Search Results

Search found 21004 results on 841 pages for 'assembly load'.

Page 118/841 | < Previous Page | 114 115 116 117 118 119 120 121 122 123 124 125  | Next Page >

  • Intrinsics program (SSE) - g++ - help needed

    - by Sriram
    Hi all, This is the first time I am posting a question on stackoverflow, so please try and overlook any errors I may have made in formatting my question/code. But please do point the same out to me so I may be more careful. I was trying to write some simple intrinsics routines for the addition of two 128-bit (containing 4 float variables) numbers. I found some code on the net and was trying to get it to run on my system. The code is as follows: //this is a sample Intrinsics program to add two vectors. #include <iostream> #include <iomanip> #include <xmmintrin.h> #include <stdio.h> using namespace std; struct vector4 { float x, y, z, w; }; //functions to operate on them. vector4 set_vector(float x, float y, float z, float w = 0) { vector4 temp; temp.x = x; temp.y = y; temp.z = z; temp.w = w; return temp; } void print_vector(const vector4& v) { cout << " This is the contents of vector: " << endl; cout << " > vector.x = " << v.x << endl; cout << " vector.y = " << v.y << endl; cout << " vector.z = " << v.z << endl; cout << " vector.w = " << v.w << endl; } vector4 sse_vector4_add(const vector4&a, const vector4& b) { vector4 result; asm volatile ( "movl $a, %eax" //move operands into registers. "\n\tmovl $b, %ebx" "\n\tmovups (%eax), xmm0" //move register contents into SSE registers. "\n\tmovups (%ebx), xmm1" "\n\taddps xmm0, xmm1" //add the elements. addps operates on single-precision vectors. "\n\t movups xmm0, result" //move result into vector4 type data. ); return result; } int main() { vector4 a, b, result; a = set_vector(1.1, 2.1, 3.2, 4.5); b = set_vector(2.2, 4.2, 5.6); result = sse_vector4_add(a, b); print_vector(a); print_vector(b); print_vector(result); return 0; } The g++ parameters I use are: g++ -Wall -pedantic -g -march=i386 -msse intrinsics_SSE_example.C -o h The errors I get are as follows: intrinsics_SSE_example.C: Assembler messages: intrinsics_SSE_example.C:45: Error: too many memory references for movups intrinsics_SSE_example.C:46: Error: too many memory references for movups intrinsics_SSE_example.C:47: Error: too many memory references for addps intrinsics_SSE_example.C:48: Error: too many memory references for movups I have spent a lot of time on trying to debug these errors, googled them and so on. I am a complete noob to Intrinsics and so may have overlooked some important things. Any help is appreciated, Thanks, Sriram.

    Read the article

  • OpenGL Calls Lock/Freeze

    - by Necrolis
    I am using some dell workstations(running WinXP Pro SP 2 & DeepFreeze) for development, but something was recenlty loaded onto these machines that prevents any opengl call(the call locks) from completing(and I know the code works as I have tested it on 'clean' machines, I also tested with simple opengl apps generated by dev-cpp, which will also lock on the dell machines). I have tried to debug my own apps to see where exactly the gl calls freeze, but there is some global system hook on ZwQueryInformationProcess that messes up calls to ZwQueryInformationThread(used by ExitThread), preventing me from debugging at all(it causes the debugger, OllyDBG, to go into an access violation reporting loop or the program to crash if the exception is passed along). the hook: ntdll.ZwQueryInformationProcess 7C90D7E0 B8 9A000000 MOV EAX,9A 7C90D7E5 BA 0003FE7F MOV EDX,7FFE0300 7C90D7EA FF12 CALL DWORD PTR DS:[EDX] 7C90D7EC - E9 0F28448D JMP 09D50000 7C90D7F1 9B WAIT 7C90D7F2 0000 ADD BYTE PTR DS:[EAX],AL 7C90D7F4 00BA 0003FE7F ADD BYTE PTR DS:[EDX+7FFE0300],BH 7C90D7FA FF12 CALL DWORD PTR DS:[EDX] 7C90D7FC C2 1400 RETN 14 7C90D7FF 90 NOP ntdll.ZwQueryInformationToken 7C90D800 B8 9C000000 MOV EAX,9C the messed up function + call: ntdll.ZwQueryInformationThread 7C90D7F0 8D9B 000000BA LEA EBX,DWORD PTR DS:[EBX+BA000000] 7C90D7F6 0003 ADD BYTE PTR DS:[EBX],AL 7C90D7F8 FE ??? ; Unknown command 7C90D7F9 7F FF JG SHORT ntdll.7C90D7FA 7C90D7FB 12C2 ADC AL,DL 7C90D7FD 14 00 ADC AL,0 7C90D7FF 90 NOP ntdll.ZwQueryInformationToken 7C90D800 B8 9C000000 MOV EAX,9C So firstly, anyone know what if anything would lead to OpenGL calls cause an infinite lock,and if there are any ways around it? and what would be creating such a hook in kernal memory ? Update: After some more fiddling, I have discovered a few more kernal hooks, a lot of them are used to nullify data returned by system information calls(such as the remote debugging port), I also managed to find out the what ever is doing this is using madchook.dll(by madshi) to do this, this dll is also injected into every running process(these seem to be some anti debugging code). Also, on the OpenGL side, it seems Direct X is fine/unaffected(I ran one of the DX 9 demo's without problems), so could one of these kernal hooks somehow affect OpenGL?

    Read the article

  • Problem with asm program (nasm)

    - by GLeBaTi
    org 0x100 SEGMENT .CODE mov ah,0x9 mov dx, Msg1 int 0x21 ;string input mov ah,0xA mov dx,buff int 0x21 mov ax,0 mov al,[buff+1]; length ;string UPPERCASE mov cl, al mov si, buff cld loop1: lodsb; cmp al, 'a' jnb upper loop loop1 ;output mov ah,0x9 mov dx, buff int 0x21 exit: mov ah, 0x8 int 0x21 int 0x20 upper: sub al,32 jmp loop1 SEGMENT .DATA Msg1 db 'Press string: $' buff db 254,0 this code perform poorly. I think that problem in "jnb upper". This program make small symbols into big symbols.

    Read the article

  • How to specify execution time of x86 and PowerPC instructions?

    - by Goofy
    Hello! I have to approximate execution time of PowerPC and x86 assembler code.I understand that I cannot compute exact it dependson many problems (current processor state - x86 processor dicides internal instructions in microinstructions, memory access time obtainig code from cache of from slower memory etc.). I found some information in Intel Optimizaction reference (APPENDIX C), but it does not provide information about all general purpose instructions. Is there any complete reference about it? What about PowerPC processors? Where can I fund such information?

    Read the article

  • Setting processor to 32-bit mode

    - by dboarman-FissureStudios
    It seems that the following is a common method given in many tutorials on switching a processor from 16-bit to 32-bit: mov eax, cr0 ; set bit 0 in CR0-go to pmode or eax, 1 mov cr0, eax Why wouldn't I simply do the following: or cr0, 1 Is there something I'm missing? Possibly the only thing I can think of is that I cannot perform an operation like this on the cr0 register.

    Read the article

  • Outputting variable values in x86?

    - by Airjoe
    Hello All- I'm working on a homework assignment in x86 and it isn't working as I expect (surprise surprise!). I'd like to be able to output values of variables in x86 functions to ensure that the values are what I expect them to be. Is there a simple way to do this, or is it very complex? For what it's worth, the x86 functions are being used by a C file and compiled with gcc, so if that makes it simpler that is how I'm going about it. Thanks for the help.

    Read the article

  • Why would I need a using statement to Libary B extn methods, if they're used in Library A & it's Li

    - by Greg
    Hi, I have: Main Program Class - uses Library A Library A - has partial classes which mix in methods from Library B Library B - mix in methods & interfaces Why would I need a using statement to LibaryB just to get their extension methods working in the main class? That is given that it's Library B that defines the classes that will be extended. EDIT - Except from code // *** PROGRAM *** using TopologyDAL; using Topology; // *** THIS WAS NEEDED TO GET EXTN METHODS APPEARING *** class Program { static void Main(string[] args) { var context = new Model1Container(); Node myNode; // ** trying to get myNode mixin methods to appear seems to need using line to point to Library B *** } } // ** LIBRARY A namespace TopologyDAL { public partial class Node { // Auto generated from EF } public partial class Node : INode<int> // to add extension methods from Library B { public int Key } } // ** LIBRARY B namespace ToplogyLibrary { public static class NodeExtns { public static void FromNodeMixin<T>(this INode<T> node) { // XXXX } } public interface INode<T> { // Properties T Key { get; } // Methods } }

    Read the article

  • delete a file in protected mode env(like windows xp)

    - by JGC
    hi I write a program to delete a file from somewhere of my harddisk in 8086 but when i use int 21h (ah=41h) an error happens and carry set to 1.and I cannot delete that. does anyone know what can I do? I think it should be from protected mode which does not allow my program to delete another file.I want the answer and language is not matter.

    Read the article

  • easy asm program(nasm)

    - by GLeBaTi
    org 0x100 SEGMENT .CODE mov ah,0x9 mov dx, Msg1 int 0x21 ;string input mov ah,0xA mov dx,buff int 0x21 mov ax,0 mov al,[buff+1]; length ;string UPPERCASE mov cl, al mov si, buff cld loop1: lodsb; cmp al, 'a' jnb upper loop loop1 ;output mov ah,0x9 mov dx, buff int 0x21 exit: mov ah, 0x8 int 0x21 int 0x20 upper: sub al,32 jmp loop1 SEGMENT .DATA Msg1 db 'Press string: $' buff db 254,0 this code perform poorly. I think that problem in "jnb upper". This program make small symbols into big symbols.

    Read the article

  • translate ia32 into C

    - by David Lee
    I am trying to translate the following: Action: push %ebp #function prolog mov %esp, %ebp sub $0x10, %esp mov 0x8(%ebp), %eax #first line compiles to these 4 lines imul 0x8(%ebp), %eax sub $0x7, %eax mov %eax, -0x4(%ebp) addl $0x8, 0xc(%ebp) #second line mov -0x4(%ebp), %eax #third line mov 0xc(%ebp), %edx mov (%edx, %eax, 4), %eax add $0x3, %eax movb $0x41, (%eax) leave ret So far I have the following: //What am I missing? void Action(int x, char **y) { int z = x * x - 7; y+=8; //missing third line } What is the best way to translate this?

    Read the article

  • Is there any kind of standard for 8086 multiprocessing?

    - by Earlz
    Back when I made an 8086 emulator I noticed that there was the LOCK prefix intended for synchonization in a multiprocessor environment. Yet the only multitasking I know of for the x86 arch. involves use of the APIC which didn't come around until either the Pentiums or 486s. Was there any kind of standard for 8086 multitasking or was it done by some manufacturer specific extensions to the instruction set and/or special ports? By standard, I mean things like: How do you separate the 2 processors if they both use the same memory? This is impossible without some kind of way to make each processor execute a different piece of code. (or cause an interrupt on only one processor)

    Read the article

  • ARM Simulator on Windows

    - by Betamoo
    I am studying ARM Processors from a textbook... I thought it will be more useful if I could apply what I learn on an ARM simulator... writing code then watching results and different execution stages would be more fun... I have searched for it, but all I could find was either a freeware on linux or a demo on windows Is there a simulator that allow me to see execution steps and different changes for ARM processor (any version!) that runs on windows?? Thanks

    Read the article

  • C++ word to bytes

    - by Vit
    Hi, I tried to read CPUID using assembler in C++. I know there is function for it in , but I want the asm way. So, after CPUID is executed, it should fill eax,ebx,ecx registers with ASCII coded string. But my problem is, since I can in asm adress only full, or half eax register, how to break that 32 bits into 4 bytes. I used this: #include <iostream> #include <stdlib.h> int main() { _asm { cpuid /*There I need to mov values from eax,ebx and ecx to some propriate variables*/ } system("PAUSE"); return(0); }

    Read the article

  • Why does it NOT give a segmentation violation?

    - by user198729
    The code below is said to give a segmentation violation: #include <stdio.h> #include <string.h> void function(char *str) { char buffer[16]; strcpy(buffer,str); } int main() { char large_string[256]; int i; for( i = 0; i < 255; i++) large_string[i] = 'A'; function(large_string); return 1; } It's compiled and run like this: gcc -Wall -Wextra hw.cpp && a.exe But there is nothing output. NOTE The above code indeed overwrites the ret address and so on if you really understand what's going underneath.

    Read the article

  • Electronic resources for learning Z/OS assembler?

    - by Jared
    This is a follow up to this question. I'm totally blind so printed books aren't an option. All the recommended books appear to have been published before electronic publishing got started. I've been able to learn the very basics but would like something between here's what a register is, and the IBM reference material. Searching the normal places like Safari Books Online has come up dry.

    Read the article

  • What are CFI directives in Gnu Assembler (GAS) used for?

    - by claws
    There seem to be a .CFI directive after every line and also there are wide varities of these ex.,.cfi_startproc , .cfi_endproc etc.. more here. .file "temp.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 movq %rsp, %rbp .cfi_offset 6, -16 .cfi_def_cfa_register 6 movl $0, %eax leave ret .cfi_endproc .LFE0: .size main, .-main .globl func .type func, @function func: .LFB1: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 movq %rsp, %rbp .cfi_offset 6, -16 .cfi_def_cfa_register 6 movl %edi, -4(%rbp) movl %esi, %eax movb %al, -8(%rbp) leave ret .cfi_endproc .LFE1: .size func, .-func .ident "GCC: (Ubuntu 4.4.1-4ubuntu9) 4.4.1" .section .note.GNU-stack,"",@progbits I didn't get the purpose of these.

    Read the article

  • How to benchmark on multi-core processors

    - by Pascal Cuoq
    I am looking for ways to perform micro-benchmarks on multi-core processors. Context: At about the same time desktop processors introduced out-of-order execution that made performance hard to predict, they, perhaps not coincidentally, also introduced special instructions to get very precise timings. Example of these instructions are rdtsc on x86 and rftb on PowerPC. These instructions gave timings that were more precise than could ever be allowed by a system call, allowed programmers to micro-benchmark their hearts out, for better or for worse. On a yet more modern processor with several cores, some of which sleep some of the time, the counters are not synchronized between cores. We are told that rdtsc is no longer safe to use for benchmarking, but I must have been dozing off when we were explained the alternative solutions. Question: Some systems may save and restore the performance counter and provide an API call to read the proper sum. If you know what this call is for any operating system, please let us know in an answer. Some systems may allow to turn off cores, leaving only one running. I know Mac OS X Leopard does when the right Preference Pane is installed from the Developers Tools. Do you think that this make rdtsc safe to use again? More context: Please assume I know what I am doing when trying to do a micro-benchmark. If you are of the opinion that if an optimization's gains cannot be measured by timing the whole application, it's not worth optimizing, I agree with you, but I cannot time the whole application until the alternative data structure is finished, which will take a long time. In fact, if the micro-benchmark were not promising, I could decide to give up on the implementation now; I need figures to provide in a publication whose deadline I have no control over.

    Read the article

  • Illegal instruction gcc assembler.

    - by Bernt
    In assembler: .globl _test _test: pushl %ebp movl %esp, %ebp movl $0, %eax pushl %eax popl %ebp ret Calling from c main() { _test(); } Compile: gcc -m32 -o test test.c test.s This code gives me illegal instruction sometimes and segment fault other times. In gdc i always get illegal instruction, this is just a simple test, i had a larger program that was working and suddenly after no apperant reason stopped working, now i always get this error even if i start from scratch like above. I have narrowed it down to pushl %eax (or any other register....), if i comment out that line the code runs fine. Any ideas? (I'm running the program at my universities linux cluster, so I have not changed any settings..)

    Read the article

  • "C variable type sizes are machine dependent." Is it really true? signed & unsigned numbers ;

    - by claws
    Hello, I've been told that C types are machine dependent. Today I wanted to verify it. void legacyTypes() { /* character types */ char k_char = 'a'; //Signedness --> signed & unsigned signed char k_char_s = 'a'; unsigned char k_char_u = 'a'; /* integer types */ int k_int = 1; /* Same as "signed int" */ //Signedness --> signed & unsigned signed int k_int_s = -2; unsigned int k_int_u = 3; //Size --> short, _____, long, long long short int k_s_int = 4; long int k_l_int = 5; long long int k_ll_int = 6; /* real number types */ float k_float = 7; double k_double = 8; } I compiled it on a 32-Bit machine using minGW C compiler _legacyTypes: pushl %ebp movl %esp, %ebp subl $48, %esp movb $97, -1(%ebp) # char movb $97, -2(%ebp) # signed char movb $97, -3(%ebp) # unsigned char movl $1, -8(%ebp) # int movl $-2, -12(%ebp)# signed int movl $3, -16(%ebp) # unsigned int movw $4, -18(%ebp) # short int movl $5, -24(%ebp) # long int movl $6, -32(%ebp) # long long int movl $0, -28(%ebp) movl $0x40e00000, %eax movl %eax, -36(%ebp) fldl LC2 fstpl -48(%ebp) leave ret I compiled the same code on 64-Bit processor (Intel Core 2 Duo) on GCC (linux) legacyTypes: .LFB2: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 movq %rsp, %rbp .cfi_offset 6, -16 .cfi_def_cfa_register 6 movb $97, -1(%rbp) # char movb $97, -2(%rbp) # signed char movb $97, -3(%rbp) # unsigned char movl $1, -12(%rbp) # int movl $-2, -16(%rbp)# signed int movl $3, -20(%rbp) # unsigned int movw $4, -6(%rbp) # short int movq $5, -32(%rbp) # long int movq $6, -40(%rbp) # long long int movl $0x40e00000, %eax movl %eax, -24(%rbp) movabsq $4620693217682128896, %rax movq %rax, -48(%rbp) leave ret Observations char, signed char, unsigned char, int, unsigned int, signed int, short int, unsigned short int, signed short int all occupy same no. of bytes on both 32-Bit & 64-Bit Processor. The only change is in long int & long long int both of these occupy 32-bit on 32-bit machine & 64-bit on 64-bit machine. And also the pointers, which take 32-bit on 32-bit CPU & 64-bit on 64-bit CPU. Questions: I cannot say, what the books say is wrong. But I'm missing something here. What exactly does "Variable types are machine dependent mean?" As you can see, There is no difference between instructions for unsigned & signed numbers. Then how come the range of numbers that can be addressed using both is different? I was reading http://stackoverflow.com/questions/2511246/how-to-maintain-fixed-size-of-c-variable-types-over-different-machines I didn't get the purpose of the question or their answers. What maintaining fixed size? They all are the same. I didn't understand how those answers are going to ensure the same size.

    Read the article

  • Software Protection: Shuffeling my application?

    - by Martijn Courteaux
    Hi, I want to continue on my previous question: http://stackoverflow.com/questions/3007168/torrents-can-i-protect-my-software-by-sending-wrong-bytes Developer Art suggested to add a unique key to the application, to identifier the cracker. But JAB said that crackers can search where my unique key is located by checking for binary differences, if the cracker has multiple copies of my software. Then crackers change that key to make them self anonymous. That is true. Now comes the question: If I want to add a unique key, are there tools to shuffle (a kind of obfuscation) the program modules? So, that a binary compare would say that the two files are completely different. So they can't locate the identifier key. I'm pretty sure it is possible (maybe by replacing assembler blocks and make some jumps). I think it would be enough to make 30 to 40 shuffles of my software. Thanks

    Read the article

  • How do I patch a Windows API at runtime so that it to returns 0 in x64?

    - by Jorge Vasquez
    In x86, I get the function address using GetProcAddress() and write a simple XOR EAX,EAX; RET; in it. Simple and effective. How do I do the same in x64? bool DisableSetUnhandledExceptionFilter() { const BYTE PatchBytes[5] = { 0x33, 0xC0, 0xC2, 0x04, 0x00 }; // XOR EAX,EAX; RET; // Obtain the address of SetUnhandledExceptionFilter HMODULE hLib = GetModuleHandle( _T("kernel32.dll") ); if( hLib == NULL ) return false; BYTE* pTarget = (BYTE*)GetProcAddress( hLib, "SetUnhandledExceptionFilter" ); if( pTarget == 0 ) return false; // Patch SetUnhandledExceptionFilter if( !WriteMemory( pTarget, PatchBytes, sizeof(PatchBytes) ) ) return false; // Ensures out of cache FlushInstructionCache(GetCurrentProcess(), pTarget, sizeof(PatchBytes)); // Success return true; } static bool WriteMemory( BYTE* pTarget, const BYTE* pSource, DWORD Size ) { // Check parameters if( pTarget == 0 ) return false; if( pSource == 0 ) return false; if( Size == 0 ) return false; if( IsBadReadPtr( pSource, Size ) ) return false; // Modify protection attributes of the target memory page DWORD OldProtect = 0; if( !VirtualProtect( pTarget, Size, PAGE_EXECUTE_READWRITE, &OldProtect ) ) return false; // Write memory memcpy( pTarget, pSource, Size ); // Restore memory protection attributes of the target memory page DWORD Temp = 0; if( !VirtualProtect( pTarget, Size, OldProtect, &Temp ) ) return false; // Success return true; } This example is adapted from code found here: http://www.debuginfo.com/articles/debugfilters.html#overwrite .

    Read the article

< Previous Page | 114 115 116 117 118 119 120 121 122 123 124 125  | Next Page >