Concise SSE and MMX instruction reference with latencies and throughput
- by Joe
I am trying to optimize some arithmetic by using the MMX and SSE instruction sets with inline assembly. However, I have been unable to find good references for the timings and usages of these enhanced instruction sets. Could you please help me find references that contain information about the throughput, latency, operands, and perhaps short descriptions of the instructions?
So far, I have found:
Intel Instruction References
http://www.intel.com/Assets/PDF/manual/253666.pdf
http://www.intel.com/Assets/PDF/manual/253667.pdf
Intel Optimization Guide
http://www.intel.com/Assets/PDF/manual/248966.pdf
Timings of Integer Operations
http://gmplib.org/~tege/x86-timing.pdf