I've been trying to figure out how to gain some improvement in my code at a very crucial couple lines:
float x = a*b;
float y = c*d;
float z = e*f;
float w = g*h;
all a, b, c... are floats.
I decided to look into using SSE, but can't seem to find any improvement, in fact it turns out to be twice as slow. My SSE code is:
Vector4 abcd,
…