How to rotate an SSE/AVX vector
- by user1584773
I need to perform a rotate operation with as little clock cycle as possible.
In the first case let's assume __m128i as source and dest type
source: || A0 || A1 || A2 || A3 ||
dest : || A1 || A2 || A3 || A0 ||
dest = (__m128i)_mm_shuffle_epi32((__m128i)source, _MM_SHUFFLE(0,3,2,1));
Now I want to do the same whit AVX intrinsics
So let's assume this time __m256i as source and dest type
source: || A0 || A1 || A2 || A3 || A4 || A5 || A6 || A7 ||
dest : || A1 || A2 || A3 || A4 || A5 || A6 || A7 || A0 ||
The Avx intrinsics is missing most of the corresponding SSE integer operations.
Maybe there is some way go get the desider output working with the floating point version.
I've tryed with:
dest = (__m256i)_mm256_shuffle_ps((__m256)source, (__m256)source, _MM_SHUFFLE(0,3,2,1));
but what I get is:
|| A0 || A2 || A3 || A4 || A5 || A6 || A7 || A1 ||
Any Idea on how to solve this in an efficient way? (without mixing SSE and AVX operation and without "manually" inverting A0 and A1
Thanks in advance!