How to rotate an SSE/AVX vector

Posted by user1584773 on Stack Overflow See other posts from Stack Overflow or by user1584773
Published on 2012-08-10T17:52:28Z Indexed on 2012/12/01 23:04 UTC
Read the original article Hit count: 423

Filed under:
|
|
|
|

I need to perform a rotate operation with as little clock cycle as possible. In the first case let's assume __m128i as source and dest type

source: || A0 || A1 || A2 || A3 ||

dest : || A1 || A2 || A3 || A0 ||

dest = (__m128i)_mm_shuffle_epi32((__m128i)source, _MM_SHUFFLE(0,3,2,1));

Now I want to do the same whit AVX intrinsics So let's assume this time __m256i as source and dest type

source: || A0 || A1 || A2 || A3 || A4 || A5 || A6 || A7 ||

dest : || A1 || A2 || A3 || A4 || A5 || A6 || A7 || A0 ||

The Avx intrinsics is missing most of the corresponding SSE integer operations. Maybe there is some way go get the desider output working with the floating point version.

I've tryed with:

dest = (__m256i)_mm256_shuffle_ps((__m256)source, (__m256)source, _MM_SHUFFLE(0,3,2,1));

but what I get is:

|| A0 || A2 || A3 || A4 || A5 || A6 || A7 || A1 ||

Any Idea on how to solve this in an efficient way? (without mixing SSE and AVX operation and without "manually" inverting A0 and A1

Thanks in advance!

© Stack Overflow or respective owner

Related posts about c

    Related posts about x86