add uchar values in ushort array with sse2 or sse3
Posted
by
pompolus
on Stack Overflow
See other posts from Stack Overflow
or by pompolus
Published on 2012-11-09T18:06:59Z
Indexed on
2012/11/10
11:01 UTC
Read the original article
Hit count: 222
i have an unsigned short dst[16][16] matrix and a larger unsigned char src[m][n] matrix.
Now i have to access in the src matrix and add a 16x16 submatrix to dst, using sse2 or ss3.
In a my older implementation, I was sure that my summed values ??were never greater than 256, so i could do this:
for (int row = 0; row < 16; ++row)
{
__m128i subMat = _mm_lddqu_si128(reinterpret_cast<const __m128i*>(src));
dst[row] = _mm_add_epi8(dst[row], subMat);
src += W; // Step to next row i need to add
}
where W is an offset to reach the desired rows. This code works, but now my values in src are larger and summed could be greater than 256, so i need to store them as ushort.
i've tried this:
for (int row = 0; row < 16; ++row)
{
__m128i subMat = _mm_lddqu_si128(reinterpret_cast<const __m128i*>(src));
dst[row] = _mm_add_epi16(dst[row], subMat);
src += W; // Step to next row i need to add
}
but it doesn't work. I'm not so good with sse, so any help will be appreciated.
© Stack Overflow or respective owner