Efficiency of data structures in C99 (possibly affected by endianness)

Posted by Ninefingers on Stack Overflow See other posts from Stack Overflow or by Ninefingers
Published on 2010-12-25T19:40:30Z Indexed on 2010/12/25 22:54 UTC
Read the original article Hit count: 398

Filed under:
|
|
|

Hi All,

I have a couple of questions that are all inter-related. Basically, in the algorithm I am implementing a word w is defined as four bytes, so it can be contained whole in a uint32_t.

However, during the operation of the algorithm I often need to access the various parts of the word. Now, I can do this in two ways:

uint32_t w = 0x11223344;
uint8_t a = (w & 0xff000000) >> 24;
uint8_t b = (w & 0x00ff0000) >> 16;
uint8_t b = (w & 0x0000ff00) >>  8;
uint8_t d = (w & 0x000000ff);

However, part of me thinks that isn't particularly efficient. I thought a better way would be to use union representation like so:

typedef union
{
    struct
    {
        uint8_t d;
        uint8_t c;
        uint8_t b;
        uint8_t a;
    };
    uint32_t n;
} word32;

Using this method I can assign word32 w = 0x11223344; then I can access the various parts as I require (w.a=11 in little endian).

However, at this stage I come up against endianness issues, namely, in big endian systems my struct is defined incorrectly so I need to re-order the word prior to it being passed in.

This I can do without too much difficulty. My question is, then, is the first part (various bitwise ands and shifts) efficient compared to the implementation using a union? Is there any difference between the two generally? Which way should I go on a modern, x86_64 processor? Is endianness just a red herring here?

I could inspect the assembly output of course, but my knowledge of compilers is not brilliant. I would have thought a union would be more efficient as it would essentially convert to memory offsets, like so:

mov eax, [r9+8]

Would a compiler realise that is what happening in the bit-shift case above?

If it matters, I'm using C99, specifically my compiler is clang (llvm).

Thanks in advance.

© Stack Overflow or respective owner

Related posts about c

    Related posts about efficiency