Efficient mapping for a particular finite integer set

Posted by R.. on Stack Overflow See other posts from Stack Overflow or by R..
Published on 2011-02-06T06:50:50Z Indexed on 2011/02/06 7:26 UTC
Read the original article Hit count: 145

Filed under:
|
|
|

I'm looking for a small, fast (in both directions) bijective mapping between the following list of integers and a subset of the range 0-127:

0x200C, 0x200D, 0x200E, 0x200F,
0x2013, 0x2014, 0x2015, 0x2017,
0x2018, 0x2019, 0x201A, 0x201C,
0x201D, 0x201E, 0x2020, 0x2021,
0x2022, 0x2026, 0x2030, 0x2039,
0x203A, 0x20AA, 0x20AB, 0x20AC,
0x20AF, 0x2116, 0x2122

One obvious solution is:

y = x>>2 & 0x40 | x & 0x3f;
x = 0x2000 | y<<2 & 0x100 | y & 0x3f;

Edit: I was missing some of the values, particularly 0x20Ax, which don't work with the above.

Another obvious solution is a lookup table, but without making it unnecessarily large, a lookup table would require some bit rearrangement anyway and I suspect the whole task can be better accomplished with simple bit rearrangement.

For the curious, those magic numbers are the only "large" Unicode codepoints that appear in legacy ISO-8859 and Windows codepages.

© Stack Overflow or respective owner

Related posts about c

    Related posts about unicode