Why new String(bytes, enc).getBytes(enc) does not return the original byte array?

Posted by Bozho on Stack Overflow See other posts from Stack Overflow or by Bozho
Published on 2010-03-30T12:12:53Z Indexed on 2010/03/30 12:23 UTC
Read the original article Hit count: 191

Filed under:

I made the following "simulation":

byte[] b = new byte[256];

for (int i = 0; i < 256; i ++) {
    b[i] = (byte) (i - 128);
}
byte[] transformed = new String(b, "cp1251").getBytes("cp1251");

for (int i = 0; i < b.length; i ++) {
    if (b[i] != transformed[i]) {
        System.out.println("Wrong : " + i);
    }
}

For cp1251 this outputs only one wrong byte - at position 23.
For KOI8-R - all fine.
For cp1252 - 4 or 5 differences.

What is the reason for this and how can this be overcome?

I know it is wrong to represent byte arrays as strings in whatever encoding, but it is a requirement of the protocol of a payment provider, so I don't have a choice.

Update: representing it in ISO-8859-1 works, and I'll use it for the byte[] part, and cp1251 for the textual part, so the question remains only out of curiousity

© Stack Overflow or respective owner

Related posts about java