Heuristic to identify if a series of 4 bytes chunks of data are integers or floats

Posted by flint on Stack Overflow See other posts from Stack Overflow or by flint
Published on 2010-03-21T00:53:53Z Indexed on 2010/03/21 1:01 UTC
Read the original article Hit count: 461

What's the best heuristic I can use to identify whether a chunk of X 4-bytes are integers or floats? A human can do this easily, but I wanted to do it programmatically.

I realize that since every combination of bits will result in a valid integer and (almost?) all of them will also result in a valid float, there is no way to know for sure. But I still would like to identify the most likely candidate (which will virtually always be correct; or at least, a human can do it).

For example, let's take a series of 4-bytes raw data and print them as integers first and then as floats:

1           1.4013e-45
10          1.4013e-44
44          6.16571e-44
5000        7.00649e-42
1024        1.43493e-42
0           0
0           0
-5          -nan
11          1.54143e-44

Obviously they will be integers.

Now, another example:

1065353216  1
1084227584  5
1085276160  5.5
1068149391  1.33333
1083179008  4.5
1120403456  100
0           0
-1110651699 -0.1
1195593728  50000

These will obviously be floats.

PS: I'm using C++ but you can answer in any language, pseudo code or just in english.

© Stack Overflow or respective owner

Related posts about language-agnostic

Related posts about algorithm