Float32 to Float16
Posted
by Goz
on Stack Overflow
See other posts from Stack Overflow
or by Goz
Published on 2010-06-11T21:45:47Z
Indexed on
2010/06/11
22:03 UTC
Read the original article
Hit count: 188
Can someone explain to me how I convert a 32-bit floating point value to a 16-bit floating point value?
(s = sign e = exponent and m = mantissa)
If 32-bit float is 1s7e24m
And 16-bit float is 1s5e10m
Then is it as simple as doing?
int fltInt32;
short fltInt16;
memcpy( &fltInt32, &flt, sizeof( float ) );
fltInt16 = (fltInt32 & 0x00FFFFFF) >> 14;
fltInt16 |= ((fltInt32 & 0x7f000000) >> 26) << 10;
fltInt16 |= ((fltInt32 & 0x80000000) >> 16);
I'm assuming it ISN'T that simple ... so can anyone tell me what you DO need to do?
© Stack Overflow or respective owner