The sign of zero with float2
- by JackOLantern
Consider the following code performing operations on complex numbers with C/C++'s float:
float real_part = log(3.f);
float imag_part = 0.f;
float real_part2 = (imag_part)*(imag_part)-(real_part*real_part);
float imag_part2 = (imag_part)*(real_part)+(real_part*imag_part);
The result will be
real_part2= -1.20695 imag_part2= 0
angle= 3.14159
where angle is the phase of the complex number and, in this case, is pi.
Now consider the following code:
float real_part = log(3.f);
float imag_part = 0.f;
float real_part2 = (-imag_part)*(-imag_part)-(real_part)*(real_part);
float imag_part2 = (-imag_part)*(real_part)+(real_part)*(-imag_part);
The result will be
real_part2= -1.20695 imag_part2= 0
angle= -3.14159
The imaginary part of the result is -0 which makes the phase of the result be -pi.
Although still accomplishing with the principal argument of a complex number and with the signed property of floating point's 0, this changes is a problem when one is defining functions of complex numbers. For example, if one is defining sqrt of a complex number by the de Moivre formula, this will change the sign of the imaginary part of the result to a wrong value.
How to deal with this effect?