Random Loss of precision in Python ReadLine()
Posted
by jackyouldon
on Stack Overflow
See other posts from Stack Overflow
or by jackyouldon
Published on 2010-06-09T13:45:35Z
Indexed on
2010/06/09
13:52 UTC
Read the original article
Hit count: 161
Hi all,
We have a process which takes a very large csv (1.6GB) and breaks it down into pieces (in this case 3). This runs nightly and normally doesn't give us any problems. When it ran last night, however, the first of the output files had lost precision on the numeric fields in the data. The active ingredient in the script are the lines:
while lineCounter <= chunk:
oOutFile.write(oInFile.readline())
lineCounter = lineCounter + 1
and the normal output might be something like
StringField1; StringField2; StringField3; StringField4; 1000000; StringField5; 0.000054454
etc.
On this one occasion and in this one output file the numeric fields were all output with 6 zeros at the end i.e.
StringField1; StringField2; StringField3; StringField4; 1000000.000000; StringField5; 0.000000
We are using Python v2.6 (and don't want to upgrade unless we really have to) but we can't afford to lose this data. Does anyone have any idea why this might have happened? If the readline is doing some kind of implicit conversion is there a way to do a binary read, because we really just want this data to pass through untouched?
It is very wierd to us that this only affected one of the output files generated by the same script, and when it was rerun the output was as expected.
thanks
Jack
© Stack Overflow or respective owner