Random Loss of precision in Python ReadLine()

Posted by jackyouldon on Stack Overflow See other posts from Stack Overflow or by jackyouldon
Published on 2010-06-09T13:45:35Z Indexed on 2010/06/09 13:52 UTC
Read the original article Hit count: 158

Filed under:
|
|

Hi all,

We have a process which takes a very large csv (1.6GB) and breaks it down into pieces (in this case 3). This runs nightly and normally doesn't give us any problems. When it ran last night, however, the first of the output files had lost precision on the numeric fields in the data. The active ingredient in the script are the lines:

         while lineCounter <= chunk:
            oOutFile.write(oInFile.readline())
            lineCounter = lineCounter + 1

and the normal output might be something like

StringField1; StringField2; StringField3; StringField4; 1000000; StringField5; 0.000054454

etc.

On this one occasion and in this one output file the numeric fields were all output with 6 zeros at the end i.e.

StringField1; StringField2; StringField3; StringField4; 1000000.000000; StringField5; 0.000000

We are using Python v2.6 (and don't want to upgrade unless we really have to) but we can't afford to lose this data. Does anyone have any idea why this might have happened? If the readline is doing some kind of implicit conversion is there a way to do a binary read, because we really just want this data to pass through untouched?

It is very wierd to us that this only affected one of the output files generated by the same script, and when it was rerun the output was as expected.

thanks

Jack

© Stack Overflow or respective owner

Related posts about python

Related posts about beginner