Fastest way to convert file from latin1 to utf-8 in python.
Posted
by xsaero00
on Stack Overflow
See other posts from Stack Overflow
or by xsaero00
Published on 2010-03-08T21:22:24Z
Indexed on
2010/03/08
22:06 UTC
Read the original article
Hit count: 221
python
I need fastest way to convert files from latin1 to utf-8 in python. The files are large ~ 2G. ( I am moving DB data ). So far I have
import codecs
infile = codecs.open(tmpfile, 'r', encoding='latin1')
outfile = codecs.open(tmpfile1, 'w', encoding='utf-8')
for line in infile:
outfile.write(line)
infile.close()
outfile.close()
but it is still slow. The conversion takes one fourth of the whole migration time.
I could also use a linux command line utility if it is faster than native python code.
© Stack Overflow or respective owner