Python file iterator over a binary file with newer idiom.

Posted by drewk on Stack Overflow See other posts from Stack Overflow or by drewk
Published on 2010-12-30T21:43:33Z Indexed on 2010/12/30 21:54 UTC
Read the original article Hit count: 210

Filed under:
|
|

In Python, for a binary file, I can write this:

buf_size=1024*64           # this is an important size...
with open(file, "rb") as f:
   while True:
      data=f.read(buf_size)
      if not data: break
      # deal with the data....

With a text file that I want to read line-by-line, I can write this:

with open(file, "r") as file:
   for line in file:
       # deal with each line....

Which is shorthand for:

with open(file, "r") as file:
   for line in iter(file.readline, ""):
       # deal with each line....

This idiom is documented in PEP 234 but I have failed to locate a similar idiom for binary files.

I have tried this:

>>> with open('dups.txt','rb') as f:
...    for chunk in iter(f.read,''):
...       i+=1

>>> i
1                # 30 MB file, i==1 means read in one go...

I tried putting iter(f.read(buf_size),'') but that is a syntax error because of the parens after the callable in iter().

I know I could write a function, but is there way with the default idiom of for chunk in file: where I can use a buffer size versus a line oriented?

Thanks for putting up with the Python newbie trying to write his first non-trivial and idiomatic Python script.

© Stack Overflow or respective owner

Related posts about python

Related posts about file