Reasonably faster way to traverse a directory tree in Python?
Posted
by Sridhar Ratnakumar
on Stack Overflow
See other posts from Stack Overflow
or by Sridhar Ratnakumar
Published on 2010-04-22T23:04:59Z
Indexed on
2010/04/22
23:43 UTC
Read the original article
Hit count: 244
Assuming that the given directory tree is of reasonable size: say an open source project like Twisted or Python, what is the fastest way to traverse and iterate over the absolute path of all files/directories inside that directory?
I want to do this from within Python (subprocess
is allowed). os.path.walk is slow. So I tried ls -lR and tree -fi. For a project with about 8337 files (including tmp, pyc, test, .svn files):
$ time tree -fi > /dev/null
real 0m0.170s
user 0m0.044s
sys 0m0.123s
$ time ls -lR > /dev/null
real 0m0.292s
user 0m0.138s
sys 0m0.152s
$ time find . > /dev/null
real 0m0.074s
user 0m0.017s
sys 0m0.056s
$
tree
appears to be faster than ls -lR
(though ls -R
is faster than tree
, but it does not give full paths). find
is the fastest.
Can anyone think of a faster and/or better approach? On Windows, I may simply ship a 32-bit binary tree.exe or ls.exe if necessary.
Update 1: Added find
© Stack Overflow or respective owner