Performance of file operations on thousands of files on NTFS vs HFS, ext3, others
Posted
by
peterjmag
on Super User
See other posts from Super User
or by peterjmag
Published on 2011-06-26T22:44:51Z
Indexed on
2011/06/27
0:24 UTC
Read the original article
Hit count: 621
[Crossposted from my Ask HN post. Feel free to close it if the question's too broad for superuser.]
This is something I've been curious about for years, but I've never found any good discussions on the topic. Of course, my Google-fu might just be failing me...
I often deal with projects involving thousands of relatively small files. This means that I'm frequently performing operations on all of those files or a large subset of them—copying the project folder elsewhere, deleting a bunch of temporary files, etc. Of all the machines I've worked on over the years, I've noticed that NTFS handles these tasks consistently slower than HFS on a Mac or ext3/ext4 on a Linux box. However, as far as I can tell, the raw throughput isn't actually slower on NTFS (at least not significantly), but the delay between each individual file is just a tiny bit longer. That little delay really adds up for thousands of files.
(Side note: From what I've read, this is one of the reasons git is such a pain on Windows, since it relies so heavily on the file system for its object database.)
Granted, my evidence is merely anecdotal—I don't currently have any real performance numbers, but it's something that I'd love to test further (perhaps with a Mac dual-booting into Windows). Still, my geekiness insists that someone out there already has.
Can anyone explain this, or perhaps point me in the right direction to research it further myself?
© Super User or respective owner