Remote I/O costs with a Content Delivery Network

Posted by x711Li on Server Fault See other posts from Server Fault or by x711Li
Published on 2012-08-17T15:19:56Z Indexed on 2012/10/03 9:39 UTC
Read the original article Hit count: 213

As far as I know, the time complexity of scanning a directory and the amount of files in said directory are correlated due to I/O costs. Would the administrative costs of placing the files in a hashed directory tree for uploading/downloading files through a CDN API be worth it for the added efficiency?

For instance, given a filename foo.mp3, the MD5 hash for this is 10ebb1120767e9de166e0f5905077cb1. Thus, storing foo.mp3 in ./10/eb/foo.mp3 would allow for less files per directory (assuming MD5 generates patterns with in Base36, this allows for 36^2 root directories with 36^2 subdirectories each and little chance of hash collision)

Considering the directories themselves are not loaded, would the I/O costs of directory scanning still exist with direct uploading/downloading?

© Server Fault or respective owner

Related posts about directory

Related posts about organization