linux: accessing thousands of files in hash of directories
- by 130490868091234
I would like to know what is the most efficient way of concurrently accessing thousands of files of a similar size in a modern Linux cluster of computers. I am carrying an indexing operation in each of these files, so the 4 index files, about 5-10x smaller than the data file, are produced next to the file to index. Right now I am using a hierarchy of directories from ./00/00/00 to ./99/99/99 and I place 1 file at the end of each directory, like ./00/00/00/file000000.ext to ./00/00/00/file999999.ext. It seems to work better than having thousands of files in the same directory but I would like to know if there is a better way of laying out the files to improve access.