Testing for disk write
- by Montecristo
I'm writing an application for storing lots of images (size <5MB) on an ext3 filesystem, this is what I have for now. After some searching here on serverfault I have decided for a structure of directories like this:
000/000/000000001.jpg
...
236/519/236519107.jpg
This structure will allow me to save up to 1'000'000'000 images as I'll store a max of 1'000 images in each leaf.
I've created it, from a theoretical point of view seems ok to me (though I've no experience on this), but I want to find out what will happen when there will be directories full of files in there.
A question about creating this structure: is it better to create it all in one go (takes approx 50 minutes on my pc) or should I create directories as they are needed? From a developer point of view I think the first option is better (no extra waiting time for the user), but from a sysadmin point of view, is this ok?
I've thought I could do as if the filesystem is already under the running application, I'll make a script that will save images as fast as it can, monitoring things as follows:
how much time does it take for an image to be saved when there is no or little space used?
how does this change when the space starts to be used up?
how much time does it take for an image to be read from a random leaf? Does this change a lot when there are lots of files?
Does launching this command
sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
has any sense at all? Is this the only thing I have to do to have a clean start if I want to start over again with my tests?
Do you have any suggestions or corrections?