Disk fragmentation when dealing with many small files
- by Zorlack
On a daily basis we generate about 3.4 Million small jpeg files. We also delete about 3.4 Million 90 day old images. To date, we've dealt with this content by storing the images in a hierarchical manner. The heriarchy is something like this:
/Year/Month/Day/Source/
This heirarchy allows us to effectively delete days worth of content across all sources.
The files are stored on a Windows 2003 server connected to a 14 disk SATA RAID6.
We've started having significant performance issues when writing-to and reading-from the disks.
This may be due to the performance of the hardware, but I suspect that disk fragmentation bay be a culprit at well.
Some people have recommended storing the data in a database, but I've been hesitant to do this. An other thought was to use some sort of container file, like a VHD or something.
Does anyone have any advice for mitigating this kind of fragmentation?