I have trouble getting the max throughput out of my setup. The hardware is as follow :
dual Quad-Core AMD Opteron(tm) Processor 2376
16 GB DDR2 ECC RAM
dual Adaptec 52245 RAID controllers
48 1 TB SATA drives set up as 2 RAID-6 arrays (256KB stripe) + spares.
Software :
Plain vanilla 2.6.32.25 kernel, compiled for AMD-64, optimized for NUMA; Debian Lenny userland.
benchmarks run : disktest, bonnie++, dd, etc. All give the same results. No discrepancy here.
io scheduler used : noop. Yeah, no trick here.
Up until now I basically assumed that striping (RAID 0) several physical devices should augment performance roughly linearly. However this is not the case here :
each RAID array achieves about 780 MB/s write, sustained, and 1 GB/s read, sustained.
writing to both RAID arrays simultaneously with two different processes gives 750 + 750 MB/s, and reading from both gives 1 + 1 GB/s.
however when I stripe both arrays together, using either mdadm or lvm, the performance is about 850 MB/s writing and 1.4 GB/s reading. at least 30% less than expected!
running two parallel writer or reader processes against the striped arrays doesn't enhance the figures, in fact it degrades performance even further.
So what's happening here? Basically I ruled out bus or memory contention, because when I run dd on both drives simultaneously, aggregate write speed actually reach 1.5 GB/s and reading speed tops 2 GB/s.
So it's not the PCIe bus. I suppose it's not the RAM. It's not the filesystem, because I get exactly the same numbers benchmarking against the raw device or using XFS. And I also get exactly the same performance using either LVM striping and md striping.
What's wrong? What's preventing a process from going up to the max possible throughput? Is Linux striping defective? What other tests could I run?