How to diagnose storage system scaling problems?
- by Unknown
We are currently testing the maximum sequential read throughput of a storage system
(48 disks total behind two HP P2000 arrays) connected to HP DL580 G7 running RHEL 5 with 128 GB of memory.
Initial testing has been mainly done by running DD-commands like this:
dd if=/dev/mapper/mpath1 of=/dev/null bs=1M count=3000
In parallel for each disk.
However, we have been unable to scale the results from one array (maximum throughput of 1.3 GB/s) to two (almost the same throughput). Each array is connected to a dedicated host bust adapter, so they should not be the bottleneck. The disks are currently in JBOD configuration, so each disk can be addressed directly.
I have two questions:
Is running multiple DD commands in parallel really a good way to test maximum read throughput? We have noticed very high SWAPIN-% numbers in iotop, which I find hard to explain because the target is /dev/null
How shoud we proceed in trying to find the reason for the scaling problem? Do you thing the server itself is the bottleneck here, or could there be some linux parameters that we have overlooked?