Why does unpartitioned Hitachi HDS5C3020 drive start consuming 50% more power 15 minutes after boot?
- by Pro Backup
In a Debian 6.0.6 system there are 74 pieces of 2TB Toshiba DT01ABA200 drives. These drives are identified as Hitachi HDS5C3020BLE630 drives running firmware revision MZ4OAAB0. 64 Drives attached via HP SAS expander cards to an LSI 2008 SAS controller, another 5 drives are connected directly to the mainboard, 4 drives are connected to a Sil based PCI controller and last 1 drive is only powered and has no data cable connected. The controller LSI and Sil card's their onboard BIOS are both disabled and the mpt2sas and sata_sil modules are removed from the Linux debian 2.6.32-5-amd64 #1 SMP Sun Sep 23 10:07:46 UTC 2012 x86_64 GNU/Linux kernel. The mpt2sas module is loaded after boot using a modprobe command in /etc/rc.local. These 74 drives are not partitioned, neither formatted and also not mounted.
The system consumes:
with 0 drives: 70.6 - 70.9 Watt (also 15 minutes after boot);
with 74 drives: 330 - 360 Watt, just after boot (is equivalent to 3.5 - 3.9W per drive in idle state);
with 74 drives: 420 - 466 Watt, each time in the 15th minute of uptime (is equivalent to 4.7 - 5.3W per drive in idle state).
The drive specification lists 4.7W as read/write, and 3.3W as idle power consumption.
The increased power consumption is most likely on the 5V line, because after roughly 1 minute an "over current protection" (OCP) of the power supply (PSU) shuts down the power. The used PSU is a single rail model with an OCP of 122A on the 12V line and 55A on the 5V line.
Regression:
It doesn't matter whether the drive its APM value is set to disabled or 1 (maximum power saving).
The operating system records no read/write activity in /proc/diskstats. The values there are identical (28 read, 0 write operations) as immediately after the modprobe operation.
Can't test what happens when booting into the mainboard it's BIOS - to exclude any OS intervention - because the Super Micro X8SI6-F mainboard running firmware 06/27/12 has a bug that incorrectly reads a +74.0 C CPU sensor temperature as "High" in BIOS mode, and shuts down the power after 1 minute.
What might be causing the drive read/write activity on all drives in the 15th minute after boot and how to prevent it from happening?