External drive hanging, load average through the roof
- by Paul Tomblin
I have an external USB drive, and I run an hourly rsync to it as a backup. This has been working fine for years. This weekend, I got two new 2Tb internal drives, and decided it was time to re-install Ubuntu from scratch to clear out all the old cruft.
About once a day since the re-install, the backup script hangs hard, usually in the "rm -rf" I do before the rsync. By the time I notice the problem, my load average is in the stratosphere and climbing fast (one time, it was over 150), but anything that doesn't touch the drive seems to be running fine. One thing that I find suspicious is that something, I don't know what, is doing a "smartctl" and a "hdparm" command on the USB drive. I'm pretty sure smartctl isn't supposed to run on external drives. I can't figure out what's doing it, either. Here's part of ps auwwfx when it's hung:
root 7310 0.0 0.0 4248 352 ? D 20:15 0:00 /sbin/hdparm -C /dev/sdd
root 7808 0.0 0.0 17372 1632 ? D 20:15 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 8427 0.0 0.0 4248 356 ? D 20:20 0:00 /sbin/hdparm -C /dev/sdd
root 8925 0.0 0.0 17372 1628 ? D 20:20 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 9529 0.0 0.0 4248 356 ? D 20:25 0:00 /sbin/hdparm -C /dev/sdd
root 10026 0.0 0.0 17372 1628 ? D 20:25 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 10655 0.0 0.0 4248 356 ? D 20:30 0:00 /sbin/hdparm -C /dev/sdd
root 11151 0.0 0.0 17372 1632 ? D 20:30 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 11774 0.0 0.0 4248 356 ? D 20:35 0:00 /sbin/hdparm -C /dev/sdd
root 12271 0.0 0.0 17372 1628 ? D 20:35 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 12878 0.0 0.0 4248 352 ? D 20:40 0:00 /sbin/hdparm -C /dev/sdd
root 13374 0.0 0.0 17372 1632 ? D 20:40 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 14011 0.0 0.0 4248 352 ? D 20:45 0:00 /sbin/hdparm -C /dev/sdd
root 14507 0.0 0.0 17372 1628 ? D 20:45 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 15116 0.0 0.0 4248 352 ? D 20:50 0:00 /sbin/hdparm -C /dev/sdd
root 15612 0.0 0.0 17372 1632 ? D 20:50 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 16223 0.0 0.0 4248 352 ? D 20:55 0:00 /sbin/hdparm -C /dev/sdd
root 16734 0.0 0.0 17372 1632 ? D 20:55 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 17345 0.0 0.0 4248 352 ? D 21:00 0:00 /sbin/hdparm -C /dev/sdd
root 17842 0.0 0.0 17372 1628 ? D 21:00 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 18463 0.0 0.0 4248 352 ? D 21:05 0:00 /sbin/hdparm -C /dev/sdd
root 18960 0.0 0.0 17372 1628 ? D 21:05 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 19598 0.0 0.0 4248 356 ? D 21:10 0:00 /sbin/hdparm -C /dev/sdd
root 20096 0.0 0.0 17372 1628 ? D 21:10 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 21280 0.0 0.0 4244 356 ? D 21:15 0:00 /sbin/hdparm -C /dev/sdd
root 21784 0.0 0.0 17372 1632 ? D 21:15 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 22414 0.0 0.0 4244 356 ? D 21:20 0:00 /sbin/hdparm -C /dev/sdd
root 22912 0.0 0.0 17372 1628 ? D 21:20 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 23541 0.0 0.0 4244 356 ? D 21:25 0:00 /sbin/hdparm -C /dev/sdd
root 24038 0.0 0.0 17372 1632 ? D 21:25 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
root 24658 0.0 0.0 4244 356 ? D 21:30 0:00 /sbin/hdparm -C /dev/sdd
root 25157 0.0 0.0 17372 1628 ? D 21:30 0:00 /usr/sbin/smartctl -a -n standby -A -i /dev/sdd
Why is this happening, and how can I stop it?