How to get rid of a stubborn 'removed' device in mdadm
Posted
by
T.J. Crowder
on Server Fault
See other posts from Server Fault
or by T.J. Crowder
Published on 2013-11-12T17:23:22Z
Indexed on
2013/11/13
3:57 UTC
Read the original article
Hit count: 536
mdadm
One of my server's drives failed and so I removed the failed drive from all three relevant arrays, had the drive swapped out, and then added the new drive to the arrays. Two of the arrays worked perfectly. The third added the drive back as a spare, and there's an odd "removed" entry in the mdadm
details.
I tried both
mdadm /dev/md2 --remove failed
and
mdadm /dev/md2 --remove detached
as suggested here and here, neither of which complained, but neither of which had any effect, either.
Does anyone know how I can get rid of that entry and get the drive added back properly? (Ideally without resyncing a third time, I've already had to do it twice and it takes hours. But if that's what it takes, that's what it takes.) The new drive is /dev/sda
, the relevant partition is /dev/sda3
.
Here's the detail on the array:
# mdadm --detail /dev/md2 /dev/md2: Version : 0.90 Creation Time : Wed Oct 26 12:27:49 2011 Raid Level : raid1 Array Size : 729952192 (696.14 GiB 747.47 GB) Used Dev Size : 729952192 (696.14 GiB 747.47 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Tue Nov 12 17:48:53 2013 State : clean, degraded Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 UUID : 2fdbf68c:d572d905:776c2c25:004bd7b2 (local to host blah) Events : 0.34665 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 19 1 active sync /dev/sdb3 2 8 3 - spare /dev/sda3
If it's relevant, it's a 64-bit server. It normally runs Ubuntu, but right now I'm in the data centre's "rescue" OS, which is Debian 7 (wheezy). The "removed" entry was there the last time I was in Ubuntu (it won't, currently, boot from the disk), so I don't think that's not some Ubuntu/Debian conflict (and they are, of course, closely related).
Update:
Having done extensive tests with test devices on a local machine, I'm just plain getting anomalous behavior from mdadm
with this array. For instance, with /dev/sda3
removed from the array again, I did this:
mdadm /dev/md2 --grow --force --raid-devices=1
And that got rid of the "removed" device, leaving me just with /dev/sdb3
. Then I nuked /dev/sda3
(wrote a file system to it, so it didn't have the raid fs anymore), then:
mdadm /dev/md2 --grow --raid-devices=2
...which gave me an array with /dev/sdb3
in slot 0 and "removed" in slot 1 as you'd expect. Then
mdadm /dev/md2 --add /dev/sda3
...added it — as a spare again. (Another 3.5 hours down the drain.)
So with the rebuilt spare in the array, given that mdadm
's man page says
RAID-DEVICES CHANGES
...
When the number of devices is increased, any hot spares that are present will be activated immediately.
...I grew the array to three devices, to try to activate the "spare":
mdadm /dev/md2 --grow --raid-devices=3
What did I get? Two "removed" devices, and the spare. And yet when I do this with a test array, I don't get this behavior.
So I nuked /dev/sda3
again, used it to create a brand-new array, and am copying the data from the old array to the new one:
rsync -r -t -v --exclude 'lost+found' --progress /mnt/oldarray/* /mnt/newarray
This will, of course, take hours. Hopefully when I'm done, I can stop the old array entirely, nuke /dev/sdb3
, and add it to the new array. Hopefully, it won't get added as a spare!
© Server Fault or respective owner