mdadm raid1 fails to resync
Posted
by JuanD
on Server Fault
See other posts from Server Fault
or by JuanD
Published on 2010-06-02T17:23:39Z
Indexed on
2010/06/02
17:34 UTC
Read the original article
Hit count: 187
Hello, I'm trying to solve this problem I'm having with an mdadm raid1.
I have an ubuntu 9.04 server running on a software 2-drive raid1 with mdadm. Yesterday, one of the drives failed, and so I replaced it with a brand new drive of the same size. I removed the faulty drive, copied the partition from the remaining good drive to the new drive and then added it to the raid. It re-synced and the system worked fine, until the drive that hadn't failed, was also labeled failed.
Now I had the raid running solely on the new drive. So I purchased another drive and repeated the procedure above. So now I had 2 brand new drives and the raid was syncing. However, after a few minutes I checked /proc/mdstat and the raid was no longer syncing.
mdadm --detail /dev/md1 shows: (sdb is the first new drive, and sdc is the second new drive)
root@dola:/home/jjaramillo# mdadm --detail /dev/md1 /dev/md1: Version : 00.90 Creation Time : Sat Dec 20 00:42:05 2008 Raid Level : raid1 Array Size : 974711680 (929.56 GiB 998.10 GB) Used Dev Size : 974711680 (929.56 GiB 998.10 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent
Update Time : Wed Jun 2 10:09:35 2010
State : clean, degraded
Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1
UUID : bba497c6:5029ba0b:bfa4f887:c0dc8f3d
Events : 0.5395594
Number Major Minor RaidDevice State
2 8 35 0 spare rebuilding /dev/sdc3
1 8 19 1 active sync /dev/sdb3
I've tried removing and re-adding the drive a few times, but the same thing happens. The raid fails to resync. I've looked at /var/log/messages, and found the following:
Jun 2 07:57:36 dola kernel: [35708.917337] sd 5:0:0:0: [sdb] Unhandled sense code Jun 2 07:57:36 dola kernel: [35708.917339] sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jun 2 07:57:36 dola kernel: [35708.917342] sd 5:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor] Jun 2 07:57:36 dola kernel: [35708.917346] Descriptor sense data with sense descriptors (in hex): Jun 2 07:57:36 dola kernel: [35708.917348] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Jun 2 07:57:36 dola kernel: [35708.917357] 00 43 9e 47 Jun 2 07:57:36 dola kernel: [35708.917360] sd 5:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed
So it looks like there's some kind of error on sdb (the first new drive). My question is, what would be the best approach to get the raid up and running again? I've thought about dd'ing the /dev/md1 to a blank hard drive, then re-doing the raid from scratch and loading the data back, but there could be an easier solution..
Any help would be appreciated.
© Server Fault or respective owner