Software Raid 10 corrupted superblock after dual disk failure, how do I recover it?

Posted by Shoshomiga on Server Fault See other posts from Server Fault or by Shoshomiga
Published on 2012-12-03T23:24:56Z Indexed on 2012/12/05 23:06 UTC
Read the original article Hit count: 245

Filed under:
|
|
|

I have a software raid 10 with 6 x 2tb hard drives (raid 1 for /boot), ubuntu 10.04 is the os.

I had a raid controller failure that put 2 drives out of sync, crashed the system and initially the os didnt boot up and went into initramfs instead, saying that drives were busy but I eventually managed to bring the raid up by stopping and assembling the drives.

The os booted up and said that there were filesystem errors, I chose to ignore because it would remount the fs in read-only mode if there was a problem.

Everything seemed to be working fine and the 2 drives started to rebuild, I was sure that it was a sata controller failure because I had dma errors in my log files.

The os crashed soon after that with ext errors.

Now its not bringing up the raid, it says that there is no superblock on /dev/sda2, even if I assemble manually with all the device names.

I also did a memtest and changed the motherboard in addition to everything else.

EDIT: This is my partition layout

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0009c34a

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        2048      511999      254976   83  Linux
/dev/sdb2          512000  3904980991  1952234496   83  Linux
/dev/sdb3      3904980992  3907028991     1024000   82  Linux swap / Solaris

All 6 disks have the same layout, partition #1 is for raid 1 /boot, partition #2 is for raid 10 far plan, partition #3 is swap, but sda did not have swap enabled

EDIT2: This is the output of mdadm --detail /dev/md1

Layout : near=1, far=2
Chunk Size : 64k

UUID : a0feff55:2018f8ff:e368bf24:bd0fce41
Events : 0.3112126

Number Major Minor RaidDevice State
0      8     34    0          spare rebuilding /dev/sdc2
1      0     0     1          removed
2      8     18    2          active sync /dev/sdb2
3      8     50    3          active sync /dev/sdd2
4      0     0     4          removed
5      8     82    5          active sync /dev/sdf2

6      8     66    -          spare /dev/sde2

EDIT3: I ran ddrescue and it has copied everything from sda except a single 4096 byte sector that I suspect is the raid superblock

EDIT4: Here is some more info too long to fit here

lshw: http://pastebin.com/2eKrh7nF

mdadm --detail /dev/sd[abcdef]1 (raid1): http://pastebin.com/cgMQWerS

mdadm --detail /dev/sd[abcdef]2 (raid10): http://pastebin.com/V5dtcGPF

dumpe2fs of /dev/sda2 (from the ddrescue cloned drive): http://pastebin.com/sp0GYcJG

I tried to recreate md1 based on this info with the command

mdadm --create /dev/md1 -v --assume-clean --level=10 --raid-devices=6 --chunk=64K --layout=f2 /dev/sda2 missing /dev/sdc2 /dev/sdd2 missing /dev/sdf2

But I can't mount it, I also tried to recreate it based on my initial mdadm --detail /dev/md1 but it still doesn't mount

It also warns me that /dev/sda2 is an ext2fs file system but I guess its because of ddrescue

© Server Fault or respective owner

Related posts about linux

Related posts about raid