Raid 1 array won't assemble after power outage. How do I fix this ext4 mirror?

Posted by Forkrul Assail on Super User See other posts from Super User or by Forkrul Assail
Published on 2012-11-25T16:38:28Z Indexed on 2012/11/25 17:06 UTC
Read the original article Hit count: 364

Filed under:

Two ext4 drives on Raid 1 with mdadm won't reassemble after the power went out for an extended period (UPS drained).

After turning the machine back on, mdadm said that the array was degraded, after which it took about 2 days for a full resync, which completed without problems.

On trying to remount the array I get:

mount: you must specify the filesystem type

cat /etc/fstab lines relevant to setup:

/dev/md127 /media/mediapool ext4 defaults 0 0

dmesg | tail (on trying to mount) says:

[ 1050.818782] EXT3-fs (md127): error: can't find ext3 filesystem on dev md127.
[ 1050.849214] EXT4-fs (md127): VFS: Can't find ext4 filesystem
[ 1050.944781] FAT-fs (md127): invalid media value (0x00)
[ 1050.944782] FAT-fs (md127): Can't find a valid FAT filesystem
[ 1058.272787] EXT2-fs (md127): error: can't find an ext2 filesystem on dev md127.

cat /proc/mdstat says:

Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md127 : active (auto-read-only) raid1 sdj[2] sdi[0]
      2930135360 blocks super 1.2 [2/2] [UU]

unused devices: <none>

fsck /dev/md127 says:

fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/md127

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

mdadm -E /dev/sdi gives me:

/dev/sdi:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 37ac1824:eb8a21f6:bd5afd6d:96da6394
           Name : sojourn:33
  Creation Time : Sat Nov 10 10:43:52 2012
     Raid Level : raid1
   Raid Devices : 2

Avail Dev Size : 5860271016 (2794.40 GiB 3000.46 GB)
     Array Size : 2930135360 (2794.39 GiB 3000.46 GB)
  Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 3e6e9a4f:6c07ab3d:22d47fce:13cecfd0

    Update Time : Tue Nov 13 20:34:18 2012
       Checksum : f7d10db9 - correct
         Events : 27

   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing)
boot@boot ~ $ sudo mdadm -E /dev/sdj
/dev/sdj:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 37ac1824:eb8a21f6:bd5afd6d:96da6394
           Name : sojourn:33
  Creation Time : Sat Nov 10 10:43:52 2012
     Raid Level : raid1
   Raid Devices : 2

Avail Dev Size : 5860271016 (2794.40 GiB 3000.46 GB)
     Array Size : 2930135360 (2794.39 GiB 3000.46 GB)
  Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 7fb84af4:e9295f7b:ede61f27:bec0cb57

    Update Time : Tue Nov 13 20:34:18 2012
       Checksum : b9d17fef - correct
         Events : 27

   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)

machine@user ~ dmesg | tail
[   61.785866] init: alsa-restore main process (2736) terminated with status 99
[   68.433548] eth0: no IPv6 routers present
[  534.142511] EXT4-fs (sdi): ext4_check_descriptors: Block bitmap for group 0 not in group (block 2838187772)!
[  534.142518] EXT4-fs (sdi): group descriptors corrupted!
[  546.418780] EXT2-fs (sdi): error: couldn't mount because of unsupported optional features (240)
[  549.654127] EXT3-fs (sdi): error: couldn't mount because of unsupported optional features (240)

Since this is Raid 1 it was suggested that I try and mount or fsck the drives separately. After a long fsck on one drive, it ended with this as tail:

Illegal double indirect block (2298566437) in inode 39717736.  CLEARED.
Illegal block #4231180 (2611866932) in inode 39717736.  CLEARED.
Error storing directory block information (inode=39717736, block=0, num=1092368): Memory allocation failed
Recreate journal? yes

Creating journal (32768 blocks):  Done.

*** journal has been re-created - filesystem is now ext3 again ***

The drive however still doesn't want to mount: dmesg | tail

[  170.674659] md: export_rdev(sdc)
[  170.675152] md: export_rdev(sdc)
[  195.275288] md: export_rdev(sdc)
[  195.275876] md: export_rdev(sdc)
[ 1338.540092] CE: hpet increased min_delta_ns to 30169 nsec
[26125.734105] EXT4-fs (sdc): ext4_check_descriptors: Checksum for group 0 failed (43502!=37987)
[26125.734115] EXT4-fs (sdc): group descriptors corrupted!
[26182.325371] EXT3-fs (sdc): error: couldn't mount because of unsupported optional features (240)
[27083.316519] EXT4-fs (sdc): ext4_check_descriptors: Checksum for group 0 failed (43502!=37987)
[27083.316530] EXT4-fs (sdc): group descriptors corrupted!

Please help me fix this. I never in my wildest nightmares thought a complete mirror would die this badly. Am I missing something? Suggestions on fixing this? Could someone explain why it would resync after the powerout, only to seemingly nuke the drive?

Thanks for reading. Any help much appreciated. I've tried everything I can think of, including booting and filesystem checking with SystemRescue and Ubuntu liveboot discs.

© Super User or respective owner

Related posts about raid-1