Recovering a mdadm+lvm+ext4 partition with read error

Posted by bitwelder on Server Fault See other posts from Server Fault or by bitwelder
Published on 2012-10-03T09:18:05Z Indexed on 2012/10/03 9:38 UTC
Read the original article Hit count: 281

Filed under:
|
|

One of disks in my NAS has failed. The NAS is running Linux, and it uses mdadm + LVM technology for its filesystems.

I do have backup for most of the contents, but not for the very last changes, and if possible, I'd like to recover that from this failing disk.

The disk (a 'green drive' WD10EARS 1TB in size) throws this kind of errors:

Oct  3 12:00:41 kernel: [ 3625.620000] ata5.00: read unc at 9453282
Oct  3 12:00:41 kernel: [ 3625.620000] lba 9453282 start 9453280 end 1953511007 
Oct  3 12:00:41 kernel: [ 3625.620000] sde5 auto_remap 0
Oct  3 12:00:41 kernel: [ 3625.630000] ata5.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6
Oct  3 12:00:41 kernel: [ 3625.630000] ata5.00: edma_err_cause=00000084 pp_flags=00000003, dev error, EDMA self-disable
Oct  3 12:00:41 kernel: [ 3625.640000] ata5.00: failed command: READ FPDMA QUEUED
Oct  3 12:00:41 kernel: [ 3625.650000] ata5.00: cmd 60/40:00:e0:3e:90/00:00:00:00:00/40 tag 0 ncq 32768 in
Oct  3 12:00:41 kernel: [ 3625.650000]          res 41/40:00:e2:3e:90/12:00:00:00:00/40 Emask 0x409 (media error) <F>
Oct  3 12:00:41 kernel: [ 3625.660000] ata5.00: status: { DRDY ERR }

However, while testing with 'dd', I noticed that if I skip the first 4kB, the read seems to be ok, i.e. a command like. dd if=/dev/sde5 of=dev/null bs=4k count=1000 skip=1 doesn't return any read error.

Supposing that there is no other read failure in the rest of the disk, would I be able to recover this 900 GB partition (as I mentioned before, it's a 'linux raid autodetect' partition, that contains a a LVM2 volume that contains a ext4 filesystem) if I copy-clone the partition somewhere else, but the first 4kB?

© Server Fault or respective owner

Related posts about lvm

Related posts about mdadm