raid6 - Page 3 - Developer IT

What benchmark tool to use to benchmark hardware for VM server?

- by Mark0978

We are setting up a new piece of hardware to virtualize several of our servers on. Choices are RAID 5, RAID 6, and RAID 0+1. We are wanting to benchmark all three before we go live with the machine, but I'm not sure how to test the speed. Since we will be using it to host VMs, what will the actual disk traffic look like? What can I use to see if RAID 6 is too slow? Short of setting up the system with all the VM's on it and running that way, then redoing on all the work, I'm not sure how to test it. It them becomes more of a subjective test than an objective one. I'm worried that RAID6 will have too much overhead, that RAID5 will be to fragile with 3TB drives and I've never worked with 0+1 at all. So in short I'd like to setup the base machine (which will be running Linux) and then test the underlying SW RAID for speed. What kind of tool exists to simulate this kind of load? Barring the lack of a specific tool, how about a generic FS testing tool that will simulate different loads?

Read the article

Defeating the RAID5 write hole with ZFS (but not RAID-Z) [closed]

- by Michael Shick

I'm setting up a long-term storage system for keeping personal backups and archives. I plan to have RAID5 starting with a relatively small array and adding devices over time to expand storage. I may also want to convert to RAID6 down the road when the array gets large. Linux md is a perfect fit for this use case since it allows both of the changes I want on a live array and performance isn't at all important. Low cost is also great. Now, I also want to defend against file corruption, so it looked like a RAID-Z1 would be a good fit, but evidently I would only be able to add additional RAID5 (RAID-Z1) sets at a time rather than individual drives. I want to be able to add drives one at a time, and I don't want to have to give up another device for parity with every expansion. So at this point, it looks like I'll be using a plain ZFS filesystem on top of an md RAID5 array. That brings me to my primary question: Will ZFS be able to correct or at least detect corruption resulting from the RAID5 write hole? Additionally, any other caveats or advice for such a set up is welcome. I'll probably be using Debian, but I'll definitely be using Linux since I'm familiar with it, so that means only as new a version of ZFS as is available for Linux (via ZFS-FUSE or so).

Read the article

Formula to calculate probability of unrecoverable read error during RAID rebuild

- by OlafM

I need to compare the reliability of different RAID systems with either consumer or enterprise drives. The formula to have the probability of success of a rebuild, ignoring mechanical problems, is simple: error_probability = 1 - (1-per_bit_error_rate)^bit_read and with 3 TB drives I get 38% probability to experience an URE (unrecoverable read error) for a 2+1 disks RAID5 (4.7% for enterprise drives) 21% for a RAID1 (2.4% for enterprise drives) 51% probability of error during recovery for the 3+1 RAID5 often used by users of SOHO products like Synologys. Most people don't know about this. Calculating the error for single disk tolerance is easy, my question concerns systems tolerant to multiple disks failures (RAID6/Z2, RAIDZ3 and RAID1 with multiple disks). If only the first disk is used for rebuild and the second one is read again from the beginning in case or an URE, then the error probability is the one calculated above squared (14.5% for consumer RAID5 2+1, 4.5% for consumer RAID1 1+2). However, I suppose (at least in ZFS that has full checksums!) that the second parity/available disk is read only where needed, meaning that only few sectors are needed: how many UREs can possibly happen in the first disk? not many, otherwise the error probability for single-disk tolerance systems would skyrocket even more than I calculated. If I'm correct, a second parity disk would practically lower the risk to extremely low values. Am I correct?

Read the article

Why do I get a DegradedArray event with mdadm

- by azera

Hello Just so we're clear on what's happening: I bought 4 new sata 2 drives, with the intent of using them in a raid5 all drive are fully recognised by both my bios and my linux box (gentoo) I created a raid5 array, fiddled a bit with it to understand how it works, how to monitor ect At some point, this triggered a degradedarray event, even though the array is brand new. I tried to stopping the array and recreating a new array with the same drive but the new array starts degraded too. here is what I used to create it mdadm --create -l5 -n4 /dev/md/md0-r5 /dev/sdb /dev/sdd /dev/sde /dev/sdf here are the output from my /proc/mdstat and mdadm --detail --scan **mdstat** Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md127 : active raid5 sdf[4] sde[2] sdd[1] sdb[0] 4395415488 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_] [>....................] recovery = 2.8% (41689732/1465138496) finish=890.3min speed=26645K/sec unused devices: <none> **detail** ARRAY /dev/md/md0-r5 metadata=0.90 spares=1 UUID=453e2833:81f22a74:64188b84:66721085 As such I have a couple questions: does a raid5 array always start in degraded mode at first ? why does sdf have the number 4 between bracket instead of 3, why does it see a spare disk and why is the 4th drive marked with _ instead of U ? (bad configuration ?) How can I recreate the array from scratch, do i have to format each drive on its own before recreating it ? Thanks for any help, I'm not sure about what I should do at the moment

Read the article

Degraded RAID-5 array with lvm2 lost superblock and partition table

- by Fred Phillips

I have a RAID-5 array of 4x1TB hard disks with one lvm2 partition on Ubuntu Linux 10.04 LTS. One of the disks has failed. I have re-assembled the array without this failed disk but now mdadm --examine claims the array has no superblock and fdisk says it has no partition table. What can I do to recover the data? # mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Sat Mar 5 14:43:49 2011 Raid Level : raid5 Array Size : 2930276352 (2794.53 GiB 3000.60 GB) Used Dev Size : 976758784 (931.51 GiB 1000.20 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Sat Mar 5 15:06:49 2011 State : clean, degraded Active Devices : 3 Working Devices : 3 Failed Devices : 1 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : boba:1 (local to host boba) UUID : 52eb4bc9:c3d8aab5:e0699505:e0e1aa05 Events : 18 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 65 1 active sync /dev/sde1 2 8 49 2 active sync /dev/sdd1 3 0 0 3 removed 4 8 17 - faulty spare /dev/sdb1 # mdadm --examine /dev/md0 mdadm: No md superblock detected on /dev/md0. # fdisk -l /dev/md0 Disk /dev/md0: 3000.6 GB, 3000602984448 bytes 2 heads, 4 sectors/track, 732569088 cylinders Units = cylinders of 8 * 512 = 4096 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 524288 bytes / 1572864 bytes Disk identifier: 0x00000000 Disk /dev/md0 doesn't contain a valid partition table # cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] md0 : active raid5 sdb1[4](F) sda1[0] sdd1[2] sde1[1] 2930276352 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_] unused devices: <none>

Read the article

How to create a software raid5 array without a spare

- by Yannick M.

I am trying to create a software raid5 array using mdadm: $ linux # mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 --spare-devices=0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 mdadm: layout defaults to left-symmetric mdadm: chunk size defaults to 64K mdadm: array /dev/md0 started. However when inspecting /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sdd1[4] sdc1[2] sdb1[1] sda1[0] 2930279808 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_] [>....................] recovery = 0.3% (2970496/976759936) finish=186.1min speed=87172K/sec unused devices: <none> It seems one drive isn't active, so I check the details of the array: /dev/md0: Version : 00.90.03 Creation Time : Tue Jul 21 16:29:53 2009 Raid Level : raid5 Array Size : 2930279808 (2794.53 GiB 3000.61 GB) Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Jul 21 16:29:53 2009 State : clean, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K Rebuild Status : 0% complete UUID : ce8b2f40:821d003c:0027688e:a70977ec Events : 0.1 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 2 8 33 2 active sync /dev/sdc1 4 8 49 3 spare rebuilding /dev/sdd1 And it seems there are only 3 active devices, with one spare. Is it just me, or something wrong here?

Read the article

How to get an inactive RAID device working again?

- by Jonik

After booting, my RAID1 device (/dev/md_d0 *) sometimes goes in some funny state and I cannot mount it. * Originally I created /dev/md0 but it has somehow changed itself into /dev/md_d0. # mount /opt mount: wrong fs type, bad option, bad superblock on /dev/md_d0, missing codepage or helper program, or other error (could this be the IDE device where you in fact use ide-scsi so that sr0 or sda or so is needed?) In some cases useful info is found in syslog - try dmesg | tail or so The RAID device appears to be inactive somehow: # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md_d0 : inactive sda4[0](S) 241095104 blocks # mdadm --detail /dev/md_d0 mdadm: md device /dev/md_d0 does not appear to be active. Question is, how to make active the device again (using mdmadm, I presume)? (Other times it's alright (active) after boot, and I can mount it manually without problems. But it still won't mount automatically even though I have it in /etc/fstab: /dev/md_d0 /opt ext4 defaults 0 0 So a bonus question: what should I do to make the RAID device automatically mount at /opt at boot time?) This is an Ubuntu 9.10 workstation. Background info about my RAID setup in this question.

Read the article

Allignment of ext3 partition on LVM RAID volume group

- by John P

I'm trying to add a partition on a LVM that resides on a RAID6 volume group and fdisk is complaining about the partition not residing on a physical sector boundry. My question is, how do you calculate the correct starting sector for a partition on a LVM? This partition will be formated ext3. Would it be better to just format the LVM directly instead of creating a new partition? Disk /dev/dedvol/backup: 2199.0 GB, 2199023255552 bytes 255 heads, 63 sectors/track, 267349 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 1048576 bytes / 8388608 bytes Disk identifier: 0x4e428f49 Device Boot Start End Blocks Id System /dev/dedvol/backup1 63 267349 2146982827+ 83 Linux Partition 1 does not start on physical sector boundary. lvdisplay /dev/dedvol/backup --- Logical volume --- LV Name /dev/dedvol/backup VG Name dedvol LV UUID OV2n5j-7LHb-exJL-t8dI-dU8A-2vxf-uIicCt LV Write Access read/write LV Status available # open 0 LV Size 2.00 TiB Current LE 524288 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 32768 Block device 253:1 vgdisplay dedvol --- Volume group --- VG Name dedvol System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 3 VG Access read/write VG Status resizable MAX LV 0 Cur LV 2 Open LV 1 Max PV 0 Cur PV 1 Act PV 1 VG Size 14.55 TiB PE Size 4.00 MiB Total PE 3815448 Alloc PE / Size 3670016 / 14.00 TiB Free PE / Size 145432 / 568.09 GiB VG UUID 8fBcOk-aXGx-P3Qy-VVpJ-0zK1-fQgy-Cb691J

Read the article

How to re-add a RAID-10 failed drive on Ubuntu?

- by thiesdiggity

I have a problem that I can't seem to solve. We have a Ubuntu server setup with RAID-10 and two of the drives dropped out of the array. When I try to re-add them using the following command: mdadm --manage --re-add /dev/md2 /dev/sdc1 I get the following error message: mdadm: Cannot open /dev/sdc1: Device or resource busy When I do a "cat /proc/mdstat" I get the following: Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [r$ md2 : active raid10 sdb1[0] sdd1[3] 1953519872 blocks 64K chunks 2 near-copies [4/2] [U__U] md1 : active raid1 sda2[0] sdc2[1] 468853696 blocks [2/2] [UU] md0 : active raid1 sda1[0] sdc1[1] 19530688 blocks [2/2] [UU] unused devices: <none> When I run "/sbin/mdadm --detail /dev/md2" I get the following: /dev/md2: Version : 00.90 Creation Time : Mon Sep 5 23:41:13 2011 Raid Level : raid10 Array Size : 1953519872 (1863.02 GiB 2000.40 GB) Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 4 Total Devices : 2 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Thu Oct 25 09:25:08 2012 State : active, degraded Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : near=2, far=1 Chunk Size : 64K UUID : c6d87d27:aeefcb2e:d4453e2e:0b7266cb Events : 0.6688691 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 0 0 1 removed 2 0 0 2 removed 3 8 49 3 active sync /dev/sdd1 Output of df -h is: Filesystem Size Used Avail Use% Mounted on /dev/md1 441G 2.0G 416G 1% / none 32G 236K 32G 1% /dev tmpfs 32G 0 32G 0% /dev/shm none 32G 112K 32G 1% /var/run none 32G 0 32G 0% /var/lock none 32G 0 32G 0% /lib/init/rw tmpfs 64G 215M 63G 1% /mnt/vmware none 441G 2.0G 416G 1% /var/lib/ureadahead/debugfs /dev/mapper/RAID10VG-RAID10LV 1.8T 139G 1.6T 8% /mnt/RAID10 When I do a "fdisk -l" I can see all the drives needed for the RAID-10. The RAID-10 is part of the /dev/mapper, could that be the reason why the device is coming back as busy? Anyone have any suggestions on what I can try to get the drives back into the array? Any help would be greatly appreciated. Thanks!

Read the article

How do I reinitialise a failed RAID 5 drive using terminal on Ubuntu Server

- by Stephen

I've currently put together a new system and part of that has been creating a software RAID 5 using 'mdadm' in Ubuntu Server. I successfully got to the point where I create the array using: sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 I left it to do its thing overnight then used the following command to check on it: watch cat /proc/mdstat To which the following was returned: Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sdd1[4](S) sdc1[2] sdb1[1] sda1[0](F) 5860535808 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2] [_UU_] unused devices: <none> It appears that one has failed (and I'm not too savvy with why another is a spare). So, just to be sure that something else isn't amiss I wanted to try and re-engage the failed drive. Can someone explain how I can do that and what I should do with the spare (if anything). And also how do I know when synchronisation is complete? The tutorial I used to get this far is located here: http://sonniesedge.co.uk/2009/06/13/software-raid-5-on-ubuntu-904/ Many thanks! p.s. Here is some extra information that may help: sudo mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Mon Jun 18 21:14:21 2012 Raid Level : raid5 Array Size : 5860535808 (5589.04 GiB 6001.19 GB) Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Mon Jun 18 21:50:26 2012 State : clean, FAILED Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Name : myraidbox:0 (local to host myraidbox) UUID : a269ee94:a161600c:fb1665e7:bd2f27b3 Events : 13 Number Major Minor RaidDevice State 0 0 0 0 removed 1 8 17 1 active sync /dev/sdb1 2 8 33 2 active sync /dev/sdc1 3 0 0 3 removed 0 8 1 - faulty spare /dev/sda1 4 8 49 - spare /dev/sdd1

Read the article

How to force mdadm to stop RAID5 array?

- by lucek

I have /dev/md127 RAID5 array that consisted of four drives. I managed to hot remove them from the array and currently /dev/md127 does not have any drives: cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sdd1[0] sda1[1] 304052032 blocks super 1.2 [2/2] [UU] md1 : active raid0 sda5[1] sdd5[0] 16770048 blocks super 1.2 512k chunks md127 : active raid5 super 1.2 level 5, 512k chunk, algorithm 2 [4/0] [____] unused devices: <none> and mdadm --detail /dev/md127 /dev/md127: Version : 1.2 Creation Time : Thu Sep 6 10:39:57 2012 Raid Level : raid5 Array Size : 8790402048 (8383.18 GiB 9001.37 GB) Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB) Raid Devices : 4 Total Devices : 0 Persistence : Superblock is persistent Update Time : Fri Sep 7 17:19:47 2012 State : clean, FAILED Active Devices : 0 Working Devices : 0 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Number Major Minor RaidDevice State 0 0 0 0 removed 1 0 0 1 removed 2 0 0 2 removed 3 0 0 3 removed I've tried to do mdadm --stop /dev/md127 but: mdadm --stop /dev/md127 mdadm: Cannot get exclusive access to /dev/md127:Perhaps a running process, mounted filesystem or active volume group? I made sure that it's unmounted, umount -l /dev/md127 and confirmed that it indeed is unmounted: umount /dev/md127 umount: /dev/md127: not mounted I've tried to zero superblock of each drive and I get (for each drive): mdadm --zero-superblock /dev/sde1 mdadm: Unrecognised md component device - /dev/sde1 Here's output of lsof|grep md127: lsof|grep md127 md127_rai 276 root cwd DIR 9,0 4096 2 / md127_rai 276 root rtd DIR 9,0 4096 2 / md127_rai 276 root txt unknown /proc/276/exe What else can I do? LVM is not even installed so it can't be a factor.

Read the article

Failing to load rootfs: Ubuntu 10 + grub2 + rootfs ext4 w/ RAID1

- by James

I am having problems booting a new Ubuntu 10 (server) install. My primary HD (/dev/sda) is laid out as follows: Device Boot Start End Blocks Id System /dev/sda1 * 1 18 144553+ 83 Linux <-- /BOOT /dev/sda2 19 182401 1464991447+ 5 Extended /dev/sda5 19 2207 17583111 fd Linux raid autodetect /dev/sda6 2208 11934 78132096 fd Linux raid autodetect <-- / (ROOTFS) /dev/sda7 11935 182401 1369276146 fd Linux raid autodetect The rootfs is part of a RAID1 (software) array (currently degraded): # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda6[1] 78132032 blocks [2/1] [_U] The UUIDs for the partitions are as follows: # blkid /dev/sda1 /dev/sda1: UUID="b25dd301-41b9-4f4d-9b0a-0e31713dd74c" TYPE="ext2" # blkid /dev/sda6 /dev/sda6: UUID="af7b9ede-fa53-c0c1-74be-31ec752c5cd5" TYPE="linux_raid_member" # blkid /dev/md2 /dev/md2: UUID="a0602d42-6855-482f-870c-6f6ecdcdae3f" TYPE="ext4" Finally, I have my grub2 menuentry setup as follows: ### BEGIN /etc/grub.d/10_linux ### menuentry 'Ubuntu, with Linux 2.6.32-25-server' --class ubuntu --class gnu-linux --class gnu --class os { insmod ext2 insmod raid insmod mdraid set root='(hd0,1)' search --no-floppy --fs-uuid --set b25dd301-41b9-4f4d-9b0a-0e31713dd74c linux /vmlinuz-2.6.32-25-server root=UUID=a0602d42-6855-482f-870c-6f6ecdcdae3f ro nosplash noplymouth initrd /initrd.img-2.6.32-25-server } When I attempt to boot, grub loads OK, however I eventually get the following error message: Gave up waiting for root device. ALERT /dev/disk/by-uuid/a0602d42-6855-482f-870c-6f6ecdcdae3f does not exist. Dropping to a shell! If from the grub bootloader I open a grub command line, I can ls (hd0,) and it lists the correct partitions with the UUIDs as shown above - sda6 shows 'a0602d42-6855-482f-870c-6f6ecdcdae3f' (the RAID UUID). If I ls (md2)/ it properly lists all the files on the RAID1 filesystem (ext4) so it doesn't appear to be an issue accessing the raid device. Does anyone have any suggestions as to what the problem might be? I can't figure this one out.

Read the article

Should I use "Raid 5 + spare" or "Raid 6"?

- by Trevor Boyd Smith

What is "Raid 5 + Spare" (excerpt from User Manual, Sect 4.17.2, P.54): RAID5+Spare: RAID 5+Spare is a RAID 5 array in which one disk is used as spare to rebuild the system as soon as a disk fails (Fig. 79). At least four disks are required. If one physical disk fails, the data remains available because it is read from the parity blocks. Data from a failed disk is rebuilt onto the hot spare disk. When a failed disk is replaced, the replacement becomes the new hot spare. No data is lost in the case of a single disk failure, but if a second disk fails before the system can rebuild data to the hot spare, all data in the array will be lost. What is "Raid 6" (excerpt from User Manual, Sect 4.17.2, P.54): RAID6: In RAID 6, data is striped across all disks (minimum of four) and a two parity blocks for each data block (p and q in Fig. 80) is written on the same stripe. If one physical disk fails, the data from the failed disk can be rebuilt onto a replacement disk. This Raid mode can support up to two disk failures with no data loss. RAID 6 provides for faster rebuilding of data from a failed disk. Both "Raid 5 + spare" and "Raid 6" are SO similar ... I can't tell the difference. When would "Raid 5 + Spare" be optimal? And when would "Raid 6" be optimal"? The manual dumbs down the different raid with 5 star ratings. "Raid 5 + Spare" only gets 4 stars but "Raid 6" gets 5 stars. If I were to blindly trust the manual I would conclude that "Raid 6" is always better. Is "Raid 6" always better?

Read the article

Growing a Linux software RAID5 array

- by chrismetcalf

On my home file server, I've got a 1.5TB software RAID5 array, built from four 500gb Western Digital drives. I've got a fifth drive that I usually run as a hot spare (but have out of the array at the moment), but if I can I'd like to add that to the array and grow it to 2TB since I'm running out of space. I Googled for guidance, but there seem to be a lot of differing opinions out there (many of them probably now out-of-date) as to whether or not that is possible and/or smart. What's the right way to go about this, or should I start looking into building a new array with more space? Version details: %> cat /etc/issue Debian GNU/Linux 5.0 \n \l %> uname -a Linux magrathea 2.6.26-1-686-bigmem #1 SMP Sat Jan 10 19:13:22 UTC 2009 i686 GNU/Linux %> /sbin/mdadm --version mdadm - v2.6.7.2 - 14th November 2008 %> cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md1 : active raid1 hdc1[0] hdd1[1] 293033536 blocks [2/2] [UU] md0 : active raid5 sde1[3] sda1[0] sdc1[2] sdb1[1] 1465151808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

Read the article

RAID degraded on Ubuntu server

- by reano

We're having a very weird issue at work. Our Ubuntu server has 6 drives, set up with RAID1 as follows: /dev/md0, consisting of: /dev/sda1 /dev/sdb1 /dev/md1, consisting of: /dev/sda2 /dev/sdb2 /dev/md2, consisting of: /dev/sda3 /dev/sdb3 /dev/md3, consisting of: /dev/sdc1 /dev/sdd1 /dev/md4, consisting of: /dev/sde1 /dev/sdf1 As you can see, md0, md1 and md2 all use the same 2 drives (split into 3 partitions). I also have to note that this is done via ubuntu software raid, not hardware raid. Today, the /md0 RAID1 array shows as degraded - it is missing the /dev/sdb1 drive. But since /dev/sdb1 is only a partition (and /dev/sdb2 and /dev/sdb3 are working fine), it's obviously not the drive that's gone AWOL, it seems the partition itself is missing. How is that even possible? And what could we do to fix it? My output of cat /proc/mdstat: Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid1 sda2[0] sdb2[1] 24006528 blocks super 1.2 [2/2] [UU] md2 : active raid1 sda3[0] sdb3[1] 1441268544 blocks super 1.2 [2/2] [UU] md0 : active raid1 sda1[0] 1464710976 blocks super 1.2 [2/1] [U_] md3 : active raid1 sdd1[1] sdc1[0] 2930133824 blocks super 1.2 [2/2] [UU] md4 : active raid1 sdf2[1] sde2[0] 2929939264 blocks super 1.2 [2/2] [UU] unused devices: <none> FYI: I tried the following: mdadm /dev/md0 --add /dev/sdb1 But got this error: mdadm: add new device failed for /dev/sdb1 as 2: Invalid argument Output of mdadm --detail /dev/md0 is: /dev/md0: Version : 1.2 Creation Time : Sat Dec 29 17:09:45 2012 Raid Level : raid1 Array Size : 1464710976 (1396.86 GiB 1499.86 GB) Used Dev Size : 1464710976 (1396.86 GiB 1499.86 GB) Raid Devices : 2 Total Devices : 1 Persistence : Superblock is persistent Update Time : Thu Nov 7 15:55:07 2013 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 Name : lia:0 (local to host lia) UUID : eb302d19:ff70c7bf:401d63af:ed042d59 Events : 26216 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 0 0 1 removed

Read the article

Disk fragmentation when dealing with many small files

- by Zorlack

On a daily basis we generate about 3.4 Million small jpeg files. We also delete about 3.4 Million 90 day old images. To date, we've dealt with this content by storing the images in a hierarchical manner. The heriarchy is something like this: /Year/Month/Day/Source/ This heirarchy allows us to effectively delete days worth of content across all sources. The files are stored on a Windows 2003 server connected to a 14 disk SATA RAID6. We've started having significant performance issues when writing-to and reading-from the disks. This may be due to the performance of the hardware, but I suspect that disk fragmentation may be a culprit at well. Some people have recommended storing the data in a database, but I've been hesitant to do this. An other thought was to use some sort of container file, like a VHD or something. Does anyone have any advice for mitigating this kind of fragmentation? Additional Info: The average file size is 8-14KB Format information from fsutil: NTFS Volume Serial Number : 0x2ae2ea00e2e9d05d Version : 3.1 Number Sectors : 0x00000001e847ffff Total Clusters : 0x000000003d08ffff Free Clusters : 0x000000001c1a4df0 Total Reserved : 0x0000000000000000 Bytes Per Sector : 512 Bytes Per Cluster : 4096 Bytes Per FileRecord Segment : 1024 Clusters Per FileRecord Segment : 0 Mft Valid Data Length : 0x000000208f020000 Mft Start Lcn : 0x00000000000c0000 Mft2 Start Lcn : 0x000000001e847fff Mft Zone Start : 0x0000000002163b20 Mft Zone End : 0x0000000007ad2000

Read the article

mdadm lvm and ext4 slowness - How can I speed it up?

- by beatbreaker

I can't figure out why I'm getting such terrible times out of my mdadm and in particular the lvm partitions in it. I made the raid: mdadm --create --verbose /dev/md0 --level=5 --chunk=1024 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 # cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sda1[0] sdd1[3] sdc1[2] sdb1[1] 2930279424 blocks level 5, 1024k chunk, algorithm 2 [4/4] [UUUU] I then created the physical volume, volume group, and logical volumes, I then formatted the logical volumes to ext4 using the following commands I got from here: http://busybox.net/~aldot/mkfs_stride.html mkfs.ext3 -b 4096 -E stride=256,stripe-width=768 /dev/datavg/blah Now I'm confused, I had these lvs running real quick before in mdadm but now that I've 'optimized' everything it's slower, eg, before: /dev/datavg/lv_audio: Timing buffered disk reads: 598 MB in 3.01 seconds = 198.85 MB/sec but now after: /dev/datavg/audio: Timing buffered disk reads: 198 MB in 3.00 seconds = 65.96 MB/sec That's pitiful! What's happened here? Did I not follow the instructions correctly? Can i reshape the ext4 partitons to default back to what they were? (I used defaults before and they were fine!)

Read the article

Did I lose my RAID again?

- by BarsMonster

Hi! A little history: 2 years ago I was really excited to find out that mdadm is so powerful that it even can reshape arrays, so you can start with a smaller array and then grow it as you need. I've bought 3x1Tb drives and made a RAID-5. It was fine for a year. Then I bought 2x more, and tried to reshape to RAID-6 out of 5 drives, and due to some mess with superblock versions, lost all content. Had to rebuild it from scratch, but 2Tb of data were gone. Yesterday I bought 2 more drives, and this time I had everything: properly built array, UPS. I've disabled write intent map, added 2 new drives as spares and run a command to grow array to 7-disks. It started working, but speed was ridiculously slow, ~100kb/sec. After processing first 37Mb at such an amazing speed, one of old HDDs fails. I properly shutdown the PC and disconnected the failed drive. After bootup it appeared that it recreated the intent map as it was still in mdadm config, so I removed it from config and rebooted again. Now all I see is that all mdadm processes deadlock, and don't do anything. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1937 root 20 0 12992 608 444 D 0 0.1 0:00.00 mdadm 2283 root 20 0 12992 852 704 D 0 0.1 0:00.01 mdadm 2287 root 20 0 0 0 0 D 0 0.0 0:00.01 md0_reshape 2288 root 18 -2 12992 820 676 D 0 0.1 0:00.01 mdadm And all I see in mdstat is: $ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sdb1[1] sdg1[4] sdf1[7] sde1[6] sdd1[0] sdc1[5] 2929683456 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [7/6] [UU_UUUU] [>....................] reshape = 0.0% (37888/976561152) finish=567604147.2min speed=0K/sec I've already tried mdadm 2.6.7, 3.1.4 and 3.2 - nothing helps. Did I lose my data again? Any suggestions on how can I make this work? OS is Ubuntu Server 10.04.2. PS. Needless to say, the data is inaccessible - I cannot mount /dev/md0 to save the most valuable data. You can see my disappointment - the very specific thing I was excited about failed twice taking 5Tb of my data with it. Update: It appears there is some nice info in kern.log: 21:38:48 ...: [ 166.522055] raid5: reshape will continue 21:38:48 ...: [ 166.522085] raid5: device sdb1 operational as raid disk 1 21:38:48 ...: [ 166.522091] raid5: device sdg1 operational as raid disk 4 21:38:48 ...: [ 166.522097] raid5: device sdf1 operational as raid disk 5 21:38:48 ...: [ 166.522102] raid5: device sde1 operational as raid disk 6 21:38:48 ...: [ 166.522107] raid5: device sdd1 operational as raid disk 0 21:38:48 ...: [ 166.522111] raid5: device sdc1 operational as raid disk 3 21:38:48 ...: [ 166.523942] raid5: allocated 7438kB for md0 21:38:48 ...: [ 166.524041] 1: w=1 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524050] 4: w=2 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524056] 5: w=3 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524062] 6: w=4 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524068] 0: w=5 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524073] 3: w=6 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0 21:38:48 ...: [ 166.524079] raid5: raid level 6 set md0 active with 6 out of 7 devices, algorithm 2 21:38:48 ...: [ 166.524519] RAID5 conf printout: 21:38:48 ...: [ 166.524523] --- rd:7 wd:6 21:38:48 ...: [ 166.524528] disk 0, o:1, dev:sdd1 21:38:48 ...: [ 166.524532] disk 1, o:1, dev:sdb1 21:38:48 ...: [ 166.524537] disk 3, o:1, dev:sdc1 21:38:48 ...: [ 166.524541] disk 4, o:1, dev:sdg1 21:38:48 ...: [ 166.524545] disk 5, o:1, dev:sdf1 21:38:48 ...: [ 166.524550] disk 6, o:1, dev:sde1 21:38:48 ...: [ 166.524553] ...ok start reshape thread 21:38:48 ...: [ 166.524727] md0: detected capacity change from 0 to 2999995858944 21:38:48 ...: [ 166.524735] md: reshape of RAID array md0 21:38:48 ...: [ 166.524740] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. 21:38:48 ...: [ 166.524745] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape. 21:38:48 ...: [ 166.524756] md: using 128k window, over a total of 976561152 blocks. 21:39:05 ...: [ 166.525013] md0: 21:42:04 ...: [ 362.520063] INFO: task mdadm:1937 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520073] mdadm D 00000000ffffffff 0 1937 1 0x00000000 21:42:04 ...: [ 362.520083] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520092] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0 21:42:04 ...: [ 362.520100] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198 21:42:04 ...: [ 362.520107] Call Trace: 21:42:04 ...: [ 362.520133] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:42:04 ...: [ 362.520148] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:42:04 ...: [ 362.520159] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456] 21:42:04 ...: [ 362.520169] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456] 21:42:04 ...: [ 362.520179] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:42:04 ...: [ 362.520188] [<ffffffff81414df0>] md_make_request+0xc0/0x130 21:42:04 ...: [ 362.520194] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130 21:42:04 ...: [ 362.520205] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0 21:42:04 ...: [ 362.520214] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20 21:42:04 ...: [ 362.520222] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60 21:42:04 ...: [ 362.520230] [<ffffffff8129fc80>] submit_bio+0x80/0x110 21:42:04 ...: [ 362.520236] [<ffffffff8116c849>] submit_bh+0xf9/0x140 21:42:04 ...: [ 362.520244] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0 21:42:04 ...: [ 362.520251] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70 21:42:04 ...: [ 362.520258] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40 21:42:04 ...: [ 362.520265] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160 21:42:04 ...: [ 362.520272] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20 21:42:04 ...: [ 362.520279] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0 21:42:04 ...: [ 362.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:42:04 ...: [ 362.520290] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:42:04 ...: [ 362.520297] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120 21:42:04 ...: [ 362.520304] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20 21:42:04 ...: [ 362.520310] [<ffffffff810f591e>] read_cache_page+0xe/0x20 21:42:04 ...: [ 362.520317] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0 21:42:04 ...: [ 362.520324] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460 21:42:04 ...: [ 362.520331] [<ffffffff811a7938>] check_partition+0x138/0x190 21:42:04 ...: [ 362.520338] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0 21:42:04 ...: [ 362.520344] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0 21:42:04 ...: [ 362.520350] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:42:04 ...: [ 362.520356] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:42:04 ...: [ 362.520362] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:42:04 ...: [ 362.520369] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:42:04 ...: [ 362.520377] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:42:04 ...: [ 362.520385] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:42:04 ...: [ 362.520391] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:42:04 ...: [ 362.520398] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:42:04 ...: [ 362.520406] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310 21:42:04 ...: [ 362.520414] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:42:04 ...: [ 362.520421] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:42:04 ...: [ 362.520428] [<ffffffff811418b0>] sys_open+0x20/0x30 21:42:04 ...: [ 362.520437] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:42:04 ...: [ 362.520446] INFO: task mdadm:2283 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520450] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520454] mdadm D 0000000000000000 0 2283 2212 0x00000000 21:42:04 ...: [ 362.520462] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520470] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0 21:42:04 ...: [ 362.520478] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78 21:42:04 ...: [ 362.520485] Call Trace: 21:42:04 ...: [ 362.520495] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:42:04 ...: [ 362.520502] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:42:04 ...: [ 362.520508] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190 21:42:04 ...: [ 362.520514] [<ffffffff811741b0>] blkdev_put+0x10/0x20 21:42:04 ...: [ 362.520520] [<ffffffff811741f3>] blkdev_close+0x33/0x60 21:42:04 ...: [ 362.520527] [<ffffffff81145375>] __fput+0xf5/0x210 21:42:04 ...: [ 362.520534] [<ffffffff811454b5>] fput+0x25/0x30 21:42:04 ...: [ 362.520540] [<ffffffff811415ad>] filp_close+0x5d/0x90 21:42:04 ...: [ 362.520546] [<ffffffff81141697>] sys_close+0xb7/0x120 21:42:04 ...: [ 362.520553] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:42:04 ...: [ 362.520559] INFO: task md0_reshape:2287 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520567] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000 21:42:04 ...: [ 362.520575] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520582] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0 21:42:04 ...: [ 362.520590] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8 21:42:04 ...: [ 362.520597] Call Trace: 21:42:04 ...: [ 362.520608] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:42:04 ...: [ 362.520616] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:42:04 ...: [ 362.520626] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456] 21:42:04 ...: [ 362.520634] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:42:04 ...: [ 362.520644] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456] 21:42:04 ...: [ 362.520651] [<ffffffff81052713>] ? __wake_up+0x53/0x70 21:42:04 ...: [ 362.520658] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0 21:42:04 ...: [ 362.520668] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10 21:42:04 ...: [ 362.520675] [<ffffffff8141640c>] md_thread+0x5c/0x130 21:42:04 ...: [ 362.520681] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:42:04 ...: [ 362.520688] [<ffffffff814163b0>] ? md_thread+0x0/0x130 21:42:04 ...: [ 362.520694] [<ffffffff81084416>] kthread+0x96/0xa0 21:42:04 ...: [ 362.520701] [<ffffffff810131ea>] child_rip+0xa/0x20 21:42:04 ...: [ 362.520707] [<ffffffff81084380>] ? kthread+0x0/0xa0 21:42:04 ...: [ 362.520713] [<ffffffff810131e0>] ? child_rip+0x0/0x20 21:42:04 ...: [ 362.520718] INFO: task mdadm:2288 blocked for more than 120 seconds. 21:42:04 ...: [ 362.520721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:42:04 ...: [ 362.520725] mdadm D 0000000000000000 0 2288 1 0x00000000 21:42:04 ...: [ 362.520733] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0 21:42:04 ...: [ 362.520741] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000 21:42:04 ...: [ 362.520748] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8 21:42:04 ...: [ 362.520755] Call Trace: 21:42:04 ...: [ 362.520763] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:42:04 ...: [ 362.520771] [<ffffffff812a6d50>] ? exact_match+0x0/0x10 21:42:04 ...: [ 362.520777] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:42:04 ...: [ 362.520783] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0 21:42:04 ...: [ 362.520790] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:42:04 ...: [ 362.520795] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:42:04 ...: [ 362.520801] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:42:04 ...: [ 362.520808] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:42:04 ...: [ 362.520815] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:42:04 ...: [ 362.520821] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:42:04 ...: [ 362.520828] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:42:04 ...: [ 362.520834] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:42:04 ...: [ 362.520841] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40 21:42:04 ...: [ 362.520848] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330 21:42:04 ...: [ 362.520855] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0 21:42:04 ...: [ 362.520862] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:42:04 ...: [ 362.520868] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:42:04 ...: [ 362.520874] [<ffffffff811418b0>] sys_open+0x20/0x30 21:42:04 ...: [ 362.520882] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:44:04 ...: [ 482.520065] INFO: task mdadm:1937 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520077] mdadm D 00000000ffffffff 0 1937 1 0x00000000 21:44:04 ...: [ 482.520087] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520096] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0 21:44:04 ...: [ 482.520104] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198 21:44:04 ...: [ 482.520112] Call Trace: 21:44:04 ...: [ 482.520139] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:44:04 ...: [ 482.520154] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:44:04 ...: [ 482.520165] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456] 21:44:04 ...: [ 482.520175] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456] 21:44:04 ...: [ 482.520185] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:44:04 ...: [ 482.520194] [<ffffffff81414df0>] md_make_request+0xc0/0x130 21:44:04 ...: [ 482.520201] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130 21:44:04 ...: [ 482.520212] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0 21:44:04 ...: [ 482.520221] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20 21:44:04 ...: [ 482.520229] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60 21:44:04 ...: [ 482.520237] [<ffffffff8129fc80>] submit_bio+0x80/0x110 21:44:04 ...: [ 482.520244] [<ffffffff8116c849>] submit_bh+0xf9/0x140 21:44:04 ...: [ 482.520252] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0 21:44:04 ...: [ 482.520258] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70 21:44:04 ...: [ 482.520266] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40 21:44:04 ...: [ 482.520273] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160 21:44:04 ...: [ 482.520280] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20 21:44:04 ...: [ 482.520286] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0 21:44:04 ...: [ 482.520293] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:44:04 ...: [ 482.520299] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:44:04 ...: [ 482.520306] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120 21:44:04 ...: [ 482.520313] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20 21:44:04 ...: [ 482.520319] [<ffffffff810f591e>] read_cache_page+0xe/0x20 21:44:04 ...: [ 482.520327] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0 21:44:04 ...: [ 482.520334] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460 21:44:04 ...: [ 482.520341] [<ffffffff811a7938>] check_partition+0x138/0x190 21:44:04 ...: [ 482.520348] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0 21:44:04 ...: [ 482.520355] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0 21:44:04 ...: [ 482.520361] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:44:04 ...: [ 482.520367] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:44:04 ...: [ 482.520373] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:44:04 ...: [ 482.520380] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:44:04 ...: [ 482.520388] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:44:04 ...: [ 482.520396] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:44:04 ...: [ 482.520403] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:44:04 ...: [ 482.520410] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:44:04 ...: [ 482.520417] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310 21:44:04 ...: [ 482.520426] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:44:04 ...: [ 482.520432] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:44:04 ...: [ 482.520438] [<ffffffff811418b0>] sys_open+0x20/0x30 21:44:04 ...: [ 482.520447] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:44:04 ...: [ 482.520458] INFO: task mdadm:2283 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520462] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520467] mdadm D 0000000000000000 0 2283 2212 0x00000000 21:44:04 ...: [ 482.520475] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520483] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0 21:44:04 ...: [ 482.520490] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78 21:44:04 ...: [ 482.520498] Call Trace: 21:44:04 ...: [ 482.520508] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:44:04 ...: [ 482.520515] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:44:04 ...: [ 482.520521] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190 21:44:04 ...: [ 482.520527] [<ffffffff811741b0>] blkdev_put+0x10/0x20 21:44:04 ...: [ 482.520533] [<ffffffff811741f3>] blkdev_close+0x33/0x60 21:44:04 ...: [ 482.520541] [<ffffffff81145375>] __fput+0xf5/0x210 21:44:04 ...: [ 482.520547] [<ffffffff811454b5>] fput+0x25/0x30 21:44:04 ...: [ 482.520554] [<ffffffff811415ad>] filp_close+0x5d/0x90 21:44:04 ...: [ 482.520560] [<ffffffff81141697>] sys_close+0xb7/0x120 21:44:04 ...: [ 482.520568] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:44:04 ...: [ 482.520574] INFO: task md0_reshape:2287 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520578] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520582] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000 21:44:04 ...: [ 482.520590] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520597] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0 21:44:04 ...: [ 482.520605] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8 21:44:04 ...: [ 482.520612] Call Trace: 21:44:04 ...: [ 482.520623] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:44:04 ...: [ 482.520633] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:44:04 ...: [ 482.520643] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456] 21:44:04 ...: [ 482.520651] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:44:04 ...: [ 482.520661] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456] 21:44:04 ...: [ 482.520668] [<ffffffff81052713>] ? __wake_up+0x53/0x70 21:44:04 ...: [ 482.520675] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0 21:44:04 ...: [ 482.520685] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10 21:44:04 ...: [ 482.520692] [<ffffffff8141640c>] md_thread+0x5c/0x130 21:44:04 ...: [ 482.520699] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:44:04 ...: [ 482.520705] [<ffffffff814163b0>] ? md_thread+0x0/0x130 21:44:04 ...: [ 482.520711] [<ffffffff81084416>] kthread+0x96/0xa0 21:44:04 ...: [ 482.520718] [<ffffffff810131ea>] child_rip+0xa/0x20 21:44:04 ...: [ 482.520725] [<ffffffff81084380>] ? kthread+0x0/0xa0 21:44:04 ...: [ 482.520730] [<ffffffff810131e0>] ? child_rip+0x0/0x20 21:44:04 ...: [ 482.520735] INFO: task mdadm:2288 blocked for more than 120 seconds. 21:44:04 ...: [ 482.520739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:44:04 ...: [ 482.520743] mdadm D 0000000000000000 0 2288 1 0x00000000 21:44:04 ...: [ 482.520751] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0 21:44:04 ...: [ 482.520759] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000 21:44:04 ...: [ 482.520767] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8 21:44:04 ...: [ 482.520774] Call Trace: 21:44:04 ...: [ 482.520782] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:44:04 ...: [ 482.520790] [<ffffffff812a6d50>] ? exact_match+0x0/0x10 21:44:04 ...: [ 482.520797] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:44:04 ...: [ 482.520804] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0 21:44:04 ...: [ 482.520810] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:44:04 ...: [ 482.520816] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:44:04 ...: [ 482.520822] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:44:04 ...: [ 482.520829] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:44:04 ...: [ 482.520837] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:44:04 ...: [ 482.520843] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:44:04 ...: [ 482.520850] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:44:04 ...: [ 482.520857] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:44:04 ...: [ 482.520864] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40 21:44:04 ...: [ 482.520871] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330 21:44:04 ...: [ 482.520878] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0 21:44:04 ...: [ 482.520885] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:44:04 ...: [ 482.520891] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:44:04 ...: [ 482.520897] [<ffffffff811418b0>] sys_open+0x20/0x30 21:44:04 ...: [ 482.520905] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:46:04 ...: [ 602.520053] INFO: task mdadm:1937 blocked for more than 120 seconds. 21:46:04 ...: [ 602.520059] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:46:04 ...: [ 602.520065] mdadm D 00000000ffffffff 0 1937 1 0x00000000 21:46:04 ...: [ 602.520075] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0 21:46:04 ...: [ 602.520084] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0 21:46:04 ...: [ 602.520091] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198 21:46:04 ...: [ 602.520099] Call Trace: 21:46:04 ...: [ 602.520127] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456] 21:46:04 ...: [ 602.520142] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20 21:46:04 ...: [ 602.520153] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456] 21:46:04 ...: [ 602.520162] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456] 21:46:04 ...: [ 602.520171] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40 21:46:04 ...: [ 602.520180] [<ffffffff81414df0>] md_make_request+0xc0/0x130 21:46:04 ...: [ 602.520187] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130 21:46:04 ...: [ 602.520197] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0 21:46:04 ...: [ 602.520206] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20 21:46:04 ...: [ 602.520215] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60 21:46:04 ...: [ 602.520222] [<ffffffff8129fc80>] submit_bio+0x80/0x110 21:46:04 ...: [ 602.520229] [<ffffffff8116c849>] submit_bh+0xf9/0x140 21:46:04 ...: [ 602.520237] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0 21:46:04 ...: [ 602.520244] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70 21:46:04 ...: [ 602.520252] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40 21:46:04 ...: [ 602.520259] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160 21:46:04 ...: [ 602.520266] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20 21:46:04 ...: [ 602.520273] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0 21:46:04 ...: [ 602.520279] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:46:04 ...: [ 602.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20 21:46:04 ...: [ 602.520292] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120 21:46:04 ...: [ 602.520300] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20 21:46:04 ...: [ 602.520306] [<ffffffff810f591e>] read_cache_page+0xe/0x20 21:46:04 ...: [ 602.520314] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0 21:46:04 ...: [ 602.520321] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460 21:46:04 ...: [ 602.520328] [<ffffffff811a7938>] check_partition+0x138/0x190 21:46:04 ...: [ 602.520335] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0 21:46:04 ...: [ 602.520342] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0 21:46:04 ...: [ 602.520348] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0 21:46:04 ...: [ 602.520354] [<ffffffff81174640>] blkdev_get+0x10/0x20 21:46:04 ...: [ 602.520359] [<ffffffff811746c1>] blkdev_open+0x71/0xc0 21:46:04 ...: [ 602.520367] [<ffffffff811419f3>] __dentry_open+0x113/0x370 21:46:04 ...: [ 602.520375] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30 21:46:04 ...: [ 602.520383] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0 21:46:04 ...: [ 602.520390] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70 21:46:04 ...: [ 602.520397] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0 21:46:04 ...: [ 602.520404] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310 21:46:04 ...: [ 602.520413] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150 21:46:04 ...: [ 602.520419] [<ffffffff81141769>] do_sys_open+0x69/0x170 21:46:04 ...: [ 602.520425] [<ffffffff811418b0>] sys_open+0x20/0x30 21:46:04 ...: [ 602.520434] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b 21:46:04 ...: [ 602.520443] INFO: task mdadm:2283 blocked for more than 120 seconds. 21:46:04 ...: [ 602.520447] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 21:46:04 ...: [ 602.520451] mdadm D 0000000000000000 0 2283 2212 0x00000000 21:46:04 ...: [ 602.520460] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0 21:46:04 ...: [ 602.520468] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0 21:46:04 ...: [ 602.520475] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78 21:46:04 ...: [ 602.520483] Call Trace: 21:46:04 ...: [ 602.520492] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180 21:46:04 ...: [ 602.520500] [<ffffffff8154397b>] mutex_lock+0x2b/0x50 21:46:04 ...: [ 602.520506] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190 21:46:04 ...: [ 602.520512] [<ffffffff811741b0>] blkdev_put+0x10/0x20 21:46:04 ...: [ 602.520518] [<ffffffff811741f3>] blkdev_close+0x33/0x60 21:46:04 ...: [ 602.520526] [<ffffffff81145375>] __fput+0xf5/0x210 21:46:04 ...: [ 602.520533] [<ffffffff811454b5>] fput+0x25/0x30 21:46:04 ...: [ 602.520539] [<ffffffff811415ad>] filp_close+0x5d/0x90 21:46:04 ...: [ 602.520545] [<ffffffff81141697>] sys_close+0xb7/0x120 21:46:04 ...: [ 602.520552] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b

Read the article

Linux boot on a raid1 software raid ?

- by azera

Hello I am trying to convert my single disk boot to a raid1 boot So far here is what i have: I sucessfully create the raid 1 as degraded with the new drive alone, I copied all the data on it I can mount that raid 1, see its files etc I already have a raid5 that is working on the same box (although not booting on it) I have installed grub on both drive When grub boot, it loads the kernel alright, but during the kernel boot it fails to load the "root block device" The kernel tells me : 1 - detected that root device is an md device 2 - determining root devices 3 - mounting root 4 - mounting /dev/md125 on /newroot failed: input/output error. Please enter another root device: ... At this point, if I enter /dev/sda3 (my "old" root device that isn't converted to raid yet) everything boots fine without the root. The /dev/md125 device is indeed created but it seems to be created after the error happens, as in it creates it after loading the device, when mdadm is loaded. Somehow it looks like it can't/doesn't load the raid array before it needs to mount it, and I don't know how I can solve that. My config files (taken from the system once it boots with sda3 as root device): $ cat /etc/mdadm.conf ARRAY /dev/md/md0-r5 metadata=0.90 UUID=1a118934:c831bdb3:64188b84:66721085 ARRAY /dev/md125 metadata=0.90 UUID=48ec4190:a80d4dde:64188b84:66721085 $ cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] [raid0] [raid10] md125 : active raid1 sdc3[1] 477853312 blocks [2/1] [_U] md127 : active raid5 sdd[0] sdf[3] sdb[2] sde[1] 4395415488 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] unused devices: <none> $ cat /boot/grub/menu.lst default 0 timeout 8 splashimage=(hd0,0)/boot/grub/splash.xpm.gz title Gentoo Linux 2.6.31-r10 root (hd0,0) #kernel /boot/kernel-genkernel-x86_64-2.6.31-gentoo-r10 root=/dev/ram0 real_root=/dev/sda3 kernel /boot/kernel-genkernel-x86_64-2.6.31-gentoo-r10 root=/dev/md125 md=125,/dev/sdc3,/dev/sda3 initrd /boot/initramfs-genkernel-x86_64-2.6.31-gentoo-r10 # blkid /dev/sda1: UUID="89fee223-b845-4e0a-8a0b-e6cf695d5bcf" TYPE="ext2" /dev/sda2: UUID="a72296a8-d7d4-447f-a34b-ee920fd1a767" TYPE="swap" /dev/sda3: UUID="97eb0a6a-c385-4a9d-bf74-c0bab1fa4dc1" TYPE="ext3" /dev/sdb: UUID="1a118934-c831-bdb3-6418-8b8466721085" TYPE="linux_raid_member" /dev/sdc1: UUID="d36537fd-19a0-b8a3-6418-8b8466721085" TYPE="linux_raid_member" /dev/sdd: UUID="1a118934-c831-bdb3-6418-8b8466721085" TYPE="linux_raid_member" /dev/sde: UUID="1a118934-c831-bdb3-6418-8b8466721085" TYPE="linux_raid_member" /dev/md127: UUID="13a41589-4cf1-4c04-91ca-37484182c783" TYPE="ext4" /dev/sdf: UUID="1a118934-c831-bdb3-6418-8b8466721085" TYPE="linux_raid_member" /dev/sdc2: UUID="a1916397-1b48-45d7-9f98-73aa521e882f" TYPE="swap" /dev/sdc3: UUID="48ec4190-a80d-4dde-6418-8b8466721085" TYPE="linux_raid_member" /dev/md125: UUID="c947ed64-1d4d-4d1d-b4d2-24669fff916e" SEC_TYPE="ext2" TYPE="ext3" # mdadm -E mdadm: No devices to examine # fdisk -l Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xe975e9fc Device Boot Start End Blocks Id System /dev/sda1 1 5 40131 83 Linux /dev/sda2 6 1311 10490445 82 Linux swap / Solaris /dev/sda3 1312 60801 477853425 83 Linux Disk /dev/sdc: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xe975e9fc Device Boot Start End Blocks Id System /dev/sdc1 1 5 40131 83 Linux /dev/sdc2 6 1311 10490445 82 Linux swap / Solaris /dev/sdc3 1312 60801 477853425 83 Linux Disk /dev/md125: 489.3 GB, 489321791488 bytes 2 heads, 4 sectors/track, 119463328 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x00000000 Disk /dev/md125 doesn't contain a valid partition table

Read the article

How to get an inactive RAID device working again?

- by Jonik

After booting, my RAID1 device (/dev/md_d0 *) sometimes goes in some funny state and I cannot mount it. * Originally I created /dev/md0 but it has somehow changed itself into /dev/md_d0. # mount /opt mount: wrong fs type, bad option, bad superblock on /dev/md_d0, missing codepage or helper program, or other error (could this be the IDE device where you in fact use ide-scsi so that sr0 or sda or so is needed?) In some cases useful info is found in syslog - try dmesg | tail or so The RAID device appears to be inactive somehow: # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md_d0 : inactive sda4[0](S) 241095104 blocks # mdadm --detail /dev/md_d0 mdadm: md device /dev/md_d0 does not appear to be active. Question is, how to make the device active again (using mdmadm, I presume)? (Other times it's alright (active) after boot, and I can mount it manually without problems. But it still won't mount automatically even though I have it in /etc/fstab: /dev/md_d0 /opt ext4 defaults 0 0 So a bonus question: what should I do to make the RAID device automatically mount at /opt at boot time?) This is an Ubuntu 9.10 workstation. Background info about my RAID setup in this question. Edit: My /etc/mdadm/mdadm.conf looks like this. I've never touched this file, at least by hand. # by default, scan all partitions (/proc/partitions) for MD superblocks. # alternatively, specify devices to scan, using wildcards if desired. DEVICE partitions # auto-create devices with Debian standard permissions CREATE owner=root group=disk mode=0660 auto=yes # automatically tag new arrays as belonging to the local system HOMEHOST <system> # instruct the monitoring daemon where to send mail alerts MAILADDR <my mail address> # definitions of existing MD arrays # This file was auto-generated on Wed, 27 Jan 2010 17:14:36 +0200 In /proc/partitions the last entry is md_d0 at least now, after reboot, when the device happens to be active again. (I'm not sure if it would be the same when it's inactive.) Resolution: as Jimmy Hedman suggested, I took the output of mdadm --examine --scan: ARRAY /dev/md0 level=raid1 num-devices=2 UUID=de8fbd92[...] and added it in /etc/mdadm/mdadm.conf, which seems to have fixed the main problem. After changing /etc/fstab to use /dev/md0 again (instead of /dev/md_d0), the RAID device also gets automatically mounted!

Read the article

How do I stop and repair a RAID 5 array that has failed and has I/O pending?

- by Ben Hymers

The short version: I have a failed RAID 5 array which has a bunch of processes hung waiting on I/O operations on it; how can I recover from this? The long version: Yesterday I noticed Samba access was being very sporadic; accessing the server's shares from Windows would randomly lock up explorer completely after clicking on one or two directories. I assumed it was Windows being a pain and left it. Today the problem is the same, so I did a little digging; the first thing I noticed was that running ps aux | grep smbd gives a lot of lines like this: ben 969 0.0 0.2 96088 4128 ? D 18:21 0:00 smbd -F root 1708 0.0 0.2 93468 4748 ? Ss 18:44 0:00 smbd -F root 1711 0.0 0.0 93468 1364 ? S 18:44 0:00 smbd -F ben 3148 0.0 0.2 96052 4160 ? D Mar07 0:00 smbd -F ... There are a lot of processes stuck in the "D" state. Running ps aux | grep " D" shows up some other processes including my nightly backup script, all of which need to access the volume mounted on my RAID array at some point. After some googling, I found that it might be down to the RAID array failing, so I checked /proc/mdstat, which shows this: ben@jack:~$ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sdb1[3](F) sdc1[1] sdd1[2] 2930271872 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU] unused devices: <none> And running mdadm --detail /dev/md0 gives this: ben@jack:~$ sudo mdadm --detail /dev/md0 /dev/md0: Version : 00.90 Creation Time : Sat Oct 31 20:53:10 2009 Raid Level : raid5 Array Size : 2930271872 (2794.53 GiB 3000.60 GB) Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Mon Mar 7 03:06:35 2011 State : active, degraded Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : f114711a:c770de54:c8276759:b34deaa0 Events : 0.208245 Number Major Minor RaidDevice State 3 8 17 0 faulty spare rebuilding /dev/sdb1 1 8 33 1 active sync /dev/sdc1 2 8 49 2 active sync /dev/sdd1 I believe this says that sdb1 has failed, and so the array is running with two drives out of three 'up'. Some advice I found said to check /var/log/messages for notices of failures, and sure enough there are plenty: ben@jack:~$ grep sdb /var/log/messages ... Mar 7 03:06:35 jack kernel: [4525155.384937] md/raid:md0: read error NOT corrected!! (sector 400644912 on sdb1). Mar 7 03:06:35 jack kernel: [4525155.389686] md/raid:md0: read error not correctable (sector 400644920 on sdb1). Mar 7 03:06:35 jack kernel: [4525155.389686] md/raid:md0: read error not correctable (sector 400644928 on sdb1). Mar 7 03:06:35 jack kernel: [4525155.389688] md/raid:md0: read error not correctable (sector 400644936 on sdb1). Mar 7 03:06:56 jack kernel: [4525176.231603] sd 0:0:1:0: [sdb] Unhandled sense code Mar 7 03:06:56 jack kernel: [4525176.231605] sd 0:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Mar 7 03:06:56 jack kernel: [4525176.231608] sd 0:0:1:0: [sdb] Sense Key : Medium Error [current] [descriptor] Mar 7 03:06:56 jack kernel: [4525176.231623] sd 0:0:1:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed Mar 7 03:06:56 jack kernel: [4525176.231627] sd 0:0:1:0: [sdb] CDB: Read(10): 28 00 17 e1 5f bf 00 01 00 00 To me it is clear that device sdb has failed, and I need to stop the array, shutdown, replace it, reboot, then repair the array, bring it back up and mount the filesystem. I cannot hot-swap a replacement drive in, and don't want to leave the array running in a degraded state. I believe I am supposed to unmount the filesystem before stopping the array, but that is failing, and that is where I'm stuck now: ben@jack:~$ sudo umount /storage umount: /storage: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) It is indeed busy; there are some 30 or 40 processes waiting on I/O. What should I do? Should I kill all these processes and try again? Is that a wise move when they are 'uninterruptable'? What would happen if I tried to reboot? Please let me know what you think I should do. And please ask if you need any extra information to diagnose the problem or to help!

Read the article

Do I need to be worried about these SMART drive temperatures?

- by Steve Lorimer

I have 5 hard drives in a machine sitting in a cupboard. /dev/sda is a 500GB Seagate drive, and is the boot disk. /dev/sd{b,c,d,e} are 2TB drives in a raid6 configuration. smartctl is showing significantly higher temperatures (like ~140 degrees celsius) on the raid drives than the boot drive. Do I need to be worried? /dev/sdb and /dev/sde are new Western Digital Black drives (new=1 week) /dev/sdc and /dev/sdd are 5 year old Hitachi drives /dev/sda [SAT], Temperature_Celsius changed from 40 to 39 /dev/sdc [SAT], Temperature_Celsius changed from 142 to 146 /dev/sdc [SAT], Temperature_Celsius changed from 146 to 142 /dev/sdd [SAT], Temperature_Celsius changed from 142 to 146 /dev/sda [SAT], Airflow_Temperature_Cel changed from 61 to 62 /dev/sda [SAT], Temperature_Celsius changed from 39 to 38 /dev/sde [SAT], Temperature_Celsius changed from 107 to 108 /dev/sdb [SAT], Temperature_Celsius changed from 108 to 109 /dev/sdc [SAT], Temperature_Celsius changed from 146 to 150 /dev/sdc [SAT], Temperature_Celsius changed from 146 to 150 /dev/sda [SAT], Airflow_Temperature_Cel changed from 62 to 61 /dev/sda [SAT], Temperature_Celsius changed from 38 to 39 Update: Adding detailed drive information as per request: /dev/sda =========================== smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.9.10-100.fc17.x86_64] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Pipeline HD 5900.2 Device Model: ST3500312CS Serial Number: 5VV47HXA LU WWN Device Id: 5 000c50 02aad5ad6 Firmware Version: SC13 User Capacity: 500,107,862,016 bytes [500 GB] Sector Size: 512 bytes logical/physical Rotation Rate: 5900 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 1.5 Gb/s (current: 1.5 Gb/s) Local Time is: Tue Jun 3 10:54:11 2014 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled /dev/sdb =========================== smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.9.10-100.fc17.x86_64] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: WDC WD2003FZEX-00Z4SA0 Serial Number: WD-WMC1F1398726 LU WWN Device Id: 5 0014ee 003b8bd25 Firmware Version: 01.01A01 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-2 (minor revision not indicated) SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Tue Jun 3 10:54:11 2014 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled /dev/sdc =========================== smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.9.10-100.fc17.x86_64] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi Deskstar 7K3000 Device Model: Hitachi HDS723020BLA642 Serial Number: MN1220F30WSTUD LU WWN Device Id: 5 000cca 369cc9f5d Firmware Version: MN6OA580 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Tue Jun 3 10:54:11 2014 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled /dev/sdd =========================== smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.9.10-100.fc17.x86_64] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi Deskstar 7K3000 Device Model: Hitachi HDS723020BLA642 Serial Number: MN1220F30WST4D LU WWN Device Id: 5 000cca 369cc9f48 Firmware Version: MN6OA580 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 6.0 Gb/s (current: 1.5 Gb/s) Local Time is: Tue Jun 3 10:54:11 2014 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled /dev/sde =========================== smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.9.10-100.fc17.x86_64] (local build) Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: WDC WD2003FZEX-00Z4SA0 Serial Number: WD-WMC1F1483782 LU WWN Device Id: 5 0014ee 3002d235c Firmware Version: 01.01A01 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-2 (minor revision not indicated) SATA Version is: SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s) Local Time is: Tue Jun 3 10:54:11 2014 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled

Read the article

Degraded RAID5 and no md superblock on one of remaining drive

- by ark1214

This is actually on a QNAP TS-509 NAS. The RAID is basically a Linux RAID. The NAS was configured with RAID 5 with 5 drives (/md0 with /dev/sd[abcde]3). At some point, /dev/sde failed and drive was replaced. While rebuilding (and not completed), the NAS rebooted itself and /dev/sdc dropped out of the array. Now the array can't start because essentially 2 drives have dropped out. I disconnected /dev/sde and hoped that /md0 can resume in degraded mode, but no luck.. Further investigation shows that /dev/sdc3 has no md superblock. The data should be good since the array was unable to assemble after /dev/sdc dropped off. All the searches I done showed how to reassemble the array assuming 1 bad drive. But I think I just need to restore the superblock on /dev/sdc3 and that should bring the array up to a degraded mode which will allow me to backup data and then proceed with rebuilding with adding /dev/sde. Any help would be greatly appreciated. mdstat does not show /dev/md0 # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] md5 : active raid1 sdd2[2](S) sdc2[3](S) sdb2[1] sda2[0] 530048 blocks [2/2] [UU] md13 : active raid1 sdd4[3] sdc4[2] sdb4[1] sda4[0] 458880 blocks [5/4] [UUUU_] bitmap: 40/57 pages [160KB], 4KB chunk md9 : active raid1 sdd1[3] sdc1[2] sdb1[1] sda1[0] 530048 blocks [5/4] [UUUU_] bitmap: 33/65 pages [132KB], 4KB chunk mdadm show /dev/md0 is still there # mdadm --examine --scan ARRAY /dev/md9 level=raid1 num-devices=5 UUID=271bf0f7:faf1f2c2:967631a4:3c0fa888 ARRAY /dev/md5 level=raid1 num-devices=2 UUID=0d75de26:0759d153:5524b8ea:86a3ee0d spares=2 ARRAY /dev/md0 level=raid5 num-devices=5 UUID=ce3e369b:4ff9ddd2:3639798a:e3889841 ARRAY /dev/md13 level=raid1 num-devices=5 UUID=7384c159:ea48a152:a1cdc8f2:c8d79a9c With /dev/sde removed, here is the mdadm examine output showing sdc3 has no md superblock # mdadm --examine /dev/sda3 /dev/sda3: Magic : a92b4efc Version : 00.90.00 UUID : ce3e369b:4ff9ddd2:3639798a:e3889841 Creation Time : Sat Dec 8 15:01:19 2012 Raid Level : raid5 Used Dev Size : 1463569600 (1395.77 GiB 1498.70 GB) Array Size : 5854278400 (5583.08 GiB 5994.78 GB) Raid Devices : 5 Total Devices : 4 Preferred Minor : 0 Update Time : Sat Dec 8 15:06:17 2012 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 1 Spare Devices : 0 Checksum : d9e9ff0e - correct Events : 0.394 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 3 0 active sync /dev/sda3 0 0 8 3 0 active sync /dev/sda3 1 1 8 19 1 active sync /dev/sdb3 2 2 8 35 2 active sync /dev/sdc3 3 3 8 51 3 active sync /dev/sdd3 4 4 0 0 4 faulty removed [~] # mdadm --examine /dev/sdb3 /dev/sdb3: Magic : a92b4efc Version : 00.90.00 UUID : ce3e369b:4ff9ddd2:3639798a:e3889841 Creation Time : Sat Dec 8 15:01:19 2012 Raid Level : raid5 Used Dev Size : 1463569600 (1395.77 GiB 1498.70 GB) Array Size : 5854278400 (5583.08 GiB 5994.78 GB) Raid Devices : 5 Total Devices : 4 Preferred Minor : 0 Update Time : Sat Dec 8 15:06:17 2012 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 1 Spare Devices : 0 Checksum : d9e9ff20 - correct Events : 0.394 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 19 1 active sync /dev/sdb3 0 0 8 3 0 active sync /dev/sda3 1 1 8 19 1 active sync /dev/sdb3 2 2 8 35 2 active sync /dev/sdc3 3 3 8 51 3 active sync /dev/sdd3 4 4 0 0 4 faulty removed [~] # mdadm --examine /dev/sdc3 mdadm: No md superblock detected on /dev/sdc3. [~] # mdadm --examine /dev/sdd3 /dev/sdd3: Magic : a92b4efc Version : 00.90.00 UUID : ce3e369b:4ff9ddd2:3639798a:e3889841 Creation Time : Sat Dec 8 15:01:19 2012 Raid Level : raid5 Used Dev Size : 1463569600 (1395.77 GiB 1498.70 GB) Array Size : 5854278400 (5583.08 GiB 5994.78 GB) Raid Devices : 5 Total Devices : 4 Preferred Minor : 0 Update Time : Sat Dec 8 15:06:17 2012 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 1 Spare Devices : 0 Checksum : d9e9ff44 - correct Events : 0.394 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 3 8 51 3 active sync /dev/sdd3 0 0 8 3 0 active sync /dev/sda3 1 1 8 19 1 active sync /dev/sdb3 2 2 8 35 2 active sync /dev/sdc3 3 3 8 51 3 active sync /dev/sdd3 4 4 0 0 4 faulty removed fdisk output shows /dev/sdc3 partition is still there. [~] # fdisk -l Disk /dev/sdx: 128 MB, 128057344 bytes 8 heads, 32 sectors/track, 977 cylinders Units = cylinders of 256 * 512 = 131072 bytes Device Boot Start End Blocks Id System /dev/sdx1 1 8 1008 83 Linux /dev/sdx2 9 440 55296 83 Linux /dev/sdx3 441 872 55296 83 Linux /dev/sdx4 873 977 13440 5 Extended /dev/sdx5 873 913 5232 83 Linux /dev/sdx6 914 977 8176 83 Linux Disk /dev/sda: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 66 530113+ 83 Linux /dev/sda2 67 132 530145 82 Linux swap / Solaris /dev/sda3 133 182338 1463569695 83 Linux /dev/sda4 182339 182400 498015 83 Linux Disk /dev/sda4: 469 MB, 469893120 bytes 2 heads, 4 sectors/track, 114720 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk /dev/sda4 doesn't contain a valid partition table Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 * 1 66 530113+ 83 Linux /dev/sdb2 67 132 530145 82 Linux swap / Solaris /dev/sdb3 133 182338 1463569695 83 Linux /dev/sdb4 182339 182400 498015 83 Linux Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdc1 1 66 530125 83 Linux /dev/sdc2 67 132 530142 83 Linux /dev/sdc3 133 182338 1463569693 83 Linux /dev/sdc4 182339 182400 498012 83 Linux Disk /dev/sdd: 2000.3 GB, 2000398934016 bytes 255 heads, 63 sectors/track, 243201 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdd1 1 66 530125 83 Linux /dev/sdd2 67 132 530142 83 Linux /dev/sdd3 133 243138 1951945693 83 Linux /dev/sdd4 243139 243200 498012 83 Linux Disk /dev/md9: 542 MB, 542769152 bytes 2 heads, 4 sectors/track, 132512 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk /dev/md9 doesn't contain a valid partition table Disk /dev/md5: 542 MB, 542769152 bytes 2 heads, 4 sectors/track, 132512 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk /dev/md5 doesn't contain a valid partition table

Read the article

Raid 1 array won't assemble after power outage. How do I fix this ext4 mirror?

- by Forkrul Assail

Two ext4 drives on Raid 1 with mdadm won't reassemble after the power went out for an extended period (UPS drained). After turning the machine back on, mdadm said that the array was degraded, after which it took about 2 days for a full resync, which completed without problems. On trying to remount the array I get: mount: you must specify the filesystem type cat /etc/fstab lines relevant to setup: /dev/md127 /media/mediapool ext4 defaults 0 0 dmesg | tail (on trying to mount) says: [ 1050.818782] EXT3-fs (md127): error: can't find ext3 filesystem on dev md127. [ 1050.849214] EXT4-fs (md127): VFS: Can't find ext4 filesystem [ 1050.944781] FAT-fs (md127): invalid media value (0x00) [ 1050.944782] FAT-fs (md127): Can't find a valid FAT filesystem [ 1058.272787] EXT2-fs (md127): error: can't find an ext2 filesystem on dev md127. cat /proc/mdstat says: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md127 : active (auto-read-only) raid1 sdj[2] sdi[0] 2930135360 blocks super 1.2 [2/2] [UU] unused devices: <none> fsck /dev/md127 says: fsck from util-linux 2.20.1 e2fsck 1.42 (29-Nov-2011) fsck.ext2: Superblock invalid, trying backup blocks... fsck.ext2: Bad magic number in super-block while trying to open /dev/md127 The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 8193 <device> mdadm -E /dev/sdi gives me: /dev/sdi: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 37ac1824:eb8a21f6:bd5afd6d:96da6394 Name : sojourn:33 Creation Time : Sat Nov 10 10:43:52 2012 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 5860271016 (2794.40 GiB 3000.46 GB) Array Size : 2930135360 (2794.39 GiB 3000.46 GB) Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 3e6e9a4f:6c07ab3d:22d47fce:13cecfd0 Update Time : Tue Nov 13 20:34:18 2012 Checksum : f7d10db9 - correct Events : 27 Device Role : Active device 0 Array State : AA ('A' == active, '.' == missing) boot@boot ~ $ sudo mdadm -E /dev/sdj /dev/sdj: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 37ac1824:eb8a21f6:bd5afd6d:96da6394 Name : sojourn:33 Creation Time : Sat Nov 10 10:43:52 2012 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 5860271016 (2794.40 GiB 3000.46 GB) Array Size : 2930135360 (2794.39 GiB 3000.46 GB) Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 7fb84af4:e9295f7b:ede61f27:bec0cb57 Update Time : Tue Nov 13 20:34:18 2012 Checksum : b9d17fef - correct Events : 27 Device Role : Active device 1 Array State : AA ('A' == active, '.' == missing) machine@user ~ dmesg | tail [ 61.785866] init: alsa-restore main process (2736) terminated with status 99 [ 68.433548] eth0: no IPv6 routers present [ 534.142511] EXT4-fs (sdi): ext4_check_descriptors: Block bitmap for group 0 not in group (block 2838187772)! [ 534.142518] EXT4-fs (sdi): group descriptors corrupted! [ 546.418780] EXT2-fs (sdi): error: couldn't mount because of unsupported optional features (240) [ 549.654127] EXT3-fs (sdi): error: couldn't mount because of unsupported optional features (240) Since this is Raid 1 it was suggested that I try and mount or fsck the drives separately. After a long fsck on one drive, it ended with this as tail: Illegal double indirect block (2298566437) in inode 39717736. CLEARED. Illegal block #4231180 (2611866932) in inode 39717736. CLEARED. Error storing directory block information (inode=39717736, block=0, num=1092368): Memory allocation failed Recreate journal? yes Creating journal (32768 blocks): Done. *** journal has been re-created - filesystem is now ext3 again *** The drive however still doesn't want to mount: dmesg | tail [ 170.674659] md: export_rdev(sdc) [ 170.675152] md: export_rdev(sdc) [ 195.275288] md: export_rdev(sdc) [ 195.275876] md: export_rdev(sdc) [ 1338.540092] CE: hpet increased min_delta_ns to 30169 nsec [26125.734105] EXT4-fs (sdc): ext4_check_descriptors: Checksum for group 0 failed (43502!=37987) [26125.734115] EXT4-fs (sdc): group descriptors corrupted! [26182.325371] EXT3-fs (sdc): error: couldn't mount because of unsupported optional features (240) [27083.316519] EXT4-fs (sdc): ext4_check_descriptors: Checksum for group 0 failed (43502!=37987) [27083.316530] EXT4-fs (sdc): group descriptors corrupted! Please help me fix this. I never in my wildest nightmares thought a complete mirror would die this badly. Am I missing something? Suggestions on fixing this? Could someone explain why it would resync after the powerout, only to seemingly nuke the drive? Thanks for reading. Any help much appreciated. I've tried everything I can think of, including booting and filesystem checking with SystemRescue and Ubuntu liveboot discs.

Read the article

e2fsck / resize2fs problems

- by BlakBat

I've got 6 drives (each 1.5T, all same model and firmware revision) that are part of a RAID5 array. The RAID5 makes a LVM volume group and a logical group. The latter contains only one ext3 partition. I've recently ran: e2fsck -f /dev/vg03/lv01 && resize2fs -M /dev/vg03/lv01 which exited without an error. Now when I try to mount /dev/vg03/lv01 I get: EXT3-fs error (device dm-0): ext3_check_descriptors: Block bitmap for group 30533 not in group (block 1000532368)! EXT3-fs: group descriptors corrupted! How do I get out of this predicament? This is all the info I can currently give you: fdisk -l /dev/sd[cdefgh] shows (correctly) that they are "Linux raid autodetect" but fdisk now shows: fdisk -l /dev/md0 Disk /dev/md0: 7501.5 GB, 7501495664640 bytes ... Disk identifier: 0x00000000 Disk /dev/md0 doesn't contain a valid partition table (instead of a LVM type partition) fdisk -l /dev/vg03/lv01 Disk /dev/vg03/lv01: 7501.5 GB, 7501491732480 bytes ... Disk identifier: 0x00000000 Disk /dev/vg03/lv01 doesn't contain a valid partition table (instead of a ext3 type partition) I've tried: e2fsck -fy /dev/vg03/lv01 e2fsck 1.41.12 (17-May-2010) e2fsck: Group descriptors look bad... trying backup blocks... Block bitmap for group 30533 is not in group. (block 1000532368) Relocate? yes Inode bitmap for group 30533 is not in group. (block 1000532369) Relocate? yes Pass 1: Checking inodes, blocks, and sizes Relocating group 30533's block bitmap to 1000524246... Error allocating 1 contiguous block(s) in block group 30533 for inode bitmap: Could not allocate block in ext2 filesystem e2fsck: aborted Extra information I can give you: cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active (auto-read-only) raid5 sdg1[0] sdh1[5] sdf1[4] sde1[3] sdc1[2] sdd1[1] 7325679360 blocks level 5, 128k chunk, algorithm 2 [6/6] [UUUUUU] bitmap: 1/175 pages [4KB], 4096KB chunk unused devices: Lastly, all smartctl tests (short and extendend) showed no errors on any of the disks. Should I try to resize2fs to grow /dev/vg03/lv01 and redo a e2fsck ? Should I cfdisk /dev/md0 and /dev/vg03/lv01 back to their real types? Thanks in advance for all and any help. 2011-09-20 UPDATE I issued the following commands and was able to remount the partition, but by viewing the size (df) of before and after, it seems that 1Tb of data have gone missing. By checking the MD5SUMS (from an old backup) of some files with the "same" files from the remounted partition, some errors have been detected. Commands issued to remount the partition were: dumpe2fs /dev/vg03/lv01 Block count: 1000491435<br /> Block size: 4096<br /> tune2fs -O ^has_journal /dev/vg03/lv01 resize2fs -p /dev/vg03/lv01 dumpe2fs /dev/vg03/lv01 Block count: 1831418880<br /> Block size: 4096<br /> mount -o ro,noatime /dev/vg03/lv01 /mnt/raid OK... but files have been damaged / gone missing.

Search Results

Search found 82 results on 4 pages for 'raid6'.

Page 3/4 | < Previous Page | 1 2 3 4 | Next Page >

- by Mark0978

- by Michael Shick

- by OlafM

- by azera

- by Fred Phillips

- by Yannick M.

- by Jonik

- by John P

- by thiesdiggity

- by Stephen

- by lucek

- by James

- by Trevor Boyd Smith

- by chrismetcalf

- by reano

- by Zorlack

- by beatbreaker

- by BarsMonster

- by azera

- by Jonik

- by Ben Hymers

- by Steve Lorimer

- by ark1214

- by Forkrul Assail

- by BlakBat

< Previous Page | 1 2 3 4 | Next Page >