Search Results

Search found 82 results on 4 pages for 'raid6'.

Page 1/4 | 1 2 3 4  | Next Page >

  • Areca 1280ml RAID6 volume set failed

    - by Richard
    Today we hit some kind of worst case scenario and are open to any kind of good ideas. Here is our problem: We are using several dedicated storage servers to host our virtual machines. Before I continue, here are the specs: Dedicated Server Machine Areca 1280ml RAID controller, Firmware 1.49 12x Samsung 1TB HDDs We configured one RAID6-set with 10 discs that contains one logical volume. We have two hot spares in the system. Today one HDD failed. This happens from time to time, so we replaced it. Upon rebuilding a second disc failed. Normally this is no fun. We stopped heavy IO-operations to ensure a stable RAID rebuild. Sadly the hot-spare disc failed while rebuilding and the whole thing stopped. Now we have the following situation: The controller says that the raid set is rebuilding The controller says that the volume failed It is a RAID 6 system and two discs failed, so the data has to be intact, but we cannot bring the volume online again to access the data. While searching we found the following leads. I don't know whether they are good or bad: Mirroring all the discs to a second set of drives. So we would have the possibility to try different things without loosing more than we already have. Trying to rebuild the array in R-Studio. But we have no real experience with the software. Pulling all drives, rebooting the system, changing into the areca controller bios, reinserting the HDDs one-by-one. Some people are saying that the brought the system online by this. Some are saying that the effect is zero. Some say, that they blew the whole thing. Using undocumented areca commands like "rescue" or "LeVel2ReScUe". Contacting a computer forensics service. But whoa... primary estimates by phone exceeded 20.000€. That's why we would kindly ask for help. Maybe we are missing the obvious? And yes of course, we have backups. But some systems lost one week of data, thats why we'd like to get the system up and running again. Any help, suggestions and questions are more than welcome.

    Read the article

  • Linux software RAID6: 3 drives offline - how to force online?

    - by Ole Tange
    This is similar to 3 drives fell out of Raid6 mdadm - rebuilding? except that it is not due to a failing cable. Instead the 3rd drive fell offline during rebuild of another drive. The drive failed with: kernel: end_request: I/O error, dev sdc, sector 293732432 kernel: md/raid:md0: read error not correctable (sector 293734224 on sdc). After rebooting both these sectors and the sectors around them are fine. This leads me to believe the error is intermittent and thus the device simply took too long to error correct the sector and remap it. I expect that no data was written to the RAID after it failed. Therefore I hope that if I can kick the last failing device online that the RAID is fine and that the xfs_filesystem is OK, maybe with a few missing recent files. Taking a backup of the disks in the RAID takes 24 hours, so I would prefer that the solution works the first time. I have therefore set up a test scenario: export PRE=3 parallel dd if=/dev/zero of=/tmp/raid${PRE}{} bs=1k count=1000k ::: 1 2 3 4 5 parallel mknod /dev/loop${PRE}{} b 7 ${PRE}{} \; losetup /dev/loop${PRE}{} /tmp/raid${PRE}{} ::: 1 2 3 4 5 mdadm --create /dev/md$PRE -c 4096 --level=6 --raid-devices=5 /dev/loop${PRE}[12345] cat /proc/mdstat mkfs.xfs -f /dev/md$PRE mkdir -p /mnt/disk2 umount -l /mnt/disk2 mount /dev/md$PRE /mnt/disk2 seq 1000 | parallel -j1 mkdir -p /mnt/disk2/{}\;cp /bin/* /mnt/disk2/{}\;sleep 0.5 & mdadm --fail /dev/md$PRE /dev/loop${PRE}3 /dev/loop${PRE}4 cat /proc/mdstat # Assume reboot so no process is using the dir kill %1; sync & kill %1; sync & # Force fail one too many mdadm --fail /dev/md$PRE /dev/loop${PRE}1 parallel --tag -k mdadm -E ::: /dev/loop${PRE}? | grep Upda # loop 2,5 are newest. loop1 almost newest => force add loop1 Next step is to add loop1 back - and this is where I am stuck. After that do a xfs-consistency check. When that works, check that the solution also works on real devices (such a 4 USB sticks).

    Read the article

  • Migrating RAID6 array from HP SmartArray P600 to LSI MegaRAID SAS 8888ELP

    - by MLu
    My customer has an external RAID6 array (8x 1TB in HP MSA60 enclosure) attached to HP SmartArray P600 controller. They have now decided to replace the server and the new one comes with LSI MegaRAID SAS 8888ELP. The HW supplier insists the migration is as simple as pulling the eSATA cable from P600 and inserting it into LSI but I'm not convinced. Unfortunately I have nowhere to test, and will have to do the exercise on the production array :( So the question in brief - is it possible to import the HP SmartArray P600-created RAID6 array to LSI MegaRAID 8888ELP?

    Read the article

  • Auto-rebuild RAID6 with MDADM

    - by user65632
    Hello everybody, i'm new in this forum, and because I see that a lot of people get helped here, i'll ask my question here! I have a openSUSE 11.3 Linux computer with 5 disks of 1TB (WD enterprise disks) in it. with mdadm I configured an RAID6 device. Now, after a lot of thorough testing, i've noticed that when the computer goes down unsuspectedly it could happen (1 time out of 10) that while booting, the md0 device isn't recognised, and then the machine goes in "recovery mode", which means that i have to 'CTRL + C' it so it can boot to openSUSE. Once in openSUSE i have to readd the drive manually with 'mdadm /dev/md0 --add /dev/sdX'. After this everything works back fine (after resynching). So my question is: Is there a way to auto-rebuild the RAID6 device when there are problems? and how can I stop this "recovery mode" from happening. Because the computer will be in a place I can't go to, to connect a keyboard, 'CTRL + C' it just to get in openSUSE. If you could help me, I would be a very very very! happy man! :-) thanks in advance! Mikhail

    Read the article

  • Linux software RAID6: rebuild slow

    - by Ole Tange
    I am trying to find the bottleneck in the rebuilding of a software raid6. ## Pause rebuilding when measuring raw I/O performance # echo 1 > /proc/sys/dev/raid/speed_limit_min # echo 1 > /proc/sys/dev/raid/speed_limit_max ## Drop caches so that does not interfere with measuring # sync ; echo 3 | tee /proc/sys/vm/drop_caches >/dev/null # time parallel -j0 "dd if=/dev/{} bs=256k count=4000 | cat >/dev/null" ::: sdbd sdbc sdbf sdbm sdbl sdbk sdbe sdbj sdbh sdbg 4000+0 records in 4000+0 records out 1048576000 bytes (1.0 GB) copied, 7.30336 s, 144 MB/s [... similar for each disk ...] # time parallel -j0 "dd if=/dev/{} skip=15000000 bs=256k count=4000 | cat >/dev/null" ::: sdbd sdbc sdbf sdbm sdbl sdbk sdbe sdbj sdbh sdbg 4000+0 records in 4000+0 records out 1048576000 bytes (1.0 GB) copied, 12.7991 s, 81.9 MB/s [... similar for each disk ...] So we can read sequentially at 140 MB/s in the outer tracks and 82 MB/s in the inner tracks on all the drives simultaneously. Sequential write performance is similar. This would lead me to expect a rebuild speed of 82 MB/s or more. # echo 800000 > /proc/sys/dev/raid/speed_limit_min # echo 800000 > /proc/sys/dev/raid/speed_limit_max # cat /proc/mdstat md2 : active raid6 sdbd[10](S) sdbc[9] sdbf[0] sdbm[8] sdbl[7] sdbk[6] sdbe[11] sdbj[4] sdbi[3](F) sdbh[2] sdbg[1] 27349121408 blocks super 1.2 level 6, 128k chunk, algorithm 2 [9/8] [UUU_UUUUU] [=========>...........] recovery = 47.3% (1849905884/3907017344) finish=855.9min speed=40054K/sec But we only get 40 MB/s. And often this drops to 30 MB/s. # iostat -dkx 1 sdbc 0.00 8023.00 0.00 329.00 0.00 33408.00 203.09 0.70 2.12 1.06 34.80 sdbd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdbe 13.00 0.00 8334.00 0.00 33388.00 0.00 8.01 0.65 0.08 0.06 47.20 sdbf 0.00 0.00 8348.00 0.00 33388.00 0.00 8.00 0.58 0.07 0.06 48.00 sdbg 16.00 0.00 8331.00 0.00 33388.00 0.00 8.02 0.71 0.09 0.06 48.80 sdbh 961.00 0.00 8314.00 0.00 37100.00 0.00 8.92 0.93 0.11 0.07 54.80 sdbj 70.00 0.00 8276.00 0.00 33384.00 0.00 8.07 0.78 0.10 0.06 48.40 sdbk 124.00 0.00 8221.00 0.00 33380.00 0.00 8.12 0.88 0.11 0.06 47.20 sdbl 83.00 0.00 8262.00 0.00 33380.00 0.00 8.08 0.96 0.12 0.06 47.60 sdbm 0.00 0.00 8344.00 0.00 33376.00 0.00 8.00 0.56 0.07 0.06 47.60 iostat says the disks are not 100% busy (but only 40-50%). This fits with the hypothesis that the max is around 80 MB/s. Since this is software raid the limiting factor could be CPU. top says: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 38520 root 20 0 0 0 0 R 64 0.0 2947:50 md2_raid6 6117 root 20 0 0 0 0 D 53 0.0 473:25.96 md2_resync So md2_raid6 and md2_resync are clearly busy taking up 64% and 53% of a CPU respectively, but not near 100%. The chunk size (128k) of the RAID was chosen after measuring which chunksize gave the least CPU penalty. If this speed is normal: What is the limiting factor? Can I measure that? If this speed is not normal: How can I find the limiting factor? Can I change that?

    Read the article

  • RAID6 mdraid -> LVM -> EXT4 root with GRUB2?

    - by Rotonen
    2012-03-31 Debian Wheezy daily build in VirtualBox 4.1.2, 6 disk devices. My steps to reproduce so far: Setup one partition, using the entire disk, as a physical volume for RAID, per disk Setup a single RAID6 mdraid array out of all of those Use the resulting md0 as the only physical volume for the volume group Setup your logical volumes, filesystems and mount points as you wish Install your system Both / and /boot will be in this stack. I've chosen EXT4 as my filesystem for this setup. I can get as far as GRUB2 rescue console, which can see the mdraid, the volume group and the LVM logical volumes (all named appropriately on all levels) on it, but I cannot ls the filesystem contents of any of those and I cannot boot from them. As far as I can see from the documentation the version of GRUB2 shipped there should handle all of this gracefully. http://packages.debian.org/wheezy/grub-pc (1.99-17 at the time of writing.) It is loading the ext2, raid, raid6rec, dosmbr (this one is in the list of modules once per disk) and lvm modules according to the generated grub.cfg file. Also it is defining the list of modules to be loaded twice in the generated grub.cfg file and according to quick Googling around this seems to be the norm and OK for GRUB2. How to get further by getting GRUB2 to actually be able to read the content of the filesystems and boot the system? What am I wrong about in my assumptions of functionality here? EDIT (2012-04-01) My generated grub.cfg: http://pastie.org/3708436 It seems it first makes my /usr logical volume the root and that might be source of the failure? A grub-mkconfig bug? Or is it supposed to get access to stuff from /usr before / and /boot? /boot is on / for me - no separate boot logical volume.

    Read the article

  • Strange performance issue with Dell R7610 and LSI 2208 RAID controller

    - by GregC
    Connecting controller to any of the three PCIe x16 slots yield choppy read performance around 750 MB/sec Lowly PCIe x4 slot yields steady 1.2 GB/sec read Given same files, same Windows Server 2008 R2 OS, same RAID6 24-disk Seagate ES.2 3TB array on LSI 9286-8e, same Dell R7610 Precision Workstation with A03 BIOS, same W5000 graphics card (no other cards), same settings etc. I see super-low CPU utilization in both cases. SiSoft Sandra reports x8 at 5GT/sec in x16 slot, and x4 at 5GT/sec in x4 slot, as expected. I'd like to be able to rely on the sheer speed of x16 slots. What gives? What can I try? Any ideas? Please assist Cross-posted from http://en.community.dell.com/support-forums/desktop/f/3514/t/19526990.aspx Follow-up information We did some more performance testing with reading from 8 SSDs, connected directly (without an expander chip). This means that both SAS cables were utilized. We saw nearly double performance, but it varied from run to run: {2.0, 1.8, 1.6, and 1.4 GB/sec were observed, then performance jumped back up to 2.0}. The SSD RAID0 tests were conducted in a x16 PCIe slot, all other variables kept the same. It seems to me that we were getting double the performance of HDD-based RAID6 array. Just for reference: maximum possible read burst speed over single channel of SAS 6Gb/sec is 570 MB/sec due to 8b/10b encoding and protocol limitations (SAS cable provides four such channels).

    Read the article

  • Suggestions for splitting server roles amongst Hyper-V virtual servers / RAID6 or RAID10? / AppAssure

    - by Anon
    We have 2 Hyper-V hosts at present running 1 virtual server that was converted from a physical box running all roles. My plan is to split the roles over various virtual machines, upgrading to the latest software versions as I go, and use the backup server as a standby in case the main server fails. AppAssure backup software has a feature called Virtual Standby, so the VHD's can be ready to be fired up on the backup server if necessary. Off-site backups will be done via external USB drive for now. I'm just seeking some input/suggestions into how I'm planning to split the roles out amongst various virtual servers. Also, I'm curious how to setup the storage on the servers. We do not have any NAS's, SAN'S or any budget for this. What would the best RAID level be to use? I'm thinking either RAID6 (which is currently used) however I'm concerned about the write speeds, or RAID10 but again I'm worried that I can only lose 1 drive (from the same mirror) as opposed to any 2 with RAID6. I realise I have a hot swap for this, but what if a further drive fails during a rebuild? Is the write penalty of RAID6 worth the extra reliability over RAID10? Or will it be too slow with all the roles I am planning, therefore RAID10 is my only real option? The reason for the needed redundancy is I am the only technician and I'm not always on-site. Options I've considered: 1) 5 drives in RAID6 set, 200gb for host OS, rest for VM storage. 1 drive for hot swap - this is how it is currently setup 2) 4 drives in RAID10 set, 200gb for host OS, rest for VM storage. 2 drives for hot swap 3) 4 drives in RAID10 set for VM storage, 2 drives in RAID1 set for host OS. No drives for hot swap - While this is probably the best option with the amount of drives I have, I don't like the idea of having no hot swap 4) 3 drives in RAID6 set for VM storage, 2 drives in RAID1 set for host OS. 1 drive for hot swap All options give us enough storage capacity for our files, etc. We don't have any budget for extra drives or extra hot swap HD chassis for the servers. We have about 70 clients and about 150 users. MAIN SERVER Intel Xeon 5520 @ 2.27 GHz (2 processors) 16GB RAM 6 x 1TB Seagate Barracuda ES.2 Enterprise SATA drives Intel SRCSATAWB RAID controller Virtual machine workload using Hyper-V on Windows Server 2008 R2: DC01 - Active Directory Domain Controller / DNS server / Global catalog - 1GB RAM DC02 - Active Directory Domain Controller / DNS server / Global catalog - 1GB RAM Member Server - DHCP server, File server, Print server - 1GB RAM SCCM Member Server - 4GB RAM Third Party Software Member Server - A/V server, Ticketing software, etc - 4GB RAM Exchange 2007 - 4GB RAM - however we are probably migrating to a hosted solution, therefore freeing up resources BACKUP SERVER Intel Xeon E5410 @ 2.33GHz (2 processors) 16GB RAM 6 x 2TB WD RE4 SATA drives Intel SRCSASRB RAID controller Virtual machine workload using Hyper-V on Windows Server 2008 R2: AppAssure backup software - 8GB RAM

    Read the article

  • 3Ware 9650SE RAID-6, two degraded drives, one ECC, rebuild stuck

    - by cswingle
    This morning I came in the office to discover that two of the drives on a RAID-6, 3ware 9650SE controller were marked as degraded and it was rebuilding the array. After getting to about 4%, it got ECC errors on a third drive (this may have happened when I attempted to access the filesystem on this RAID and got I/O errors from the controller). Now I'm in this state: > /c2/u1 show Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB) ------------------------------------------------------------------------ u1 RAID-6 REBUILDING 4%(A) - - 64K 7450.5 u1-0 DISK OK - - p5 - 931.312 u1-1 DISK OK - - p2 - 931.312 u1-2 DISK OK - - p1 - 931.312 u1-3 DISK OK - - p4 - 931.312 u1-4 DISK OK - - p11 - 931.312 u1-5 DISK DEGRADED - - p6 - 931.312 u1-6 DISK OK - - p7 - 931.312 u1-7 DISK DEGRADED - - p3 - 931.312 u1-8 DISK WARNING - - p9 - 931.312 u1-9 DISK OK - - p10 - 931.312 u1/v0 Volume - - - - - 7450.5 Examining the SMART data on the three drives in question, the two that are DEGRADED are in good shape (PASSED without any Current_Pending_Sector or Offline_Uncorrectable errors), but the drive listed as WARNING has 24 uncorrectable sectors. And, the "rebuild" has been stuck at 4% for ten hours now. So: How do I get it to start actually rebuilding? This particular controller doesn't appear to support /c2/u1 resume rebuild, and the only rebuild command that appears to be an option is one that wants to know what disk to add (/c2/u1 start rebuild disk=<p:-p...> [ignoreECC] according to the help). I have two hot spares in the server, and I'm happy to engage them, but I don't understand what it would do with that information in the current state it's in. Can I pull out the drive that is demonstrably failing (the WARNING drive), when I have two DEGRADED drives in a RAID-6? It seems to me that the best scenario would be for me to pull the WARNING drive and tell it to use one of my hot spares in the rebuild. But won't I kill the thing by pulling a "good" drive in a RAID-6 with two DEGRADED drives? Finally, I've seen reference in other posts to a bad bug in this controller that causes good drives to be marked as bad and that upgrading the firmware may help. Is flashing the firmware a risky operation given the situation? Is it likely to help or hurt wrt the rebuilding-but-stuck-at-4% RAID? Am I experiencing this bug in action? Advice outside the spiritual would be much appreciated. Thanks.

    Read the article

  • How to identify RAID (5 or 6) controllers that allow dynamic resize of the array

    - by David Pfeffer
    I'm building a server with a RAID5 array, based on a hardware controller. I want to be able to later add additional disks and have the array rebalance across all of the disks, enlarging the usable size. I also want to be able to later upgrade to bigger disks (one at a time, of course) and then expand the array to fill the entire drive. These features are available in Linux software raid (md). I've also heard they're available in some hardware controllers. Currently, I own the Adaptec RAID 3805 card and the 3ware 9650se card. I'd prefer to use the Adaptec if possible, but I can't find if either of these cards offer this feature. If they don't, are there other affordable (read as: sub-$600) RAID cards available that can accomplish this?

    Read the article

  • Linux mdadm software RAID 6 - does it support bit corruption recovery?

    - by user101203
    Wikipedia says "RAID 2 is the only standard RAID level, other than some implementations of RAID 6, which can automatically recover accurate data from single-bit corruption in data." Does anyone know if the RAID 6 mdadm implementation in Linux is one such implementation that can automatically detect and recover from single-bit data corruption. This pertains to CentOS / Red Hat 6 if those are different from other versions. I tried searching online but didn't have much luck. With SATA error rates being 1 in 1E14 bits, and a 2TB SATA disk containing 1.6E13 bits, this is especially relevant to preventing data corruption. Thanks!

    Read the article

  • Force RAID to read exiled disk?

    - by user198847
    user user197015 asked on 1th of November the following question: "We have a RAID 6 array (Infortrend EonStor DS S16F) that recently had two disks fail. Immediately prior to replacing these two disks, a third, good, disk was accidentally ejected from the array. After reinserting this disk it is marked as "exiled" by the array's firmware, and so even after replacing the two failed disks with new ones the array refuses to rebuild the logical volume and remains inaccessible. Since the temporarily-ejected disk is still functional and nothing has been written to the array since it was ejected, it seems that it should theoretically be possible to recover all the data on the array, but how can we convince the array to use the data from the "exiled" disk? Thanks for any help or advice you can offer." Now I've got quite the same problem. The post has been deleted by the user, so I don#t know if he was successful. Is there anybody who can help me? Thank you!

    Read the article

  • High IO latency on ESXi 4.1 on HP DL580 G7

    - by teddydestodes
    at my Company we were experiencing some weird spikes on IO latency on one of our ESXi instances. we've spend 24h figuring out whats wrong and no clue so far. after giving up we put all the disks into a different server (HP DL380 G7) with much less RAM and only one 6(HT) cores (from 12 on the DL 580) which run fine for about 2 hours. I dont know the specs for the DL380 but both servers have a Smart Array P410i with BBWC (the DL 580 has 1GB) Is it possible that one(or all) of the disks is failing without actually failing?

    Read the article

  • LSI RAID-on-chip with RAID6 over two SAS links goes red when HDD enclosure is powered cycled; how to recover?

    - by GregC
    I have a RAID6 array managed by LSI 9286-8e card. I also have Sans Digital 24-bay NexentaSTOR JBOD enclosure with SAS extender built-in. They are connected to separate UPS devices. Normally, I'd shut down the PC, leaving RAID6 in healthy state. But today the power to JBOD enclosure was cut but PC kept running. After restarting the PC, all disks in RAID6 have lit up RED, and the only option in LSI MegaRAID manager app was to reset each disk to unassigned, thereby loosing all data on RAID6 array. Thankfully, I am only testing, but how would I recover if this were to happen in production?

    Read the article

  • mdadm+zfs vs mdadm+lvm

    - by Alex
    This may be a naive question since I'm new to this and I cannot find any results about mdadm+zfs, but after some testing it seems it might work: The use case is a server with RAID6 for some data that is backed-up somewhat infrequently. I think I'm well served by any of ZFS or RAID6. Platform is Linux. Performance is secondary. So the two setups I am considering are: A RAID6 array plus regular LVM and ext4 A RAID6 array plus ZFS (without redundancy). Is this second option that I don't see discussed at all. Why ZFS+RAID6? It's mainly because the inability of ZFS to grow a raidz2 with new disks. You can replace disks with larger ones, I know, but not add another disk. You can accomplish 2-disk redundancy and ZFS disk growth using mdadm as the redundancy layer. Besides that main point (otherwise I could go directly to raidz2 without RAID under it), these are the pros-cons that I see for each option: ZFS has snapshots without preallocated space. LVM requires preallocation (might be no longer true). ZFS has checksumming (very interested in this) and compression (nice bonus). LVM has online filesystem growth (ZFS can do it offline with export/mdadm --grow/import). LVM has encryption (ZFS-on-Linux has not). This is the only major con of this combo I see. I guess I could go RAID6+LVM+ZFS... seems too heavy, or not? So, to close with a proper question: 1) Is there anything that inherently discourages or precludes RAID6+ZFS? Anyone has experience with a setup like this? 2) Are there possibilities for checksumming and compression that would make ZFS unnecessary (maintaining the possibility of filesystem growth)? Because the RAID6+LVM combo seems the sanctioned, tested way.

    Read the article

  • mdadm: Win7-install created a boot partition on one of my RAID6 drives. How to rebuild?

    - by EXIT_FAILURE
    My problem happened when I attempted to install Windows 7 on it's own SSD. The Linux OS I used which has knowledge of the software RAID system is on a SSD that I disconnected prior to the install. This was so that windows (or I) wouldn't inadvertently mess it up. However, and in retrospect, foolishly, I left the RAID disks connected, thinking that windows wouldn't be so ridiculous as to mess with a HDD that it sees as just unallocated space. Boy was I wrong! After copying over the installation files to the SSD (as expected and desired), it also created an ntfs partition on one of the RAID disks. Both unexpected and totally undesired! . I changed out the SSDs again, and booted up in linux. mdadm didn't seem to have any problem assembling the array as before, but if I tried to mount the array, I got the error message: mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so dmesg: EXT4-fs (md0): ext4_check_descriptors: Block bitmap for group 0 not in group (block 1318081259)! EXT4-fs (md0): group descriptors corrupted! I then used qparted to delete the newly created ntfs partition on /dev/sdd so that it matched the other three /dev/sd{b,c,e}, and requested a resync of my array with echo repair > /sys/block/md0/md/sync_action This took around 4 hours, and upon completion, dmesg reports: md: md0: requested-resync done. A bit brief after a 4-hour task, though I'm unsure as to where other log files exist (I also seem to have messed up my sendmail configuration). In any case: No change reported according to mdadm, everything checks out. mdadm -D /dev/md0 still reports: Version : 1.2 Creation Time : Wed May 23 22:18:45 2012 Raid Level : raid6 Array Size : 3907026848 (3726.03 GiB 4000.80 GB) Used Dev Size : 1953513424 (1863.02 GiB 2000.40 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Mon May 26 12:41:58 2014 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 4K Name : okamilinkun:0 UUID : 0c97ebf3:098864d8:126f44e3:e4337102 Events : 423 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 2 8 48 2 active sync /dev/sdd 3 8 64 3 active sync /dev/sde Trying to mount it still reports: mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so and dmesg: EXT4-fs (md0): ext4_check_descriptors: Block bitmap for group 0 not in group (block 1318081259)! EXT4-fs (md0): group descriptors corrupted! I'm a bit unsure where to proceed from here, and trying stuff "to see if it works" is a bit too risky for me. This is what I suggest I should attempt to do: Tell mdadm that /dev/sdd (the one that windows wrote into) isn't reliable anymore, pretend it is newly re-introduced to the array, and reconstruct its content based on the other three drives. I also could be totally wrong in my assumptions, that the creation of the ntfs partition on /dev/sdd and subsequent deletion has changed something that cannot be fixed this way. My question: Help, what should I do? If I should do what I suggested , how do I do that? From reading documentation, etc, I would think maybe: mdadm --manage /dev/md0 --set-faulty /dev/sdd mdadm --manage /dev/md0 --remove /dev/sdd mdadm --manage /dev/md0 --re-add /dev/sdd However, the documentation examples suggest /dev/sdd1, which seems strange to me, as there is no partition there as far as linux is concerned, just unallocated space. Maybe these commands won't work without. Maybe it makes sense to mirror the partition table of one of the other raid devices that weren't touched, before --re-add. Something like: sfdisk -d /dev/sdb | sfdisk /dev/sdd Bonus question: Why would the Windows 7 installation do something so st...potentially dangerous? Update I went ahead and marked /dev/sdd as faulty, and removed it (not physically) from the array: # mdadm --manage /dev/md0 --set-faulty /dev/sdd # mdadm --manage /dev/md0 --remove /dev/sdd However, attempting to --re-add was disallowed: # mdadm --manage /dev/md0 --re-add /dev/sdd mdadm: --re-add for /dev/sdd to /dev/md0 is not possible --add, was fine. # mdadm --manage /dev/md0 --add /dev/sdd mdadm -D /dev/md0 now reports the state as clean, degraded, recovering, and /dev/sdd as spare rebuilding. /proc/mdstat shows the recovery progress: md0 : active raid6 sdd[4] sdc[1] sde[3] sdb[0] 3907026848 blocks super 1.2 level 6, 4k chunk, algorithm 2 [4/3] [UU_U] [>....................] recovery = 2.1% (42887780/1953513424) finish=348.7min speed=91297K/sec nmon also shows expected output: ¦sdb 0% 87.3 0.0| > |¦ ¦sdc 71% 109.1 0.0|RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR > |¦ ¦sdd 40% 0.0 87.3|WWWWWWWWWWWWWWWWWWWW > |¦ ¦sde 0% 87.3 0.0|> || It looks good so far. Crossing my fingers for another five+ hours :) Update 2 The recovery of /dev/sdd finished, with dmesg output: [44972.599552] md: md0: recovery done. [44972.682811] RAID conf printout: [44972.682815] --- level:6 rd:4 wd:4 [44972.682817] disk 0, o:1, dev:sdb [44972.682819] disk 1, o:1, dev:sdc [44972.682820] disk 2, o:1, dev:sdd [44972.682821] disk 3, o:1, dev:sde Attempting mount /dev/md0 reports: mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so And on dmesg: [44984.159908] EXT4-fs (md0): ext4_check_descriptors: Block bitmap for group 0 not in group (block 1318081259)! [44984.159912] EXT4-fs (md0): group descriptors corrupted! I'm not sure what do do now. Suggestions? Output of dumpe2fs /dev/md0: dumpe2fs 1.42.8 (20-Jun-2013) Filesystem volume name: Atlas Last mounted on: /mnt/atlas Filesystem UUID: e7bfb6a4-c907-4aa0-9b55-9528817bfd70 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 244195328 Block count: 976756712 Reserved block count: 48837835 Free blocks: 92000180 Free inodes: 243414877 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 791 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 RAID stripe width: 2 Flex block group size: 16 Filesystem created: Thu May 24 07:22:41 2012 Last mount time: Sun May 25 23:44:38 2014 Last write time: Sun May 25 23:46:42 2014 Mount count: 341 Maximum mount count: -1 Last checked: Thu May 24 07:22:41 2012 Check interval: 0 (<none>) Lifetime writes: 4357 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: e177a374-0b90-4eaa-b78f-d734aae13051 Journal backup: inode blocks dumpe2fs: Corrupt extent header while reading journal super block

    Read the article

  • Xen kernel can't see 2 disks of 6 of 1TB, does it have a limitation?

    - by PartySoft
    Linux gentoo-xen 2.6.18-xen-r12 #3 SMP Tue Oct 5 09:28:53 PDT 2010 x86_64 Intel(R) Xeon(R) CPU E5506 @ 2.13GHz GenuineIntel GNU/Linux I have 6 disks of 1 TB and i can't see all of them only 4, can anyone give me an ideea what can i do ? Filesystem Size Used Avail Use% Mounted on rootfs 886G 4.4G 836G 1% / /dev/sda3 886G 4.4G 836G 1% / rc-svcdir 1.0M 44K 980K 5% /lib64/rc/init.d shm 7.9G 0 7.9G 0% /dev/shm /dev/sdb1 917G 200M 871G 1% /home2 /dev/sdc1 917G 200M 871G 1% /home3 /dev/sdd1 917G 200M 871G 1% /home4 The hardware is Dual xeon E5506 processors on a supermicro X8DTL mobo 4.346585] ata3.00: ATA-8, max UDMA/133, 1953525168 sectors: LBA48 NCQ (depth 0/32) [ 4.346588] ata3.00: ata3: dev 0 multi count 16 [ 4.352861] ata3.00: configured for UDMA/133 [ 4.352867] scsi3 : ata_piix [ 4.352875] PM: Adding info for No Bus:host3 [ 4.510584] ata4.00: ATA-8, max UDMA/133, 1953525168 sectors: LBA48 NCQ (depth 0/32) [ 4.510587] ata4.00: ata4: dev 0 multi count 16 [ 4.516848] ata4.00: configured for UDMA/133 [ 4.516861] PM: Adding info for No Bus:target2:0:0 [ 4.516905] Vendor: ATA Model: SAMSUNG HD103SJ Rev: 1AJ1 [ 4.516910] Type: Direct-Access ANSI SCSI revision: 05 [ 4.516920] PM: Adding info for scsi:2:0:0:0 [ 4.517452] SCSI device sde: 1953525168 512-byte hdwr sectors (1000205 MB) [ 4.517460] sde: Write Protect is off [ 4.517461] sde: Mode Sense: 00 3a 00 00 [ 4.517478] SCSI device sde: drive cache: write back [ 4.517514] SCSI device sde: 1953525168 512-byte hdwr sectors (1000205 MB) [ 4.517521] sde: Write Protect is off [ 4.517522] sde: Mode Sense: 00 3a 00 00 [ 4.517532] SCSI device sde: drive cache: write back [ 4.517534] sde: sde1 [ 4.524551] sd 2:0:0:0: Attached scsi disk sde [ 4.524855] sd 2:0:0:0: Attached scsi generic sg4 type 0 [ 4.524874] PM: Adding info for No Bus:target3:0:0 [ 4.524928] Vendor: ATA Model: SAMSUNG HD103SJ Rev: 1AJ1 [ 4.524933] Type: Direct-Access ANSI SCSI revision: 05 [ 4.524946] PM: Adding info for scsi:3:0:0:0 [ 4.525216] SCSI device sdf: 1953525168 512-byte hdwr sectors (1000205 MB) [ 4.525227] sdf: Write Protect is off [ 4.525228] sdf: Mode Sense: 00 3a 00 00 [ 4.525242] SCSI device sdf: drive cache: write back [ 4.525280] SCSI device sdf: 1953525168 512-byte hdwr sectors (1000205 MB) [ 4.525286] sdf: Write Protect is off [ 4.525289] sdf: Mode Sense: 00 3a 00 00 [ 4.525301] SCSI device sdf: drive cache: write back [ 4.525302] sdf: sdf1 [ 4.532691] sd 3:0:0:0: Attached scsi disk sdf [ 4.533010] sd 3:0:0:0: Attached scsi generic sg5 type 0 [ 4.977669] scsi: <fdomain> Detection failed (no card) [ 5.030479] GDT-HA: Storage RAID Controller Driver. Version: 3.05 [ 5.030635] GDT-HA: Found 0 PCI Storage RAID Controllers [ 5.372350] Fusion MPT base driver 3.04.01 [ 5.372358] Copyright (c) 1999-2005 LSI Logic Corporation [ 5.579176] Fusion MPT SPI Host driver 3.04.01 [ 5.881777] ieee1394: Initialized config rom entry `ip1394' [ 6.166745] ieee1394: sbp2: Driver forced to serialize I/O (serialize_io=1) [ 6.166748] ieee1394: sbp2: Try serialize_io=0 for better performance [ 6.428866] md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 [ 6.428872] md: bitmap version 4.39 [ 6.431518] md: raid0 personality registered for level 0 [ 6.495979] md: raid1 personality registered for level 1 [ 6.570270] raid5: automatically using best checksumming function: generic_sse [ 6.575523] generic_sse: 6608.000 MB/sec [ 6.575526] raid5: using function: generic_sse (6608.000 MB/sec) [ 6.596226] raid6: int64x1 1835 MB/s [ 6.613231] raid6: int64x2 1773 MB/s [ 6.630256] raid6: int64x4 1675 MB/s [ 6.647296] raid6: int64x8 1027 MB/s [ 6.664267] raid6: sse2x1 3578 MB/s [ 6.681268] raid6: sse2x2 4207 MB/s [ 6.698280] raid6: sse2x4 4625 MB/s [ 6.698281] raid6: using algorithm sse2x4 (4625 MB/s) [ 6.698285] md: raid6 personality registered for level 6 [ 6.698286] md: raid5 personality registered for level 5 [ 6.698288] md: raid4 personality registered for level 4 [ 6.781090] md: raid10 personality registered for level 10 [ 7.007043] Intel(R) PRO/1000 Network Driver - version 7.1.9-k4 [ 7.007046] Copyright (c) 1999-2006 Intel Corporation. [ 9.229465] kjournald starting. Commit interval 5 seconds [ 9.229476] EXT3-fs: mounted filesystem with ordered data mode.

    Read the article

  • Reusing slot numbers in Linux software RAID arrays

    - by thkala
    When a hard disk drive in one of my Linux machines failed, I took the opportunity to migrate from RAID5 to a 6-disk software RAID6 array. At the time of the migration I did not have all 6 drives - more specifically the fourth and fifth (slots 3 and 4) drives were already in use in the originating array, so I created the RAID6 array with a couple of missing devices. I now need to add those drives in those empty slots. Using mdadm --add does result in a proper RAID6 configuration, with one glitch - the new drives are placed in new slots, which results in this /proc/mdstat snippet: ... md0 : active raid6 sde1[7] sdd1[6] sda1[0] sdf1[5] sdc1[2] sdb1[1] 25185536 blocks super 1.0 level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU] ... mdadm -E verifies that the actual slot numbers in the device superblocks are correct, yet the numbers shown in /proc/mdstat are still weird. I would like to fix this glitch, both to satisfy my inner perfectionist and to avoid any potential sources of future confusion in a crisis. Is there a way to specify which slot a new device should occupy in a RAID array? UPDATE: I have verified that the slot number persists in the component device superblock. For the version 1.0 superblocks that I am using that would be the dev_number field as defined in include/linux/raid/md_p.h of the Linux kernel source. I am now considering direct modification of said field to change the slot number - I don't suppose there is some standard way to manipulate the RAID superblock?

    Read the article

  • Why is my RAID /dev/md1 showing up as /dev/md126? Is mdadm.conf being ignored?

    - by mmorris
    I created a RAID with: sudo mdadm --create --verbose /dev/md1 --level=mirror --raid-devices=2 /dev/sdb1 /dev/sdc1 sudo mdadm --create --verbose /dev/md2 --level=mirror --raid-devices=2 /dev/sdb2 /dev/sdc2 sudo mdadm --detail --scan returns: ARRAY /dev/md1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb Which I appended it to /etc/mdadm/mdadm.conf, see below: # mdadm.conf # # Please refer to mdadm.conf(5) for information about this file. # # by default (built-in), scan all partitions (/proc/partitions) and all # containers for MD superblocks. alternatively, specify devices to scan, using # wildcards if desired. #DEVICE partitions containers # auto-create devices with Debian standard permissions CREATE owner=root group=disk mode=0660 auto=yes # automatically tag new arrays as belonging to the local system HOMEHOST <system> # instruct the monitoring daemon where to send mail alerts MAILADDR root # definitions of existing MD arrays # This file was auto-generated on Mon, 29 Oct 2012 16:06:12 -0500 # by mkconf $Id$ ARRAY /dev/md1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sdb2[0] sdc2[1] 208629632 blocks super 1.2 [2/2] [UU] md1 : active raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: brw-rw---- 1 root disk 9, 1 Oct 30 11:06 md1 brw-rw---- 1 root disk 9, 2 Oct 30 11:06 md2 So I think all is good and I reboot. After the reboot, /dev/md1 is now /dev/md126 and /dev/md2 is now /dev/md127????? sudo mdadm --detail --scan returns: ARRAY /dev/md/ion:1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md/ion:2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md126 : active raid1 sdc2[1] sdb2[0] 208629632 blocks super 1.2 [2/2] [UU] md127 : active (auto-read-only) raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: drwxr-xr-x 2 root root 80 Oct 30 11:18 md brw-rw---- 1 root disk 9, 126 Oct 30 11:18 md126 brw-rw---- 1 root disk 9, 127 Oct 30 11:18 md127 All is not lost, I: sudo mdadm --stop /dev/md126 sudo mdadm --stop /dev/md127 sudo mdadm --assemble --verbose /dev/md1 /dev/sdb1 /dev/sdc1 sudo mdadm --assemble --verbose /dev/md2 /dev/sdb2 /dev/sdc2 and verify everything: sudo mdadm --detail --scan returns: ARRAY /dev/md1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sdb2[0] sdc2[1] 208629632 blocks super 1.2 [2/2] [UU] md1 : active raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: brw-rw---- 1 root disk 9, 1 Oct 30 11:26 md1 brw-rw---- 1 root disk 9, 2 Oct 30 11:26 md2 So once again, I think all is good and I reboot. Again, after the reboot, /dev/md1 is /dev/md126 and /dev/md2 is /dev/md127????? sudo mdadm --detail --scan returns: ARRAY /dev/md/ion:1 metadata=1.2 name=ion:1 UUID=aa1f85b0:a2391657:cfd38029:772c560e ARRAY /dev/md/ion:2 metadata=1.2 name=ion:2 UUID=528e5385:e61eaa4c:1db2dba7:44b556fb cat /proc/mdstat returns: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md126 : active raid1 sdc2[1] sdb2[0] 208629632 blocks super 1.2 [2/2] [UU] md127 : active (auto-read-only) raid1 sdb1[0] sdc1[1] 767868736 blocks super 1.2 [2/2] [UU] unused devices: <none> ls -la /dev | grep md returns: drwxr-xr-x 2 root root 80 Oct 30 11:42 md brw-rw---- 1 root disk 9, 126 Oct 30 11:42 md126 brw-rw---- 1 root disk 9, 127 Oct 30 11:42 md127 What am I missing here?

    Read the article

  • disks not ready in array causes mdadm to force initramfs shell

    - by RaidPinata
    Okay, this is starting to get pretty frustrating. I've read most of the other answers on this site that have anything to do with this issue but I'm still not getting anywhere. I have a RAID 6 array with 10 devices and 1 spare. The OS is on a completely separate device. At boot only three of the 10 devices in the raid are available, the others become available later in the boot process. Currently, unless I go through initramfs I can't get the system to boot - it just hangs with a blank screen. When I do boot through recovery (initramfs), I get a message asking if I want to assemble the degraded array. If I say no and then exit initramfs the system boots fine and my array is mounted exactly where I intend it to. Here are the pertinent files as near as I can tell. Ask me if you want to see anything else. # mdadm.conf # # Please refer to mdadm.conf(5) for information about this file. # # by default (built-in), scan all partitions (/proc/partitions) and all # containers for MD superblocks. alternatively, specify devices to scan, using # wildcards if desired. #DEVICE partitions containers # auto-create devices with Debian standard permissions # CREATE owner=root group=disk mode=0660 auto=yes # automatically tag new arrays as belonging to the local system HOMEHOST <system> # instruct the monitoring daemon where to send mail alerts MAILADDR root # definitions of existing MD arrays # This file was auto-generated on Tue, 13 Nov 2012 13:50:41 -0700 # by mkconf $Id$ ARRAY /dev/md0 level=raid6 num-devices=10 metadata=1.2 spares=1 name=Craggenmore:data UUID=37eea980:24df7b7a:f11a1226:afaf53ae Here is fstab # /etc/fstab: static file system information. # # Use 'blkid' to print the universally unique identifier for a # device; this may be used with UUID= as a more robust way to name devices # that works even if disks are added and removed. See fstab(5). # # <file system> <mount point> <type> <options> <dump> <pass> # / was on /dev/sdc2 during installation UUID=3fa1e73f-3d83-4afe-9415-6285d432c133 / ext4 errors=remount-ro 0 1 # swap was on /dev/sdc3 during installation UUID=c4988662-67f3-4069-a16e-db740e054727 none swap sw 0 0 # mount large raid device on /data /dev/md0 /data ext4 defaults,nofail,noatime,nobootwait 0 0 output of cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sda[0] sdd[10](S) sdl[9] sdk[8] sdj[7] sdi[6] sdh[5] sdg[4] sdf[3] sde[2] sdb[1] 23441080320 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU] unused devices: <none> Here is the output of mdadm --detail --scan --verbose ARRAY /dev/md0 level=raid6 num-devices=10 metadata=1.2 spares=1 name=Craggenmore:data UUID=37eea980:24df7b7a:f11a1226:afaf53ae devices=/dev/sda,/dev/sdb,/dev/sde,/dev/sdf,/dev/sdg,/dev/sdh,/dev/sdi,/dev/sdj,/dev/sdk,/dev/sdl,/dev/sdd Please let me know if there is anything else you think might be useful in troubleshooting this... I just can't seem to figure out how to change the boot process so that mdadm waits until the drives are ready to build the array. Everything works just fine if the drives are given enough time to come online. edit: changed title to properly reflect situation

    Read the article

  • Improving mdadm RAID-6 write speed

    - by BarsMonster
    Hi! I have a mdadm RAID-6 in my home server of 5x1Tb WD Green HDDs. Read speed is more than enough - 268 Mb/s in dd. But write speed is just 37.1 Mb/s. (Both tested via dd on 48Gb file, RAM size is 1Gb, block size used in testing is 8kb) Could you please suggest why write speed is so low and is there any ways to improve it? CPU usage during writing is just 25% (i.e. half of 1 core of Opteron 165) No business critical data there & server is UPS-backed. mdstat is: Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sda1[0] sdd1[4] sde1[3] sdf1[2] sdb1[1] 2929683456 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [5/5] [UUUUU] bitmap: 0/8 pages [0KB], 65536KB chunk unused devices: <none> Any suggestions?

    Read the article

  • How to stop RAID5 array while it is shown to be busy?

    - by RCola
    I have a raid5 array and need to stop it, but while trying to stop it getting error. # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sde1[3](F) sdc1[4](F) sdf1[2] sdd1[1] 2120320 blocks level 5, 32k chunk, algorithm 2 [3/2] [_UU] unused devices: <none> # mdadm --stop mdadm: metadata format 00.90 unknown, ignored. mdadm: metadata format 00.90 unknown, ignored. mdadm: No devices given. # mdadm --stop /dev/md0 mdadm: metadata format 00.90 unknown, ignored. mdadm: metadata format 00.90 unknown, ignored. mdadm: fail to stop array /dev/md0: Device or resource busy and # lsof | grep md0 md0_raid5 965 root cwd DIR 8,1 4096 2 / md0_raid5 965 root rtd DIR 8,1 4096 2 / md0_raid5 965 root txt unknown /proc/965/exe # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sde1[3](F) sdc1[4](F) sdf1[2] sdd1[1] 2120320 blocks level 5, 32k chunk, algorithm 2 [3/2] [_UU] # grep md0 /proc/mdstat md0 : active raid5 sde1[3](F) sdc1[4](F) sdf1[2] sdd1[1] # grep md0 /proc/partitions 9 0 2120320 md0 While booting, md1 is mounted ok but md0 failed for some unknown reason # dmesg | grep md[0-9] [ 4.399658] raid5: allocated 3179kB for md1 [ 4.400432] raid5: raid level 5 set md1 active with 3 out of 3 devices, algorithm 2 [ 4.400678] md1: detected capacity change from 0 to 2121793536 [ 4.403135] md1: unknown partition table [ 38.937932] Filesystem "md1": Disabling barriers, trial barrier write failed [ 38.941969] XFS mounting filesystem md1 [ 41.058808] Ending clean XFS mount for filesystem: md1 [ 46.325684] raid5: allocated 3179kB for md0 [ 46.327103] raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2 [ 46.330620] md0: detected capacity change from 0 to 2171207680 [ 46.335598] md0: unknown partition table [ 46.410195] md: recovery of RAID array md0 [ 117.970104] md: md0: recovery done. # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sde1[0] sdf1[2] sdd1[1] 2120320 blocks level 5, 32k chunk, algorithm 2 [3/3] [UUU] md1 : active raid5 sdc2[0] sdf2[2] sde2[3](S) sdd2[1] 2072064 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU]

    Read the article

  • Reconstructing the disk order in RAID 6 with 7 disks

    - by rkotulla
    a little background to this question first: I am running a RAID-6 within a QNAP TS869L external RAID/NAS system. I started with 5 disks of 3 TB each back in the day, and later added another 2 disks of 3TB to the RAID. The QNAP internals handled the growing and re-syncing etc, and everything seemd to be perfectly fine. About 2 weeks ago, I had one of the disks (disk #5, disk #2 has gone bad in the mean time) fail, and somehow (I have no idea why), also disks 1 and 2 got kicked out of the array. I replaced disk #5, but the RAID didn't start working again. After some calls to QNAP technical support, they re-created the array (using mdadm --create --force --assume-clean ...), but the resulting array couldn't find a filesystem, and I was kindly referred to contact a data recovery company that I can't afford. After some digging through old log files, resetting the disk to factory default, etc, I found a few errors that were made during this re-create - I wish I still had some of the original metadata, but unfortunately i don't (I definitely learned that lesson). I'm currently at the point where I know the correct chunk-size (64K), metadata-version (1.0; factory default was 0.9, but from what I read 0.9 doesn't handle disks over 2 TB, mine are 3 TB), and I now find the ext4 filesystem that should be on the disks. Only variable left to determine is the right disk order! I started using the description found in answer #4 of "Recover RAID 5 data after created new array instead of re-using" but am a little confused on what the order should be for a proper RAID-6. RAID-5 is pretty well documented in a number of places, but RAID-6 much less so. Also, does the layout, i.e. distribution of parity and data chunks across the disks, change after the growing of the array from 5 to 7 disks, or does the re-sync re-organize them in such a way a native 7-disk RAID-6 would have been? Thanks some more mdadm output that might be helpful: mdadm version: [~] # mdadm --version mdadm - v2.6.3 - 20th August 2007 mdadm details from one of the disks in the array: [~] # mdadm --examine /dev/sda3 /dev/sda3: Magic : a92b4efc Version : 1.0 Feature Map : 0x0 Array UUID : 1c1614a5:e3be2fbb:4af01271:947fe3aa Name : 0 Creation Time : Tue Jun 10 10:27:58 2014 Raid Level : raid6 Raid Devices : 7 Used Dev Size : 5857395112 (2793.02 GiB 2998.99 GB) Array Size : 29286975360 (13965.12 GiB 14994.93 GB) Used Size : 5857395072 (2793.02 GiB 2998.99 GB) Super Offset : 5857395368 sectors State : clean Device UUID : 7c572d8f:20c12727:7e88c888:c2c357af Update Time : Tue Jun 10 13:01:06 2014 Checksum : d275c82d - correct Events : 7036 Chunk Size : 64K Array Slot : 0 (0, 1, failed, 3, failed, 5, 6) Array State : Uu_u_uu 2 failed mdadm details for the array in the current disk-order (based on my best guess reconstructed from old log-files) [~] # mdadm --detail /dev/md0 /dev/md0: Version : 01.00.03 Creation Time : Tue Jun 10 10:27:58 2014 Raid Level : raid6 Array Size : 14643487680 (13965.12 GiB 14994.93 GB) Used Dev Size : 2928697536 (2793.02 GiB 2998.99 GB) Raid Devices : 7 Total Devices : 5 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Jun 10 13:01:06 2014 State : clean, degraded Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Chunk Size : 64K Name : 0 UUID : 1c1614a5:e3be2fbb:4af01271:947fe3aa Events : 7036 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 19 1 active sync /dev/sdb3 2 0 0 2 removed 3 8 51 3 active sync /dev/sdd3 4 0 0 4 removed 5 8 99 5 active sync /dev/sdg3 6 8 83 6 active sync /dev/sdf3 output from /proc/mdstat (md8, md9, and md13 are internally used RAIDs holding swap, etc; the one I'm after is md0) [~] # more /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] md0 : active raid6 sdf3[6] sdg3[5] sdd3[3] sdb3[1] sda3[0] 14643487680 blocks super 1.0 level 6, 64k chunk, algorithm 2 [7/5] [UU_U_UU] md8 : active raid1 sdg2[2](S) sdf2[3](S) sdd2[4](S) sdc2[5](S) sdb2[6](S) sda2[1] sde2[0] 530048 blocks [2/2] [UU] md13 : active raid1 sdg4[3] sdf4[4] sde4[5] sdd4[6] sdc4[2] sdb4[1] sda4[0] 458880 blocks [8/7] [UUUUUUU_] bitmap: 21/57 pages [84KB], 4KB chunk md9 : active raid1 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sda1[0] sdb1[1] 530048 blocks [8/7] [UUUUUUU_] bitmap: 37/65 pages [148KB], 4KB chunk unused devices: <none>

    Read the article

1 2 3 4  | Next Page >