Formula to calculate probability of unrecoverable read error during RAID rebuild

Posted by OlafM on Super User See other posts from Super User or by OlafM
Published on 2012-12-09T11:34:44Z Indexed on 2012/12/13 17:06 UTC
Read the original article Hit count: 430

Filed under:

I need to compare the reliability of different RAID systems with either consumer or enterprise drives. The formula to have the probability of success of a rebuild, ignoring mechanical problems, is simple:

error_probability = 1 - (1-per_bit_error_rate)^bit_read

and with 3 TB drives I get

38% probability to experience an URE (unrecoverable read error) for a 2+1 disks RAID5 (4.7% for enterprise drives)
21% for a RAID1 (2.4% for enterprise drives)
51% probability of error during recovery for the 3+1 RAID5 often used by users of SOHO products like Synologys. Most people don't know about this.

Calculating the error for single disk tolerance is easy, my question concerns systems tolerant to multiple disks failures (RAID6/Z2, RAIDZ3 and RAID1 with multiple disks).

If only the first disk is used for rebuild and the second one is read again from the beginning in case or an URE, then the error probability is the one calculated above squared (14.5% for consumer RAID5 2+1, 4.5% for consumer RAID1 1+2). However, I suppose (at least in ZFS that has full checksums!) that the second parity/available disk is read only where needed, meaning that only few sectors are needed: how many UREs can possibly happen in the first disk? not many, otherwise the error probability for single-disk tolerance systems would skyrocket even more than I calculated.

If I'm correct, a second parity disk would practically lower the risk to extremely low values.

Am I correct?

Developer IT

Formula to calculate probability of unrecoverable read error during RAID rebuild - Developer IT

Formula to calculate probability of unrecoverable read error during RAID rebuild

raid

zfs

rebuild

Related posts about raid

Booting from integrated RAID controller when another RAID controller is installed in a PCIe slot

Onboard RAID vs Software RAID

RAID-1 and regular drive removal (using RAID-1 as a backup measure)

Explain difference in SQLIO numbers for RAID 0 versus RAID 5 over 6 disks

RAID 5 RECONSTRUCT with RAID Reconstructor

Related posts about zfs

ZFS replication between 2 ZFS file systems

Troubleshoot broken ZFS

Oracle Solaris 11 ZFS Lab for Openworld 2012

I need advice about iscsi + zfs(or ntfs) + windows 2008 clustering

zfs setup question

Categories cloud