How ZFS handles online replacement in a RAID-Z (theoretical)
- by Kevin
This is a somewhat theoretical question about ZFS and RAID-Z. I'll use a three disk single-parity array as an example for clarity, but the problem can be extended to any number of disks and any parity.
Suppose we have disks A, B, and C in the pool, and that it is clean.
Suppose now that we physically add disk D with the intention of replacing disk C, and that disk C is still functioning correctly and is only being replaced out of preventive maintenance. Some admins might just yank C and install D, which is a little more organized as devices need not change IDs - however this does leave the array degraded temporarily and so for this example suppose we install D without offlining or removing C. Solaris docs indicate that we can replace a disk without first offlining it, using a command such as:
zpool replace pool C D
This should cause a resilvering onto D. Let us say that resilvering proceeds "downwards" along a "cursor." (I don't know the actual terminology used in the internal implementation.)
Suppose now that midways through the resilvering, disk A fails. In theory, this should be recoverable, as above the cursor B and D contain sufficient parity and below the cursor B and C contain sufficient parity. However, whether or not this is actually recoverable depnds upon internal design decisions in ZFS which I am not aware of (and which the manual doesn't say in certain terms).
If ZFS continues to send writes to C below the cursor, then we are fine. If, however, ZFS internally treats C as though it were gone, resilvering D only from parity between A and B and only writing A and B below the cursor, then we're toast.
Some experimenting could answer this question but I was hoping maybe someone on here already knows which way ZFS handles this situation. Thank you in advance for any insight!