What happens to missed writes after a zpool clear?

Posted by Kevin on Server Fault See other posts from Server Fault or by Kevin
Published on 2013-06-19T08:32:52Z Indexed on 2013/06/26 22:23 UTC
Read the original article Hit count: 251

Filed under:

software-raid

I am trying to understand ZFS' behaviour under a specific condition, but the documentation is not very explicit about this so I'm left guessing.

Suppose we have a zpool with redundancy. Take the following sequence of events:

A problem arises in the connection between device D and the server. This causes a large number of failures and ZFS therefore faults the device, putting the pool in degraded state.
While the pool is in degraded state, the pool is mutated (data is written and/or changed.)
The connectivity issue is physically repaired such that device D is reliable again.
Knowing that most data on D is valid, and not wanting to stress the pool with a resilver needlessly, the admin instead runs zpool clear pool D. This is indicated by Oracle's documentation as the appropriate action where the fault was due to a transient problem that has been corrected.

I've read that zpool clear only clears the error counter, and restores the device to online status. However, this is a bit troubling, because if that's all it does, it will leave the pool in an inconsistent state!

This is because mutations in step 2 will not have been successfully written to D. Instead, D will reflect the state of the pool prior to the connectivity failure. This is of course not the normative state for a zpool and could lead to hard data loss upon failure of another device - however, the pool status will not reflect this issue!

I would at least assume based on ZFS' robust integrity mechanisms that an attempt to read the mutated data from D would catch the mistakes and repair them. However, this raises two problems:

Reads are not guaranteed to hit all mutations unless a scrub is done; and
Once ZFS does hit the mutated data, it (I'm guessing) might fault the drive again because it would appear to ZFS to be corrupting data, since it doesn't remember the previous write failures.

Theoretically, ZFS could circumvent this problem by keeping track of mutations that occur during a degraded state, and writing them back to D when it's cleared. For some reason I suspect that's not what happens, though.

I'm hoping someone with intimate knowledge of ZFS can shed some light on this aspect.

Developer IT

What happens to missed writes after a zpool clear? - Developer IT

What happens to missed writes after a zpool clear?

zfs

software-raid

Related posts about zfs

ZFS replication between 2 ZFS file systems

Troubleshoot broken ZFS

Oracle Solaris 11 ZFS Lab for Openworld 2012

I need advice about iscsi + zfs(or ntfs) + windows 2008 clustering

zfs setup question

Related posts about software-raid

Linux Software RAID-1 in production environment

How to install Windows 7 onto software RAID-0

Problems migrating software RAID 5 to new server (linux)

Linux boot on a raid1 software raid ?

Mac OS X Server 10.6 - Apple's software mirrored RAID worth it?

Categories cloud