Zpool disk failure - Where am I at?
- by JT.WK
After checking the status of one of my zpools today, I was faced with the following:
root@server: zpool status -v myPool
pool: myPool
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed after 3h6m with 0 errors on Tue Sep 28 11:15:11 2010
config:
NAME STATE READ WRITE CKSUM
myPool ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c6t7d0 ONLINE 0 0 0
c6t8d0 ONLINE 0 0 0
spare ONLINE 0 0 0
c6t9d0 ONLINE 54 0 0
c6t36d0 ONLINE 0 0 0
c6t10d0 ONLINE 0 0 0
c6t11d0 ONLINE 0 0 0
c6t12d0 ONLINE 0 0 0
spares
c6t36d0 INUSE currently in use
c6t37d0 AVAIL
c6t38d0 AVAIL
errors: No known data errors
From what I can see, c6t9d0 has encountered 54 write errors. It seems as though it has automatically resilvered with the spare disk c6t36d0, which is now currently in use.
My question is, where exactly am I at? Yes the 'action' tells me to determine whether or not the disk needs replacing, but is this disk currently still in use? Can I replace / remove it?
Any explanation would be much appreciated as I'm quite new to this stuff :)
update: After following the advice from C10k Consulting, ie detaching:
zpool detach myPool c6t9d0
and adding as a spare:
zpool add myPool spare c6t9d0
It appears as though all is well. The new status of my zpool is:
root@server: zpool status -v myPool
pool: myPool
state: ONLINE
scrub: resilver completed after 3h6m with 0 errors on Tue Sep 28 11:15:11 2010
config:
NAME STATE READ WRITE CKSUM
muPool ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c6t7d0 ONLINE 0 0 0
c6t8d0 ONLINE 0 0 0
c6t36d0 ONLINE 0 0 0
c6t10d0 ONLINE 0 0 0
c6t11d0 ONLINE 0 0 0
c6t12d0 ONLINE 0 0 0
spares
c6t37d0 AVAIL
c6t38d0 AVAIL
c6t9d0 AVAIL
errors: No known data errors
Thanks for your help c10k consulting :)