Does a 3ware "ECC-ERROR" matter on a JBOD when I have ZFS?
- by Stefan Lasiewski
I have a FreeBSD 8.x machine running ZFS and with a 3ware 9690SA controller.
The 3ware controller shows an ECC-ERROR with one of the disks:
//host> /c0 show
VPort Status Unit Size Type Phy Encl-Slot Model
------------------------------------------------------------------------------
p0 OK u0 279.39 GB SAS 0 - SEAGATE ST3300657SS
p1 OK u0 279.39 GB SAS 1 - SEAGATE ST3300657SS
p2 OK u1 931.51 GB SAS 2 - SEAGATE ST31000640SS
p3 ECC-ERROR u2 931.51 GB SAS 3 - SEAGATE ST31000640SS
p4 OK u3 931.51 GB SAS 4 - SEAGATE ST31000640SS
/c0 show events shows no ECC errors in it's recent history.
ZFS does not currently detect any errors. zpool status says No known data errors
My question: Is this ECC-ERROR something that I need to be concerned about?
According to the 3ware CLI 9.5.2 Manual, an ECC-ERROR means that the 3ware controller caught a read-error for one or more sectors on this drive. This sometimes occurs when a RAID array is recovering from a failed disk. I believe that ECC-ERRORS can also be detected when the 3ware Controller verifies each disk. None of the drives have failed and thus there was no drive rebuild, so I assume that 3ware discovered a bad sector when it ran it's weekly auto-verify scan of the disks. Is this a safe assumption?
According to our logs, ZFS has not detected any bad sectors on this drive. ZFS can work around read errors -- if ZFS detects a bad sector on the drive, it will simply mark that sector as bad and never use it again. From the ZFS perspective one bad sector isn't a big deal, although it might indicate that the drive is starting to go bad.