What does "single-bit ECC errors were detected on the RAID controller" mean?
Posted
by
jsp
on Server Fault
See other posts from Server Fault
or by jsp
Published on 2014-02-07T22:02:55Z
Indexed on
2014/08/23
16:24 UTC
Read the original article
Hit count: 2002
I have a Dell T7600 with a Perc H710P RAID controller and 4 attached 3TB drives. Over the past few months the RAID controller has been intermittently reporting errors on boot: "no boot device found", "adapter at baseport is not responding", disks frequently reported as missing or failed.
I have since replaced the RAID controller, the 4 hard drives, and finally the system's motherboard.
After replacing the motherboard and rebooting a few times, I got the error
Single bit ECC errors were detected on the RAID controller.
Please contact technical support to resolve this issue.
After rebooting about 20 more times, I haven't seen the ECC error. The system seems otherwise OK, except for the fact that the disk fans will sometimes start blowing at full blast when the the system is sitting completely idle and not stop until I reboot.
Are the ECC errors in memory on the RAID controller? Or, does the RAID controller map in system memory, and the ECC errors are really in system memory? Or, are the ECC errors in the 1GB cache that resides in the RAID controller?
© Server Fault or respective owner