Hardware error messages from syslogd
- by Farhat
I have a 64-core AMD server running CEntOS on which I was running a long job. In the midst of the output, I see these lines. It appears to be a memory error. How severe is this and what exactly does it indicate?
Message from syslogd@heracles at Nov 7 21:00:02 ...
kernel:[Hardware Error]: MC4_STATUS[Over|CE|MiscV|-|AddrV|-|-|CECC]: 0xdc10410040080a13
Message from syslogd@heracles at Nov 7 21:00:02 ...
kernel:[Hardware Error]: Northbridge Error (node 4): DRAM ECC error detected on the NB.
Message from syslogd@heracles at Nov 7 21:00:02 ...
kernel:[Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: RES (no timeout)