Complete machine freezes...at a loss

Posted by user28818 on Super User See other posts from Super User or by user28818
Published on 2010-06-02T16:42:32Z Indexed on 2010/06/02 16:44 UTC
Read the original article Hit count: 243

Filed under:
|
|

Guys,

We built around 12 machines a few months ago to run Ubuntu. They each have the following specs:

ASUS Z8NA-D6 motherboard Dual quad core Intel(R) Xeon(R) CPU E5520 @ 2.27GHz OCZ Mod Extreme Pro 500W power supply 12 GB Kingston RAM Nvidia GeForce 9800 GT graphics card

My machine ran well for awhile. However, it started experiencing random lockups. These lockups are not X lockups, they are complete system freezes. The nic stops responding, the magic sysrq buttons won't work. The machine is dead.

I first suspected RAM. Memtest86 didn't find anything, but I replaced the RAM anyway. Still, lockups. So I replaced the graphics card. Still, more lockups. They became more and more frequent and started to happen 2-3 times a day.

So I replaced the motherboard and power supply in one fell swoop. Suddenly, no more lockups! Woohoo!

Except, a week later, in the morning, the machine wouldn't wake up. I reset it, started it up, and the log files showed the last entry at around 11 pm the evening before. This has started occurring with more frequency...now just about every morning I come in, the machine is locked up, and has been since the night before.

Yesterday, in the 3 weeks since I replaced the motherboard and power supply, the machine actually locked up on in in mid-work. This is the first time since replacing the two (MB and PS) that this happened while I was using it. All others occurred while I was away.

I'm at a loss. Nothing is in syslog or message that would indicate a problem around the time of the lockup. Temps are good...I use lmsensors to monitor and have a script that writes the output to file every minute. They never get that high.

The only thing I haven't replaced at this point is the case and the harddrives. I doubt either could be the cause.

What would you do if you were in my shoes? Is there a troubleshooting approach I'm missing?

For the record, all of the other machines, all eleven of them, don't have any problems. They're all running the same version of Ubuntu (Lucid) that I am.

Thanks!

© Super User or respective owner

Related posts about ubuntu

Related posts about Hardware