What can I do to determine the root cause of a Windows server hanging/freezing?
- by Aaronaught
We set up a new server here a few weeks ago that I am informally responsible for managing.
Almost everything works perfectly except for one thing: Every so often it hangs without warning.
To clarify: When I say hangs, I mean completely. None of the services respond and I'm unable to even get onto a local console - the display acts as though there's no VGA signal. One time, the server actually responded to pings, another time I got the "destination host unreachable" response, but most of the time the pings just time out, as one would expect for a hung server.
Event logs don't show anything after a reboot. I don't mean that they don't show anything interesting, I mean that they don't show anything at all from before the failure occurs to after the reboot. And there are never any performance problems, strange errors, or other obvious signs of impending doom before it happens.
I don't expect any easy answers here. What I'd like to know his I can methodically determine the root cause of this problem, be it a misbehaving service, defective hardware, or something else.
Is there any kind of logging I can set up that will help me get to the bottom of this? Any hardware diagnostics or remote monitoring? Anything else I can do to help me discover what's actually happening, or at least be able to eliminate what isn't wrong?
Just to reiterate, I really don't want to start speculating about possible causes and take a trial-and-error approach, because it's going to be at least several days at a time before I would have conclusive results. I'm looking for solutions to reliably trace the problem to its source.