Intermittent Windows Server 2008 BSOD and restart

Posted by Timka on Server Fault See other posts from Server Fault or by Timka
Published on 2013-10-01T19:43:19Z Indexed on 2013/10/28 21:56 UTC
Read the original article Hit count: 1561

Our EC2 Instance (Windows Server 2008) crashed multiple times for the past 3 months (last time was today at 1:05 EST). Upon reviewing MEMORY.DMP file we noticed that possible cause of the crashes is rhelnet.sys (RedHat PV NIC Driver).

Server's Event Viewer has the following records right after the crash:

Critical - Kernel Power:
The system has rebooted without cleanly shutting down first. 
This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

BugCheck:
The computer has rebooted from a bugcheck.  The bugcheck was:
0x000000d1 (0x000000000000002d, 0x0000000000000002, 0x0000000000000000, 0xfffff88001402d14). 
A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 100113-35849-01.

Could this be a hardware issue? Would it help if we stop and start the instance? Or is this more likely that this is caused by the software running on the system?

[Update 10.01.2013]

Amazon Rep suggested to update RH drivers to Citrix PV drivers on our instance:

Upgrading PV Drivers

[Update 10.08.2013]

We performed a drivers upgrade on the cloned instance. Right after the upgrade we noticed the following errors in our Event viewer:

Xennet6 errors in Event Viewer (Event ID# 5001)

After digging a bit more I found this article suggesting to install the latest Citrix drivers. Unfortunately, this didn't help us at all and our cloned instance became unresponsive.

[Update 10.08.2013 2]

I recreated an instance and updated PV drivers again. After searching on Internet I found this article where Amazon Rep explains that:

"Event ID 5001 from source Xennet6 cannot be found" message does not 
indicate anything wrong, just that the PV driver is looking for a feature
that we have not implemented in our version of Xen. 

I will keep my test system running for a while to see if there any issues with it.

© Server Fault or respective owner

Related posts about windows-server-2008

Related posts about amazon-ec2