hyper-v cluster behavior when losing network connectivity
- by ChristopheD
Setup:
(rather new) Hyper-V R2 cluster with 2 nodes (in failover configuration). Fysical host OS: Windows Server 2008.
About eight VM's (mixed: Windows Server 2008 and Linux)
Yesterday we had a power outage of about 15 minutes.
Our blades are on UPS so the fysical host machines (Windows Server 2008) never went down. Our main switches are not on UPS (yet) and we saw the behaviour similar to the following (as distilled from the event logs).
The nodes in the cluster lost means of communication (because the external switches went down).
The cluster wants to bring down one (the first) of the nodes (to start failover?).
The previous step impacts clustered storage where the virtual machine VHD's are located.
All VM's got brutally terminated and were found in a failed state in the failover manager in the host OS'es. The Linux VM's were kernel panicking and looked like they had their disk ripped out.
This whole setup is rather new to us, so we are still learning about this.
The question:
We are putting switches on UPS soon but were wondering if the above is expected behavior (seems rather fragile) or if there are obvious improvements configuration-wise to handle such scenario's ?
I can upload an evtx file concerning what exactly was going on in case that's necessary.