I have now had 4 Server 2012 VMs fail to boot after a hardware failure in the cluster, requiring the servers to be rebuilt. This has happened in a span of 2 months since we started deploying this configuration and using Server 2012.
This has happened at 3 different locations, meaning it is NOT tied to a specific SAN, switch, or server.
-The first and second failures was when the Hyper-v servers both BSOD at the same time, causing the guests to die. BSOD was due to a network driver issue, and both systems receiving the same bad packet.
-The third failure was in our lab as we were building the cluster prior to production deployment. We had a power outage that killed the SAN, but left the storage and switch running.
-The fourth failure was when we had the switch doing iSCSI reboot on us. None of the Server 2008 machines were effected, but 1 of our 2012 Domain Controllers BSOD and needed to be rebuilt from scratch.
Here's the hardware setup for each location. I have this setup in 15+ locations, the problem has happened in different locations, so it is not specific to one location's hardware.
-2 Cisco UCS servers running Server 2012 Datacenter with Hyper-V in a cluster.
-Nimble storage array for iSCSI volumes.
-HP switch for iSCSI backbone.
-All hardware and drivers are being updated at the time of the build.
The dead systems all share these things:
-Running Server 2012 standard
-Domain controller
-created from the same template VHDX file
At this point I have no faith that any VM will survive any sort of hardware crash or power outage. I can see it occasionally causing a VM to not boot after a crash, but we've had 5 system crashes, and of those 5, 4 of them have caused me to rebuild a Server 2012 guest VM. That's an 80% failure rate.
Anyone else experiencing this level of failure?
I've been doing iSCSI VMs on VMware for 6+ years, seen countless crashes, and never had to rebuild a VM because of it. This is my first go at hyper-v. I'm not trying to bash it and say VMware is better, I'm just amazed at what I'm seeing, as it's not expected.
I've followed all the guidelines of MS and the storage vendor for iSCSI configuration. The problem doesn't appear to be storage related though. The VM will boot, so the VHDx file is not corrupt,it just seems that Server 2012 as a VM is not very crash resilient.
I have some systems in the lab now getting ready to ship, I am going to see if I can BSOD a Server 2012 guest running on local storage.