I have built a Hyper-V failover cluster using Windows 2012 on 3 Dell PowerEdge M520 Blades connected to a Dell EqualLogic PS4100 array. The blades plug into Dell PowerConnect M6348 switches in the back of the chassis. The network cards are Broadcom. In total 12 cards, 4 are plugged into the iSCSI network. 4 are teamed using the inbuilt windows teaming with networks created to carry internal production vLANs, CSV, heartbeat, Live Migration and Management. The other 4 are used in a team for carrying external DMZ vLAN traffic.
All updates applied and all drivers and firmware are up to date.
I have migrated and now have 18 guest OS running on this cluster, Windows Server 2003r2, 2008r2 and 2012, a few client OS, XP and Windows 7.
The problem I am having is only with the 2 Windows Server 2003r2 and 1 XP VMs. Every 20-30 minutes the VMs lose the ability to connect to anything outside of the vLAN they are assigned too. They can ping each other but not the gateway. Looking in the ARP table of the switch the mac address for these VMs change and become 0000.0000.0000. Before they change to this they are the static assigned mac assigned to the VM. To get the VMs network to work outside of the IP range it is on I have to just simulate the network cable being unplugged. Be that repairing the connection on the VM, live migrate the machine to another host or just change the setting to not connected and back. 20-30 minutes later the mac address becomes 0000.0000.0000 again.
An ipconfig on the VM show the correct mac at all times. No events in the events logs at all when this happens, host or VM.
I have tried the following;
Fully patched the VMs
Update the host integration services
Removed the adapters and added new ones
Changed IP
Changed IP Range
Changed MAC address of VM
Removed hidden devices
Removed entries in the register for network adapters
Tried legacy adapter where possible
Disabling TCP Task Offload settings, physical and VM
Enable VMQ and disabling VMQ, physical and VM
Yet still after 20-30 minutes the VM lose connection outside of the IP range they are on, the MAC address in the switches ARP table changes 0000.0000.0000 hence the loss of network outside of the IP range.
I have check for rouge devices on the network, doubled checked all VMs to make sure no duplicated MAC address have been used. I have gone over our switch settings but cannot find anything here that would or possibly only affect VMs running 2003r2 or XP. I don't believe it is a switch network problem anymore as it is only these 3 VMs with 2003r2 or XP that are having this issue, the other 15 VMs that are 2008r2, 2012 or Win 7 are working great on the same hardware connecting to the same switches in the same IP ranage.
I am now at a total loss as to what to try next to get the 2003r2 and XP VMs.
Any ideas?
Thanks
Barry