Hi all,
I've got a bit of a strange problem. DPM 'suddenly' stopped backing up all VM's on a single node of the Hyper-V cluster, and they all fail with the same errors (will be listed below). The DPM server has been backing up ALL VM's for atleast 3 months
now, with no problems but the occasional (resolvable) failure.
The event that most likely caused the problem might have been the complete network failure of our office. Previous wednesday all of our switches went down, but within 2 hours all was operational again.
The next day, ALL VM's on the node 3 (of 6) failed the nightly backups. Consistency checks continue the fail no matter what I've tried.
Can anyone point me in the right direction? Below is a summary of what info I can provide atm:
Cluster consists of 6 hosts, hosting ~50VM’s.
Hyper-V CSV Serialization is in place.
All Backups worked consistently for all VM’s for atleast 3 months (didn’t work here before then)
All network connectivity was lost (switch stack failure).
All is operational +- 2 hours later, but now only the VM’s on host 3 fail (all other VM’s on other hosts backup just fine).
ACTIONS (which failed):
Perform Consistency Check
Delete VM from PG, Re-Add, Perform Consistency Check
Delete VM from PG, Delete Inactive Protection, Re-Add, Perform Consistency Check
Reinstalled “Hyper-V Integration Services” , Perform Consistency Check, Delete VM from PG, Delete Inactive Protection, Re-Add, Perform Consistency Check
(Re)Registered Hyper-V VSS Writers, ,,
,,
Restarted Services,
,,
,,
Defragmented Disk,
,,
,,
Some more less constructive 'solutions' from the interwebs
Error message on DPM_SRVR:
DPM failed to synchronize changes for Microsoft Hyper-V \Backup Using Child Partition Snapshot\VM_MACHINE on SCVMM VM_MACHINE Resources.*.*.* because the snapshot volume did
not have sufficient storage space to hold the churn on the protected computer (ID 30115 Details: VssError:Insufficient storage available to create either the shadow copy storage file or other shadow copy data. (0x8004231F))
Informative messages on VM_MACHINE:
The VSS service is shutting down due to idle timeout.
Error message on CLUSTER_NODE_03:
calculateReserveSize(): The vdisk doesn't have sufficient space to create snap pool:array id=(long_id_number)
Any help would be appriciated, to me it looks like an issue between the node and the VM's, because the VM has no errors, just the 'VSS idle timeout' message.