We're running into an issue when taking and committing snapshots on large disk size VMware guest machines where taking and committing snapshots causes the guest to lose 5-6 pings which then causes Windows cluster failover. We have already maxed out the cluster.exe /prop thresholds.
Hardware
HP Proliant Blade 460c G7
24 logical Procs
192 GB RAM
ESX
5.0.0, 623860
Storage (SAN)
3Par
Guest (1 machine)
4 vCPU
16 GB RAM
Windows 2008 R2 (64-Bit)
4 datastores (1 TB each) VMFS
18 vmdk cut from the 4 datastores above (application DB/Log)
The disk is very fast and the server hardware is fast. We're trying to troubleshoot on where the bottleneck is and how best to approach this issue. This guest is not in production so there's very little changes that occur for the "committ" part of the snapshot. This will eventually be an Exchange 2010 DAG cluster that we'll be using Veeam to backup. Because Veeam uses the API's to take the snapshot it's killing the cluster because the the ping loss. We can recreate this issue within VMware alone when manually taking and committing snapshots. Any help would be greatly appreciated.