Over the last couple of days, I've been experiencing connectivity issues with my VMs and I think I've narrowed it down to when beacon probing is used as the failover mechanism in my port groups / VLANs. I guess the wider question is "what is the recommended settings for failover based on my hardware setup"...
My ESXi hosts are running 5.0.0U1 or 5.1.0, I've just upraded to vCenter 5.1.0a. Hosts are BL460c G7 servers in a c7000 chassis with Flex10 ICs. Each ESXi has a blade server profile with 4 FlexNICs: one pair for all VM data traffic (multiple tagged VLANs) which belong to a vDS. The second pair is used for all management traffic and (for the moment as I've reset it) is on a standalone vSwitch with untagged traffic.
The uplinks of each IC go to an Avaya VSP9000 switch. The two VSP9ks are linked by an IST trunk with all the VLANs tagged.
I initially discovered the source of my dropped traffic with the ARP table of a VM seeming to flip flop between the local 10Gb link on one Avaya and the IST link to the other Avaya (meaning the uplink of the other IC module was being used), potentially causing a small temporary loop.
Regardless of which Load Balancing method I configure in the PG (sticking to the ones supported by VC) and regarless whether I have both dvUplinks marked as Active or one marked as Active and the other as Standy, I do see occasional flip-flopping when beacon probing is used.
I've set up the virtual networks in Virtual Connect to use Smart Link, so I'm hoping that Link Status only should be sufficient.
Now whilst I still don't fully understand beacon probing (I've read up about it on a few posts), I was hoping that it might provide a bit more resilience despite having Smart Link configured on the VC side.
So for those of you with the same hardware setup, how do you have yours configured ?
In the long run, I would like to bring the management of the VMs inside the vDS, primarily to have more flexibility in bandwidth management than having a slice of the 10Gb uplink fixed at the Virtual Connect level. I've tried to bring the management back into the vDS but have failed miserably and I think it's down to a chicken and egg scenario and I'm getting the order of executing commands wrong... I'll look into that once my fundamental network configuration problem above has been sorted.
Thanks