Monday, April 4, 2011

More OTV, Jumbo Frames and Vmware fun!

So we're moving datacenters. Primary and backup both moving in the same short timeframe. Fun.

Our new VMware design relies heavily on OTV from Cisco. 4 blades in each datacenter running the same cluster of virtual machines. We had it tested and working with the secondary DC moved to its new location, but over the weekend we fired up the new primary datacenter and moved the Nexus 7ks to it, while keeping the 6509s in our old primary DC. No VC-ESX communications worked after that.

Now, last week we discovered that OTV adds some packet overhead to communications (we knew it did but didn't realize the repercussions). Vmware secure communications to the VC server are pretty close to the maximum size already (1500 by default). When we tried to add a host to the cluster when the VC was connected to a vlan using OTV, the host would send SHA thumbprint info, but the communication would timeout after that. Thats because OTV adds 70 bytes or so. Pings even work normally, but using the size option (-s, -l depending on client) we found that pings of 1430 size worked, and 1431 size didn't.

So after discovering this we played around with resizing the MTU on vmware and the VC, but decided rather that the switches all should have their MTU fixed. The network team fixed the MTU size on the 7ks, but the 6509s will unfortunately cause OSPF errors if the MTU isn't the same on all the switches. means a big outage, so we're scheduling that.

So, why did it work during testing? Because the 7ks could talk directly to each other without a router (6509) prior to the move, and afterwards they couldn't. Doh.

So, why then couldn't the VC server, which was hosted in the backup DC, not even communicate with the ESX host it resided on? Because of OTV domain ownership. The 7ks in the primary datacenter own all the OTV vlans, and because the 7ks couldn't talk to each other anymore directly, the OTV vlans in the backup DC are broken until the 6509 reboots. Big d'oh.




Wednesday, February 23, 2011

VMware HA and Cisco OTV

So we're moving to 2 new data-centers in the next little while, from our existing 2. However the new ones will eventually be active/active, once we get Ontap 8.x whichever one supports metronet clusters.

So in designing our new HA architecture, I had to take into account the fact that we're using HP blades and OTV from cisco, which allows us to have a flat network and have VMs that are portable across the 2 data-centers.

Which seems like a great idea. We'll have cisco 7000 10G switches connecting the ESX hosts in each location, and the 6509s will handle the OTV domain ownership since they control the links between locations as well. Links are redundant, ESX clusters have capacity in both locations, and eventually we'll have metronet clusters for the filers, so storage will be available in both places.

HOWEVER, I was talking about the OTV vlans to our network team today. Seems that in the event of a datacenter outage, an OTV domain owned by the switch in the failed datacenter would be unavailable everywhere, even on the 7000s in the opposite DC. So the HA response for ESX would be to shut everything down, because its heartbeats would fail everywhere.

The solution is having another service console network for the whole cluster, and have its OTV domain be owned by the 6509 in its DC. That way in the event of a DC failure, those hosts could still communicate on their secondary SC network, and they wouldn't fail, and as well they would properly power on the VMs from the other DC, which is what we want.

And we have to be sure we add hosts in a specific order to the HA cluster. The first 5 are HA masters, so we need at least 1 in each location, preferably 2 - so adding 3 from one DC and then 2 from the other to the HA cluster is important.