VMware or Microsoft?–How robust is your availability?

http://mschnlnine.vo.llnwd.net/d1/inetpub/kevinremde/KROmniture.htmDisclaimer: facts and figures in this article are based on the state of the technology as it exists at the date of its publication. 

Our article today in our “VMware or Microsoft?” series is about availability. 

When I say “availability”, I mean “high availability”. 

And when I say “robust high availability”, I mean a solution such as Windows Failover Clustering that provides high availability and scalability of server workloads.

I argue that Microsoft’s solution is robust and solid, but VMware has argued differently.  In a currently available document that VMware provides comparing vSphere 5 to the as-of-then beta of what is now Hyper-V in Windows Server 2012, VMware makes the claim that they have “robust high availability” with a “single click, [that] withstands multiple host failures”, whereas Microsoft’s Failover Clustering is “based on legacy quorum model, complex and brittle”. 

Really?  They haven’t been watching how far clustering has come in Windows Server lately.  In fact, at best, VMware’s document might be referring to how failover clustering used to work back in 2008.  More specifically, they are referring to the quorum model of how a cluster needs a majority vote to determine whether or not a node is actually unavailable, so that the resources it was managing can fail over to other nodes.  To ever have a solid majority, the number of voting members needs to be an odd number.  All nodes get a vote, and so if you have an even number of nodes, you need something else to break the tie.  So to make that work, you need some other “cluster witness”; which is either a “witness disk” or a “witness file share”. 

From this document on Windows Server 2008 failover clustering:

In a cluster with an even number of nodes and a quorum configuration that includes a witness, when the witness remains online, the cluster can continue sustain failures of half the nodes. If the witness goes offline, the same cluster can sustain failures of half the nodes minus one.

Well then, please allow me to introduce you to…

The Dynamic Quorum

“Batman and Robin?”

Tell me you didn't LOVE this show as a kid.No.. that was the “dynamic duo”.  I’m talking about the ability of all nodes in a Windows Failover Cluster to have a vote, and for the number of voting members to adjust dynamically as nodes fail; so that there is never any confusion (lack of a quorum) by having an even number of voting members.

Is this diagram…

Node & Disk Majority

…we see a healthy 4 node cluster, each running 2 VMs, or any other clustered roles.  (Windows Failover Clustering is not just for virtualization, you know.)  The quorum is maintained because we have a disk witness to break the tie in case two nodes say “one node is down!” and the other two say “no, he’s not!”.

If one of the nodes in our cluster goes away…

Simple Node Majority

…depending upon whether that removal was planned or a complete surprise, the clustered roles are able to failover or restart on other nodes.  AND, because the cluster now only has three active nodes, then that in itself becomes a quorum of voting members.

“When a node shuts down or crashes, the node loses its quorum vote.  When a node successfully rejoins the cluster, it regains its quorum vote.  By dynamically adjusting the assignment of quorum votes, the cluster can increase or decrease the number of quorum votes that are required to keep running. This enables the cluster to maintain availability during sequential node failures or shutdowns.”

Later, if either the node is re-added, it again gets a vote. 

Robust.  But wait… there’s more…

The Dynamic Witness

The story gets even better In Windows Server 2012 R2.  R2 improves with something called the “Dynamic Witness”:

“If the cluster is configured to use dynamic quorum (the default), the witness vote is also dynamically adjusted based on the number of voting nodes in current cluster membership. If there are an odd number of votes, the quorum witness does not have a vote. If there is an even number of votes, the quorum witness has a vote.

The quorum witness vote is also dynamically adjusted based on the state of the witness resource. If the witness resource is offline or failed, the cluster sets the witness vote to ‘0’.”

The benefit of this is for the rare case of a witness failure.  If that happens, the vote simply goes away and is assumed to not be there.  A huge benefit of all of this is that you never really have to worry about whether or not to count your nodes and the to configure a quorum witness or not. Just do it (as recommended), and let the dynamic nature of our failover clustering take care of it.

Guest Clustering Without Limits

Microsoft has a distinct advantage over VMware when it comes to guest clustering.  With Hyper-V and with virtual servers running Windows Server 2012 or 2012 R2, clusters of virtual machines can be created that use iSCSI, Fibre Channel, and even .VHDX files (in R2) as the location for their shared storage in either a Clustered Shared Volume (CSV) or just a server file share (SMB Share – file based storage). 

So here are a couple of the new, flexible choices you have for guest clustered VM shared storage in Windows Server 2012 R2…

Flexible choices for placement of Shared VHDX

Try doing that on NFS. 

While we’re on the subject of scale…

Does Size Matter?

VMware requires Essentials Plus or better for HA, and unless something else changed in vSphere 5.5 that they haven’t yet said much about, I do believe they still can only support up to 4000 VMs in a 32 node cluster.  (Correct me in the comments and point me to documentation that proves me wrong, please.  I sincerely thought they would up their game here.) 

You can cluster up to 8,000 virtual machines in up to a 64 node cluster with Windows Server 2012 and Windows Failover Clustering.  And you can do it for no additional cost

“Holy robust high availability, Batman!”

I’m glad you like it.  But if not, or if you have any questions, let me know in the comments.

And for more details on what’s newer than what VMware would have you believe in the world of robust high-availability, check out these two TechNet documents:

What’s New in Failover Clustering in Windows Server 2012

What’s New in Failover Clustering in Windows Server 2012 R2

7 thoughts on “VMware or Microsoft?–How robust is your availability?

  1. Hi Kevin,
    Great article on what is new in server 2012 R2. I understand that both VMware and Microsoft take a difference approach to HA but I would like to see more detailed information describing the differences between the two products. More to the point, differences that highlight exactly why one is better than the other. You have listed cost and cluster maximums are better with Microsoft’s offering, but is there any other factor that puts Microsoft ahead of VMware?
    Average Joe

    Like

Leave a reply to Average Joe Cancel reply