Fault tolerance, or FT, is a means of providing zero downtime to select virtual machines by maintaining an exact mirror copy on a second physical server. FT is not needed for every workload; it was never intended for widespread use, and there are plenty of more appropriate alternatives for databases, mail servers, and other elements of the infrastructure designed with high-availability and redundancy in mind.
What FT is good for, however, is protecting critical workloads that do not have sufficient redundancy capabilities out of the box, such as a legacy application that fulfills a crucial role but was never architected for failover, or a next-generation workload still in early phases that lacks the robustness that comes with maturity.
Consider Hadoop. Doing interesting things with massive amounts of information is in the foreground these days, and Hadoop is the de facto standard when it comes to analysis of big data. Hadoop, designed to process jobs in a distributed fashion, is tolerant of compute node failures within a cluster, but a couple of the management components of Hadoop have not yet reached the same level of resiliency as the data processing nodes and remain single points of failure. A scenario like this is a perfect match for VMware vSphere Fault Tolerance, so it should be no surprise the the VMware Performance Team has recently published a study characterizing the scale of such a solution.
It’s clear that despite one of the popular objections to FT — single vCPU support — the feature fills a significant gap and critically enhances overall reliability of a distributed system, thereby contributing to the uptime of hundreds of compute nodes. And for those wishing for FT VMs with multiple CPUs, the SMP FT technology was previewed and demonstrated in a session at VMworld 2012.
Although it’s not widely known, fault tolerance is expected to be available in Windows Hyper-V someday. Don’t just take my word for it, check out this announcement to learn more about the “need” for fault tolerant Windows servers.
Evidently, the Hyper-V team has been “jealous” of this amazing zero-downtime vSphere advantage for years, so it’s no wonder that they pre-announced the capability for their own product well ahead of availability. You’ll know that they are getting close to finally providing FT capabilities when they cease to criticize VMware FT.
As with private cloud, just because Microsoft talks about something, doesn’t mean it exists.