The previous post in this series on Red Hat Enterprise Virtualization (RHEV), explained that the RHEV Manager is not just a mission critical component of the infrastructure — it’s a huge single point of failure as well.
What happens when the other major component of a RHEV infrastructure fails? Can you rely on RHEV High Availability (HA) to quickly and reliably restart affected VMs when a RHEV Hypervisor fails? It depends — as you will see.
First, let’s make sure everyone is up to speed on HA capabilities provided by the gold standard in virtualization:
VMware HA is a robust feature that was first introduced with Virtual Infrastructure 3 in 2006.
VMware vCenter Server is required to configure HA options and add VMware ESX hosts to a cluster, but after that vCenter is hands-off — ESX hosts communicate among themselves to reliably restart virtual machines. In fact, VMware HA can even restart vCenter Server if it is running inside a protected VM — wrap your head around that one.
Powerful options are available for administrators, such as specifying the restart priority of virtual machines and whether or not to force VMs to power off if a host becomes isolated from the rest of the cluster.
VMware has heavily invested in this technology, reducing risk for customers that virtualize with vSphere. For even more information on VMware HA, take a look at Duncan Epping’s HA Deep Dive.
RHEV HA [ha ha]
Looking at this Red Hat Enterprise Virtualization competitive comparison, you’d might assume that RHEV and vSphere are on equal footing when it comes to protecting virtual machines with HA:
Unsightly details behind the marketing
RHEV HA sounds great in the marketing brochure, but there are a few problems with the execution. RHEV Manager is a single point of failure — running on a physical Windows box — and it’s also the actual brain behind HA. Yes, RHEV-M is responsible for restarting virtual machines when a host fails. If the manager is down, no HA for you!
That alone makes RHEV HA something less than “HA” for most production environments, but there are a few other key weaknesses:
- HA must be manually enabled for each virtual machine — no cluster-wide settings
- No cluster admission control — administrators must manually ensure sufficient capacity would be available in a cluster to accommodate a host failure
- No VM restart priority to ensure the most critical workloads and dependencies are brought online first
- Primitive split-brain protection requires IPMI or other out-of-band management interface to force a host shutdown
- Cannot protect the RHEV Manager itself — chicken-and-egg situation
Wow, I didn’t notice those details in the comparison brochure.
Whether your datacenter is running Windows Server or the mighty Red Hat Enterprise Linux, doesn’t it makes sense to trust the proven leader in virtualization? VMware vSphere is simply the most reliable platform for consolidating workloads and building your private cloud. Going beyond exceptional HA is VMware FT — mirroring mission-critical VMs on backup hosts means zero downtime from host failures.