When you centrally manage multiple virtualization hosts, what measures can be taken to protect the manager itself from outages? While availability could be a concern at first glance, a management platform like VMware vCenter can generally tolerate a short outage without too many folks noticing — VM workloads would never be affected. One exception is in the area of DRS and DPM — if more capacity is needed during such an outage then the infrastructure cannot respond by moving workloads or powering on new hosts. VMware HA, on the other hand, was designed to function properly without vCenter running — for obvious reasons.
System Center Virtual Machine Manager 2008 (SCVMM) does not offer the same degree of dynamic workload balancing, implying that an outage with that product could likely be tolerated for an even longer period. From that perspective, that’s good news for Microsoft — as you are about to see. As for HA on Hyper-V, since Microsoft leverages Failover Clustering, it should be unaffected by a management server outage.
Do you have a strong opinion about running your virtualization management system inside a VM? The VM Guy (Dave Lawrence) ran a poll last month asking this very question. The results were pretty even, but virtualized vCenter came out slightly ahead. It might just be the easiest way to protect your management platform.
Improving VMware vCenter Availability
Depending on your needs, there are several ways to minimize your vCenter downtime:
- Protect the vCenter VM with VMware HA — works for host outages
- Use Microsoft clustering (MSCS) to run an active/passive vCenter setup (physical or virtual)
- Deploy the new vCenter Server Heartbeat solution, also active/passive (physical or virtual)
The above solutions have a common requirement — the VC database also needs protection. This can be done through clustering or other vendor-supported techniques. vCenter Server Heartbeat can also be used for your SQL Server. Regardless of how you protect your database, make sure that you or your DBA have actual backups of the database for recovery purposes. With a backup copy of the database and the vCenter SSL certificates — which are needed to decrypt stored passwords — the environment can be recovered relatively quickly.
And if you are running vCenter in a VM that lives in a DRS cluster, it’s a good idea to disable DRS for that VM. If VC does go down, you don’t want to have to search for it — it will always be on the host where you intentionally placed it. (In fact, that was one of my contributions to the VC-in-a-VM whitepaper VMware released quite some time ago.) In case you were wondering, it works perfectly well to migrate your vCenter VM with VMotion without downtime. Amazing!
SCVMM Availability Options
How does Microsoft System Center Virtual Machine Manger 2008 (SCVMM) compare? You might make an educated assumption that you could simply deploy SCVMM on a Microsoft Failover Cluster (they’re not calling it MSCS anymore, you know). After all, Failover Clustering is the cornerstone to Hyper-V VM availability and quick migration. Actually, the product documentation has this to say:
Installing the VMM server on a cluster has not been tested and is not supported. To make the VMM server highly available, it is recommended that you install it on a highly available virtual machine.
Fine, so you have just one option: install SCVMM in an HA VM in order to have protection. Any limitations with that? From the documentation, again:
If you install the VMM server on a virtual machine, do not migrate this virtual machine to another host from within the VMM Administrator Console. If the virtual machine is highly available, do not migrate this virtual machine to another node on the cluster from within the VMM Administrator Console. When the VMM server is running on a virtual machine, it is literally managing itself; therefore, any migration, including quick migration, will result in a service interruption.
And they’re not kidding:
In summary, the only way to provide protection from unwanted downtime for SCVMM is to run it inside a VM. However, any maintenance on that Hyper-V host will introduce downtime as the SCVMM VM is quick migrated or suspended. As I said earlier, it’s good news for Microsoft that SCVMM can tolerate outages — there will be plenty to go around.