With the release of VMware vSphere 4, VMware has released a very powerful management tool called Fault Tolerance (FT). At a basic level, FT allows you to keep two virtual machines (a Primary VM and a Secondary VM) running in lockstep on two different physical ESX hosts. If one of the ESX hosts were to experience a hardware failure, the VM protected with FT would remain running on the second host without any downtime. This can greatly reduce downtime due to hardware failures and provide increased service levels for important applications.
FT is often compared to Microsoft Windows Failover Clusters, formerly Microsoft Cluster Server (MSCS), and in fact many have talked about how FT can replace Microsoft clustering altogether. Rather than jump to conclusions like this, it is important to understand the use cases for both technologies. In addition, there are several limitations to FT that need to be considered. Here are some important points to remember about FT:
1) FT only supports a single vCPU, limiting its usefulness for some applications (this will probably change in the future).
2) FT is meant to protect against host level failures only, such as physical server failures.
3) FT keeps protected virtual machines in complete lockstep, meaning whatever happens on the Primary VM also happens on the Secondary VM. Why is this important to understand? Guess what happens when the Primary VM bluescreens.
4) FT VMs share the same virtual disk file, meaning a storage level failure affects both.
Microsoft Failover Clustering, on the other hand, can help protect against application and operating system level failures in addition to physical server failures. Clusters also make it easier to patch the underlying operating system with minimal downtime. Finally, FT has a limited set of hardware that it is compatible with (see this link to check if your system is compatible) so that may limit its usefulness for some organizations using older hardware.
All of that said, FT is a great feature and definitely has a place in your virtual infrastructure. Best of all it is available in vSphere Advanced and above, making it an affordable feature that many organizations may already own. So what are good use cases for FT over Microsoft clusters? Here are some thoughts:
- Provide hardware level protection to applications that don’t natively support any clustering functionality. There are many examples here – web servers, application servers, etc.
- Provide hardware level protection to applications where clustering support may be available but is either expensive or requires a special license. Document management servers, indexers, etc., may make good candidates here.
- Provide extra protection against hardware failures for critical applications during specific business periods where downtime simply cannot be tolerated, such as accounting servers during end of month processing.
VMware FT is a great new feature that can help provide extra uptime to virtual machines in your environment. It is available in most versions of vSphere and should definitely be considered as part of a virtual infrastructure design. Just make sure you understand the use cases for it and don’t rule out Microsoft clusters where they are appropriate.