vSphere High Availability
vSphere High Availability (HA) is a VMware product that detects ESXi host failure, for example host power off or network partition, and automatically restarts virtual machines on other hosts in the cluster. It can interoperate effectively with the BOSH Resurrector, which recreates VMs if the Director loses contact with a VM's BOSH Agent.
Note
This feature is available with bosh-vsphere-cpi v30+.
vCenter Configuration¶
Configure vSphere HA as follows:
-
Check Cluster* → Manage → Settings → vSphere HA → Edit... → Turn on vSphere HA
-
Check Host Monitoring
-
Ensure the Response for Failure conditions and VM response → Host Isolation is set to Shut down and restart VMs
BOSH Director Configuration¶
Increase the timeout values of the BOSH Health Monitor on the BOSH Director to allow for smooth interoperation between BOSH and vCenter.
We recommend increasing the agent_timeout
from the default 60s to 180s in the BOSH Director's manifest to allow vCenter time to detect the failed host:
jobs: - name: bosh properties: ... hm: resurrector_enabled: true intervals: agent_timeout: 180
Warning
If vSphere HA is not enabled on the cluster and a host failure occurs, the BOSH Resurrector will be unable to recreate the VMs without manual intervention. Follow the manual procedure as appropriate: Host Failure or Network Partition.