Changing VM Update Strategy¶
vm_strategy property was added to the
update section of deployment manifests to control how VMs are updated during a deploy.
This feature was introduced in bosh/267.2.
Historically, when a VM needs to be replaced (e.g. stemcell update, networking change), the director takes time to: stop all processes, delete the VM in the IaaS, create a new VM, provision it, and start the processes. Depending on your IaaS, this process can take several minutes and, depending on your release deployment strategy, may cause downtime. This is still the default behavior, and it is called the
delete-create strategy. The following diagram represents the process flow, with grey representing potential downtime.
The new strategy is called
create-swap-delete. This strategy shifts IaaS-related and most provisioning steps to occur before the running processes on a VM are stopped. The workflow of this strategy looks like: create a new VM, provision it, stop jobs on the old VM, reattach persistent disks (if present), start jobs. One of the most noticeable impacts this has is to reduce the duration of downtime since the deployment no longer needs to wait for IaaS VM resource management while jobs are stopped.
You can see that
create-swap-delete introduces new functionality for deferring VM deletions as well. Old VMs will be "orphaned" (similar to disks) and scheduled for clean up. Typically cleanup starts within 5 minutes.
To change this behavior in your deployment, add
vm_strategy to your deployment's
update section. For example...
update: canaries: 4 ... vm_strategy: create-swap-delete
update section can be overridden at the instance group level. This allows you to opt-in or opt-out specific instance groups which need different strategies.
- H/A Tradeoff - some deployments may consider ~20s of downtime through
create-swap-deleteto be acceptable when compared to the requirements of running a component with full H/A redundancy.
- Persistent Disks - when using persistent disks, there is still a dependency on the IaaS for detaching and attaching the disk. Note: IaaSes take different periods of time for this operation, and even within the same IaaS, duration can vary.
- Reduce Deployment Time - some deployments may benefit from bulk creating VMs up front rather than as part of the instance update process. Deployment time may remain roughly the same, however, once created, the time during which VMs are being updated and needing special monitoring should be noticeably reduced.
create-swap-deletestrategy will not be used for instance groups which have networks using static IPs due to the exclusivity of the IPs in IaaSes.
- When using
create-swap-delete, your IaaS resource usage will inherently surge during the deploy while BOSH creates additional VMs in preparation for update. You may need to review any resource limits which are in effect for your IaaS and account.