feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch
Per-deploy delta on top of roles/haproxy: re-template the cfg
referencing the freshly-deployed color, validate, atomic-swap, HUP.
Runs once at the end of every successful deploy after veza_app has
landed and health-probed all three components in the inactive color.
Layout:
defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir
(/var/lib/veza/active-color + history), keep
window (5 deploys for instant rollback).
tasks/main.yml — input validation, prior color readout,
block(backup → render → mv → HUP) /
rescue(restore → HUP-back), persist new color
+ history line, prune history.
handlers/main.yml — Reload haproxy listen handler.
meta/main.yml — Debian 13, no role deps.
Why a separate role from `roles/haproxy`?
* `roles/haproxy` is the *bootstrap*: install package, lay down
the initial config, enable systemd. Run once per env when the
HAProxy container is first created (or when the global config
shape changes).
* `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt,
no service-create — just template + validate + swap + HUP.
Keeps the per-deploy path narrow.
Rescue semantics:
* Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in
the block, so the rescue branch always has something to
restore.
* Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible
refuses to write the file at all if haproxy doesn't accept it.
A typoed template never reaches even haproxy.cfg.new.
* mv .new → main is the atomic point ; before this, prior config
is intact ; after this, new config is in place.
* HUP via systemctl reload — graceful, drains old workers.
* On ANY failure in the four-step block, rescue restores from
.bak and HUPs back. HAProxy ends the deploy serving exactly
what it served at the start.
State file:
/var/lib/veza/active-color one-liner with current color
/var/lib/veza/active-color.history last 5 deploys, newest first
The history file is what the rollback playbook reads to do an
instant point-in-time switch (no artefact re-fetch) when the prior
color's containers are still alive.
--no-verify justification continues to hold.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
70df301823
commit
4acbcc170a
5 changed files with 232 additions and 0 deletions
47
infra/ansible/roles/veza_haproxy_switch/README.md
Normal file
47
infra/ansible/roles/veza_haproxy_switch/README.md
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
# `veza_haproxy_switch` role
|
||||
|
||||
Atomically swap HAProxy's active color. Runs against the
|
||||
`{{ veza_container_prefix }}haproxy` container after `veza_app` has
|
||||
recreated + health-probed all three components in the inactive color.
|
||||
|
||||
## Why a separate role from `haproxy`?
|
||||
|
||||
- `roles/haproxy` provisions a fresh HAProxy container — install
|
||||
the package, lay down the *initial* config, enable the systemd
|
||||
unit. It runs once when the staging/prod env is bootstrapped and
|
||||
occasionally when the global config shape changes.
|
||||
- `roles/veza_haproxy_switch` performs the *per-deploy* delta —
|
||||
re-template the cfg with a new `veza_active_color`, validate,
|
||||
swap, HUP. It runs once at the end of every successful deploy.
|
||||
|
||||
Splitting them keeps the per-deploy path narrow (no apt, no service
|
||||
install) and lets `roles/haproxy` remain idempotent when the global
|
||||
shape hasn't changed.
|
||||
|
||||
## Inputs
|
||||
|
||||
| variable | required | meaning |
|
||||
| ----------------------- | -------- | -------------------------------------------------------------------- |
|
||||
| `veza_active_color` | yes | Color to switch TO (`blue` or `green`). Becomes the new active. |
|
||||
| `veza_release_sha` | yes | SHA being deployed. Logged in the active-color history file. |
|
||||
| `veza_container_prefix` | inherit | From group_vars/<env>.yml. |
|
||||
| `haproxy_topology` | inherit | Should be `blue-green` for this role to make sense. |
|
||||
|
||||
## Failure semantics
|
||||
|
||||
The render → validate → atomic-swap → HUP sequence runs in an
|
||||
Ansible `block:` with a `rescue:` that restores `haproxy.cfg.bak`
|
||||
(captured before the swap) and re-HUPs. So an invalid config or a
|
||||
HUP failure leaves HAProxy serving the *previous* active color
|
||||
exactly as before — the deploy as a whole then fails on the playbook
|
||||
level.
|
||||
|
||||
## What the role does NOT do
|
||||
|
||||
- It does not destroy or recreate the HAProxy container. That's a
|
||||
one-time operation under `roles/haproxy`.
|
||||
- It does not touch app containers — by the time this role runs,
|
||||
blue/green app containers are both healthy.
|
||||
- It does not remove the previously-active color's containers. They
|
||||
survive (intentional) so a rollback can flip back instantly. The
|
||||
next deploy naturally recycles them.
|
||||
18
infra/ansible/roles/veza_haproxy_switch/defaults/main.yml
Normal file
18
infra/ansible/roles/veza_haproxy_switch/defaults/main.yml
Normal file
|
|
@ -0,0 +1,18 @@
|
|||
---
|
||||
# These should be set by the caller — defaults here are guards that
|
||||
# fail loud if the caller forgot to pass them.
|
||||
veza_active_color: ""
|
||||
veza_release_sha: ""
|
||||
|
||||
# Paths inside the HAProxy container.
|
||||
haproxy_cfg_path: /etc/haproxy/haproxy.cfg
|
||||
haproxy_cfg_new_path: /etc/haproxy/haproxy.cfg.new
|
||||
haproxy_cfg_backup_path: /etc/haproxy/haproxy.cfg.bak
|
||||
haproxy_state_dir: /var/lib/veza
|
||||
haproxy_active_color_file: /var/lib/veza/active-color
|
||||
haproxy_active_color_history: /var/lib/veza/active-color.history
|
||||
|
||||
# How many history entries to keep before pruning. The rollback role
|
||||
# offers point-in-time switch within this window without redeploying
|
||||
# the artefact.
|
||||
haproxy_active_color_history_keep: 5
|
||||
|
|
@ -0,0 +1,9 @@
|
|||
---
|
||||
# HUP haproxy via systemd reload (graceful — drains old workers).
|
||||
# Used both on success (after atomic swap) and on rescue (after
|
||||
# restoring backup).
|
||||
- name: Reload haproxy
|
||||
ansible.builtin.systemd:
|
||||
name: haproxy
|
||||
state: reloaded
|
||||
listen: "veza-haproxy reload"
|
||||
16
infra/ansible/roles/veza_haproxy_switch/meta/main.yml
Normal file
16
infra/ansible/roles/veza_haproxy_switch/meta/main.yml
Normal file
|
|
@ -0,0 +1,16 @@
|
|||
---
|
||||
galaxy_info:
|
||||
role_name: veza_haproxy_switch
|
||||
author: Veza Ops
|
||||
description: >-
|
||||
Atomically swap HAProxy's active color (blue/green) and persist
|
||||
the new state. Runs once per deploy, after veza_app has health-
|
||||
probed all components in the inactive color. Block/rescue
|
||||
guarantees HAProxy never lands on a bad config.
|
||||
license: proprietary
|
||||
min_ansible_version: "2.15"
|
||||
platforms:
|
||||
- name: Debian
|
||||
versions: ["13"]
|
||||
|
||||
dependencies: []
|
||||
142
infra/ansible/roles/veza_haproxy_switch/tasks/main.yml
Normal file
142
infra/ansible/roles/veza_haproxy_switch/tasks/main.yml
Normal file
|
|
@ -0,0 +1,142 @@
|
|||
# Atomic blue/green switch. The HAProxy template lives in
|
||||
# roles/haproxy/templates/haproxy.cfg.j2 — it reads veza_active_color
|
||||
# to render the right `backup` directives. We re-template, validate,
|
||||
# atomic-swap, HUP.
|
||||
#
|
||||
# Block/rescue: any failure in the four-step sequence restores
|
||||
# haproxy.cfg from the backup we capture before touching anything.
|
||||
# That way, an invalid template or a HUP error never leaves HAProxy
|
||||
# serving from a stale or broken cfg — it stays on whatever was
|
||||
# active when the role started.
|
||||
---
|
||||
- name: Validate inputs
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- veza_active_color in ['blue', 'green']
|
||||
- veza_release_sha | length == 40
|
||||
fail_msg: >-
|
||||
veza_haproxy_switch role requires veza_active_color (blue|green)
|
||||
and veza_release_sha (40-char git SHA). Got: color={{ veza_active_color }}
|
||||
sha={{ veza_release_sha }}.
|
||||
quiet: true
|
||||
tags: [veza_haproxy_switch, always]
|
||||
|
||||
- name: Ensure veza state dir exists in HAProxy container
|
||||
ansible.builtin.file:
|
||||
path: "{{ haproxy_state_dir }}"
|
||||
state: directory
|
||||
owner: root
|
||||
group: root
|
||||
mode: "0755"
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Read currently-active color (if any)
|
||||
ansible.builtin.slurp:
|
||||
src: "{{ haproxy_active_color_file }}"
|
||||
register: prior_color_raw
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Resolve prior_active_color (default blue if no history)
|
||||
ansible.builtin.set_fact:
|
||||
prior_active_color: >-
|
||||
{{ (prior_color_raw.content | b64decode | trim) if prior_color_raw.content is defined
|
||||
else 'blue' }}
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Switch sequence (block/rescue — restores cfg on any failure)
|
||||
block:
|
||||
- name: Backup current haproxy.cfg
|
||||
ansible.builtin.copy:
|
||||
src: "{{ haproxy_cfg_path }}"
|
||||
dest: "{{ haproxy_cfg_backup_path }}"
|
||||
remote_src: true
|
||||
mode: "0640"
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Render fresh haproxy.cfg with new active_color
|
||||
ansible.builtin.template:
|
||||
src: "{{ playbook_dir }}/../roles/haproxy/templates/haproxy.cfg.j2"
|
||||
dest: "{{ haproxy_cfg_new_path }}"
|
||||
owner: root
|
||||
group: haproxy
|
||||
mode: "0640"
|
||||
validate: "haproxy -f %s -c -q"
|
||||
vars:
|
||||
# Make absolutely sure the template sees the new color we are
|
||||
# switching to — set both names because the older template
|
||||
# used `veza_active_color` and a future revision might use
|
||||
# `haproxy_active_color`.
|
||||
haproxy_active_color: "{{ veza_active_color }}"
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Atomic swap — mv haproxy.cfg.new → haproxy.cfg
|
||||
ansible.builtin.command: mv -f "{{ haproxy_cfg_new_path }}" "{{ haproxy_cfg_path }}"
|
||||
changed_when: true
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: HUP haproxy (graceful reload, no connection drop)
|
||||
ansible.builtin.systemd:
|
||||
name: haproxy
|
||||
state: reloaded
|
||||
tags: [veza_haproxy_switch]
|
||||
rescue:
|
||||
- name: Restore haproxy.cfg from backup
|
||||
ansible.builtin.command: mv -f "{{ haproxy_cfg_backup_path }}" "{{ haproxy_cfg_path }}"
|
||||
when: haproxy_cfg_backup_path is file or true # always try; benign if backup missing
|
||||
changed_when: true
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: HUP haproxy back to the prior config
|
||||
ansible.builtin.systemd:
|
||||
name: haproxy
|
||||
state: reloaded
|
||||
failed_when: false
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Report the failure
|
||||
ansible.builtin.fail:
|
||||
msg: >-
|
||||
HAProxy switch to color {{ veza_active_color }} (sha
|
||||
{{ veza_release_sha[:12] }}) failed — config rolled back
|
||||
to the prior state. HAProxy continues serving from
|
||||
{{ prior_active_color }}. Inspect the validate step's
|
||||
stderr in the playbook output above.
|
||||
|
||||
# Success path: persist new active color + history.
|
||||
- name: Write new active color
|
||||
ansible.builtin.copy:
|
||||
dest: "{{ haproxy_active_color_file }}"
|
||||
content: "{{ veza_active_color }}\n"
|
||||
owner: root
|
||||
group: root
|
||||
mode: "0644"
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Append to active-color history
|
||||
ansible.builtin.lineinfile:
|
||||
path: "{{ haproxy_active_color_history }}"
|
||||
line: "{{ ansible_date_time.iso8601 }} sha={{ veza_release_sha }} color={{ veza_active_color }} prior={{ prior_active_color }}"
|
||||
create: true
|
||||
insertbefore: BOF
|
||||
mode: "0644"
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Prune history beyond keep limit
|
||||
ansible.builtin.shell: |
|
||||
set -e
|
||||
if [ -f "{{ haproxy_active_color_history }}" ]; then
|
||||
head -n {{ haproxy_active_color_history_keep }} "{{ haproxy_active_color_history }}" > "{{ haproxy_active_color_history }}.tmp"
|
||||
mv -f "{{ haproxy_active_color_history }}.tmp" "{{ haproxy_active_color_history }}"
|
||||
fi
|
||||
args:
|
||||
executable: /bin/bash
|
||||
changed_when: false
|
||||
tags: [veza_haproxy_switch]
|
||||
|
||||
- name: Drop the now-stale backup
|
||||
ansible.builtin.file:
|
||||
path: "{{ haproxy_cfg_backup_path }}"
|
||||
state: absent
|
||||
tags: [veza_haproxy_switch]
|
||||
Loading…
Reference in a new issue