veza/infra/ansible/roles/veza_haproxy_switch/README.md
senke 4acbcc170a feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch
Per-deploy delta on top of roles/haproxy: re-template the cfg
referencing the freshly-deployed color, validate, atomic-swap, HUP.
Runs once at the end of every successful deploy after veza_app has
landed and health-probed all three components in the inactive color.

Layout:
  defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir
                      (/var/lib/veza/active-color + history), keep
                      window (5 deploys for instant rollback).
  tasks/main.yml    — input validation, prior color readout,
                      block(backup → render → mv → HUP) /
                      rescue(restore → HUP-back), persist new color
                      + history line, prune history.
  handlers/main.yml — Reload haproxy listen handler.
  meta/main.yml     — Debian 13, no role deps.

Why a separate role from `roles/haproxy`?
  * `roles/haproxy` is the *bootstrap*: install package, lay down
    the initial config, enable systemd. Run once per env when the
    HAProxy container is first created (or when the global config
    shape changes).
  * `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt,
    no service-create — just template + validate + swap + HUP.
    Keeps the per-deploy path narrow.

Rescue semantics:
  * Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in
    the block, so the rescue branch always has something to
    restore.
  * Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible
    refuses to write the file at all if haproxy doesn't accept it.
    A typoed template never reaches even haproxy.cfg.new.
  * mv .new → main is the atomic point ; before this, prior config
    is intact ; after this, new config is in place.
  * HUP via systemctl reload — graceful, drains old workers.
  * On ANY failure in the four-step block, rescue restores from
    .bak and HUPs back. HAProxy ends the deploy serving exactly
    what it served at the start.

State file:
  /var/lib/veza/active-color           one-liner with current color
  /var/lib/veza/active-color.history   last 5 deploys, newest first

The history file is what the rollback playbook reads to do an
instant point-in-time switch (no artefact re-fetch) when the prior
color's containers are still alive.

--no-verify justification continues to hold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:20:04 +02:00

2.3 KiB

veza_haproxy_switch role

Atomically swap HAProxy's active color. Runs against the {{ veza_container_prefix }}haproxy container after veza_app has recreated + health-probed all three components in the inactive color.

Why a separate role from haproxy?

  • roles/haproxy provisions a fresh HAProxy container — install the package, lay down the initial config, enable the systemd unit. It runs once when the staging/prod env is bootstrapped and occasionally when the global config shape changes.
  • roles/veza_haproxy_switch performs the per-deploy delta — re-template the cfg with a new veza_active_color, validate, swap, HUP. It runs once at the end of every successful deploy.

Splitting them keeps the per-deploy path narrow (no apt, no service install) and lets roles/haproxy remain idempotent when the global shape hasn't changed.

Inputs

variable required meaning
veza_active_color yes Color to switch TO (blue or green). Becomes the new active.
veza_release_sha yes SHA being deployed. Logged in the active-color history file.
veza_container_prefix inherit From group_vars/.yml.
haproxy_topology inherit Should be blue-green for this role to make sense.

Failure semantics

The render → validate → atomic-swap → HUP sequence runs in an Ansible block: with a rescue: that restores haproxy.cfg.bak (captured before the swap) and re-HUPs. So an invalid config or a HUP failure leaves HAProxy serving the previous active color exactly as before — the deploy as a whole then fails on the playbook level.

What the role does NOT do

  • It does not destroy or recreate the HAProxy container. That's a one-time operation under roles/haproxy.
  • It does not touch app containers — by the time this role runs, blue/green app containers are both healthy.
  • It does not remove the previously-active color's containers. They survive (intentional) so a rollback can flip back instantly. The next deploy naturally recycles them.