veza/infra/ansible/roles/veza_haproxy_switch/tasks/main.yml

176 lines
6.1 KiB
YAML
Raw Normal View History

feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch Per-deploy delta on top of roles/haproxy: re-template the cfg referencing the freshly-deployed color, validate, atomic-swap, HUP. Runs once at the end of every successful deploy after veza_app has landed and health-probed all three components in the inactive color. Layout: defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir (/var/lib/veza/active-color + history), keep window (5 deploys for instant rollback). tasks/main.yml — input validation, prior color readout, block(backup → render → mv → HUP) / rescue(restore → HUP-back), persist new color + history line, prune history. handlers/main.yml — Reload haproxy listen handler. meta/main.yml — Debian 13, no role deps. Why a separate role from `roles/haproxy`? * `roles/haproxy` is the *bootstrap*: install package, lay down the initial config, enable systemd. Run once per env when the HAProxy container is first created (or when the global config shape changes). * `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt, no service-create — just template + validate + swap + HUP. Keeps the per-deploy path narrow. Rescue semantics: * Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in the block, so the rescue branch always has something to restore. * Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible refuses to write the file at all if haproxy doesn't accept it. A typoed template never reaches even haproxy.cfg.new. * mv .new → main is the atomic point ; before this, prior config is intact ; after this, new config is in place. * HUP via systemctl reload — graceful, drains old workers. * On ANY failure in the four-step block, rescue restores from .bak and HUPs back. HAProxy ends the deploy serving exactly what it served at the start. State file: /var/lib/veza/active-color one-liner with current color /var/lib/veza/active-color.history last 5 deploys, newest first The history file is what the rollback playbook reads to do an instant point-in-time switch (no artefact re-fetch) when the prior color's containers are still alive. --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:20:04 +00:00
# Atomic blue/green switch. The HAProxy template lives in
# roles/haproxy/templates/haproxy.cfg.j2 — it reads veza_active_color
# to render the right `backup` directives. We re-template, validate,
# atomic-swap, HUP.
#
# Block/rescue: any failure in the four-step sequence restores
# haproxy.cfg from the backup we capture before touching anything.
# That way, an invalid template or a HUP error never leaves HAProxy
# serving from a stale or broken cfg — it stays on whatever was
# active when the role started.
---
- name: Validate inputs
ansible.builtin.assert:
that:
- veza_active_color in ['blue', 'green']
- veza_release_sha | length == 40
refactor(ansible): single edge HAProxy — multi-env + Forgejo + Talas The 12-record DNS plan ($1 per record at the registrar but only one public R720 IP) forces the obvious : a single HAProxy on :443 must serve staging.veza.fr + veza.fr + www.veza.fr + talas.fr + www.talas.fr + forgejo.talas.group all at once. Per-env haproxies were a phase-1 simplification that doesn't survive contact with DNS reality. Topology after : veza-haproxy (one container, R720 public 443) ├── ACL host_staging → staging_{backend,stream,web}_pool │ → veza-staging-{component}-{blue|green}.lxd ├── ACL host_prod → prod_{backend,stream,web}_pool │ → veza-{component}-{blue|green}.lxd ├── ACL host_forgejo → forgejo_backend → 10.0.20.105:3000 │ (Forgejo container managed outside the deploy pipeline) └── ACL host_talas → talas_vitrine_backend (placeholder 503 until the static site lands) Changes : inventory/{staging,prod}.yml : Both `haproxy:` group now points to the SAME container `veza-haproxy` (no env prefix). Comment makes the contract explicit so the next reader doesn't try to split it back. group_vars/all/main.yml : NEW : haproxy_env_prefixes (per-env container prefix mapping). NEW : haproxy_env_public_hosts (per-env Host-header mapping). NEW : haproxy_forgejo_host + haproxy_forgejo_backend. NEW : haproxy_talas_hosts + haproxy_talas_vitrine_backend. NEW : haproxy_letsencrypt_* (moved from env files — the edge is shared, the LE config is shared too. Else the env that ran the haproxy role last would clobber the domain set). group_vars/{staging,prod}.yml : Strip the haproxy_letsencrypt_* block (now in all/main.yml). Comment points readers there. roles/haproxy/templates/haproxy.cfg.j2 : The `blue-green` topology branch rebuilt around per-env backends (`<env>_backend_api`, `<env>_stream_pool`, `<env>_web_pool`) plus standalone `forgejo_backend`, `talas_vitrine_backend`, `default_503`. Frontend ACLs : `host_<env>` (hdr(host) -i ...) selects which env's backends to use ; path ACLs (`is_api`, `is_stream_seg`, etc.) refine within the env. Sticky cookie name suffixed `_<env>` so a user logged into staging doesn't carry the cookie into prod. Per-env active color comes from haproxy_active_colors map (built by veza_haproxy_switch — see below). Multi-instance branch (lab) untouched. roles/veza_haproxy_switch/defaults/main.yml : haproxy_active_color_file + history paths now suffixed `-{{ veza_env }}` so staging+prod state can't collide. roles/veza_haproxy_switch/tasks/main.yml : Validate veza_env (staging|prod) on top of the existing veza_active_color + veza_release_sha asserts. Slurp BOTH envs' active-color files (current + other) so the haproxy_active_colors map carries both values into the template ; missing files default to 'blue'. playbooks/deploy_app.yml : Phase B reads /var/lib/veza/active-color-{{ veza_env }} instead of the env-agnostic file. playbooks/cleanup_failed.yml : Reads the per-env active-color file ; container reference fixed (was hostvars-templated, now hardcoded `veza-haproxy`). playbooks/rollback.yml : Fast-mode SHA lookup reads the per-env history file. Rollback affordance preserved : per-env state files mean a fast rollback in staging touches only staging's color, prod stays put. The history files (`active-color-{staging,prod}.history`) keep the last 5 deploys per env independently. Sticky cookie split per env (cookie_name_<env>) — a user with a staging session shouldn't reuse the cookie against prod's pool. Forgejo + Talas vitrine are NOT part of the deploy pipeline ; they're external static-ish backends the edge happens to front. haproxy_forgejo_backend is "10.0.20.105:3000" today (matches the existing Incus container at that address). --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:32:49 +00:00
- veza_env in ['staging', 'prod']
feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch Per-deploy delta on top of roles/haproxy: re-template the cfg referencing the freshly-deployed color, validate, atomic-swap, HUP. Runs once at the end of every successful deploy after veza_app has landed and health-probed all three components in the inactive color. Layout: defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir (/var/lib/veza/active-color + history), keep window (5 deploys for instant rollback). tasks/main.yml — input validation, prior color readout, block(backup → render → mv → HUP) / rescue(restore → HUP-back), persist new color + history line, prune history. handlers/main.yml — Reload haproxy listen handler. meta/main.yml — Debian 13, no role deps. Why a separate role from `roles/haproxy`? * `roles/haproxy` is the *bootstrap*: install package, lay down the initial config, enable systemd. Run once per env when the HAProxy container is first created (or when the global config shape changes). * `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt, no service-create — just template + validate + swap + HUP. Keeps the per-deploy path narrow. Rescue semantics: * Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in the block, so the rescue branch always has something to restore. * Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible refuses to write the file at all if haproxy doesn't accept it. A typoed template never reaches even haproxy.cfg.new. * mv .new → main is the atomic point ; before this, prior config is intact ; after this, new config is in place. * HUP via systemctl reload — graceful, drains old workers. * On ANY failure in the four-step block, rescue restores from .bak and HUPs back. HAProxy ends the deploy serving exactly what it served at the start. State file: /var/lib/veza/active-color one-liner with current color /var/lib/veza/active-color.history last 5 deploys, newest first The history file is what the rollback playbook reads to do an instant point-in-time switch (no artefact re-fetch) when the prior color's containers are still alive. --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:20:04 +00:00
fail_msg: >-
refactor(ansible): single edge HAProxy — multi-env + Forgejo + Talas The 12-record DNS plan ($1 per record at the registrar but only one public R720 IP) forces the obvious : a single HAProxy on :443 must serve staging.veza.fr + veza.fr + www.veza.fr + talas.fr + www.talas.fr + forgejo.talas.group all at once. Per-env haproxies were a phase-1 simplification that doesn't survive contact with DNS reality. Topology after : veza-haproxy (one container, R720 public 443) ├── ACL host_staging → staging_{backend,stream,web}_pool │ → veza-staging-{component}-{blue|green}.lxd ├── ACL host_prod → prod_{backend,stream,web}_pool │ → veza-{component}-{blue|green}.lxd ├── ACL host_forgejo → forgejo_backend → 10.0.20.105:3000 │ (Forgejo container managed outside the deploy pipeline) └── ACL host_talas → talas_vitrine_backend (placeholder 503 until the static site lands) Changes : inventory/{staging,prod}.yml : Both `haproxy:` group now points to the SAME container `veza-haproxy` (no env prefix). Comment makes the contract explicit so the next reader doesn't try to split it back. group_vars/all/main.yml : NEW : haproxy_env_prefixes (per-env container prefix mapping). NEW : haproxy_env_public_hosts (per-env Host-header mapping). NEW : haproxy_forgejo_host + haproxy_forgejo_backend. NEW : haproxy_talas_hosts + haproxy_talas_vitrine_backend. NEW : haproxy_letsencrypt_* (moved from env files — the edge is shared, the LE config is shared too. Else the env that ran the haproxy role last would clobber the domain set). group_vars/{staging,prod}.yml : Strip the haproxy_letsencrypt_* block (now in all/main.yml). Comment points readers there. roles/haproxy/templates/haproxy.cfg.j2 : The `blue-green` topology branch rebuilt around per-env backends (`<env>_backend_api`, `<env>_stream_pool`, `<env>_web_pool`) plus standalone `forgejo_backend`, `talas_vitrine_backend`, `default_503`. Frontend ACLs : `host_<env>` (hdr(host) -i ...) selects which env's backends to use ; path ACLs (`is_api`, `is_stream_seg`, etc.) refine within the env. Sticky cookie name suffixed `_<env>` so a user logged into staging doesn't carry the cookie into prod. Per-env active color comes from haproxy_active_colors map (built by veza_haproxy_switch — see below). Multi-instance branch (lab) untouched. roles/veza_haproxy_switch/defaults/main.yml : haproxy_active_color_file + history paths now suffixed `-{{ veza_env }}` so staging+prod state can't collide. roles/veza_haproxy_switch/tasks/main.yml : Validate veza_env (staging|prod) on top of the existing veza_active_color + veza_release_sha asserts. Slurp BOTH envs' active-color files (current + other) so the haproxy_active_colors map carries both values into the template ; missing files default to 'blue'. playbooks/deploy_app.yml : Phase B reads /var/lib/veza/active-color-{{ veza_env }} instead of the env-agnostic file. playbooks/cleanup_failed.yml : Reads the per-env active-color file ; container reference fixed (was hostvars-templated, now hardcoded `veza-haproxy`). playbooks/rollback.yml : Fast-mode SHA lookup reads the per-env history file. Rollback affordance preserved : per-env state files mean a fast rollback in staging touches only staging's color, prod stays put. The history files (`active-color-{staging,prod}.history`) keep the last 5 deploys per env independently. Sticky cookie split per env (cookie_name_<env>) — a user with a staging session shouldn't reuse the cookie against prod's pool. Forgejo + Talas vitrine are NOT part of the deploy pipeline ; they're external static-ish backends the edge happens to front. haproxy_forgejo_backend is "10.0.20.105:3000" today (matches the existing Incus container at that address). --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:32:49 +00:00
veza_haproxy_switch role requires veza_active_color (blue|green),
veza_release_sha (40-char git SHA), and veza_env (staging|prod).
Got: color={{ veza_active_color }} sha={{ veza_release_sha }}
env={{ veza_env | default('UNSET') }}.
feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch Per-deploy delta on top of roles/haproxy: re-template the cfg referencing the freshly-deployed color, validate, atomic-swap, HUP. Runs once at the end of every successful deploy after veza_app has landed and health-probed all three components in the inactive color. Layout: defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir (/var/lib/veza/active-color + history), keep window (5 deploys for instant rollback). tasks/main.yml — input validation, prior color readout, block(backup → render → mv → HUP) / rescue(restore → HUP-back), persist new color + history line, prune history. handlers/main.yml — Reload haproxy listen handler. meta/main.yml — Debian 13, no role deps. Why a separate role from `roles/haproxy`? * `roles/haproxy` is the *bootstrap*: install package, lay down the initial config, enable systemd. Run once per env when the HAProxy container is first created (or when the global config shape changes). * `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt, no service-create — just template + validate + swap + HUP. Keeps the per-deploy path narrow. Rescue semantics: * Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in the block, so the rescue branch always has something to restore. * Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible refuses to write the file at all if haproxy doesn't accept it. A typoed template never reaches even haproxy.cfg.new. * mv .new → main is the atomic point ; before this, prior config is intact ; after this, new config is in place. * HUP via systemctl reload — graceful, drains old workers. * On ANY failure in the four-step block, rescue restores from .bak and HUPs back. HAProxy ends the deploy serving exactly what it served at the start. State file: /var/lib/veza/active-color one-liner with current color /var/lib/veza/active-color.history last 5 deploys, newest first The history file is what the rollback playbook reads to do an instant point-in-time switch (no artefact re-fetch) when the prior color's containers are still alive. --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:20:04 +00:00
quiet: true
tags: [veza_haproxy_switch, always]
- name: Ensure veza state dir exists in HAProxy container
ansible.builtin.file:
path: "{{ haproxy_state_dir }}"
state: directory
owner: root
group: root
mode: "0755"
tags: [veza_haproxy_switch]
refactor(ansible): single edge HAProxy — multi-env + Forgejo + Talas The 12-record DNS plan ($1 per record at the registrar but only one public R720 IP) forces the obvious : a single HAProxy on :443 must serve staging.veza.fr + veza.fr + www.veza.fr + talas.fr + www.talas.fr + forgejo.talas.group all at once. Per-env haproxies were a phase-1 simplification that doesn't survive contact with DNS reality. Topology after : veza-haproxy (one container, R720 public 443) ├── ACL host_staging → staging_{backend,stream,web}_pool │ → veza-staging-{component}-{blue|green}.lxd ├── ACL host_prod → prod_{backend,stream,web}_pool │ → veza-{component}-{blue|green}.lxd ├── ACL host_forgejo → forgejo_backend → 10.0.20.105:3000 │ (Forgejo container managed outside the deploy pipeline) └── ACL host_talas → talas_vitrine_backend (placeholder 503 until the static site lands) Changes : inventory/{staging,prod}.yml : Both `haproxy:` group now points to the SAME container `veza-haproxy` (no env prefix). Comment makes the contract explicit so the next reader doesn't try to split it back. group_vars/all/main.yml : NEW : haproxy_env_prefixes (per-env container prefix mapping). NEW : haproxy_env_public_hosts (per-env Host-header mapping). NEW : haproxy_forgejo_host + haproxy_forgejo_backend. NEW : haproxy_talas_hosts + haproxy_talas_vitrine_backend. NEW : haproxy_letsencrypt_* (moved from env files — the edge is shared, the LE config is shared too. Else the env that ran the haproxy role last would clobber the domain set). group_vars/{staging,prod}.yml : Strip the haproxy_letsencrypt_* block (now in all/main.yml). Comment points readers there. roles/haproxy/templates/haproxy.cfg.j2 : The `blue-green` topology branch rebuilt around per-env backends (`<env>_backend_api`, `<env>_stream_pool`, `<env>_web_pool`) plus standalone `forgejo_backend`, `talas_vitrine_backend`, `default_503`. Frontend ACLs : `host_<env>` (hdr(host) -i ...) selects which env's backends to use ; path ACLs (`is_api`, `is_stream_seg`, etc.) refine within the env. Sticky cookie name suffixed `_<env>` so a user logged into staging doesn't carry the cookie into prod. Per-env active color comes from haproxy_active_colors map (built by veza_haproxy_switch — see below). Multi-instance branch (lab) untouched. roles/veza_haproxy_switch/defaults/main.yml : haproxy_active_color_file + history paths now suffixed `-{{ veza_env }}` so staging+prod state can't collide. roles/veza_haproxy_switch/tasks/main.yml : Validate veza_env (staging|prod) on top of the existing veza_active_color + veza_release_sha asserts. Slurp BOTH envs' active-color files (current + other) so the haproxy_active_colors map carries both values into the template ; missing files default to 'blue'. playbooks/deploy_app.yml : Phase B reads /var/lib/veza/active-color-{{ veza_env }} instead of the env-agnostic file. playbooks/cleanup_failed.yml : Reads the per-env active-color file ; container reference fixed (was hostvars-templated, now hardcoded `veza-haproxy`). playbooks/rollback.yml : Fast-mode SHA lookup reads the per-env history file. Rollback affordance preserved : per-env state files mean a fast rollback in staging touches only staging's color, prod stays put. The history files (`active-color-{staging,prod}.history`) keep the last 5 deploys per env independently. Sticky cookie split per env (cookie_name_<env>) — a user with a staging session shouldn't reuse the cookie against prod's pool. Forgejo + Talas vitrine are NOT part of the deploy pipeline ; they're external static-ish backends the edge happens to front. haproxy_forgejo_backend is "10.0.20.105:3000" today (matches the existing Incus container at that address). --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:32:49 +00:00
- name: Read currently-active color for THIS env (if any)
feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch Per-deploy delta on top of roles/haproxy: re-template the cfg referencing the freshly-deployed color, validate, atomic-swap, HUP. Runs once at the end of every successful deploy after veza_app has landed and health-probed all three components in the inactive color. Layout: defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir (/var/lib/veza/active-color + history), keep window (5 deploys for instant rollback). tasks/main.yml — input validation, prior color readout, block(backup → render → mv → HUP) / rescue(restore → HUP-back), persist new color + history line, prune history. handlers/main.yml — Reload haproxy listen handler. meta/main.yml — Debian 13, no role deps. Why a separate role from `roles/haproxy`? * `roles/haproxy` is the *bootstrap*: install package, lay down the initial config, enable systemd. Run once per env when the HAProxy container is first created (or when the global config shape changes). * `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt, no service-create — just template + validate + swap + HUP. Keeps the per-deploy path narrow. Rescue semantics: * Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in the block, so the rescue branch always has something to restore. * Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible refuses to write the file at all if haproxy doesn't accept it. A typoed template never reaches even haproxy.cfg.new. * mv .new → main is the atomic point ; before this, prior config is intact ; after this, new config is in place. * HUP via systemctl reload — graceful, drains old workers. * On ANY failure in the four-step block, rescue restores from .bak and HUPs back. HAProxy ends the deploy serving exactly what it served at the start. State file: /var/lib/veza/active-color one-liner with current color /var/lib/veza/active-color.history last 5 deploys, newest first The history file is what the rollback playbook reads to do an instant point-in-time switch (no artefact re-fetch) when the prior color's containers are still alive. --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:20:04 +00:00
ansible.builtin.slurp:
src: "{{ haproxy_active_color_file }}"
register: prior_color_raw
failed_when: false
changed_when: false
tags: [veza_haproxy_switch]
- name: Resolve prior_active_color (default blue if no history)
ansible.builtin.set_fact:
prior_active_color: >-
{{ (prior_color_raw.content | b64decode | trim) if prior_color_raw.content is defined
else 'blue' }}
tags: [veza_haproxy_switch]
refactor(ansible): single edge HAProxy — multi-env + Forgejo + Talas The 12-record DNS plan ($1 per record at the registrar but only one public R720 IP) forces the obvious : a single HAProxy on :443 must serve staging.veza.fr + veza.fr + www.veza.fr + talas.fr + www.talas.fr + forgejo.talas.group all at once. Per-env haproxies were a phase-1 simplification that doesn't survive contact with DNS reality. Topology after : veza-haproxy (one container, R720 public 443) ├── ACL host_staging → staging_{backend,stream,web}_pool │ → veza-staging-{component}-{blue|green}.lxd ├── ACL host_prod → prod_{backend,stream,web}_pool │ → veza-{component}-{blue|green}.lxd ├── ACL host_forgejo → forgejo_backend → 10.0.20.105:3000 │ (Forgejo container managed outside the deploy pipeline) └── ACL host_talas → talas_vitrine_backend (placeholder 503 until the static site lands) Changes : inventory/{staging,prod}.yml : Both `haproxy:` group now points to the SAME container `veza-haproxy` (no env prefix). Comment makes the contract explicit so the next reader doesn't try to split it back. group_vars/all/main.yml : NEW : haproxy_env_prefixes (per-env container prefix mapping). NEW : haproxy_env_public_hosts (per-env Host-header mapping). NEW : haproxy_forgejo_host + haproxy_forgejo_backend. NEW : haproxy_talas_hosts + haproxy_talas_vitrine_backend. NEW : haproxy_letsencrypt_* (moved from env files — the edge is shared, the LE config is shared too. Else the env that ran the haproxy role last would clobber the domain set). group_vars/{staging,prod}.yml : Strip the haproxy_letsencrypt_* block (now in all/main.yml). Comment points readers there. roles/haproxy/templates/haproxy.cfg.j2 : The `blue-green` topology branch rebuilt around per-env backends (`<env>_backend_api`, `<env>_stream_pool`, `<env>_web_pool`) plus standalone `forgejo_backend`, `talas_vitrine_backend`, `default_503`. Frontend ACLs : `host_<env>` (hdr(host) -i ...) selects which env's backends to use ; path ACLs (`is_api`, `is_stream_seg`, etc.) refine within the env. Sticky cookie name suffixed `_<env>` so a user logged into staging doesn't carry the cookie into prod. Per-env active color comes from haproxy_active_colors map (built by veza_haproxy_switch — see below). Multi-instance branch (lab) untouched. roles/veza_haproxy_switch/defaults/main.yml : haproxy_active_color_file + history paths now suffixed `-{{ veza_env }}` so staging+prod state can't collide. roles/veza_haproxy_switch/tasks/main.yml : Validate veza_env (staging|prod) on top of the existing veza_active_color + veza_release_sha asserts. Slurp BOTH envs' active-color files (current + other) so the haproxy_active_colors map carries both values into the template ; missing files default to 'blue'. playbooks/deploy_app.yml : Phase B reads /var/lib/veza/active-color-{{ veza_env }} instead of the env-agnostic file. playbooks/cleanup_failed.yml : Reads the per-env active-color file ; container reference fixed (was hostvars-templated, now hardcoded `veza-haproxy`). playbooks/rollback.yml : Fast-mode SHA lookup reads the per-env history file. Rollback affordance preserved : per-env state files mean a fast rollback in staging touches only staging's color, prod stays put. The history files (`active-color-{staging,prod}.history`) keep the last 5 deploys per env independently. Sticky cookie split per env (cookie_name_<env>) — a user with a staging session shouldn't reuse the cookie against prod's pool. Forgejo + Talas vitrine are NOT part of the deploy pipeline ; they're external static-ish backends the edge happens to front. haproxy_forgejo_backend is "10.0.20.105:3000" today (matches the existing Incus container at that address). --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:32:49 +00:00
# Read the OTHER env's active color too — the haproxy template renders
# both staging+prod simultaneously, so we need both values in scope.
- name: Read OTHER env's active color
ansible.builtin.slurp:
src: "/var/lib/veza/active-color-{{ 'prod' if veza_env == 'staging' else 'staging' }}"
register: other_color_raw
failed_when: false
changed_when: false
tags: [veza_haproxy_switch]
- name: Build haproxy_active_colors map (current state of every env)
ansible.builtin.set_fact:
haproxy_active_colors:
staging: >-
{%- if veza_env == 'staging' -%}
{{ veza_active_color }}
{%- elif other_color_raw.content is defined -%}
{{ other_color_raw.content | b64decode | trim }}
{%- else -%}
blue
{%- endif -%}
prod: >-
{%- if veza_env == 'prod' -%}
{{ veza_active_color }}
{%- elif other_color_raw.content is defined -%}
{{ other_color_raw.content | b64decode | trim }}
{%- else -%}
blue
{%- endif -%}
tags: [veza_haproxy_switch]
feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch Per-deploy delta on top of roles/haproxy: re-template the cfg referencing the freshly-deployed color, validate, atomic-swap, HUP. Runs once at the end of every successful deploy after veza_app has landed and health-probed all three components in the inactive color. Layout: defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir (/var/lib/veza/active-color + history), keep window (5 deploys for instant rollback). tasks/main.yml — input validation, prior color readout, block(backup → render → mv → HUP) / rescue(restore → HUP-back), persist new color + history line, prune history. handlers/main.yml — Reload haproxy listen handler. meta/main.yml — Debian 13, no role deps. Why a separate role from `roles/haproxy`? * `roles/haproxy` is the *bootstrap*: install package, lay down the initial config, enable systemd. Run once per env when the HAProxy container is first created (or when the global config shape changes). * `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt, no service-create — just template + validate + swap + HUP. Keeps the per-deploy path narrow. Rescue semantics: * Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in the block, so the rescue branch always has something to restore. * Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible refuses to write the file at all if haproxy doesn't accept it. A typoed template never reaches even haproxy.cfg.new. * mv .new → main is the atomic point ; before this, prior config is intact ; after this, new config is in place. * HUP via systemctl reload — graceful, drains old workers. * On ANY failure in the four-step block, rescue restores from .bak and HUPs back. HAProxy ends the deploy serving exactly what it served at the start. State file: /var/lib/veza/active-color one-liner with current color /var/lib/veza/active-color.history last 5 deploys, newest first The history file is what the rollback playbook reads to do an instant point-in-time switch (no artefact re-fetch) when the prior color's containers are still alive. --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:20:04 +00:00
- name: Switch sequence (block/rescue — restores cfg on any failure)
block:
- name: Backup current haproxy.cfg
ansible.builtin.copy:
src: "{{ haproxy_cfg_path }}"
dest: "{{ haproxy_cfg_backup_path }}"
remote_src: true
mode: "0640"
tags: [veza_haproxy_switch]
- name: Render fresh haproxy.cfg with new active_color
ansible.builtin.template:
src: "{{ playbook_dir }}/../roles/haproxy/templates/haproxy.cfg.j2"
dest: "{{ haproxy_cfg_new_path }}"
owner: root
group: haproxy
mode: "0640"
validate: "haproxy -f %s -c -q"
vars:
# Make absolutely sure the template sees the new color we are
# switching to — set both names because the older template
# used `veza_active_color` and a future revision might use
# `haproxy_active_color`.
haproxy_active_color: "{{ veza_active_color }}"
tags: [veza_haproxy_switch]
- name: Atomic swap — mv haproxy.cfg.new → haproxy.cfg
ansible.builtin.command: mv -f "{{ haproxy_cfg_new_path }}" "{{ haproxy_cfg_path }}"
changed_when: true
tags: [veza_haproxy_switch]
- name: HUP haproxy (graceful reload, no connection drop)
ansible.builtin.systemd:
name: haproxy
state: reloaded
tags: [veza_haproxy_switch]
rescue:
- name: Restore haproxy.cfg from backup
ansible.builtin.command: mv -f "{{ haproxy_cfg_backup_path }}" "{{ haproxy_cfg_path }}"
when: haproxy_cfg_backup_path is file or true # always try; benign if backup missing
changed_when: true
tags: [veza_haproxy_switch]
- name: HUP haproxy back to the prior config
ansible.builtin.systemd:
name: haproxy
state: reloaded
failed_when: false
tags: [veza_haproxy_switch]
- name: Report the failure
ansible.builtin.fail:
msg: >-
HAProxy switch to color {{ veza_active_color }} (sha
{{ veza_release_sha[:12] }}) failed — config rolled back
to the prior state. HAProxy continues serving from
{{ prior_active_color }}. Inspect the validate step's
stderr in the playbook output above.
# Success path: persist new active color + history.
- name: Write new active color
ansible.builtin.copy:
dest: "{{ haproxy_active_color_file }}"
content: "{{ veza_active_color }}\n"
owner: root
group: root
mode: "0644"
tags: [veza_haproxy_switch]
- name: Append to active-color history
ansible.builtin.lineinfile:
path: "{{ haproxy_active_color_history }}"
line: "{{ ansible_date_time.iso8601 }} sha={{ veza_release_sha }} color={{ veza_active_color }} prior={{ prior_active_color }}"
create: true
insertbefore: BOF
mode: "0644"
tags: [veza_haproxy_switch]
- name: Prune history beyond keep limit
ansible.builtin.shell: |
set -e
if [ -f "{{ haproxy_active_color_history }}" ]; then
head -n {{ haproxy_active_color_history_keep }} "{{ haproxy_active_color_history }}" > "{{ haproxy_active_color_history }}.tmp"
mv -f "{{ haproxy_active_color_history }}.tmp" "{{ haproxy_active_color_history }}"
fi
args:
executable: /bin/bash
changed_when: false
tags: [veza_haproxy_switch]
- name: Drop the now-stale backup
ansible.builtin.file:
path: "{{ haproxy_cfg_backup_path }}"
state: absent
tags: [veza_haproxy_switch]