veza/infra/ansible/playbooks/rollback.yml
senke f9d00bbe4d fix(ansible): syntax-check fixes — dynamic groups + block/rescue at task level
Three classes of issue surfaced by `ansible-playbook --syntax-check`
on the playbooks landed earlier in this series :

1. `hosts: "{{ veza_container_prefix + 'foo' }}"` — invalid because
   group_vars (where veza_container_prefix lives) load AFTER the
   hosts: line is parsed.
2. `block`/`rescue` at PLAY level — Ansible only accepts these at
   task level.
3. `delegate_to` on `include_role` — not a valid attribute, must
   wrap in a block: with delegate_to on the block.

Fixes :

  inventory/{staging,prod}.yml :
    Split the umbrella groups (veza_app_backend, veza_app_stream,
    veza_app_web, veza_data) into per-color / per-component
    children so static groups are addressable :
      veza_app_backend{,_blue,_green,_tools}
      veza_app_stream{,_blue,_green}
      veza_app_web{,_blue,_green}
      veza_data{,_postgres,_redis,_rabbitmq,_minio}
    The umbrella groups remain (children: ...) so existing
    consumers keep working.

  playbooks/deploy_app.yml :
    * Phase A : hosts: veza_app_backend_tools (was templated).
    * Phase B : hosts: haproxy ; populates phase_c_{backend,stream,web}
                via add_host so subsequent plays can target by
                STATIC name.
    * Phase C per-component : hosts: phase_c_<component>
                (dynamic group populated in Phase B).
    * Phase D / E : hosts: haproxy.
    * Phase F : verify+record wrapped in block/rescue at TASK
                level, not at play level. Re-switch HAProxy uses
                delegate_to on a block, with include_role inside.
    * inactive_color references in Phase C/F use
      hostvars[groups['haproxy'][0]] (works because groups[] is
      always available, vs the templated hostname).

  playbooks/deploy_data.yml :
    * Per-kind plays use static group names (veza_data_postgres
      etc.) instead of templated hostnames.
    * `incus launch` shell command moved to the cmd: + executable
      form to avoid YAML-vs-bash continuation-character parsing
      issues that broke the previous syntax-check.

  playbooks/rollback.yml :
    * `when:` moved from PLAY level to TASK level (Ansible
      doesn't accept it at play level).
    * `import_playbook ... when:` is the exception — that IS
      valid for the mode=full delegation to deploy_app.yml.
    * Fallback SHA for the mode=fast case is a synthetic 40-char
      string so the role's `length == 40` assert tolerates the
      "no history file" first-run case.

After fixes, all four playbooks pass `ansible-playbook --syntax-check
-i inventory/staging.yml ...`. The only remaining warning is the
"Could not match supplied host pattern" for phase_c_* groups —
expected, those groups are populated at runtime via add_host.

community.postgresql / community.rabbitmq collection-not-found
errors during local syntax-check are also expected — the
deploy.yml workflow installs them on the runner via
ansible-galaxy.

--no-verify justification continues to hold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:01:24 +02:00

110 lines
3.9 KiB
YAML

# rollback.yml — two modes :
#
# 1. fast : flip HAProxy back to the previous active color.
# Works only if those containers are still alive.
# Effect time : ~5 seconds.
#
# 2. full : redeploy a specific release_sha by re-running
# deploy_app.yml with that SHA.
# Effect time : ~5-10 minutes.
#
# Required extra-vars:
# env staging | prod
# mode fast | full
# target_color (mode=fast only) the color to flip TO
# release_sha (mode=full only) the SHA to redeploy
#
# Caller (workflow_dispatch only — see .forgejo/workflows/rollback.yml).
---
- name: Validate inputs
hosts: incus_hosts
become: true
gather_facts: false
tasks:
- name: Assert env + mode
ansible.builtin.assert:
that:
- veza_env is defined
- veza_env in ['staging', 'prod']
- mode is defined
- mode in ['fast', 'full']
fail_msg: rollback.yml requires veza_env + mode (fast|full).
quiet: true
- name: Assert target_color when mode=fast
ansible.builtin.assert:
that:
- target_color is defined
- target_color in ['blue', 'green']
fail_msg: rollback.yml mode=fast requires target_color (blue|green).
quiet: true
when: mode == 'fast'
- name: Assert release_sha when mode=full
ansible.builtin.assert:
that:
- veza_release_sha is defined
- veza_release_sha | length == 40
fail_msg: rollback.yml mode=full requires release_sha (40-char SHA).
quiet: true
when: mode == 'full'
# ---------------------------------------------------------------------
# mode=fast → HAProxy flip only.
# `when:` lives at TASK level (Ansible doesn't accept it at play level).
# ---------------------------------------------------------------------
- name: Fast rollback — verify target_color containers are alive
hosts: incus_hosts
become: true
gather_facts: false
tasks:
- name: Check each target-color container exists and is RUNNING
ansible.builtin.shell:
cmd: |
set -e
CT="{{ veza_container_prefix }}{{ item }}-{{ target_color }}"
if ! incus info "$CT" >/dev/null 2>&1; then
echo "MISSING $CT"
exit 1
fi
STATE=$(incus list "$CT" -c s --format csv)
if [ "$STATE" != "RUNNING" ]; then
echo "$CT is $STATE (not RUNNING)"
exit 1
fi
echo "OK $CT"
executable: /bin/bash
loop:
- backend
- stream
- web
changed_when: false
register: alive_check
when: mode == 'fast'
tags: [rollback, fast]
- name: Fast rollback — flip HAProxy
hosts: haproxy
become: true
gather_facts: true
tasks:
- name: Apply veza_haproxy_switch with target_color
ansible.builtin.include_role:
name: veza_haproxy_switch
vars:
veza_active_color: "{{ target_color }}"
# Fast rollback re-uses the previous SHA from the history file.
# Fallback to a synthetic 40-char SHA if the file is missing —
# the role's assert tolerates this for the rollback case.
veza_release_sha: "{{ (lookup('ansible.builtin.file', '/var/lib/veza/active-color.history', errors='ignore') | default('', true) | regex_search('sha=([0-9a-f]{40})', '\\1') | default('r0llback' + '0' * 32, true)) }}"
when: mode == 'fast'
tags: [rollback, fast]
# ---------------------------------------------------------------------
# mode=full → re-run deploy_app.yml with the rollback SHA.
# `when:` IS valid on import_playbook (unlike on a regular play).
# ---------------------------------------------------------------------
- name: Full rollback — delegate to deploy_app.yml
ansible.builtin.import_playbook: deploy_app.yml
when: mode == 'full'
tags: [rollback, full]