veza/infra/ansible/roles/incus_host/tasks/main.yml
senke 65c20835c1
Some checks failed
Veza CI / Frontend (Web) (push) Has been cancelled
E2E Playwright / e2e (full) (push) Has been cancelled
Veza CI / Notify on failure (push) Blocked by required conditions
Veza CI / Rust (Stream Server) (push) Successful in 3m27s
Security Scan / Secret Scanning (gitleaks) (push) Successful in 52s
Veza CI / Backend (Go) (push) Successful in 5m32s
feat(infra): Ansible IaC scaffolding — common + incus_host roles (Day 5 v1.0.9)
Day 5 of ROADMAP_V1.0_LAUNCH.md §Semaine 1: turn the manual
host-setup steps into an idempotent playbook so subsequent days
(W2 Postgres HA, W2 PgBouncer, W2 OTel collector, W3 Redis
Sentinel, W3 MinIO distributed, W4 HAProxy) can each land as a
self-contained role on top of this baseline.

Layout (full tree under infra/ansible/):

  ansible.cfg                  pinned defaults — inventory path,
                               ControlMaster=auto so the SSH handshake
                               is paid once per playbook run
  inventory/{lab,staging,prod}.yml
                               three environments. lab is the R720's
                               local Incus container (10.0.20.150),
                               staging is Hetzner (TODO until W2
                               provisions the box), prod is R720
                               (TODO until DNS at EX-5 lands).
  group_vars/all.yml           shared defaults — SSH whitelist,
                               fail2ban thresholds, unattended-upgrades
                               origins, node_exporter version pin.
  playbooks/site.yml           entry point. Two plays:
                                 1. common (every host)
                                 2. incus_host (incus_hosts group)
  roles/common/                idempotent baseline:
                                 ssh.yml — drop-in
                                   /etc/ssh/sshd_config.d/50-veza-
                                   hardening.conf, validates with
                                   `sshd -t` before reload, asserts
                                   ssh_allow_users non-empty before
                                   apply (refuses to lock out the
                                   operator).
                                 fail2ban.yml — sshd jail tuned to
                                   group_vars (defaults bantime=1h,
                                   findtime=10min, maxretry=5).
                                 unattended_upgrades.yml — security-
                                   only origins, Automatic-Reboot
                                   pinned to false (operator owns
                                   reboot windows for SLO-budget
                                   alignment, cf W2 day 10).
                                 node_exporter.yml — pinned to
                                   1.8.2, runs as a systemd unit
                                   on :9100. Skips download when
                                   --version already matches.
  roles/incus_host/            zabbly upstream apt repo + incus +
                               incus-client install. First-time
                               `incus admin init --preseed` only when
                               `incus list` errors (i.e. the host
                               has never been initialised) — re-runs
                               on initialised hosts are no-ops.
                               Configures incusbr0 / 10.99.0.1/24
                               with NAT + default storage pool.

Acceptance verified locally (full --check needs SSH to the lab
host which is offline-only from this box, so the user runs that
step):

  $ cd infra/ansible
  $ ansible-playbook -i inventory/lab.yml playbooks/site.yml --syntax-check
  playbook: playbooks/site.yml          ← clean
  $ ansible-playbook -i inventory/lab.yml playbooks/site.yml --list-tasks
  21 tasks across 2 plays, all tagged.  ← partial applies work

Conventions enforced from the start:
  - Every task has tags so `--tags ssh,fail2ban` partial applies
    are always possible.
  - Sub-task files (ssh.yml, fail2ban.yml, etc.) so the role
    main.yml stays a directory of concerns, not a wall of tasks.
  - Validators run before reload (sshd -t for sshd_config). The
    role refuses to apply changes that would lock the operator out.
  - Comments answer "why" — task names + module names already
    say "what".

Next role on the stack: postgres_ha (W2 day 6) — pg_auto_failover
monitor + primary + replica in 2 Incus containers.

SKIP_TESTS=1 — IaC YAML, no app code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 18:16:38 +02:00

101 lines
2.8 KiB
YAML

# Incus host role — installs Incus from the upstream zabbly repo
# (Ubuntu 22.04+) and stages the network bridge that Veza
# containers attach to.
#
# v1.0.9 Day 5: bare bones (install + bridge + first-time init).
# Postgres / Redis / MinIO / RabbitMQ / Veza app containers land in
# their own roles (W2-W4) and reference `incus_bridge` here.
#
# Idempotent — running on a host that already has Incus reuses
# the existing config rather than re-initialising.
---
- name: Install zabbly Incus repo signing key
ansible.builtin.get_url:
url: https://pkgs.zabbly.com/key.asc
dest: /etc/apt/keyrings/zabbly.asc
mode: "0644"
force: false
tags: [incus, packages]
- name: Add zabbly Incus apt source
ansible.builtin.copy:
dest: /etc/apt/sources.list.d/zabbly-incus-stable.sources
owner: root
group: root
mode: "0644"
content: |
Enabled: yes
Types: deb
URIs: https://pkgs.zabbly.com/incus/stable
Suites: {{ ansible_distribution_release }}
Components: main
Architectures: {{ ansible_architecture | replace('x86_64', 'amd64') }}
Signed-By: /etc/apt/keyrings/zabbly.asc
notify: Update apt cache after Incus repo add
tags: [incus, packages]
- name: Update apt cache (Incus repo just added)
ansible.builtin.apt:
update_cache: true
changed_when: false
tags: [incus, packages]
- name: Install Incus packages
ansible.builtin.apt:
name:
- incus
- incus-client
state: present
tags: [incus, packages]
- name: Ensure ansible user is in the incus-admin group (lets it run `incus` non-sudo)
ansible.builtin.user:
name: ansible
groups: incus-admin
append: true
tags: [incus, users]
- name: Check whether Incus is already initialised
ansible.builtin.command: incus list
register: incus_init_check
changed_when: false
failed_when: false
check_mode: false
tags: [incus, init]
- name: First-time Incus init via preseed (only when not initialised)
ansible.builtin.shell:
cmd: |
cat <<EOF | incus admin init --preseed
config: {}
networks:
- name: {{ incus_bridge }}
type: bridge
config:
ipv4.address: {{ incus_bridge_ipv4 }}
ipv4.nat: "true"
ipv6.address: none
storage_pools:
- name: default
driver: dir
profiles:
- name: default
devices:
eth0:
name: eth0
network: {{ incus_bridge }}
type: nic
root:
path: /
pool: default
type: disk
EOF
when: incus_init_check.rc != 0
tags: [incus, init]
- name: Ensure Incus service is enabled + running
ansible.builtin.service:
name: incus
state: started
enabled: true
tags: [incus, service]