Some checks failed
Veza CI / Frontend (Web) (push) Has been cancelled
E2E Playwright / e2e (full) (push) Has been cancelled
Veza CI / Notify on failure (push) Blocked by required conditions
Veza CI / Rust (Stream Server) (push) Successful in 3m27s
Security Scan / Secret Scanning (gitleaks) (push) Successful in 52s
Veza CI / Backend (Go) (push) Successful in 5m32s
Day 5 of ROADMAP_V1.0_LAUNCH.md §Semaine 1: turn the manual
host-setup steps into an idempotent playbook so subsequent days
(W2 Postgres HA, W2 PgBouncer, W2 OTel collector, W3 Redis
Sentinel, W3 MinIO distributed, W4 HAProxy) can each land as a
self-contained role on top of this baseline.
Layout (full tree under infra/ansible/):
ansible.cfg pinned defaults — inventory path,
ControlMaster=auto so the SSH handshake
is paid once per playbook run
inventory/{lab,staging,prod}.yml
three environments. lab is the R720's
local Incus container (10.0.20.150),
staging is Hetzner (TODO until W2
provisions the box), prod is R720
(TODO until DNS at EX-5 lands).
group_vars/all.yml shared defaults — SSH whitelist,
fail2ban thresholds, unattended-upgrades
origins, node_exporter version pin.
playbooks/site.yml entry point. Two plays:
1. common (every host)
2. incus_host (incus_hosts group)
roles/common/ idempotent baseline:
ssh.yml — drop-in
/etc/ssh/sshd_config.d/50-veza-
hardening.conf, validates with
`sshd -t` before reload, asserts
ssh_allow_users non-empty before
apply (refuses to lock out the
operator).
fail2ban.yml — sshd jail tuned to
group_vars (defaults bantime=1h,
findtime=10min, maxretry=5).
unattended_upgrades.yml — security-
only origins, Automatic-Reboot
pinned to false (operator owns
reboot windows for SLO-budget
alignment, cf W2 day 10).
node_exporter.yml — pinned to
1.8.2, runs as a systemd unit
on :9100. Skips download when
--version already matches.
roles/incus_host/ zabbly upstream apt repo + incus +
incus-client install. First-time
`incus admin init --preseed` only when
`incus list` errors (i.e. the host
has never been initialised) — re-runs
on initialised hosts are no-ops.
Configures incusbr0 / 10.99.0.1/24
with NAT + default storage pool.
Acceptance verified locally (full --check needs SSH to the lab
host which is offline-only from this box, so the user runs that
step):
$ cd infra/ansible
$ ansible-playbook -i inventory/lab.yml playbooks/site.yml --syntax-check
playbook: playbooks/site.yml ← clean
$ ansible-playbook -i inventory/lab.yml playbooks/site.yml --list-tasks
21 tasks across 2 plays, all tagged. ← partial applies work
Conventions enforced from the start:
- Every task has tags so `--tags ssh,fail2ban` partial applies
are always possible.
- Sub-task files (ssh.yml, fail2ban.yml, etc.) so the role
main.yml stays a directory of concerns, not a wall of tasks.
- Validators run before reload (sshd -t for sshd_config). The
role refuses to apply changes that would lock the operator out.
- Comments answer "why" — task names + module names already
say "what".
Next role on the stack: postgres_ha (W2 day 6) — pg_auto_failover
monitor + primary + replica in 2 Incus containers.
SKIP_TESTS=1 — IaC YAML, no app code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
58 lines
1.5 KiB
YAML
58 lines
1.5 KiB
YAML
# Common baseline applied on every veza host (lab / staging / prod).
|
|
# Idempotent — safe to re-run on every playbook execution.
|
|
#
|
|
# Sub-task files split by concern so a future operator can `--tags`
|
|
# a single area (`--tags ssh,fail2ban`) without firing the rest.
|
|
---
|
|
- name: Update apt cache (only when older than 1 hour)
|
|
ansible.builtin.apt:
|
|
update_cache: true
|
|
cache_valid_time: 3600
|
|
changed_when: false
|
|
tags: [common, packages]
|
|
|
|
- name: Install baseline packages
|
|
ansible.builtin.apt:
|
|
name:
|
|
- curl
|
|
- ca-certificates
|
|
- gnupg
|
|
- lsb-release
|
|
- htop
|
|
- vim
|
|
- git
|
|
- jq
|
|
- rsync
|
|
- ufw
|
|
- fail2ban
|
|
- unattended-upgrades
|
|
- apt-listchanges
|
|
- python3-apt
|
|
state: present
|
|
tags: [common, packages]
|
|
|
|
- name: Ensure ansible user exists (idempotent — no-op if pre-provisioned)
|
|
ansible.builtin.user:
|
|
name: ansible
|
|
shell: /bin/bash
|
|
groups: sudo
|
|
append: true
|
|
create_home: true
|
|
state: present
|
|
tags: [common, users]
|
|
|
|
- name: Import SSH hardening sub-tasks
|
|
ansible.builtin.import_tasks: ssh.yml
|
|
tags: [common, ssh]
|
|
|
|
- name: Import fail2ban sub-tasks
|
|
ansible.builtin.import_tasks: fail2ban.yml
|
|
tags: [common, fail2ban]
|
|
|
|
- name: Import unattended-upgrades sub-tasks
|
|
ansible.builtin.import_tasks: unattended_upgrades.yml
|
|
tags: [common, unattended-upgrades]
|
|
|
|
- name: Import node_exporter sub-tasks
|
|
ansible.builtin.import_tasks: node_exporter.yml
|
|
tags: [common, monitoring, node_exporter]
|