# Veza Ansible IaC Infrastructure-as-code for the Veza self-hosted platform. Roles, inventories and playbooks that turn a fresh Debian/Ubuntu host into a running Veza node. Scope at v1.0.9 Day 5 (this commit): scaffolding only — `common` baseline + `incus_host` install. Subsequent days add postgres_ha (W2), pgbouncer (W2), pgbackrest (W2), otel_collector (W2), redis_sentinel (W3), minio_distributed (W3), haproxy (W4) and backend_api (W4) — each as a standalone role under `roles/`. ## Layout ``` infra/ansible/ ├── ansible.cfg # pinned defaults (inventory path, ControlMaster) ├── inventory/ │ ├── lab.yml # R720 lab Incus container — dry-run target │ ├── staging.yml # Hetzner staging (TODO IP — W2 provision) │ └── prod.yml # R720 prod (TODO IP — DNS at EX-5) ├── group_vars/ │ └── all.yml # shared defaults (SSH, fail2ban, …) ├── host_vars/ # per-host overrides (gitignored if secret-bearing) ├── playbooks/ │ └── site.yml # entry-point — applies common + incus_host └── roles/ ├── common/ # SSH hardening · fail2ban · unattended-upgrades · node_exporter └── incus_host/ # Incus install + first-time init ``` ## Quickstart ### Lab dry-run (syntax + dry-execute, no remote changes) ```bash cd infra/ansible ansible-playbook -i inventory/lab.yml playbooks/site.yml --check ``` `--check` is the acceptance gate for v1.0.9 Day 5 — must pass clean before merging any role change. ### Lab apply ```bash ansible-playbook -i inventory/lab.yml playbooks/site.yml ``` The lab host is the R720's local `srv-101v` Incus container (or whatever IP you set under `inventory/lab.yml::veza-lab.ansible_host`). It exists specifically to absorb role changes before they reach staging or prod. ### Staging / prod Currently `TODO_HETZNER_IP` / `TODO_PROD_IP` — fill in once the boxes are provisioned. Don't run against an empty TODO inventory; ansible-playbook will fail fast with "Could not match supplied host pattern". ### Tags — apply a single concern ```bash # Re-render only the SSH hardening drop-in ansible-playbook -i inventory/lab.yml playbooks/site.yml --tags ssh # Bump node_exporter to a newer pinned version (after editing group_vars/all.yml) ansible-playbook -i inventory/lab.yml playbooks/site.yml --tags node_exporter ``` Available tags: `common`, `packages`, `users`, `ssh`, `fail2ban`, `unattended-upgrades`, `monitoring`, `node_exporter`, `incus`, `init`, `service`. ## Roles ### `common` — host baseline - `ssh.yml` — drops `/etc/ssh/sshd_config.d/50-veza-hardening.conf` from a Jinja template. Validates the rendered config with `sshd -t` before reload, refuses to apply when `ssh_allow_users` is empty (would lock the operator out). - `fail2ban.yml` — `/etc/fail2ban/jail.local` with the sshd jail enabled, defaults to bantime=1h / findtime=10min / maxretry=5. - `unattended_upgrades.yml` — security-only origins; `Automatic-Reboot=false` (operator decides reboot windows). - `node_exporter.yml` — installs Prometheus node_exporter pinned to the version in `group_vars/all.yml::monitoring_node_exporter_version`, runs as a systemd unit on `:9100`. Variables in `group_vars/all.yml`: | var | default | notes | |---|---|---| | `ssh_port` | `22` | bump for prod once a bastion is in place | | `ssh_permit_root_login` | `"no"` | string, not boolean (sshd config syntax) | | `ssh_password_authentication` | `"no"` | | | `ssh_allow_users` | `[senke, ansible]` | role asserts non-empty | | `fail2ban_bantime` | `3600` | seconds | | `fail2ban_findtime` | `600` | seconds | | `fail2ban_maxretry` | `5` | | | `unattended_upgrades_origins` | security-only | | | `unattended_upgrades_auto_reboot` | `false` | operator-driven | | `monitoring_node_exporter_version` | `1.8.2` | upstream pin | | `monitoring_node_exporter_port` | `9100` | | ### `incus_host` — Incus server install - Adds the upstream zabbly Incus apt repo. - Installs `incus` + `incus-client`. - Adds the `ansible` user to `incus-admin` so subsequent roles can run `incus` non-sudo. - First-time `incus admin init` via preseed if the host has never been initialised. Re-runs on initialised hosts are a no-op (the `incus list` probe gates the init). Bridge config: | var | default | notes | |---|---|---| | `incus_bridge` | `incusbr0` | the bridge Veza app containers attach to | | `incus_bridge_ipv4` | `10.99.0.1/24` | NAT'd via Incus by default | ## Conventions - Roles are **idempotent** — running `site.yml` twice produces no changes. CI eventually validates this with a `--check` after a real apply. - **No secrets in git.** `host_vars/.yml` is fine for non-secrets; secrets go in `host_vars/.vault.yml` encrypted with `ansible-vault`. The vault key lives outside the repo. - **Tags are mandatory** on every task so a partial apply (`--tags ssh,monitoring`) is always possible. A new role missing tags fails its own commit's `--check` review. - **Comment the why, not the what.** Role tasks should answer "why this knob, why this default, why this guard" — the task name + module already say what. ## See also - `ROADMAP_V1.0_LAUNCH.md` §Semaine 1 day 5 — original scope brief - `docs/runbooks/` — once roles for production services land, each gets a runbook - `docker-compose.dev.yml` — the dev-host equivalent of these roles (kept for now; Ansible takes over for staging/prod once W2 lands)