veza/infra at 257ea4b159d5c8ff295f5f299f705006a74a69cc - senke/veza

senke/veza

History

senke 257ea4b159 feat(ansible): playbooks/deploy_data.yml — idempotent data provisioning First-half of every deploy: ZFS snapshot, then ensure data containers exist + their services are configured + ready. Per requirement: data containers are NEVER destroyed across deploys, only created if absent. Sequence: Pre-flight (incus_hosts) Validate veza_env (staging\|prod) + veza_release_sha (40-char SHA). Compute the list of managed data containers from veza_container_prefix. ZFS snapshot (incus_hosts) Resolve each container's dataset via `zfs list \| grep`. Skip if no ZFS dataset (non-ZFS storage backend) or if the container doesn't exist yet (first-ever deploy). Snapshot name: <dataset>@pre-deploy-<sha>. Idempotent — re-runs no-op once the snapshot exists. Prune step keeps the {{ veza_release_retention }} most recent pre-deploy snapshots per dataset, drops the rest. Provision (incus_hosts) For each {postgres, redis, rabbitmq, minio} container : `incus info` to detect existence, `incus launch ... --profile veza-data --profile veza-net` if absent, then poll `incus exec -- /bin/true` until ready. refresh_inventory after launch so subsequent plays can use community.general.incus to reach the new containers. Configure (per-container plays, ansible_connection=community.general.incus) postgres : apt install postgresql-16, ensure veza role + veza database (no_log on password). redis : apt install redis-server, render redis.conf with vault_redis_password + appendonly + sane LRU. rabbitmq : apt install rabbitmq-server, ensure /veza vhost + veza user with vault_rabbitmq_password (.* perms). minio : direct-download minio + mc binaries (no apt package), render systemd unit + EnvironmentFile, start, then `mc mb --ignore-existing veza-<env>` to create the application bucket. Why no `roles/postgres_ha` etc.? The existing HA roles (postgres_ha, redis_sentinel, minio_distributed) target multi-host topology and pg_auto_failover. Phase-1 staging on a single R720 doesn't justify HA orchestration ; the simpler inline tasks are what the user gets out of the box. When prod splits onto multiple hosts (post v1.1), the inline blocks lift into the existing HA roles unchanged. Idempotency guarantees: * Container exist : `incus info >/dev/null` short-circuit. * Snapshot : zfs list -t snapshot guard. * Postgres role/db : community.postgresql idempotent. * Redis config : copy with notify-restart only on diff. * RabbitMQ vhost/user : community.rabbitmq idempotent. * MinIO bucket : mc mb --ignore-existing. Failure mode: any task that fails, fails the playbook hard. The ZFS snapshot is the recovery story — `zfs rollback <dataset>@pre-deploy-<sha>` restores prior state if we corrupt something on a partial run. --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-29 12:23:30 +02:00
..
ansible	feat(ansible): playbooks/deploy_data.yml — idempotent data provisioning	2026-04-29 12:23:30 +02:00
coturn	feat(webrtc): coturn ICE config endpoint + frontend wiring + ops template (v1.0.9 item 1.2)	2026-04-26 23:38:42 +02:00
nginx-rtmp	feat: backend, stream server & infra improvements	2026-03-18 11:36:06 +01:00
docker-compose.lab.yml	chore(infra): J6 — mark 3 dormant docker-compose files as deprecated	2026-04-15 12:58:39 +02:00