No description
Find a file
senke bf31a91ae6
Some checks failed
Veza CI / Frontend (Web) (push) Failing after 16m6s
Veza CI / Notify on failure (push) Successful in 11s
E2E Playwright / e2e (full) (push) Successful in 19m59s
Veza CI / Rust (Stream Server) (push) Successful in 4m57s
Security Scan / Secret Scanning (gitleaks) (push) Successful in 49s
Veza CI / Backend (Go) (push) Successful in 6m4s
feat(infra): pgbackrest role + dr-drill + Prometheus backup alerts (W2 Day 8)
ROADMAP_V1.0_LAUNCH.md §Semaine 2 day 8 deliverable:
  - Postgres backups land in MinIO via pgbackrest
  - dr-drill restores them weekly into an ephemeral Incus container
    and asserts the data round-trips
  - Prometheus alerts fire when the drill fails OR when the timer
    has stopped firing for >8 days

Cadence:
  full   — weekly  (Sun 02:00 UTC, systemd timer)
  diff   — daily   (Mon-Sat 02:00 UTC, systemd timer)
  WAL    — continuous (postgres archive_command, archive_timeout=60s)
  drill  — weekly  (Sun 04:00 UTC — runs 2h after the Sun full so
           the restore exercises fresh data)

RPO ≈ 1 min (archive_timeout). RTO ≤ 30 min (drill measures actual
restore wall-clock).

Files:
  infra/ansible/roles/pgbackrest/
    defaults/main.yml — repo1-* config (MinIO/S3, path-style,
      aes-256-cbc encryption, vault-backed creds), retention 4 full
      / 7 diff / 4 archive cycles, zstd@3 compression. The role's
      first task asserts the placeholder secrets are gone — refuses
      to apply until the vault carries real keys.
    tasks/main.yml — install pgbackrest, render
      /etc/pgbackrest/pgbackrest.conf, set archive_command on the
      postgres instance via ALTER SYSTEM, detect role at runtime
      via `pg_autoctl show state --json`, stanza-create from primary
      only, render + enable systemd timers (full + diff + drill).
    templates/pgbackrest.conf.j2 — global + per-stanza sections;
      pg1-path defaults to the pg_auto_failover state dir so the
      role plugs straight into the Day 6 formation.
    templates/pgbackrest-{full,diff,drill}.{service,timer}.j2 —
      systemd units. Backup services run as `postgres`,
      drill service runs as `root` (needs `incus`).
      RandomizedDelaySec on every timer to absorb clock skew + node
      collision risk.
    README.md — RPO/RTO guarantees, vault setup, repo wiring,
      operational cheatsheet (info / check / manual backup),
      restore procedure documented separately as the dr-drill.

  scripts/dr-drill.sh
    Acceptance script for the day. Sequence:
      0. pre-flight: required tools, latest backup metadata visible
      1. launch ephemeral `pg-restore-drill` Incus container
      2. install postgres + pgbackrest inside, push the SAME
         pgbackrest.conf as the host (read-only against the bucket
         by pgbackrest semantics — the same s3 keys get reused so
         the drill exercises the production credential path)
      3. `pgbackrest restore` — full + WAL replay
      4. start postgres, wait for pg_isready
      5. smoke query: SELECT count(*) FROM users — must be ≥ MIN_USERS_EXPECTED
      6. write veza_backup_drill_* metrics to the textfile-collector
      7. teardown (or --keep for postmortem inspection)
    Exit codes 0/1/2 (pass / drill failure / env problem) so a
    Prometheus runner can plug in directly.

  config/prometheus/alert_rules.yml — new `veza_backup` group:
    - BackupRestoreDrillFailed (critical, 5m): the last drill
      reported success=0. Pages because a backup we haven't proved
      restorable is dette technique waiting for a disaster.
    - BackupRestoreDrillStale (warning, 1h after >8 days): the
      drill timer has stopped firing. Catches a broken cron / unit
      / runner before the failure-mode alert above ever sees data.
    Both annotations include a runbook_url stub
    (veza.fr/runbooks/...) — those land alongside W2 day 10's
    SLO runbook batch.

  infra/ansible/playbooks/postgres_ha.yml
    Two new plays:
      6. apply pgbackrest role to postgres_ha_nodes (install +
         config + full/diff timers on every data node;
         pgbackrest's repo lock arbitrates collision)
      7. install dr-drill on the incus_hosts group (push
         /usr/local/bin/dr-drill.sh + render drill timer + ensure
         /var/lib/node_exporter/textfile_collector exists)

Acceptance verified locally:
  $ ansible-playbook -i inventory/lab.yml playbooks/postgres_ha.yml \
      --syntax-check
  playbook: playbooks/postgres_ha.yml          ← clean
  $ python3 -c "import yaml; yaml.safe_load(open('config/prometheus/alert_rules.yml'))"
  YAML OK
  $ bash -n scripts/dr-drill.sh
  syntax OK

Real apply + drill needs the lab R720 + a populated MinIO bucket
+ the secrets in vault — operator's call.

Out of scope (deferred per ROADMAP §2):
  - Off-site backup replica (B2 / Bunny.net) — v1.1+
  - Logical export pipeline for RGPD per-user dumps — separate
    feature track, not a backup-system concern
  - PITR admin UI — CLI-only via `--type=time` for v1.0
  - pgbackrest_exporter Prometheus integration — W2 day 9
    alongside the OTel collector

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:51:00 +02:00
.github fix(e2e): unblock @critical green slate for v1.0.9 tag (Day 4 triage) 2026-04-27 16:18:56 +02:00
.husky chore(web): drop legacy openapi-generator-cli — orval is the single source (v1.0.8 B9) 2026-04-26 00:02:58 +02:00
.zap chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
apps/web feat(branding): scaffold Logo component + Sumi icons + brand assets pipeline (Sprint 3) 2026-04-27 17:08:17 +02:00
chat_exports report generation and future tasks selection 2025-12-08 19:57:54 +01:00
config feat(infra): pgbackrest role + dr-drill + Prometheus backup alerts (W2 Day 8) 2026-04-28 00:51:00 +02:00
dev-environment refactor: remove dead code (api_manager.go, unused templates) 2026-02-22 17:44:19 +01:00
docker/haproxy chore: consolidate CI, E2E, backend and frontend updates 2026-02-17 16:43:21 +01:00
docs docs(roadmap): add v1.0 → v2.0.0-public launch roadmap (6 weeks) 2026-04-26 23:50:07 +02:00
docs-assets/mermaid BASE: completing the initial repo state 2025-12-03 22:56:50 +01:00
fixtures release(v0.903): Vault - ORDER BY whitelist, rate limiter, VERSION sync, chat-server cleanup, Go 1.24 2026-02-27 09:43:25 +01:00
full_veza_audit_data feat(v0.923): API contract tests, OpenAPI generation, CI type sync check 2026-02-27 20:23:10 +01:00
home/senke/git/talas/veza/apps/web/src small fixes : cors + login loop 2026-02-07 20:36:48 +01:00
infra feat(infra): pgbackrest role + dr-drill + Prometheus backup alerts (W2 Day 8) 2026-04-28 00:51:00 +02:00
k8s docs(J2): align docs with reality — rewrite CLAUDE.md, fix README, purge chat-server refs 2026-04-14 17:23:50 +02:00
loadtests chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
make feat(ci): enforce OpenAPI type sync — drift prevention (v1.0.8 P0) 2026-04-23 20:33:13 +02:00
packages/design-system refactor(design-system): finish Sprint 2 — light theme + 3 viz pigments canonized 2026-04-27 16:57:12 +02:00
prompts chore: add audit screenshots, audit scripts, and prompt templates 2026-03-31 19:17:05 +02:00
proto chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
scripts feat(infra): pgbackrest role + dr-drill + Prometheus backup alerts (W2 Day 8) 2026-04-28 00:51:00 +02:00
sub_task_agents Phase 2 stabilisation: code mort, Modal→Dialog, feature flags, tests, router split, Rust legacy 2026-02-14 17:23:32 +01:00
test-reports/20251226-132633 [TEST] MVP integration tests executed - 2/28 API passed, 0/20 E2E passed, 3 bugs found 2026-01-04 01:44:13 +01:00
tests fix(e2e): triage @critical batch 2 — chat WS proxy + FeedPage dette (Day 4) 2026-04-27 16:55:15 +02:00
tmt fix: sync E2E tests with seed data + i18n fix 2026-04-02 19:42:03 +02:00
tools BASE: completing the initial repo state 2025-12-03 22:56:50 +01:00
veza-backend-api feat(infra): pgbouncer role + pgbench load test (W2 Day 7) 2026-04-27 18:35:05 +02:00
veza-common chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
veza-docs docs(origin): align brand identity with CHARTE_GRAPHIQUE_TALAS (Sprint 2 follow-up #4) 2026-04-27 16:48:37 +02:00
veza-stream-server chore(infra): J6 — mark 3 dormant docker-compose files as deprecated 2026-04-15 12:58:39 +02:00
.commitlintrc.json chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
.cursorrules docs: retrospective v0.803, archive scope, update SCOPE_CONTROL 2026-03-03 09:25:34 +01:00
.editorconfig initial: initial repo set up (README, LICENSE, CONTRIBUTORS, etc...) 2025-12-03 13:54:23 +01:00
.gitattributes initial: initial repo set up (README, LICENSE, CONTRIBUTORS, etc...) 2025-12-03 13:54:23 +01:00
.gitignore chore(web): install orval + mutator for OpenAPI code generation (v1.0.8 P1) 2026-04-24 00:18:14 +02:00
.gitleaks.toml ci(security): expand gitleaks allowlist for e2e artifacts, docs, templates 2026-04-14 12:32:34 +02:00
.lighthouserc.js feat(v0.14.0): validation runtime & staging pipeline 2026-03-13 16:09:43 +01:00
.lintstagedrc.json fix(ci): lint-staged eslint rule was linting the whole project 2026-04-15 12:47:21 +02:00
.nvmrc v0.9.3 2026-03-05 19:35:57 +01:00
.pa11yci.json chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
.semgrepignore chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
AUDIT_REPORT.md chore(docs): archive obsolete v0.12.6 security docs 2026-04-23 15:32:25 +02:00
CHANGELOG.md chore: release v1.0.8 2026-04-26 00:23:59 +02:00
CLAUDE.md docs: update CLAUDE.md stack table + history post-v1.0.8 2026-04-26 01:46:27 +02:00
CONTRIBUTING.md release(v0.903): Vault - ORDER BY whitelist, rate limiter, VERSION sync, chat-server cleanup, Go 1.24 2026-02-27 09:43:25 +01:00
docker-compose.dev.yml chore(docker): pin MinIO + mc to dated release tags 2026-04-20 20:32:01 +02:00
docker-compose.env.example feat(payments): document Hyperswitch activation and validate checkout flow 2026-02-15 16:08:49 +01:00
docker-compose.override.yml.example BASE: completing the initial repo state 2025-12-03 22:56:50 +01:00
docker-compose.prod.yml chore(docker): pin MinIO + mc to dated release tags 2026-04-20 20:32:01 +02:00
docker-compose.staging.yml chore(docker): pin MinIO + mc to dated release tags 2026-04-20 20:32:01 +02:00
docker-compose.test.yml fix(infra): align PostgreSQL to version 16 in test compose 2026-02-22 17:35:35 +01:00
docker-compose.yml chore(docker): pin MinIO + mc to dated release tags 2026-04-20 20:32:01 +02:00
env.remote-r720.example stabilisation commit: while implementing v0.10.5 2026-03-09 19:36:33 +01:00
FUNCTIONAL_AUDIT.md docs(audit): reconcile top-15 priorities with tier 1-3 + BFG pass 2026-04-23 14:20:28 +02:00
go.work fix(ci): bump go.work to 1.25 to match veza-backend-api/go.mod 2026-04-15 15:06:50 +02:00
go.work.sum chore(release): v0.931 — Cursor (cursor-based pagination, performance baseline) 2026-03-02 12:35:49 +01:00
help chore(cleanup): remove orphan code + archive disabled workflows + .playwright-mcp 2026-04-20 20:33:40 +02:00
Makefile release(v0.903): Vault - ORDER BY whitelist, rate limiter, VERSION sync, chat-server cleanup, Go 1.24 2026-02-27 09:43:25 +01:00
package-lock.json feat(design-system): introduce Style Dictionary (W3C tokens) — Sprint 2 foundation 2026-04-27 04:52:15 +02:00
package.json chore(deps): add @commitlint/cli + config-conventional dev deps 2026-04-25 21:06:38 +02:00
README.md chore(cleanup): J5 — defer GeoIP, rename v2-v3-types, document Storybook kill 2026-04-15 12:43:57 +02:00
RELEASE_NOTES_V1.md chore(release): v0.992 RC2 — Release notes, sign-off final 2026-03-03 19:53:41 +01:00
run-audit.sh chore: add audit screenshots, audit scripts, and prompt templates 2026-03-31 19:17:05 +02:00
rust-toolchain.toml BASE: completing the initial repo state 2025-12-03 22:56:50 +01:00
status.sh docs: add project documentation, logging config, status script 2026-03-18 11:36:36 +01:00
turbo.json feat(design-system): introduce Style Dictionary (W3C tokens) — Sprint 2 foundation 2026-04-27 04:52:15 +02:00
Untitled chore: consolidate CI, E2E, backend and frontend updates 2026-02-17 16:43:21 +01:00
VERSION chore: release v1.0.8 2026-04-26 00:23:59 +02:00
VEZA_VERSIONS_ROADMAP.md docs: update VEZA_VERSIONS_ROADMAP [v1.0.0-rc1 DONE] 2026-03-13 16:24:04 +01:00

Veza Monorepo

CI

Version courante : v1.0.4 (cleanup + consolidation post-audit). Voir CHANGELOG.md et docs/PROJECT_STATE.md.

Project Structure

  • apps/web — Frontend React 18 + Vite 5 + TypeScript strict (source of truth for the UI)
  • veza-backend-api — Main Go 1.25 API service (Gin, GORM, Postgres, Redis, RabbitMQ, Elasticsearch). Handles REST, WebSocket, and chat (chat server was merged into this service in v0.502).
  • veza-stream-server — Rust streaming server (Axum 0.8, Tokio 1.35, Symphonia) — HLS, HTTP Range, WebSocket, gRPC
  • veza-common — Shared Rust types and logging
  • packages/design-system — Shared design tokens

See CLAUDE.md for the full architecture map.

Development Setup

Prerequisites: Node 20 (see .nvmrc), Go, Rust, Docker. Configure .env from .env.example.

# Verify environment
make doctor
./scripts/validate-env.sh development

# Install dependencies
make install-deps

# Option A — Backend in Docker + Web local
make dev

# Option B — All apps local with hot reload (infra from docker-compose.dev.yml)
make dev-full

# Option C — Infra only, then run services manually
docker compose -f docker-compose.dev.yml up -d
make dev-web              # or make dev-backend-api, make dev-stream-server

See docs/ENV_VARIABLES.md for required variables. make build builds all services.

Quick Start

Frontend only

cd apps/web
npm install
npm run dev

Docker Production

Canonical production compose file: docker-compose.prod.yml

docker compose -f docker-compose.prod.yml up -d

See make/config.mk for COMPOSE_PROD and deployment docs.

CI/CD

  • Badge : CI status above. Set SLACK_WEBHOOK_URL (Incoming Webhook) in repo secrets to receive Slack notifications on failure.

Disabled workflows

  • Storybook (chromatic.yml.disabled, storybook-audit.yml.disabled, visual-regression.yml.disabled): deferred until MSW is wired up for /api/v1/auth/me and /api/v1/logs/frontend, which currently causes ~1 400 network errors in the Storybook build. The npm scripts (storybook, build-storybook) still work locally for one-off component inspection. To reactivate in CI, fix the MSW handlers and rename the three files back to .yml.

Documentation

  • Developer Onboarding — Setup, architecture, conventions, troubleshooting
  • Documentation index — Index complet de la documentation
  • See docs/ for detailed architecture and development guides. Older audits and reports are archived in docs/archive/.