senke/veza - Talas Project: Beyond coding. We Forge.

senke/veza

Author	SHA1	Message	Date
senke	6c1e87e52f	feat(marketplace): async stripe connect reversal worker — v1.0.7 item B day 2 Some checks failed Veza CI / Backend (Go) (push) Failing after 0s Details Veza CI / Rust (Stream Server) (push) Failing after 0s Details Veza CI / Frontend (Web) (push) Failing after 0s Details Security Scan / Secret Scanning (gitleaks) (push) Failing after 0s Details Veza CI / Notify on failure (push) Failing after 0s Details Day-2 cut of item B: the reversal path becomes async. Pre-v1.0.7 (and v1.0.7 day 1) the refund handler flipped seller_transfers straight from completed to reversed without ever calling Stripe — the ledger said "reversed" while the seller's Stripe balance still showed the original transfer as settled. The new flow: refund.succeeded webhook → reverseSellerAccounting transitions row: completed → reversal_pending → StripeReversalWorker (every REVERSAL_CHECK_INTERVAL, default 1m) → calls ReverseTransfer on Stripe → success: row → reversed + persist stripe_reversal_id → 404 already-reversed (dead code until day 3): row → reversed + log → 404 resource_missing (dead code until day 3): row → permanently_failed → transient error: stay reversal_pending, bump retry_count, exponential backoff (base * 2^retry, capped at backoffMax) → retries exhausted: row → permanently_failed → buyer-facing refund completes immediately regardless of Stripe health State machine enforcement: * New `SellerTransfer.TransitionStatus(tx, to, extras)` wraps every mutation: validates against AllowedTransferTransitions, guarded UPDATE with WHERE status=<from> (optimistic lock semantics), no RowsAffected = stale state / concurrent winner detected. * processSellerTransfers no longer mutates .Status in place — terminal status is decided before struct construction, so the row is Created with its final state. * transfer_retry.retryOne and admin RetryTransfer route through TransitionStatus. Legacy direct assignment removed. * TestNoDirectTransferStatusMutation greps the package for any `st.Status = "..."` / `t.Status = "..."` / GORM Model(&SellerTransfer{}).Update("status"...) outside the allowlist and fails if found. Verified by temporarily injecting a violation during development — test caught it as expected. Configuration (v1.0.7 item B): * REVERSAL_WORKER_ENABLED=true (default) * REVERSAL_MAX_RETRIES=5 (default) * REVERSAL_CHECK_INTERVAL=1m (default) * REVERSAL_BACKOFF_BASE=1m (default) * REVERSAL_BACKOFF_MAX=1h (default, caps exponential growth) * .env.template documents TRANSFER_RETRY_* and REVERSAL_* env vars so an ops reader can grep them. Interface change: TransferService.ReverseTransfer(ctx, stripe_transfer_id, amount int64, reason) (reversalID, error) added. All four mocks extended (process_webhook, transfer_retry, admin_transfer_handler, payment_flow integration). amount=nil means full reversal; v1.0.7 always passes nil (partial reversal is future scope per axis-1 P2). Stripe 404 disambiguation (ErrTransferAlreadyReversed / ErrTransferNotFound) is wired in the worker as dead code — the sentinels are declared and the worker branches on them, but StripeConnectService.ReverseTransfer doesn't yet emit them. Day 3 will parse stripe.Error.Code and populate the sentinels; no worker change needed at that point. Keeping the handling skeleton in day 2 so the worker's branch shape doesn't change between days and the tests can already cover all four paths against the mock. Worker unit tests (9 cases, all green, sqlite :memory:): happy path: reversal_pending → reversed + stripe_reversal_id set * already reversed (mock returns sentinel): → reversed + log * not found (mock returns sentinel): → permanently_failed + log * transient 503: retry_count++, next_retry_at set with backoff, stays reversal_pending * backoff capped at backoffMax (verified with base=1s, max=10s, retry_count=4 → capped at 10s not 16s) * max retries exhausted: → permanently_failed * legacy row with empty stripe_transfer_id: → permanently_failed, does not call Stripe * only picks up reversal_pending (skips all other statuses) * respects next_retry_at (future rows skipped) Existing test updated: TestProcessRefundWebhook_SucceededFinalizesState now asserts the row lands at reversal_pending with next_retry_at set (worker's responsibility to drive to reversed), not reversed. Worker wired in cmd/api/main.go alongside TransferRetryWorker, sharing the same StripeConnectService instance. Shutdown path registered for graceful stop. Cut from day 2 scope (per agreed-upon discipline), landing in day 3: * Stripe 404 disambiguation implementation (parse error.Code) * End-to-end smoke probe (refund → reversal_pending → worker processes → reversed) against local Postgres + mock Stripe * Batch-size tuning / inter-batch sleep — batchLimit=20 today is safely under Stripe's 100 req/s default rate limit; revisit if observed load warrants Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 15:34:29 +02:00
senke	5530267287	feat(workers): hourly cleanup of orphan tracks stuck in processing Upload flow: POST creates a track row with `status=processing` and writes the file at `file_path`. If the uploader process dies (OOM, SIGKILL during deploy, disk wipe) between row-create and status-update, the row stays in `processing` forever with a `file_path` that doesn't exist. The library UI shows a ghost track the user can never play, never reach, and only partially delete. New worker: * `jobs/cleanup_orphan_tracks.go` — `CleanupOrphanTracks` queries tracks with `status=processing AND created_at < NOW()-1h`, stats the `file_path`, and flips the row to `status=failed` with `status_message = "orphan cleanup: file missing on disk after >1h in processing"`. Never deletes; never touches present files or rows already in another state. Safe to run repeatedly. * `ScheduleOrphanTracksCleanup(db, logger)` runs once at boot and then every hour thereafter. Wired in `cmd/api/main.go` right after route setup so restarts trigger an immediate scan. * Threshold exported as `OrphanTrackAgeThreshold` constant so tests and future tuning don't need to edit the worker. Tests: 5 cases in `cleanup_orphan_tracks_test.go`: - `_FlipsStuckMissingFile` happy path - `_LeavesFilePresent` (slow uploads must not be failed) - `_LeavesRecent` (below threshold) - `_IgnoresAlreadyFailed` (idempotent) - `_NilDatabaseIsNoop` (safety) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:57:24 +02:00
senke	ebb28c77a0	fix(backend): J4 — GDPR-compliant hard delete with Redis and ES cleanup Some checks failed Veza CI / Notify on failure (push) Blocked by required conditions Details Backend API CI / test-unit (push) Failing after 14m36s Details Backend API CI / test-integration (push) Failing after 18m52s Details Veza CI / Frontend (Web) (push) Has been cancelled Details Veza CI / Rust (Stream Server) (push) Has been cancelled Details Veza CI / Backend (Go) (push) Has been cancelled Details Security Scan / Secret Scanning (gitleaks) (push) Has been cancelled Details Closes TODO(HIGH-007). When the hard-delete worker anonymizes a user past their recovery deadline, it now also cleans the user's residual data from Redis and Elasticsearch, not just PostgreSQL. Without this, a user who invoked their right to erasure would still appear in cached feed/profile responses and in ES search results for up to the next reindex cycle. Worker changes (internal/workers/hard_delete_worker.go): WithRedis / WithElasticsearch builder methods inject the clients. Both are optional: if either is nil (feature disabled or unreachable), the corresponding cleanup is skipped with a debug log and the worker keeps going. Partial progress beats panic. cleanRedisKeys uses SCAN with a cursor loop (COUNT 100), NEVER KEYS — KEYS would block the Redis server on multi-million-key deployments. Pattern is user:{id}:. Transient SCAN errors retry up to 3 times with 100ms retry linear backoff; persistent errors return without panic. DEL errors on a batch are logged but non-fatal so subsequent batches are still attempted. cleanESDocs hits three indices independently: - users index: DELETE doc by _id (the user UUID); 404 treated as success (already gone = desired state) - tracks index: DeleteByQuery with a terms filter on _id, using the list of track IDs collected from PostgreSQL BEFORE anonymization - playlists index: same pattern as tracks A failure on one index does not prevent the others from being tried; the first error is returned so the caller can log. Track/playlist IDs are pre-collected (collectTrackIDs, collectPlaylistIDs) before the UPDATE anonymization runs, because the anonymization does NOT cascade (no DELETE on users), so tracks and playlists rows remain with their creator_id / user_id intact and resolvable at query time. Wiring (cmd/api/main.go): The worker now receives cfg.RedisClient directly, and an optional ES client built from elasticsearch.LoadConfig() + NewClient. If ES is disabled or unreachable at startup, the worker logs a warning and proceeds with Redis-only cleanup. Tests (internal/workers/hard_delete_worker_test.go, +260 lines): Pure-function unit tests: - TestUUIDsToStrings - TestEsIndexNameFor Nil-client safety tests: - TestCleanRedisKeys_NilClientIsNoop - TestCleanESDocs_NilClientIsNoop ES mock-server tests (httptest.Server mimicking /_doc and /_delete_by_query endpoints with valid ES 8.11 responses): - TestCleanESDocs_CallsAllThreeIndices — verifies the three expected HTTP calls land with the right paths and request bodies containing the provided UUIDs - TestCleanESDocs_SkipsEmptyIDLists — verifies no DeleteByQuery is issued when the ID lists are empty Redis testcontainer integration test (gated by VEZA_SKIP_INTEGRATION): - TestCleanRedisKeys_Integration — seeds 154 keys (4 fixed + 150 bulk to force the SCAN loop past a single batch) plus 4 unrelated keys from another user / global, runs cleanRedisKeys, asserts all 154 own keys are gone and all 4 unrelated keys remain. Verification: go build ./... OK go vet ./... OK VEZA_SKIP_INTEGRATION=1 go test ./internal/workers/... short OK go test ./internal/workers/ -run TestCleanRedisKeys_Integration → testcontainers spins redis:7-alpine, test passes in 1.34s Out of J4 scope (noted for a follow-up): - No "activity" ES index exists in the codebase today (the audit plan mentioned it as a possible target). The three real indices with user data — users, tracks, playlists — are all now cleaned. - Track artist strings (free-form) may still contain the user's display name as a cached value in the tracks index after this cleanup. Actual user-owned tracks are deleted here, but if a third party's track referenced the removed user in its artist field, that reference is not touched. Strict RGPD on that edge case is a separate ticket. Refs: AUDIT_REPORT.md §8.5, §10 P5, §12 item 1	2026-04-15 12:25:39 +02:00
senke	24af2f72bc	ci: bump Go to 1.25 and fix goimports drift in 3 files Some checks failed Veza CI / Rust (Stream Server) (push) Waiting to run Details Veza CI / Notify on failure (push) Blocked by required conditions Details Security Scan / Secret Scanning (gitleaks) (push) Waiting to run Details Backend API CI / test-integration (push) Has been cancelled Details Veza CI / Frontend (Web) (push) Has been cancelled Details Backend API CI / test-unit (push) Has been cancelled Details Veza CI / Backend (Go) (push) Has been cancelled Details golangci-lint v2.11.4 requires Go >= 1.25. With the workflow on 1.24, setup-go would silently trigger an in-job auto-toolchain download (observed in run #71: 'go: github.com/golangci/golangci-lint/v2@v2.11.4 requires go >= 1.25.0; switching to go1.25.9') adding ~3 min to every Backend (Go) run. Bump setup-go to 1.25 in ci.yml, backend-ci.yml, go-fuzz.yml so the prebuilt Go is already the right version. Also lint-fix three files that golangci-lint's goimports checker flagged — goimports sorts/groups imports and removes unused ones, which plain gofmt leaves alone: - veza-backend-api/cmd/api/main.go - veza-backend-api/internal/api/handlers/chat_handlers.go - veza-backend-api/internal/handlers/auth_integration_test.go	2026-04-14 17:02:09 +02:00
senke	7b2f873736	feat: backend, stream server & infra improvements Backend (Go): - Config: CORS, RabbitMQ, rate limit, main config updates - Routes: core, distribution, tracks routing changes - Middleware: rate limiter, endpoint limiter, response cache hardening - Handlers: distribution, search handler fixes - Workers: job worker improvements - Upload validator and logging config additions - New migrations: products, orders, performance indexes - Seed tooling and data Stream Server (Rust): - Audio processing, config, routes, simple stream server updates - Dockerfile improvements Infrastructure: - docker-compose.yml updates - nginx-rtmp config changes - Makefile improvements (config, dev, high, infra) - Root package.json and lock file updates - .env.example updates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 11:36:06 +01:00
senke	249fd99730	fix(v0.12.6): apply all pentest remediations — 36 findings across 36 files CRITICAL fixes: - Race condition (TOCTOU) in payout/refund with SELECT FOR UPDATE (CRITICAL-001/002) - IDOR on analytics endpoint — ownership check enforced (CRITICAL-003) - CSWSH on all WebSocket endpoints — origin whitelist (CRITICAL-004) - Mass assignment on user self-update — strip privileged fields (CRITICAL-005) HIGH fixes: - Path traversal in marketplace upload — UUID filenames (HIGH-001) - IP spoofing — use Gin trusted proxy c.ClientIP() (HIGH-002) - Popularity metrics (followers, likes) set to json:"-" (HIGH-003) - bcrypt cost hardened to 12 everywhere (HIGH-004) - Refresh token lock made mandatory (HIGH-005) - Stream token replay prevention with access_count (HIGH-006) - Subscription trial race condition fixed (HIGH-007) - License download expiration check (HIGH-008) - Webhook amount validation (HIGH-009) - pprof endpoint removed from production (HIGH-010) MEDIUM fixes: - WebSocket message size limit 64KB (MEDIUM-010) - HSTS header in nginx production (MEDIUM-001) - CORS origin restricted in nginx-rtmp (MEDIUM-002) - Docker alpine pinned to 3.21 (MEDIUM-003/004) - Redis authentication enforced (MEDIUM-005) - GDPR account deletion expanded (MEDIUM-006) - .gitignore hardened (MEDIUM-007) LOW/INFO fixes: - GitHub Actions SHA pinning on all workflows (LOW-001) - .env.example security documentation (INFO-001) - Production CORS set to HTTPS (LOW-002) All tests pass. Go and Rust compile clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-14 00:44:46 +01:00
senke	99d609bfae	feat(v0.12.8): documentation & API publique — rate limiting, scopes, OpenAPI - API key rate limiting middleware (1000 reads/h, 200 writes/h par clé) — tracking séparé read/write, par API key ID (pas par IP) — headers X-RateLimit-Limit/Remaining/Reset sur chaque réponse - API key scope enforcement middleware (read → GET, write → POST/PUT/DELETE) — admin scope permet tout, CSRF skip pour API key auth - OpenAPI spec: ajout securityDefinition ApiKeyAuth (X-API-Key header) - Swagger annotations: ajout ApiKeyAuth dans cmd/api/main.go - Wiring dans router.go: middlewares appliqués sur tout le groupe /api/v1 - Tests: 10 tests (5 rate limiter + 5 scope enforcement), tous PASS Backend existant déjà en place (pré-v0.12.8): - Swagger UI (gin-swagger + frontend SwaggerUIDoc component) - API key CRUD (create/list/delete + X-API-Key auth dans AuthMiddleware) - Developer Dashboard frontend (API keys, webhooks, playground) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 18:44:09 +01:00
senke	71c15c2590	fix(v0.12.6.1): remediate 2 CRITICAL + 10 HIGH + 1 MEDIUM pentest findings Security fixes implemented: CRITICAL: - CRIT-001: IDOR on chat rooms — added IsRoomMember check before returning room data or message history (returns 404, not 403) - CRIT-002: play_count/like_count exposed publicly — changed JSON tags to "-" so they are never serialized in API responses HIGH: - HIGH-001: TOCTOU race on marketplace downloads — transaction + SELECT FOR UPDATE on GetDownloadURL - HIGH-002: HS256 in production docker-compose — replaced JWT_SECRET with JWT_PRIVATE_KEY_PATH / JWT_PUBLIC_KEY_PATH (RS256) - HIGH-003: context.Background() bypass in user repository — full context propagation from handlers → services → repository (29 files) - HIGH-004: Race condition on promo codes — SELECT FOR UPDATE - HIGH-005: Race condition on exclusive licenses — SELECT FOR UPDATE - HIGH-006: Rate limiter IP spoofing — SetTrustedProxies(nil) default - HIGH-007: RGPD hard delete incomplete — added cleanup for sessions, settings, follows, notifications, audit_logs anonymization - HIGH-008: RTMP callback auth weak — fail-closed when unconfigured, header-only (no query param), constant-time compare - HIGH-009: Co-listening host hijack — UpdateHostState now takes *Conn and verifies IsHost before processing - HIGH-010: Moderator self-strike — added issuedBy != userID check MEDIUM: - MEDIUM-001: Recovery codes used math/rand — replaced with crypto/rand - MEDIUM-005: Stream token forgeable — resolved by HIGH-002 (RS256) Updated REMEDIATION_MATRIX: 14 findings marked ✅ CORRIGÉ. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 05:40:53 +01:00
senke	f2881ad865	feat(gdpr): v0.10.8 portabilité données - export ZIP async, suppression compte, hard delete cron Some checks failed Backend API CI / test-unit (push) Failing after 1s Details Frontend CI / test (push) Failing after 3s Details Storybook Audit / Build & audit Storybook (push) Failing after 2s Details Backend API CI / test-integration (push) Failing after 6s Details - Export: table data_exports, POST /me/export (202), GET /me/exports, messages+playback_history - Notification email quand ZIP prêt, rate limit 3/jour - Suppression: keep_public_tracks, anonymisation PII complète (users, user_profiles) - HardDeleteWorker: final anonymization après 30 jours - Frontend: POST export, checkbox keep_public_tracks - MSW handlers pour Storybook	2026-03-10 13:57:04 +01:00
senke	c2b3a68fd5	feat(v0.10.5): Notifications complètes — F551-F555 Some checks failed Backend API CI / test-unit (push) Failing after 1s Details Backend API CI / test-integration (push) Failing after 2s Details Frontend CI / test (push) Failing after 2s Details Storybook Audit / Build & audit Storybook (push) Failing after 2s Details F555: Backend pagination/filter GetNotifications (type, page, limit) + frontend pagination F551: WebSocket real-time — backend inject chat hub, send on CreateNotification; frontend useChat invalidates F553: Quiet hours — migration 132, CreateNotification skips push/WS, UI in PushPreferencesSection F554: Notification grouping — migration 133, group_key/actor_count for like/comment, UI format F552: Weekly digest — migration 134, NotificationDigestWorker, email template, prefs UI Acceptance: no gamification notif; defaults unchanged; individual toggles for marketing	2026-03-10 10:02:21 +01:00
senke	a3624ce4b3	feat(v0.802): frontend Cloud/Gear, MSW, docs, scope v0.803, archive - Cloud: CloudFileVersions, CloudShareModal, versions/share in CloudView - Gear: GearDocumentsTab, GearRepairsTab, warranty badge, initialTab - MSW: cloud versions/share, gear documents/repairs, tags suggest - Stories: CloudFileVersions, CloudShareModal, GearDetailModal variants - gearService: listDocuments, uploadDocument, deleteDocument, listRepairs, createRepair, deleteRepair - cloudService: listVersions, restoreVersion, shareFile, getSharedFile - gear_warranty_notifier: 24h ticker, notifications for expiring warranty - tag_handler_test: unit tests - docs: API_REFERENCE, CHANGELOG, PROJECT_STATE, FEATURE_STATUS v0.802 - SCOPE_CONTROL, .cursorrules: scope v0.803 - archive: V0_802_RELEASE_SCOPE, RETROSPECTIVE_V0802	2026-02-25 14:00:58 +01:00
senke	e303e33dfc	feat(cloud): GDPR data export and automatic backup cron	2026-02-25 13:35:16 +01:00
senke	1b66260c22	feat(server): start TransferRetryWorker on boot (v0.701)	2026-02-23 23:32:23 +01:00
senke	8ab391dd73	fix(backend): replace panic/Fatal with graceful error when Redis down (audit 1.4, P0) - Add early validation in Setup() returning error if Redis nil in production - Remove panic/Fatal from routes_core.go and router.go applyCSRFProtection - Handle Setup() error in cmd/api/main.go and cmd/modern-server/main.go - Mark audit item 1.4 as done	2026-02-15 14:05:20 +01:00
senke	1ed6e7f07b	state-ownership: delete unused optimisticStoreUpdates.ts file - Deleted apps/web/src/utils/optimisticStoreUpdates.ts (unused file) - File was unused - no imports found in codebase - Mutations already use React Query's onMutate pattern - No TypeScript errors after deletion - Actions 4.4.1.2 and 4.4.1.3 complete	2026-01-15 19:26:53 +01:00
senke	39f7967e1e	incus deployement fully implemented, Makefile updated and make fmt ran	2026-01-13 19:47:57 +01:00
senke	a73c36b3e6	[LOGGING] Fix #10 : Erreurs silencieuses - Ajout de logs avec contexte pour toutes les erreurs dans core/auth et core/track	2026-01-04 01:44:15 +01:00
senke	f6a40c9ec6	[BE-SVC-017] be-svc: Implement graceful shutdown - Created ShutdownManager for coordinated graceful shutdown of all services - Added Shutdowner interface for services that need graceful shutdown - Implemented parallel shutdown with individual timeouts (10s per service) - Added global shutdown timeout (30s total) - Integrated shutdown manager in main.go for: - HTTP server shutdown - JobWorker cancellation - Config.Close() (DB, Redis, RabbitMQ) - Logger sync - Sentry flush - Added comprehensive unit tests for shutdown manager - Prevents registration of new services during shutdown Phase: PHASE-6 Priority: P2 Progress: 113/267 (42.32%)	2025-12-24 17:03:11 +01:00
senke	a7d463b8fd	stabilizing veza-backend-api: P1 & P2	2025-12-16 13:34:08 -05:00
senke	d33c351ac6	refonte: backend-api go first; phase 1	2025-12-12 21:34:34 -05:00
okinrev	8caa2fd7ca	STABILISATION: phase 3–5 – API contract, tests & chat-server hardening	2025-12-06 17:21:59 +01:00
okinrev	5ffcd50e0a	P0: stabilisation backend/chat/stream + nouvelle base migrations v1 Backend Go: - Remplacement complet des anciennes migrations par la base V1 alignée sur ORIGIN. - Durcissement global du parsing JSON (BindAndValidateJSON + RespondWithAppError). - Sécurisation de config.go, CORS, statuts de santé et monitoring. - Implémentation des transactions P0 (RBAC, duplication de playlists, social toggles). - Ajout d’un job worker structuré (emails, analytics, thumbnails) + tests associés. - Nouvelle doc backend : AUDIT_CONFIG, BACKEND_CONFIG, AUTH_PASSWORD_RESET, JOB_WORKER_. Chat server (Rust): - Refonte du pipeline JWT + sécurité, audit et rate limiting avancé. - Implémentation complète du cycle de message (read receipts, delivered, edit/delete, typing). - Nettoyage des panics, gestion d’erreurs robuste, logs structurés. - Migrations chat alignées sur le schéma UUID et nouvelles features. Stream server (Rust): - Refonte du moteur de streaming (encoding pipeline + HLS) et des modules core. - Transactions P0 pour les jobs et segments, garanties d’atomicité. - Documentation détaillée de la pipeline (AUDIT_STREAM_, DESIGN_STREAM_PIPELINE, TRANSACTIONS_P0_IMPLEMENTATION). Documentation & audits: - TRIAGE.md et AUDIT_STABILITY.md à jour avec l’état réel des 3 services. - Cartographie complète des migrations et des transactions (DB_MIGRATIONS_*, DB_TRANSACTION_PLAN, AUDIT_DB_TRANSACTIONS, TRANSACTION_TESTS_PHASE3). - Scripts de reset et de cleanup pour la lab DB et la V1. Ce commit fige l’ensemble du travail de stabilisation P0 (UUID, backend, chat et stream) avant les phases suivantes (Coherence Guardian, WS hardening, etc.).	2025-12-06 11:14:38 +01:00
okinrev	2425c15b09	adding initial backend API (Go)	2025-12-03 20:29:37 +01:00

23 commits