# Changelog - Veza ## [v1.0.6.2] - 2026-04-17 ### Hotfix — subscription payment-gate bypass Discovered during the 2026-04 audit probe (ops question Q2, "are paid subscriptions actually gated server-side?"). An authenticated user could POST `/api/v1/subscriptions/subscribe` with a paid plan and receive HTTP 201 with `status=active` — with the payment provider never invoked when `HYPERSWITCH_ENABLED=false` (or unset). The resulting row satisfied `checkEligibility()` in the distribution service, which returns `sub.Plan.HasDistribution || sub.Plan.CanSellOnMarketplace`. The Creator plan carries `can_sell_on_marketplace=true`, so any user could reach `/api/v1/distribution/submit` — a paid feature that dispatches to external distribution partners — without paying. Fix — `GetUserSubscription` now filters out active/trialing rows that lack an effective payment linkage. "Effective" means: on a free plan, or in an unexpired trial, or at least one attached invoice carries a PSP payment intent (`hyperswitch_payment_id` non-empty). This is the sole centralised gate; all paid-feature eligibility paths (distribution and anything added later) route through it. * `ErrSubscriptionNoPayment` added to `internal/core/subscription`. `GetUserSubscription` returns it when a row sits in active/trialing but fails the payment-effective predicate. Callers treat it as ineligible (distribution returns `false, nil`; subscription HTTP handlers return 404 "Active subscription" for cancel/reactivate/ billing-cycle paths; `GET /me/subscription` returns an explicit `needs_payment=true` payload so honest-path users who landed here via a broken flow get actionable information, not a misleading "you're on free" or an opaque 500). * `Subscribe` and `subscribeToFreePlan` also treat the new error as "no existing active subscription" so a user can re-subscribe cleanly once migration 980 has voided their fantôme row. * Migration `980_void_unpaid_subscriptions.sql` sweeps all pre-v1.0.6.2 fantôme rows into `status='expired'`, capturing the `(subscription_id, user_id, plan_id, previous_status)` tuple in a dated audit table (`voided_subscriptions_20260417`) so support can notify any honest-path user who landed there by mistake. * Probe script `scripts/probes/subscription-unpaid-activation.sh` kept as a versioned regression test. `--dry-run` lists plans; `--destructive` logs in and attempts the exploit, cleaning up after itself. Exit 0 = no bypass; exit 1 = bypass detected. * Unit test `gate_test.go` covers the 8-branch matrix of the `hasEffectivePayment` predicate (free pass, paid with/without invoice, paid with empty vs populated `hyperswitch_payment_id`, trial variants with future/past/nil `trial_end`, no row at all). * `TODO(v1.0.7-item-G)` annotation on the `if s.paymentProvider != nil` short-circuit in `createNewSubscription` so the v1.0.7 work that replaces it with a mandatory `pending_payment` state retains the audit trail. ### Security Closes a subscription-gate bypass affecting distribution eligibility. Internal audit finding; no external report. Axis-1 correctness item P1.7 will be reclassified to P0 and item G added to the v1.0.7 plan in a follow-up commit. ## [v1.0.6.1] - 2026-04-17 ### Hotfix — partial UNIQUE on refunds.hyperswitch_refund_id Surfaced by the v1.0.6 refund smoke test (scenario S4, triggered after S3 left a failed refund in its post-Phase-1 / pre-Phase-2 state): the plain UNIQUE constraint from migration 978 rejected a second refund attempt on a *different* order because both rows had `hyperswitch_refund_id=''` (Go's zero-value string → empty string, not NULL). Postgres treats two empty strings as colliding under a regular UNIQUE; it only skips NULLs. * Migration `979_refunds_unique_partial.sql` drops the original constraint and replaces it with a partial UNIQUE that only enforces uniqueness when `hyperswitch_refund_id IS NOT NULL AND <> ''`. * Preserves the load-bearing idempotency guarantee for successful refunds (duplicate webhook lands on the same row because the PSP refund_id is set). * No Go code change — the model and service logic were already correct; only the DB constraint shape needed fixing. Smoke coverage that caught it + re-validates the fix: * S1 happy path: refund + order + license + seller_transfer + seller_balance all reconciled end-to-end * S2 idempotent replay: succeeded_at + transfer.updated_at + available_cents strictly unchanged across 2 webhook deliveries (THE critical proof — duplicate Hyperswitch retries are no-ops at the row level, not at the handler level) * S3 PSP error rollback: order reverts to completed, refund persisted as failed, no seller debit * S4 webhook refund.failed: order reverts, license intact, seller balance intact — **this is the scenario that surfaced the bug** * S5 double-submit: second POST returns 400 ErrRefundAlreadyRequested, only 1 refund row persisted ## [v1.0.6] - 2026-04-17 ### Ergonomics + operational hardening — six items from the v1.0.5 backlog Follow-up to the hardening sprint. v1.0.5 validated the `register → verify → play` critical path end-to-end; v1.0.6 addresses the next layer — the UX friction and operational blindspots that a first-day public user (or a first-day on-call) would hit. Six targeted commits. #### Fix 1 — Self-service creator role (`c32278dc1`) New `POST /api/v1/users/me/upgrade-creator`. Verified users click a "Become an artist" button in `/settings → Account` and their role flips from `user` to `creator` on one conscious click — no KYC, no cooldown, no admin round-trip. One-way by design (downgrade = support ticket) so we don't have to handle the "my uploads orphaned" edge case. * Gated strictly on `is_verified=true` (403 `EMAIL_NOT_VERIFIED` otherwise). * Idempotent 200 for anyone already creator-tier — no clutter. * UPDATE scoped `WHERE role='user'` so a concurrent admin assignment can't be silently overwritten. * Audit trail: `user.upgrade_creator` action logged with the full role transition metadata. * Migration `977_users_promoted_to_creator_at.sql` adds a nullable `promoted_to_creator_at TIMESTAMPTZ` column — distinguishes organic self-promotions from admin-assigned roles for analytics. * Tests: 6 Go (happy path, unverified, already-creator, admin idempotent, 404, no-auth) + 7 Vitest (verified button, unverified state, hidden for creator, hidden for admin, refetch on success, idempotent message, server error toast). #### Fix 2 — Upload size limits from a single source (`5848c2e40`) The v1.0.5 audit flagged a "front 500MB vs back 100MB" mismatch. In reality every live pair was aligned (tracks 100/100, cloud 500/500, video 500/500) — the real architectural bug was **five duplicated hardcoded values** that could drift silently as soon as anyone tuned one. * `internal/config/upload_limits.go`: `AudioLimit`, `ImageLimit`, `VideoLimit` expose `Bytes()`, `MB()`, `HumanReadable()`, `AllowedMIMEs`. Read lazily from env (`MAX_UPLOAD_AUDIO_MB`, `MAX_UPLOAD_IMAGE_MB`, `MAX_UPLOAD_VIDEO_MB`, defaults 100/10/500). Invalid/negative/zero env values fall back to default. * `track/service.go`, `track_upload_handler.go`, `education_handler.go`, `upload.go:GetUploadLimits` all consume the single source. Changing one env retunes every path. * Frontend `useUploadLimits()` hook: react-query with 5 min stale, 30 min gc, 1 retry then optimistic fallback to baked-in defaults so the dropzone stays responsive even without the network round trip. `useUploadModal` replaces `MAX_FILE_SIZE` constant with the live value; `UploadModal` forwards `audioMaxHuman` to `UploadModalDropzone` so the label and error toast track the env. * Out of scope (tracked for later): `CloudUploadModal.tsx` still hardcodes 500MB — cloud uploads accept audio+zip+midi with a different category semantic than the three in `/upload/limits`. Unifying deserves its own design pass. * Tests: 4 Go (defaults, env override, invalid env fallback, MIME lists) + 4 Vitest (sync fallback, typed mapping, partial-payload fallback per category, network failure keeps fallback). #### Fix 3 — Unified SMTP env schema (`066144352`) Two email services in-tree read *different* env vars for the same fields — surfaced during the v1.0.5.1 hotfix: internal/email/sender.go internal/services/email_service.go SMTP_USERNAME SMTP_USER SMTP_FROM FROM_EMAIL SMTP_FROM_NAME FROM_NAME v1.0.6 reconciles both onto canonical `SMTP_*` names, with a migration fallback to the legacy names that logs a structured deprecation warning (`remove_in: v1.1.0`). * `internal/email/sender.go` is the single loader — both services delegate to it via `LoadSMTPConfigFromEnvWithLogger(*zap.Logger)`. Canonical wins over deprecated; no precedence surprise. * `docker-compose.yml` backend-api env: `FROM_EMAIL` / `FROM_NAME` → `SMTP_FROM` / `SMTP_FROM_NAME` to match the canonical schema. * `.env.template` trimmed — only canonical vars ship, old ones removed (still accepted in running env for zero-downtime rollover). * No default injected for Host/Port in the loader. `Host==""` → callers go log-only (matches historic dev behavior). Dev defaults stay in `.env.template`, so prod fails fast instead of silently dialing localhost. * Tests: 5 Go (empty env, canonical direct, deprecated fallback + warning emission, canonical silently wins over deprecated, nil logger allowed). #### Fix 4 — Refund reverse-charge with idempotent webhook (`959031667`) The structural one. Before v1.0.6, `RefundOrder` wrote `status='refunded'` to the DB and called Hyperswitch synchronously, treating the API ack as terminal. In reality Hyperswitch returns `pending` and only finalizes via webhook. Customers could see "refunded" while their bank was still uncredited, and the seller balance kept its credit even on successful refunds. * Two-phase flow: 1. **Open pending refund** (short row-locked tx): validate permissions + 14-day window + double-submit guard; persist `Refund{status=pending}`; flip order to `refund_pending` (not `refunded` — that's the webhook's job). 2. **PSP call outside the tx**: `Provider.CreateRefund` returns `(refund_id, status, err)`. On error, mark refund failed + roll order back to `completed`. On success, capture the `hyperswitch_refund_id` as the idempotency key — stay in `pending` even if the sync status is "succeeded" (per customer guidance: never trust the sync ack, always wait for the webhook). 3. **`ProcessRefundWebhook`** drives terminal state. Row-lock + `IsTerminal()` short-circuit: any duplicate Hyperswitch retry is a no-op 200. On `refund.succeeded`: flip refund + order to succeeded/refunded, revoke licenses, debit seller balance, mark every `SellerTransfer` for the order as `reversed`. * Migration `978_refunds_table.sql` with `UNIQUE(hyperswitch_refund_id)` — this is the load-bearing idempotency guarantee. * Webhook routing: `HyperswitchWebhookPayload.IsRefundEvent()` dispatches `refund.*` events to `ProcessRefundWebhook`; payment events keep flowing through the existing `ProcessPaymentWebhook`. * `DebitSellerBalance` ported off Postgres-only `GREATEST()` to portable `CASE WHEN`; the path wasn't exercised before v1.0.6, so this is a quality fix not a regression. * Partial refunds: signature carries `amount *int64` (nil = full) but service call-site passes nil — full-only for v1.0.6. Partial-refund UX is deferred to v1.0.7. * Stripe Connect Transfers:reversal call flagged TODO(v1.0.7). Internal balance + transfer-status are corrected here so buyer and seller views match the moment the PSP confirms; the missing piece is the money-movement round-trip at Stripe. Internal accounting is consistent — external settlement catches up with v1.0.7. * Tests: 15 Go cases covering Phase 1 (pending state, PSP error rollback, double-submit, permissions, window), webhook finalization (succeeded, failed, idempotent replay with `succeeded_at` timestamp invariant, unknown refund_id, missing refund_id, non-terminal ignored), and dispatcher logic (6 `IsRefundEvent` cases across flat/nested/event_type shapes). #### Fix 5 — RTMP ingest health banner on Go Live (`64fa0c9ac`) "Go Live" was silent when `nginx-rtmp` wasn't running. An artist could copy the RTMP URL + stream key, fire OBS, and broadcast into the void with no in-UI signal. * `GET /api/v1/live/health` TCP-dials `NGINX_RTMP_ADDR` (default `localhost:1935`), 2s timeout, 15s TTL cache protected by a mutex so a burst of page loads can't hammer the ingest. Returns UI-safe `error` string (no raw hostname leak) and `Cache-Control: private, max-age=15` so browsers honor the same window. * Unreachable path emits a WARN log so operators see the outage before users do. * Frontend `useLiveHealth()` hook: react-query 15s stale, 1 retry, then optimistic `{ rtmpReachable: true }` — better to miss a banner than flash a false negative on a transient health-endpoint blip. * `LiveRtmpHealthBanner` at the top of `GoLivePage`: amber, non-blocking, copy explicitly tells the artist the stream key is still valid but broadcasting won't reach anyone, with a Retry button that invalidates the health query. * Tests: 3 Go (listener reachable + Cache-Control; dead port unreachable + UI-safe error asserting no `127.0.0.1` leak; TTL cache survives listener teardown) + 3 Vitest (hidden when reachable, visible with Retry when unreachable, Retry invalidates the right query key). #### Fix 6 — RabbitMQ publish failures no longer silent (`bf688af35`) `RabbitMQEventBus.Publish` returned the broker error but did not log it. Callers that wrapped `Publish` in fire-and-forget (`_ = eb.Publish(...)`) lost events with zero trace during RMQ outages. * `Publish` now emits a structured ERROR on broker failure with the exchange, routing_key, payload_bytes, content_type, and message_id context. Function still returns the error so call-sites that actually check it keep working. * `EventBus disabled` warning kept but upgraded with `payload_bytes` so dashboards can quantify drops when RMQ is intentionally off. * Aligns the legacy `internal/eventbus` with `infrastructure/eventbus` which already had this pattern. * Tests: 2 Go (disabled bus emits WARN + returns `EventBusUnavailableError`; nil logger stays panic-free for legacy callers). ### Breaking changes * `marketplace.MarketplaceService.RefundOrder` now returns `(*Refund, error)` instead of `error`. Callers consuming the service directly need to accept the pending refund row. * `marketplace.refundProvider` internal interface: `Refund(...) error` → `CreateRefund(...) (refundID, status string, err error)`. `hyperswitch.Provider` implements both; external mocks must be updated. * Order status machine gains `refund_pending` as an intermediate state. Clients reading `orders.status` should treat it as "in-flight refund, don't show as refunded yet". ### Known gaps (parked for v1.0.7) * Partial refunds — UX decision + call-site wiring * Stripe Connect Transfers:reversal — actually move money back at the PSP level (internal accounting is correct today) * `CloudUploadModal.tsx` hardcoded 500MB — category semantic doesn't map to the three exposed by `/upload/limits` * Smoke test of refund flow against Hyperswitch sandbox (manual, outside CI) ## [v1.0.5.1] - 2026-04-16 ### Hotfix — dev SMTP ergonomics Follow-up to the v1.0.5 smoke test: a fresh clone + `cp .env.template .env` + `make dev-full` produced a backend with `SMTP_HOST=""`, which silently short-circuits `EmailService.sendEmail` to a log-only path. New contributors hit register → "where's my verification email?" and had no obvious cue that the SMTP hookup was missing. - `veza-backend-api/.env.template`: `SMTP_HOST` / `SMTP_PORT` now default to the MailHog instance that ships with `make infra-up-dev` (`localhost:1025`, UI on `:8025`). `FROM_EMAIL` / `FROM_NAME` seeded with local-safe values. Comment rewritten to point at both the dev path and the prod override. - Also exports the duplicate variable names (`SMTP_USERNAME`, `SMTP_FROM`, `SMTP_FROM_NAME`) read by `internal/email/sender.go` — a TODO flagged for v1.0.6 to reconcile the two email services onto a single env schema. Until then both sets cover every code path. No code change, no migration, no version bump in the Go module. Pure config hotfix. ## [v1.0.5] - 2026-04-16 ### Hardening sprint — seven critical-path fixes before public opening Audit follow-up on the `register → verify → play` critical path. The app was functional on the surface but broken underneath: the player was silent, emails weren't really sent, the marketplace gave products away in production, the chat silently de-synced across pods, maintenance mode was per-pod only, orphaned tracks accumulated forever in `processing`, and the response cache was corrupting range-aware media responses. Seven targeted fixes, each with its own commit, its own tests, and no behaviour change outside scope. #### Fix 1 — Player muet (`veza-backend-api` + `apps/web`) - New `GET /api/v1/tracks/:id/stream` handler in `internal/core/track/track_hls_handler.go`. Serves the raw file via `http.ServeContent` — `Range`, `If-Modified-Since` and `If-None-Match` handled for free, so `