Closes FUNCTIONAL_AUDIT.md §4 #1: WebRTC 1:1 calls had working
signaling but no NAT traversal, so calls between two peers behind
symmetric NAT (corporate firewalls, mobile carrier CGNAT, Incus
container default networking) failed silently after the SDP exchange.
Backend:
- GET /api/v1/config/webrtc (public) returns {iceServers: [...]}
built from WEBRTC_STUN_URLS / WEBRTC_TURN_URLS / *_USERNAME /
*_CREDENTIAL env vars. Half-config (URLs without creds, or vice
versa) deliberately omits the TURN block — a half-configured TURN
surfaces auth errors at call time instead of falling back cleanly
to STUN-only.
- 4 handler tests cover the matrix.
Frontend:
- services/api/webrtcConfig.ts caches the config for the page
lifetime and falls back to the historical hardcoded Google STUN
if the fetch fails.
- useWebRTC fetches at mount, hands iceServers synchronously to
every RTCPeerConnection, exposes a {hasTurn, loaded} hint.
- CallButton tooltip warns up-front when TURN isn't configured
instead of letting calls time out silently.
Ops:
- infra/coturn/turnserver.conf — annotated template with the SSRF-
safe denied-peer-ip ranges, prometheus exporter, TLS for TURNS,
static lt-cred-mech (REST-secret rotation deferred to v1.1).
- infra/coturn/README.md — Incus deploy walkthrough, smoke test
via turnutils_uclient, capacity rules of thumb.
- docs/ENV_VARIABLES.md gains a 13bis. WebRTC ICE servers section.
Coturn deployment itself is a separate ops action — this commit lands
the plumbing so the deploy can light up the path with zero code
changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two consolidations:
(1) Annotate `/search`, `/search/suggestions`, `/social/trending` with
swag tags so orval generates typed clients for them. Migrate
`searchApi` and `socialApi` (the two remaining hand-written wrappers
in `apps/web/src/services/api/`) to delegate to the generated
functions. Removes the last drift surface where backend changes to
those endpoints could silently mismatch the SPA.
(2) Delete two orphan auth-service implementations that have parallel-
implemented login/register/verifyEmail with stale wire shapes:
- apps/web/src/services/authService.ts (only its own test imports it)
- apps/web/src/features/auth/services/authService.ts (re-exported
from features/auth/index.ts but the barrel itself has zero
importers across the SPA)
The active path remains `services/api/auth.ts` (the integration layer
that owns token storage, csrf, and proactive refresh) — the duplicates
were dead post-v1.0.8 orval migration and silently diverged from the
true backend shape (e.g., the deleted services still expected
`access_token` at the root of the register response, never matched
current backend, broke when v1.0.9 item 1.4 changed the shape).
Net diff: -944 LOC of dead code, +typed orval clients for 2 more
endpoints, zero importer rewires.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Item 1.4 — Register no longer issues an access+refresh token pair. The
prior flow set httpOnly cookies at register but the AuthMiddleware
refused them on every protected route until the user had verified
their email (`core/auth/service.go:527`). Users ended up with dead
credentials and a "logged in but locked out" UX. Register now returns
{user, verification_required: true, message} and the SPA's existing
"check your email" notice fires naturally.
Item 1.3 — `POST /auth/verify-email` reads the token from the
`X-Verify-Token` header in preference to the `?token=…` query param.
Query param logged a deprecation warning but stays accepted so emails
dispatched before this release still work. Headers don't leak through
proxy/CDN access logs that record URL but not headers.
Tests: 18 test files updated (sed `_, _, err :=` → `_, err :=` for the
new Register signature). `core/auth/handler_test.go` gets a
`registerVerifyLogin` helper for tests that exercise post-login flows
(refresh, logout). Two new E2E `@critical` specs lock in the defer-JWT
contract and the header read-path.
OpenAPI + orval regenerated to reflect the new RegisterResponse shape
and the verify-email header parameter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First instalment of Item G from docs/audit-2026-04/v107-plan.md §G.
This commit lands the state machine + create-flow change. Phase 2
(webhook handler + recovery endpoint + reconciler sweep) follows.
What changes :
- **`models.go`** — adds `StatusPendingPayment` to the
SubscriptionStatus enum. Free-text VARCHAR(30) so no DDL needed
for the value itself; Phase 2's reconciler index lives in
migration 986 (additive, partial index on `created_at` WHERE
status='pending_payment').
- **`service.go`** — `PaymentProvider.CreateSubscriptionPayment`
interface gains an `idempotencyKey string` parameter, mirroring
the marketplace.refundProvider contract added in v1.0.7 item D.
Callers pass the new subscription row's UUID so a retried HTTP
request collapses to one PSP charge instead of duplicating it.
- **`createNewSubscription`** — refactored state machine :
* Free plan → StatusActive (unchanged, in subscribeToFreePlan).
* Paid plan, trial available, first-time user → StatusTrialing,
no PSP call (no invoice either — Phase 2 will create the
first paid invoice on trial expiry).
* Paid plan, no trial / repeat user → **StatusPendingPayment**
+ invoice + PSP CreateSubscriptionPayment with idempotency
key = subscription.ID.String(). Webhook
subscription.payment_succeeded (Phase 2) flips to active;
subscription.payment_failed flips to expired.
- **`if s.paymentProvider != nil` short-circuit removed**. Paid
plans now require a configured PaymentProvider — without one,
`createNewSubscription` returns ErrPaymentProviderRequired. The
handler maps this to HTTP 503 "Payment provider not configured —
paid plans temporarily unavailable", surfacing env misconfig to
ops instead of silently giving away paid plans (the v1.0.6.2
fantôme bug class).
- **`GetUserSubscription` query unchanged** — already filters on
`status IN ('active','trialing')`, so pending_payment rows
correctly read as "no active subscription" for feature-gate
purposes. The v1.0.6.2 hasEffectivePayment filter is kept as
defence-in-depth for legacy rows.
- **`hyperswitch.Provider`** — implements
`subscription.PaymentProvider` by delegating to the existing
`CreatePaymentSimple`. Compile-time interface assertion added
(`var _ subscription.PaymentProvider = (*Provider)(nil)`).
- **`routes_subscription.go`** — wires the Hyperswitch provider
into `subscription.NewService` when HyperswitchEnabled +
HyperswitchAPIKey + HyperswitchURL are all set. Without those,
the service falls back to no-provider mode (paid subscribes
return 503).
- **Tests** : new TestSubscribe_PendingPaymentStateMachine in
gate_test.go covers all five visible outcomes (free / paid+
provider / paid+no-provider / first-trial / repeat-trial) with a
fakePaymentProvider that records calls. Asserts on idempotency
key = subscription.ID.String(), PSP call counts, and the
Subscribe response shape (client_secret + payment_id surfaced).
5/5 green, sqlite :memory:.
Phase 2 backlog (next session) :
- `ProcessSubscriptionWebhook(ctx, payload)` — flip pending_payment
→ active on success / expired on failure, idempotent against
replays.
- Recovery endpoint `POST /api/v1/subscriptions/complete/:id` —
return the existing client_secret to resume a stalled flow.
- Reconciliation sweep for rows stuck in pending_payment past the
webhook-arrival window (uses the new partial index from
migration 986).
- Distribution.checkEligibility explicit pending_payment branch
(today it's already handled implicitly via the active/trialing
filter).
- E2E @critical : POST /subscribe → POST /distribution/submit
asserts 403 with "complete payment" until webhook fires.
Backward compat : clients on the previous flow that called
/subscribe expecting an immediately-active row will now see
status=pending_payment + a client_secret. They must drive the PSP
confirm step before the row is granted feature access. The
v1.0.6.2 voided_subscriptions cleanup migration (980) handles
pre-existing fantôme rows.
go build ./... clean. Subscription + handlers test suites green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the two annotation gaps that blocked finishing the orval
migration in v1.0.8 :
- queue_handler.go (5 routes — GetQueue, UpdateQueue, AddQueueItem,
RemoveQueueItem, ClearQueue) — under @Tags Queue with @Security
BearerAuth, @Param body/path, @Success/@Failure on the standard
APIResponse envelope.
- queue_session_handler.go (5 routes — CreateSession, GetSession,
DeleteSession, AddToSession, RemoveFromSession). GetSession is
public (no @Security tag) since the share-token URL is meant for
join-via-link from outside the auth wall.
- password_reset_handler.go (2 routes — RequestPasswordReset and
ResetPassword factory functions). Both are public (no @Security)
since they're the entry-points for users who can't log in. The
request-side annotation documents the intentional generic 200
response (anti-enumeration: same body whether the email exists or
not).
After regen :
- openapi.yaml gains 7 queue paths (/queue, /queue/items[/{id}],
/queue/session[/{token}[/items[/{id}]]]) and 2 password paths
(/auth/password/reset, /auth/password/reset-request). +568 LOC.
- docs/{docs.go,swagger.json,swagger.yaml} updated identically by
swag init.
- apps/web/src/services/generated/queue/queue.ts created (10
HTTP funcs + matching React Query hooks). model/ index extended
with the queue + password-reset request/response shapes.
Validates with `swag init` (Swagger 2.0). go build ./... clean. No
runtime behaviour change — annotations are pure metadata read by the
spec generator. The orval regen IS the wiring point for the
follow-up frontend commit (queue.ts migration + authService finish).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fourth batch. Closes the user/profile surface consumed by the
frontend users service. 6 handlers annotated across
internal/handlers/profile_handler.go (now 12/15 annotated).
Handlers annotated:
- SearchUsers — GET /users/search
- FollowUser — POST /users/{id}/follow
- GetFollowSuggestions — GET /users/suggestions
- UnfollowUser — DELETE /users/{id}/follow
- BlockUser — POST /users/{id}/block
- UnblockUser — DELETE /users/{id}/block
Added a blank `_ "veza-backend-api/internal/models"` import so swaggo
can resolve models.User in doc comments without forcing runtime use
(same pattern as track_hls_handler.go / track_waveform_handler.go).
Spec coverage: /users/* paths now 12 (all frontend-consumed endpoints).
make openapi: ✅ · go build ./...: ✅.
Completes the B-2 backend annotation scope for auth / users / tracks /
playlists — the four services that will migrate to orval in the next
commit. Remaining unannotated handlers (admin, moderation, analytics,
education, cloud, gear, social_group, etc.) are outside the v1.0.8
frontend migration and deferred to v1.0.9.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third batch. Fills the playlist_handler.go gap (was 8/24 annotated,
now 20/24). Covers the functionality consumed by the frontend
playlists service: import, favoris, share tokens, collaborators,
analytics, search, recommendations, duplication.
Handlers annotated:
- ImportPlaylist — POST /playlists/import
- GetFavorisPlaylist — GET /playlists/favoris
- GetPlaylistByShareToken — GET /playlists/shared/{token}
- SearchPlaylists — GET /playlists/search
- GetRecommendations — GET /playlists/recommendations
- GetPlaylistStats — GET /playlists/{id}/analytics
- AddCollaborator — POST /playlists/{id}/collaborators
- GetCollaborators — GET /playlists/{id}/collaborators
- UpdateCollaboratorPermission — PUT /playlists/{id}/collaborators/{userId}
- RemoveCollaborator — DELETE /playlists/{id}/collaborators/{userId}
- CreateShareLink — POST /playlists/{id}/share
- DuplicatePlaylist — POST /playlists/{id}/duplicate
Not annotated (unrouted, survey false positives): FollowPlaylist,
UnfollowPlaylist — no route references in internal/api/routes_*.go.
Left unannotated to avoid polluting the spec with dead handlers.
Marketplace gap originally planned for this batch is deferred to
v1.0.9: the 13 remaining handlers (UploadProductPreview, reviews,
licenses, sell stats, refund, invoice) don't block the B-2 frontend
migration (auth/users/tracks/playlists only), so they will be done
after v1.0.8 ships. Task #48 updated to reflect.
Spec coverage:
/playlists/* paths: 5 → 15
make openapi: ✅ valid
go build ./...: ✅
Next: profile_handler.go + auth/handler.go to finish the B-2 spec
surface (users endpoints), then regen orval and migrate 4 services.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Day-2 cut of item B: the reversal path becomes async. Pre-v1.0.7
(and v1.0.7 day 1) the refund handler flipped seller_transfers
straight from completed to reversed without ever calling Stripe —
the ledger said "reversed" while the seller's Stripe balance still
showed the original transfer as settled. The new flow:
refund.succeeded webhook
→ reverseSellerAccounting transitions row: completed → reversal_pending
→ StripeReversalWorker (every REVERSAL_CHECK_INTERVAL, default 1m)
→ calls ReverseTransfer on Stripe
→ success: row → reversed + persist stripe_reversal_id
→ 404 already-reversed (dead code until day 3): row → reversed + log
→ 404 resource_missing (dead code until day 3): row → permanently_failed
→ transient error: stay reversal_pending, bump retry_count,
exponential backoff (base * 2^retry, capped at backoffMax)
→ retries exhausted: row → permanently_failed
→ buyer-facing refund completes immediately regardless of Stripe health
State machine enforcement:
* New `SellerTransfer.TransitionStatus(tx, to, extras)` wraps every
mutation: validates against AllowedTransferTransitions, guarded
UPDATE with WHERE status=<from> (optimistic lock semantics), no
RowsAffected = stale state / concurrent winner detected.
* processSellerTransfers no longer mutates .Status in place —
terminal status is decided before struct construction, so the
row is Created with its final state.
* transfer_retry.retryOne and admin RetryTransfer route through
TransitionStatus. Legacy direct assignment removed.
* TestNoDirectTransferStatusMutation greps the package for any
`st.Status = "..."` / `t.Status = "..."` / GORM
Model(&SellerTransfer{}).Update("status"...) outside the
allowlist and fails if found. Verified by temporarily injecting
a violation during development — test caught it as expected.
Configuration (v1.0.7 item B):
* REVERSAL_WORKER_ENABLED=true (default)
* REVERSAL_MAX_RETRIES=5 (default)
* REVERSAL_CHECK_INTERVAL=1m (default)
* REVERSAL_BACKOFF_BASE=1m (default)
* REVERSAL_BACKOFF_MAX=1h (default, caps exponential growth)
* .env.template documents TRANSFER_RETRY_* and REVERSAL_* env vars
so an ops reader can grep them.
Interface change: TransferService.ReverseTransfer(ctx,
stripe_transfer_id, amount *int64, reason) (reversalID, error)
added. All four mocks extended (process_webhook, transfer_retry,
admin_transfer_handler, payment_flow integration). amount=nil means
full reversal; v1.0.7 always passes nil (partial reversal is future
scope per axis-1 P2).
Stripe 404 disambiguation (ErrTransferAlreadyReversed /
ErrTransferNotFound) is wired in the worker as dead code — the
sentinels are declared and the worker branches on them, but
StripeConnectService.ReverseTransfer doesn't yet emit them. Day 3
will parse stripe.Error.Code and populate the sentinels; no worker
change needed at that point. Keeping the handling skeleton in day 2
so the worker's branch shape doesn't change between days and the
tests can already cover all four paths against the mock.
Worker unit tests (9 cases, all green, sqlite :memory:):
* happy path: reversal_pending → reversed + stripe_reversal_id set
* already reversed (mock returns sentinel): → reversed + log
* not found (mock returns sentinel): → permanently_failed + log
* transient 503: retry_count++, next_retry_at set with backoff,
stays reversal_pending
* backoff capped at backoffMax (verified with base=1s, max=10s,
retry_count=4 → capped at 10s not 16s)
* max retries exhausted: → permanently_failed
* legacy row with empty stripe_transfer_id: → permanently_failed,
does not call Stripe
* only picks up reversal_pending (skips all other statuses)
* respects next_retry_at (future rows skipped)
Existing test updated: TestProcessRefundWebhook_SucceededFinalizesState
now asserts the row lands at reversal_pending with next_retry_at
set (worker's responsibility to drive to reversed), not reversed.
Worker wired in cmd/api/main.go alongside TransferRetryWorker,
sharing the same StripeConnectService instance. Shutdown path
registered for graceful stop.
Cut from day 2 scope (per agreed-upon discipline), landing in day 3:
* Stripe 404 disambiguation implementation (parse error.Code)
* End-to-end smoke probe (refund → reversal_pending → worker
processes → reversed) against local Postgres + mock Stripe
* Batch-size tuning / inter-batch sleep — batchLimit=20 today is
safely under Stripe's 100 req/s default rate limit; revisit if
observed load warrants
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-A self-review surfaced two gaps:
1. `StripeConnectService.CreateTransfer` trusted Stripe's SDK to
return a non-empty `tr.ID` on success (`err == nil`). The
invariant holds in practice, but an empty id silently persisted
on a completed transfer leaves the row permanently
un-reversible — which defeats the entire point of item A.
Added a belt-and-suspenders check that converts `(tr.ID="",
err=nil)` into a failed transfer.
2. `TestRetryTransfer_Success` (admin handler) exercised the retry
path but didn't assert that StripeTransferID was persisted after
a successful retry. The worker path and processSellerTransfers
both had the assertion; the admin manual-retry path was the
third entry into the same behavior and lacked coverage. Added
the assertion.
Decision on scope: v1.0.6.2 added a partial UNIQUE on
stripe_transfer_id (WHERE IS NOT NULL AND <> '') in migration 981,
matching the v1.0.6.1 pattern for refunds.hyperswitch_refund_id.
The combination of (a) the DB partial UNIQUE and (b) this defensive
guard means there is now no code or data path that can persist an
empty transfer id while claiming success.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TransferService.CreateTransfer signature changes from (...) error to
(...) (string, error) — the caller now captures the Stripe transfer
identifier and persists it on the SellerTransfer row. Pre-v1.0.7 the
stripe_transfer_id column was declared on the model and table but
never written to, which blocked the reversal worker (v1.0.7 item B)
from identifying which transfer to reverse on refund.
Changes:
* `TransferService` interface and `StripeConnectService.CreateTransfer`
both return the Stripe transfer id alongside the error.
* `processSellerTransfers` (marketplace service) persists the id on
success before `tx.Create(&st)` so a crash between Stripe ACK and
DB commit leaves no inconsistency.
* `TransferRetryWorker.retryOne` persists on retry success — a row
that failed on first attempt and succeeded via the worker is
reversal-ready all the same.
* `admin_transfer_handler.RetryTransfer` (manual retry) persists too.
* `SellerPayout.ExternalPayoutID` is populated by the Connect payout
flow (`payout.go`) — the field existed but was never written.
* Four test mocks updated; two tests assert the id is persisted on
the happy path, one on the failure path confirms we don't write a
fake id when the provider errors.
Migration `981_seller_transfers_stripe_reversal_id.sql`:
* Adds nullable `stripe_reversal_id` column for item B.
* Partial UNIQUE indexes on both stripe_transfer_id and
stripe_reversal_id (WHERE IS NOT NULL AND <> ''), mirroring the
v1.0.6.1 pattern for refunds.hyperswitch_refund_id.
* Logs a count of historical completed transfers that lack an id —
these are candidates for the backfill CLI follow-up task.
Backfill for historical rows is a separate follow-up (cmd/tools/
backfill_stripe_transfer_ids, calling Stripe's transfers.List with
Destination + Metadata[order_id]). Pre-v1.0.7 transfers without a
backfilled id cannot be auto-reversed on refund — document in P2.9
admin-recovery when it lands. Acceptable scope per v107-plan.
Migration number bumped 980 → 981 because v1.0.6.2 used 980 for the
unpaid-subscription cleanup; v107-plan updated with the note.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-review of the v1.0.6.2 hotfix surfaced that
distribution.checkEligibility silently swallowed
subscription.ErrSubscriptionNoPayment as "ineligible, no extra info",
so a user with a fantôme subscription trying to submit a distribution
got "Distribution requires Creator or Premium plan" — misleading, the
user has a plan but no payment. checkEligibility now propagates the
error so the handler can surface "Your subscription is not linked to
a payment. Complete payment to enable distribution."
Security is unchanged — the gate still refuses. This is a UX clarity
fix for honest-path users who landed in the fantôme state via a
broken payment flow.
Also:
- Closure timestamp added to axis-1 P0.12 ("closed 2026-04-17 in
v1.0.6.2 (commit 9a8d2a4e7)") so future readers know the finding's
lifecycle without re-grepping the CHANGELOG.
- Item G in v107-plan.md gains an explicit E2E Playwright @critical
acceptance — the shell probe + Go unit tests validate the fix
today but don't run on every commit, so a refactor of Subscribe or
checkEligibility could silently re-open the bypass. The E2E test
makes regression coverage automatic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes a bypass surfaced by the 2026-04 audit probe (axis-1 Q2): any
authenticated user could POST /api/v1/subscriptions/subscribe on a paid
plan and receive 201 active without the payment provider ever being
invoked. The resulting row satisfied `checkEligibility()` in the
distribution service via `can_sell_on_marketplace=true` on the Creator
plan — effectively free access to /api/v1/distribution/submit, which
dispatches to external partners.
Fix is centralised in `GetUserSubscription` so there is no code path
that can grant subscription-gated access without routing through the
payment check. Effective-payment = free plan OR unexpired trial OR
invoice with non-empty hyperswitch_payment_id. Migration 980 sweeps
pre-existing fantôme rows into `expired`, preserving the tuple in a
dated audit table for support outreach.
Subscribe and subscribeToFreePlan treat the new ErrSubscriptionNoPayment
as equivalent to ErrNoActiveSubscription so re-subscription works
cleanly post-cleanup. GET /me/subscription surfaces needs_payment=true
with a support-contact message rather than a misleading "you're on
free" or an opaque 500. TODO(v1.0.7-item-G) annotation marks where the
`if s.paymentProvider != nil` short-circuit needs to become a mandatory
pending_payment state.
Probe script `scripts/probes/subscription-unpaid-activation.sh` kept as
a versioned regression test — dry-run by default, --destructive logs in
and attempts the exploit against a live backend with automatic cleanup.
8-case unit test matrix covers the full hasEffectivePayment predicate.
Smoke validated end-to-end against local v1.0.6.2: POST /subscribe
returns 201 (by design — item G closes the creation path), but
GET /me/subscription returns subscription=null + needs_payment=true,
distribution eligibility returns false.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fourth item of the v1.0.6 backlog, and the structuring one — the pre-
v1.0.6 RefundOrder wrote `status='refunded'` to the DB and called
Hyperswitch synchronously in the same transaction, treating the API
ack as terminal confirmation. In reality Hyperswitch returns `pending`
and only finalizes via webhook. Customers could see "refunded" in the
UI while their bank was still uncredited, and the seller balance
stayed credited even on successful refunds.
v1.0.6 flow
Phase 1 — open a pending refund (short row-locked transaction):
* validate permissions + 14-day window + double-submit guard
* persist Refund{status=pending}
* flip order to `refund_pending` (not `refunded` — that's the
webhook's job)
Phase 2 — call PSP outside the transaction:
* Provider.CreateRefund returns (refund_id, status, err). The
refund_id is the unique idempotency key for the webhook.
* on PSP error: mark Refund{status=failed}, roll order back to
`completed` so the buyer can retry.
* on success: persist hyperswitch_refund_id, stay in `pending`
even if the sync status is "succeeded". The webhook is the only
authoritative signal. (Per customer guidance: "ne jamais flipper
à succeeded sur la réponse synchrone du POST".)
Phase 3 — webhook drives terminal state:
* ProcessRefundWebhook looks up by hyperswitch_refund_id (UNIQUE
constraint in the new `refunds` table guarantees idempotency).
* terminal-state short-circuit: IsTerminal() returns 200 without
mutating anything, so a Hyperswitch retry storm is safe.
* on refund.succeeded: flip refund + order to succeeded/refunded,
revoke licenses, debit seller balance, mark every SellerTransfer
for the order as `reversed`. All within a row-locked tx.
* on refund.failed: flip refund to failed, order back to
`completed`.
Seller-side reconciliation
* SellerBalance.DebitSellerBalance was using Postgres-only GREATEST,
which silently failed on SQLite tests. Ported to a portable
CASE WHEN that clamps at zero in both DBs.
* SellerTransfer.Status = "reversed" captures the refund event in
the ledger. The actual Stripe Connect Transfers:reversal call is
flagged TODO(v1.0.7) — requires wiring through TransferService
with connected-account context that the current transfer worker
doesn't expose. The internal balance is corrected here so the
buyer and seller views match as soon as the PSP confirms; the
missing piece is purely the money-movement round-trip at Stripe.
Webhook routing
* HyperswitchWebhookPayload extended with event_type + refund_id +
error_message, with flat and nested (object.*) shapes supported
(same tolerance as the existing payment fields).
* New IsRefundEvent() discriminator: matches any event_type
containing "refund" (case-insensitive) or presence of refund_id.
routes_webhooks.go peeks the payload once and dispatches to
ProcessRefundWebhook or ProcessPaymentWebhook.
* No signature-verification changes — the same HMAC-SHA512 check
protects both paths.
Handler response
* POST /marketplace/orders/:id/refund now returns
`{ refund: { id, status: "pending" }, message }` so the UI can
surface the in-flight state. A new ErrRefundAlreadyRequested maps
to 400 with a "already in progress" message instead of silently
creating a duplicate row (the double-submit guard checks order
status = `refund_pending` *before* the existing-row check so the
error is explicit).
Schema
* Migration 978_refunds_table.sql adds the `refunds` table with
UNIQUE(hyperswitch_refund_id). The uniqueness constraint is the
load-bearing idempotency guarantee — a duplicate PSP notification
lands on the same DB row, and the webhook handler's
FOR UPDATE + IsTerminal() check turns it into a no-op.
* hyperswitch_refund_id is nullable (NULL between Phase 1 and
Phase 2) so the UNIQUE index ignores rows that haven't been
assigned a PSP id yet.
Partial refunds
* The Provider.CreateRefund signature carries `amount *int64`
already (nil = full), but the service call-site passes nil. Full
refunds only for v1.0.6 — partial-refund UX needs a product
decision and is deferred to v1.0.7. Flagged in the ErrRefund*
section.
Tests (15 cases, all sqlite-in-memory + httptest-style mock provider)
* RefundOrder phase 1
- OpensPendingRefund: pending state, refund_id captured, order
→ refund_pending, licenses untouched
- PSPErrorRollsBack: failed state, order reverts to completed
- DoubleRequestRejected: second call returns
ErrRefundAlreadyRequested, not a generic ErrOrderNotRefundable
- NotCompleted / NoPaymentID / Forbidden / SellerCanRefund
- ExpiredRefundWindow / FallbackExpiredNoDeadline
* ProcessRefundWebhook
- SucceededFinalizesState: refund + order + licenses + seller
balance + seller transfer all reconciled in one tx
- FailedRollsOrderBack: order returns to completed for retry
- IsRefundEventIdempotentOnReplay: second webhook asserts
succeeded_at timestamp is *unchanged*, proving the second
invocation bailed out on IsTerminal (not re-ran)
- UnknownRefundIDReturnsOK: never-issued refund_id → 200 silent
(avoids a Hyperswitch retry storm on stale events)
- MissingRefundID: explicit 400 error
- NonTerminalStatusIgnored: pending/processing leave the row
alone
* HyperswitchWebhookPayload.IsRefundEvent: 6 dispatcher cases
(flat event_type, mixed case, payment event, refund_id alone,
empty, nested object.refund_id)
Backward compat
* hyperswitch.Provider still exposes the old Refund(ctx,...) error
method for any call-site that only cared about success/failure.
* Old mockRefundPaymentProvider replaced; external mocks need to
add CreateRefund — the interface is now (refundID, status, err).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifth item of the v1.0.6 backlog. "Go Live" was silent when the
nginx-rtmp profile wasn't up — an artist could copy the RTMP URL +
stream key, fire up OBS, hit "Start Streaming" and broadcast into the
void with no in-UI signal that the ingest wasn't listening. The audit
flagged this 🟡 ("livestream sans feedback UI si nginx-rtmp down").
Backend (`GET /api/v1/live/health`)
* `LiveHealthHandler` TCP-dials `NGINX_RTMP_ADDR` (default
`localhost:1935`) with a 2s timeout. Reports `rtmp_reachable`,
`rtmp_addr`, a UI-safe `error` string (no raw dial target in the
body — avoids leaking internal hostnames to the browser), and
`last_check_at`.
* 15s TTL cache protected by a mutex so a burst of page loads can't
hammer the ingest. First call dials; subsequent calls within TTL
serve the cached verdict.
* Response ships `Cache-Control: private, max-age=15` so browsers
piggy-back the same quarter-minute window.
* When the dial fails the handler emits a WARN log so an operator
watching backend logs sees the outage before a user does.
* Public endpoint — no auth. The "RTMP is up / down" signal has no
sensitive payload and is useful pre-login too.
Frontend
* `useLiveHealth()` hook: react-query with 15s stale time, 1 retry,
then falls back to an optimistic `{ rtmpReachable: true }` — we'd
rather miss a banner than flash a false negative during a transient
blip on the health endpoint itself.
* `LiveRtmpHealthBanner`: amber, non-blocking banner with a Retry
button that invalidates the health query. Copy explicitly tells the
artist their stream key is still valid but broadcasting now won't
reach anyone.
* `GoLivePage` wraps `GoLiveView` in a vertical stack with the banner
above — the view itself stays unchanged (the key + instructions
remain readable even when the ingest is down).
Tests
* 3 Go tests: live listener reports reachable + Cache-Control header;
dead address reports unreachable + UI-safe error (asserts no
`127.0.0.1` leak); TTL cache survives listener teardown within
window.
* 3 Vitest tests: banner renders nothing when reachable; banner
visible + Retry enabled when unreachable; Retry invalidates the
right query key.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second item of the v1.0.6 backlog. The "front 500MB vs back 100MB" mismatch
flagged in the v1.0.5 audit turned out to be a misread — every live pair
was already aligned (tracks 100/100, cloud 500/500, video 500/500). The
real bug is architectural: the same byte values were duplicated in five
places (`track/service.go`, `handlers/upload.go:GetUploadLimits`,
`handlers/education_handler.go`, `upload-modal/constants.ts`, and
`CloudUploadModal.tsx`), drifting silently as soon as anyone tuned one.
Backend — one canonical spec at `internal/config/upload_limits.go`:
* `AudioLimit`, `ImageLimit`, `VideoLimit` expose `Bytes()`, `MB()`,
`HumanReadable()`, `AllowedMIMEs` — read lazily from env
(`MAX_UPLOAD_AUDIO_MB`, `MAX_UPLOAD_IMAGE_MB`, `MAX_UPLOAD_VIDEO_MB`)
with defaults 100/10/500.
* Invalid / negative / zero env values fall back to the default;
unreadable config can't turn the limit off silently.
* `track.Service.maxFileSize`, `track_upload_handler.go` error string,
`education_handler.go` video gate, and `upload.go:GetUploadLimits`
all read from this single source. Changing `MAX_UPLOAD_AUDIO_MB`
retunes every path at once.
Frontend — new `useUploadLimits()` hook:
* Fetches GET `/api/v1/upload/limits` via react-query (5 min stale,
30 min gc), one retry, then silently falls back to baked-in
defaults that match the backend compile-time defaults so the
dropzone stays responsive even without the network round-trip.
* `useUploadModal.ts` replaces its hardcoded `MAX_FILE_SIZE`
constant with `useUploadLimits().audio.maxBytes`, and surfaces
`audioMaxHuman` up to `UploadModal` → `UploadModalDropzone` so
the "max 100 MB" label and the "too large" error toast both
display the live value.
* `MAX_FILE_SIZE` constant kept as pure fallback for pre-network
render (documented as such).
Tests
* 4 Go tests on `config.UploadLimit` (defaults, env override, invalid
env → fallback, non-empty MIME lists).
* 4 Vitest tests on `useUploadLimits` (sync fallback on first render,
typed mapping from server payload, partial-payload falls back
per-category, network failure keeps fallback).
* Existing `trackUpload.integration.test.tsx` (11 cases) still green.
Out of scope (tracked for later):
* `CloudUploadModal.tsx` still has its own 500MB hardcoded — cloud
uploads accept audio+zip+midi with a different category semantic
than the three in `/upload/limits`. Unifying those deserves its
own design pass, not a drive-by.
* No runtime refactor of admin-provided custom category limits —
the current tri-category split covers every upload we ship today.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First item of the v1.0.6 backlog surfaced by the v1.0.5 smoke test: a
brand-new account could register, verify email, and log in — but
attempting to upload hit a 403 because `role='user'` doesn't pass the
`RequireContentCreatorRole` middleware. The only way to get past that
gate was an admin DB update.
This commit wires the self-service path decided in the v1.0.6
specification:
* One-way flip from `role='user'` to `role='creator'`, gated strictly
on `is_verified=true` (the verification-email flow we restored in
Fix 2 of the hardening sprint).
* No KYC, no cooldown, no admin validation. The conscious click
already requires ownership of the email address.
* Downgrade is out of scope — a creator who wants back to `user`
opens a support ticket. Avoids the "my uploads orphaned" edge case.
Backend
* Migration `977_users_promoted_to_creator_at.sql`: nullable
`TIMESTAMPTZ` column, partial index for non-null values. NULL
preserves the semantic for users who never self-promoted
(out-of-band admin assignments stay distinguishable from organic
creators for audit/analytics).
* `models.User`: new `PromotedToCreatorAt *time.Time` field.
* `handlers.UpgradeToCreator(db, auditService, logger)`:
- 401 if no `user_id` in context (belt-and-braces — middleware
should catch this first)
- 404 if the user row is missing
- 403 `EMAIL_NOT_VERIFIED` when `is_verified=false`
- 200 idempotent with `already_elevated=true` when the caller is
already creator / premium / moderator / admin / artist /
producer / label (same set accepted by
`RequireContentCreatorRole`)
- 200 with the new role + `promoted_to_creator_at` on the happy
path. The UPDATE is scoped `WHERE role='user'` so a concurrent
admin assignment can't be silently overwritten; the zero-rows
case reloads and returns `already_elevated=true`.
- audit logs a `user.upgrade_creator` action with IP, UA, and
the role transition metadata. Non-fatal on failure — the
upgrade itself already committed.
* Route: `POST /api/v1/users/me/upgrade-creator` under the existing
protected users group (RequireAuth + CSRF).
Frontend
* `AccountSettingsCreatorCard`: new card in the Account tab of
`/settings`. Completely hidden for users already on a creator-tier
role (no "you're already a creator" clutter). Unverified users see
a disabled-but-explanatory state with a "Resend verification"
CTA to `/verify-email/resend`. Verified users see the "Become an
artist" button, which POSTs to `/users/me/upgrade-creator` and
refetches the user on success.
* `upgradeToCreator()` service in `features/settings/services/`.
* Copy is deliberately explicit that the change is one-way.
Tests
* 6 Go unit tests covering: happy path (role + timestamp), unverified
refused, already-creator idempotent (timestamp preserved),
admin-assigned idempotent (no timestamp overwrite), user-not-found,
no-auth-context.
* 7 Vitest tests covering: verified button visible, unverified state
shown, card hidden for creator, card hidden for admin, success +
refetch, idempotent message, server error via toast.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Registration was setting `IsVerified: true` at user-create time and the
"send email" block was a `logger.Info("Sending verification email")` — no
SMTP call. On production this meant any attacker-typo or typosquat email
got a fully-verified account because the user never had to prove
ownership. In development the hack let people "log in" without checking
MailHog, masking SMTP misconfiguration.
Changes:
* `core/auth/service.go`: new users start with `IsVerified: false`. The
existing `POST /auth/verify-email` flow (unchanged) flips the bit
when the user clicks the link.
* Registration now calls `emailService.SendVerificationEmail(...)` for
real. On SMTP failure the handler returns `500` in production (no
stuck account with no recovery path) and logs a warning in
development (local sign-ups keep flowing).
* Same treatment for `password_reset_handler.RequestPasswordReset` —
production fails loud instead of returning the generic success
message after a silent SMTP drop.
* New helper `isProductionEnv()` centralises the
`APP_ENV=="production"` check in both `core/auth` and `handlers`.
* `docker-compose.yml` + `docker-compose.dev.yml` now ship MailHog
(`mailhog/mailhog:v1.0.1`, SMTP 1025, UI 8025). Backend dev env
vars `SMTP_HOST=mailhog SMTP_PORT=1025` pre-wired so dev sign-ups
actually deliver.
Tests: auth test mocks updated (`expectRegister` adds a
`SendVerificationEmail` mock). `TestAuthService_Login_Success` +
`TestAuthHandler_Login_Success` flip `is_verified` directly after
`Register` to simulate the verification click.
`TestLogin_EmailNotVerified` now asserts `403` (previously asserted
`200` — the test was codifying the bug this commit fixes).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cleanup of dead code marked // DEPRECATED in veza-backend-api/internal/handlers.
Each symbol was verified to have zero callers across the codebase before
deletion (go build ./... + go vet ./... + go test ./internal/... pass).
Deleted:
- UploadResponse type (upload.go) — callers use upload.StandardUploadResponse
- BindJSON method on CommonHandler (common.go) — callers use BindAndValidateJSON
- sendMessage method on *Client (playback_websocket_handler.go) —
internal WS broadcast now goes through sendStandardizedMessage
Kept as tech debt (still actively used, refactor out of J3 scope):
- UploadRequest type (upload.go:23) — used by upload handler, refactor
requires migrating to upload.StandardUploadRequest with multipart binding
- BroadcastMessage type (playback_websocket_handler.go:53) — still the
channel type for legacy playback broadcasts and referenced in tests
Also in this day (already committed in parallel):
- veza-backend-api/internal/api/handlers/two_factor_handlers.go deletion
(had //go:build ignore, zero callers) — bundled into 7fa314866 by
concurrent work on .github/workflows/*.yml
seed-v2 investigation:
- No Go source for seed-v2 found — it was only a compiled binary
already purged in J1 (0e7097ed1). No code action needed.
Refs: AUDIT_REPORT.md §8.1, §12 item 1-2
First-attempt commit 3a5c6e184 only captured the .gitignore change; the
pre-commit hook silently dropped the 343 staged moves/deletes during
lint-staged's "no matching task" path. This commit re-applies the intended
J1 content on top of bec75f143 (which was pushed in parallel).
Uses --no-verify because:
- J1 only touches .md/.json/.log/.png/binaries — zero code that would
benefit from lint-staged, typecheck, or vitest
- The hook demonstrated it corrupts pure-rename commits in this repo
- Explicitly authorized by user for this one commit
Changes (343 total: 169 deletions + 174 renames):
Binaries purged (~167 MB):
- veza-backend-api/{server,modern-server,encrypt_oauth_tokens,seed,seed-v2}
Generated reports purged:
- 9 apps/web/lint_report*.json (~32 MB)
- 8 apps/web/tsc_*.{log,txt} + ts_*.log (TS error snapshots)
- 3 apps/web/storybook_*.json (1375+ stored errors)
- apps/web/{build_errors*,build_output,final_errors}.txt
- 70 veza-backend-api/coverage*.out + coverage_groups/ (~4 MB)
- 3 veza-backend-api/internal/handlers/*.bak
Root cleanup:
- 54 audit-*.png (visual regression baselines, ~11 MB)
- 9 stale MVP-era scripts (Jan 27, hardcoded v0.101):
start_{iteration,mvp,recovery}.sh,
test_{mvp_endpoints,protected_endpoints,user_journey}.sh,
validate_v0101.sh, verify_logs_setup.sh, gen_hash.py
Session docs archived (not deleted — preserved under docs/archive/):
- 78 apps/web/*.md → docs/archive/frontend-sessions-2026/
- 43 veza-backend-api/*.md → docs/archive/backend-sessions-2026/
- 53 docs/{RETROSPECTIVE_V,SMOKE_TEST_V,PLAN_V0_,V0_*_RELEASE_SCOPE,
AUDIT_,PLAN_ACTION_AUDIT,REMEDIATION_PROGRESS}*.md
→ docs/archive/v0-history/
README.md and CONTRIBUTING.md preserved in apps/web/ and veza-backend-api/.
Note: The .gitignore rules preventing recurrence were already pushed in
3a5c6e184 and remain in place — this commit does not modify .gitignore.
Refs: AUDIT_REPORT.md §11
golangci-lint v2.11.4 requires Go >= 1.25. With the workflow on 1.24,
setup-go would silently trigger an in-job auto-toolchain download
(observed in run #71: 'go: github.com/golangci/golangci-lint/v2@v2.11.4
requires go >= 1.25.0; switching to go1.25.9') adding ~3 min to every
Backend (Go) run.
Bump setup-go to 1.25 in ci.yml, backend-ci.yml, go-fuzz.yml so the
prebuilt Go is already the right version.
Also lint-fix three files that golangci-lint's goimports checker
flagged — goimports sorts/groups imports and removes unused ones,
which plain gofmt leaves alone:
- veza-backend-api/cmd/api/main.go
- veza-backend-api/internal/api/handlers/chat_handlers.go
- veza-backend-api/internal/handlers/auth_integration_test.go
backend-ci.yml's `test -z "$(gofmt -l .)"` strict gate (added in
13c21ac11) failed on a backlog of unformatted files. None of the
85 files in this commit had been edited since the gate was added
because no push touched veza-backend-api/** in between, so the
gate never fired until today's CI fixes triggered it.
The diff is exclusively whitespace alignment in struct literals
and trailing-space comments. `go build ./...` and the full test
suite (with VEZA_SKIP_INTEGRATION=1 -short) pass identically.
Update RabbitMQ config and eventbus. Improve secret filter logging.
Refine presence, cloud, and social services. Update announcement and
feature flag handlers. Add track_likes updated_at migration. Rebuild
seed binary.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MEDIUM-002: Remove manual X-Forwarded-For parsing in metrics_protection.go,
use c.ClientIP() only (respects SetTrustedProxies)
MEDIUM-003: Pin ClamAV Docker image to 1.4 across all compose files
MEDIUM-004: Add clampLimit(100) to 15+ handlers that parsed limit directly
MEDIUM-006: Remove unsafe-eval from CSP script-src on Swagger routes
MEDIUM-007: Pin all GitHub Actions to SHA in 11 workflow files
MEDIUM-008: Replace rabbitmq:3-management-alpine with rabbitmq:3-alpine in prod
MEDIUM-009: Add trial-already-used check in subscription service
MEDIUM-010: Add 60s periodic token re-validation to WebSocket connections
MEDIUM-011: Mask email in auth handler logs with maskEmail() helper
MEDIUM-012: Add k-anonymity threshold (k=5) to playback analytics stats
LOW-001: Align frontend password policy to 12 chars (matching backend)
LOW-003: Replace deprecated dotenv with dotenvy crate in Rust stream server
LOW-004: Enable xpack.security in Elasticsearch dev/local compose files
LOW-005: Accept context.Context in CleanupExpiredSessions instead of Background()
LOW-002: Noted — Hyperswitch version update deferred (requires payment integration tests)
29/30 findings remediated. 1 noted (LOW-002).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Course CRUD with slug generation, publish/archive lifecycle
- Lesson management with ordering and transcoding status
- Enrollment system with duplicate prevention
- Progress tracking with auto-completion at 90%
- Certificate issuance requiring full course completion
- Course reviews with rating aggregation
- Unit tests for service and handler layers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Distribution module: submit tracks to Spotify, Apple Music, Deezer
- Subscription eligibility check (Creator/Premium only)
- Distribution status tracking with platform-specific statuses
- Status history audit trail
- External streaming royalties import and aggregation
- Distributor provider interface for DistroKid/TuneCore integration
- Handler and service unit tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add subscription module (models, service, tests)
- Plans: Free, Creator ($9.99/mo), Premium ($19.99/mo)
- Features: subscribe, cancel, reactivate, change billing cycle
- 14-day trial for Premium plan
- Upgrade immediate, downgrade at period end
- Invoice tracking and history
- Handler tests for auth and validation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Export: table data_exports, POST /me/export (202), GET /me/exports, messages+playback_history
- Notification email quand ZIP prêt, rate limit 3/jour
- Suppression: keep_public_tracks, anonymisation PII complète (users, user_profiles)
- HardDeleteWorker: final anonymization après 30 jours
- Frontend: POST export, checkbox keep_public_tracks
- MSW handlers pour Storybook