TransferService.CreateTransfer signature changes from (...) error to
(...) (string, error) — the caller now captures the Stripe transfer
identifier and persists it on the SellerTransfer row. Pre-v1.0.7 the
stripe_transfer_id column was declared on the model and table but
never written to, which blocked the reversal worker (v1.0.7 item B)
from identifying which transfer to reverse on refund.
Changes:
* `TransferService` interface and `StripeConnectService.CreateTransfer`
both return the Stripe transfer id alongside the error.
* `processSellerTransfers` (marketplace service) persists the id on
success before `tx.Create(&st)` so a crash between Stripe ACK and
DB commit leaves no inconsistency.
* `TransferRetryWorker.retryOne` persists on retry success — a row
that failed on first attempt and succeeded via the worker is
reversal-ready all the same.
* `admin_transfer_handler.RetryTransfer` (manual retry) persists too.
* `SellerPayout.ExternalPayoutID` is populated by the Connect payout
flow (`payout.go`) — the field existed but was never written.
* Four test mocks updated; two tests assert the id is persisted on
the happy path, one on the failure path confirms we don't write a
fake id when the provider errors.
Migration `981_seller_transfers_stripe_reversal_id.sql`:
* Adds nullable `stripe_reversal_id` column for item B.
* Partial UNIQUE indexes on both stripe_transfer_id and
stripe_reversal_id (WHERE IS NOT NULL AND <> ''), mirroring the
v1.0.6.1 pattern for refunds.hyperswitch_refund_id.
* Logs a count of historical completed transfers that lack an id —
these are candidates for the backfill CLI follow-up task.
Backfill for historical rows is a separate follow-up (cmd/tools/
backfill_stripe_transfer_ids, calling Stripe's transfers.List with
Destination + Metadata[order_id]). Pre-v1.0.7 transfers without a
backfilled id cannot be auto-reversed on refund — document in P2.9
admin-recovery when it lands. Acceptable scope per v107-plan.
Migration number bumped 980 → 981 because v1.0.6.2 used 980 for the
unpaid-subscription cleanup; v107-plan updated with the note.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CHANGELOG v1.0.6.2 block now documents the distribution-handler
propagate fix as part of the release (applied in commit 3cee007d8
before re-tagging). v1.0.7 item G acceptance gains a recovery
endpoint requirement so the "complete payment" error message has a
real target rather than leaving users stuck.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-review of the v1.0.6.2 hotfix surfaced that
distribution.checkEligibility silently swallowed
subscription.ErrSubscriptionNoPayment as "ineligible, no extra info",
so a user with a fantôme subscription trying to submit a distribution
got "Distribution requires Creator or Premium plan" — misleading, the
user has a plan but no payment. checkEligibility now propagates the
error so the handler can surface "Your subscription is not linked to
a payment. Complete payment to enable distribution."
Security is unchanged — the gate still refuses. This is a UX clarity
fix for honest-path users who landed in the fantôme state via a
broken payment flow.
Also:
- Closure timestamp added to axis-1 P0.12 ("closed 2026-04-17 in
v1.0.6.2 (commit d31f5733d)") so future readers know the finding's
lifecycle without re-grepping the CHANGELOG.
- Item G in v107-plan.md gains an explicit E2E Playwright @critical
acceptance — the shell probe + Go unit tests validate the fix
today but don't run on every commit, so a refactor of Subscribe or
checkEligibility could silently re-open the bypass. The E2E test
makes regression coverage automatic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 Q2 probe confirmed the subscription money-movement finding
wasn't a "needs confirmation from ops" P1 — it was a live P0 bypass.
An authenticated user could POST /api/v1/subscriptions/subscribe,
receive 201 active without payment, and satisfy the distribution
eligibility gate. v1.0.6.2 (commit d31f5733d) closed the bypass at
the consumption site via GetUserSubscription filter + migration 980
cleanup.
axis-1-correctness.md:
* P1.7 renamed to P0.12 with the bypass chain, probe evidence, and
v1.0.6.2 closure cross-reference.
* Residual subscription-refund / webhook completeness work split out
as P1.7' (original scope, still v1.0.8).
v107-plan.md:
* Item G added (M effort) — replaces the v1.0.6.2 filter with a
mandatory pending_payment state + webhook-driven activation,
closing the creation path rather than compensating at the gate.
* Dependency graph gains a third track (independent of A/B/C/D/E/F).
* Effort total revised from 9-10d to 12-13d single-dev, 5d to 7d
two-dev parallel.
* Item D acceptance gains a TTL caveat section — Hyperswitch
Idempotency-Key has a 24h-7d server-side TTL; app-level
idempotency (order.id / partial UNIQUE) remains the load-bearing
guard beyond that window.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Axis 1 of the 5-axis VEZA audit, scoped to money-movement correctness
and ledger↔PSP reconciliation. Layout: one file per axis under
docs/audit-2026-04/, README index, v107-plan.md derived.
P0 findings (block v1.0.7 "ready-to-show" gate):
* P0.1 — SellerTransfer.StripeTransferID declared but never populated.
stripe_connect_service.CreateTransfer discards the *stripe.Transfer
return value (`_, err := transfer.New(params)`), so the column in
models.go:237 is dead. Structural blocker for the CHANGELOG-parked
v1.0.7 "Stripe Connect reversal" item.
* P0.2 — No Stripe Connect reversal on refund.succeeded. Every refund
today creates a permanent VEZA↔Stripe ledger gap. Action reworked
to decouple via a new `seller_transfers.status = 'reversal_pending'`
state + async worker, so Stripe flaps never block buyer-facing
refund UX.
* P0.3 — No reconciliation sweep for stuck orders / refunds / refund
rows with empty hyperswitch_refund_id. Hourly worker recommended,
same pattern as v1.0.5 Fix 6 orphan-tracks cleaner.
* P0.4 — No Idempotency-Key on outbound Hyperswitch POST /payments and
POST /refunds. Action includes an explicit scope note: the header
covers HTTP-transport retry only, NOT application-level replay (for
which the fix is a state-machine precondition).
P1 findings:
* P1.5 — Webhook raw payloads not persisted (blocks dispute forensics)
* P1.6 — Disputes / chargebacks silently dropped (new, surfaced during
review; dispute.* webhooks fall through the default case)
* P1.7 — Subscription money-movement not covered by v1.0.6 hardening
* P1.8 — No ledger-health Prometheus metrics
P2 findings:
* P2.9 — No admin API for manual override
* P2.10 — Partial refund latent compromise (amount *int64 always nil)
wontfix:
* wontfix.11 — Per-seller retry interval (re-evaluate at 10× load)
Derived deliverable: v107-plan.md sequences the 6 de-duplicated items
(4 P0 + 2 P1) with a dependency graph, two parallel tracks, per-commit
effort estimates (D→A→B; E→C→F), release gating and open questions
(volume magnitude, Connect backfill %).
Info needed from ops (tracked in axis-1 doc, not determinable from
code): last manual reconciliation date, whether subscriptions are
currently sold, current order/refund volume.
Axes 2-5 deferred: README.md marks axis 2 (state machines) as gated
on v1.0.7 landing first, otherwise the transition matrix captures a
v1.0.6.1 snapshot that's immediately stale.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7-day cleanup sprint (J1–J7) done. The codebase is unchanged
functionally but the working tree, docs, k8s runbooks, CI, and
Go dependency graph are all realigned with reality for the first
time since the v1.0.0 release.
VERSION 1.0.2 → 1.0.4 (skips v1.0.3 — that tag already
exists upstream, unused on this branch)
CHANGELOG.md full v1.0.4 entry with per-day (J1–J7) breakdown
and the govulncheck + CI fix trail
docs/PROJECT_STATE.md header month + version table refreshed,
pointer to AUDIT_REPORT.md added
docs/FEATURE_STATUS.md header updated — no feature matrix
changes (no feature work in this sprint)
Key deliverables of the sprint:
J1 7c9eece09 purge 220 MB of debris (binaries, reports,
session docs, stale MVP scripts)
J2 172ff497b rewrite CLAUDE.md, fix README, purge chat-server
refs from k8s runbooks and env examples
J3 784961b7e remove 3 deprecated unused handlers
J3+ dbda03f45 2FA handler duplicate removal (bundled by parallel
ci-cache commit)
J4 ebb28c77a GDPR-compliant hard delete with Redis SCAN cursor
and ES DeleteByQuery — closes TODO(HIGH-007)
J5 edc851af6 defer GeoIP, rename v2-v3-types.ts to domain.ts,
document Storybook kill
J5+ a9394a4a0 fix lint-staged eslint rule (was linting the
whole project — root cause of earlier --no-verify)
J6 091583b3d mark 3 dormant docker-compose files deprecated
fix 9e817aa6b bump x/image, quic-go, testcontainers-go — drops
containerd + docker/docker from dep graph,
resolving 5 govulncheck findings without allowlist
fix 51ed89cda bump go.work to 1.25 to match veza-backend-api
fix 51416ce37 bump x/net v0.51.0 for GO-2026-4559
fix 8f15bb136 retire legacy backend-ci.yml, centralize Docker
probe in SkipIfNoIntegration
CI status on the consolidated ci.yml workflow for 8f15bb136:
Veza CI / Backend (Go) OK 6m36s
Veza CI / Frontend (Web) OK 20m57s
Veza CI / Rust (Stream) OK 6m25s
Security Scan / gitleaks OK 4m13s
Veza CI / Notify skipped (fires only on failure)
First fully green CI run of the sprint and the first in a long
time overall. The tag v1.0.4 is cut on this state.
Refs: AUDIT_REPORT.md, all commits 7c9eece09..8f15bb136
First-attempt commit 02728909f only captured the .gitignore change; the
pre-commit hook silently dropped the 343 staged moves/deletes during
lint-staged's "no matching task" path. This commit re-applies the intended
J1 content on top of 24af2f72b (which was pushed in parallel).
Uses --no-verify because:
- J1 only touches .md/.json/.log/.png/binaries — zero code that would
benefit from lint-staged, typecheck, or vitest
- The hook demonstrated it corrupts pure-rename commits in this repo
- Explicitly authorized by user for this one commit
Changes (343 total: 169 deletions + 174 renames):
Binaries purged (~167 MB):
- veza-backend-api/{server,modern-server,encrypt_oauth_tokens,seed,seed-v2}
Generated reports purged:
- 9 apps/web/lint_report*.json (~32 MB)
- 8 apps/web/tsc_*.{log,txt} + ts_*.log (TS error snapshots)
- 3 apps/web/storybook_*.json (1375+ stored errors)
- apps/web/{build_errors*,build_output,final_errors}.txt
- 70 veza-backend-api/coverage*.out + coverage_groups/ (~4 MB)
- 3 veza-backend-api/internal/handlers/*.bak
Root cleanup:
- 54 audit-*.png (visual regression baselines, ~11 MB)
- 9 stale MVP-era scripts (Jan 27, hardcoded v0.101):
start_{iteration,mvp,recovery}.sh,
test_{mvp_endpoints,protected_endpoints,user_journey}.sh,
validate_v0101.sh, verify_logs_setup.sh, gen_hash.py
Session docs archived (not deleted — preserved under docs/archive/):
- 78 apps/web/*.md → docs/archive/frontend-sessions-2026/
- 43 veza-backend-api/*.md → docs/archive/backend-sessions-2026/
- 53 docs/{RETROSPECTIVE_V,SMOKE_TEST_V,PLAN_V0_,V0_*_RELEASE_SCOPE,
AUDIT_,PLAN_ACTION_AUDIT,REMEDIATION_PROGRESS}*.md
→ docs/archive/v0-history/
README.md and CONTRIBUTING.md preserved in apps/web/ and veza-backend-api/.
Note: The .gitignore rules preventing recurrence were already pushed in
02728909f and remain in place — this commit does not modify .gitignore.
Refs: AUDIT_REPORT.md §11