senke/veza

senke 26cb523334 fix(distribution,audit): propagate ErrSubscriptionNoPayment to handler + P0.12 closure date + E2E regression TODO

Self-review of the v1.0.6.2 hotfix surfaced that
distribution.checkEligibility silently swallowed
subscription.ErrSubscriptionNoPayment as "ineligible, no extra info",
so a user with a fantôme subscription trying to submit a distribution
got "Distribution requires Creator or Premium plan" — misleading, the
user has a plan but no payment. checkEligibility now propagates the
error so the handler can surface "Your subscription is not linked to
a payment. Complete payment to enable distribution."

Security is unchanged — the gate still refuses. This is a UX clarity
fix for honest-path users who landed in the fantôme state via a
broken payment flow.

Also:
- Closure timestamp added to axis-1 P0.12 ("closed 2026-04-17 in
  v1.0.6.2 (commit 9a8d2a4e7)") so future readers know the finding's
  lifecycle without re-grepping the CHANGELOG.
- Item G in v107-plan.md gains an explicit E2E Playwright @critical
  acceptance — the shell probe + Go unit tests validate the fix
  today but don't run on every commit, so a refactor of Subscribe or
  checkEligibility could silently re-open the bypass. The E2E test
  makes regression coverage automatic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-17 12:43:21 +02:00

27 KiB

Raw Permalink Blame History

Axis 1 — Correctness / accounting

See README.md for scope, auditor, source state and reading conventions.

Surface summary

PSP	Package	Endpoints called	Env gate
Hyperswitch (payments + refunds)	`internal/services/hyperswitch/`	`POST /payments`, `GET /payments/:id`, `POST /refunds`	`HYPERSWITCH_ENABLED`
Stripe Connect (seller payouts + transfers)	`internal/services/stripe_connect_service.go`	`POST /v1/account_links`, `GET /v1/accounts/:id`, `GET /v1/balance`, `POST /v1/transfers`	`STRIPE_CONNECT_ENABLED`

Table	Status column	Terminal states	Non-terminal
`orders`	`status`	completed, failed, cancelled, refunded	pending, refund_pending
`refunds` (v1.0.6)	`status`	succeeded, failed	pending
`seller_transfers`	`status`	completed, permanently_failed, reversed	pending, failed
`seller_payouts`	`status`	completed, failed	pending, processing
`licenses`	`revoked_at` (nullable)	NULL (active) / timestamp (revoked)	—
`user_subscriptions`	`status` (enum)	canceled, expired	active, trialing, past_due

Webhook entry: POST /api/v1/webhooks/hyperswitch (routes_webhooks.go:55), HMAC-SHA512 verified, dispatches to ProcessPaymentWebhook or ProcessRefundWebhook via HyperswitchWebhookPayload.IsRefundEvent() (service.go:700).

Existing reconciliation: only the TransferRetryWorker for seller_transfers.status='failed' (transfer_retry.go:57), interval TRANSFER_RETRY_INTERVAL default 5m (config.go:399).

P0.1 — `SellerTransfer.StripeTransferID` declared but never populated

Evidence

Column declared: internal/core/marketplace/models.go:237 StripeTransferID string \gorm:"size:255" json:"stripe_transfer_id,omitempty"``
Zero write-sites (grep 'StripeTransferID\s*=' returns 0 hits in production code).
Upstream returns the transfer but we discard it: internal/services/stripe_connect_service.go:205 _, err := transfer.New(params) — the *stripe.Transfer pointer carrying .ID is underscored away.
CreateTransfer signature returns only error (internal/services/stripe_connect_service.go:179), so callers physically cannot persist the id even if they wanted to.

Consequence Direct blocker for the v1.0.6 CHANGELOG-documented "parked v1.0.7 — Stripe Connect Transfers:reversal". When anyone starts that work, there is no stripe_transfer_id in the DB to pass to transfers.CreateReversal(...). The only fallbacks would be:

Query Stripe with transfers.List(Destination=account_id, Metadata[order_id]=...) — fragile, rate-limited, no unique guarantee.
Do a full upstream refactor of the transfer worker — larger scope than the TODO implies.

This is exactly the "temporary compromise frozen into permanent" pattern the audit charter targets.

Action (prerequisite for any v1.0.7 reversal work)

Change TransferService.CreateTransfer signature: (ctx, sellerID, amount, currency, orderID string) (stripeTransferID string, err error).
Update processSellerTransfers (service.go:762) to capture the id and set st.StripeTransferID = stripeTransferID before tx.Create(&st).
Update TransferRetryWorker.retryOne (transfer_retry.go:116) to persist the id on successful retry too.

Criticity — P0. Must land before the reversal commit, not with it.

P0.2 — No Stripe Connect reversal on refund.succeeded

Evidence

internal/core/marketplace/service.go:1529 reverseSellerAccounting does three things on refund webhook succeeded:
1. DebitSellerBalance (in-DB balance row decrement)
2. seller_transfers.status = "reversed" (in-DB ledger flag)
3. Nothing else.
The CHANGELOG v1.0.6 entry explicitly flags this: "Stripe Connect Transfers:reversal call flagged TODO(v1.0.7). [...] the missing piece is the money-movement round-trip at Stripe."

Consequence For every refunded order, the buyer is credited by Hyperswitch (verified via webhook) and the VEZA internal seller balance is debited correctly. But the Stripe-Connect-side transfer that moved the seller's share into their connected account is not reversed. The seller's Stripe account still holds that money. VEZA's books say "debited", Stripe says "paid out". On every refund this creates a real reconciliation gap that a seller's monthly Stripe statement will show.

This isn't a ledger-consistency issue in VEZA's own DB — it's a VEZA-vs-Stripe divergence. It grows linearly with refund volume.

Action — decoupled from the buyer-refund critical path The intuitive fix ("call transfers.NewReversal inside reverseSellerAccounting and return on error to let the webhook retry") is wrong: Hyperswitch has already credited the buyer, the buyer is watching their order page, and a Stripe flap would leave the buyer in refund_pending until Stripe comes back. That couples buyer UX to the weakest dependency in the whole flow.

Correct pattern — split the reversal into two concerns, with an intermediate state that says "VEZA books updated, PSP reversal still owed":

Migration — extend seller_transfers.status to include a new value reversal_pending. Update the state-value documentation comment in internal/core/marketplace/models.go:242.
Refund finalization (service.go:1466 finalizeSuccessfulRefund) — no change to its flow except: instead of transitioning seller_transfers.status directly to reversed in reverseSellerAccounting, transition to reversal_pending and set next_retry_at = NOW(). The buyer-facing flow finalizes immediately: order=refunded, licenses revoked, webhook returns 200.
New worker StripeReversalWorker — scans seller_transfers WHERE status='reversal_pending' AND (next_retry_at IS NULL OR next_retry_at <= NOW()). For each row:
- if StripeTransferID == "" (see P0.1), skip + log ERROR "cannot reverse — transfer id missing", alert ops.
- else call transfer.NewReversal(stripeTransferID, &stripe.TransferReversalParams{Amount: amountCents}).
- on success: status='reversed', persist the reversal id on a new stripe_reversal_id column.
- on Stripe 404 (transfer not found): assume already reversed out-of-band, set status='reversed', log INFO.
- on any other error: exponential backoff on next_retry_at, increment retry_count. Hit the same permanently_failed ceiling the TransferRetryWorker uses today.

This gives us three properties:

Buyer refund never blocks on Stripe health.
VEZA accounting is consistent with Hyperswitch refund state at all times.
The VEZA↔Stripe gap is bounded (every reversal either completes or ends in permanently_failed with a clear ops action).

Criticity — P0. Pre-public-opening of marketplace.

P0.3 — No reconciliation sweep for stuck pending states

Evidence

Inventory search returned zero scheduled jobs that compare DB state to PSP state for payments or refunds.
TransferRetryWorker exists (transfer_retry.go) but only reconciles seller_transfers — not orders, not refunds.
Agent inventory: "No startup sanity checks comparing DB to PSP for orders/refunds. No admin endpoints for manual order/refund/transfer verification. No sweeps for stuck orders in refund_pending or orders exceeding 14-day deadline."

Consequence If a Hyperswitch webhook is never delivered — they've been down, our /webhooks/hyperswitch was down during their retry window, or their delivery system has a bug — then:

An order stuck in pending stays forever. Customer paid, got no license. Only surfaces via customer complaint.
A refund stuck in refund_pending stays forever. Customer asked for refund, got nothing. Order stuck in refund_pending also blocks re-attempts (ErrRefundAlreadyRequested at service.go:1303).
A refund row with empty hyperswitch_refund_id (Phase 1 completed, Phase 2 crashed between PSP call and DB UPDATE) has no way to correlate to the PSP. Nothing queries them.

Action Hourly worker (same pattern as CleanupOrphanTracks from v1.0.5 Fix 6):

// internal/jobs/reconcile_hyperswitch.go
func ReconcilePendingOrders(ctx context.Context, db *gorm.DB, hs *hyperswitch.Client) error {
    // Orders in 'pending' older than 30m: GET /payments/:id, sync
    // Refunds in 'pending' > 30m with non-empty refund_id: GET /refunds/:id, sync
    // Refunds in 'pending' > 5m with EMPTY refund_id: mark failed, roll back order
}

Wire in cmd/api/main.go next to ScheduleOrphanTracksCleanup.

Criticity — P0. The gap is open today on every deployed instance.

P0.4 — `CreatePayment` / `CreateRefund` not idempotent at the HTTP layer

Evidence

internal/services/hyperswitch/client.go:73-78 and internal/services/hyperswitch/client.go:170-175 — only Content-Type and api-key headers are set. No Idempotency-Key.
Hyperswitch's API supports an Idempotency-Key header (standard PSP convention).

Consequence If the HTTP call from VEZA to Hyperswitch retries at the transport layer (curl-level / net/http Transport-level retry, proxy retry, TLS reconnect after a flap) during a single logical call, a second payment or refund object can be created for the same intent. Customer double-charged, or two refunds issued. The order row's hyperswitch_payment_id column is set to the first payment_id we hear back about, so the second shadow payment is orphaned in Hyperswitch with no VEZA row pointing at it.

Scope note — what Idempotency-Key does and does NOT fix This header protects against HTTP-transport retry only. It does not fix, and should not be conflated with:

Application-level replay — buyer clicks "Pay" twice, frontend double-submits a form, VEZA application code retries after a DB transaction aborts. These paths create distinct order.id values, so each call sees a different natural key and the PSP legitimately creates different payments. The fix there is a state-machine precondition (order dedup by (buyer_id, cart_hash) within a window, frontend debounce, CSRF single-use token), not the header.
Crash-between-call-and-save — VEZA POST succeeds, Hyperswitch returns payment_id, VEZA crashes before persisting it. A subsequent application retry with the same order.id does recover thanks to the Idempotency-Key (Hyperswitch returns the cached response), but this only works if the application retry re-uses the same order row. That requires the order row to exist in a pending_payment_submitted intermediate state before the PSP call — which the current code doesn't have. Documenting as a follow-up.

Action Add Idempotency-Key: <natural key> header on every mutating call:

CreatePayment → order_id (UUID, guaranteed unique per order row).
CreateRefund → refund.ID (UUID from the pending refunds row — populated before the PSP call in Phase 1, so the same key is used across any HTTP-level retry within that call).

The http.Client in hyperswitch.Client should NOT set this implicitly; callers must pass the key explicitly so the semantic is auditable at each call site. Two changes, each ~3 lines.

The state-machine-level fix for application retry is out of scope here; it goes on the v1.0.8+ roadmap (documented in v107-plan.md).

Criticity — P0. Low probability per call but unbounded blast radius (customer double-charge).

P1.5 — Webhook raw payloads not persisted

Evidence

routes_webhooks.go:59-82 reads body, verifies signature, dispatches.
No table or log file captures the raw webhook body, signature header, or processing result for later audit.

Consequence When a customer disputes ("I never got refunded"), we cannot reproduce what Hyperswitch told us. The only evidence is in the backend log at whatever retention the host keeps — and structured logs don't carry the raw body (we explicitly redact some fields for security).

Action New table hyperswitch_webhook_log:

CREATE TABLE hyperswitch_webhook_log (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    received_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    signature_header TEXT NOT NULL,
    signature_valid BOOLEAN NOT NULL,
    event_type TEXT,
    payment_id TEXT,
    refund_id TEXT,
    payload BYTEA NOT NULL,
    processing_result TEXT NOT NULL,  -- 'ok', 'error: <msg>', 'skipped'
    processed_at TIMESTAMPTZ
);
CREATE INDEX ON hyperswitch_webhook_log (payment_id) WHERE payment_id IS NOT NULL;
CREATE INDEX ON hyperswitch_webhook_log (refund_id) WHERE refund_id IS NOT NULL;

Insert in routes_webhooks.go:hyperswitchWebhookHandler before signature verification — so we capture attack attempts too. Retention 90 days via a separate sweep job.

Criticity — P1. Not ledger-diverging, but the first ops question on any refund/dispute claim is "what did the webhook say", and today we can't answer it.

P1.6 — Disputes / chargebacks silently dropped

Evidence

HyperswitchWebhookPayload.IsRefundEvent() (service.go:700) only distinguishes refund-vs-non-refund. No dispute discriminator.
Zero grep hits for dispute|chargeback|disputed in internal/core/marketplace/, internal/services/hyperswitch/, or anywhere in the webhook handler path.
Hyperswitch emits dispute.opened, dispute.evidence_required, dispute.won, dispute.lost, dispute.expired — all land on POST /webhooks/hyperswitch and currently fall through ProcessPaymentWebhook's switch to the default case (service.go:718), which logs Debug "ignoring status" and returns 200.

Consequence A dispute/chargeback is the case where the ledger diverges most violently, because:

It's not initiated by VEZA — we learn about it passively when the buyer's bank pushes a reversal through the card network.
On dispute.lost (the common case if we don't respond in time), Hyperswitch debits our acquirer balance for the original amount plus a chargeback fee (typically 15-25€ depending on acquirer).
The VEZA orders row stays completed, the license stays active, the seller stays paid. Buyer has their money back via the card network. Every table in VEZA thinks everything is fine. The only signal is the acquirer statement, and it's weeks delayed.
Chargeback rate is a seller health metric: an account with 1% CB rate gets flagged by Visa/MC and the whole VEZA merchant account can be put under review. Without counting disputes per-seller, we have no early-warning system.

Action (staged) Minimum (ships with P1.5 webhook log): every dispute.* event lands in hyperswitch_webhook_log with the payload captured. Ops can query.

Recommended (v1.0.8 sprint):

New disputes table:

CREATE TABLE disputes (
    id UUID PRIMARY KEY,
    order_id UUID NOT NULL REFERENCES orders(id),
    hyperswitch_dispute_id TEXT UNIQUE NOT NULL,
    status TEXT NOT NULL, -- opened, evidence_required, won, lost, expired
    amount_cents BIGINT NOT NULL,
    reason TEXT,
    evidence_due_at TIMESTAMPTZ,
    opened_at TIMESTAMPTZ NOT NULL,
    resolved_at TIMESTAMPTZ,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Extend IsRefundEvent() discriminator to a three-way EventKind() that returns payment | refund | dispute.
New ProcessDisputeWebhook — mirrors ProcessRefundWebhook's idempotency pattern (unique hyperswitch_dispute_id + row lock).
On dispute.lost: same seller-side reversal machinery as refund (P0.2), because the money effect is identical.
Per-seller chargeback counter on seller_balances or a new seller_risk_metrics table, exposed to admin UI.

Criticity — P1 minimum (capture + log). P0 if volume justifies it once public-opened. At current volume (pre-opening), accepting the gap is defensible; post-opening, a single dispute we miss is worse than the engineering cost of handling it.

P1.7 → P0.12 — Subscription payment-gate bypass (escalated 2026-04-17)

Re-classified from P1.7 to P0 on 2026-04-17 after probing ops Q2. The probe did not confirm "subscriptions are sold" — it confirmed "subscriptions are activated server-side without payment, and that state satisfies a paid-feature gate". Criticity moved from "P1 hardening if subs are live" to "P0 security — bypass is live in any environment where HYPERSWITCH_ENABLED=false (which is the current dev/staging default and was the prod default of every env variable default before v1.0.6.2)". Hotfix v1.0.6.2 ships the gate fix (GetUserSubscription filter + migration 980 cleanup). The subscription-refund / webhook work (original P1.7 scope) remains and splits out below as P1.7' for v1.0.8.

Evidence (bypass chain, confirmed by probe)

POST /api/v1/subscriptions/subscribe with plan_id=<Creator> from any authenticated user (role=user, no KYC, no email verification re-check beyond initial signup) returns HTTP 201 status=active.
subscription/service.go:256 sub.Status = StatusActive is assigned unconditionally before the payment call.
subscription/service.go:279 if s.paymentProvider != nil — payment call is conditional; when Hyperswitch is disabled (the default for HYPERSWITCH_ENABLED=false), the row persists active with no PSP reference written.
subscription/service.go:101 (pre-hotfix) WHERE status IN ('active', 'trialing') — the fantôme row satisfies this filter, GetUserSubscription returns it to callers.
distribution/service.go:253 gate returns sub.Plan.HasDistribution || sub.Plan.CanSellOnMarketplace. Creator plan has can_sell_on_marketplace=true (migration 949 line 79) → gate passes → user reaches POST /distribution/submit.
user_subscriptions.hyperswitch_subscription_id declared (models.go:95) but never written anywhere in the codebase — no code path populates it, so it cannot be used as a post-hoc "payment confirmed" signal either.
Probe artefact (captured, cleaned): row 906c585d-1afb-49e5-ac96-5eb5411cda99 created 2026-04-17 via authenticated POST /subscribe from smoke.0416@veza.local (role=user), HS IDs empty. Voided by migration 980 at v1.0.6.2 deploy.

Consequence Any authenticated user gains access to POST /distribution/submit, which dispatches to external distribution partners. Three harms:

Reputational — external partners receive distribution submissions from non-paying VEZA accounts.
Financial — if the distribution workflow triggers billable API calls to the partner platform (depends on workflow, worth auditing as a separate finding under axis 3).
Contractual — VEZA's agreements with partners likely assume "paid-subscriber submissions only". The role column is not affected (user stays user, no bypass of v1.0.6 creator self-service gate), and free-tier upload is unaffected — the specific gate bypassed is the has_distribution || can_sell_on_marketplace check on paid plans.

Action — shipped in v1.0.6.2 (commit d31f5733d)

subscription/service.go GetUserSubscription now routes through hasEffectivePayment — effective payment = free plan OR unexpired trial OR invoice with non-empty hyperswitch_payment_id.
ErrSubscriptionNoPayment added; distribution.checkEligibility and the four subscription handlers map it to "no active subscription" (with needs_payment=true on GET /me/subscription so honest-path users get actionable info).
Migration 980_void_unpaid_subscriptions.sql sweeps pre-existing fantôme rows to status='expired' with audit table voided_subscriptions_20260417.
Probe script scripts/probes/subscription-unpaid-activation.sh kept as versioned regression test.
TODO(v1.0.7-item-G) annotation on subscription/service.go:createNewSubscription marks the deferred work: replace the if s.paymentProvider != nil short-circuit with a mandatory pending_payment state + webhook-driven activation (item G in the v1.0.7 plan).

Criticity — P0, closed 2026-04-17 in v1.0.6.2 (commit d31f5733d). Item G in v1.0.7 hardens the creation path end-to-end.

P1.7' — Subscription money-movement not covered by v1.0.6 hardening (original scope, deferred to v1.0.8)

Evidence

Subscription model has Hyperswitch keys: subscription/models.go:95 HyperswitchSubscriptionID, line 96 HyperswitchCustomerID, line 136 HyperswitchPaymentID.
Status enum: active, trialing, canceled, past_due, expired (subscription/models.go:20-28).
Service.Subscribe (subscription/service.go:149) and Service.CancelSubscription (line 390) exist.
Zero refund code in the subscription package. Cancellation marks the row canceled but does not initiate any PSP reimbursement.
No visible webhook handler branch for subscription.* events — the dispatcher only knows payment vs refund.
No code path was found that transitions a subscription INTO past_due (which implies it's driven by an external event the webhook isn't consuming, or it's dead state).

Consequence The marketplace refund hardening only covers one-shot purchases. A subscription cancellation inside the paid period currently doesn't refund anything, and a failed recurring charge doesn't move the subscription to past_due because nothing wires it up.

Action Dedicated sprint for subscription state-machine hardening. Scope for v1.0.8. Breakdown:

Inventory every state in user_subscriptions.status with evidence of which code path lands there.
Handle subscription.payment_succeeded, subscription.payment_failed, subscription.canceled webhook events in a ProcessSubscriptionWebhook.
Decide product policy on mid-period cancellation (pro-rata refund? honour until end of period? forfeit?).

Criticity — P1 for the hardening. v1.0.6.2 closes the bypass half; this half (refund / webhook completeness) is legitimately v1.0.8 scope and does not produce a bypass on its own.

P1.8 — No ledger-health metrics

Evidence No grep hit for prometheus.*counter.*order|refund|transfer|payout in internal/metrics/. Agent confirms no dashboards.

Consequence Ledger state is invisible until someone complains. Specifically:

Count of orders in pending older than 30 min (webhook never arrived)
Count of orders in refund_pending older than 30 min
Count of seller_transfers in failed at max retry_count
Count of refunds with empty hyperswitch_refund_id older than 5 min
Count of seller_transfers in reversal_pending older than 30 min (new state from P0.2)
Oldest timestamp in each of the above

Action Prometheus gauge metrics updated by a 60s sampler (cheap query each):

ledger_stuck_orders_pending_total{age_bucket="30m_1h|1h_1d|1d+"}
ledger_stuck_refunds_pending_total{age_bucket=...}
ledger_failed_transfers_at_max_retry_total
ledger_orphan_refund_rows_total        // empty hyperswitch_refund_id
ledger_reversal_pending_transfers_total

Grafana panel with 5 number panels + 5 time-series. Alert rules:

ledger_stuck_orders_pending_total > 0 for 10m → page
ledger_orphan_refund_rows_total > 0 for 5m → page (= bug in the two-phase commit between DB and PSP)

Criticity — P1. Ships with P0.3 (the reconciliation sweep) so the metrics track the sweep's effectiveness.

P2.9 — No admin API for manual refund / order override

Evidence No admin endpoint found that can force an order or refund into a specific terminal state. Ops would have to issue raw SQL in prod.

Consequence When something genuinely gets stuck (Hyperswitch confirms out-of-band that a refund succeeded but their webhook is never going to fire for some freak reason), there's no sanctioned path to resolve without raw DB access.

Action POST /api/v1/admin/orders/:id/force-status + POST /api/v1/admin/refunds/:id/force-status, gated by AuthMiddleware.RequireAdminRole, with an audit-log entry. Body: { new_status: "...", reason: "..." }. Validates the target state is reachable from the current one per the documented state machine (depends on Axis 2).

Criticity — P2. Not urgent; raw SQL works until it doesn't.

P2.10 — Partial refund latent compromise

Evidence client.go:142 CreateRefundRequest.Amount *int64 // nil = full refund. service.go:1330 passes nil as amount on every call. CHANGELOG v1.0.6: "full-only for v1.0.6 — partial refund UX is deferred to v1.0.7."

Consequence The amount *int64 parameter is a permanent half-wired compromise. When partial refunds land, the implementer must remember:

Multiple refunds per order must be allowed (change the double-submit guard at service.go:1303).
SellerBalance debit must be proportional to the partial amount, not full transfer amount.
The partial UNIQUE index already tolerates the multi-refund case (v1.0.6.1 migration 979) — but only if each refund has its own PSP-assigned refund_id, which Hyperswitch does.
Webhook lookup already uses hyperswitch_refund_id so no change.

Action When partial lands: extensive state-table tests (can the same order reach (status=refunded, amount < total)? should it be a new partially_refunded status?). Flag this item in the v1.0.7 plan explicitly so it's scoped, not drive-by.

Criticity — P2. No active bug; future hazard only.

wontfix.11 — `TransferRetryInterval` not per-seller configurable

config.go:399 default 5m. A single global interval works fine at current traffic. Re-evaluate at 10× load.

Axis 1 summary

#	Finding	Criticity	Gate
1	`SellerTransfer.StripeTransferID` never set	P0	prerequisite for v1.0.7 reversal
2	No Stripe Connect reversal on refund.succeeded	P0	v1.0.7, decoupled via `reversal_pending` state
3	No reconciliation sweep for stuck pending states	P0	v1.0.7 (hourly job)
4	No `Idempotency-Key` on Hyperswitch calls	P0	v1.0.7 (2 header lines)
5	Webhook raw payloads not persisted	P1	v1.0.7 (migration + insert)
6	Disputes / chargebacks silently dropped	P1	v1.0.8 (min: capture in P1.5 log)
7	Subscription money-movement surface	P1	v1.0.8 sprint
8	No ledger-health metrics	P1	ships with #3
9	No admin API for manual override	P2	v1.0.8+
10	Partial refund latent compromise	P2	flag in v1.0.7 plan
11	Per-seller retry interval	wontfix	—

Blockers for "ready to show" — 1, 2, 3, 4 must land before the marketplace is used by anyone external (grants demo, beta, DIY community). After those four, VEZA's accounting is self-consistent AND consistent with Stripe+Hyperswitch, which is the real bar.

See v107-plan.md for sequencing and effort.

Info needed from ops

Not determinable from code. Answers become part of the audit:

Last manual reconciliation between VEZA DB and Stripe/Hyperswitch dashboards. Date + scope + outcome. If the answer is "jamais" — that's itself a P1 process finding: no reconciliation cadence established. Absence of documented debt = hidden debt.
Are subscriptions currently sold (i.e., is the paid-subscription feature turned on for end-users), or is the code live but the UI hides the entry point? If yes, P1.7 → P0. If hidden-by-UI, additional Axis-3 check needed: is the API reachable despite the UI hiding? Front-flag-only gating is a security finding.
Current volume order of magnitude — number of orders/refunds processed per week — so criticity can be re-checked against blast radius.

27 KiB Raw Permalink Blame History Unescape Escape

Axis 1 — Correctness / accounting

Surface summary

P0.1 — SellerTransfer.StripeTransferID declared but never populated

P0.2 — No Stripe Connect reversal on refund.succeeded

P0.3 — No reconciliation sweep for stuck pending states

P0.4 — CreatePayment / CreateRefund not idempotent at the HTTP layer

P1.5 — Webhook raw payloads not persisted

P1.6 — Disputes / chargebacks silently dropped

P1.7 → P0.12 — Subscription payment-gate bypass (escalated 2026-04-17)

P1.7' — Subscription money-movement not covered by v1.0.6 hardening (original scope, deferred to v1.0.8)

P1.8 — No ledger-health metrics

P2.9 — No admin API for manual refund / order override

P2.10 — Partial refund latent compromise

wontfix.11 — TransferRetryInterval not per-seller configurable

Axis 1 summary

Info needed from ops

27 KiB

Raw Permalink Blame History

P0.1 — `SellerTransfer.StripeTransferID` declared but never populated

P0.4 — `CreatePayment` / `CreateRefund` not idempotent at the HTTP layer

wontfix.11 — `TransferRetryInterval` not per-seller configurable