veza/docs/audit-2026-04/README.md
senke 6b345ede9f docs(audit): 2026-04 correctness/accounting findings (axis 1)
Axis 1 of the 5-axis VEZA audit, scoped to money-movement correctness
and ledger↔PSP reconciliation. Layout: one file per axis under
docs/audit-2026-04/, README index, v107-plan.md derived.

P0 findings (block v1.0.7 "ready-to-show" gate):
  * P0.1 — SellerTransfer.StripeTransferID declared but never populated.
    stripe_connect_service.CreateTransfer discards the *stripe.Transfer
    return value (`_, err := transfer.New(params)`), so the column in
    models.go:237 is dead. Structural blocker for the CHANGELOG-parked
    v1.0.7 "Stripe Connect reversal" item.
  * P0.2 — No Stripe Connect reversal on refund.succeeded. Every refund
    today creates a permanent VEZA↔Stripe ledger gap. Action reworked
    to decouple via a new `seller_transfers.status = 'reversal_pending'`
    state + async worker, so Stripe flaps never block buyer-facing
    refund UX.
  * P0.3 — No reconciliation sweep for stuck orders / refunds / refund
    rows with empty hyperswitch_refund_id. Hourly worker recommended,
    same pattern as v1.0.5 Fix 6 orphan-tracks cleaner.
  * P0.4 — No Idempotency-Key on outbound Hyperswitch POST /payments and
    POST /refunds. Action includes an explicit scope note: the header
    covers HTTP-transport retry only, NOT application-level replay (for
    which the fix is a state-machine precondition).

P1 findings:
  * P1.5 — Webhook raw payloads not persisted (blocks dispute forensics)
  * P1.6 — Disputes / chargebacks silently dropped (new, surfaced during
    review; dispute.* webhooks fall through the default case)
  * P1.7 — Subscription money-movement not covered by v1.0.6 hardening
  * P1.8 — No ledger-health Prometheus metrics

P2 findings:
  * P2.9 — No admin API for manual override
  * P2.10 — Partial refund latent compromise (amount *int64 always nil)

wontfix:
  * wontfix.11 — Per-seller retry interval (re-evaluate at 10× load)

Derived deliverable: v107-plan.md sequences the 6 de-duplicated items
(4 P0 + 2 P1) with a dependency graph, two parallel tracks, per-commit
effort estimates (D→A→B; E→C→F), release gating and open questions
(volume magnitude, Connect backfill %).

Info needed from ops (tracked in axis-1 doc, not determinable from
code): last manual reconciliation date, whether subscriptions are
currently sold, current order/refund volume.

Axes 2-5 deferred: README.md marks axis 2 (state machines) as gated
on v1.0.7 landing first, otherwise the transition matrix captures a
v1.0.6.1 snapshot that's immediately stale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 03:21:33 +02:00

2 KiB

VEZA Audit — 2026-04

Scope — VEZA backend (Go) + web (TypeScript). TALAS software (firmware, PCB reverse-engineering pipeline) is out of scope and will be audited separately when its phase stabilises.

Source state — commits up to a57bb6f78 (v1.0.6.1, 2026-04-17).

Auditor — Claude Opus 4.7 (1M context).

Axes

# File Status
1 axis-1-correctness.md — correctness / accounting delivered
2 axis-2-state-machines.md — transition matrix + illegal-transition tests 🔲 pending v1.0.7
3 axis-3-security.md — attack surface (signatures, rate limits, authz, secrets) 🔲 pending
4 axis-4-tests.md — coverage vs reality, failure-injection gap 🔲 pending
5 axis-5-debt.md — documented debt vs hidden debt (TODO/FIXME inventory) 🔲 pending

Axis 2 is gated on v1.0.7 landing first — otherwise the transition matrix captures a v1.0.6.1 snapshot that's immediately stale. See v107-plan.md for the sequencing.

Reading conventions

Every finding cites file:line evidence. Structure:

### P{0|1|2}.N — short title
**Evidence** — concrete cites
**Consequence** — what breaks today / tomorrow
**Action** — what to do, with enough detail that an implementer can start
**Criticity** — P0 / P1 / P2 / wontfix (with justification)

P0 = fix within v1.0.7 or earlier (ledger diverges today, or a v1.0.7 commitment is structurally blocked). P1 = v1.0.7 target. Operational visibility / correctness hardening. P2 = v1.0.8+. Nice-to-have. wontfix = justified non-action.

Info needed from ops (not determinable from code)

Tracked in axis-1-correctness.md. Absence of answers becomes a finding in its own right.

Derived deliverables

  • v107-plan.md — sequencing, dependencies and relative effort for the axis-1 P0 findings + the CHANGELOG-parked v1.0.7 items. Read this before picking up v1.0.7 work.