veza/docs/audit-2026-04
senke 3cd82ba5be fix(hyperswitch): idempotency-key on create-payment and create-refund — v1.0.7 item D
Every outbound POST /payments and POST /refunds from the Hyperswitch
client now carries an Idempotency-Key HTTP header. Key values are
explicit parameters at every call site — no context-carrier magic,
no auto-generation. An empty key is a loud error from the client
(not silent header omission) so a future new call site that forgets
to supply one fails immediately, not months later under an obscure
replay scenario.

Key choices, both stable across HTTP retries of the same logical
call:
  * CreatePayment → order.ID.String() (GORM BeforeCreate populates
    order.ID before the PSP call in ConfirmOrder).
  * CreateRefund → pendingRefund.ID.String() (populated by the
    Phase 1 tx.Create in RefundOrder, available for the Phase 2 PSP
    call).

Scope note (reproduced here for the next reader who grep-s the
commit log for "Idempotency-Key"):

  Idempotency-Key covers HTTP-transport retry (TLS reconnect,
  proxy retry, DNS flap) within a single CreatePayment /
  CreateRefund invocation. It does NOT cover application-level
  replay (user double-click, form double-submit, retry after crash
  before DB write). That class of bug requires state-machine
  preconditions on VEZA side — already addressed by the order
  state machine + the handler-level guards on POST
  /api/v1/payments (for payments) and the partial UNIQUE on
  `refunds.hyperswitch_refund_id` landed in v1.0.6.1 (for refunds).

  Hyperswitch TTL on Idempotency-Key: typically 24h-7d server-side
  (verify against current PSP docs). Beyond TTL, a retry with the
  same key is treated as a new request. Not a concern at current
  volumes; document if retry logic ever extends beyond 1 hour.

Explicitly out of scope: item D does NOT add application-level
retry logic. The current "try once, fail loudly" behavior on PSP
errors is preserved. Adding retries is a separate design exercise
(backoff, max attempts, circuit breaker) not part of this commit.

Interfaces changed:
  * hyperswitch.Client.CreatePayment(ctx, idempotencyKey, ...)
  * hyperswitch.Client.CreatePaymentSimple(...) convenience wrapper
  * hyperswitch.Client.CreateRefund(ctx, idempotencyKey, ...)
  * hyperswitch.Provider.CreatePayment threads through
  * hyperswitch.Provider.CreateRefund threads through
  * marketplace.PaymentProvider interface — first param after ctx
  * marketplace.refundProvider interface — first param after ctx

Removed:
  * hyperswitch.Provider.Refund (zero callers, superseded by
    CreateRefund which returns (refund_id, status, err) and is the
    only method marketplace's refundProvider cares about).

Tests:
  * Two new httptest.Server-backed tests (client_test.go) pin the
    Idempotency-Key header value for CreatePayment and CreateRefund.
  * Two new empty-key tests confirm the client errors rather than
    silently sending no header.
  * TestRefundOrder_OpensPendingRefund gains an assertion that
    f.provider.lastIdempotencyKey == refund.ID.String() — if a
    future refactor threads the key from somewhere else (paymentID,
    uuid.New() per call, etc.) the test fails loudly.
  * Four pre-existing test mocks updated for the new signature
    (mockRefundPaymentProvider in marketplace, mockPaymentProvider
    in tests/integration and tests/contract, mockRefundPayment
    Provider in tests/integration/refund_flow).

Subscription's CreateSubscriptionPayment interface declares its own
shape and has no live Hyperswitch-backed implementation today —
v1.0.6.2 noted this as the payment-gate bypass surface, v1.0.7
item G will ship the real provider. When that lands, item G's
implementation threads the idempotency key through in the same
pattern (documented in v107-plan.md item G acceptance).

CHANGELOG v1.0.7-rc1 entry updated with the full item D scope note
and the "out of scope: retries" caveat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 02:30:02 +02:00
..
axis-1-correctness.md fix(distribution,audit): propagate ErrSubscriptionNoPayment to handler + P0.12 closure date + E2E regression TODO 2026-04-17 12:43:21 +02:00
README.md docs(audit): 2026-04 correctness/accounting findings (axis 1) 2026-04-17 03:21:33 +02:00
v107-plan.md fix(hyperswitch): idempotency-key on create-payment and create-refund — v1.0.7 item D 2026-04-18 02:30:02 +02:00

VEZA Audit — 2026-04

Scope — VEZA backend (Go) + web (TypeScript). TALAS software (firmware, PCB reverse-engineering pipeline) is out of scope and will be audited separately when its phase stabilises.

Source state — commits up to a57bb6f78 (v1.0.6.1, 2026-04-17).

Auditor — Claude Opus 4.7 (1M context).

Axes

# File Status
1 axis-1-correctness.md — correctness / accounting delivered
2 axis-2-state-machines.md — transition matrix + illegal-transition tests 🔲 pending v1.0.7
3 axis-3-security.md — attack surface (signatures, rate limits, authz, secrets) 🔲 pending
4 axis-4-tests.md — coverage vs reality, failure-injection gap 🔲 pending
5 axis-5-debt.md — documented debt vs hidden debt (TODO/FIXME inventory) 🔲 pending

Axis 2 is gated on v1.0.7 landing first — otherwise the transition matrix captures a v1.0.6.1 snapshot that's immediately stale. See v107-plan.md for the sequencing.

Reading conventions

Every finding cites file:line evidence. Structure:

### P{0|1|2}.N — short title
**Evidence** — concrete cites
**Consequence** — what breaks today / tomorrow
**Action** — what to do, with enough detail that an implementer can start
**Criticity** — P0 / P1 / P2 / wontfix (with justification)

P0 = fix within v1.0.7 or earlier (ledger diverges today, or a v1.0.7 commitment is structurally blocked). P1 = v1.0.7 target. Operational visibility / correctness hardening. P2 = v1.0.8+. Nice-to-have. wontfix = justified non-action.

Info needed from ops (not determinable from code)

Tracked in axis-1-correctness.md. Absence of answers becomes a finding in its own right.

Derived deliverables

  • v107-plan.md — sequencing, dependencies and relative effort for the axis-1 P0 findings + the CHANGELOG-parked v1.0.7 items. Read this before picking up v1.0.7 work.