Every POST /webhooks/hyperswitch delivery now writes a row to
`hyperswitch_webhook_log` regardless of signature-valid or
processing outcome. Captures both legitimate deliveries and attack
probes — a forensics query now has the actual bytes to read, not
just a "webhook rejected" log line. Disputes (axis-1 P1.6) ride
along: the log captures dispute.* events alongside payment and
refund events, ready for when disputes get a handler.
Table shape (migration 984):
* payload TEXT — readable in psql, invalid UTF-8 replaced with
empty (forensics value is in headers + ip + timing for those
attacks, not the binary body).
* signature_valid BOOLEAN + partial index for "show me attack
attempts" being instantaneous.
* processing_result TEXT — 'ok' / 'error: <msg>' /
'signature_invalid' / 'skipped'. Matches the P1.5 action
semantic exactly.
* source_ip, user_agent, request_id — forensics essentials.
request_id is captured from Hyperswitch's X-Request-Id header
when present, else a server-side UUID so every row correlates
to VEZA's structured logs.
* event_type — best-effort extract from the JSON payload, NULL
on malformed input.
Hardening:
* 64KB body cap via io.LimitReader rejects oversize with 413
before any INSERT — prevents log-spam DoS.
* Single INSERT per delivery with final state; no two-phase
update race on signature-failure path. signature_invalid and
processing-error rows both land.
* DB persistence failures are logged but swallowed — the
endpoint's contract is to ack Hyperswitch, not perfect audit.
Retention sweep:
* CleanupHyperswitchWebhookLog in internal/jobs, daily tick,
batched DELETE (10k rows + 100ms pause) so a large backlog
doesn't lock the table.
* HYPERSWITCH_WEBHOOK_LOG_RETENTION_DAYS (default 90).
* Same goroutine-ticker pattern as ScheduleOrphanTracksCleanup.
* Wired in cmd/api/main.go alongside the existing cleanup jobs.
Tests: 5 in webhook_log_test.go (persistence, request_id auto-gen,
invalid-JSON leaves event_type empty, invalid-signature capture,
extractEventType 5 sub-cases) + 4 in cleanup_hyperswitch_webhook_
log_test.go (deletes-older-than, noop, default-on-zero,
context-cancel). Migration 984 applied cleanly to local Postgres;
all indexes present.
Also (v107-plan.md):
* Item G acceptance gains an explicit Idempotency-Key threading
requirement with an empty-key loud-fail test — "literally
copy-paste D's 4-line test skeleton". Closes the risk that
item G silently reopens the HTTP-retry duplicate-charge
exposure D closed.
Out of scope for E (noted in CHANGELOG):
* Rate limit on the endpoint — pre-existing middleware covers
it at the router level; adding a per-endpoint limit is
separate scope.
* Readable-payload SQL view — deferred, the TEXT column is
already human-readable; a convenience view is a nice-to-have
not a ship-blocker.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| axis-1-correctness.md | ||
| README.md | ||
| v107-plan.md | ||
VEZA Audit — 2026-04
Scope — VEZA backend (Go) + web (TypeScript). TALAS software (firmware, PCB reverse-engineering pipeline) is out of scope and will be audited separately when its phase stabilises.
Source state — commits up to
a57bb6f78(v1.0.6.1, 2026-04-17).Auditor — Claude Opus 4.7 (1M context).
Axes
| # | File | Status |
|---|---|---|
| 1 | axis-1-correctness.md — correctness / accounting |
✅ delivered |
| 2 | axis-2-state-machines.md — transition matrix + illegal-transition tests |
🔲 pending v1.0.7 |
| 3 | axis-3-security.md — attack surface (signatures, rate limits, authz, secrets) |
🔲 pending |
| 4 | axis-4-tests.md — coverage vs reality, failure-injection gap |
🔲 pending |
| 5 | axis-5-debt.md — documented debt vs hidden debt (TODO/FIXME inventory) |
🔲 pending |
Axis 2 is gated on v1.0.7 landing first — otherwise the transition matrix
captures a v1.0.6.1 snapshot that's immediately stale. See
v107-plan.md for the sequencing.
Reading conventions
Every finding cites file:line evidence. Structure:
### P{0|1|2}.N — short title
**Evidence** — concrete cites
**Consequence** — what breaks today / tomorrow
**Action** — what to do, with enough detail that an implementer can start
**Criticity** — P0 / P1 / P2 / wontfix (with justification)
P0 = fix within v1.0.7 or earlier (ledger diverges today, or a v1.0.7 commitment is structurally blocked). P1 = v1.0.7 target. Operational visibility / correctness hardening. P2 = v1.0.8+. Nice-to-have. wontfix = justified non-action.
Info needed from ops (not determinable from code)
Tracked in axis-1-correctness.md.
Absence of answers becomes a finding in its own right.
Derived deliverables
v107-plan.md— sequencing, dependencies and relative effort for the axis-1 P0 findings + the CHANGELOG-parked v1.0.7 items. Read this before picking up v1.0.7 work.