Every POST /webhooks/hyperswitch delivery now writes a row to
`hyperswitch_webhook_log` regardless of signature-valid or
processing outcome. Captures both legitimate deliveries and attack
probes — a forensics query now has the actual bytes to read, not
just a "webhook rejected" log line. Disputes (axis-1 P1.6) ride
along: the log captures dispute.* events alongside payment and
refund events, ready for when disputes get a handler.
Table shape (migration 984):
* payload TEXT — readable in psql, invalid UTF-8 replaced with
empty (forensics value is in headers + ip + timing for those
attacks, not the binary body).
* signature_valid BOOLEAN + partial index for "show me attack
attempts" being instantaneous.
* processing_result TEXT — 'ok' / 'error: <msg>' /
'signature_invalid' / 'skipped'. Matches the P1.5 action
semantic exactly.
* source_ip, user_agent, request_id — forensics essentials.
request_id is captured from Hyperswitch's X-Request-Id header
when present, else a server-side UUID so every row correlates
to VEZA's structured logs.
* event_type — best-effort extract from the JSON payload, NULL
on malformed input.
Hardening:
* 64KB body cap via io.LimitReader rejects oversize with 413
before any INSERT — prevents log-spam DoS.
* Single INSERT per delivery with final state; no two-phase
update race on signature-failure path. signature_invalid and
processing-error rows both land.
* DB persistence failures are logged but swallowed — the
endpoint's contract is to ack Hyperswitch, not perfect audit.
Retention sweep:
* CleanupHyperswitchWebhookLog in internal/jobs, daily tick,
batched DELETE (10k rows + 100ms pause) so a large backlog
doesn't lock the table.
* HYPERSWITCH_WEBHOOK_LOG_RETENTION_DAYS (default 90).
* Same goroutine-ticker pattern as ScheduleOrphanTracksCleanup.
* Wired in cmd/api/main.go alongside the existing cleanup jobs.
Tests: 5 in webhook_log_test.go (persistence, request_id auto-gen,
invalid-JSON leaves event_type empty, invalid-signature capture,
extractEventType 5 sub-cases) + 4 in cleanup_hyperswitch_webhook_
log_test.go (deletes-older-than, noop, default-on-zero,
context-cancel). Migration 984 applied cleanly to local Postgres;
all indexes present.
Also (v107-plan.md):
* Item G acceptance gains an explicit Idempotency-Key threading
requirement with an empty-key loud-fail test — "literally
copy-paste D's 4-line test skeleton". Closes the risk that
item G silently reopens the HTTP-retry duplicate-charge
exposure D closed.
Out of scope for E (noted in CHANGELOG):
* Rate limit on the endpoint — pre-existing middleware covers
it at the router level; adding a per-endpoint limit is
separate scope.
* Readable-payload SQL view — deferred, the TEXT column is
already human-readable; a convenience view is a nice-to-have
not a ship-blocker.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
71 lines
3.4 KiB
SQL
71 lines
3.4 KiB
SQL
-- v1.0.7 item E: raw-payload audit log for every Hyperswitch webhook
|
|
-- reaching our endpoint. Captures both legitimate deliveries and
|
|
-- attack attempts (invalid signatures, malformed bodies) — the insert
|
|
-- happens regardless of signature-valid / processing-success status,
|
|
-- so a forensics query after "something weird happened last Tuesday"
|
|
-- has the actual bytes to look at.
|
|
--
|
|
-- Shape decisions:
|
|
--
|
|
-- payload TEXT — Hyperswitch sends JSON; TEXT is readable in psql
|
|
-- without base64-decoding and plenty fast at our volumes. Invalid
|
|
-- UTF-8 is rejected at INSERT time — that class of "attack" is a
|
|
-- grossly malformed probe where we have the header + ip + timing
|
|
-- anyway, no value in storing the binary payload.
|
|
--
|
|
-- signature_valid BOOLEAN — HMAC verification outcome. Partial
|
|
-- index below makes "attempts with invalid signature last 24h"
|
|
-- cheap for forensics.
|
|
--
|
|
-- processing_result TEXT — 'ok' on successful dispatch, 'error: <msg>'
|
|
-- on processing failure (after signature was valid), 'skipped' if
|
|
-- the handler declined for another reason, 'signature_invalid' if
|
|
-- rejected at the signature gate.
|
|
--
|
|
-- source_ip / user_agent / request_id — forensics essentials.
|
|
-- request_id is captured from Hyperswitch's X-Request-Id header if
|
|
-- sent, else the handler generates a UUID so every row has a value
|
|
-- correlatable against VEZA's structured logs.
|
|
--
|
|
-- event_type — pulled from the payload JSON via a best-effort
|
|
-- extract; NULL if the payload isn't valid JSON or doesn't carry
|
|
-- an event_type field. Useful for "how many dispute.* events have
|
|
-- we seen this month" — item P1.6 (disputes) rides along on this
|
|
-- log without needing its own handler yet.
|
|
--
|
|
-- Retention: 90 days by default, swept by CleanupHyperswitchWebhookLog
|
|
-- (internal/jobs). Configurable via HYPERSWITCH_WEBHOOK_LOG_RETENTION_DAYS.
|
|
|
|
CREATE TABLE IF NOT EXISTS hyperswitch_webhook_log (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
received_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
payload TEXT NOT NULL,
|
|
signature_valid BOOLEAN NOT NULL,
|
|
signature_header TEXT,
|
|
processing_result TEXT NOT NULL,
|
|
event_type TEXT,
|
|
source_ip TEXT,
|
|
user_agent TEXT,
|
|
request_id TEXT NOT NULL
|
|
);
|
|
|
|
-- Received-at ordering index: "what did we receive in the last hour"
|
|
-- is the single most common operational query. Cheap, indexed by
|
|
-- default PK on id but adding the timestamp index keeps retention
|
|
-- sweeps and forensics scans well-planned.
|
|
CREATE INDEX IF NOT EXISTS idx_hyperswitch_webhook_log_received_at
|
|
ON hyperswitch_webhook_log(received_at DESC);
|
|
|
|
-- Partial index on invalid signatures — "show me attack attempts".
|
|
-- Partial keeps the index tiny on the common case (valid sigs) and
|
|
-- makes the forensics query instantaneous on the rare case.
|
|
CREATE INDEX IF NOT EXISTS idx_hyperswitch_webhook_log_signature_invalid
|
|
ON hyperswitch_webhook_log(received_at DESC)
|
|
WHERE signature_valid = false;
|
|
|
|
-- request_id is required-NOT-NULL at the column level, so an index
|
|
-- on it is just for "correlate this Veza log line with the webhook
|
|
-- row". Non-unique because retries with the same request_id could
|
|
-- land multiple rows (e.g., Hyperswitch redelivers a webhook).
|
|
CREATE INDEX IF NOT EXISTS idx_hyperswitch_webhook_log_request_id
|
|
ON hyperswitch_webhook_log(request_id);
|