docs(audit): P1.7 → P0.12 post-probe; add v1.0.7 item G + Idempotency-Key TTL note
2026-04-17 Q2 probe confirmed the subscription money-movement finding
wasn't a "needs confirmation from ops" P1 — it was a live P0 bypass.
An authenticated user could POST /api/v1/subscriptions/subscribe,
receive 201 active without payment, and satisfy the distribution
eligibility gate. v1.0.6.2 (commit 9a8d2a4e7) closed the bypass at
the consumption site via GetUserSubscription filter + migration 980
cleanup.
axis-1-correctness.md:
* P1.7 renamed to P0.12 with the bypass chain, probe evidence, and
v1.0.6.2 closure cross-reference.
* Residual subscription-refund / webhook completeness work split out
as P1.7' (original scope, still v1.0.8).
v107-plan.md:
* Item G added (M effort) — replaces the v1.0.6.2 filter with a
mandatory pending_payment state + webhook-driven activation,
closing the creation path rather than compensating at the gate.
* Dependency graph gains a third track (independent of A/B/C/D/E/F).
* Effort total revised from 9-10d to 12-13d single-dev, 5d to 7d
two-dev parallel.
* Item D acceptance gains a TTL caveat section — Hyperswitch
Idempotency-Key has a 24h-7d server-side TTL; app-level
idempotency (order.id / partial UNIQUE) remains the load-bearing
guard beyond that window.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
9a8d2a4e73
commit
68a0d390e2
2 changed files with 166 additions and 13 deletions
|
|
@ -348,7 +348,87 @@ the engineering cost of handling it.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## P1.7 — Subscription money-movement not covered by v1.0.6 hardening
|
## P1.7 → P0.12 — Subscription payment-gate bypass (escalated 2026-04-17)
|
||||||
|
|
||||||
|
> **Re-classified from P1.7 to P0 on 2026-04-17** after probing ops Q2.
|
||||||
|
> The probe did not confirm "subscriptions are sold" — it confirmed
|
||||||
|
> "subscriptions are activated server-side without payment, and that
|
||||||
|
> state satisfies a paid-feature gate". Criticity moved from "P1
|
||||||
|
> hardening if subs are live" to "P0 security — bypass is live in any
|
||||||
|
> environment where `HYPERSWITCH_ENABLED=false` (which is the current
|
||||||
|
> dev/staging default and was the prod default of every env variable
|
||||||
|
> default before v1.0.6.2)". Hotfix v1.0.6.2 ships the gate fix
|
||||||
|
> (`GetUserSubscription` filter + migration 980 cleanup). The
|
||||||
|
> subscription-refund / webhook work (original P1.7 scope) remains and
|
||||||
|
> splits out below as P1.7' for v1.0.8.
|
||||||
|
|
||||||
|
**Evidence (bypass chain, confirmed by probe)**
|
||||||
|
- `POST /api/v1/subscriptions/subscribe` with `plan_id=<Creator>` from
|
||||||
|
any authenticated user (role=`user`, no KYC, no email verification
|
||||||
|
re-check beyond initial signup) returns **HTTP 201 `status=active`**.
|
||||||
|
- `subscription/service.go:256` `sub.Status = StatusActive` is assigned
|
||||||
|
unconditionally before the payment call.
|
||||||
|
- `subscription/service.go:279` `if s.paymentProvider != nil` —
|
||||||
|
payment call is conditional; when Hyperswitch is disabled (the
|
||||||
|
default for `HYPERSWITCH_ENABLED=false`), the row persists `active`
|
||||||
|
with no PSP reference written.
|
||||||
|
- `subscription/service.go:101` (pre-hotfix) `WHERE status IN
|
||||||
|
('active', 'trialing')` — the fantôme row satisfies this filter,
|
||||||
|
`GetUserSubscription` returns it to callers.
|
||||||
|
- `distribution/service.go:253` gate returns
|
||||||
|
`sub.Plan.HasDistribution || sub.Plan.CanSellOnMarketplace`. Creator
|
||||||
|
plan has `can_sell_on_marketplace=true` (migration 949 line 79) →
|
||||||
|
gate passes → user reaches `POST /distribution/submit`.
|
||||||
|
- `user_subscriptions.hyperswitch_subscription_id` declared
|
||||||
|
(`models.go:95`) but **never written anywhere** in the codebase —
|
||||||
|
no code path populates it, so it cannot be used as a post-hoc
|
||||||
|
"payment confirmed" signal either.
|
||||||
|
- Probe artefact (captured, cleaned): row
|
||||||
|
`906c585d-1afb-49e5-ac96-5eb5411cda99` created 2026-04-17 via
|
||||||
|
authenticated `POST /subscribe` from `smoke.0416@veza.local`
|
||||||
|
(`role=user`), HS IDs empty. Voided by migration 980 at v1.0.6.2
|
||||||
|
deploy.
|
||||||
|
|
||||||
|
**Consequence**
|
||||||
|
Any authenticated user gains access to `POST /distribution/submit`,
|
||||||
|
which dispatches to external distribution partners. Three harms:
|
||||||
|
1. Reputational — external partners receive distribution submissions
|
||||||
|
from non-paying VEZA accounts.
|
||||||
|
2. Financial — if the distribution workflow triggers billable API
|
||||||
|
calls to the partner platform (depends on workflow, worth auditing
|
||||||
|
as a separate finding under axis 3).
|
||||||
|
3. Contractual — VEZA's agreements with partners likely assume
|
||||||
|
"paid-subscriber submissions only".
|
||||||
|
The `role` column is not affected (user stays `user`, no bypass of
|
||||||
|
v1.0.6 creator self-service gate), and free-tier upload is unaffected
|
||||||
|
— the specific gate bypassed is the `has_distribution ||
|
||||||
|
can_sell_on_marketplace` check on paid plans.
|
||||||
|
|
||||||
|
**Action — shipped in v1.0.6.2 (commit d31f5733d)**
|
||||||
|
1. `subscription/service.go` `GetUserSubscription` now routes through
|
||||||
|
`hasEffectivePayment` — effective payment = free plan OR unexpired
|
||||||
|
trial OR invoice with non-empty `hyperswitch_payment_id`.
|
||||||
|
2. `ErrSubscriptionNoPayment` added; `distribution.checkEligibility`
|
||||||
|
and the four subscription handlers map it to "no active
|
||||||
|
subscription" (with `needs_payment=true` on `GET /me/subscription`
|
||||||
|
so honest-path users get actionable info).
|
||||||
|
3. Migration `980_void_unpaid_subscriptions.sql` sweeps pre-existing
|
||||||
|
fantôme rows to `status='expired'` with audit table
|
||||||
|
`voided_subscriptions_20260417`.
|
||||||
|
4. Probe script `scripts/probes/subscription-unpaid-activation.sh`
|
||||||
|
kept as versioned regression test.
|
||||||
|
5. `TODO(v1.0.7-item-G)` annotation on
|
||||||
|
`subscription/service.go:createNewSubscription` marks the deferred
|
||||||
|
work: replace the `if s.paymentProvider != nil` short-circuit with
|
||||||
|
a mandatory `pending_payment` state + webhook-driven activation
|
||||||
|
(item G in the v1.0.7 plan).
|
||||||
|
|
||||||
|
**Criticity** — **P0, closed in v1.0.6.2.** Item G in v1.0.7 hardens
|
||||||
|
the creation path end-to-end.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## P1.7' — Subscription money-movement not covered by v1.0.6 hardening (original scope, deferred to v1.0.8)
|
||||||
|
|
||||||
**Evidence**
|
**Evidence**
|
||||||
- Subscription model has Hyperswitch keys:
|
- Subscription model has Hyperswitch keys:
|
||||||
|
|
@ -383,9 +463,9 @@ v1.0.8. Breakdown:
|
||||||
3. Decide product policy on mid-period cancellation (pro-rata refund?
|
3. Decide product policy on mid-period cancellation (pro-rata refund?
|
||||||
honour until end of period? forfeit?).
|
honour until end of period? forfeit?).
|
||||||
|
|
||||||
**Criticity** — P1 for the hardening, P0 only if subscriptions are
|
**Criticity** — P1 for the hardening. v1.0.6.2 closes the bypass half;
|
||||||
currently sold. Status: needs confirmation from ops — not determinable
|
this half (refund / webhook completeness) is legitimately v1.0.8 scope
|
||||||
from code alone. See [Info needed from ops](#info-needed-from-ops) Q2.
|
and does not produce a bypass on its own.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -4,7 +4,7 @@
|
||||||
> The v1.0.6 CHANGELOG listed 4 "parked v1.0.7" items; the axis-1 audit
|
> The v1.0.6 CHANGELOG listed 4 "parked v1.0.7" items; the axis-1 audit
|
||||||
> added 4 P0 findings. De-duplicated and sequenced below.
|
> added 4 P0 findings. De-duplicated and sequenced below.
|
||||||
|
|
||||||
## The 8 items, de-duplicated → 6
|
## The 9 items (was 8, de-duplicated → 6; +1 after the 2026-04-17 Q2 probe)
|
||||||
|
|
||||||
| # | From | Title | Effort |
|
| # | From | Title | Effort |
|
||||||
|---|---|---|---|
|
|---|---|---|---|
|
||||||
|
|
@ -14,6 +14,7 @@
|
||||||
| D | audit P0.4 | `Idempotency-Key` on `CreatePayment` / `CreateRefund` | **XS** |
|
| D | audit P0.4 | `Idempotency-Key` on `CreatePayment` / `CreateRefund` | **XS** |
|
||||||
| E | audit P1.5 | Webhook raw-payload log table + insert | **S** |
|
| E | audit P1.5 | Webhook raw-payload log table + insert | **S** |
|
||||||
| F | audit P1.8 | Ledger-health Prometheus metrics + alerts | **S** |
|
| F | audit P1.8 | Ledger-health Prometheus metrics + alerts | **S** |
|
||||||
|
| **G** | **audit P0.12 follow-up (post v1.0.6.2 hotfix)** | **Subscription `pending_payment` state + webhook-driven activation; replace `if s.paymentProvider != nil` short-circuit** | **M** |
|
||||||
|
|
||||||
Dropped from v1.0.7 scope:
|
Dropped from v1.0.7 scope:
|
||||||
- Partial refunds (CHANGELOG-parked) — P2 in audit, feature-class, defer to v1.0.8
|
- Partial refunds (CHANGELOG-parked) — P2 in audit, feature-class, defer to v1.0.8
|
||||||
|
|
@ -53,15 +54,24 @@ Effort legend — **XS** ≤ 2h, **S** ≤ 1 day, **M** ≤ 3 days, **L** > 3 da
|
||||||
│ F metrics │ needs the state-shape C produces; alerts wire
|
│ F metrics │ needs the state-shape C produces; alerts wire
|
||||||
│ + alerts │ to the buckets the reconciler defines
|
│ + alerts │ to the buckets the reconciler defines
|
||||||
└──────────────────┘
|
└──────────────────┘
|
||||||
|
|
||||||
|
┌──────────────────┐
|
||||||
|
│ G subscription │ independent of A/B/C/D/E/F; shares the
|
||||||
|
│ pending_ │ `pending_payment` pattern with B's
|
||||||
|
│ payment │ `reversal_pending`. Builds on v1.0.6.2
|
||||||
|
└──────────────────┘ hotfix (which compensates via filter).
|
||||||
```
|
```
|
||||||
|
|
||||||
**Two parallel tracks:**
|
**Three parallel tracks:**
|
||||||
- **Track 1 (reversal correctness)** — A → B
|
- **Track 1 (reversal correctness)** — A → B
|
||||||
- **Track 2 (operational visibility)** — D, E, C → F
|
- **Track 2 (operational visibility)** — D, E, C → F
|
||||||
|
- **Track 3 (subscription creation path)** — G (single item, independent)
|
||||||
|
|
||||||
Two developers can work in parallel without stepping on each other. A
|
Two developers can work in parallel without stepping on each other. A
|
||||||
single developer sequences as ordered: D first (XS quick win, earns
|
single developer sequences as ordered: D first (XS quick win, earns
|
||||||
trust + unblocks "pre-open" checklist), then A→B, then E, then C→F.
|
trust + unblocks "pre-open" checklist), then A→B, then E, then C→F,
|
||||||
|
then G. G can also run in parallel at any point after D — it shares no
|
||||||
|
data-model surface with the other items.
|
||||||
|
|
||||||
## Commit sequence (single-developer path)
|
## Commit sequence (single-developer path)
|
||||||
|
|
||||||
|
|
@ -82,6 +92,17 @@ Acceptance:
|
||||||
- CHANGELOG entry cross-references P0.4 + its scope note (HTTP retry
|
- CHANGELOG entry cross-references P0.4 + its scope note (HTTP retry
|
||||||
only, not app-level replay).
|
only, not app-level replay).
|
||||||
|
|
||||||
|
**TTL caveat** — Hyperswitch (like most PSPs) honours `Idempotency-Key`
|
||||||
|
server-side only for a finite window: 24 h is common, 7 d at the high
|
||||||
|
end. Beyond the TTL, a replayed call with the same key is treated as
|
||||||
|
a new request. The header therefore protects against HTTP-layer retries
|
||||||
|
within a single request cycle, not against long-tail application
|
||||||
|
replay scenarios (for which the application-level idempotency primitives
|
||||||
|
— order.id on payments, the partial UNIQUE on `refunds.hyperswitch_refund_id`
|
||||||
|
landed in v1.0.6.1 — are the load-bearing guards). Verify the exact TTL
|
||||||
|
against current Hyperswitch docs before landing and note it in the
|
||||||
|
CHANGELOG scope sentence so anyone reading later knows the envelope.
|
||||||
|
|
||||||
**Ship as v1.0.7-alpha-1 for sandbox testing**, don't wait for B/C.
|
**Ship as v1.0.7-alpha-1 for sandbox testing**, don't wait for B/C.
|
||||||
|
|
||||||
### 2. `refactor(connect): persist stripe_transfer_id on create + retry` — A
|
### 2. `refactor(connect): persist stripe_transfer_id on create + retry` — A
|
||||||
|
|
@ -186,10 +207,61 @@ Acceptance:
|
||||||
- Sampler queries are cheap: `SELECT COUNT(*) WHERE ... AND created_at < NOW() - INTERVAL '30 min'` per metric, indexed by `status + created_at`.
|
- Sampler queries are cheap: `SELECT COUNT(*) WHERE ... AND created_at < NOW() - INTERVAL '30 min'` per metric, indexed by `status + created_at`.
|
||||||
- If indexes are missing: migration `983_ledger_health_indexes.sql`.
|
- If indexes are missing: migration `983_ledger_health_indexes.sql`.
|
||||||
|
|
||||||
|
### 7. `feat(subscription): pending_payment state + webhook-driven activation` — G
|
||||||
|
|
||||||
|
Effort: **M**. The follow-up to v1.0.6.2's gate-filter compensation.
|
||||||
|
v1.0.6.2 closed the feature bypass at the consumption site
|
||||||
|
(`GetUserSubscription` filters fantôme rows out); G replaces the
|
||||||
|
creation path so no fantôme rows get written in the first place.
|
||||||
|
|
||||||
|
Migrations:
|
||||||
|
- `984_subscription_pending_payment_enum.sql` adds `'pending_payment'`
|
||||||
|
to the `user_subscriptions.status` VARCHAR (no enum at DB level, so
|
||||||
|
just a documentation + backfill step for Go constants).
|
||||||
|
- `985_backfill_hs_subscription_id_from_invoice.sql` (optional) —
|
||||||
|
backpopulates `user_subscriptions.hyperswitch_subscription_id` from
|
||||||
|
the attached invoice's PSP intent where available. Documents the
|
||||||
|
join rule for the webhook to reconcile on.
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
- `subscription/service.go:createNewSubscription` creates paid-plan
|
||||||
|
rows in `pending_payment` state (never `active`). Transition to
|
||||||
|
`active` only via `ProcessSubscriptionWebhook` on
|
||||||
|
`payment_succeeded`. On `payment_failed`: transition to `expired`
|
||||||
|
+ log INFO; no invoice charge, no DB fantôme.
|
||||||
|
- `if s.paymentProvider != nil` short-circuit deleted — paid plans
|
||||||
|
without a payment provider configured return `503 payment provider
|
||||||
|
not configured` (env misconfig is an ops issue, not a silently-free
|
||||||
|
subscription).
|
||||||
|
- `GET /me/subscription` handles `pending_payment` explicitly —
|
||||||
|
returns `status: pending_payment`, `client_secret` echoed back so
|
||||||
|
the frontend can resume a stalled flow.
|
||||||
|
- `distribution.checkEligibility` treats `pending_payment` as
|
||||||
|
ineligible (same as the v1.0.6.2 `ErrSubscriptionNoPayment` path).
|
||||||
|
- Remove the TODO(v1.0.7-item-G) annotation; remove the v1.0.6.2
|
||||||
|
filter from `GetUserSubscription` (redundant once the creation
|
||||||
|
path is correct) — OR keep the filter as defence-in-depth and note
|
||||||
|
it in the code comment.
|
||||||
|
- Webhook dispatcher: new `subscription` event family. Reuse the
|
||||||
|
`webhook_raw_payloads` table from item E for persistence.
|
||||||
|
- Tests:
|
||||||
|
- Subscribe to paid plan with Hyperswitch enabled →
|
||||||
|
`pending_payment` row + PSP intent, no feature access yet.
|
||||||
|
- Webhook `subscription.payment_succeeded` → row transitions to
|
||||||
|
`active`, feature access granted.
|
||||||
|
- Webhook `subscription.payment_failed` → row transitions to
|
||||||
|
`expired`, no charge, no access.
|
||||||
|
- Webhook replay (same payment_id) is idempotent.
|
||||||
|
- Subscribe with provider misconfigured → 503, no row created.
|
||||||
|
- Migration of v1.0.6.2 voided rows — check `voided_subscriptions_20260417`
|
||||||
|
entries stay readable and not re-pickable by the new flow.
|
||||||
|
|
||||||
|
Independent of A/B/C/D/E/F. Can land at any point after D.
|
||||||
|
|
||||||
## Release gating
|
## Release gating
|
||||||
|
|
||||||
**Before tagging v1.0.7:**
|
**Before tagging v1.0.7:**
|
||||||
- All six items landed as separate commits.
|
- All seven items landed as separate commits.
|
||||||
- Refund smoke test (see `docs/audit-2026-04/smoke/refund.md` — TBD)
|
- Refund smoke test (see `docs/audit-2026-04/smoke/refund.md` — TBD)
|
||||||
re-run against sandbox with the reversal path: assert Stripe side
|
re-run against sandbox with the reversal path: assert Stripe side
|
||||||
shows the reversal as well as the refund.
|
shows the reversal as well as the refund.
|
||||||
|
|
@ -206,12 +278,13 @@ the API change).
|
||||||
|
|
||||||
## Effort total
|
## Effort total
|
||||||
|
|
||||||
Worst-case single-developer sequential: **~9-10 working days**
|
Worst-case single-developer sequential: **~12-13 working days**
|
||||||
(XS + S + M + S + M + S = 2h + 1d + 3d + 1d + 3d + 1d).
|
(XS + S + M + S + M + S + M = 2h + 1d + 3d + 1d + 3d + 1d + 3d).
|
||||||
|
|
||||||
With two devs on the parallel tracks: **~5 working days** end-to-end,
|
With two devs on the parallel tracks: **~7 working days** end-to-end
|
||||||
because track 1 (A→B, 4d) and track 2 (D, E, C→F done serially = 5d)
|
— track 1 (A→B, 4d), track 2 (D, E, C→F, 5d), track 3 (G, 3d, can
|
||||||
overlap.
|
run at any point after D). Track 3 extends the critical path unless
|
||||||
|
a third dev picks it up; with 3 devs it lands in the track-2 window.
|
||||||
|
|
||||||
## Unknowns to resolve before starting
|
## Unknowns to resolve before starting
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue