veza/veza-backend-api/internal/handlers/admin_transfer_handler.go
senke d2bb9c0e78 feat(marketplace): async stripe connect reversal worker — v1.0.7 item B day 2
Day-2 cut of item B: the reversal path becomes async. Pre-v1.0.7
(and v1.0.7 day 1) the refund handler flipped seller_transfers
straight from completed to reversed without ever calling Stripe —
the ledger said "reversed" while the seller's Stripe balance still
showed the original transfer as settled. The new flow:

  refund.succeeded webhook
    → reverseSellerAccounting transitions row: completed → reversal_pending
    → StripeReversalWorker (every REVERSAL_CHECK_INTERVAL, default 1m)
      → calls ReverseTransfer on Stripe
      → success: row → reversed + persist stripe_reversal_id
      → 404 already-reversed (dead code until day 3): row → reversed + log
      → 404 resource_missing (dead code until day 3): row → permanently_failed
      → transient error: stay reversal_pending, bump retry_count,
        exponential backoff (base * 2^retry, capped at backoffMax)
      → retries exhausted: row → permanently_failed
    → buyer-facing refund completes immediately regardless of Stripe health

State machine enforcement:
  * New `SellerTransfer.TransitionStatus(tx, to, extras)` wraps every
    mutation: validates against AllowedTransferTransitions, guarded
    UPDATE with WHERE status=<from> (optimistic lock semantics), no
    RowsAffected = stale state / concurrent winner detected.
  * processSellerTransfers no longer mutates .Status in place —
    terminal status is decided before struct construction, so the
    row is Created with its final state.
  * transfer_retry.retryOne and admin RetryTransfer route through
    TransitionStatus. Legacy direct assignment removed.
  * TestNoDirectTransferStatusMutation greps the package for any
    `st.Status = "..."` / `t.Status = "..."` / GORM
    Model(&SellerTransfer{}).Update("status"...) outside the
    allowlist and fails if found. Verified by temporarily injecting
    a violation during development — test caught it as expected.

Configuration (v1.0.7 item B):
  * REVERSAL_WORKER_ENABLED=true (default)
  * REVERSAL_MAX_RETRIES=5 (default)
  * REVERSAL_CHECK_INTERVAL=1m (default)
  * REVERSAL_BACKOFF_BASE=1m (default)
  * REVERSAL_BACKOFF_MAX=1h (default, caps exponential growth)
  * .env.template documents TRANSFER_RETRY_* and REVERSAL_* env vars
    so an ops reader can grep them.

Interface change: TransferService.ReverseTransfer(ctx,
stripe_transfer_id, amount *int64, reason) (reversalID, error)
added. All four mocks extended (process_webhook, transfer_retry,
admin_transfer_handler, payment_flow integration). amount=nil means
full reversal; v1.0.7 always passes nil (partial reversal is future
scope per axis-1 P2).

Stripe 404 disambiguation (ErrTransferAlreadyReversed /
ErrTransferNotFound) is wired in the worker as dead code — the
sentinels are declared and the worker branches on them, but
StripeConnectService.ReverseTransfer doesn't yet emit them. Day 3
will parse stripe.Error.Code and populate the sentinels; no worker
change needed at that point. Keeping the handling skeleton in day 2
so the worker's branch shape doesn't change between days and the
tests can already cover all four paths against the mock.

Worker unit tests (9 cases, all green, sqlite :memory:):
  * happy path: reversal_pending → reversed + stripe_reversal_id set
  * already reversed (mock returns sentinel): → reversed + log
  * not found (mock returns sentinel): → permanently_failed + log
  * transient 503: retry_count++, next_retry_at set with backoff,
    stays reversal_pending
  * backoff capped at backoffMax (verified with base=1s, max=10s,
    retry_count=4 → capped at 10s not 16s)
  * max retries exhausted: → permanently_failed
  * legacy row with empty stripe_transfer_id: → permanently_failed,
    does not call Stripe
  * only picks up reversal_pending (skips all other statuses)
  * respects next_retry_at (future rows skipped)

Existing test updated: TestProcessRefundWebhook_SucceededFinalizesState
now asserts the row lands at reversal_pending with next_retry_at
set (worker's responsibility to drive to reversed), not reversed.

Worker wired in cmd/api/main.go alongside TransferRetryWorker,
sharing the same StripeConnectService instance. Shutdown path
registered for graceful stop.

Cut from day 2 scope (per agreed-upon discipline), landing in day 3:
  * Stripe 404 disambiguation implementation (parse error.Code)
  * End-to-end smoke probe (refund → reversal_pending → worker
    processes → reversed) against local Postgres + mock Stripe
  * Batch-size tuning / inter-batch sleep — batchLimit=20 today is
    safely under Stripe's 100 req/s default rate limit; revisit if
    observed load warrants

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:34:29 +02:00

152 lines
4.7 KiB
Go

package handlers
import (
"net/http"
"strconv"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
"go.uber.org/zap"
"gorm.io/gorm"
"veza-backend-api/internal/core/marketplace"
apperrors "veza-backend-api/internal/errors"
)
// AdminTransferHandler handles admin transfer dashboard endpoints (v0.701).
type AdminTransferHandler struct {
db *gorm.DB
ts marketplace.TransferService
logger *zap.Logger
feeRate float64
}
// NewAdminTransferHandler creates a new AdminTransferHandler.
func NewAdminTransferHandler(db *gorm.DB, ts marketplace.TransferService, feeRate float64, logger *zap.Logger) *AdminTransferHandler {
return &AdminTransferHandler{
db: db,
ts: ts,
logger: logger,
feeRate: feeRate,
}
}
// GetTransfers returns a paginated list of all platform transfers with optional filters.
// Query params: status, seller_id, from, to, limit (default 50), offset (default 0).
func (h *AdminTransferHandler) GetTransfers(c *gin.Context) {
query := h.db.Model(&marketplace.SellerTransfer{})
if status := c.Query("status"); status != "" {
query = query.Where("status = ?", status)
}
if sellerIDStr := c.Query("seller_id"); sellerIDStr != "" {
sellerID, err := uuid.Parse(sellerIDStr)
if err != nil {
RespondWithAppError(c, apperrors.NewValidationError("invalid seller_id"))
return
}
query = query.Where("seller_id = ?", sellerID)
}
if from := c.Query("from"); from != "" {
query = query.Where("created_at >= ?", from)
}
if to := c.Query("to"); to != "" {
query = query.Where("created_at <= ?", to)
}
var total int64
if err := query.Count(&total).Error; err != nil {
h.logger.Error("GetTransfers count failed", zap.Error(err))
RespondWithAppError(c, apperrors.NewInternalErrorWrap("Failed to count transfers", err))
return
}
limit := 50
if l := c.Query("limit"); l != "" {
if parsed, err := strconv.Atoi(l); err == nil && parsed > 0 && parsed <= 100 {
limit = parsed
}
}
offset := 0
if o := c.Query("offset"); o != "" {
if parsed, err := strconv.Atoi(o); err == nil && parsed >= 0 {
offset = parsed
}
}
var transfers []marketplace.SellerTransfer
if err := query.Order("created_at DESC").Limit(limit).Offset(offset).Find(&transfers).Error; err != nil {
h.logger.Error("GetTransfers find failed", zap.Error(err))
RespondWithAppError(c, apperrors.NewInternalErrorWrap("Failed to fetch transfers", err))
return
}
RespondSuccess(c, http.StatusOK, gin.H{
"transfers": transfers,
"total": total,
})
}
// RetryTransfer manually retries a failed transfer.
func (h *AdminTransferHandler) RetryTransfer(c *gin.Context) {
if h.ts == nil {
RespondWithAppError(c, apperrors.NewServiceUnavailableError("Stripe Connect is not enabled"))
return
}
idStr := c.Param("id")
if idStr == "" {
RespondWithAppError(c, apperrors.NewValidationError("transfer id required"))
return
}
transferID, err := uuid.Parse(idStr)
if err != nil {
RespondWithAppError(c, apperrors.NewValidationError("invalid transfer id"))
return
}
var t marketplace.SellerTransfer
if err := h.db.First(&t, transferID).Error; err != nil {
if err == gorm.ErrRecordNotFound {
RespondWithAppError(c, apperrors.NewNotFoundError("transfer"))
return
}
h.logger.Error("RetryTransfer find failed", zap.Error(err), zap.String("transfer_id", idStr))
RespondWithAppError(c, apperrors.NewInternalErrorWrap("Failed to fetch transfer", err))
return
}
if t.Status != "failed" {
RespondWithAppError(c, apperrors.NewValidationError("only failed transfers can be retried"))
return
}
stripeTransferID, err := h.ts.CreateTransfer(c.Request.Context(), t.SellerID, t.AmountCents, t.Currency, t.OrderID.String())
if err != nil {
// Failure: stay in 'failed' (same-state) but bump retry_count and
// record the error. For manual admin retry we don't set
// next_retry_at — ops is driving this, not the worker.
failExtras := map[string]interface{}{
"retry_count": t.RetryCount + 1,
"error_message": err.Error(),
}
if saveErr := t.TransitionStatus(h.db.WithContext(c.Request.Context()), t.Status, failExtras); saveErr != nil {
h.logger.Error("RetryTransfer save failed", zap.Error(saveErr))
}
h.logger.Error("RetryTransfer CreateTransfer failed", zap.Error(err), zap.String("transfer_id", idStr))
RespondWithAppError(c, apperrors.NewInternalErrorWrap("Transfer failed", err))
return
}
extras := map[string]interface{}{
"error_message": "",
"stripe_transfer_id": stripeTransferID,
}
if err := t.TransitionStatus(h.db.WithContext(c.Request.Context()), marketplace.TransferStatusCompleted, extras); err != nil {
h.logger.Error("RetryTransfer save failed", zap.Error(err))
RespondWithAppError(c, apperrors.NewInternalErrorWrap("Failed to update transfer", err))
return
}
RespondSuccess(c, http.StatusOK, t)
}