veza/veza-backend-api/docs/TROUBLESHOOTING_GUIDE.md

1766 lines
36 KiB
Markdown

# Veza Backend API - Troubleshooting Guide
## Table of Contents
1. [Quick Diagnosis](#quick-diagnosis)
2. [Authentication Issues](#authentication-issues)
3. [Database Issues](#database-issues)
4. [Upload Issues](#upload-issues)
5. [Performance Issues](#performance-issues)
6. [Network Issues](#network-issues)
7. [Configuration Issues](#configuration-issues)
8. [Deployment Issues](#deployment-issues)
9. [Security Issues](#security-issues)
10. [API Errors](#api-errors)
11. [Service Integration Issues](#service-integration-issues)
12. [Diagnostic Tools](#diagnostic-tools)
13. [Getting Help](#getting-help)
## Quick Diagnosis
### Health Check
First, verify the application is running:
```bash
# Check if API is responding
curl http://localhost:8080/healthz
# Expected response:
# {"status":"healthy"}
```
### Common Quick Fixes
1. **Application not starting**: Check logs, verify environment variables
2. **Database connection failed**: Verify PostgreSQL is running, check DATABASE_URL
3. **401 Unauthorized**: Check JWT token, verify token hasn't expired
4. **500 Internal Server Error**: Check application logs for details
5. **Slow responses**: Check database performance, verify connection pool settings
## Authentication Issues
### Cannot Register New User
**Symptoms:**
- Registration returns 500 error
- Error message: "Failed to create user"
- No detailed error in logs
**Diagnosis:**
```bash
# Check application logs
docker logs veza-backend-api | grep -i register
# or
journalctl -u veza-backend-api | grep -i register
# Check database constraints
psql "$DATABASE_URL" -c "SELECT constraint_name FROM information_schema.table_constraints WHERE table_name='users';"
# Verify username uniqueness
psql "$DATABASE_URL" -c "SELECT username, email FROM users WHERE username='testuser';"
```
**Common Causes:**
1. **Username Already Exists**
```bash
# Check if username exists
curl "http://localhost:8080/api/v1/auth/check-username?username=testuser"
```
2. **Email Already Exists**
```bash
# Check database
psql "$DATABASE_URL" -c "SELECT email FROM users WHERE email='user@example.com';"
```
3. **Slug Constraint Violation**
```bash
# Check for duplicate slugs
psql "$DATABASE_URL" -c "SELECT slug, COUNT(*) FROM users GROUP BY slug HAVING COUNT(*) > 1;"
```
4. **Password Validation Failure**
- Ensure password meets requirements:
- Minimum 12 characters
- At least one uppercase letter
- At least one lowercase letter
- At least one number
- At least one special character
- Not a common password
**Solutions:**
1. **Fix Username Conflict**
```bash
# Choose a different username
# Or delete existing user (if testing)
psql "$DATABASE_URL" -c "DELETE FROM users WHERE username='testuser';"
```
2. **Fix Email Conflict**
```bash
# Use a different email
# Or verify and use existing account
```
3. **Fix Slug Issue**
```bash
# The application should auto-generate unique slugs
# If issue persists, check slug generation logic
```
4. **Fix Password Issues**
- Use a stronger password
- Ensure password meets all requirements
- Avoid common passwords
### Cannot Login
**Symptoms:**
- Login returns 401 Unauthorized
- Error message: "Invalid credentials"
- Token generation fails
**Diagnosis:**
```bash
# Check user exists
psql "$DATABASE_URL" -c "SELECT id, email, password_hash FROM users WHERE email='user@example.com';"
# Check password hash format
# Should be bcrypt hash starting with $2a$ or $2b$
# Check JWT secret
echo $JWT_SECRET | wc -c
# Should be at least 32 characters
```
**Common Causes:**
1. **Invalid Credentials**
- Verify email and password are correct
- Check for typos
- Ensure account is verified
2. **Account Not Verified**
```bash
# Check email verification status
psql "$DATABASE_URL" -c "SELECT email, is_verified FROM users WHERE email='user@example.com';"
# Resend verification email
curl -X POST http://localhost:8080/api/v1/auth/resend-verification \
-H "Content-Type: application/json" \
-d '{"email":"user@example.com"}'
```
3. **JWT Secret Too Short**
```bash
# Verify JWT_SECRET length
echo $JWT_SECRET | wc -c
# Must be at least 32 characters
# Generate new secret
openssl rand -base64 32
```
4. **Token Version Mismatch**
```bash
# Check token_version in database
psql "$DATABASE_URL" -c "SELECT id, email, token_version FROM users WHERE email='user@example.com';"
# If token_version was incremented, all tokens are invalidated
```
**Solutions:**
1. **Reset Password**
```bash
# Request password reset
curl -X POST http://localhost:8080/api/v1/auth/password/reset-request \
-H "Content-Type: application/json" \
-d '{"email":"user@example.com"}'
```
2. **Verify Email**
```bash
# Resend verification email
curl -X POST http://localhost:8080/api/v1/auth/resend-verification \
-H "Content-Type: application/json" \
-d '{"email":"user@example.com"}'
```
3. **Update JWT Secret**
```bash
# Generate new secret
export JWT_SECRET=$(openssl rand -base64 32)
# Restart application
docker-compose restart backend-api
```
### 2FA Issues
**Symptoms:**
- Cannot enable 2FA
- 2FA verification fails
- Recovery codes not working
**Diagnosis:**
```bash
# Check 2FA status
psql "$DATABASE_URL" -c "SELECT id, email, two_factor_enabled, two_factor_secret FROM users WHERE email='user@example.com';"
# Check recovery codes
psql "$DATABASE_URL" -c "SELECT backup_codes FROM users WHERE email='user@example.com';"
```
**Common Causes:**
1. **Time Sync Issue**
- TOTP codes are time-sensitive
- Ensure device clock is synchronized
- Check timezone settings
2. **Invalid Secret**
- Secret may be corrupted
- Regenerate 2FA setup
3. **Recovery Codes Used**
- Recovery codes are single-use
- Generate new codes if needed
**Solutions:**
1. **Sync Device Time**
```bash
# On Linux
sudo ntpdate -s time.nist.gov
# On macOS
sudo sntp -sS time.apple.com
```
2. **Regenerate 2FA**
- Disable 2FA
- Re-enable 2FA
- Save new recovery codes
3. **Use Recovery Code**
- Use a recovery code to login
- Regenerate 2FA after login
### Token Expiration Issues
**Symptoms:**
- Token expires too quickly
- Refresh token not working
- Getting 401 errors frequently
**Diagnosis:**
```bash
# Check token expiration settings
echo $JWT_EXPIRY
# Default: 24h
# Check refresh token expiration
# Default: 30 days
```
**Solutions:**
1. **Refresh Token**
```bash
curl -X POST http://localhost:8080/api/v1/auth/refresh \
-H "Content-Type: application/json" \
-d '{"refresh_token":"your_refresh_token"}'
```
2. **Adjust Token Expiration**
```bash
# Set longer expiration (not recommended for production)
export JWT_EXPIRY=48h
```
## Database Issues
### Connection Refused
**Symptoms:**
- Application cannot connect to database
- Error: "connection refused" or "no such host"
- Application fails to start
**Diagnosis:**
```bash
# Check PostgreSQL is running
sudo systemctl status postgresql
# or
docker ps | grep postgres
# Test connection
psql "$DATABASE_URL" -c "SELECT 1;"
# Check network connectivity
ping postgres-host
telnet postgres-host 5432
```
**Common Causes:**
1. **PostgreSQL Not Running**
```bash
# Start PostgreSQL
sudo systemctl start postgresql
# or
docker-compose up -d postgres
```
2. **Wrong Connection String**
```bash
# Verify DATABASE_URL format
echo $DATABASE_URL
# Should be: postgres://user:password@host:port/dbname?sslmode=disable
```
3. **Firewall Blocking**
```bash
# Check firewall rules
sudo iptables -L -n | grep 5432
sudo ufw status | grep 5432
```
4. **Wrong Host/Port**
```bash
# Verify host and port
# Default: localhost:5432
```
**Solutions:**
1. **Start PostgreSQL**
```bash
# Systemd
sudo systemctl start postgresql
sudo systemctl enable postgresql
# Docker
docker-compose up -d postgres
```
2. **Fix Connection String**
```bash
# Verify format
export DATABASE_URL="postgres://veza:password@localhost:5432/veza_dev?sslmode=disable"
```
3. **Configure Firewall**
```bash
# Allow PostgreSQL port
sudo ufw allow 5432/tcp
```
### Migration Errors
**Symptoms:**
- Migrations fail to apply
- Error: "relation already exists" or "column does not exist"
- Database schema inconsistent
**Diagnosis:**
```bash
# Check migration status
migrate -path ./migrations -database "$DATABASE_URL" version
# Check current schema
psql "$DATABASE_URL" -c "\d users"
psql "$DATABASE_URL" -c "\d tracks"
```
**Common Causes:**
1. **Migration Already Applied**
- Migration was partially applied
- Schema is inconsistent
2. **Missing Migration Files**
- Migration files deleted or moved
- Version mismatch
3. **Database Locked**
- Another process is running migrations
- Transaction not committed
**Solutions:**
1. **Check Migration Status**
```bash
migrate -path ./migrations -database "$DATABASE_URL" version
```
2. **Force Migration Version**
```bash
# Set to specific version (use with caution)
migrate -path ./migrations -database "$DATABASE_URL" force <version>
```
3. **Manual Rollback (using _down.sql files)**
Veza uses a custom migration runner that ignores `*_down.sql` files during normal migration. For manual rollback, execute down migrations in reverse order (most recent first):
```bash
# Rollback order (execute from migrations/ directory):
psql "$DATABASE_URL" -f migrations/931_add_refresh_tokens_updated_at_down.sql
psql "$DATABASE_URL" -f migrations/930_add_missing_foreign_keys_down.sql
psql "$DATABASE_URL" -f migrations/920_add_performance_indexes_down.sql
psql "$DATABASE_URL" -f migrations/910_create_audit_logs_down.sql
psql "$DATABASE_URL" -f migrations/900_triggers_and_functions_down.sql
psql "$DATABASE_URL" -f migrations/077_create_live_streams_down.sql
psql "$DATABASE_URL" -f migrations/076_create_gear_items_down.sql
# ... continue with older migrations as needed
```
**Warning:** Always backup the database before rollback. Down migrations are provided for the most critical migrations (900-931, 076-077). Full rollback may require creating additional _down.sql for other migrations.
4. **Manual Schema Fix**
```bash
# If needed, manually fix schema
psql "$DATABASE_URL" -f fix_schema.sql
```
### Query Performance Issues
**Symptoms:**
- Slow API responses
- Database queries taking too long
- High database CPU usage
**Diagnosis:**
```bash
# Check slow queries
psql "$DATABASE_URL" -c "SELECT query, mean_exec_time, calls FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;"
# Check database size
psql "$DATABASE_URL" -c "SELECT pg_size_pretty(pg_database_size('veza_dev'));"
# Check table sizes
psql "$DATABASE_URL" -c "SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size FROM pg_tables WHERE schemaname='public' ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;"
```
**Common Causes:**
1. **Missing Indexes**
```bash
# Check indexes
psql "$DATABASE_URL" -c "\d users"
psql "$DATABASE_URL" -c "\d tracks"
# Look for columns used in WHERE clauses without indexes
```
2. **Large Tables**
- Tables growing too large
- Need partitioning or archiving
3. **Inefficient Queries**
- N+1 query problems
- Missing JOIN optimizations
**Solutions:**
1. **Add Missing Indexes**
```sql
-- Example: Add index on frequently queried column
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_tracks_user_id ON tracks(user_id);
CREATE INDEX idx_tracks_created_at ON tracks(created_at);
```
2. **Optimize Queries**
- Use EXPLAIN ANALYZE to identify slow queries
- Add appropriate indexes
- Use JOINs instead of multiple queries
3. **Connection Pooling**
```bash
# Adjust connection pool settings
export DB_MAX_OPEN_CONNS=50
export DB_MAX_IDLE_CONNS=10
export DB_CONN_MAX_LIFETIME=5m
```
## Upload Issues
### Upload Fails
**Symptoms:**
- File upload returns error
- Upload progress stuck
- File not appearing after upload
**Diagnosis:**
```bash
# Check upload directory permissions
ls -la uploads/
# Should be writable by application user
# Check disk space
df -h
# Check file size limits
echo $MAX_UPLOAD_SIZE
# Default: 500MB
# Check ClamAV status (if enabled)
clamdscan --version
systemctl status clamav-daemon
```
**Common Causes:**
1. **File Too Large**
- Exceeds MAX_UPLOAD_SIZE limit
- Use chunked upload for large files
2. **Invalid File Format**
- Only MP3, WAV, FLAC, OGG supported
- Convert file to supported format
3. **Disk Space Full**
```bash
# Check disk space
df -h
# Clean up old files
find uploads/ -type f -mtime +30 -delete
```
4. **Permission Issues**
```bash
# Fix permissions
sudo chown -R veza:veza uploads/
sudo chmod -R 755 uploads/
```
5. **ClamAV Scan Fails**
```bash
# Check ClamAV
systemctl status clamav-daemon
# If ClamAV is down and CLAMAV_REQUIRED=true, uploads will fail
# Set CLAMAV_REQUIRED=false for development
export CLAMAV_REQUIRED=false
```
**Solutions:**
1. **Use Chunked Upload**
```bash
# For large files, use chunked upload
# See API documentation for chunked upload endpoints
```
2. **Convert File Format**
```bash
# Convert to MP3 using ffmpeg
ffmpeg -i input.wav -codec:a libmp3lame -b:a 192k output.mp3
```
3. **Free Disk Space**
```bash
# Remove old uploads
find uploads/ -type f -mtime +90 -delete
# Or archive to external storage
```
4. **Fix Permissions**
```bash
sudo chown -R veza:veza uploads/
sudo chmod -R 755 uploads/
```
### Chunked Upload Issues
**Symptoms:**
- Chunks fail to upload
- Upload cannot be completed
- Chunks out of order
**Diagnosis:**
```bash
# Check Redis (if used for chunk tracking)
redis-cli ping
# Check upload session
# Check logs for chunk upload errors
docker logs veza-backend-api | grep -i chunk
```
**Common Causes:**
1. **Chunks Out of Order**
- Network issues causing reordering
- Client not uploading in sequence
2. **Missing Chunks**
- Some chunks failed to upload
- Session expired
3. **Session Expired**
- Upload session timeout
- Need to resume upload
**Solutions:**
1. **Resume Upload**
```bash
# Get upload status
curl http://localhost:8080/api/v1/tracks/resume/{uploadId} \
-H "Authorization: Bearer $TOKEN"
# Continue uploading missing chunks
```
2. **Retry Failed Chunks**
- Identify failed chunks
- Re-upload missing chunks
- Complete upload when all chunks received
### Processing Fails
**Symptoms:**
- Track uploaded but stuck in "processing" status
- Processing never completes
- Error in processing
**Diagnosis:**
```bash
# Check track status
psql "$DATABASE_URL" -c "SELECT id, title, status, status_message FROM tracks WHERE status='processing';"
# Check processing logs
docker logs veza-backend-api | grep -i process
# Check Stream Server (if used)
curl http://stream-server:8082/health
```
**Common Causes:**
1. **Stream Server Down**
- Stream Server not running
- Cannot process audio
2. **Processing Queue Full**
- Too many tracks in queue
- Processing backlog
3. **Invalid Audio File**
- File corrupted
- Unsupported format
**Solutions:**
1. **Restart Stream Server**
```bash
docker-compose restart stream-server
```
2. **Manual Processing**
```bash
# Trigger processing manually (if endpoint exists)
curl -X POST http://localhost:8080/api/v1/tracks/{id}/process \
-H "Authorization: Bearer $TOKEN"
```
3. **Re-upload File**
- Delete failed track
- Re-upload with valid file
## Performance Issues
### Slow API Responses
**Symptoms:**
- API responses take several seconds
- Timeout errors
- High latency
**Diagnosis:**
```bash
# Check response times
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8080/api/v1/tracks
# Check database query times
psql "$DATABASE_URL" -c "SELECT query, mean_exec_time FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;"
# Check application metrics
curl http://localhost:8080/metrics | grep http_request_duration
```
**Common Causes:**
1. **Slow Database Queries**
- Missing indexes
- Large result sets
- N+1 queries
2. **High Load**
- Too many concurrent requests
- Resource exhaustion
3. **Network Issues**
- Slow network connection
- High latency
**Solutions:**
1. **Optimize Queries**
- Add indexes
- Use pagination
- Optimize JOINs
2. **Scale Application**
```bash
# Scale horizontally
docker-compose up -d --scale backend-api=3
```
3. **Enable Caching**
```bash
# Use Redis for caching
export REDIS_URL=redis://localhost:6379
```
### High Memory Usage
**Symptoms:**
- Application using too much memory
- Out of memory errors
- System slowdown
**Diagnosis:**
```bash
# Check memory usage
docker stats veza-backend-api
# or
ps aux | grep veza-backend-api
# Check for memory leaks
go tool pprof http://localhost:8080/debug/pprof/heap
```
**Common Causes:**
1. **Memory Leaks**
- Goroutines not cleaned up
- Cached data growing unbounded
2. **Large File Processing**
- Processing large audio files in memory
- Should use streaming
3. **Connection Pool Too Large**
- Too many database connections
- Each connection uses memory
**Solutions:**
1. **Reduce Connection Pool**
```bash
export DB_MAX_OPEN_CONNS=25
export DB_MAX_IDLE_CONNS=5
```
2. **Enable Garbage Collection Tuning**
```bash
export GOGC=100
```
3. **Restart Application**
```bash
# Periodic restarts can help
docker-compose restart backend-api
```
### High CPU Usage
**Symptoms:**
- CPU usage consistently high
- Slow response times
- System overheating
**Diagnosis:**
```bash
# Check CPU usage
top -p $(pgrep veza-backend-api)
# or
docker stats veza-backend-api
# Profile CPU usage
go tool pprof http://localhost:8080/debug/pprof/profile
```
**Common Causes:**
1. **Inefficient Algorithms**
- Complex computations
- Unoptimized code paths
2. **Too Many Goroutines**
- Goroutine leaks
- Excessive concurrency
3. **Frequent Garbage Collection**
- High allocation rate
- GC pressure
**Solutions:**
1. **Optimize Code**
- Profile to identify hotspots
- Optimize critical paths
- Reduce allocations
2. **Limit Concurrency**
```bash
export MAX_CONCURRENT_UPLOADS=10
```
3. **Scale Horizontally**
- Add more instances
- Distribute load
## Network Issues
### CORS Errors
**Symptoms:**
- Browser shows CORS errors
- API requests blocked
- "Access-Control-Allow-Origin" errors
**Diagnosis:**
```bash
# Check CORS configuration
echo $CORS_ALLOWED_ORIGINS
# Test CORS headers
curl -H "Origin: http://localhost:3000" \
-H "Access-Control-Request-Method: POST" \
-H "Access-Control-Request-Headers: Content-Type" \
-X OPTIONS \
http://localhost:8080/api/v1/auth/login
```
**Common Causes:**
1. **Origin Not Allowed**
- Frontend origin not in CORS_ALLOWED_ORIGINS
- Missing wildcard or specific origin
2. **CORS Not Configured**
- CORS middleware not enabled
- Missing CORS headers
**Solutions:**
1. **Add Origin to Allowed List**
```bash
export CORS_ALLOWED_ORIGINS=http://localhost:3000,https://app.veza.com
```
2. **Enable CORS in Development**
```bash
# For development, allow all origins (not recommended for production)
export CORS_ALLOWED_ORIGINS=*
```
### Connection Timeouts
**Symptoms:**
- Requests timeout
- "Connection timed out" errors
- Intermittent failures
**Diagnosis:**
```bash
# Check network connectivity
ping api.veza.com
telnet api.veza.com 8080
# Check timeout settings
echo $HANDLER_TIMEOUT
# Default: 30s
```
**Common Causes:**
1. **Network Issues**
- Slow network connection
- Firewall blocking
2. **Timeout Too Short**
- Handler timeout too low
- Need longer timeout for large operations
**Solutions:**
1. **Increase Timeout**
```bash
export HANDLER_TIMEOUT=60s
```
2. **Check Network**
```bash
# Test connectivity
curl -v http://localhost:8080/healthz
```
## Configuration Issues
### Environment Variables Not Loaded
**Symptoms:**
- Application uses wrong configuration
- Default values used instead of env vars
- Configuration errors
**Diagnosis:**
```bash
# Check environment variables
docker exec veza-backend-api env | grep -E 'DATABASE|JWT|APP_ENV'
# Check .env file
cat .env
# Verify .env is loaded
# Application should load .env automatically
```
**Common Causes:**
1. **.env File Missing**
- No .env file in project root
- Wrong location
2. **Variable Not Set**
- Environment variable not exported
- Typo in variable name
3. **Wrong Format**
- Invalid value format
- Missing quotes for values with spaces
**Solutions:**
1. **Create .env File**
```bash
cp .env.example .env
# Edit .env with your values
```
2. **Export Variables**
```bash
export DATABASE_URL=postgres://...
export JWT_SECRET=...
```
3. **Verify Format**
```bash
# Check .env file format
cat .env
# Should be: KEY=value (no spaces around =)
```
### Invalid Configuration
**Symptoms:**
- Application fails to start
- Configuration validation errors
- "Invalid configuration" messages
**Diagnosis:**
```bash
# Check configuration validation
# Application validates config on startup
# Check required variables
echo $DATABASE_URL
echo $JWT_SECRET
echo $APP_ENV
```
**Common Causes:**
1. **Missing Required Variables**
- DATABASE_URL required
- JWT_SECRET required (min 32 chars)
- CORS_ALLOWED_ORIGINS required in production
2. **Invalid Values**
- JWT_SECRET too short
- Invalid DATABASE_URL format
- Invalid APP_ENV value
**Solutions:**
1. **Set Required Variables**
```bash
export DATABASE_URL=postgres://user:pass@host:5432/dbname
export JWT_SECRET=$(openssl rand -base64 32)
export APP_ENV=production
export CORS_ALLOWED_ORIGINS=https://app.veza.com
```
2. **Validate Configuration**
```bash
# Application validates on startup
# Check startup logs for validation errors
docker logs veza-backend-api | grep -i config
```
## Deployment Issues
### Application Won't Start
**Symptoms:**
- Container exits immediately
- Application crashes on startup
- No logs available
**Diagnosis:**
```bash
# Check container logs
docker logs veza-backend-api
# Check exit code
docker inspect veza-backend-api | grep ExitCode
# Check health
docker ps -a | grep veza-backend-api
```
**Common Causes:**
1. **Configuration Error**
- Invalid environment variables
- Missing required variables
2. **Database Connection Failed**
- Database not accessible
- Wrong connection string
3. **Port Already in Use**
```bash
# Check port usage
lsof -i :8080
netstat -tulpn | grep 8080
```
**Solutions:**
1. **Check Logs**
```bash
docker logs veza-backend-api
# Look for error messages
```
2. **Verify Configuration**
```bash
docker exec veza-backend-api env | grep -E 'DATABASE|JWT'
```
3. **Free Port**
```bash
# Kill process using port
kill -9 $(lsof -t -i:8080)
# Or change port
export API_PORT=8081
```
### Health Check Fails
**Symptoms:**
- Health endpoint returns error
- Kubernetes marks pod as unhealthy
- Load balancer removes instance
**Diagnosis:**
```bash
# Check health endpoint
curl http://localhost:8080/healthz
# Check readiness
curl http://localhost:8080/readyz
# Check detailed health
curl http://localhost:8080/health
```
**Common Causes:**
1. **Database Unavailable**
- PostgreSQL down
- Connection failed
2. **Redis Unavailable** (if required)
- Redis down
- Connection failed
3. **Application Error**
- Internal error
- Check application logs
**Solutions:**
1. **Check Dependencies**
```bash
# Verify database
psql "$DATABASE_URL" -c "SELECT 1;"
# Verify Redis
redis-cli ping
```
2. **Fix Application**
```bash
# Check logs
docker logs veza-backend-api
# Restart if needed
docker-compose restart backend-api
```
## Security Issues
### JWT Token Issues
**Symptoms:**
- Tokens rejected
- "Invalid token" errors
- Tokens expire unexpectedly
**Diagnosis:**
```bash
# Check JWT secret
echo $JWT_SECRET | wc -c
# Must be at least 32 characters
# Verify token format
# Decode token (without verification) to check structure
```
**Common Causes:**
1. **JWT Secret Changed**
- Secret rotated
- All existing tokens invalid
2. **Token Expired**
- Token past expiration time
- Need to refresh
3. **Token Version Mismatch**
- User's token_version incremented
- All tokens invalidated
**Solutions:**
1. **Refresh Token**
```bash
curl -X POST http://localhost:8080/api/v1/auth/refresh \
-H "Content-Type: application/json" \
-d '{"refresh_token":"your_refresh_token"}'
```
2. **Re-login**
- Get new token by logging in again
### Rate Limiting Issues
**Symptoms:**
- Too many requests errors
- Rate limit exceeded
- 429 status codes
**Diagnosis:**
```bash
# Check rate limit headers
curl -I http://localhost:8080/api/v1/tracks
# Look for X-RateLimit-* headers
# Check Redis (if used for rate limiting)
redis-cli GET "ratelimit:user:123"
```
**Common Causes:**
1. **Too Many Requests**
- Exceeded rate limit
- Need to wait or reduce requests
2. **Rate Limit Too Strict**
- Limits too low for use case
- Need to adjust limits
**Solutions:**
1. **Wait for Reset**
- Rate limits reset after time window
- Check X-RateLimit-Reset header
2. **Adjust Limits** (if you have access)
```bash
# Modify rate limit configuration
# See rate limiting middleware configuration
```
## API Errors
### 400 Bad Request
**Symptoms:**
- API returns 400 error
- "Bad request" message
- Validation errors
**Diagnosis:**
```bash
# Check request format
# Verify JSON is valid
# Check required fields
# Example error response
{
"success": false,
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid request",
"details": ["field: required"]
}
}
```
**Common Causes:**
1. **Invalid JSON**
- Malformed JSON
- Syntax errors
2. **Missing Required Fields**
- Required field not provided
- Check API documentation
3. **Invalid Field Values**
- Value doesn't meet validation rules
- Check field requirements
**Solutions:**
1. **Validate JSON**
```bash
# Use JSON validator
echo '{"field":"value"}' | jq .
```
2. **Check Required Fields**
- Review API documentation
- Ensure all required fields present
3. **Validate Values**
- Check field types
- Verify value constraints
### 401 Unauthorized
**Symptoms:**
- API returns 401 error
- "Unauthorized" message
- Authentication required
**Diagnosis:**
```bash
# Check if endpoint requires authentication
# Verify token is included
# Check token is valid
# Test with token
curl http://localhost:8080/api/v1/users/me \
-H "Authorization: Bearer $TOKEN"
```
**Common Causes:**
1. **Missing Token**
- Authorization header not included
- Token not provided
2. **Invalid Token**
- Token expired
- Token malformed
- Token signature invalid
3. **Token Not for This Service**
- Wrong issuer
- Wrong audience
**Solutions:**
1. **Include Token**
```bash
curl http://localhost:8080/api/v1/users/me \
-H "Authorization: Bearer $TOKEN"
```
2. **Refresh Token**
```bash
# Get new token
curl -X POST http://localhost:8080/api/v1/auth/refresh \
-H "Content-Type: application/json" \
-d '{"refresh_token":"your_refresh_token"}'
```
### 403 Forbidden
**Symptoms:**
- API returns 403 error
- "Forbidden" message
- Insufficient permissions
**Diagnosis:**
```bash
# Check user permissions
# Verify user has required role
# Check resource ownership
```
**Common Causes:**
1. **Insufficient Permissions**
- User doesn't have required role
- Need admin or creator role
2. **Resource Ownership**
- User doesn't own resource
- Cannot modify others' resources
**Solutions:**
1. **Check Permissions**
```bash
# Verify user role
curl http://localhost:8080/api/v1/auth/me \
-H "Authorization: Bearer $TOKEN"
```
2. **Request Access**
- Contact administrator
- Request role upgrade
### 404 Not Found
**Symptoms:**
- API returns 404 error
- "Not found" message
- Resource doesn't exist
**Diagnosis:**
```bash
# Verify resource exists
psql "$DATABASE_URL" -c "SELECT id FROM tracks WHERE id='track-id';"
# Check URL path
# Verify endpoint exists
```
**Common Causes:**
1. **Resource Deleted**
- Resource was deleted
- ID doesn't exist
2. **Wrong ID**
- Invalid UUID format
- Typo in ID
3. **Wrong Endpoint**
- Endpoint doesn't exist
- Wrong API version
**Solutions:**
1. **Verify Resource**
```bash
# Check if resource exists
curl http://localhost:8080/api/v1/tracks/{id}
```
2. **Check ID Format**
- Must be valid UUID
- Check for typos
### 500 Internal Server Error
**Symptoms:**
- API returns 500 error
- "Internal server error" message
- Application error
**Diagnosis:**
```bash
# Check application logs
docker logs veza-backend-api | tail -50
journalctl -u veza-backend-api -n 50
# Check for panic or error messages
docker logs veza-backend-api | grep -i "panic\|error\|fatal"
```
**Common Causes:**
1. **Application Bug**
- Code error
- Unhandled exception
2. **Database Error**
- Query failed
- Constraint violation
3. **External Service Error**
- Redis unavailable
- Stream Server error
**Solutions:**
1. **Check Logs**
```bash
# Review error logs
docker logs veza-backend-api | grep -i error
```
2. **Report Bug**
- Collect error details
- Report to development team
- Include logs and request details
3. **Temporary Workaround**
- Retry request
- Use alternative endpoint if available
## Service Integration Issues
### Chat Server Integration
**Symptoms:**
- Cannot get chat token
- Chat connection fails
- WebSocket errors
**Diagnosis:**
```bash
# Check Chat Server
curl http://chat-server:8081/health
# Test chat token endpoint
curl http://localhost:8080/api/v1/chat/token \
-H "Authorization: Bearer $TOKEN"
```
**Common Causes:**
1. **Chat Server Down**
- Chat Server not running
- Cannot connect
2. **Invalid Token**
- Token generation failed
- Token format incorrect
**Solutions:**
1. **Start Chat Server**
```bash
docker-compose up -d chat-server
```
2. **Verify Integration**
```bash
# Check Chat Server URL
echo $CHAT_SERVER_URL
```
### Stream Server Integration
**Symptoms:**
- Tracks not processing
- Streaming fails
- HLS not available
**Diagnosis:**
```bash
# Check Stream Server
curl http://stream-server:8082/health
# Check stream status
curl http://localhost:8080/api/v1/tracks/{id}/hls/status
```
**Common Causes:**
1. **Stream Server Down**
- Stream Server not running
- Cannot process audio
2. **Processing Failed**
- Audio processing error
- Unsupported format
**Solutions:**
1. **Start Stream Server**
```bash
docker-compose up -d stream-server
```
2. **Retry Processing**
```bash
# Trigger processing (requires X-Internal-API-Key matching STREAM_SERVER_INTERNAL_API_KEY)
curl -X POST http://localhost:8080/api/v1/internal/tracks/{id}/stream-ready \
-H "Content-Type: application/json" \
-H "X-Internal-API-Key: $STREAM_SERVER_INTERNAL_API_KEY" \
-d '{"status":"completed"}'
```
## Diagnostic Tools
### Health Checks
```bash
# Basic health
curl http://localhost:8080/healthz
# Detailed health
curl http://localhost:8080/health
# Readiness
curl http://localhost:8080/readyz
```
### Logs
```bash
# Docker logs
docker logs -f veza-backend-api
# Docker Compose logs
docker-compose logs -f backend-api
# Systemd logs
journalctl -u veza-backend-api -f
# Kubernetes logs
kubectl logs -f deployment/veza-backend-api -n veza
```
### Metrics
```bash
# Prometheus metrics
curl http://localhost:8080/metrics
# Specific metric
curl http://localhost:8080/metrics | grep http_request_duration
```
### Database Queries
```bash
# Connect to database
psql "$DATABASE_URL"
# Check users
SELECT id, email, username, created_at FROM users LIMIT 10;
# Check tracks
SELECT id, title, user_id, status FROM tracks LIMIT 10;
# Check recent errors
SELECT * FROM audit_logs WHERE level='error' ORDER BY created_at DESC LIMIT 10;
```
### Performance Profiling
```bash
# Enable pprof
export ENABLE_PPROF=true
# CPU profile
go tool pprof http://localhost:8080/debug/pprof/profile
# Memory profile
go tool pprof http://localhost:8080/debug/pprof/heap
# Goroutine profile
go tool pprof http://localhost:8080/debug/pprof/goroutine
```
## Getting Help
### Before Asking for Help
1. **Check Documentation**
- API Documentation
- Deployment Guide
- Development Guide
2. **Check Logs**
- Application logs
- Database logs
- System logs
3. **Reproduce Issue**
- Steps to reproduce
- Expected vs actual behavior
- Error messages
### Contact Support
- **Email**: support@veza.app
- **Documentation**: https://docs.veza.app
- **Issues**: https://github.com/veza/veza-backend-api/issues
- **Community**: https://community.veza.app
### Information to Include
When reporting issues, include:
1. **Environment**
- OS and version
- Docker version (if applicable)
- Application version
2. **Configuration**
- Relevant environment variables (sanitized)
- Configuration files (sanitized)
3. **Error Details**
- Full error message
- Stack trace (if available)
- Logs around error time
4. **Steps to Reproduce**
- Detailed steps
- Sample requests/responses
- Expected vs actual behavior
5. **Additional Context**
- When issue started
- Recent changes
- Related issues
## Common Error Codes
| Code | Meaning | Common Causes | Solutions |
|------|---------|---------------|-----------|
| 400 | Bad Request | Invalid input, validation error | Check request format, validate fields |
| 401 | Unauthorized | Missing/invalid token | Include valid token, refresh token |
| 403 | Forbidden | Insufficient permissions | Check user role, verify ownership |
| 404 | Not Found | Resource doesn't exist | Verify ID, check endpoint |
| 409 | Conflict | Resource conflict (duplicate) | Use different value, check existing |
| 422 | Unprocessable Entity | Validation error | Fix field values, check constraints |
| 429 | Too Many Requests | Rate limit exceeded | Wait, reduce request rate |
| 500 | Internal Server Error | Application error | Check logs, report bug |
| 503 | Service Unavailable | Service down | Check dependencies, restart service |
## Quick Reference
### Essential Commands
```bash
# Health check
curl http://localhost:8080/healthz
# Check logs
docker logs -f veza-backend-api
# Check database
psql "$DATABASE_URL" -c "SELECT 1;"
# Check Redis
redis-cli ping
# Restart service
docker-compose restart backend-api
# View metrics
curl http://localhost:8080/metrics
```
### Useful Queries
```sql
-- Check recent users
SELECT id, email, username, created_at FROM users ORDER BY created_at DESC LIMIT 10;
-- Check tracks by status
SELECT status, COUNT(*) FROM tracks GROUP BY status;
-- Check failed uploads
SELECT id, title, status, status_message FROM tracks WHERE status='failed';
-- Check active sessions
SELECT user_id, COUNT(*) FROM sessions WHERE expires_at > NOW() GROUP BY user_id;
```
### Environment Variables Checklist
```bash
# Required
DATABASE_URL=postgres://...
JWT_SECRET=... (min 32 chars)
APP_ENV=production|staging|development
# Production Required
CORS_ALLOWED_ORIGINS=https://app.veza.com
# Optional but Recommended
REDIS_URL=redis://...
LOG_LEVEL=INFO
LOG_FORMAT=json
```
## Additional Resources
- **API Documentation**: See `docs/API_DOCUMENTATION.md`
- **Deployment Guide**: See `docs/DEPLOYMENT_GUIDE.md`
- **Development Guide**: See `docs/DEVELOPMENT_SETUP_GUIDE.md`
- **Architecture Documentation**: See `docs/ARCHITECTURE.md`