Skip to content

Deploy Runbook — qgtm.ai

This is the single source of truth for taking the QGTM.AI trading platform from repo to production. A competent ops engineer should be able to follow this and bring the stack online in under a day.

Prerequisites

  • [ ] Access to QGTMAI GitHub org
  • [ ] Cloudflare account with API token
  • [ ] Fly.io account with CLI installed (flyctl)
  • [ ] Neon or Supabase account for PostgreSQL
  • [ ] Upstash account for Redis
  • [ ] Doppler account for secrets management
  • [ ] Alpaca account (paper keys verified working)
  • [ ] Stripe account (test mode configured)
  • [ ] Discord developer account with bot created
  • [ ] Domain qgtm.ai purchased (or alternative)

Step 1: Domain & DNS (Cloudflare)

# 1. Purchase qgtm.ai from your registrar of choice
# 2. Point nameservers to Cloudflare
# 3. Add zone in Cloudflare dashboard

# Verify zone is active
curl -s -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  "https://api.cloudflare.com/client/v4/zones?name=qgtm.ai" | jq '.result[0].status'
# Expected: "active"

# DNS records will be added by Terraform in Step 6

Step 2: Secrets Setup (Doppler)

# Install Doppler CLI
brew install dopplerhq/cli/doppler

# Login and setup project
doppler login
doppler setup --project qgtm-trading

# Create environments
doppler environments create staging
doppler environments create production

# Import secrets from .env.example (fill all values)
doppler secrets upload --config staging .env.staging
doppler secrets upload --config production .env.production

Step 3: Database (Neon PostgreSQL)

# Create project via Neon console or CLI
# Save connection string to Doppler as DATABASE_URL

# Run migrations
doppler run --config staging -- python -m scripts.migrate

Step 4: Redis (Upstash)

# Create Redis instance via Upstash console
# Enable TLS
# Save REDIS_URL to Doppler

Step 5: Object Storage (Cloudflare R2)

# Create R2 bucket for ArcticDB
wrangler r2 bucket create qgtm-arcticdb

# Create R2 bucket for backtest artifacts
wrangler r2 bucket create qgtm-artifacts

# Save credentials to Doppler as ARCTICDB_URI

Step 6: Infrastructure (Terraform)

cd infra/terraform

# Initialize
terraform init

# Plan (review carefully)
terraform plan -var-file=production.tfvars

# Apply
terraform apply -var-file=production.tfvars

Step 7: Backend Deploy (Fly.io)

# Create apps
flyctl apps create qgtm-api --org qgtmai
flyctl apps create qgtm-worker --org qgtmai
flyctl apps create qgtm-signals --org qgtmai

# Set secrets from Doppler
doppler run --config production -- flyctl secrets import --app qgtm-api

# Deploy
flyctl deploy --app qgtm-api --config infra/fly/fly.api.toml
flyctl deploy --app qgtm-worker --config infra/fly/fly.worker.toml
flyctl deploy --app qgtm-signals --config infra/fly/fly.signals.toml

Step 8: Frontend Deploy (Cloudflare Pages)

cd qgtm_web

# Create Pages project
wrangler pages project create qgtm-web

# Deploy
pnpm build
wrangler pages deploy out --project-name=qgtm-web

# Set custom domain in Cloudflare dashboard: qgtm.ai → qgtm-web

Step 9: Alpaca Connection

# PAPER MODE FIRST — verify round-trip
doppler run --config staging -- python -m scripts.verify_alpaca

# Expected output:
# ✓ Account status: ACTIVE
# ✓ Paper trading: True
# ✓ Buying power: $100,000.00
# ✓ Test order submitted and cancelled successfully

# LIVE MODE — only after 30+ days of paper trading verification
# 1. Update ALPACA_BASE_URL in Doppler production config
# 2. Set QGTM_LIVE_TRADING_ENABLED=true
# 3. Requires GitHub environment approval

Step 10: Stripe (Test → Live)

# Verify test mode webhooks working
# Create products and prices in live mode
# Update STRIPE_SECRET_KEY in Doppler production
# Update price IDs
# Test a subscription flow end-to-end

Step 11: Discord Bot

# Create production Discord server
# Invite bot with required permissions
# Set up role-gated channels per tier
# Update DISCORD_GUILD_ID in Doppler
# Verify signal posting works

Step 12: Smoke Tests

# Health checks
curl https://api.qgtm.ai/health
curl https://qgtm.ai

# Paper trade round-trip
doppler run --config production -- python -m scripts.smoke_test_trade

# Signal publish test
doppler run --config production -- python -m scripts.smoke_test_signal

# Subscriber flow test
# 1. Sign up on qgtm.ai
# 2. Subscribe to Pro tier (test card)
# 3. Verify Discord role assigned
# 4. Verify signal received

Step 13: Monitoring

# Grafana dashboards imported from infra/grafana/
# Prometheus scrape targets configured in Fly.io
# Loki log aggregation via Fly.io log shipping
# Sentry DSN configured in all services

# Verify alerts
# - API latency > 500ms
# - Trading daemon heartbeat missing > 60s
# - Daily PnL loss > circuit breaker threshold
# - Signal delivery failure

Step 14: Rollback

# Frontend: Cloudflare Pages has instant rollback via dashboard
wrangler pages deployment list --project-name=qgtm-web
wrangler pages deployment rollback --project-name=qgtm-web <deployment-id>

# Backend: Fly.io rollback
flyctl releases --app qgtm-api
flyctl deploy --app qgtm-api --image <previous-image>

# Database: Neon has point-in-time recovery
# Redis: Upstash has daily backups

# EMERGENCY: Kill switch
doppler run --config production -- python -m qgtm_live.kill_switch
# This: cancels all open orders, flattens all positions, disables live trading flag

Step 15: Post-Deploy Verification

  • [ ] https://qgtm.ai loads correctly
  • [ ] https://api.qgtm.ai/health returns {"status": "ok"}
  • [ ] Paper trade executes end-to-end
  • [ ] Signal publishes to Discord and Telegram
  • [ ] Stripe webhook fires on test subscription
  • [ ] Grafana dashboards show data
  • [ ] Sentry captures test error
  • [ ] SSL certificate valid and auto-renewing
  • [ ] WAF rules active
  • [ ] Rate limiting configured