Deploy Runbook — qgtm.ai
This is the single source of truth for taking the QGTM.AI trading platform from repo to production. A competent ops engineer should be able to follow this and bring the stack online in under a day.
Prerequisites
- [ ] Access to QGTMAI GitHub org
- [ ] Cloudflare account with API token
- [ ] Fly.io account with CLI installed (
flyctl) - [ ] Neon or Supabase account for PostgreSQL
- [ ] Upstash account for Redis
- [ ] Doppler account for secrets management
- [ ] Alpaca account (paper keys verified working)
- [ ] Stripe account (test mode configured)
- [ ] Discord developer account with bot created
- [ ] Domain
qgtm.aipurchased (or alternative)
Step 1: Domain & DNS (Cloudflare)
# 1. Purchase qgtm.ai from your registrar of choice
# 2. Point nameservers to Cloudflare
# 3. Add zone in Cloudflare dashboard
# Verify zone is active
curl -s -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
"https://api.cloudflare.com/client/v4/zones?name=qgtm.ai" | jq '.result[0].status'
# Expected: "active"
# DNS records will be added by Terraform in Step 6
Step 2: Secrets Setup (Doppler)
# Install Doppler CLI
brew install dopplerhq/cli/doppler
# Login and setup project
doppler login
doppler setup --project qgtm-trading
# Create environments
doppler environments create staging
doppler environments create production
# Import secrets from .env.example (fill all values)
doppler secrets upload --config staging .env.staging
doppler secrets upload --config production .env.production
Step 3: Database (Neon PostgreSQL)
# Create project via Neon console or CLI
# Save connection string to Doppler as DATABASE_URL
# Run migrations
doppler run --config staging -- python -m scripts.migrate
Step 4: Redis (Upstash)
Step 5: Object Storage (Cloudflare R2)
# Create R2 bucket for ArcticDB
wrangler r2 bucket create qgtm-arcticdb
# Create R2 bucket for backtest artifacts
wrangler r2 bucket create qgtm-artifacts
# Save credentials to Doppler as ARCTICDB_URI
Step 6: Infrastructure (Terraform)
cd infra/terraform
# Initialize
terraform init
# Plan (review carefully)
terraform plan -var-file=production.tfvars
# Apply
terraform apply -var-file=production.tfvars
Step 7: Backend Deploy (Fly.io)
# Create apps
flyctl apps create qgtm-api --org qgtmai
flyctl apps create qgtm-worker --org qgtmai
flyctl apps create qgtm-signals --org qgtmai
# Set secrets from Doppler
doppler run --config production -- flyctl secrets import --app qgtm-api
# Deploy
flyctl deploy --app qgtm-api --config infra/fly/fly.api.toml
flyctl deploy --app qgtm-worker --config infra/fly/fly.worker.toml
flyctl deploy --app qgtm-signals --config infra/fly/fly.signals.toml
Step 8: Frontend Deploy (Cloudflare Pages)
cd qgtm_web
# Create Pages project
wrangler pages project create qgtm-web
# Deploy
pnpm build
wrangler pages deploy out --project-name=qgtm-web
# Set custom domain in Cloudflare dashboard: qgtm.ai → qgtm-web
Step 9: Alpaca Connection
# PAPER MODE FIRST — verify round-trip
doppler run --config staging -- python -m scripts.verify_alpaca
# Expected output:
# ✓ Account status: ACTIVE
# ✓ Paper trading: True
# ✓ Buying power: $100,000.00
# ✓ Test order submitted and cancelled successfully
# LIVE MODE — only after 30+ days of paper trading verification
# 1. Update ALPACA_BASE_URL in Doppler production config
# 2. Set QGTM_LIVE_TRADING_ENABLED=true
# 3. Requires GitHub environment approval
Step 10: Stripe (Test → Live)
# Verify test mode webhooks working
# Create products and prices in live mode
# Update STRIPE_SECRET_KEY in Doppler production
# Update price IDs
# Test a subscription flow end-to-end
Step 11: Discord Bot
# Create production Discord server
# Invite bot with required permissions
# Set up role-gated channels per tier
# Update DISCORD_GUILD_ID in Doppler
# Verify signal posting works
Step 12: Smoke Tests
# Health checks
curl https://api.qgtm.ai/health
curl https://qgtm.ai
# Paper trade round-trip
doppler run --config production -- python -m scripts.smoke_test_trade
# Signal publish test
doppler run --config production -- python -m scripts.smoke_test_signal
# Subscriber flow test
# 1. Sign up on qgtm.ai
# 2. Subscribe to Pro tier (test card)
# 3. Verify Discord role assigned
# 4. Verify signal received
Step 13: Monitoring
# Grafana dashboards imported from infra/grafana/
# Prometheus scrape targets configured in Fly.io
# Loki log aggregation via Fly.io log shipping
# Sentry DSN configured in all services
# Verify alerts
# - API latency > 500ms
# - Trading daemon heartbeat missing > 60s
# - Daily PnL loss > circuit breaker threshold
# - Signal delivery failure
Step 14: Rollback
# Frontend: Cloudflare Pages has instant rollback via dashboard
wrangler pages deployment list --project-name=qgtm-web
wrangler pages deployment rollback --project-name=qgtm-web <deployment-id>
# Backend: Fly.io rollback
flyctl releases --app qgtm-api
flyctl deploy --app qgtm-api --image <previous-image>
# Database: Neon has point-in-time recovery
# Redis: Upstash has daily backups
# EMERGENCY: Kill switch
doppler run --config production -- python -m qgtm_live.kill_switch
# This: cancels all open orders, flattens all positions, disables live trading flag
Step 15: Post-Deploy Verification
- [ ]
https://qgtm.ailoads correctly - [ ]
https://api.qgtm.ai/healthreturns{"status": "ok"} - [ ] Paper trade executes end-to-end
- [ ] Signal publishes to Discord and Telegram
- [ ] Stripe webhook fires on test subscription
- [ ] Grafana dashboards show data
- [ ] Sentry captures test error
- [ ] SSL certificate valid and auto-renewing
- [ ] WAF rules active
- [ ] Rate limiting configured