Skip to content

Deploy Runbook — qgtmai.com / api.qgtmai.com

This runbook covers the repo's actual current production path. It does not describe greenfield bootstrap work or legacy Fly.io/Doppler infrastructure as if those were the active deployment model.

Preconditions

  • [ ] main is green in GitHub Actions
  • [ ] You have access to the QGTMAI/trading repository
  • [ ] GitHub secrets for Cloudflare and the DigitalOcean SSH key are current
  • [ ] The DigitalOcean droplet at 142.93.1.195 is reachable
  • [ ] Alpaca paper credentials are valid
  • [ ] Any operator-impacting changes have updated their paired docs / Notion pages

Documentation Sync Gate

Before merging or deploying any operator-impacting change, confirm all three truth surfaces move together:

  • repo docs updated where the behavior changed
  • matching Notion source-of-truth pages updated
  • docs deploy run completed if anything under docs/** changed

Minimum repo doc set for trading/runtime changes:

  • README.md
  • docs/api-reference.md
  • docs/secrets-reference.md
  • this runbook or a more specific operator runbook when procedures changed

Production Surfaces

Surface Current deploy target
Terminal UI Cloudflare Pages project qgtm-trading
API DigitalOcean droplet 142.93.1.195
Daemon Same droplet, restarted via systemd during deploy
Redis Same droplet, Docker/systemd
Docs Cloudflare Pages project qgtm-docs

Step 1: Trigger The Deploy Workflow

Production deploys are orchestrated by .github/workflows/deploy.yml.

  • Push to main to trigger a change-scoped production deploy.
  • Use workflow_dispatch if you need to deploy only web, only api, or all.

The workflow decides whether web and/or API surfaces changed before deploying.

Step 2: Web Deploy

For web changes, the workflow:

  1. checks out the repo
  2. installs Node and pnpm
  3. builds qgtm_web
  4. deploys qgtm_web/out to Cloudflare Pages project qgtm-trading

Operational expectation:

  • qgtmai.com should reflect the new static export
  • this path does not deploy the docs site

Step 3: API + Daemon Deploy

For API-relevant changes, the workflow SSHes to the droplet and runs the following flow inside /opt/trading:

  1. capture the previous git commit for rollback
  2. git fetch origin main
  3. git reset --hard origin/main
  4. git clean -fd
  5. update selected .env values from GitHub secrets if new values are provided
  6. activate .venv and reinstall the package with pip install -e .
  7. verify from qgtm_api.main import app imports successfully
  8. restart qgtm-api
  9. restart qgtm-daemon
  10. wait for http://127.0.0.1:8000/health

If the import or health check fails, the workflow rolls the droplet back to the previous git commit and restarts qgtm-api.

Step 4: Runtime Services

The active production services are managed by systemd wrappers around Docker Compose:

  • infra/systemd/qgtm-api.service
  • infra/systemd/qgtm-daemon.service
  • infra/systemd/qgtm-redis.service

These units reference the server-side compose file at:

/opt/trading/docker-compose.prod.yml

The API and daemon are intentionally isolated into separate systemd units so an API restart does not imply the same failure domain as daemon execution.

Step 5: Post-Deploy Verification

Minimum verification after every production deploy:

curl -sS https://api.qgtmai.com/health
curl -sS https://api.qgtmai.com/api/v1/market-data/bars/GLD?limit=5
curl -sS https://api.qgtmai.com/api/v1/forecast/gold
curl -I https://qgtmai.com/

Check:

  • terminal loads
  • API health returns status: ok
  • core bars endpoint works
  • forecast endpoint works
  • daemon restarted cleanly if API code changed

For operator-impacting API, daemon, or execution changes, also verify:

curl -H "X-QGTM-API-Key: $QGTM_API_KEY" \
  https://api.qgtmai.com/api/v1/session/readiness

curl -H "X-QGTM-API-Key: $QGTM_API_KEY" \
  "https://api.qgtmai.com/api/v1/orders/attribution?limit=50"

Check:

  • readiness returns ready=true when the stack is healthy
  • checks.order_tag_coverage_ok and checks.legacy_tag_debt_present reflect the expected attribution state
  • order attribution returns the expected realized-PnL and legacy/unknown unattributed split

Step 6: Docs Deploy

Docs deploy separately via .github/workflows/deploy-docs.yml.

That workflow:

  1. builds the MkDocs site into site/
  2. deploys it to Cloudflare Pages project qgtm-docs

Do not treat a successful terminal deploy as a docs deploy. If docs/** changed, wait for the docs workflow and verify https://qgtm-docs.pages.dev/.

Step 7: Rollback

Web

  • Use Cloudflare Pages deployment rollback for project qgtm-trading

API / daemon

  • Re-run deploy after reverting the bad commit on main, or
  • SSH to the droplet, reset /opt/trading to the known-good commit, then restart:
systemctl restart qgtm-api
systemctl restart qgtm-daemon

Emergency trading response

If the concern is trading safety rather than web/API correctness, use the kill-switch / flatten procedures from the operator runbooks before worrying about cosmetic rollback.

Legacy / Non-Canonical Paths

The repo still contains Fly.io, Neon, Upstash, Doppler, Terraform, and k8s references. Treat those as historical, experimental, or future-state assets unless they are reintroduced into the active GitHub Actions deploy path.