Chapter 8: System Architecture
The goal: Understand how every component connects -- from data ingestion to trade execution to the Bloomberg-style terminal -- without being a software engineer. If you can follow a flow chart, you can understand this system.
The 30,000-Foot View
┌─────────────────────────────────────────────────────────────────────┐
│ QGTM.AI ARCHITECTURE │
│ │
│ DATA SOURCES SIGNAL ENGINE EXECUTION │
│ ──────────── ──────────── ────────── │
│ LBMA, COMEX ──► 29 Strategies ──► Compliance ──► │
│ CFTC, FRED ──► Signal Agg ──► OMS ──► │
│ ETF Flows ──► Meta-Labeller ──► Broker API ──► │
│ Alt Data ──► Regime Alloc ──► Fill Recon │
│ │
│ RISK MANAGER (intercepts every step) │
│ │
│ FRONTEND MONITORING AUDIT │
│ ──────── ────────── ───── │
│ Next.js Terminal ◄── Prometheus ──► Merkle Log │
│ Bloomberg Panels ◄── Grafana ──► Correlation IDs │
│ WebSocket Stream ◄── PagerDuty ──► Signed Releases │
│ │
└─────────────────────────────────────────────────────────────────────┘
Python Backend: 14 Packages, 3000+ Tests
The core trading system is Python, organized as a monorepo with 14 internal packages. Each package has a single responsibility and talks to others through typed interfaces defined in qgtm_core.
PACKAGE RESPONSIBILITY TESTS
================== ================================ =====
qgtm_core Types, config, universe ~200
qgtm_data Data ingestion + PIT timestamps ~250
qgtm_features Feature engineering ~200
qgtm_strategies 29 alpha strategies ~350
qgtm_backtest Walk-forward backtesting ~250
qgtm_risk 8-factor risk + kill switch ~300
qgtm_portfolio Signal aggregation + allocation ~200
qgtm_execution OMS + broker adapter ~200
qgtm_live Daemon, scheduler, heartbeat ~150
qgtm_signals Signal publishing + tiers ~150
qgtm_api FastAPI backend ~200
qgtm_ml LightGBM meta-labeller + HMM ~200
qgtm_compliance Wash sale, PDT, position limits ~150
qgtm_audit Merkle log, correlation IDs ~200
================== ================================ =====
TOTAL ~3,000+
Why 14 Packages, Not 1?
Each package can be tested, deployed, and reasoned about independently. If qgtm_data has a bug, we know exactly where to look. If qgtm_risk needs a new factor, we change one package without touching strategies.
The Signal Pipeline
This is the core flow that turns data into trades:
┌─ SIGNAL PIPELINE ──────────────────────────────────────────────────┐
│ │
│ STEP 1: DATA INGESTION (qgtm_data) │
│ Pull LBMA fix, COMEX stocks, FRED macro, ETF flows, COT │
│ Every data point gets as_of + available_at timestamps (PIT) │
│ │ │
│ ▼ │
│ STEP 2: FEATURE ENGINEERING (qgtm_features) │
│ Z-scores, rolling correlations, Kalman filters, vol estimates │
│ Output: feature matrix (29 strategies x N features each) │
│ │ │
│ ▼ │
│ STEP 3: SIGNAL GENERATION (qgtm_strategies) │
│ Each of 29 strategies produces: direction, conviction, size │
│ Raw signals: possibly 50+ signals per day │
│ │ │
│ ▼ │
│ STEP 4: SIGNAL AGGREGATION (qgtm_portfolio) │
│ Combine conflicting signals across strategies │
│ Apply regime weights from G2 classifier │
│ Output: target portfolio (instrument -> weight) │
│ │ │
│ ▼ │
│ STEP 5: REGIME ALLOCATION (qgtm_ml) │
│ HMM + BOCPD determines current regime │
│ Meta-labeller filters: keep / reduce / reject each signal │
│ Output: filtered + regime-weighted target portfolio │
│ │ │
│ ▼ │
│ STEP 6: COMPLIANCE CHECK (qgtm_compliance) │
│ Wash sale? PDT? Position limit? CFTC reporting threshold? │
│ Output: approved target portfolio (or blocked trades flagged) │
│ │ │
│ ▼ │
│ STEP 7: RISK GATING (qgtm_risk) │
│ 8-factor check, leverage check, CVaR check, kill switch check │
│ Output: risk-approved portfolio │
│ │ │
│ ▼ │
│ STEP 8: ORDER MANAGEMENT (qgtm_execution) │
│ Current portfolio vs target portfolio = required trades │
│ Smart order routing: limit vs market, timing, VWAP vs arrival │
│ Idempotency keys prevent duplicate orders │
│ │ │
│ ▼ │
│ STEP 9: BROKER EXECUTION │
│ Orders sent to broker API (Alpaca / Interactive Brokers) │
│ Fill reconciliation: our records vs broker records │
│ │ │
│ ▼ │
│ STEP 10: AUDIT LOG (qgtm_audit) │
│ Every step logged with correlation ID │
│ Merkle-chained: each log entry hashes the previous │
│ Tamper-evident: cannot alter history without breaking the chain │
│ │
└─────────────────────────────────────────────────────────────────────┘
Next.js Terminal: Bloomberg-Style Panels
The frontend is a Next.js application styled as a Bloomberg Terminal -- dark background, dense information, keyboard-driven navigation.
The 7 Panel Types
PANEL NAME WHAT IT SHOWS
===== ===== ============================================
PORT Portfolio Current positions, P&L, Greeks, factor
exposures, drawdown gauge
RISK Risk 8-factor decomposition, CVaR, kill switch
status, leverage utilization, stress test
results
COT COT CFTC positioning for gold + silver.
Z-score chart, managed money net, commercial
hedging pressure, historical extremes
TCA TCA Transaction Cost Analysis. Slippage vs VWAP,
implementation shortfall, fill rate, broker
comparison
OMON Options Options monitor. IV surface, skew, vol term
Monitor structure, Greek exposures, risk reversals
NEWS News Real-time news feed filtered for gold/silver.
Sentiment score, source, relevance ranking.
FOMC/CPI/NFP event countdown timer.
SCEN Scenario Stress testing interface. Select a crisis
scenario (Hunt 1980, COVID 2020, etc.) and
see simulated portfolio impact in real-time.
Panel Layout
┌─────────────────────┬─────────────────────┬──────────────────┐
│ │ │ │
│ PORT │ RISK │ NEWS │
│ (positions, P&L, │ (8 factors, CVaR, │ (sentiment, │
│ drawdown) │ kill switch) │ events) │
│ │ │ │
├─────────────────────┼─────────────────────┤ │
│ │ │ │
│ COT │ OMON │ │
│ (CFTC positioning, │ (vol surface, │ │
│ z-scores) │ skew, Greeks) │ │
│ │ │ │
├─────────────────────┴─────────────────────┼──────────────────┤
│ │ │
│ TCA / SCEN │ Command │
│ (slippage analysis / stress testing) │ Palette │
│ │ (Cmd+K) │
└───────────────────────────────────────────┴──────────────────┘
WebSocket Streaming + Auto-Reconnect
The terminal receives real-time updates over WebSocket:
FRONTEND (Next.js) BACKEND (FastAPI)
┌──────────────────┐ ┌──────────────────┐
│ │ WebSocket │ │
│ useWebSocket() │◄════════════════►│ /ws/stream │
│ hook handles: │ │ pushes: │
│ - auto-reconnect│ Heartbeat │ - price updates │
│ - exponential │ every 10s │ - P&L changes │
│ backoff │ │ - signal alerts │
│ - message queue │ │ - kill switch │
│ during │ │ status │
│ disconnect │ │ │
└──────────────────┘ └──────────────────┘
Reconnect strategy:
Attempt 1: 1 second delay
Attempt 2: 2 seconds
Attempt 3: 4 seconds
Attempt 4: 8 seconds
Max delay: 30 seconds
Messages queued during disconnect are replayed on reconnect.
Self-Learning Loop
The system monitors its own performance and automatically retrains models when alpha decays.
┌─ SELF-LEARNING LOOP ──────────────────────────────────────────┐
│ │
│ STEP 1: DECAY MONITOR │
│ Rolling 60-day Sharpe for each strategy vs its backtest │
│ IF live_sharpe < 0.5 * backtest_sharpe for 3+ months: │
│ FLAG strategy for investigation │
│ │
│ STEP 2: AUTO-RETRAIN │
│ LightGBM meta-labeller retrained monthly on walk-forward data │
│ HMM regime classifier refitted quarterly │
│ Kalman filter parameters updated weekly (by design) │
│ │
│ STEP 3: SHADOW-LIVE │
│ New model runs in shadow mode alongside production model │
│ Receives same inputs, generates same outputs, but does NOT │
│ trade. Results logged and compared. │
│ │
│ STEP 4: PROMOTE │
│ IF shadow model beats production over 30+ trading days: │
│ AND shadow Sharpe > 0.5 │
│ AND shadow max drawdown < production max drawdown │
│ AND team approves: │
│ --> PROMOTE shadow to production │
│ --> Old production becomes the new shadow │
│ │
│ This is a continuous A/B test for models. │
│ │
└────────────────────────────────────────────────────────────────┘
Merkle-Chained Audit Log
Every action in the system is logged with a tamper-evident chain.
WHAT IS A MERKLE CHAIN?
Each log entry contains a hash of the PREVIOUS entry.
If anyone modifies an old entry, every subsequent hash breaks.
This is the same principle behind blockchain, but simpler.
Entry 1: { action: "BUY GLD 500", hash: sha256(entry_1_data) }
Entry 2: { action: "SELL SLV 200", prev_hash: hash_of_entry_1,
hash: sha256(entry_2_data + prev_hash) }
Entry 3: { action: "RISK CHECK OK", prev_hash: hash_of_entry_2,
hash: sha256(entry_3_data + prev_hash) }
If someone changes Entry 1:
- Entry 1's hash changes
- Entry 2's prev_hash no longer matches
- The chain is broken --> TAMPER DETECTED
Correlation IDs
Every request gets a unique correlation ID that follows it through the entire pipeline:
correlation_id = "sig-20260412-143022-a7f3"
This ID appears in:
- Strategy signal log
- Risk check log
- Compliance check log
- Order creation log
- Broker submission log
- Fill confirmation log
- Audit trail
To debug any trade, search the correlation ID.
Every step is traceable end-to-end.
CI/CD Pipeline
Every code change goes through automated quality gates before merging.
PULL REQUEST
│
▼
┌─────────────────────────────────────────────────┐
│ GATE 1: LINT (ruff) │
│ Python linting. Zero warnings tolerated. │
│ Enforces consistent style across 14 packages. │
└─────────────────────┬───────────────────────────┘
│ PASS
▼
┌─────────────────────────────────────────────────┐
│ GATE 2: TYPE CHECK (mypy --strict) │
│ Full static type analysis. Every function has │
│ typed parameters and return types. No Any. │
└─────────────────────┬───────────────────────────┘
│ PASS
▼
┌─────────────────────────────────────────────────┐
│ GATE 3: TESTS (pytest, 3000+ tests) │
│ Unit tests, integration tests, PIT audits. │
│ Coverage must stay above 80%. │
└─────────────────────┬───────────────────────────┘
│ PASS
▼
┌─────────────────────────────────────────────────┐
│ GATE 4: SBOM (Software Bill of Materials) │
│ Every dependency tracked with version + hash. │
│ Vulnerability scan: no known CVEs in deps. │
└─────────────────────┬───────────────────────────┘
│ PASS
▼
┌─────────────────────────────────────────────────┐
│ GATE 5: SIGNED RELEASE │
│ Release artifact is GPG-signed. │
│ Signature verified before deployment. │
│ Rollback to previous signed release in <60s. │
└─────────────────────┬───────────────────────────┘
│ PASS
▼
MERGE + DEPLOY
Technology Choices
| Technology | What It Is | Why We Use It |
|---|---|---|
| Python 3.12 | Core language | Industry standard for quant finance |
| FastAPI | Web framework | Async, auto-docs, type-safe |
| Polars | DataFrame library | 10-100x faster than pandas |
| LightGBM | Gradient boosting | Fast training, handles missing data |
| hmmlearn | Hidden Markov Models | Regime detection |
| Next.js | Frontend framework | SSR, fast navigation, React ecosystem |
| WebSocket | Real-time streaming | Sub-100ms latency for price updates |
| PostgreSQL | Primary database | Reliable, ACID, great for time-series |
| Redis | Cache + pub/sub | Hot state, real-time signal distribution |
| Docker | Containerization | Reproducible environments |
| GitHub Actions | CI/CD | Automated testing + deployment |
| Prometheus | Metrics collection | System health monitoring |
| Grafana | Dashboards | Visual monitoring of all metrics |
Why Polars Over Pandas?
Task: Calculate Kalman filter on gold/silver ratio, 20 years of data
Pandas: ██████████████████████████████████ 4.8 seconds
Polars: ████ 0.12 seconds
40x faster
When 29 strategies run every 15 minutes on multiple instruments,
speed is not optional.
The Monorepo
trading/ <-- THE REPO
├── qgtm_core/ <-- Types, config, universe
├── qgtm_data/ <-- Data ingestion + PIT
├── qgtm_features/ <-- Feature engineering
├── qgtm_strategies/ <-- 29 alpha strategies
├── qgtm_backtest/ <-- Walk-forward engine
├── qgtm_risk/ <-- 8-factor risk + kill switch
├── qgtm_portfolio/ <-- Signal aggregation
├── qgtm_execution/ <-- OMS + broker
├── qgtm_live/ <-- Daemon + scheduler
├── qgtm_signals/ <-- Signal publishing
├── qgtm_api/ <-- FastAPI backend
├── qgtm_ml/ <-- Meta-labeller + HMM
├── qgtm_compliance/ <-- Wash sale, PDT, limits
├── qgtm_audit/ <-- Merkle log + correlation IDs
├── qgtm_web/ <-- Next.js Bloomberg terminal
├── infra/ <-- Docker, K8s, Terraform
├── tests/ <-- 3000+ automated tests
├── docs/ <-- This tutorial
├── pyproject.toml <-- Python project config
└── .github/workflows/ <-- CI: ruff, mypy, pytest, SBOM
Summary
┌─── KEY TAKEAWAYS ───────────────────────────────────────────────┐
│ │
│ 1. 14 Python packages, each with a single responsibility. │
│ 3000+ tests. mypy strict. Zero-tolerance linting. │
│ │
│ 2. Signal pipeline: Data -> Features -> Strategies -> Meta │
│ -> Regime -> Compliance -> Risk -> OMS -> Broker │
│ │
│ 3. Bloomberg-style terminal with 7 panel types (PORT, RISK, │
│ COT, TCA, OMON, NEWS, SCEN) streaming over WebSocket. │
│ │
│ 4. Self-learning: decay monitor -> auto-retrain -> shadow-live │
│ -> promote. Continuous model improvement without downtime. │
│ │
│ 5. Merkle-chained audit log with correlation IDs means every │
│ trade is traceable end-to-end and tamper-evident. │
│ │
│ 6. CI enforces: ruff lint, mypy strict, 3000+ tests, SBOM │
│ scan, and GPG-signed releases. No shortcuts. │
│ │
└──────────────────────────────────────────────────────────────────┘
Ready to run it yourself? Chapter 9: Running the Platform -->