Chapter 8: System Architecture

The goal: Understand how every component connects -- from data ingestion to trade execution to the Bloomberg-style terminal -- without being a software engineer. If you can follow a flow chart, you can understand this system.

The 30,000-Foot View

  ┌─────────────────────────────────────────────────────────────────────┐
  │                        QGTM.AI ARCHITECTURE                         │
  │                                                                     │
  │   DATA SOURCES          SIGNAL ENGINE          EXECUTION            │
  │   ────────────          ────────────          ──────────            │
  │   LBMA, COMEX    ──►   29 Strategies   ──►   Compliance   ──►     │
  │   CFTC, FRED     ──►   Signal Agg      ──►   OMS          ──►     │
  │   ETF Flows      ──►   Meta-Labeller   ──►   Broker API   ──►     │
  │   Alt Data       ──►   Regime Alloc    ──►   Fill Recon          │
  │                                                                     │
  │                   RISK MANAGER (intercepts every step)              │
  │                                                                     │
  │   FRONTEND                MONITORING           AUDIT               │
  │   ────────                ──────────           ─────               │
  │   Next.js Terminal  ◄──  Prometheus     ──►   Merkle Log           │
  │   Bloomberg Panels  ◄──  Grafana        ──►   Correlation IDs      │
  │   WebSocket Stream  ◄──  PagerDuty      ──►   Signed Releases      │
  │                                                                     │
  └─────────────────────────────────────────────────────────────────────┘

Python Backend: 14 Packages, 3000+ Tests

The core trading system is Python, organized as a monorepo with 14 internal packages. Each package has a single responsibility and talks to others through typed interfaces defined in qgtm_core.

  PACKAGE              RESPONSIBILITY                    TESTS
  ==================   ================================  =====
  qgtm_core            Types, config, universe           ~200
  qgtm_data            Data ingestion + PIT timestamps   ~250
  qgtm_features        Feature engineering               ~200
  qgtm_strategies      29 alpha strategies               ~350
  qgtm_backtest        Walk-forward backtesting          ~250
  qgtm_risk            8-factor risk + kill switch       ~300
  qgtm_portfolio       Signal aggregation + allocation   ~200
  qgtm_execution       OMS + broker adapter              ~200
  qgtm_live            Daemon, scheduler, heartbeat      ~150
  qgtm_signals         Signal publishing + tiers         ~150
  qgtm_api             FastAPI backend                   ~200
  qgtm_ml              LightGBM meta-labeller + HMM     ~200
  qgtm_compliance      Wash sale, PDT, position limits   ~150
  qgtm_audit           Merkle log, correlation IDs       ~200
  ==================   ================================  =====
  TOTAL                                                  ~3,000+

Why 14 Packages, Not 1?

Each package can be tested, deployed, and reasoned about independently. If qgtm_data has a bug, we know exactly where to look. If qgtm_risk needs a new factor, we change one package without touching strategies.

The Signal Pipeline

This is the core flow that turns data into trades:

  ┌─ SIGNAL PIPELINE ──────────────────────────────────────────────────┐
  │                                                                     │
  │  STEP 1: DATA INGESTION (qgtm_data)                                │
  │  Pull LBMA fix, COMEX stocks, FRED macro, ETF flows, COT           │
  │  Every data point gets as_of + available_at timestamps (PIT)        │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 2: FEATURE ENGINEERING (qgtm_features)                        │
  │  Z-scores, rolling correlations, Kalman filters, vol estimates      │
  │  Output: feature matrix (29 strategies x N features each)           │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 3: SIGNAL GENERATION (qgtm_strategies)                        │
  │  Each of 29 strategies produces: direction, conviction, size        │
  │  Raw signals: possibly 50+ signals per day                          │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 4: SIGNAL AGGREGATION (qgtm_portfolio)                        │
  │  Combine conflicting signals across strategies                      │
  │  Apply regime weights from G2 classifier                            │
  │  Output: target portfolio (instrument -> weight)                    │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 5: REGIME ALLOCATION (qgtm_ml)                                │
  │  HMM + BOCPD determines current regime                              │
  │  Meta-labeller filters: keep / reduce / reject each signal          │
  │  Output: filtered + regime-weighted target portfolio                │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 6: COMPLIANCE CHECK (qgtm_compliance)                         │
  │  Wash sale? PDT? Position limit? CFTC reporting threshold?         │
  │  Output: approved target portfolio (or blocked trades flagged)      │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 7: RISK GATING (qgtm_risk)                                   │
  │  8-factor check, leverage check, CVaR check, kill switch check      │
  │  Output: risk-approved portfolio                                    │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 8: ORDER MANAGEMENT (qgtm_execution)                          │
  │  Current portfolio vs target portfolio = required trades            │
  │  Smart order routing: limit vs market, timing, VWAP vs arrival     │
  │  Idempotency keys prevent duplicate orders                          │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 9: BROKER EXECUTION                                           │
  │  Orders sent to broker API (Alpaca / Interactive Brokers)           │
  │  Fill reconciliation: our records vs broker records                 │
  │                              │                                      │
  │                              ▼                                      │
  │  STEP 10: AUDIT LOG (qgtm_audit)                                    │
  │  Every step logged with correlation ID                              │
  │  Merkle-chained: each log entry hashes the previous                │
  │  Tamper-evident: cannot alter history without breaking the chain    │
  │                                                                     │
  └─────────────────────────────────────────────────────────────────────┘

Next.js Terminal: Bloomberg-Style Panels

The frontend is a Next.js application styled as a Bloomberg Terminal -- dark background, dense information, keyboard-driven navigation.

The 7 Panel Types

  PANEL    NAME     WHAT IT SHOWS
  =====    =====    ============================================
  PORT     Portfolio Current positions, P&L, Greeks, factor
                    exposures, drawdown gauge

  RISK     Risk     8-factor decomposition, CVaR, kill switch
                    status, leverage utilization, stress test
                    results

  COT      COT      CFTC positioning for gold + silver.
                    Z-score chart, managed money net, commercial
                    hedging pressure, historical extremes

  TCA      TCA      Transaction Cost Analysis. Slippage vs VWAP,
                    implementation shortfall, fill rate, broker
                    comparison

  OMON     Options  Options monitor. IV surface, skew, vol term
           Monitor  structure, Greek exposures, risk reversals

  NEWS     News     Real-time news feed filtered for gold/silver.
                    Sentiment score, source, relevance ranking.
                    FOMC/CPI/NFP event countdown timer.

  SCEN     Scenario Stress testing interface. Select a crisis
                    scenario (Hunt 1980, COVID 2020, etc.) and
                    see simulated portfolio impact in real-time.

Panel Layout

  ┌─────────────────────┬─────────────────────┬──────────────────┐
  │                     │                     │                  │
  │       PORT          │       RISK          │      NEWS        │
  │  (positions, P&L,   │  (8 factors, CVaR,  │  (sentiment,     │
  │   drawdown)         │   kill switch)       │   events)        │
  │                     │                     │                  │
  ├─────────────────────┼─────────────────────┤                  │
  │                     │                     │                  │
  │       COT           │       OMON          │                  │
  │  (CFTC positioning, │  (vol surface,      │                  │
  │   z-scores)         │   skew, Greeks)      │                  │
  │                     │                     │                  │
  ├─────────────────────┴─────────────────────┼──────────────────┤
  │                                           │                  │
  │              TCA / SCEN                    │   Command        │
  │  (slippage analysis / stress testing)      │   Palette        │
  │                                           │   (Cmd+K)        │
  └───────────────────────────────────────────┴──────────────────┘

WebSocket Streaming + Auto-Reconnect

The terminal receives real-time updates over WebSocket:

  FRONTEND (Next.js)                    BACKEND (FastAPI)
  ┌──────────────────┐                  ┌──────────────────┐
  │                  │  WebSocket       │                  │
  │  useWebSocket()  │◄════════════════►│  /ws/stream      │
  │  hook handles:   │                  │  pushes:         │
  │  - auto-reconnect│  Heartbeat       │  - price updates │
  │  - exponential   │  every 10s       │  - P&L changes   │
  │    backoff       │                  │  - signal alerts  │
  │  - message queue │                  │  - kill switch   │
  │    during        │                  │    status         │
  │    disconnect    │                  │                  │
  └──────────────────┘                  └──────────────────┘

  Reconnect strategy:
  Attempt 1: 1 second delay
  Attempt 2: 2 seconds
  Attempt 3: 4 seconds
  Attempt 4: 8 seconds
  Max delay: 30 seconds
  Messages queued during disconnect are replayed on reconnect.

Self-Learning Loop

The system monitors its own performance and automatically retrains models when alpha decays.

  ┌─ SELF-LEARNING LOOP ──────────────────────────────────────────┐
  │                                                                │
  │  STEP 1: DECAY MONITOR                                        │
  │  Rolling 60-day Sharpe for each strategy vs its backtest       │
  │  IF live_sharpe < 0.5 * backtest_sharpe for 3+ months:         │
  │      FLAG strategy for investigation                            │
  │                                                                │
  │  STEP 2: AUTO-RETRAIN                                          │
  │  LightGBM meta-labeller retrained monthly on walk-forward data │
  │  HMM regime classifier refitted quarterly                      │
  │  Kalman filter parameters updated weekly (by design)           │
  │                                                                │
  │  STEP 3: SHADOW-LIVE                                           │
  │  New model runs in shadow mode alongside production model       │
  │  Receives same inputs, generates same outputs, but does NOT     │
  │  trade. Results logged and compared.                            │
  │                                                                │
  │  STEP 4: PROMOTE                                               │
  │  IF shadow model beats production over 30+ trading days:        │
  │      AND shadow Sharpe > 0.5                                    │
  │      AND shadow max drawdown < production max drawdown          │
  │      AND team approves:                                         │
  │      --> PROMOTE shadow to production                           │
  │      --> Old production becomes the new shadow                  │
  │                                                                │
  │  This is a continuous A/B test for models.                      │
  │                                                                │
  └────────────────────────────────────────────────────────────────┘

Merkle-Chained Audit Log

Every action in the system is logged with a tamper-evident chain.

  WHAT IS A MERKLE CHAIN?

  Each log entry contains a hash of the PREVIOUS entry.
  If anyone modifies an old entry, every subsequent hash breaks.
  This is the same principle behind blockchain, but simpler.

  Entry 1: { action: "BUY GLD 500", hash: sha256(entry_1_data) }
  Entry 2: { action: "SELL SLV 200", prev_hash: hash_of_entry_1,
             hash: sha256(entry_2_data + prev_hash) }
  Entry 3: { action: "RISK CHECK OK", prev_hash: hash_of_entry_2,
             hash: sha256(entry_3_data + prev_hash) }

  If someone changes Entry 1:
  - Entry 1's hash changes
  - Entry 2's prev_hash no longer matches
  - The chain is broken --> TAMPER DETECTED

Correlation IDs

Every request gets a unique correlation ID that follows it through the entire pipeline:

  correlation_id = "sig-20260412-143022-a7f3"

  This ID appears in:
  - Strategy signal log
  - Risk check log
  - Compliance check log
  - Order creation log
  - Broker submission log
  - Fill confirmation log
  - Audit trail

  To debug any trade, search the correlation ID.
  Every step is traceable end-to-end.

CI/CD Pipeline

Every code change goes through automated quality gates before merging.

  PULL REQUEST
       │
       ▼
  ┌─────────────────────────────────────────────────┐
  │  GATE 1: LINT (ruff)                             │
  │  Python linting. Zero warnings tolerated.        │
  │  Enforces consistent style across 14 packages.   │
  └─────────────────────┬───────────────────────────┘
                        │ PASS
                        ▼
  ┌─────────────────────────────────────────────────┐
  │  GATE 2: TYPE CHECK (mypy --strict)              │
  │  Full static type analysis. Every function has   │
  │  typed parameters and return types. No Any.      │
  └─────────────────────┬───────────────────────────┘
                        │ PASS
                        ▼
  ┌─────────────────────────────────────────────────┐
  │  GATE 3: TESTS (pytest, 3000+ tests)             │
  │  Unit tests, integration tests, PIT audits.      │
  │  Coverage must stay above 80%.                   │
  └─────────────────────┬───────────────────────────┘
                        │ PASS
                        ▼
  ┌─────────────────────────────────────────────────┐
  │  GATE 4: SBOM (Software Bill of Materials)       │
  │  Every dependency tracked with version + hash.   │
  │  Vulnerability scan: no known CVEs in deps.      │
  └─────────────────────┬───────────────────────────┘
                        │ PASS
                        ▼
  ┌─────────────────────────────────────────────────┐
  │  GATE 5: SIGNED RELEASE                          │
  │  Release artifact is GPG-signed.                 │
  │  Signature verified before deployment.           │
  │  Rollback to previous signed release in <60s.    │
  └─────────────────────┬───────────────────────────┘
                        │ PASS
                        ▼
       MERGE + DEPLOY

Technology Choices

Technology	What It Is	Why We Use It
Python 3.12	Core language	Industry standard for quant finance
FastAPI	Web framework	Async, auto-docs, type-safe
Polars	DataFrame library	10-100x faster than pandas
LightGBM	Gradient boosting	Fast training, handles missing data
hmmlearn	Hidden Markov Models	Regime detection
Next.js	Frontend framework	SSR, fast navigation, React ecosystem
WebSocket	Real-time streaming	Sub-100ms latency for price updates
PostgreSQL	Primary database	Reliable, ACID, great for time-series
Redis	Cache + pub/sub	Hot state, real-time signal distribution
Docker	Containerization	Reproducible environments
GitHub Actions	CI/CD	Automated testing + deployment
Prometheus	Metrics collection	System health monitoring
Grafana	Dashboards	Visual monitoring of all metrics

Why Polars Over Pandas?

  Task: Calculate Kalman filter on gold/silver ratio, 20 years of data

  Pandas:  ██████████████████████████████████  4.8 seconds
  Polars:  ████                                0.12 seconds
                                                40x faster

  When 29 strategies run every 15 minutes on multiple instruments,
  speed is not optional.

The Monorepo

  trading/                             <-- THE REPO
  ├── qgtm_core/                       <-- Types, config, universe
  ├── qgtm_data/                       <-- Data ingestion + PIT
  ├── qgtm_features/                   <-- Feature engineering
  ├── qgtm_strategies/                 <-- 29 alpha strategies
  ├── qgtm_backtest/                   <-- Walk-forward engine
  ├── qgtm_risk/                       <-- 8-factor risk + kill switch
  ├── qgtm_portfolio/                  <-- Signal aggregation
  ├── qgtm_execution/                  <-- OMS + broker
  ├── qgtm_live/                       <-- Daemon + scheduler
  ├── qgtm_signals/                    <-- Signal publishing
  ├── qgtm_api/                        <-- FastAPI backend
  ├── qgtm_ml/                         <-- Meta-labeller + HMM
  ├── qgtm_compliance/                 <-- Wash sale, PDT, limits
  ├── qgtm_audit/                      <-- Merkle log + correlation IDs
  ├── qgtm_web/                        <-- Next.js Bloomberg terminal
  ├── infra/                           <-- Docker, K8s, Terraform
  ├── tests/                           <-- 3000+ automated tests
  ├── docs/                            <-- This tutorial
  ├── pyproject.toml                   <-- Python project config
  └── .github/workflows/              <-- CI: ruff, mypy, pytest, SBOM

Summary

  ┌─── KEY TAKEAWAYS ───────────────────────────────────────────────┐
  │                                                                  │
  │  1. 14 Python packages, each with a single responsibility.       │
  │     3000+ tests. mypy strict. Zero-tolerance linting.            │
  │                                                                  │
  │  2. Signal pipeline: Data -> Features -> Strategies -> Meta      │
  │     -> Regime -> Compliance -> Risk -> OMS -> Broker             │
  │                                                                  │
  │  3. Bloomberg-style terminal with 7 panel types (PORT, RISK,     │
  │     COT, TCA, OMON, NEWS, SCEN) streaming over WebSocket.       │
  │                                                                  │
  │  4. Self-learning: decay monitor -> auto-retrain -> shadow-live  │
  │     -> promote. Continuous model improvement without downtime.   │
  │                                                                  │
  │  5. Merkle-chained audit log with correlation IDs means every    │
  │     trade is traceable end-to-end and tamper-evident.            │
  │                                                                  │
  │  6. CI enforces: ruff lint, mypy strict, 3000+ tests, SBOM      │
  │     scan, and GPG-signed releases. No shortcuts.                 │
  │                                                                  │
  └──────────────────────────────────────────────────────────────────┘

Ready to run it yourself? Chapter 9: Running the Platform -->