Architecture — QGTM.AI Trading Platform
System Overview
QGTM.AI is a monorepo-based commodity ETF algorithmic trading platform with three primary surfaces:
- Proprietary Trading Engine -- systematic alpha generation across commodity ETFs
- Signal Publication Service -- tiered subscription business delivering vetted trade signals
- Web Terminal -- Bloomberg-style interface for monitoring, research, and subscriber access
Design Principles
- Correctness over speed -- No lookahead bias, proper event ordering, point-in-time data
- Paper-first -- Every path works in paper mode; live trading is a gated promotion
- Repo-first -- Everything is code, committed, versioned, reproducible
- Broker-agnostic -- Alpaca is primary, but the abstraction layer supports IB/Tradier/Tastytrade
- Defense in depth -- Multiple independent risk checks at strategy, portfolio, and execution layers
C4 Model Diagrams
Level 1: System Context
Who uses the system and what external systems does it interact with.
C4Context
title QGTM.AI — System Context (C4 Level 1)
Person(owner, "Owner/Operator", "Monitors P&L, approves kill-switch resets, manages strategy lifecycle")
Person(subscriber, "Signal Subscriber", "Receives trade signals via API/Discord/Telegram/email")
System(qgtm, "QGTM.AI Trading Platform", "Systematic commodity ETF trading engine with signal publication and web terminal")
System_Ext(alpaca, "Alpaca", "Primary broker: order execution, market data, positions, account")
System_Ext(fred, "FRED", "Federal Reserve macro data: rates, DXY, breakevens, PMI")
System_Ext(cftc, "CFTC", "Commitments of Traders reports: managed money, commercial, swap dealer positioning")
System_Ext(lbma, "LBMA / Nasdaq DL", "London gold/silver fix prices, vault holdings, clearing volumes")
System_Ext(comex, "CME / COMEX", "Warehouse stocks (registered/eligible), futures term structure")
System_Ext(eia, "EIA", "Energy inventory data for commodity macro context")
System_Ext(noaa, "NOAA / ECMWF", "Weather ensembles for agricultural commodity signals")
System_Ext(monitoring, "Grafana Cloud", "Metrics dashboards, alerting, Prometheus remote write")
System_Ext(discord, "Discord / Telegram", "Signal delivery channels for subscribers")
System_Ext(auth, "Clerk / Auth.js", "Authentication and user management")
System_Ext(secrets, "Doppler", "Secrets management: API keys, broker credentials")
Rel(owner, qgtm, "Monitors, configures, approves")
Rel(subscriber, qgtm, "Receives signals, views terminal")
Rel(qgtm, alpaca, "Orders, market data, positions", "REST/WS")
Rel(qgtm, fred, "Macro series", "REST")
Rel(qgtm, cftc, "COT reports", "HTTP/ZIP")
Rel(qgtm, lbma, "Fix prices, vault data", "REST")
Rel(qgtm, comex, "Warehouse stocks", "REST")
Rel(qgtm, eia, "Energy inventories", "REST")
Rel(qgtm, noaa, "Weather data", "REST")
Rel(qgtm, monitoring, "Metrics, traces", "OTLP/Prometheus")
Rel(qgtm, discord, "Signal delivery", "Bot API")
Rel(qgtm, auth, "AuthN/AuthZ", "OIDC")
Rel(qgtm, secrets, "Secrets", "SDK")
Level 2: Container Diagram
The major deployable units within the QGTM.AI platform.
C4Container
title QGTM.AI — Container Diagram (C4 Level 2)
Person(owner, "Owner/Operator")
Person(subscriber, "Signal Subscriber")
System_Boundary(qgtm, "QGTM.AI Platform") {
Container(web, "Web Terminal", "Next.js 15, React", "Bloomberg-style dark UI with command palette, real-time data grids, keyboard navigation")
Container(api, "API Backend", "FastAPI, Python 3.12+", "REST + WebSocket API for terminal, signals, admin, subscriber management")
Container(daemon, "Trading Daemon", "Python asyncio", "Event-driven live trading: signal generation, risk checks, order execution, reconciliation")
Container(signals, "Signal Publisher", "Python", "Tiered signal delivery to subscribers via Discord, Telegram, email, API")
Container(bot, "Chat Bots", "Python", "Discord + Telegram bots for signal delivery and community interaction")
ContainerDb(postgres, "PostgreSQL", "Neon", "Metadata, signals, subscribers, audit trail")
ContainerDb(duckdb, "DuckDB", "Embedded", "Fast analytical queries on bar/tick data and features")
ContainerDb(redis, "Redis", "Upstash", "Hot state, heartbeats, pub/sub streams, cache")
ContainerDb(r2, "Object Storage", "Cloudflare R2", "Parquet files, model artifacts, backtest results")
Container(prometheus, "Prometheus + Grafana", "Grafana Cloud", "Metrics collection, dashboards, alerting")
}
System_Ext(alpaca, "Alpaca Broker", "Order execution, market data")
System_Ext(data_providers, "Data Providers", "FRED, CFTC, LBMA, COMEX, EIA, NOAA")
Rel(owner, web, "Monitors, manages")
Rel(subscriber, web, "Views signals, research")
Rel(web, api, "REST/WS", "HTTPS")
Rel(api, postgres, "Read/Write", "SQL")
Rel(api, redis, "Cache, pub/sub", "Redis protocol")
Rel(daemon, redis, "Heartbeat, hot state", "Redis protocol")
Rel(daemon, postgres, "Audit log, signals", "SQL")
Rel(daemon, duckdb, "Feature queries", "SQL")
Rel(daemon, alpaca, "Orders, positions, data", "REST/WS")
Rel(daemon, data_providers, "Market/macro data", "REST")
Rel(daemon, r2, "Model artifacts", "S3 API")
Rel(daemon, prometheus, "Metrics, traces", "OTLP")
Rel(signals, postgres, "Read signals", "SQL")
Rel(signals, bot, "Publishes signals")
Rel(bot, subscriber, "Signal delivery")
Level 3: Component Diagram — Trading Daemon
The internal components of the trading daemon, the core of the system.
C4Component
title Trading Daemon — Component Diagram (C4 Level 3)
Container_Boundary(daemon, "Trading Daemon") {
Component(strategies, "Strategy Library", "Python", "40+ strategies: TSMOM, XSMOM, stat-arb, regime, options, ML ensemble. Each emits Signal objects.")
Component(regime, "Regime Detector", "Python", "Classifies market regime (trending, mean-reverting, crisis) using HMM and rule-based indicators")
Component(aggregator, "Signal Aggregator", "Python", "Collects signals from all active strategies, deduplicates, applies regime filter")
Component(allocator, "Portfolio Allocator", "Python", "Multi-strategy weighting: equal-risk, inverse-vol, HRP. Adaptive leverage with drawdown dampening.")
Component(risk, "Risk Manager", "Python", "Tiered kill-switch (WARN/THROTTLE/NO_NEW/FLATTEN), vol targeting, Kelly sizing, CVaR limits, correlation caps")
Component(oms, "Order Management System", "Python", "Signal-to-order lifecycle: idempotent IDs, algo routing (VWAP/TWAP), partial fill handling")
Component(recon, "Reconciliation Engine", "Python", "60s cycle: broker vs internal positions. Classifies: MISSING_LOCALLY, MISSING_AT_BROKER, QTY_MISMATCH, VALUE_MISMATCH")
Component(watchdog, "Watchdog / Dead-Man's Switch", "Python", "Independent heartbeat monitor. Triggers emergency flatten if daemon stalls beyond timeout.")
Component(audit, "Audit Log", "Python", "Immutable append-only log with Merkle chaining. Records every order, fill, risk event, recon result.")
Component(retrain, "Auto-Retrain Loop", "Python", "Decay detect -> retrain -> validate (walk-forward OOS, PBO) -> shadow -> promote. Never auto-promotes without shadow validation.")
Component(decay, "Decay Monitor", "Python", "Tracks rolling Sharpe, hit rate, factor exposure drift. Flags WARNING/DECAYED strategies for retrain.")
Component(scheduler, "Scheduler", "Python", "APScheduler-based: triggers rebalance at 15:30 ET, reconciliation every 60s, data refresh on cron")
}
ContainerDb(redis, "Redis")
ContainerDb(postgres, "PostgreSQL")
ContainerDb(duckdb, "DuckDB")
System_Ext(broker, "Alpaca Broker")
Rel(scheduler, strategies, "Triggers signal generation")
Rel(strategies, aggregator, "Emit signals")
Rel(regime, strategies, "Regime context")
Rel(aggregator, allocator, "Filtered signals")
Rel(allocator, risk, "Proposed portfolio")
Rel(risk, oms, "Risk-checked orders")
Rel(oms, broker, "Submit orders", "REST/WS")
Rel(broker, oms, "Fills, rejects", "WS")
Rel(recon, broker, "Query positions", "REST")
Rel(recon, audit, "Log discrepancies")
Rel(watchdog, redis, "Read heartbeat")
Rel(watchdog, broker, "Emergency flatten")
Rel(retrain, decay, "Monitor status")
Rel(oms, audit, "Order lifecycle events")
Rel(risk, audit, "Kill-switch events")
Rel(strategies, duckdb, "Feature queries")
Rel(audit, postgres, "Persist audit trail")
Sequence Diagrams
Signal-to-Order Flow
The critical path from strategy signal generation through to trade completion analysis.
sequenceDiagram
autonumber
participant S as Strategy
participant AGG as Signal Aggregator
participant A as Portfolio Allocator
participant R as Risk Manager
participant OMS as OMS
participant B as Alpaca Broker
participant TCA as TCA Engine
participant AL as Audit Log
S->>S: generate_signals(features)
S->>AGG: Signal(symbol, side, weight, confidence)
Note over AGG: Deduplicate, apply regime filter
AGG->>A: filtered signals by strategy
A->>A: compute_weights(inverse_vol / HRP)
A->>A: adaptive_leverage(drawdown_dampening)
A->>R: proposed_portfolio (target weights)
R->>R: check_kill_tier()
alt Kill tier >= NO_NEW
R-->>OMS: BLOCKED (kill-switch active)
R->>AL: log(KILL_SWITCH_BLOCK)
else Kill tier == THROTTLE
R->>R: apply throttle_factor (50% size reduction)
R->>OMS: throttled orders
else NORMAL / WARN
R->>R: check_position_limits()
R->>R: check_correlation_caps()
R->>R: check_sector_exposure()
R->>OMS: approved orders
end
OMS->>OMS: select_algo(qty vs ADV)
alt Order > 5% ADV
OMS->>OMS: TWAP (capped participation)
else Order > 1% ADV
OMS->>OMS: VWAP (volume curve)
else Small order
OMS->>OMS: Direct market order
end
OMS->>AL: log(ORDER_SUBMITTED)
OMS->>B: submit_order(idempotent_id)
B-->>OMS: fill / partial_fill / reject
OMS->>AL: log(ORDER_FILLED / REJECTED)
OMS->>TCA: Fill(price, qty, timestamp, venue)
TCA->>TCA: arrival_price_cost()
TCA->>TCA: vwap_slippage()
TCA->>TCA: implementation_shortfall()
TCA->>AL: log(TCA_RESULT)
Kill-Switch Escalation
Tiered response to adverse conditions -- from logging through full position flatten.
sequenceDiagram
autonumber
participant D as Trading Daemon
participant R as Risk Manager
participant OMS as OMS
participant B as Broker
participant AL as Audit Log
participant O as Owner (Alert)
D->>R: update_pnl(daily_pnl, equity)
R->>R: compute drawdown vs peak
alt Daily PnL < -50% of limit
R->>R: escalate to WARN
R->>AL: log(TIER_CHANGE, NORMAL->WARN)
R->>O: alert(WARN: approaching daily loss limit)
end
D->>R: update_pnl(worsening)
alt Daily PnL < -75% of limit
R->>R: escalate to THROTTLE
R->>AL: log(TIER_CHANGE, WARN->THROTTLE)
R->>O: alert(THROTTLE: sizes reduced 50%)
Note over R: throttle_factor = 0.5
end
D->>R: update_pnl(breach)
alt Daily PnL < -100% of limit
R->>R: escalate to NO_NEW
R->>AL: log(TIER_CHANGE, THROTTLE->NO_NEW)
R->>O: alert(NO_NEW: new positions blocked)
Note over OMS: Only close/reduce allowed
end
D->>R: update_pnl(max drawdown breached)
alt Drawdown > max_drawdown_pct
R->>R: escalate to FLATTEN
R->>AL: log(TIER_CHANGE, NO_NEW->FLATTEN)
R->>O: alert(FLATTEN: emergency liquidation)
R->>OMS: flatten_all_positions()
OMS->>B: submit close orders (all positions)
B-->>OMS: fills
OMS->>AL: log(EMERGENCY_FLATTEN_COMPLETE)
end
Note over R: De-escalation requires two-person approval
O->>R: reset_kill_switch(approver="co-pilot")
R->>R: verify approver != operator
R->>R: reset to NORMAL
R->>AL: log(KILL_SWITCH_RESET, approver)
Self-Learning Loop (Decay Detect, Retrain, Shadow, Promote)
Automated model lifecycle management with mandatory shadow validation.
sequenceDiagram
autonumber
participant DM as Decay Monitor
participant AR as Auto-Retrain
participant DP as Data Provider
participant T as Model Trainer
participant V as Validator
participant SR as Shadow Runner
participant MR as Model Registry
participant AL as Audit Log
participant O as Owner (Alert)
loop Every rebalance cycle
DM->>DM: compute rolling Sharpe, hit rate, exposure drift
DM->>DM: classify: HEALTHY / WARNING / DECAYED
end
alt Status == WARNING or DECAYED
DM->>AR: trigger_retrain(strategy_id, status)
AR->>AL: log(RETRAIN_TRIGGERED, strategy_id)
AR->>DP: fetch_training_data(strategy_id, lookback)
DP-->>AR: TrainingData
AR->>T: train(strategy_id, data, hyperparameters)
T-->>AR: TrainResult(model, metrics)
AR->>V: walk_forward_oos(model, data)
V-->>AR: oos_sharpe, oos_hit_rate
AR->>V: pbo_check(model)
V-->>AR: pbo_probability
alt Validation fails (PBO > 0.5 or OOS Sharpe < threshold)
AR->>AL: log(RETRAIN_FAILED, reason)
AR->>O: alert(retrain failed, manual review needed)
else Validation passes
AR->>MR: register(model, stage=STAGING)
AR->>AL: log(MODEL_STAGED, strategy_id, version)
AR->>SR: run_shadow(candidate, current, duration)
SR->>SR: paper-trade both models in parallel
SR-->>AR: ShadowResult(candidate_sharpe, current_sharpe, p_value)
alt Shadow underperforms (p > 0.05 or worse Sharpe)
AR->>AL: log(SHADOW_FAILED, metrics)
AR->>O: alert(shadow failed, candidate discarded)
AR->>MR: update_stage(model, ARCHIVED)
else Shadow passes
AR->>MR: update_stage(model, PRODUCTION)
AR->>MR: update_stage(old_model, ARCHIVED)
AR->>AL: log(MODEL_PROMOTED, strategy_id, version)
AR->>O: alert(model promoted to production)
end
end
end
Reconciliation Flow
Continuous broker-to-internal position reconciliation with classification and remediation.
sequenceDiagram
autonumber
participant SCH as Scheduler (60s)
participant RE as Reconciliation Engine
participant B as Alpaca Broker
participant INT as Internal State
participant AL as Audit Log
participant O as Owner (Alert)
SCH->>RE: trigger_reconciliation()
RE->>B: get_positions()
B-->>RE: broker_positions[]
RE->>INT: get_local_positions()
INT-->>RE: local_positions[]
RE->>RE: compare(broker, local)
loop For each discrepancy
alt MISSING_LOCALLY (broker has, we don't)
RE->>RE: classify(MISSING_LOCALLY, severity)
RE->>AL: log(RECON_DISCREPANCY, MISSING_LOCALLY)
alt Severity == CRITICAL
RE->>O: alert(unknown position detected!)
end
else MISSING_AT_BROKER (we have, broker doesn't)
RE->>RE: classify(MISSING_AT_BROKER, severity)
RE->>AL: log(RECON_DISCREPANCY, MISSING_AT_BROKER)
RE->>INT: mark_position_stale()
else QTY_MISMATCH
RE->>RE: classify(QTY_MISMATCH, severity)
RE->>AL: log(RECON_DISCREPANCY, QTY_MISMATCH)
alt Auto-correctable (< threshold)
RE->>INT: adjust_to_broker_qty()
else Requires review
RE->>O: alert(quantity mismatch, review needed)
end
else VALUE_MISMATCH
RE->>RE: classify(VALUE_MISMATCH, severity)
Note over RE: Usually stale prices, resolve on next tick
end
end
RE->>AL: log(RECON_COMPLETE, broker=N, local=M, discrepancies=K)
RE-->>SCH: ReconciliationResult
Data Flow Diagram
End-to-end data flow from raw sources through the PIT layer to strategy signals and execution.
flowchart LR
subgraph Sources["Raw Data Sources"]
ALP[Alpaca<br/>Bars / Trades / Quotes]
FRED_S[FRED<br/>Rates, DXY, Breakevens]
CFTC_S[CFTC<br/>COT Reports]
LBMA_S[LBMA<br/>Fix Prices, Vault Data]
COMEX_S[COMEX<br/>Warehouse Stocks]
EIA_S[EIA<br/>Energy Inventories]
NOAA_S[NOAA / ECMWF<br/>Weather Ensembles]
SAT[Satellite / AIS<br/>Alt Data]
end
subgraph Ingestion["qgtm_data + qgtm_altdata"]
ING[Ingestion Pipelines<br/>Rate limiting, retry,<br/>schema validation]
PIT[PIT Layer<br/>pit_join: knowledge_time >= timestamp<br/>No future data leakage]
DQ[Data Quality Suite<br/>Null checks, range checks,<br/>freshness, monotonicity,<br/>distribution drift]
end
subgraph Storage["Persistent Storage"]
DUCK[(DuckDB<br/>Bar/tick data,<br/>analytical queries)]
PG[(PostgreSQL<br/>Metadata, signals,<br/>subscribers, audit)]
RD[(Redis<br/>Hot state, streams,<br/>heartbeats)]
R2_S[(R2 / S3<br/>Parquet, models,<br/>backtest artifacts)]
end
subgraph Features["qgtm_features"]
FS[Feature Store<br/>Multi-horizon returns,<br/>vol, term structure,<br/>COT z-scores,<br/>correlation regimes]
end
subgraph Strategies["qgtm_strategies"]
STRAT[Strategy Library<br/>40+ strategies:<br/>TSMOM, XSMOM, stat-arb,<br/>regime, options, ML]
end
subgraph Portfolio["Portfolio + Risk"]
AGG2[Signal Aggregator<br/>Dedup + regime filter]
ALLOC[Portfolio Allocator<br/>Equal-risk / HRP /<br/>inverse-vol]
RISK[Risk Manager<br/>Kill-switch tiers,<br/>vol targeting, CVaR]
end
subgraph Execution["qgtm_execution"]
OMS2[OMS<br/>Algo routing:<br/>VWAP / TWAP / Market]
BRK[Broker Adapter<br/>Alpaca primary,<br/>IB/Tradier/Tastytrade]
TCA2[TCA<br/>Arrival price,<br/>VWAP slippage,<br/>impl. shortfall]
end
Sources --> ING
ING --> PIT
PIT --> DQ
DQ --> DUCK
DQ --> PG
DQ --> R2_S
DUCK --> FS
FS --> STRAT
STRAT --> AGG2
AGG2 --> ALLOC
ALLOC --> RISK
RISK --> OMS2
OMS2 --> BRK
BRK --> TCA2
TCA2 --> PG
RD -.-> |hot state| OMS2
Module Responsibilities
| Module | Purpose | Key Abstractions |
|---|---|---|
qgtm_core |
Shared types, config, universe, calendars | Signal, Order, PortfolioSnapshot, RiskLimits |
qgtm_data |
Market data ingestion, PIT joins, quality checks | FundamentalProvider, pit_join, DataQualitySuite |
qgtm_altdata |
Alternative data: satellite, AIS, weather | Batch-oriented pipelines, higher latency tolerance |
qgtm_features |
ArcticDB/DuckDB-backed feature store | Versioned, reproducible feature transforms |
qgtm_strategies |
Strategy library (40+ strategies) | BaseStrategy with declared factor exposures, capacity |
qgtm_backtest |
Backtesting: vectorbt + event-driven harness | Walk-forward, purged k-fold CV, PBO, deflated Sharpe |
qgtm_risk |
Risk management: kill-switch, vol targeting, CVaR | RiskManager, KillTier, DecayMonitor |
qgtm_portfolio |
Multi-strategy allocator | PortfolioAllocator: equal-risk, inverse-vol, HRP |
qgtm_execution |
Broker abstraction, OMS, algo routing, TCA | OrderManagementSystem, AlpacaBroker, VWAPExecutor |
qgtm_live |
Live trading daemon, reconciliation, watchdog | TradingDaemon, ReconciliationEngine, DeadManSwitch |
qgtm_signals |
Signal publication with tiered delays | SignalPublisher |
qgtm_api |
FastAPI backend: REST + WebSocket | Subscriber mgmt, admin endpoints, real-time updates |
qgtm_web |
Next.js 15 terminal UI | Dark theme, command palette, keyboard-first |
qgtm_bot |
Discord + Telegram bots | Signal delivery, community interaction |
Deployment Topology
| Service | Platform | Region |
|---|---|---|
| Frontend | Cloudflare Pages | Edge (global) |
| API Workers | Cloudflare Workers | Edge (global) |
| API Backend | Fly.io | us-east (iad) |
| Live Trader | Fly.io | us-east (iad) |
| Signals | Fly.io | us-east (iad) |
| PostgreSQL | Neon | us-east |
| Redis | Upstash | us-east |
| Object Storage | Cloudflare R2 | Auto |
| Secrets | Doppler | N/A |
| Monitoring | Grafana Cloud | N/A |
Security Model
+------------------+
| Cloudflare | <-- WAF, DDoS, rate limiting
| Pages/Workers |
+--------+---------+
|
+--------+---------+
| qgtm_web | <-- Auth (Clerk/Auth.js)
| qgtm_api | <-- RBAC: viewer|trader|admin|subscriber
+--------+---------+
|
+--------------+--------------+
| | |
+-----+------+ +----+------+ +-----+-----+
| qgtm_live | | qgtm_risk | | qgtm_exec |
| | | | | |
| LIVE_TRADING_ENABLED | | Paper |
| = false (default) | | default |
+------------+ +-----------+ +-----------+