Architecture — QGTM.AI Trading Platform

System Overview

QGTM.AI is a monorepo-based commodity ETF algorithmic trading platform with three primary surfaces:

Proprietary Trading Engine -- systematic alpha generation across commodity ETFs
Signal Publication Service -- tiered subscription business delivering vetted trade signals
Web Terminal -- Bloomberg-style interface for monitoring, research, and subscriber access

Design Principles

Correctness over speed -- No lookahead bias, proper event ordering, point-in-time data
Paper-first -- Every path works in paper mode; live trading is a gated promotion
Repo-first -- Everything is code, committed, versioned, reproducible
Broker-agnostic -- Alpaca is primary, but the abstraction layer supports IB/Tradier/Tastytrade
Defense in depth -- Multiple independent risk checks at strategy, portfolio, and execution layers

C4 Model Diagrams

Level 1: System Context

Who uses the system and what external systems does it interact with.

C4Context
    title QGTM.AI — System Context (C4 Level 1)

    Person(owner, "Owner/Operator", "Monitors P&L, approves kill-switch resets, manages strategy lifecycle")
    Person(subscriber, "Signal Subscriber", "Receives trade signals via API/Discord/Telegram/email")

    System(qgtm, "QGTM.AI Trading Platform", "Systematic commodity ETF trading engine with signal publication and web terminal")

    System_Ext(alpaca, "Alpaca", "Primary broker: order execution, market data, positions, account")
    System_Ext(fred, "FRED", "Federal Reserve macro data: rates, DXY, breakevens, PMI")
    System_Ext(cftc, "CFTC", "Commitments of Traders reports: managed money, commercial, swap dealer positioning")
    System_Ext(lbma, "LBMA / Nasdaq DL", "London gold/silver fix prices, vault holdings, clearing volumes")
    System_Ext(comex, "CME / COMEX", "Warehouse stocks (registered/eligible), futures term structure")
    System_Ext(eia, "EIA", "Energy inventory data for commodity macro context")
    System_Ext(noaa, "NOAA / ECMWF", "Weather ensembles for agricultural commodity signals")
    System_Ext(monitoring, "Grafana Cloud", "Metrics dashboards, alerting, Prometheus remote write")
    System_Ext(discord, "Discord / Telegram", "Signal delivery channels for subscribers")
    System_Ext(auth, "Clerk / Auth.js", "Authentication and user management")
    System_Ext(secrets, "Doppler", "Secrets management: API keys, broker credentials")

    Rel(owner, qgtm, "Monitors, configures, approves")
    Rel(subscriber, qgtm, "Receives signals, views terminal")
    Rel(qgtm, alpaca, "Orders, market data, positions", "REST/WS")
    Rel(qgtm, fred, "Macro series", "REST")
    Rel(qgtm, cftc, "COT reports", "HTTP/ZIP")
    Rel(qgtm, lbma, "Fix prices, vault data", "REST")
    Rel(qgtm, comex, "Warehouse stocks", "REST")
    Rel(qgtm, eia, "Energy inventories", "REST")
    Rel(qgtm, noaa, "Weather data", "REST")
    Rel(qgtm, monitoring, "Metrics, traces", "OTLP/Prometheus")
    Rel(qgtm, discord, "Signal delivery", "Bot API")
    Rel(qgtm, auth, "AuthN/AuthZ", "OIDC")
    Rel(qgtm, secrets, "Secrets", "SDK")

Level 2: Container Diagram

The major deployable units within the QGTM.AI platform.

C4Container
    title QGTM.AI — Container Diagram (C4 Level 2)

    Person(owner, "Owner/Operator")
    Person(subscriber, "Signal Subscriber")

    System_Boundary(qgtm, "QGTM.AI Platform") {
        Container(web, "Web Terminal", "Next.js 15, React", "Bloomberg-style dark UI with command palette, real-time data grids, keyboard navigation")
        Container(api, "API Backend", "FastAPI, Python 3.12+", "REST + WebSocket API for terminal, signals, admin, subscriber management")
        Container(daemon, "Trading Daemon", "Python asyncio", "Event-driven live trading: signal generation, risk checks, order execution, reconciliation")
        Container(signals, "Signal Publisher", "Python", "Tiered signal delivery to subscribers via Discord, Telegram, email, API")
        Container(bot, "Chat Bots", "Python", "Discord + Telegram bots for signal delivery and community interaction")
        ContainerDb(postgres, "PostgreSQL", "Neon", "Metadata, signals, subscribers, audit trail")
        ContainerDb(duckdb, "DuckDB", "Embedded", "Fast analytical queries on bar/tick data and features")
        ContainerDb(redis, "Redis", "Upstash", "Hot state, heartbeats, pub/sub streams, cache")
        ContainerDb(r2, "Object Storage", "Cloudflare R2", "Parquet files, model artifacts, backtest results")
        Container(prometheus, "Prometheus + Grafana", "Grafana Cloud", "Metrics collection, dashboards, alerting")
    }

    System_Ext(alpaca, "Alpaca Broker", "Order execution, market data")
    System_Ext(data_providers, "Data Providers", "FRED, CFTC, LBMA, COMEX, EIA, NOAA")

    Rel(owner, web, "Monitors, manages")
    Rel(subscriber, web, "Views signals, research")
    Rel(web, api, "REST/WS", "HTTPS")
    Rel(api, postgres, "Read/Write", "SQL")
    Rel(api, redis, "Cache, pub/sub", "Redis protocol")
    Rel(daemon, redis, "Heartbeat, hot state", "Redis protocol")
    Rel(daemon, postgres, "Audit log, signals", "SQL")
    Rel(daemon, duckdb, "Feature queries", "SQL")
    Rel(daemon, alpaca, "Orders, positions, data", "REST/WS")
    Rel(daemon, data_providers, "Market/macro data", "REST")
    Rel(daemon, r2, "Model artifacts", "S3 API")
    Rel(daemon, prometheus, "Metrics, traces", "OTLP")
    Rel(signals, postgres, "Read signals", "SQL")
    Rel(signals, bot, "Publishes signals")
    Rel(bot, subscriber, "Signal delivery")

Level 3: Component Diagram — Trading Daemon

The internal components of the trading daemon, the core of the system.

C4Component
    title Trading Daemon — Component Diagram (C4 Level 3)

    Container_Boundary(daemon, "Trading Daemon") {
        Component(strategies, "Strategy Library", "Python", "40+ strategies: TSMOM, XSMOM, stat-arb, regime, options, ML ensemble. Each emits Signal objects.")
        Component(regime, "Regime Detector", "Python", "Classifies market regime (trending, mean-reverting, crisis) using HMM and rule-based indicators")
        Component(aggregator, "Signal Aggregator", "Python", "Collects signals from all active strategies, deduplicates, applies regime filter")
        Component(allocator, "Portfolio Allocator", "Python", "Multi-strategy weighting: equal-risk, inverse-vol, HRP. Adaptive leverage with drawdown dampening.")
        Component(risk, "Risk Manager", "Python", "Tiered kill-switch (WARN/THROTTLE/NO_NEW/FLATTEN), vol targeting, Kelly sizing, CVaR limits, correlation caps")
        Component(oms, "Order Management System", "Python", "Signal-to-order lifecycle: idempotent IDs, algo routing (VWAP/TWAP), partial fill handling")
        Component(recon, "Reconciliation Engine", "Python", "60s cycle: broker vs internal positions. Classifies: MISSING_LOCALLY, MISSING_AT_BROKER, QTY_MISMATCH, VALUE_MISMATCH")
        Component(watchdog, "Watchdog / Dead-Man's Switch", "Python", "Independent heartbeat monitor. Triggers emergency flatten if daemon stalls beyond timeout.")
        Component(audit, "Audit Log", "Python", "Immutable append-only log with Merkle chaining. Records every order, fill, risk event, recon result.")
        Component(retrain, "Auto-Retrain Loop", "Python", "Decay detect -> retrain -> validate (walk-forward OOS, PBO) -> shadow -> promote. Never auto-promotes without shadow validation.")
        Component(decay, "Decay Monitor", "Python", "Tracks rolling Sharpe, hit rate, factor exposure drift. Flags WARNING/DECAYED strategies for retrain.")
        Component(scheduler, "Scheduler", "Python", "APScheduler-based: triggers rebalance at 15:30 ET, reconciliation every 60s, data refresh on cron")
    }

    ContainerDb(redis, "Redis")
    ContainerDb(postgres, "PostgreSQL")
    ContainerDb(duckdb, "DuckDB")
    System_Ext(broker, "Alpaca Broker")

    Rel(scheduler, strategies, "Triggers signal generation")
    Rel(strategies, aggregator, "Emit signals")
    Rel(regime, strategies, "Regime context")
    Rel(aggregator, allocator, "Filtered signals")
    Rel(allocator, risk, "Proposed portfolio")
    Rel(risk, oms, "Risk-checked orders")
    Rel(oms, broker, "Submit orders", "REST/WS")
    Rel(broker, oms, "Fills, rejects", "WS")
    Rel(recon, broker, "Query positions", "REST")
    Rel(recon, audit, "Log discrepancies")
    Rel(watchdog, redis, "Read heartbeat")
    Rel(watchdog, broker, "Emergency flatten")
    Rel(retrain, decay, "Monitor status")
    Rel(oms, audit, "Order lifecycle events")
    Rel(risk, audit, "Kill-switch events")
    Rel(strategies, duckdb, "Feature queries")
    Rel(audit, postgres, "Persist audit trail")

Sequence Diagrams

Signal-to-Order Flow

The critical path from strategy signal generation through to trade completion analysis.

sequenceDiagram
    autonumber
    participant S as Strategy
    participant AGG as Signal Aggregator
    participant A as Portfolio Allocator
    participant R as Risk Manager
    participant OMS as OMS
    participant B as Alpaca Broker
    participant TCA as TCA Engine
    participant AL as Audit Log

    S->>S: generate_signals(features)
    S->>AGG: Signal(symbol, side, weight, confidence)
    Note over AGG: Deduplicate, apply regime filter
    AGG->>A: filtered signals by strategy
    A->>A: compute_weights(inverse_vol / HRP)
    A->>A: adaptive_leverage(drawdown_dampening)
    A->>R: proposed_portfolio (target weights)
    R->>R: check_kill_tier()
    alt Kill tier >= NO_NEW
        R-->>OMS: BLOCKED (kill-switch active)
        R->>AL: log(KILL_SWITCH_BLOCK)
    else Kill tier == THROTTLE
        R->>R: apply throttle_factor (50% size reduction)
        R->>OMS: throttled orders
    else NORMAL / WARN
        R->>R: check_position_limits()
        R->>R: check_correlation_caps()
        R->>R: check_sector_exposure()
        R->>OMS: approved orders
    end
    OMS->>OMS: select_algo(qty vs ADV)
    alt Order > 5% ADV
        OMS->>OMS: TWAP (capped participation)
    else Order > 1% ADV
        OMS->>OMS: VWAP (volume curve)
    else Small order
        OMS->>OMS: Direct market order
    end
    OMS->>AL: log(ORDER_SUBMITTED)
    OMS->>B: submit_order(idempotent_id)
    B-->>OMS: fill / partial_fill / reject
    OMS->>AL: log(ORDER_FILLED / REJECTED)
    OMS->>TCA: Fill(price, qty, timestamp, venue)
    TCA->>TCA: arrival_price_cost()
    TCA->>TCA: vwap_slippage()
    TCA->>TCA: implementation_shortfall()
    TCA->>AL: log(TCA_RESULT)

Kill-Switch Escalation

Tiered response to adverse conditions -- from logging through full position flatten.

sequenceDiagram
    autonumber
    participant D as Trading Daemon
    participant R as Risk Manager
    participant OMS as OMS
    participant B as Broker
    participant AL as Audit Log
    participant O as Owner (Alert)

    D->>R: update_pnl(daily_pnl, equity)
    R->>R: compute drawdown vs peak

    alt Daily PnL < -50% of limit
        R->>R: escalate to WARN
        R->>AL: log(TIER_CHANGE, NORMAL->WARN)
        R->>O: alert(WARN: approaching daily loss limit)
    end

    D->>R: update_pnl(worsening)

    alt Daily PnL < -75% of limit
        R->>R: escalate to THROTTLE
        R->>AL: log(TIER_CHANGE, WARN->THROTTLE)
        R->>O: alert(THROTTLE: sizes reduced 50%)
        Note over R: throttle_factor = 0.5
    end

    D->>R: update_pnl(breach)

    alt Daily PnL < -100% of limit
        R->>R: escalate to NO_NEW
        R->>AL: log(TIER_CHANGE, THROTTLE->NO_NEW)
        R->>O: alert(NO_NEW: new positions blocked)
        Note over OMS: Only close/reduce allowed
    end

    D->>R: update_pnl(max drawdown breached)

    alt Drawdown > max_drawdown_pct
        R->>R: escalate to FLATTEN
        R->>AL: log(TIER_CHANGE, NO_NEW->FLATTEN)
        R->>O: alert(FLATTEN: emergency liquidation)
        R->>OMS: flatten_all_positions()
        OMS->>B: submit close orders (all positions)
        B-->>OMS: fills
        OMS->>AL: log(EMERGENCY_FLATTEN_COMPLETE)
    end

    Note over R: De-escalation requires two-person approval
    O->>R: reset_kill_switch(approver="co-pilot")
    R->>R: verify approver != operator
    R->>R: reset to NORMAL
    R->>AL: log(KILL_SWITCH_RESET, approver)

Self-Learning Loop (Decay Detect, Retrain, Shadow, Promote)

Automated model lifecycle management with mandatory shadow validation.

sequenceDiagram
    autonumber
    participant DM as Decay Monitor
    participant AR as Auto-Retrain
    participant DP as Data Provider
    participant T as Model Trainer
    participant V as Validator
    participant SR as Shadow Runner
    participant MR as Model Registry
    participant AL as Audit Log
    participant O as Owner (Alert)

    loop Every rebalance cycle
        DM->>DM: compute rolling Sharpe, hit rate, exposure drift
        DM->>DM: classify: HEALTHY / WARNING / DECAYED
    end

    alt Status == WARNING or DECAYED
        DM->>AR: trigger_retrain(strategy_id, status)
        AR->>AL: log(RETRAIN_TRIGGERED, strategy_id)

        AR->>DP: fetch_training_data(strategy_id, lookback)
        DP-->>AR: TrainingData

        AR->>T: train(strategy_id, data, hyperparameters)
        T-->>AR: TrainResult(model, metrics)

        AR->>V: walk_forward_oos(model, data)
        V-->>AR: oos_sharpe, oos_hit_rate

        AR->>V: pbo_check(model)
        V-->>AR: pbo_probability

        alt Validation fails (PBO > 0.5 or OOS Sharpe < threshold)
            AR->>AL: log(RETRAIN_FAILED, reason)
            AR->>O: alert(retrain failed, manual review needed)
        else Validation passes
            AR->>MR: register(model, stage=STAGING)
            AR->>AL: log(MODEL_STAGED, strategy_id, version)

            AR->>SR: run_shadow(candidate, current, duration)
            SR->>SR: paper-trade both models in parallel
            SR-->>AR: ShadowResult(candidate_sharpe, current_sharpe, p_value)

            alt Shadow underperforms (p > 0.05 or worse Sharpe)
                AR->>AL: log(SHADOW_FAILED, metrics)
                AR->>O: alert(shadow failed, candidate discarded)
                AR->>MR: update_stage(model, ARCHIVED)
            else Shadow passes
                AR->>MR: update_stage(model, PRODUCTION)
                AR->>MR: update_stage(old_model, ARCHIVED)
                AR->>AL: log(MODEL_PROMOTED, strategy_id, version)
                AR->>O: alert(model promoted to production)
            end
        end
    end

Reconciliation Flow

Continuous broker-to-internal position reconciliation with classification and remediation.

sequenceDiagram
    autonumber
    participant SCH as Scheduler (60s)
    participant RE as Reconciliation Engine
    participant B as Alpaca Broker
    participant INT as Internal State
    participant AL as Audit Log
    participant O as Owner (Alert)

    SCH->>RE: trigger_reconciliation()
    RE->>B: get_positions()
    B-->>RE: broker_positions[]
    RE->>INT: get_local_positions()
    INT-->>RE: local_positions[]

    RE->>RE: compare(broker, local)

    loop For each discrepancy
        alt MISSING_LOCALLY (broker has, we don't)
            RE->>RE: classify(MISSING_LOCALLY, severity)
            RE->>AL: log(RECON_DISCREPANCY, MISSING_LOCALLY)
            alt Severity == CRITICAL
                RE->>O: alert(unknown position detected!)
            end
        else MISSING_AT_BROKER (we have, broker doesn't)
            RE->>RE: classify(MISSING_AT_BROKER, severity)
            RE->>AL: log(RECON_DISCREPANCY, MISSING_AT_BROKER)
            RE->>INT: mark_position_stale()
        else QTY_MISMATCH
            RE->>RE: classify(QTY_MISMATCH, severity)
            RE->>AL: log(RECON_DISCREPANCY, QTY_MISMATCH)
            alt Auto-correctable (< threshold)
                RE->>INT: adjust_to_broker_qty()
            else Requires review
                RE->>O: alert(quantity mismatch, review needed)
            end
        else VALUE_MISMATCH
            RE->>RE: classify(VALUE_MISMATCH, severity)
            Note over RE: Usually stale prices, resolve on next tick
        end
    end

    RE->>AL: log(RECON_COMPLETE, broker=N, local=M, discrepancies=K)
    RE-->>SCH: ReconciliationResult

Data Flow Diagram

End-to-end data flow from raw sources through the PIT layer to strategy signals and execution.

flowchart LR
    subgraph Sources["Raw Data Sources"]
        ALP[Alpaca<br/>Bars / Trades / Quotes]
        FRED_S[FRED<br/>Rates, DXY, Breakevens]
        CFTC_S[CFTC<br/>COT Reports]
        LBMA_S[LBMA<br/>Fix Prices, Vault Data]
        COMEX_S[COMEX<br/>Warehouse Stocks]
        EIA_S[EIA<br/>Energy Inventories]
        NOAA_S[NOAA / ECMWF<br/>Weather Ensembles]
        SAT[Satellite / AIS<br/>Alt Data]
    end

    subgraph Ingestion["qgtm_data + qgtm_altdata"]
        ING[Ingestion Pipelines<br/>Rate limiting, retry,<br/>schema validation]
        PIT[PIT Layer<br/>pit_join: knowledge_time >= timestamp<br/>No future data leakage]
        DQ[Data Quality Suite<br/>Null checks, range checks,<br/>freshness, monotonicity,<br/>distribution drift]
    end

    subgraph Storage["Persistent Storage"]
        DUCK[(DuckDB<br/>Bar/tick data,<br/>analytical queries)]
        PG[(PostgreSQL<br/>Metadata, signals,<br/>subscribers, audit)]
        RD[(Redis<br/>Hot state, streams,<br/>heartbeats)]
        R2_S[(R2 / S3<br/>Parquet, models,<br/>backtest artifacts)]
    end

    subgraph Features["qgtm_features"]
        FS[Feature Store<br/>Multi-horizon returns,<br/>vol, term structure,<br/>COT z-scores,<br/>correlation regimes]
    end

    subgraph Strategies["qgtm_strategies"]
        STRAT[Strategy Library<br/>40+ strategies:<br/>TSMOM, XSMOM, stat-arb,<br/>regime, options, ML]
    end

    subgraph Portfolio["Portfolio + Risk"]
        AGG2[Signal Aggregator<br/>Dedup + regime filter]
        ALLOC[Portfolio Allocator<br/>Equal-risk / HRP /<br/>inverse-vol]
        RISK[Risk Manager<br/>Kill-switch tiers,<br/>vol targeting, CVaR]
    end

    subgraph Execution["qgtm_execution"]
        OMS2[OMS<br/>Algo routing:<br/>VWAP / TWAP / Market]
        BRK[Broker Adapter<br/>Alpaca primary,<br/>IB/Tradier/Tastytrade]
        TCA2[TCA<br/>Arrival price,<br/>VWAP slippage,<br/>impl. shortfall]
    end

    Sources --> ING
    ING --> PIT
    PIT --> DQ
    DQ --> DUCK
    DQ --> PG
    DQ --> R2_S
    DUCK --> FS
    FS --> STRAT
    STRAT --> AGG2
    AGG2 --> ALLOC
    ALLOC --> RISK
    RISK --> OMS2
    OMS2 --> BRK
    BRK --> TCA2
    TCA2 --> PG
    RD -.-> |hot state| OMS2

Module Responsibilities

Module	Purpose	Key Abstractions
`qgtm_core`	Shared types, config, universe, calendars	`Signal`, `Order`, `PortfolioSnapshot`, `RiskLimits`
`qgtm_data`	Market data ingestion, PIT joins, quality checks	`FundamentalProvider`, `pit_join`, `DataQualitySuite`
`qgtm_altdata`	Alternative data: satellite, AIS, weather	Batch-oriented pipelines, higher latency tolerance
`qgtm_features`	ArcticDB/DuckDB-backed feature store	Versioned, reproducible feature transforms
`qgtm_strategies`	Strategy library (40+ strategies)	`BaseStrategy` with declared factor exposures, capacity
`qgtm_backtest`	Backtesting: vectorbt + event-driven harness	Walk-forward, purged k-fold CV, PBO, deflated Sharpe
`qgtm_risk`	Risk management: kill-switch, vol targeting, CVaR	`RiskManager`, `KillTier`, `DecayMonitor`
`qgtm_portfolio`	Multi-strategy allocator	`PortfolioAllocator`: equal-risk, inverse-vol, HRP
`qgtm_execution`	Broker abstraction, OMS, algo routing, TCA	`OrderManagementSystem`, `AlpacaBroker`, `VWAPExecutor`
`qgtm_live`	Live trading daemon, reconciliation, watchdog	`TradingDaemon`, `ReconciliationEngine`, `DeadManSwitch`
`qgtm_signals`	Signal publication with tiered delays	`SignalPublisher`
`qgtm_api`	FastAPI backend: REST + WebSocket	Subscriber mgmt, admin endpoints, real-time updates
`qgtm_web`	Next.js 15 terminal UI	Dark theme, command palette, keyboard-first
`qgtm_bot`	Discord + Telegram bots	Signal delivery, community interaction

Deployment Topology

Service	Platform	Region
Frontend	Cloudflare Pages	Edge (global)
API Workers	Cloudflare Workers	Edge (global)
API Backend	Fly.io	us-east (iad)
Live Trader	Fly.io	us-east (iad)
Signals	Fly.io	us-east (iad)
PostgreSQL	Neon	us-east
Redis	Upstash	us-east
Object Storage	Cloudflare R2	Auto
Secrets	Doppler	N/A
Monitoring	Grafana Cloud	N/A

Security Model

                    +------------------+
                    |    Cloudflare    |  <-- WAF, DDoS, rate limiting
                    |   Pages/Workers  |
                    +--------+---------+
                             |
                    +--------+---------+
                    |     qgtm_web     |  <-- Auth (Clerk/Auth.js)
                    |     qgtm_api     |  <-- RBAC: viewer|trader|admin|subscriber
                    +--------+---------+
                             |
              +--------------+--------------+
              |              |              |
        +-----+------+ +----+------+ +-----+-----+
        |  qgtm_live | | qgtm_risk | | qgtm_exec |
        |            | |           | |           |
        | LIVE_TRADING_ENABLED    | | Paper     |
        | = false (default)       | | default   |
        +------------+ +-----------+ +-----------+