RFC-0001: Last-Mile Execution Plan — v1.0.0-live
Author: Claude (Last-Mile Execution Agent) Date: 2026-04-12 Status: ACTIVE Goal: Close 9 open gaps, move from release-candidate to PRODUCTION-CERTIFIED
Current State
| Metric | Value |
|---|---|
| Gates | 11/15 GREEN, 4 RED (5, 7, 8, 12→docs) |
| Open gaps | 4 SEV1 + 4 SEV2 + 1 SEV3 = 9 |
| Tests | 3,023 / 90.71% coverage |
| Strategies | 42 files (29 active, 5 shadow, 4 blocked) |
| LOC | ~22,500 Python + ~18,700 TS |
Strategy Edge Summary (42 files)
| File | Edge |
|---|---|
| tsmom.py | Time-series momentum on precious metals |
| xsmom.py | Cross-sectional momentum across commodity ETFs |
| gold_silver_ratio.py | Mean-reversion on Au/Ag ratio |
| real_rate_gold.py | Gold tracks inverse real rates (DFII10) |
| breakeven_inflation_gold.py | Gold as inflation hedge via breakeven spread |
| dxy_gold.py | Gold inverse-USD momentum |
| cot_positioning.py | CFTC positioning extremes as contrarian signal |
| cot_precious.py | Precious-metals-specific COT signal |
| vol_risk_parity.py | Inverse-vol weighting across PM universe |
| vol_risk_premium_pm.py | Harvest PM vol risk premium via options |
| vol_term_structure_pm.py | PM vol term structure slope signal |
| mean_reversion.py | Short-term mean-reversion on PM ETFs |
| hedging_pressure.py | Futures term structure hedging pressure |
| cross_carry.py | Carry across precious metals futures curves |
| miners_vs_metal.py | GDX/GLD relative value |
| gold_platinum.py | Au/Pt substitution effects |
| central_bank_gold.py | Central bank buying/selling flow signal |
| sge_withdrawals.py | Shanghai Gold Exchange withdrawal signal |
| comex_warehouse.py | COMEX registered/eligible inventory signal |
| inventory_surprise.py | Inventory report surprise (EIA, USDA) |
| event_drift_pm.py | Post-event drift (FOMC, CPI, NFP) for PM |
| seasonality_pm.py | Calendar effects in gold/silver |
| overnight_gold.py | Overnight return anomaly in gold |
| gc_term_structure.py | Gold futures term structure signal |
| si_term_structure.py | Silver futures term structure signal |
| backwardation_stress.py | Backwardation as supply stress indicator |
| fix_dislocation.py | London fix vs. spot dislocation |
| levered_etf_decay.py | Leveraged ETF decay arbitrage |
| skew_trades.py | Options skew as directional signal |
| gamma_scalp_pm.py | Gamma scalping on PM options |
| tail_hedge_pm.py | Tail-risk hedging overlay |
| options_strategies.py | Options spread strategies on PM |
| kalman_pairs.py | Kalman-filter pairs trading |
| etf_flow_pm.py | ETF flow momentum/contrarian |
| meta_labeller_pm.py | Meta-labelling for trade filtering |
| regime_classifier_pm.py | HMM/GMM regime classification |
| regime_detector.py | Regime transition detection |
| ml_ensemble.py | Gradient boosted ensemble signal |
| forecasting.py | Point forecasting models |
| precious_metals.py | Broad PM composite signal |
| vix_haven.py | VIX-driven safe-haven rotation |
| pysystemtrade_patterns.py | Systematic trend patterns (Rob Carver style) |
Execution Plan (Ordered)
Phase 1: SEV1 Gaps (Blocks Live Trading)
1A. GAP-002 — Backtest Suite with PBO/DSR
- Fix/create scripts/backtest_all.py to run the canonical runtime PM strategy set and any explicitly included supplemental modules
- Implement CPCV (Combinatorial Purged Cross-Validation)
- Enforce PBO < 0.5 gate, DSR >= 0.95 confidence, HLZ penalty
- Output per-strategy research/backtests/<id>/report.html
- Golden-file metrics in tests/.golden/backtests/<id>.json
- Quarantine failing strategies to qgtm_strategies/_quarantine/
- Commit research/backtests/SUMMARY.md
1B. GAP-003 — Kill-Switch Fire Drill
- Create staging daemon against Alpaca paper with synthetic positions
- Fire each tier: WARN → THROTTLE → NO_NEW → FLATTEN_ALL
- Prove two-person reset required
- Record asciinema to docs/dr/drills/killswitch_<date>.cast
- Property-based test for state transitions
1C. GAP-004 — Reconciliation Loop E2E
- Stand up mocked broker endpoint
- Deliberately desync internal state
- Prove auto-detection within 60s + correct remediation
- Record docs/dr/drills/reconcile_<date>.md
- Regression test tests/test_reconcile_drill.py
1D. GAP-005 — Dead-Man's Switch
- Run daemon; kill process; prove watchdog flattens within SLA
- Ensure watchdog is on separate failure domain
- Record docs/dr/drills/watchdog_<date>.md
Phase 2: SEV2 Gaps (Reliability)
2A. GAP-012/013 — IBKR + FIX
- Stand up IB Gateway paper mock container in CI
- Prove IBKRBroker.submit_order → fill → reconcile → audit-log
- QuickFIX test acceptor for FIX 4.4 handshake
- Commit session logs to tests/fixtures/fix_sessions/
2B. GAP-014 — K8s Deployment
- Create infra/terraform/k3s.tf or infra/scripts/bootstrap_k3s.sh
- Apply infra/k8s/*.yml to k3s cluster
- Primary + warm standby with leader election
- Script failover drill
2C. GAP-015/016 — Observability
- Deploy Grafana + Prometheus + Loki + Tempo
- Import infra/grafana/qgtm-dashboard.json
- Four operator dashboards
- Wire PagerDuty/Opsgenie for SEV1 alerts
- Measure p99 signal→order-ack
Phase 3: SEV3 + Docs
3A. GAP-018 — Chaos/DR Drill - Chaos Mesh on staging (broker disconnect, feed dropout, clock skew, OOM) - Weekly scheduled run; first recorded - Quarterly DR drill script
3B. B-006 — Docs Site
- Deploy mkdocs to Cloudflare Pages
- mkdocs build → wrangler pages deploy site --project-name qgtm-docs
- Bind to docs.qgtmai.com
Phase 4: Compounding Loop Strengthening
- Auto-retrain e2e test (
tests/test_auto_retrain_e2e.py) - Drift monitors (KS + PSI) on every feature
- Auto-research agent (SSRN/arXiv crawl → RFC drafts)
- Auto-postmortem on loss days
- Regime-change detector → bus event → allocator reweight
- Capacity-aware sizing enforcement
- Shadow-live harness
Phase 5: Operator Experience
- Partner 30-second phone dashboard route
- Owner Sunday-night 5-minute screen
- Weekly digest auto-send (email + Discord)
make bootstrap && make demoin ≤ 10 minutes
Phase 6: Code Quality
- mypy --strict expansion to full tree
- ruff ALL: resolve 847 remaining findings
- Mutation testing (mutmut) ≥ 80% on core/risk/execution/portfolio
- Property-based tests (Hypothesis) max_examples ≥ 1000
- Coverage 90.71% → ≥ 95%
- Custom AST lints (no .shift(-, no bare datetime.now(), no float in money)
- Dependency cycle guard
- Dead-code scan (vulture)
Agent Roster
| Agent | Surface | Key Deliverables |
|---|---|---|
| Agent BT | Backtest | PBO/DSR/CPCV, golden files, SUMMARY.md |
| Agent DR | Drills | Kill-switch, reconcile, watchdog fire drills |
| Agent BR | Brokers | IBKR mock, FIX test acceptor |
| Agent K8 | Infra | k3s, Grafana, Prometheus, Loki, PagerDuty |
| Agent SL | Self-Learning | Auto-retrain e2e, drift, auto-research |
| Agent OX | Operator XP | Dashboard, Sunday screen, digest, Makefile |
| Agent QC | Quality | mypy, ruff, mutation, coverage, lints |
| Agent DX | Docs | Cloudflare Pages deploy, 4 operator front doors |
| Agent IV | Validator | Adversarial re-check of every claim |
Stop Conditions
All must be true: 1. gap_ledger.md: 0 SEV1 open, 0 SEV2 open 2. EXECUTIVE_SUMMARY.md: 15/15 gates GREEN 3. ELITE_STATE.md: PRODUCTION-CERTIFIED (Agent IV signed) 4. research/backtests/SUMMARY.md: approved roster, DSR CIs, PBO < 0.5 5. Live probes pass (health, WS, Lighthouse, Grafana, audit) 6. Partner 30s test + owner 5min test pass 7. Fresh-clone ≤ 10 min on CI 8. Chaos + DR + rollback drills recorded 9. mypy --strict full tree, ruff ALL clean, mutation ≥ 80% 10. Tagged v1.0.0-live, signed, SBOM, deployed, verified