Governance — QGTM Trading Platform

Owner: QGTM AI Last reviewed: 2026-04-12 Next review: 2026-07-12 (quarterly) Classification: INTERNAL — share with auditors under NDA

1. Access Control

Resource	Who	How Provisioned	MFA Required
GitHub repo (`QGTMAI/trading`)	@QGTMAI (owner)	GitHub RBAC	Yes
Production server (DigitalOcean)	@QGTMAI	SSH key + Tailscale	Yes
Alpaca broker (paper)	@QGTMAI	API key in Vault	N/A
Alpaca broker (live)	@QGTMAI	API key in Vault, IP-restricted	N/A
Cloudflare (qgtmai.com)	@QGTMAI	SSO	Yes
Discord / Telegram bots	@QGTMAI	Bot tokens in Vault	N/A
`.env` / secrets	Never committed	`.gitignore` enforced, pre-commit hook	N/A

Principle of least privilege: No service account has broader permissions than needed. Broker keys are scoped to trading only (no withdrawal authority).

2. Change Management

Branch Protection (main)

Direct push: blocked
Required reviewers: 1 (general), 2 (risk-critical paths)
Status checks required: lint, mypy, test, coverage >= 70%, sbom
Force push: blocked
Merge method: squash-and-merge only

CODEOWNERS (enforced paths)

See CODEOWNERS. Two-person review paths:

Path	Reason
`qgtm_risk/`	Kill switch, position limits, circuit breakers
`qgtm_execution/`	Order routing, broker integration
`qgtm_live/`	Live daemon, watchdog, reconciliation
`qgtm_backtest/`	Changes affect all reported metrics
`infra/`	Deployment, secrets, infrastructure
`.github/`	CI/CD pipeline definitions

PR Process

Branch from main, prefix: feat/, fix/, refactor/, docs/
Write failing test first (if applicable)
Open PR with description, link to issue if any
CI must pass: lint + mypy strict + full test suite
CODEOWNERS auto-assigned for review
Squash merge after approval(s)
Delete source branch

3. Release Process

Tag (vX.Y.Z)
  --> CI: lint + mypy + 1700+ tests
    --> SBOM generated (CycloneDX)
      --> Container image built + signed (cosign)
        --> Canary deploy (1 replica, 10% traffic)
          --> Smoke tests pass
            --> Full rollout

Step	Tool	Artifact
Tagging	`git tag -s vX.Y.Z`	Signed tag
SBOM	`syft` + CycloneDX	`sbom.json` in release assets
Image signing	`cosign`	Signature in OCI registry
Canary	K8s Deployment (1 replica)	Prometheus alert on error rate
Full rollout	K8s rolling update	Zero-downtime, readiness probes

Rollback

kubectl rollout undo deployment/qgtm-api

Automatic rollback triggers: error rate > 5% or p99 latency > 200ms during canary window.

4. Two-Person Rule

These actions require approval from two authorized individuals:

Action	Method	Evidence
Merge to risk-critical paths	GitHub CODEOWNERS (2 reviewers)	PR approval log
Production deploy	Tag + CI + manual approval gate	GitHub Actions log
Broker API key rotation	Owner + documented witness	Vault audit log
Kill switch override	Owner + manual confirmation	Audit log entry
Position limit increase	Owner + risk review documented	Git commit + PR
Database migration (production)	Owner + reviewed migration script	PR + deploy log

5. Model Risk Management (SR 11-7)

Model Inventory

See model_inventory.md for the full register of 18 models.

Category	Count	Examples
Signal generation	10	TSMOM, XSMOM, ML ensemble, regime detector
Portfolio allocation	3	HRP, risk parity, regime-adaptive
Risk estimation	3	Factor model, EVT tail, options Greeks
Execution	2	VWAP, TWAP

Validation Cadence

Activity	Frequency	Owner
Backtest re-run (walk-forward)	Monthly	@QGTMAI
PBO (Probability of Backtest Overfitting)	Quarterly	@QGTMAI
Deflated Sharpe ratio check	Quarterly	@QGTMAI
Feature importance drift	Monthly	Automated (CI)
Out-of-sample performance vs. backtest	Weekly (live)	Watchdog daemon

Model Change Protocol

Document hypothesis and expected impact
Run full walk-forward backtest on historical data
Compare PBO, deflated Sharpe, max drawdown vs. baseline
PR with backtest results attached
Two-person review for strategy changes
Canary period: paper trading for 2 weeks minimum

6. Incident Response

Kill Switch Tiers

Tier	Trigger	Action	Recovery
T1 — Soft	Single strategy drawdown > threshold	Halt that strategy, flatten positions	Auto-resume after cooldown
T2 — Hard	Portfolio drawdown > daily limit	Halt all strategies, flatten all	Manual re-enable required
T3 — Emergency	Broker connectivity loss / data corruption	Halt daemon, cancel all open orders	Manual restart after investigation
T4 — Total	Security breach / unauthorized access	Kill all processes, rotate all keys	Postmortem required before restart

Escalation Path

Alert fires (Prometheus/watchdog)
  --> PagerDuty / Discord alert
    --> @QGTMAI acknowledges (< 15 min SLA)
      --> Assess tier
        --> Execute response per tier table
          --> Postmortem within 48 hours

Postmortem Process

Timeline: Reconstruct events from audit log (correlation IDs)
Root cause: 5-Whys analysis
Impact: P&L impact, duration, affected strategies
Remediation: Concrete action items with owners and deadlines
Prevention: Systemic fixes, not just patches
Document: Stored in docs/postmortems/YYYY-MM-DD-title.md

7. Audit Trail

Merkle-Chained Log

All trading decisions are recorded in an append-only, Merkle-chained audit log (qgtm_core/audit_log.py).

Property	Implementation
Integrity	SHA-256 hash chain; each entry includes previous hash
Correlation IDs	Every request gets a UUID propagated through all subsystems
Tamper detection	Hash chain verification on startup and hourly
Fields per entry	timestamp, correlation_id, event_type, strategy, symbol, signal, action, quantity, price, rationale

Retention Policy

Data Type	Retention	Storage
Audit log entries	7 years	PostgreSQL + monthly Parquet export to S3
Trade records	7 years	PostgreSQL
Backtest results	3 years	Local Parquet files
Application logs	90 days	Structured JSON (stdout), rotated
Metrics (Prometheus)	30 days	Prometheus TSDB

Querying

# Find all actions for a specific correlation ID
grep "corr_id=abc-123" audit.log

# Verify hash chain integrity
python -m qgtm_core.audit_log --verify

8. Quarterly Review

Cadence: First week of January, April, July, October.

Standing Agenda

Architecture review -- Are there new single points of failure?
Model performance -- PBO, deflated Sharpe, OOS vs backtest drift
Risk parameter review -- Position limits, drawdown thresholds, correlation assumptions
Access audit -- Review all service accounts, API keys, SSH keys; revoke unused
Dependency audit -- Review SBOM for CVEs; update pinned versions
Incident review -- All postmortems since last quarter; systemic patterns
Backlog triage -- Prioritize open issues, tech debt, gate progress
Regulatory check -- Wash-sale compliance, PDT status, position limit changes

Deliverable

docs/quarterly/YYYY-QN-review.md committed to repo with findings and action items.

9. Key-Person Risk

Current State

Domain	Primary	Backup	Cross-Training Status
Strategy development	@QGTMAI	--	Document all strategies in `docs/strategies/`
Infrastructure / DevOps	@QGTMAI	--	Runbooks in `docs/runbooks/`
Broker integration	@QGTMAI	--	`docs/runbooks/broker-setup.md`
Risk management	@QGTMAI	--	Model inventory + parameter docs
Frontend (Next.js)	@QGTMAI	--	`qgtm_web/README.md`

Mitigation Plan

Documentation-first: Every system has a runbook; every strategy has a design doc
Automated operations: CI/CD handles build, test, deploy; watchdog handles live monitoring
Kill switch independence: T2+ kill switch can be triggered by watchdog without human intervention
Secrets recovery: All secrets backed up in encrypted vault with documented recovery procedure
Succession plan: If scaling team, onboarding checklist in docs/onboarding.md

10. Regulatory Compliance

Wash-Sale Rule (IRC 1091)

Control	Implementation
30-day window tracking	`qgtm_risk/wash_sale.py` checks before order submission
Substantially identical security detection	Symbol + correlated ETF lookup
Cost basis adjustment	Logged for tax reporting
Override	Requires manual flag + rationale in audit log

Pattern Day Trader (PDT) Monitoring

Control	Implementation
Round-trip counter	Rolling 5-business-day window
Threshold alert	Warning at 3 round trips; block at 4
Account equity check	Verify > $25,000 before allowing 4th round trip
Override	Disabled in production; available in paper mode only

Position Limits

Control	Implementation
Per-symbol max	Configurable in `qgtm_core/config.py`; enforced pre-trade
Portfolio concentration	No single position > 20% of NAV (default)
Sector concentration	No sector > 40% of NAV (default)
Leverage cap	Max 1.0x (no leverage by default); configurable per regime

Reporting

Tax lots: Tracked per trade for Schedule D / Form 8949
Wash-sale adjustments: Flagged and logged for tax preparer
1099-B reconciliation: Quarterly reconciliation against broker statements

Document History

Date	Change	Author
2026-04-12	Initial creation	@QGTMAI