05 — Portfolio Integration: Ensemble Construction + Risk Layer
Status: Research + Spec (no production code in this document)
Scope: The layer that combines validated strategy sleeves (trend, carry, vol-premium,
macro/regime, stat-arb/pairs) into one diversified, risk-targeted, drawdown-controlled
portfolio.
Owner system: qgtm_portfolio/, qgtm_risk/, qgtm_live/daemon.py, qgtm_execution/
Assumes: A separate promotion gate (built by another agent) decides which sleeves receive
capital. This spec decides how much each promoted sleeve gets, and how the book is risk-managed.
0. TL;DR — Recommendation
Architecture (the one-paragraph version). Stop combining signals and start combining portfolios. Run a two-stage allocator: (Stage 1) normalize and volatility-target each sleeve so every promoted sleeve speaks the same risk language; (Stage 2) allocate capital across sleeves with Hierarchical Risk Parity (HRP) on a Ledoit–Wolf-shrunk covariance of sleeve returns, with Equal Risk Contribution (ERC) as the robustness cross-check. Assemble and net at the symbol level, apply one portfolio volatility target, enforce gross/net/concentration caps, then pass the book through a smooth drawdown governor that de-risks continuously (replacing today's 1.0 → 0.5 → 0.0 cliff) with the discrete kill-tier ladder kept as a safety overlay. Regime is a bounded tilt on the vol target and sleeve weights, not a hard method switch. Turnover is controlled with EWMA-smoothed sleeve weights, a no-trade band, cross-sleeve netting on the execution path, and a slow cadence for the risk allocation decoupled from fast signal refresh.
Why HRP/ERC and not mean-variance / min-variance: with ~5–12 sleeves and ~1–2 years of daily data, we are squarely in the regime where convex optimizers amplify estimation error and lose to naïve diversification out-of-sample (DeMiguel, Garlappi & Uppal 2009; Ledoit & Wolf 2004; López de Prado 2016). HRP needs no matrix inversion, survives a singular/near-duplicate covariance, and clusters correlated sleeves so they don't double-count. The real risk in this layer is estimation error and overfitting, not insufficient optimization.
First production configuration (robust, low-parameter):
| Knob | First value | Rationale |
|---|---|---|
| Cross-sleeve method | HRP (fallback ladder HRP → ERC → inverse-vol → 1/N) | Robust at N≈T, no inversion |
| Sleeve covariance | Ledoit–Wolf shrinkage, 252-day window, EWMA half-life 63d | Stable Σ; reuses existing estimator |
| Per-sleeve vol target | 10% annualized | Equal-risk sleeves |
| Portfolio vol target | 10% annualized (regime band 8–12%) | One vol-target stage only |
| Leverage bounds | L ∈ [0.3, 2.0] | Conservative vs. current 3.0 |
| Gross / net / single-name caps | 2.0× / 1.0× / 15% of equity | Current RiskLimits defaults, gross lowered from 3.0 |
| Skill tilt | OFF for first ~40 trading days, then bounded ±25% | Don't let noisy IC drive the base |
| Drawdown governor | smooth taper from DD 5% → floor 0.2 at 12% → cooloff at 15% | Continuous de-risk |
| Meta-allocation cadence | daily @ 15:30 ET, EWMA-smoothed (λ=0.8); intraday only refreshes symbol targets | Risk weights shouldn't whipsaw intraday |
| No-trade band | 0.75% of equity per name | Turnover control |
| Estimation hygiene | weights at t use data ≤ t-1; purge same-day; no truncate-to-shortest | No look-ahead |
Expected diversification uplift: roughly 1.6×–2.2× a single sleeve's Sharpe under realistic sleeve counts/correlations — but bank only a fraction of the theoretical figure because correlations rise toward 1 in crises (see §8–§9).
1. How allocation works today (current-state audit)
The live decision path is TradingDaemon._rebalance() in qgtm_live/daemon.py. Tracing it end to
end, signals become positions like this:
- Per-strategy signal generation. Each of the ~52 PM strategies emits
Signalobjects with a signedweight ∈ [-1, 1](daemon._rebalance, signal loop ~L3033–3150). - Signal aggregation (telemetry path).
self.signal_aggregator.aggregate(strategy_signals)(daemonL3236) runsSignalAggregatorinqgtm_portfolio/signal_aggregator.py: per-strategy weight =inverse_vol × skill_score × capacity_headroom × meta_label_confidence, nets equivalent ETF wrappers (_net_equivalent_wrappers, e.g. GLD/IAU/SGOL), clips each symbol to ±40%, normalizes totarget_gross_exposure = 2.0. This result (agg_result) is stored for telemetry (_last_agg_result) but is not what actually sizes orders. - Drawdown overlay (applied to the telemetry vector).
self.drawdown_manager.apply(...)(daemonL3298) scalesagg_resultby a stepwise multiplier (1.0 / 0.5 / 0.0 perDrawdownManager._compute_multiplier). - Kill-switch coordination.
self.risk.sync_drawdown_state(...)(daemonL3320) maps drawdown ontoRiskManagerkill tiers (WARN/THROTTLE/NO_NEW/FLATTEN) inqgtm_risk/manager.py. - Ensemble allocation (execution path).
combined = self.allocator.allocate_regime_aware(...)(daemonL3353).EnsembleAllocator(qgtm_portfolio/ensemble_allocator.py) wraps the legacyPortfolioAllocator(qgtm_portfolio/allocator.py) and: - shrinks each strategy's signals by IC skill (
ic_tracker.skill_score) × correlation discount (correlation_penalty.correlation_discount); - combines signals weighted by per-strategy weight (default equal weight /
1/N), blends with regime sector weights by confidence, appliesadaptive_leverage(0.5×–2.0×); - replaces signal magnitudes with
min_varianceoptimizer output over the symbol covariance (rolling 60-day Ledoit–Wolf viacovariance.rolling_covariance), then scales the whole vector totarget_vol = 0.15with leverage bounded[0.25, 3.0]. - Hygiene. Drop operator-killed symbols, synthesize close-outs for stale/dust positions
(
daemonL3385–3453). - Per-signal sizing → execution. For each combined signal,
RiskManager.plan_order(qgtm_risk/manager.pyL333) sizes notional assignal.weight × (vol_target / realized_vol) × kelly_fraction(0.5) × throttle_factor, clipped to the per-symbol cap and a 3.0× gross-of-equity ceiling, then a pre-trade compliance check and OMS submission.
1.1 Weaknesses (this is the case for the redesign)
- W1 — Volatility is targeted twice.
EnsembleAllocatorscales to 15% ex-ante vol, thenRiskManager.plan_orderindependently re-scales byvol_target/realized_vol × half-Kelly. Two compounding vol controls mean the realized portfolio vol target is not well defined. - W2 — The drawdown governor de-risks the wrong vector.
drawdown_manager.apply()scalesagg_result, which is not executed. The executedcombinedbook only feels drawdown through the discrete kill-tierthrottle_factor(1.0 → 0.5 → 0.0). So the "smooth" governor is effectively decorative on the live path and the real response is a cliff. The mandate explicitly wants smooth de-risking. - W3 — Cross-strategy allocation is
1/N, not risk-based.PortfolioAllocator(method="equal_risk")returns equal weights becausestrategy_performanceis never passed in (allocator._compute_weights). A 30%-vol stat-arb sleeve and a 6%-vol carry sleeve get equal nominal weight, so portfolio risk is dominated by the most volatile sleeves. Equal nominal weight ≠ equal risk. - W4 — It combines signals, not portfolios. Strategy signals are summed at the symbol level and a single optimizer runs over the combined symbol covariance. There is no allocation across sleeves based on the sleeve-return covariance, so cross-sleeve diversification is implicit and uncontrolled, and correlated sleeves (e.g. two trend variants) double-count.
- W5 —
min_varianceis the fragile choice. Minimum-variance over a 60-day shrunk covariance on a ~40–50-name universe (N≈T) is the exact estimation-error-amplifying setup HRP/ERC were designed to avoid. It corner-concentrates into the lowest-vol names (subject to the 40% cap), which reduces diversification. - W6 — Estimation hygiene gaps. The covariance provider truncates every series to the shortest
common history (
make_covariance_provider), silently discarding data and biasing toward whichever sleeve/asset has least history; 60 obs is short for the universe size;realized_volfor sizing is read fromsignal.metadata["realized_vol"]and is frequently 0 (→plan_orderreturns 0 → trade silently dropped). No explicit point-in-time contract is documented. - W7 — The good regime machinery is dead code on the live path.
RegimeAllocator._REGIME_OPT_MAP(risk_on→risk_parity, crisis→cash+hedge, transition→HRP) and theVolTargeterregime vol bands live insidePortfolioOrchestrator, which the daemon never calls. The live regime overlay is the cruderallocate_regime_awaresector-weight blend. - W8 — Turnover control is downstream-only. Equivalent-wrapper netting lives in the unused
aggregator path. The live path cancels all open orders and re-solves
min_varianceevery rebalance (up to 4×/day in day-trading mode). There is a cost gate (PreTradeCostModel) and dust/stale close-out, but no no-trade band, turnover penalty, or weight-change threshold in the allocator itself. - W9 — No capital ladder.
target_gross_exposure(1.0→2.0) andmax_leverage(2.0→3.0) were hand- bumped "to use capital".VehicleRouter(which would pick ETF vs. futures by account size) is off the live path. - W10 — Sizing blurs conviction and risk. Taking signs from the legacy allocator and magnitudes
from
min_variancediscards each sleeve's within-sleeve conviction sizing.
Recurring theme: a lot of correct machinery already exists (HRP, ERC, Ledoit–Wolf, IC tracker, correlation penalty, kill tiers, stress lab) — it is either not wired into the executed path or wired in a fragile configuration (min-variance over symbols, 1/N over sleeves, double vol-target, cliff drawdown). The redesign is mostly re-wiring and consolidating into one coherent pipeline, not green-field building.
2. Research foundation (grounding + citations)
| Method | Takeaway used here | Source |
|---|---|---|
| 1/N benchmark / estimation error | Across 14 optimizing models and 7 datasets, none beats naïve 1/N out-of-sample on Sharpe/CEQ/turnover; the estimation window needed for sample MV to beat 1/N is ~3,000 months (25 assets) / ~6,000 months (50 assets). Any optimizer must clear the 1/N bar net of costs to justify itself. | DeMiguel, Garlappi & Uppal (2009), Review of Financial Studies 22(5):1915–1953 |
| Covariance shrinkage | "Nobody should be using the sample covariance matrix for portfolio optimization." Use a convex blend δF + (1−δ)S; data-driven δ; PSD even when T≈N. |
Ledoit & Wolf (2004), J. Portfolio Management 30(4):110–119 |
| HRP | Cluster → quasi-diagonalize → recursive bisection. No matrix inversion (works on singular Σ), lower out-of-sample variance than the Critical Line Algorithm even though min-variance is CLA's objective; advantage grows with N/T. | López de Prado (2016), J. Portfolio Management 42(4):59–69, DOI 10.3905/jpm.2016.42.4.059; weight-noise proof in Antonov, Lipton & López de Prado (2024) |
| ERC / risk budgeting | Equalize each component's total risk contribution; σ_minvar ≤ σ_ERC ≤ σ_1/N; "minimum variance subject to a diversification constraint"; robust to covariance estimation error. Caveat: more turnover than 1/N, computationally heavier. |
Maillard, Roncalli & Teiletche (2010), J. Portfolio Management 36(4):60–70 |
| Volatility targeting | Scaling exposure inversely to recent variance raised Sharpe ~25% for the market factor and produced positive alpha across value/momentum/carry. | Moreira & Muir (2017), Journal of Finance 72(4):1611–1644, DOI 10.1111/jofi.12513 |
| Vol-targeting caveat | Out-of-sample, vol management does not systematically beat unmanaged portfolios; gains concentrate in momentum/trend; the naïve scaling coefficient suffers look-ahead bias. → Use lagged realized vol, longer half-lives, and expect the benefit mostly on the trend/momentum sleeves. | Cederburg, O'Doherty, Wang & Yang (2020), J. Financial Economics; Liu, Tang & Zhou (2019) |
| Fractional Kelly | Full Kelly maximizes log-growth but is extremely sensitive to mean estimates; half-Kelly trades a small amount of growth for a large reduction in drawdown/ruin risk; "security is gained by reducing the Kelly fraction" under estimation error. → Keep the existing half-Kelly and apply it once. | MacLean, Thorp & Ziemba (2010), Quantitative Finance 10(7):681–687; MacLean & Ziemba (1992), Management Science 38(11) |
| Combining strategies (ensembling) | Treat each validated strategy as an investable asset and allocate across them with risk parity/ERC; a portfolio-level clipping layer is required on top of independent sleeve sizing; "combine portfolios" preserves sleeve diversification whereas "combine signals" trades diversity for signal quality. | Practitioner consensus (delphicalpha "7 Layers of Ensembling"; QuantInsti multi-strategy; Quant.SE) |
| NCO (documented upgrade) | Cluster → intra-cluster allocation → inter-cluster allocation → dot product; ~55% RMSE reduction vs. Markowitz for the max-Sharpe portfolio; agnostic to the inner optimizer. | López de Prado (2019), "A Robust Estimator of the Efficient Frontier", SSRN 3469961; Machine Learning for Asset Managers (2020) |
| Overfitting controls | PBO via combinatorially symmetric / purged cross-validation: fraction of train/test splits where the in-sample winner is not the out-of-sample winner; PBO > 0.5 ⇒ the selection process overfits. Deflated Sharpe Ratio corrects an observed Sharpe for the number of trials + non-normality. | Bailey, Borwein, López de Prado & Zhu (2015), J. Computational Finance; Bailey & López de Prado (2014), J. Portfolio Management 40(5):94–107 |
Synthesis that drives the design. Returns are nearly unforecastable and covariance is noisy at N≈T; therefore the robust play is to (a) drop expected-return inputs at the cross-sleeve level, (b) use a shrunk covariance, (c) prefer a hierarchical/risk-budgeting allocator that needs no inversion, (d) target volatility with lagged estimates, (e) size with half-Kelly once, and (f) benchmark everything against 1/N and gate it with PBO/DSR. This is a deliberately low-parameter design: fewer knobs ⇒ less overfitting surface.
3. Target architecture — the Portfolio Integration Layer (PIL)
promoted sleeves (trend, carry, vol-prem, macro, stat-arb …)
│ each emits a signed symbol-weight vector w_s + a realized return series r_s
▼
┌──────────────────────────────────────────────────────────────────────────┐
│ STAGE 0 Sleeve isolation & normalization │
│ • normalize each w_s to unit gross (‖w_s‖₁ = 1) │
│ • per-sleeve vol target: scale w_s to σ_sleeve (10%) using lagged σ̂_s │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 1 Cross-sleeve meta-allocation ←★ the core │
│ • Σ_S = LedoitWolf( sleeve-return matrix ) (K×K, K = #sleeves) │
│ • a = HRP(Σ_S) [ERC cross-check; fallback inverse-vol → 1/N] │
│ • bounded skill tilt from ic_tracker; per-sleeve cap; EWMA-smooth a │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 2 Assemble & net │
│ • W = Σ_s a_s · (vol-scaled w_s); net at symbol level │
│ • net equivalent wrappers (GLD/IAU/SGOL) ON THE EXEC PATH │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 3 ONE portfolio vol target │
│ • L = clip(σ_target,port / sqrt(WᵀΣ_assetsW), L_min, L_max); W ← L·W │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 4 Exposure caps (gross / net / per-name / per-cluster) │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 5 Smooth drawdown governor + kill-tier safety overlay │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 6 Regime overlay = bounded tilt on vol target + sleeve weights │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 7 Turnover control (no-trade band, EWMA, cost gate, cadence split) │
├──────────────────────────────────────────────────────────────────────────┤
│ STAGE 8 Capital ladder (account-size → gross cap, vehicle, min ticket) │
└──────────────────────────────────────────────────────────────────────────┘
▼
final target book → RiskManager.plan_order (half-Kelly, ONCE) → compliance → OMS
Stage 0 — Sleeve isolation & normalization
Each promoted sleeve s is treated as a self-contained mini-portfolio. Two transforms put all sleeves
on a common footing:
- Normalize w_s ← w_s / ‖w_s‖₁ so a sleeve that emits weight=1.0 on everything cannot dominate
a sleeve that emits 0.2; conviction becomes relative within the sleeve.
- Per-sleeve vol target: scale w_s by σ_sleeve / σ̂_s, where σ̂_s is the sleeve's own
lagged realized return volatility (EWMA, half-life ~21–63d, floored). After this step every sleeve has
the same ex-ante standalone vol (≈10%). This is the "combine portfolios at equal risk" precondition.
Stage 1 — Cross-sleeve meta-allocation (the core change)
Estimate the sleeve-return covariance Σ_S (K×K) from the per-sleeve realized P&L series with
Ledoit–Wolf shrinkage (reuse covariance.ledoit_wolf_shrinkage). Allocate capital across sleeves with
HRP (reuse optimizer.hrp):
1. distance d_ij = sqrt(0.5(1−ρ_ij)) from the sleeve correlation;
2. hierarchical linkage + quasi-diagonalization;
3. recursive bisection allocating inversely to each cluster's variance.
HRP is the production default because it (a) needs no inversion and survives a singular Σ_S (very
likely when two promoted sleeves are near-duplicates), (b) clusters correlated sleeves and splits
risk hierarchically, which is the structural fix for double-counting (W4), and (c) has the lowest
weight-estimation noise of the candidates (Antonov et al. 2024). ERC (optimizer.risk_parity) runs
in parallel as a cross-check: a large HRP-vs-ERC weight divergence is a useful instability alarm.
Then apply, in order:
- a per-sleeve cap (e.g. 25–35%) and renormalize;
- a bounded skill tilt: a_s ← a_s · clip(skill_mult_s, 1−κ, 1+κ) where skill_mult_s derives
from ic_tracker.skill_score(s) ∈ [0,1] and κ ≈ 0.25; renormalize. This is the gentle "combine
signals" benefit layered on the robust "combine portfolios" base — bounded so a noisy IC can never
swing the base allocation.
- EWMA smoothing of the final sleeve weights: a_t = λ·a_{t-1} + (1−λ)·a_target, λ≈0.8, to damp
turnover from week-to-week covariance noise.
Documented upgrade path: swap Stage 1 for NCO (cluster sleeves with the existing
portfolio_optimization.compute_correlation_clusters, ERC within clusters, ERC across clusters). Keep
it out of the first config to minimize parameters.
Stage 2 — Assemble & net
W = Σ_s a_s · (vol-scaled w_s), summed at the symbol level so offsetting long/short exposures across
sleeves cancel before sizing. Run _net_equivalent_wrappers (GLD/IAU/SGOL → primary) on the
execution path (fixes W8). Output: the pre-risk target book.
Stage 3 — One portfolio volatility target
Compute ex-ante portfolio vol σ_ex = sqrt(Wᵀ Σ_assets W) with Σ_assets from
covariance.rolling_covariance (or factor_model_covariance via pm_factor_model when N>T). Scale
W ← L·W, L = clip(σ_target,port / σ_ex, L_min, L_max). Vol targeting happens exactly here and
nowhere else. Neutralize the second vol scaling inside RiskManager.plan_order (treat W as the final
target weight; plan_order keeps only the half-Kelly haircut and the caps). Fixes W1.
Stage 4 — Exposure caps
Gross Σ|W| ≤ G_max, net |ΣW| ≤ N_max, per-name ≤ c_name, per-cluster ≤ c_cluster. Reuse
RiskLimits + PreTradeCompliance. Apply the gross cap after vol targeting:
W ← W · min(1, G_max / Σ|W|).
Stage 5 — Smooth drawdown governor (+ kill-tier safety overlay)
Replace the stepwise multiplier with a continuous function m(DD) applied to the final executed
book (fixes W2):
1 for DD ≤ d0
m(DD) = 1 − (1 − m_min)·(DD − d0)/(dH − d0) for d0 < DD < dH
m_min at DD ≥ dH (then 0 during cooloff)
d0=5%, dH=12%, m_min=0.2, cooloff/flat at the 15% hard cap. Measure DD in vol-adjusted
units (DD / σ_target,port) so the same fractional drawdown triggers consistently across vol regimes.
The discrete kill tiers in RiskManager remain as a safety overlay that can only further reduce
exposure, never raise it (belt and suspenders). sync_drawdown_state is the single coordination
point so the two governors never disagree.
Stage 6 — Regime overlay (bounded tilt, not a switch)
Wire the existing regime knowledge as continuous tilts, gated by regime confidence:
- scale the portfolio vol target within a band using _REGIME_VOL_TARGETS (risk_on → top of band,
crisis → bottom);
- apply a small bounded multiplicative tilt to sleeve weights (e.g. crisis: defensive sleeves
×1.2, cyclical ×0.8, capped) and/or nudge the HRP↔ERC blend.
Avoid hard method switching (risk_parity ↔ min_variance ↔ cash): discrete switches add path-dependence and a large overfitting surface, and regime detection is itself noisy. Tilts are bounded and confidence-weighted. This both uses the dormant regime machinery (W7) and keeps it robust.
Stage 7 — Turnover & rebalancing control
- No-trade band: only trade name i if
|W_target,i − W_current,i| > τ(≈0.75% of equity). - Cadence split: recompute the full meta-allocation (Stage 1) slowly — daily at 15:30 ET (or
weekly). Intraday slots (
REBALANCE_SCHEDULE_ET = 09:35 / 12:00 / 14:00 / 15:30) only refresh symbol-level targets within fixed sleeve weights. Risk allocation must not whipsaw intraday. - Cost-aware: net offsets (Stage 2), use
PreTradeCostModelto skip trades whose expected edge < expected cost, EWMA-smooth sleeve weights (Stage 1).
Stage 8 — Capital ladder
Account-size tiers control gross cap, vehicle selection, min ticket / dust thresholds, and per-name
caps. Wire VehicleRouter onto the exec path (fixes W9):
| Tier (equity) | Gross cap | Vehicles | Per-name cap | Notes |
|---|---|---|---|---|
| < $100k | 1.0× | ETFs only | 20% | dust threshold high; few names |
| \(100k–\)1M | 1.5× | ETFs (+ liquid futures where min-size allows) | 15% | current operating range |
| \(1M–\)10M | 2.0× | ETF + futures (margin-efficient) | 12% | migrate to futures via VehicleRouter |
| > $10M | up to firm cap (≤3.0×) | futures-first + options overlays | 10% | capacity-aware sleeve caps |
4. Allocation algorithm + exact math
Notation. Sleeves s = 1..K. Sleeve realized return at time t: r_{s,t}. Sleeve target weight
vector over symbols: w_s ∈ R^M. Asset covariance Σ_A ∈ R^{M×M}. Sleeve covariance Σ_S ∈ R^{K×K}.
Sleeve return definition (canonical). Formalize today's ad-hoc _strategy_pnl_buf / IC P&L into a
single series: r_{s,t} = Σ_i w_{s,i,t-1} · ret_{i,t} — the return the normalized sleeve target from
the prior rebalance would have earned. (This matches ic_tracker.rolling_sharpe's pnl_t but is the
authoritative input to Stage 1.)
Stage 0. w_s ← w_s / Σ_i|w_{s,i}|; then w_s ← w_s · (σ_sleeve / max(σ̂_s, σ_floor)), where
σ̂_s² = EWMA of r_{s,·}² (half-life h_σ).
Stage 1 — sleeve covariance & HRP.
- Σ_S = ledoit_wolf_shrinkage(R) where R is the T×K lagged sleeve-return matrix (annualized).
- ρ = corr(Σ_S), D_ij = sqrt(0.5(1−ρ_ij)), linkage → quasi-diagonal order → recursive bisection:
at each split into clusters (L, R), cluster variance v_C = w_C^T Σ_{S,C} w_C with intra-cluster
inverse-variance weights, allocation factor α = 1 − v_L/(v_L+v_R); propagate. Output base a^{HRP}.
- Skill tilt: a_s ← a^{HRP}_s · clip(2·skill_s, 1−κ, 1+κ), cap a_s ≤ a_max, renormalize Σ a_s=1.
- Smooth: a_t = λ a_{t-1} + (1−λ) a_t.
ERC cross-check (Maillard et al.). Solve for a such that
a_i (Σ_S a)_i = a_j (Σ_S a)_j ∀ i,j, i.e. equal total risk contribution RC_i = a_i (Σ_S a)_i.
(optimizer.risk_parity already implements the Newton iteration.)
Stage 2. W = Σ_s a_s · w_s (symbol-summed), then equivalent-wrapper netting.
Stage 3. σ_ex = sqrt(W^T Σ_A W); L = clip(σ_P / σ_ex, L_min, L_max); W ← L·W.
Stage 4. Per-name clip W_i ← clip(W_i, −c_name, c_name); gross scale
W ← W·min(1, G_max/Σ|W_i|); verify |ΣW_i| ≤ N_max (trim the largest net contributors if not).
Stage 5. W ← m(DD)·W, then apply the kill-tier throttle_factor as W ← min(throttle,1)·W.
Sizing (once). Final notional per name: notional_i = sizing_base · W_i · kelly_fraction
(half-Kelly), with kelly_fraction = 0.5 retained from RiskLimits. The vol_target/realized_vol
term in plan_order is set to 1 because vol targeting already happened in Stage 3.
5. Inputs, estimation & look-ahead control
| Input | How estimated | Lookback / shrinkage | Look-ahead guard |
|---|---|---|---|
Sleeve returns r_{s,t} |
Σ_i w_{s,i,t-1}·ret_{i,t} (prior-bar weights × realized) |
full available, ≥60d to activate | weights are lagged by construction |
Sleeve covariance Σ_S |
Ledoit–Wolf constant-correlation | 252d window, EWMA half-life 63d | use returns ≤ t-1; purge same-day bar |
Sleeve vol σ̂_s |
EWMA of r² |
half-life 21–63d, floored | lagged |
Asset covariance Σ_A |
rolling_covariance (LW) or factor_model_covariance (N>T) |
90–120d; factor model for large N | lagged; do not truncate to shortest history — use pairwise-complete + shrinkage or the factor model |
| Skill / IC | ic_tracker.skill_score |
window 21, half-life 63 | forecast-vs-next-period realized only |
| Regime | MarketRegimeDetector |
as configured | features ≤ t-1 |
Estimation-error is the real enemy (call-out). Per DeMiguel et al. (2009), with K≈5–12 sleeves and ~250–500 daily observations we have nowhere near the sample size for mean-variance to beat 1/N. That is precisely why Stage 1 uses HRP/ERC (no μ, no inversion) + shrinkage and why the skill tilt is bounded. Two concrete anti-look-ahead fixes vs. today: (1) replace the truncate-to-shortest covariance assembly (W6) with pairwise-complete estimation + shrinkage (or the factor model); (2) make the point-in-time contract explicit — every weight applied at the open of bar t is a pure function of data through bar t-1.
6. Rebalancing rules
- Risk allocation (Stage 1): once daily at 15:30 ET; EWMA-smoothed; only updates that move a sleeve weight by > 2% absolute are applied.
- Symbol refresh (Stages 2–5): at each
REBALANCE_SCHEDULE_ETslot, within fixed sleeve weights; trade a name only if outside the no-trade band and net of cost-gate. - Newly promoted sleeve: enters at ≤ ½ of its HRP weight, ramped to full over ~20 trading days (probation), tying into the promotion gate. A demoted/quarantined sleeve has its weight set to 0 and its positions handed to the existing stale-close-out path.
- Event-driven: keep
EventTriggerEngineextra rebalances, but they refresh symbol targets only — never recompute sleeve covariance intraday (avoids covariance whipsaw on a vol spike).
7. Mapping onto the existing codebase (reuse vs. add)
Reuse as-is:
- qgtm_portfolio/covariance.py — ledoit_wolf_shrinkage, rolling_covariance, factor_model_covariance.
- qgtm_portfolio/optimizer.py — hrp, risk_parity (ERC), inverse_vol, equal_weight already implemented and tested.
- qgtm_portfolio/ic_tracker.py — sleeve skill scores for the bounded tilt.
- qgtm_portfolio/correlation_penalty.py — optional extra redundancy discount (HRP already handles most of this structurally).
- qgtm_risk/manager.py — kill-tier ladder, compliance caps, half-Kelly sizing (with the second vol-scaling neutralized).
- qgtm_risk/pm_factor_model.py — Σ_A when N>T.
- qgtm_risk/stress.py (StressTestLab, HISTORICAL_SCENARIOS) — validation §9.
- qgtm_execution/pretrade_cost.py (PreTradeCostModel) — turnover/cost gate.
- qgtm_portfolio/signal_aggregator.py — _net_equivalent_wrappers, VehicleRouter, and DrawdownManager (upgraded to smooth).
Add (thin, low-parameter):
- StrategyReturnTracker — canonical per-sleeve realized return series (formalizes _strategy_pnl_buf).
- SleeveAllocator — Stage 1: HRP/ERC over Σ_S (shrunk), caps, bounded skill tilt, EWMA smoothing, fallback ladder HRP→ERC→inverse-vol→1/N.
- PortfolioIntegrator — the Stage 0–8 orchestrator that replaces the dual aggregator/ensemble execution path with one pipeline. It supersedes the unused PortfolioOrchestrator and the EnsembleAllocator min-variance path, and outputs the final symbol weights the daemon sizes to.
- Smooth m(DD) on DrawdownManager (or a new smooth_multiplier) applied to the executed book.
- Regime → vol-band + bounded-tilt wiring; capital-ladder config table.
Net effect on daemon._rebalance(): the two parallel weight computations (telemetry agg_result
and executed combined) collapse into a single PortfolioIntegrator.build_target_book(...) call whose
output flows to sizing. One vol target, one drawdown governor, one allocator.
8. Expected Sharpe uplift from diversification
For K sleeves each with standalone Sharpe SR and average pairwise return correlation ρ̄, an
equal-risk combination has approximately
SR = 0.6):
| K | ρ̄ = 0.0 | ρ̄ = 0.2 | ρ̄ = 0.4 |
|---|---|---|---|
| 3 | 1.04 | 0.82 | 0.71 |
| 5 | 1.34 | 1.00 | 0.82 |
| 8 | 1.70 | 1.13 | 0.88 |
So with 5–8 genuinely diversifying sleeves (ρ̄ ≈ 0.2) the combined Sharpe lands around 1.0–1.15,
a ~1.7×–1.9× uplift over a single sleeve. Do not bank the full figure: (a) HRP/ERC are not
mean-variance optimal so they capture most, not all, of the theoretical benefit; (b) ρ̄ rises
toward 1 in crises, collapsing the denominator; (c) estimation error and turnover costs subtract. A
prudent planning assumption is 0.85–1.05 combined Sharpe before costs, with the explicit
acknowledgement (DeMiguel et al.) that if HRP cannot beat 1/N-risk net of costs in walk-forward, we
default to equal-vol/1/N across sleeves.
9. Behavior in crises
- Vol targeting (Stage 3) cuts gross as realized vol spikes — the Moreira–Muir mechanism, and the Cederburg et al. evidence says this helps trend/momentum sleeves most (which this system runs).
- HRP (Stage 1) degrades gracefully when cross-sleeve correlations jump to ~1: it loses diversification but never blows up (no inversion), unlike min-variance.
- Smooth drawdown governor (Stage 5) tapers exposure continuously from 5% DD, hitting the 0.2 floor by 12% and cooloff at 15% — no cliff, no whipsaw re-entry.
- Regime tilt (Stage 6) lowers the vol target to the bottom of the band and tilts toward defensive sleeves under a confident crisis read.
- Kill-tier overlay is the hard backstop (FLATTEN at the 15% hard DD / daily-loss breaker).
Net: in a vol spike the book shrinks through four independent, mutually reinforcing channels (vol target ↓, HRP de-concentrates, DD governor ↓, regime tilt ↓), with the kill switch as the final stop.
10. Validation plan
- Combined-portfolio walk-forward. Rolling/expanding windows; at each step estimate
Σ_Son lagged data, compute HRP, hold, record OOS. Report net-of-cost Sharpe, vol (vs. target), max DD, turnover, effective-N (portfolio_optimization._effective_n). Benchmark ladder: HRP vs. ERC vs. inverse-vol vs. 1/N vs. current production path. HRP must beat 1/N-risk net of costs or we ship 1/N-risk. - PBO on the meta-allocation. Treat the configuration family — {HRP, ERC, inverse-vol, MV}, lookbacks, shrinkage targets, vol target, skill-tilt bound — as the strategy set. Run CSCV/CPCV; require PBO < 0.5 (target < 0.2). Compute the Deflated Sharpe Ratio on the final combined track using the number of configurations tried (Bailey & López de Prado). A high PBO means the choice among methods is itself overfit → fall back to the most robust (HRP or 1/N-risk).
- Stress scenarios. Run
StressTestLab.run_all(HISTORICAL_SCENARIOS: 1987 Black Monday, 2008 collapse, 2020 COVID / negative WTI, 2011 silver squeeze, 2022 energy) plusmonte_carlo_stresson the assembled book; confirm worst-case loss respects the DD caps and the governor actually cuts gross. - Parameter-sensitivity (overfitting screen). Perturb lookback, shrinkage, vol target, EWMA λ by ±50%. Acceptance criterion: Sharpe is stable. High sensitivity = overfit and is itself a rejection signal — favor the flatter configuration even at slightly lower in-sample Sharpe.
- Live shadow run. Run
PortfolioIntegratorin shadow alongside the current allocator for ~4–8 weeks. Compare realized vs. target vol, turnover, gross, and ex-ante/ex-post tracking before cutover.
11. Where overfitting / estimation error actually bites (explicit risk register)
- Sleeve covariance instability with few sleeves and short history → mitigated by shrinkage + HRP + EWMA smoothing + ERC cross-check. Do not add μ estimates at the cross-sleeve level.
- Vol-targeting look-ahead → use only lagged realized vol; expect the benefit mostly on trend sleeves (Cederburg et al.).
- Skill-tilt feedback (chasing recently hot sleeves) → bounded ±25%, off during warmup, derived from rank-IC not raw P&L.
- Regime overfitting → bounded tilts, confidence-gated, never a hard method switch.
- Method selection overfitting → PBO/DSR gate; default to HRP/1/N-risk if PBO is high.
- Turnover/cost drag → no-trade band, cadence split, cost gate, EWMA smoothing.
Guiding principle: favor robust, low-parameter methods over fragile optimization. Every added knob must pay for itself against the 1/N-risk benchmark net of costs and after the PBO penalty.
12. References
- Antonov, A., Lipton, A., & López de Prado, M. (2024). Analytical weight-noise bounds for HRP vs. minimum-variance.
- Bailey, D. H., Borwein, J., López de Prado, M., & Zhu, Q. J. (2015). The Probability of Backtest Overfitting. Journal of Computational Finance.
- Bailey, D. H., & López de Prado, M. (2014). The Deflated Sharpe Ratio. Journal of Portfolio Management 40(5):94–107.
- Cederburg, S., O'Doherty, M., Wang, F., & Yang, X. (2020). On the performance of volatility-managed portfolios. Journal of Financial Economics.
- Choueifaty, Y., & Coignard, Y. (2008). Toward Maximum Diversification. Journal of Portfolio Management.
- DeMiguel, V., Garlappi, L., & Uppal, R. (2009). Optimal Versus Naive Diversification: How Inefficient is the 1/N Portfolio Strategy? Review of Financial Studies 22(5):1915–1953.
- Ledoit, O., & Wolf, M. (2004). Honey, I Shrunk the Sample Covariance Matrix. Journal of Portfolio Management 30(4):110–119.
- López de Prado, M. (2016). Building Diversified Portfolios that Outperform Out-of-Sample. Journal of Portfolio Management 42(4):59–69. DOI 10.3905/jpm.2016.42.4.059.
- López de Prado, M. (2019). A Robust Estimator of the Efficient Frontier (NCO). SSRN 3469961; Machine Learning for Asset Managers (2020), Cambridge University Press.
- MacLean, L. C., Thorp, E. O., & Ziemba, W. T. (2010). Good and bad properties of the Kelly and fractional Kelly criteria. Quantitative Finance 10(7):681–687.
- MacLean, L. C., & Ziemba, W. T. (1992). Growth Versus Security in Dynamic Investment Analysis. Management Science 38(11):1562–1585.
- Maillard, S., Roncalli, T., & Teiletche, J. (2010). The Properties of Equally-Weighted Risk Contribution Portfolios. Journal of Portfolio Management 36(4):60–70.
- Moreira, A., & Muir, T. (2017). Volatility-Managed Portfolios. Journal of Finance 72(4):1611–1644. DOI 10.1111/jofi.12513.
- Rockafellar, R. T., & Uryasev, S. (2000). Optimization of Conditional Value-at-Risk. Journal of Risk.