05 — Portfolio Integration: Ensemble Construction + Risk Layer

Status: Research + Spec (no production code in this document) Scope: The layer that combines validated strategy sleeves (trend, carry, vol-premium, macro/regime, stat-arb/pairs) into one diversified, risk-targeted, drawdown-controlled portfolio. Owner system: qgtm_portfolio/, qgtm_risk/, qgtm_live/daemon.py, qgtm_execution/ Assumes: A separate promotion gate (built by another agent) decides which sleeves receive capital. This spec decides how much each promoted sleeve gets, and how the book is risk-managed.

0. TL;DR — Recommendation

Architecture (the one-paragraph version). Stop combining signals and start combining portfolios. Run a two-stage allocator: (Stage 1) normalize and volatility-target each sleeve so every promoted sleeve speaks the same risk language; (Stage 2) allocate capital across sleeves with Hierarchical Risk Parity (HRP) on a Ledoit–Wolf-shrunk covariance of sleeve returns, with Equal Risk Contribution (ERC) as the robustness cross-check. Assemble and net at the symbol level, apply one portfolio volatility target, enforce gross/net/concentration caps, then pass the book through a smooth drawdown governor that de-risks continuously (replacing today's 1.0 → 0.5 → 0.0 cliff) with the discrete kill-tier ladder kept as a safety overlay. Regime is a bounded tilt on the vol target and sleeve weights, not a hard method switch. Turnover is controlled with EWMA-smoothed sleeve weights, a no-trade band, cross-sleeve netting on the execution path, and a slow cadence for the risk allocation decoupled from fast signal refresh.

Why HRP/ERC and not mean-variance / min-variance: with ~5–12 sleeves and ~1–2 years of daily data, we are squarely in the regime where convex optimizers amplify estimation error and lose to naïve diversification out-of-sample (DeMiguel, Garlappi & Uppal 2009; Ledoit & Wolf 2004; López de Prado 2016). HRP needs no matrix inversion, survives a singular/near-duplicate covariance, and clusters correlated sleeves so they don't double-count. The real risk in this layer is estimation error and overfitting, not insufficient optimization.

First production configuration (robust, low-parameter):

Knob	First value	Rationale
Cross-sleeve method	HRP (fallback ladder HRP → ERC → inverse-vol → 1/N)	Robust at N≈T, no inversion
Sleeve covariance	Ledoit–Wolf shrinkage, 252-day window, EWMA half-life 63d	Stable Σ; reuses existing estimator
Per-sleeve vol target	10% annualized	Equal-risk sleeves
Portfolio vol target	10% annualized (regime band 8–12%)	One vol-target stage only
Leverage bounds	L ∈ [0.3, 2.0]	Conservative vs. current 3.0
Gross / net / single-name caps	2.0× / 1.0× / 15% of equity	Current `RiskLimits` defaults, gross lowered from 3.0
Skill tilt	OFF for first ~40 trading days, then bounded ±25%	Don't let noisy IC drive the base
Drawdown governor	smooth taper from DD 5% → floor 0.2 at 12% → cooloff at 15%	Continuous de-risk
Meta-allocation cadence	daily @ 15:30 ET, EWMA-smoothed (λ=0.8); intraday only refreshes symbol targets	Risk weights shouldn't whipsaw intraday
No-trade band	0.75% of equity per name	Turnover control
Estimation hygiene	weights at t use data ≤ t-1; purge same-day; no truncate-to-shortest	No look-ahead

Expected diversification uplift: roughly 1.6×–2.2× a single sleeve's Sharpe under realistic sleeve counts/correlations — but bank only a fraction of the theoretical figure because correlations rise toward 1 in crises (see §8–§9).

1. How allocation works today (current-state audit)

The live decision path is TradingDaemon._rebalance() in qgtm_live/daemon.py. Tracing it end to end, signals become positions like this:

Per-strategy signal generation. Each of the ~52 PM strategies emits Signal objects with a signed weight ∈ [-1, 1] (daemon._rebalance, signal loop ~L3033–3150).
Signal aggregation (telemetry path). self.signal_aggregator.aggregate(strategy_signals) (daemon L3236) runs SignalAggregator in qgtm_portfolio/signal_aggregator.py: per-strategy weight = inverse_vol × skill_score × capacity_headroom × meta_label_confidence, nets equivalent ETF wrappers (_net_equivalent_wrappers, e.g. GLD/IAU/SGOL), clips each symbol to ±40%, normalizes to target_gross_exposure = 2.0. This result (agg_result) is stored for telemetry (_last_agg_result) but is not what actually sizes orders.
Drawdown overlay (applied to the telemetry vector). self.drawdown_manager.apply(...) (daemon L3298) scales agg_result by a stepwise multiplier (1.0 / 0.5 / 0.0 per DrawdownManager._compute_multiplier).
Kill-switch coordination. self.risk.sync_drawdown_state(...) (daemon L3320) maps drawdown onto RiskManager kill tiers (WARN/THROTTLE/NO_NEW/FLATTEN) in qgtm_risk/manager.py.
Ensemble allocation (execution path). combined = self.allocator.allocate_regime_aware(...) (daemon L3353). EnsembleAllocator (qgtm_portfolio/ensemble_allocator.py) wraps the legacy PortfolioAllocator (qgtm_portfolio/allocator.py) and:
shrinks each strategy's signals by IC skill (ic_tracker.skill_score) × correlation discount (correlation_penalty.correlation_discount);
combines signals weighted by per-strategy weight (default equal weight / 1/N), blends with regime sector weights by confidence, applies adaptive_leverage (0.5×–2.0×);
replaces signal magnitudes with min_variance optimizer output over the symbol covariance (rolling 60-day Ledoit–Wolf via covariance.rolling_covariance), then scales the whole vector to target_vol = 0.15 with leverage bounded [0.25, 3.0].
Hygiene. Drop operator-killed symbols, synthesize close-outs for stale/dust positions (daemon L3385–3453).
Per-signal sizing → execution. For each combined signal, RiskManager.plan_order (qgtm_risk/manager.py L333) sizes notional as signal.weight × (vol_target / realized_vol) × kelly_fraction(0.5) × throttle_factor, clipped to the per-symbol cap and a 3.0× gross-of-equity ceiling, then a pre-trade compliance check and OMS submission.

1.1 Weaknesses (this is the case for the redesign)

W1 — Volatility is targeted twice. EnsembleAllocator scales to 15% ex-ante vol, then RiskManager.plan_order independently re-scales by vol_target/realized_vol × half-Kelly. Two compounding vol controls mean the realized portfolio vol target is not well defined.
W2 — The drawdown governor de-risks the wrong vector. drawdown_manager.apply() scales agg_result, which is not executed. The executed combined book only feels drawdown through the discrete kill-tier throttle_factor (1.0 → 0.5 → 0.0). So the "smooth" governor is effectively decorative on the live path and the real response is a cliff. The mandate explicitly wants smooth de-risking.
W3 — Cross-strategy allocation is 1/N, not risk-based. PortfolioAllocator(method="equal_risk") returns equal weights because strategy_performance is never passed in (allocator._compute_weights). A 30%-vol stat-arb sleeve and a 6%-vol carry sleeve get equal nominal weight, so portfolio risk is dominated by the most volatile sleeves. Equal nominal weight ≠ equal risk.
W4 — It combines signals, not portfolios. Strategy signals are summed at the symbol level and a single optimizer runs over the combined symbol covariance. There is no allocation across sleeves based on the sleeve-return covariance, so cross-sleeve diversification is implicit and uncontrolled, and correlated sleeves (e.g. two trend variants) double-count.
W5 — min_variance is the fragile choice. Minimum-variance over a 60-day shrunk covariance on a ~40–50-name universe (N≈T) is the exact estimation-error-amplifying setup HRP/ERC were designed to avoid. It corner-concentrates into the lowest-vol names (subject to the 40% cap), which reduces diversification.
W6 — Estimation hygiene gaps. The covariance provider truncates every series to the shortest common history (make_covariance_provider), silently discarding data and biasing toward whichever sleeve/asset has least history; 60 obs is short for the universe size; realized_vol for sizing is read from signal.metadata["realized_vol"] and is frequently 0 (→ plan_order returns 0 → trade silently dropped). No explicit point-in-time contract is documented.
W7 — The good regime machinery is dead code on the live path. RegimeAllocator._REGIME_OPT_MAP (risk_on→risk_parity, crisis→cash+hedge, transition→HRP) and the VolTargeter regime vol bands live inside PortfolioOrchestrator, which the daemon never calls. The live regime overlay is the cruder allocate_regime_aware sector-weight blend.
W8 — Turnover control is downstream-only. Equivalent-wrapper netting lives in the unused aggregator path. The live path cancels all open orders and re-solves min_variance every rebalance (up to 4×/day in day-trading mode). There is a cost gate (PreTradeCostModel) and dust/stale close-out, but no no-trade band, turnover penalty, or weight-change threshold in the allocator itself.
W9 — No capital ladder. target_gross_exposure (1.0→2.0) and max_leverage (2.0→3.0) were hand- bumped "to use capital". VehicleRouter (which would pick ETF vs. futures by account size) is off the live path.
W10 — Sizing blurs conviction and risk. Taking signs from the legacy allocator and magnitudes from min_variance discards each sleeve's within-sleeve conviction sizing.

Recurring theme: a lot of correct machinery already exists (HRP, ERC, Ledoit–Wolf, IC tracker, correlation penalty, kill tiers, stress lab) — it is either not wired into the executed path or wired in a fragile configuration (min-variance over symbols, 1/N over sleeves, double vol-target, cliff drawdown). The redesign is mostly re-wiring and consolidating into one coherent pipeline, not green-field building.

2. Research foundation (grounding + citations)

Method	Takeaway used here	Source
1/N benchmark / estimation error	Across 14 optimizing models and 7 datasets, none beats naïve 1/N out-of-sample on Sharpe/CEQ/turnover; the estimation window needed for sample MV to beat 1/N is ~3,000 months (25 assets) / ~6,000 months (50 assets). Any optimizer must clear the 1/N bar net of costs to justify itself.	DeMiguel, Garlappi & Uppal (2009), Review of Financial Studies 22(5):1915–1953
Covariance shrinkage	"Nobody should be using the sample covariance matrix for portfolio optimization." Use a convex blend `δF + (1−δ)S`; data-driven δ; PSD even when T≈N.	Ledoit & Wolf (2004), J. Portfolio Management 30(4):110–119
HRP	Cluster → quasi-diagonalize → recursive bisection. No matrix inversion (works on singular Σ), lower out-of-sample variance than the Critical Line Algorithm even though min-variance is CLA's objective; advantage grows with N/T.	López de Prado (2016), J. Portfolio Management 42(4):59–69, DOI 10.3905/jpm.2016.42.4.059; weight-noise proof in Antonov, Lipton & López de Prado (2024)
ERC / risk budgeting	Equalize each component's total risk contribution; `σ_minvar ≤ σ_ERC ≤ σ_1/N`; "minimum variance subject to a diversification constraint"; robust to covariance estimation error. Caveat: more turnover than 1/N, computationally heavier.	Maillard, Roncalli & Teiletche (2010), J. Portfolio Management 36(4):60–70
Volatility targeting	Scaling exposure inversely to recent variance raised Sharpe ~25% for the market factor and produced positive alpha across value/momentum/carry.	Moreira & Muir (2017), Journal of Finance 72(4):1611–1644, DOI 10.1111/jofi.12513
Vol-targeting caveat	Out-of-sample, vol management does not systematically beat unmanaged portfolios; gains concentrate in momentum/trend; the naïve scaling coefficient suffers look-ahead bias. → Use lagged realized vol, longer half-lives, and expect the benefit mostly on the trend/momentum sleeves.	Cederburg, O'Doherty, Wang & Yang (2020), J. Financial Economics; Liu, Tang & Zhou (2019)
Fractional Kelly	Full Kelly maximizes log-growth but is extremely sensitive to mean estimates; half-Kelly trades a small amount of growth for a large reduction in drawdown/ruin risk; "security is gained by reducing the Kelly fraction" under estimation error. → Keep the existing half-Kelly and apply it once.	MacLean, Thorp & Ziemba (2010), Quantitative Finance 10(7):681–687; MacLean & Ziemba (1992), Management Science 38(11)
Combining strategies (ensembling)	Treat each validated strategy as an investable asset and allocate across them with risk parity/ERC; a portfolio-level clipping layer is required on top of independent sleeve sizing; "combine portfolios" preserves sleeve diversification whereas "combine signals" trades diversity for signal quality.	Practitioner consensus (delphicalpha "7 Layers of Ensembling"; QuantInsti multi-strategy; Quant.SE)
NCO (documented upgrade)	Cluster → intra-cluster allocation → inter-cluster allocation → dot product; ~55% RMSE reduction vs. Markowitz for the max-Sharpe portfolio; agnostic to the inner optimizer.	López de Prado (2019), "A Robust Estimator of the Efficient Frontier", SSRN 3469961; Machine Learning for Asset Managers (2020)
Overfitting controls	PBO via combinatorially symmetric / purged cross-validation: fraction of train/test splits where the in-sample winner is not the out-of-sample winner; PBO > 0.5 ⇒ the selection process overfits. Deflated Sharpe Ratio corrects an observed Sharpe for the number of trials + non-normality.	Bailey, Borwein, López de Prado & Zhu (2015), J. Computational Finance; Bailey & López de Prado (2014), J. Portfolio Management 40(5):94–107

Synthesis that drives the design. Returns are nearly unforecastable and covariance is noisy at N≈T; therefore the robust play is to (a) drop expected-return inputs at the cross-sleeve level, (b) use a shrunk covariance, (c) prefer a hierarchical/risk-budgeting allocator that needs no inversion, (d) target volatility with lagged estimates, (e) size with half-Kelly once, and (f) benchmark everything against 1/N and gate it with PBO/DSR. This is a deliberately low-parameter design: fewer knobs ⇒ less overfitting surface.

3. Target architecture — the Portfolio Integration Layer (PIL)

 promoted sleeves (trend, carry, vol-prem, macro, stat-arb …)
        │  each emits a signed symbol-weight vector  w_s  +  a realized return series r_s
        ▼
 ┌──────────────────────────────────────────────────────────────────────────┐
 │ STAGE 0  Sleeve isolation & normalization                                  │
 │   • normalize each w_s to unit gross (‖w_s‖₁ = 1)                           │
 │   • per-sleeve vol target: scale w_s to σ_sleeve (10%) using lagged σ̂_s    │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 1  Cross-sleeve meta-allocation  ←★ the core                          │
 │   • Σ_S  = LedoitWolf( sleeve-return matrix )      (K×K, K = #sleeves)      │
 │   • a    = HRP(Σ_S)   [ERC cross-check; fallback inverse-vol → 1/N]         │
 │   • bounded skill tilt from ic_tracker; per-sleeve cap; EWMA-smooth a       │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 2  Assemble & net                                                     │
 │   • W = Σ_s a_s · (vol-scaled w_s);  net at symbol level                    │
 │   • net equivalent wrappers (GLD/IAU/SGOL) ON THE EXEC PATH                 │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 3  ONE portfolio vol target                                          │
 │   • L = clip(σ_target,port / sqrt(WᵀΣ_assetsW), L_min, L_max);  W ← L·W     │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 4  Exposure caps (gross / net / per-name / per-cluster)              │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 5  Smooth drawdown governor + kill-tier safety overlay               │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 6  Regime overlay = bounded tilt on vol target + sleeve weights      │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 7  Turnover control (no-trade band, EWMA, cost gate, cadence split)  │
 ├──────────────────────────────────────────────────────────────────────────┤
 │ STAGE 8  Capital ladder (account-size → gross cap, vehicle, min ticket)    │
 └──────────────────────────────────────────────────────────────────────────┘
        ▼
 final target book → RiskManager.plan_order (half-Kelly, ONCE) → compliance → OMS

Stage 0 — Sleeve isolation & normalization

Each promoted sleeve s is treated as a self-contained mini-portfolio. Two transforms put all sleeves on a common footing: - Normalize w_s ← w_s / ‖w_s‖₁ so a sleeve that emits weight=1.0 on everything cannot dominate a sleeve that emits 0.2; conviction becomes relative within the sleeve. - Per-sleeve vol target: scale w_s by σ_sleeve / σ̂_s, where σ̂_s is the sleeve's own lagged realized return volatility (EWMA, half-life ~21–63d, floored). After this step every sleeve has the same ex-ante standalone vol (≈10%). This is the "combine portfolios at equal risk" precondition.

Stage 1 — Cross-sleeve meta-allocation (the core change)

Estimate the sleeve-return covariance Σ_S (K×K) from the per-sleeve realized P&L series with Ledoit–Wolf shrinkage (reuse covariance.ledoit_wolf_shrinkage). Allocate capital across sleeves with HRP (reuse optimizer.hrp): 1. distance d_ij = sqrt(0.5(1−ρ_ij)) from the sleeve correlation; 2. hierarchical linkage + quasi-diagonalization; 3. recursive bisection allocating inversely to each cluster's variance.

HRP is the production default because it (a) needs no inversion and survives a singular Σ_S (very likely when two promoted sleeves are near-duplicates), (b) clusters correlated sleeves and splits risk hierarchically, which is the structural fix for double-counting (W4), and (c) has the lowest weight-estimation noise of the candidates (Antonov et al. 2024). ERC (optimizer.risk_parity) runs in parallel as a cross-check: a large HRP-vs-ERC weight divergence is a useful instability alarm.

Then apply, in order: - a per-sleeve cap (e.g. 25–35%) and renormalize; - a bounded skill tilt: a_s ← a_s · clip(skill_mult_s, 1−κ, 1+κ) where skill_mult_s derives from ic_tracker.skill_score(s) ∈ [0,1] and κ ≈ 0.25; renormalize. This is the gentle "combine signals" benefit layered on the robust "combine portfolios" base — bounded so a noisy IC can never swing the base allocation. - EWMA smoothing of the final sleeve weights: a_t = λ·a_{t-1} + (1−λ)·a_target, λ≈0.8, to damp turnover from week-to-week covariance noise.

Documented upgrade path: swap Stage 1 for NCO (cluster sleeves with the existing portfolio_optimization.compute_correlation_clusters, ERC within clusters, ERC across clusters). Keep it out of the first config to minimize parameters.

Stage 2 — Assemble & net

W = Σ_s a_s · (vol-scaled w_s), summed at the symbol level so offsetting long/short exposures across sleeves cancel before sizing. Run _net_equivalent_wrappers (GLD/IAU/SGOL → primary) on the execution path (fixes W8). Output: the pre-risk target book.

Stage 3 — One portfolio volatility target

Compute ex-ante portfolio vol σ_ex = sqrt(Wᵀ Σ_assets W) with Σ_assets from covariance.rolling_covariance (or factor_model_covariance via pm_factor_model when N>T). Scale W ← L·W, L = clip(σ_target,port / σ_ex, L_min, L_max). Vol targeting happens exactly here and nowhere else. Neutralize the second vol scaling inside RiskManager.plan_order (treat W as the final target weight; plan_order keeps only the half-Kelly haircut and the caps). Fixes W1.

Stage 4 — Exposure caps

Gross Σ|W| ≤ G_max, net |ΣW| ≤ N_max, per-name ≤ c_name, per-cluster ≤ c_cluster. Reuse RiskLimits + PreTradeCompliance. Apply the gross cap after vol targeting: W ← W · min(1, G_max / Σ|W|).

Stage 5 — Smooth drawdown governor (+ kill-tier safety overlay)

Replace the stepwise multiplier with a continuous function m(DD) applied to the final executed book (fixes W2):

            1                                  for DD ≤ d0
 m(DD) =    1 − (1 − m_min)·(DD − d0)/(dH − d0) for d0 < DD < dH
            m_min                               at DD ≥ dH (then 0 during cooloff)

with d0=5%, dH=12%, m_min=0.2, cooloff/flat at the 15% hard cap. Measure DD in vol-adjusted units (DD / σ_target,port) so the same fractional drawdown triggers consistently across vol regimes. The discrete kill tiers in RiskManager remain as a safety overlay that can only further reduce exposure, never raise it (belt and suspenders). sync_drawdown_state is the single coordination point so the two governors never disagree.

Stage 6 — Regime overlay (bounded tilt, not a switch)

Wire the existing regime knowledge as continuous tilts, gated by regime confidence: - scale the portfolio vol target within a band using _REGIME_VOL_TARGETS (risk_on → top of band, crisis → bottom); - apply a small bounded multiplicative tilt to sleeve weights (e.g. crisis: defensive sleeves ×1.2, cyclical ×0.8, capped) and/or nudge the HRP↔ERC blend.

Avoid hard method switching (risk_parity ↔ min_variance ↔ cash): discrete switches add path-dependence and a large overfitting surface, and regime detection is itself noisy. Tilts are bounded and confidence-weighted. This both uses the dormant regime machinery (W7) and keeps it robust.

Stage 7 — Turnover & rebalancing control

No-trade band: only trade name i if |W_target,i − W_current,i| > τ (≈0.75% of equity).
Cadence split: recompute the full meta-allocation (Stage 1) slowly — daily at 15:30 ET (or weekly). Intraday slots (REBALANCE_SCHEDULE_ET = 09:35 / 12:00 / 14:00 / 15:30) only refresh symbol-level targets within fixed sleeve weights. Risk allocation must not whipsaw intraday.
Cost-aware: net offsets (Stage 2), use PreTradeCostModel to skip trades whose expected edge < expected cost, EWMA-smooth sleeve weights (Stage 1).

Stage 8 — Capital ladder

Account-size tiers control gross cap, vehicle selection, min ticket / dust thresholds, and per-name caps. Wire VehicleRouter onto the exec path (fixes W9):

Tier (equity)	Gross cap	Vehicles	Per-name cap	Notes
< $100k	1.0×	ETFs only	20%	dust threshold high; few names
$100k–$1M	1.5×	ETFs (+ liquid futures where min-size allows)	15%	current operating range
$1M–$10M	2.0×	ETF + futures (margin-efficient)	12%	migrate to futures via `VehicleRouter`
> $10M	up to firm cap (≤3.0×)	futures-first + options overlays	10%	capacity-aware sleeve caps

4. Allocation algorithm + exact math

Notation. Sleeves s = 1..K. Sleeve realized return at time t: r_{s,t}. Sleeve target weight vector over symbols: w_s ∈ R^M. Asset covariance Σ_A ∈ R^{M×M}. Sleeve covariance Σ_S ∈ R^{K×K}.

Sleeve return definition (canonical). Formalize today's ad-hoc _strategy_pnl_buf / IC P&L into a single series: r_{s,t} = Σ_i w_{s,i,t-1} · ret_{i,t} — the return the normalized sleeve target from the prior rebalance would have earned. (This matches ic_tracker.rolling_sharpe's pnl_t but is the authoritative input to Stage 1.)

Stage 0. w_s ← w_s / Σ_i|w_{s,i}|; then w_s ← w_s · (σ_sleeve / max(σ̂_s, σ_floor)), where σ̂_s² = EWMA of r_{s,·}² (half-life h_σ).

Stage 1 — sleeve covariance & HRP. - Σ_S = ledoit_wolf_shrinkage(R) where R is the T×K lagged sleeve-return matrix (annualized). - ρ = corr(Σ_S), D_ij = sqrt(0.5(1−ρ_ij)), linkage → quasi-diagonal order → recursive bisection: at each split into clusters (L, R), cluster variance v_C = w_C^T Σ_{S,C} w_C with intra-cluster inverse-variance weights, allocation factor α = 1 − v_L/(v_L+v_R); propagate. Output base a^{HRP}. - Skill tilt: a_s ← a^{HRP}_s · clip(2·skill_s, 1−κ, 1+κ), cap a_s ≤ a_max, renormalize Σ a_s=1. - Smooth: a_t = λ a_{t-1} + (1−λ) a_t.

ERC cross-check (Maillard et al.). Solve for a such that a_i (Σ_S a)_i = a_j (Σ_S a)_j ∀ i,j, i.e. equal total risk contribution RC_i = a_i (Σ_S a)_i. (optimizer.risk_parity already implements the Newton iteration.)

Stage 2. W = Σ_s a_s · w_s (symbol-summed), then equivalent-wrapper netting.

Stage 3. σ_ex = sqrt(W^T Σ_A W); L = clip(σ_P / σ_ex, L_min, L_max); W ← L·W.

Stage 4. Per-name clip W_i ← clip(W_i, −c_name, c_name); gross scale W ← W·min(1, G_max/Σ|W_i|); verify |ΣW_i| ≤ N_max (trim the largest net contributors if not).

Stage 5. W ← m(DD)·W, then apply the kill-tier throttle_factor as W ← min(throttle,1)·W.

Sizing (once). Final notional per name: notional_i = sizing_base · W_i · kelly_fraction (half-Kelly), with kelly_fraction = 0.5 retained from RiskLimits. The vol_target/realized_vol term in plan_order is set to 1 because vol targeting already happened in Stage 3.

5. Inputs, estimation & look-ahead control

Input	How estimated	Lookback / shrinkage	Look-ahead guard
Sleeve returns `r_{s,t}`	`Σ_i w_{s,i,t-1}·ret_{i,t}` (prior-bar weights × realized)	full available, ≥60d to activate	weights are lagged by construction
Sleeve covariance `Σ_S`	Ledoit–Wolf constant-correlation	252d window, EWMA half-life 63d	use returns ≤ t-1; purge same-day bar
Sleeve vol `σ̂_s`	EWMA of `r²`	half-life 21–63d, floored	lagged
Asset covariance `Σ_A`	`rolling_covariance` (LW) or `factor_model_covariance` (N>T)	90–120d; factor model for large N	lagged; do not truncate to shortest history — use pairwise-complete + shrinkage or the factor model
Skill / IC	`ic_tracker.skill_score`	window 21, half-life 63	forecast-vs-next-period realized only
Regime	`MarketRegimeDetector`	as configured	features ≤ t-1

Estimation-error is the real enemy (call-out). Per DeMiguel et al. (2009), with K≈5–12 sleeves and ~250–500 daily observations we have nowhere near the sample size for mean-variance to beat 1/N. That is precisely why Stage 1 uses HRP/ERC (no μ, no inversion) + shrinkage and why the skill tilt is bounded. Two concrete anti-look-ahead fixes vs. today: (1) replace the truncate-to-shortest covariance assembly (W6) with pairwise-complete estimation + shrinkage (or the factor model); (2) make the point-in-time contract explicit — every weight applied at the open of bar t is a pure function of data through bar t-1.

6. Rebalancing rules

Risk allocation (Stage 1): once daily at 15:30 ET; EWMA-smoothed; only updates that move a sleeve weight by > 2% absolute are applied.
Symbol refresh (Stages 2–5): at each REBALANCE_SCHEDULE_ET slot, within fixed sleeve weights; trade a name only if outside the no-trade band and net of cost-gate.
Newly promoted sleeve: enters at ≤ ½ of its HRP weight, ramped to full over ~20 trading days (probation), tying into the promotion gate. A demoted/quarantined sleeve has its weight set to 0 and its positions handed to the existing stale-close-out path.
Event-driven: keep EventTriggerEngine extra rebalances, but they refresh symbol targets only — never recompute sleeve covariance intraday (avoids covariance whipsaw on a vol spike).

7. Mapping onto the existing codebase (reuse vs. add)

Reuse as-is: - qgtm_portfolio/covariance.py — ledoit_wolf_shrinkage, rolling_covariance, factor_model_covariance. - qgtm_portfolio/optimizer.py — hrp, risk_parity (ERC), inverse_vol, equal_weight already implemented and tested. - qgtm_portfolio/ic_tracker.py — sleeve skill scores for the bounded tilt. - qgtm_portfolio/correlation_penalty.py — optional extra redundancy discount (HRP already handles most of this structurally). - qgtm_risk/manager.py — kill-tier ladder, compliance caps, half-Kelly sizing (with the second vol-scaling neutralized). - qgtm_risk/pm_factor_model.py — Σ_A when N>T. - qgtm_risk/stress.py (StressTestLab, HISTORICAL_SCENARIOS) — validation §9. - qgtm_execution/pretrade_cost.py (PreTradeCostModel) — turnover/cost gate. - qgtm_portfolio/signal_aggregator.py — _net_equivalent_wrappers, VehicleRouter, and DrawdownManager (upgraded to smooth).

Add (thin, low-parameter): - StrategyReturnTracker — canonical per-sleeve realized return series (formalizes _strategy_pnl_buf). - SleeveAllocator — Stage 1: HRP/ERC over Σ_S (shrunk), caps, bounded skill tilt, EWMA smoothing, fallback ladder HRP→ERC→inverse-vol→1/N. - PortfolioIntegrator — the Stage 0–8 orchestrator that replaces the dual aggregator/ensemble execution path with one pipeline. It supersedes the unused PortfolioOrchestrator and the EnsembleAllocator min-variance path, and outputs the final symbol weights the daemon sizes to. - Smooth m(DD) on DrawdownManager (or a new smooth_multiplier) applied to the executed book. - Regime → vol-band + bounded-tilt wiring; capital-ladder config table.

Net effect on daemon._rebalance(): the two parallel weight computations (telemetry agg_result and executed combined) collapse into a single PortfolioIntegrator.build_target_book(...) call whose output flows to sizing. One vol target, one drawdown governor, one allocator.

8. Expected Sharpe uplift from diversification

For K sleeves each with standalone Sharpe SR and average pairwise return correlation ρ̄, an equal-risk combination has approximately

SR_portfolio ≈ SR · sqrt( K / (1 + (K−1)·ρ̄) ).

Worked examples (per-sleeve SR = 0.6):

K	ρ̄ = 0.0	ρ̄ = 0.2	ρ̄ = 0.4
3	1.04	0.82	0.71
5	1.34	1.00	0.82
8	1.70	1.13	0.88

So with 5–8 genuinely diversifying sleeves (ρ̄ ≈ 0.2) the combined Sharpe lands around 1.0–1.15, a ~1.7×–1.9× uplift over a single sleeve. Do not bank the full figure: (a) HRP/ERC are not mean-variance optimal so they capture most, not all, of the theoretical benefit; (b) ρ̄ rises toward 1 in crises, collapsing the denominator; (c) estimation error and turnover costs subtract. A prudent planning assumption is 0.85–1.05 combined Sharpe before costs, with the explicit acknowledgement (DeMiguel et al.) that if HRP cannot beat 1/N-risk net of costs in walk-forward, we default to equal-vol/1/N across sleeves.

9. Behavior in crises

Vol targeting (Stage 3) cuts gross as realized vol spikes — the Moreira–Muir mechanism, and the Cederburg et al. evidence says this helps trend/momentum sleeves most (which this system runs).
HRP (Stage 1) degrades gracefully when cross-sleeve correlations jump to ~1: it loses diversification but never blows up (no inversion), unlike min-variance.
Smooth drawdown governor (Stage 5) tapers exposure continuously from 5% DD, hitting the 0.2 floor by 12% and cooloff at 15% — no cliff, no whipsaw re-entry.
Regime tilt (Stage 6) lowers the vol target to the bottom of the band and tilts toward defensive sleeves under a confident crisis read.
Kill-tier overlay is the hard backstop (FLATTEN at the 15% hard DD / daily-loss breaker).

Net: in a vol spike the book shrinks through four independent, mutually reinforcing channels (vol target ↓, HRP de-concentrates, DD governor ↓, regime tilt ↓), with the kill switch as the final stop.

10. Validation plan

Combined-portfolio walk-forward. Rolling/expanding windows; at each step estimate Σ_S on lagged data, compute HRP, hold, record OOS. Report net-of-cost Sharpe, vol (vs. target), max DD, turnover, effective-N (portfolio_optimization._effective_n). Benchmark ladder: HRP vs. ERC vs. inverse-vol vs. 1/N vs. current production path. HRP must beat 1/N-risk net of costs or we ship 1/N-risk.
PBO on the meta-allocation. Treat the configuration family — {HRP, ERC, inverse-vol, MV}, lookbacks, shrinkage targets, vol target, skill-tilt bound — as the strategy set. Run CSCV/CPCV; require PBO < 0.5 (target < 0.2). Compute the Deflated Sharpe Ratio on the final combined track using the number of configurations tried (Bailey & López de Prado). A high PBO means the choice among methods is itself overfit → fall back to the most robust (HRP or 1/N-risk).
Stress scenarios. Run StressTestLab.run_all (HISTORICAL_SCENARIOS: 1987 Black Monday, 2008 collapse, 2020 COVID / negative WTI, 2011 silver squeeze, 2022 energy) plus monte_carlo_stress on the assembled book; confirm worst-case loss respects the DD caps and the governor actually cuts gross.
Parameter-sensitivity (overfitting screen). Perturb lookback, shrinkage, vol target, EWMA λ by ±50%. Acceptance criterion: Sharpe is stable. High sensitivity = overfit and is itself a rejection signal — favor the flatter configuration even at slightly lower in-sample Sharpe.
Live shadow run. Run PortfolioIntegrator in shadow alongside the current allocator for ~4–8 weeks. Compare realized vs. target vol, turnover, gross, and ex-ante/ex-post tracking before cutover.

11. Where overfitting / estimation error actually bites (explicit risk register)

Sleeve covariance instability with few sleeves and short history → mitigated by shrinkage + HRP + EWMA smoothing + ERC cross-check. Do not add μ estimates at the cross-sleeve level.
Vol-targeting look-ahead → use only lagged realized vol; expect the benefit mostly on trend sleeves (Cederburg et al.).
Skill-tilt feedback (chasing recently hot sleeves) → bounded ±25%, off during warmup, derived from rank-IC not raw P&L.
Regime overfitting → bounded tilts, confidence-gated, never a hard method switch.
Method selection overfitting → PBO/DSR gate; default to HRP/1/N-risk if PBO is high.
Turnover/cost drag → no-trade band, cadence split, cost gate, EWMA smoothing.

Guiding principle: favor robust, low-parameter methods over fragile optimization. Every added knob must pay for itself against the 1/N-risk benchmark net of costs and after the PBO penalty.

12. References

Antonov, A., Lipton, A., & López de Prado, M. (2024). Analytical weight-noise bounds for HRP vs. minimum-variance.
Bailey, D. H., Borwein, J., López de Prado, M., & Zhu, Q. J. (2015). The Probability of Backtest Overfitting. Journal of Computational Finance.
Bailey, D. H., & López de Prado, M. (2014). The Deflated Sharpe Ratio. Journal of Portfolio Management 40(5):94–107.
Cederburg, S., O'Doherty, M., Wang, F., & Yang, X. (2020). On the performance of volatility-managed portfolios. Journal of Financial Economics.
Choueifaty, Y., & Coignard, Y. (2008). Toward Maximum Diversification. Journal of Portfolio Management.
DeMiguel, V., Garlappi, L., & Uppal, R. (2009). Optimal Versus Naive Diversification: How Inefficient is the 1/N Portfolio Strategy? Review of Financial Studies 22(5):1915–1953.
Ledoit, O., & Wolf, M. (2004). Honey, I Shrunk the Sample Covariance Matrix. Journal of Portfolio Management 30(4):110–119.
López de Prado, M. (2016). Building Diversified Portfolios that Outperform Out-of-Sample. Journal of Portfolio Management 42(4):59–69. DOI 10.3905/jpm.2016.42.4.059.
López de Prado, M. (2019). A Robust Estimator of the Efficient Frontier (NCO). SSRN 3469961; Machine Learning for Asset Managers (2020), Cambridge University Press.
MacLean, L. C., Thorp, E. O., & Ziemba, W. T. (2010). Good and bad properties of the Kelly and fractional Kelly criteria. Quantitative Finance 10(7):681–687.
MacLean, L. C., & Ziemba, W. T. (1992). Growth Versus Security in Dynamic Investment Analysis. Management Science 38(11):1562–1585.
Maillard, S., Roncalli, T., & Teiletche, J. (2010). The Properties of Equally-Weighted Risk Contribution Portfolios. Journal of Portfolio Management 36(4):60–70.
Moreira, A., & Muir, T. (2017). Volatility-Managed Portfolios. Journal of Finance 72(4):1611–1644. DOI 10.1111/jofi.12513.
Rockafellar, R. T., & Uryasev, S. (2000). Optimization of Conditional Value-at-Risk. Journal of Risk.