MVG Sweep — Post Forward-Leak Fix (2026-04-16)
TL;DR — The strategy has no edge
After fixing the Polars slice(offset, length) forward-leak in
qgtm_backtest/intraday.py, the MVG v1 sweep collapses. Every config
now produces negative Sharpe and negative total return with DSR
confidence ≤ 0.18 and PBO = 1.0. The previously-reported "best DSR
0.36" result was an artifact of the bug — the strategy was peeking
at future bars, which inflated signal quality post-warmup.
Recommendation: keep QGTM_ENABLE_MVG=false. The strategy as
currently parameterised does not have edge on 30m GLD over
2022-01-03 → 2026-04-14. Any further work should start from a
re-research of the underlying thesis, not a reparameterisation of
this strategy.
Root cause
In intraday.py, the reference engine used:
Polars .slice(offset, length) — the second arg is a length, not
an end index. For i >= W, offset = i + 1 - W and length = i + 1,
so the slice spans [i + 1 - W, 2i + 2 - W), reaching past bar i
into i + 1 - W future bars. Every MVG entry past bar 220 was
computed with look-ahead.
The vectorized engine in intraday_vec.py replicated the same
effective end-bar mapping via j_i = min(i + max(0, i+1-W), n-1) to
preserve equivalence with the reference — so PR #3's equivalence
tests still passed; both engines had the same bug.
Scope of contamination
- Bars affected: every bar with index
i >= W - 1(i.e. >= 219 for the default window of 220). Over a 14,396-bar cache that's ~14,177 bars — 98.5% of the bar-walk. - Future bars visible: for a bar at index
i, the slice extended intomin(i + 1, n) - Wfuture bars. Ati = 220the slice leaked 1 future bar; ati = 14,395it leaked ~14,176 future bars. Feature recomputation from the slice tail (ATR, ADX, Bollinger width, swing-high/low) used that post-bar data. - Strategies affected: only MVG v1 consumes this engine. No live
trading consumed these numbers — the MVG flag has been OFF
(
QGTM_ENABLE_MVG=false) since the strategy was quarantined by the DSR gate.
Before vs after — 4-config sweep
Backtest window: 2022-01-03 → 2026-04-14, 30m GLD bars (14,396 bars). Fixed engine via the vectorized path (equivalent to reference engine to 5e-6 on trade geometry; equivalence test suite passes).
| Config | N (before) | N (after) | Sharpe (before) | Sharpe (after) | Total Return (before) | Total Return (after) | DSR (before) | DSR (after) | PBO (before) | PBO (after) |
|---|---|---|---|---|---|---|---|---|---|---|
| LS_pure_rr3.0 | 27 | 306 | -0.57 | -1.06 | -0.79% | -8.10% | 0.2733 | 0.0931 | 0.625 | 1.0000 |
| S_pure_rr3.0 | 16 | 170 | -1.34 | -1.50 | -1.57% | -8.37% | 0.3589 | 0.1053 | 0.625 | 1.0000 |
| LS_decay_rr3.0 | 55 | 463 | -2.42 | -2.49 | -1.63% | -13.54% | 0.2976 | 0.1834 | 0.5556 | 1.0000 |
| S_decay_rr3.0 | 27 | 225 | -1.69 | -2.09 | -0.75% | -7.32% | 0.3056 | 0.1425 | 0.5556 | 1.0000 |
Key observations:
-
Trade count exploded ~10x across every config. The old engine was aggressively filtering signals using post-bar information, producing fewer but higher-quality-looking trades. Remove the leak and the regime filter passes far more often on noisy bars.
-
Returns flipped deeply negative. Worst case (LS_decay) is now -13.5% over the 4.3-year window vs -1.6% before. The decay exit amplifies the damage: it closes mostly-losing short positions early before they can recover.
-
DSR confidence collapsed. The best config drops from 0.36 to 0.18 — nowhere near the 0.95 hard gate. PBO went to 1.0 across the board, meaning every fold selected as "best in-sample" was the worst out-of-sample. The strategy is memorising noise.
-
Win rate dropped from ~50% to ~42-50% while the risk-reward stayed fixed at 3:1 — profit factor under 1.0 in all cases.
Go-live implications
- Hard gate remains blocked. DSR < 0.95 on every config. Soft gate (Sharpe ≥ 1.0) also failed.
- The earliest-2026-05-12 go-live target is NOT delayed by this because MVG was never going to hit the gate — the "0.36 best" number that suggested it was in the ballpark was a bug artifact.
- No real-capital exposure. MVG flag has been OFF. No production trade decision used these numbers.
- Memory file updated:
project_intraday_lookahead_bug.mdshould be marked RESOLVED (bug fixed, baseline reset, no edge found at current parameters). Recorded in this report; the memory update is the follow-up item.
Verification
- Regression tests added in
tests/qgtm_backtest/test_intraday_no_lookahead.py— 7 tests, 3 parametric variants of the core leak check, plus vec-engine and spike-trigger end-to-end checks. All pass. - The regression suite fails on the pre-fix engine (verified by temporarily stashing the fix and re-running: "bar 20 saw bar 21 — forward leak").
- Vectorized-engine equivalence suite still passes (7 tests) — both engines now produce bit-for-bit identical results, both leak-free.
- Pre-existing
tests/test_intraday_backtest.py(16 tests) passes unchanged.
Files touched
qgtm_backtest/intraday.py— fixed slice math inIntradayBarWalkEngine.run(line ~357) and_extract_current_mvs(line ~571).qgtm_backtest/intraday_vec.py— removedj_sig/j_deceffective-end-bar shims; feature lookups now use indexidirectly; fallback slice math fixed to match.tests/qgtm_backtest/test_intraday_no_lookahead.py— NEW, 7 regression tests covering the leak.scripts/_mvg_postfix_sweep.py— one-shot vec-engine sweep that produced the numbers in this report. Kept for reproducibility.backtest_results/mvg_v1/sweep_postfix.json— raw sweep output.
Delta summary
| Metric | Before (best) | After (best) | Δ |
|---|---|---|---|
| DSR confidence | 0.3589 (S_pure) | 0.1834 (LS_decay) | -0.1755 |
| Sharpe | -0.57 (LS_pure) | -1.06 (LS_pure) | -0.49 |
| Total return | -0.79% (LS_pure) | -7.32% (S_decay) | -6.5 pp |
| Trade count | 27 (LS_pure) | 170 (S_pure) | +10x avg |
| PBO | 0.5556-0.625 | 1.0 (all) | worst possible |
The honest read: MVG v1 has no edge on this data. The fix didn't kill a good strategy; it revealed that the apparent edge was the bug.