Big BotsResearch ← all reports

HVF / iHVF Breakout — Backtest Report ⚠️ NOT a validated systematic edge — 2026-06-29

Hypothesis. The Hunt Volatility Funnel breakout shown live by the HVF Radar (hvfradar.py) — enter on the break through H3 (bullish HVF) or L3 (bearish iHVF), stop at the opposite anchor, target = first-leg height projected from the funnel centre — has positive expectancy after costs.

Verdict (short version). No robust edge after costs. The only timeframe with a proper pre-2026 / 2026 split (4h) shows an apparent in-sample edge that is tiny in sample, concentrated in shorts, and whose short component collapses out of sample. Combined 4h 2026 is only marginally positive (PF ≈ 1.24, 23 trades — inside the noise). The timeframes that generate enough trades to mean anything (15m, 5m) are negative after costs; 5m is badly negative. As a discretionary radar it still flags clean setups, but it is not an automated positive- expectancy system.


Method

Data constraint that shapes everything

Only 4h and 1d have pre-2026 history; the radar's faster timeframes do not exist before 2026:

TF History starts Pre-2026 split possible?
4h 2024-06-04 (~2 yr) ✅ yes — the primary test
1d 2024-06-05 (~2 yr) ✅ yes (but radar doesn't scan 1d)
1h 2026-03-06 ❌ 2026-only
15m 2026-05-05 ❌ 2026-only
5m 2026-05-15 ❌ 2026-only

So the tune-pre / test-post rule can only be honoured on 4h (and 1d). The lower timeframes can only ever be 2026-only — reported as out-of-sample, never tunable.


Results

4h — the primary, split-validated timeframe (85 trades over 2 yr)

Period Leg Trades Win% Avg R PF 8-cap equity Max DD
TRAIN (pre-2026) Long (HVF) 22 36.4 +0.31 1.48
Short (iHVF) 40 60.0 +0.99 3.79
Combined 62 51.6 +0.75 2.65 $15,510 5.7%
OOS 2026 Long (HVF) 13 61.5 +0.63 2.60
Short (iHVF) 10 20.0 −0.53 0.27
Combined 23 43.5 +0.13 1.24 $10,276 3.9%

The train edge is almost entirely shorts (iHVF PF 3.79). Out of sample the short leg collapses (PF 3.79 → 0.27, avg R −0.53). The long leg actually held up (1.48 → 2.60) but on only 13 trades. Combined OOS PF 1.24 on 23 trades is inside the noise — not an edge you can lean on. The 8-position cap never binds on 4h (signals are too sparse: 0 skipped).

4h monthly (sum R by exit month):

TRAIN  2024-07 +13.8 | 2024-08 -1.0 | 2024-09 -1.0 | 2024-10 +1.9 | 2024-12 +1.5
       2025-01 -4.6  | 2025-02 +0.7 | 2025-03 +10.0| 2025-04 +8.8 | 2025-05 -2.6
       2025-06 +0.9  | 2025-07 +1.9 | 2025-08 +3.2 | 2025-09 -1.0 | 2025-10 +0.6
       2025-11 +2.5  | 2025-12 +11.9
OOS    2026-01 -1.0  | 2026-02 -3.9 | 2026-03 +2.3 | 2026-04 +4.2 | 2026-05 +0.5
       2026-06 -0.2

The train curve leans on two clusters (Jul-2024, Mar–Apr & Dec-2025); 2026 is a flat grind around zero.

1d — secondary (19 trades, mostly shorts)

TRAIN: 17 trades, win 17.6%, avg R −0.27, PF 0.58 (negative). OOS: only 2 trades. Too few signals to say anything; in-sample it loses.

Lower timeframes — 2026-only (out-of-sample by construction)

TF Trades Win% Avg R PF 8-cap equity Max DD Last-14d
1h 34 50.0 +0.52 1.99 $11,696 5.1% +7.7 R (PF 2.48)
15m 121 33.1 −0.08 0.89 $8,822 22.1% −7.4 R (PF 0.78)
5m 323 34.4 −0.26 0.67 $4,121 64.7% −11.1 R (PF 0.88)

Clear monotonic decay with speed: the faster the timeframe, the worse it gets. 1h is positive (again driven by shorts, PF 2.69; longs lose, PF 0.49) but sits on only ~4 months of one regime. 15m is negative; 5m is a wipeout (−59% equity, 65% drawdown) — noise plus the ~0.17% round-trip cost on small stop distances overwhelms any signal.

Last-14-day (to data end)

4h: 3 trades, ≈flat (−0.2 R). 1h: +7.7 R (PF 2.48). 15m: −7.4 R. 5m: −11.1 R. Mixed and small — consistent with "no dependable edge," good recent run on 1h shorts notwithstanding.

Robustness to the one non-radar knob (time-stop)

max_hold is my modelling choice, not the radar's. Sweeping it confirms the verdict is stable, not a single-parameter fluke:

max_hold (bars) 4h OOS PF 1h OOS PF
40 1.64 2.69
60 1.33 1.95
90 1.24 1.99
120 1.20 2.08

4h stays weakly positive on a tiny sample; 1h stays robustly positive on 2026- only data. Neither finding changes with the time-stop.


Honest conclusion

Caveats (don't over-read the green cells)

  1. Survivorship. tracked_symbols is today's active perp set; delisted perps are absent. This both flatters and distorts — and it especially muddies the short results (coins that died would have been the best shorts, yet they're gone from the data).
  2. Sample size. The timeframes with a real train/test split (4h, 1d) produce only 85 and 19 trades in two years. You cannot certify or kill an edge on that. The big samples (5m: 323) come only from 2026 and only where it loses.
  3. Modelling choices. Time-stop (90 bars), stop-first-on-ambiguous-bar, and 1%-risk sizing are mine, not the radar's. The first is shown robust above; the second is conservative.
  4. No parameter fitting was done — a plus for honesty (2026 is genuinely untouched) but it also means we have not searched for a better-performing variant; we tested exactly what is live.

What the radar is still good for

The radar remains a useful discretionary screen — it isolates clean, completed funnels at their decision point, and the higher-timeframe short (iHVF) setups in particular showed the most life (4h train, 1h 2026). Used the way Jacques uses the momentum shortlist — hand-vetted, regime-aware, not fired mechanically — it earns its place. It should not be wired to auto-trade.

Possible next steps (only if we want to pursue it)

Files