HVF / iHVF Breakout — Backtest Report ⚠️ NOT a validated systematic edge — 2026-06-29

Hypothesis. The Hunt Volatility Funnel breakout shown live by the HVF Radar (hvfradar.py) — enter on the break through H3 (bullish HVF) or L3 (bearish iHVF), stop at the opposite anchor, target = first-leg height projected from the funnel centre — has positive expectancy after costs.

Verdict (short version). No robust edge after costs. The only timeframe with a proper pre-2026 / 2026 split (4h) shows an apparent in-sample edge that is tiny in sample, concentrated in shorts, and whose short component collapses out of sample. Combined 4h 2026 is only marginally positive (PF ≈ 1.24, 23 trades — inside the noise). The timeframes that generate enough trades to mean anything (15m, 5m) are negative after costs; 5m is badly negative. As a discretionary radar it still flags clean setups, but it is not an automated positive- expectancy system.

Method

Engine: /root/backtest_hvf_PY-00028_290626.py (trade log CSVs: /root/backtest_hvf_trades_<tf>.csv).
Faithful replay. The backtest reuses the radar's exact six-pivot detection (find_pivots + the completed-HVF / completed-iHVF gates: descending highs / rising lows, H2 ≥ 0.61 and H3 ≥ 0.50 retraces, genuine-turn check, anchor ≤ 40 bars). No lookahead — a swing pivot is only "known" PIVOT_WINDOW = 10 bars after it prints, exactly as in live use.
Trade model. Arm on a completed funnel while price is still on the watching side of the trigger; enter on a genuine break through the trigger (gap-aware fill); stop at the opposite anchor; target as defined; time-stop at max_hold = 90 bars. If one bar spans both stop and target, assume stop first (conservative). One position per symbol; re-anchors to the latest confirmed pivot.
Costs (mandatory minimums): 0.055% fee + 0.03% slippage per side — slippage in the fill price, fees (0.11% round-trip) off the return; ~0.17% round-trip drag.
Equity sim: start $10,000, risk 1% to the stop per trade, max 8 concurrent positions. Max drawdown from that curve.
Validation split: TRAIN = entries before 2026-01-01 (in-sample characterisation), OOS = 2026 entries (untouched). We did not fit any parameters — the radar's live settings are used as-is — so 2026 is a true out-of-sample test and the all-2026 lower timeframes are legitimately out-of-sample by construction.
Universe: the 464 active Bybit USDT perps in tracked_symbols, read-only from crypto_cross.db.

Data constraint that shapes everything

Only 4h and 1d have pre-2026 history; the radar's faster timeframes do not exist before 2026:

TF	History starts	Pre-2026 split possible?
4h	2024-06-04 (~2 yr)	✅ yes — the primary test
1d	2024-06-05 (~2 yr)	✅ yes (but radar doesn't scan 1d)
1h	2026-03-06	❌ 2026-only
15m	2026-05-05	❌ 2026-only
5m	2026-05-15	❌ 2026-only

So the tune-pre / test-post rule can only be honoured on 4h (and 1d). The lower timeframes can only ever be 2026-only — reported as out-of-sample, never tunable.

Results

4h — the primary, split-validated timeframe (85 trades over 2 yr)

Period	Leg	Trades	Win%	Avg R	PF	8-cap equity	Max DD
TRAIN (pre-2026)	Long (HVF)	22	36.4	+0.31	1.48	—	—
	Short (iHVF)	40	60.0	+0.99	3.79	—	—
	Combined	62	51.6	+0.75	2.65	$15,510	5.7%
OOS 2026	Long (HVF)	13	61.5	+0.63	2.60	—	—
	Short (iHVF)	10	20.0	−0.53	0.27	—	—
	Combined	23	43.5	+0.13	1.24	$10,276	3.9%

The train edge is almost entirely shorts (iHVF PF 3.79). Out of sample the short leg collapses (PF 3.79 → 0.27, avg R −0.53). The long leg actually held up (1.48 → 2.60) but on only 13 trades. Combined OOS PF 1.24 on 23 trades is inside the noise — not an edge you can lean on. The 8-position cap never binds on 4h (signals are too sparse: 0 skipped).

4h monthly (sum R by exit month):

TRAIN  2024-07 +13.8 | 2024-08 -1.0 | 2024-09 -1.0 | 2024-10 +1.9 | 2024-12 +1.5
       2025-01 -4.6  | 2025-02 +0.7 | 2025-03 +10.0| 2025-04 +8.8 | 2025-05 -2.6
       2025-06 +0.9  | 2025-07 +1.9 | 2025-08 +3.2 | 2025-09 -1.0 | 2025-10 +0.6
       2025-11 +2.5  | 2025-12 +11.9
OOS    2026-01 -1.0  | 2026-02 -3.9 | 2026-03 +2.3 | 2026-04 +4.2 | 2026-05 +0.5
       2026-06 -0.2

The train curve leans on two clusters (Jul-2024, Mar–Apr & Dec-2025); 2026 is a flat grind around zero.

1d — secondary (19 trades, mostly shorts)

TRAIN: 17 trades, win 17.6%, avg R −0.27, PF 0.58 (negative). OOS: only 2 trades. Too few signals to say anything; in-sample it loses.

Lower timeframes — 2026-only (out-of-sample by construction)

TF	Trades	Win%	Avg R	PF	8-cap equity	Max DD	Last-14d
1h	34	50.0	+0.52	1.99	$11,696	5.1%	+7.7 R (PF 2.48)
15m	121	33.1	−0.08	0.89	$8,822	22.1%	−7.4 R (PF 0.78)
5m	323	34.4	−0.26	0.67	$4,121	64.7%	−11.1 R (PF 0.88)

Clear monotonic decay with speed: the faster the timeframe, the worse it gets. 1h is positive (again driven by shorts, PF 2.69; longs lose, PF 0.49) but sits on only ~4 months of one regime. 15m is negative; 5m is a wipeout (−59% equity, 65% drawdown) — noise plus the ~0.17% round-trip cost on small stop distances overwhelms any signal.

Last-14-day (to data end)

4h: 3 trades, ≈flat (−0.2 R). 1h: +7.7 R (PF 2.48). 15m: −7.4 R. 5m: −11.1 R. Mixed and small — consistent with "no dependable edge," good recent run on 1h shorts notwithstanding.

Robustness to the one non-radar knob (time-stop)

max_hold is my modelling choice, not the radar's. Sweeping it confirms the verdict is stable, not a single-parameter fluke:

max_hold (bars)	4h OOS PF	1h OOS PF
40	1.64	2.69
60	1.33	1.95
90	1.24	1.99
120	1.20	2.08

4h stays weakly positive on a tiny sample; 1h stays robustly positive on 2026- only data. Neither finding changes with the time-stop.

Honest conclusion

The HVF breakout does not pass validation as a systematic, positive- expectancy strategy. On the only properly split timeframe (4h), the apparent edge is small-sample, short-driven, and the short component does not survive out of sample; combined 2026 is barely above breakeven.
Where it has enough trades to be significant, it loses (15m negative, 5m badly negative after costs). The favourable readings (4h train, 1h) ride on small samples and/or a single 2026 regime.
Per CLAUDE.md, this is a result, not a failure. It tells us the funnel geometry alone, traded mechanically across the whole universe, is not an edge after realistic costs — especially intraday where cost drag dominates.

Caveats (don't over-read the green cells)

Survivorship. tracked_symbols is today's active perp set; delisted perps are absent. This both flatters and distorts — and it especially muddies the short results (coins that died would have been the best shorts, yet they're gone from the data).
Sample size. The timeframes with a real train/test split (4h, 1d) produce only 85 and 19 trades in two years. You cannot certify or kill an edge on that. The big samples (5m: 323) come only from 2026 and only where it loses.
Modelling choices. Time-stop (90 bars), stop-first-on-ambiguous-bar, and 1%-risk sizing are mine, not the radar's. The first is shown robust above; the second is conservative.
No parameter fitting was done — a plus for honesty (2026 is genuinely untouched) but it also means we have not searched for a better-performing variant; we tested exactly what is live.

What the radar is still good for

The radar remains a useful discretionary screen — it isolates clean, completed funnels at their decision point, and the higher-timeframe short (iHVF) setups in particular showed the most life (4h train, 1h 2026). Used the way Jacques uses the momentum shortlist — hand-vetted, regime-aware, not fired mechanically — it earns its place. It should not be wired to auto-trade.

Possible next steps (only if we want to pursue it)

Add a regime / trend filter (e.g. only take iHVF shorts when BTC is risk- off, only HVF longs when risk-on) and re-test — the long/short asymmetry hints the funnel needs directional context.
Backfill delisted perps to attack survivorship before trusting the short numbers.
Treat 1h as a live-forward paper test (it's the most promising clean window) rather than claiming it from 4 months of history.

Files

Engine: /root/backtest_hvf_PY-00028_290626.py
Trade logs: /root/backtest_hvf_trades_{4h,1d,1h,15m,5m}.csv
Radar under test: /root/crypto_bots/hvfradar.py (see 2026-06-27_hvf-radar-log.md)
Data: /root/crypto_bots/crypto_cross.db (candles, tracked_symbols), read-only.