Top-20 Universe & the April Whipsaw — Diagnostic + Backtest ⛔ top-20 FAILS
Date: 2026-06-12
Analyst: Claude (for Jacques)
Engine: /root/backtest_xsmom_PY-00027_100626.py (unchanged)
Diagnostics: /root/xsmom_whipsaw_diag_120626.py, /root/xsmom_top20_120626.py
Data outputs: /root/research/2026-06-12_april-whipsaw-weekly.csv
Builds on: 2026-06-10_cross-sectional-momentum.md (the strategy that PASSED).
1. Question
The passing strategy (regime-gated long/short 21-day momentum, top-100 PIT universe) had one ugly month: April 2026, −20.9%, flagged as open question (a): what caused the April whipsaw, and can a second filter fix it without overfitting? Jacques asked to try a quality filter: trade only the top 20 coins "by market cap."
Data caveat (unchanged from prior reports): there is no market-cap field in any DB. The honest proxy used throughout the project is trailing-30d dollar volume, recomputed weekly point-in-time. "Top 20 by market cap" is therefore implemented as top 20 by dollar volume, PIT.
2. What the April whipsaw actually is
Per-week reconstruction of the headline variant over 2026 (full log in the CSV):
- BTC rose +18.5% across April; the BTC regime gate read BULL every week. The filter was not wrong — the market was up. The damage was idiosyncratic to the picks.
- The short leg made money in April (+94% summed net). The long leg lost −186%. The entire whipsaw is the long side.
- The long leg buys the 3 strongest coins = last week's biggest pumpers, which then revert violently:
| Week | BTC | Strategy | Long picks |
|---|---|---|---|
| Mar 25–Apr 01 | +3.4% | −3.4% | SIREN −87%, ARIA +49%, BEAT −32% |
| Apr 08–15 | +5.3% | −14.6% | ARIA −87%, NOM −47%, JCT −2% |
| Apr 15–22 | +4.5% | −4.0% | RAVE −90%, LAB +13%, BIO −6% |
SIREN went +132% → −87% in consecutive weeks. These same hype microcaps (RIVER, SIREN, ARIA, RAVE) also produced January's +107% and March's +132% wins. So the long leg is a high-variance lottery on a few extremely volatile coins, equal-weighted as if normal. April is the stretch where the lottery lost three weeks running.
Diagnosis: not a regime-filter miss — a position-sizing / pick-quality problem.
3. The top-20 test (Jacques' requested fix)
Same headline params (regime_ls, L21, skip0, N3, btc_ma50), full costs, max-8, PIT universe, only the universe size changed to 20. Tune and OOS reported separately.
| Universe | Tune (<2026) | OOS 2026 | April |
|---|---|---|---|
| TOP-20 | PF 1.26, ret +57%, DD −13.6%, Sharpe 1.37 | PF 0.94, ret −7.8%, DD −26.1%, Sharpe −0.32 | −25.4% |
| TOP-100 (baseline) | PF 1.24, ret +62%, DD −31.8%, Sharpe 0.88 | PF 1.26, ret +19.8%, DD −20.9%, Sharpe +1.05 | −20.9% |
- Top-20 looks better in tuning (higher Sharpe, lower drawdown) and then fails OOS — the classic look-ahead trap. Its 2026 result (PF 0.94, −7.8%) sits right on top of the earlier top-10 failure (PF 0.99, −8.5%).
- April got worse, not better: −25.4% vs −20.9%.
- The bombs are still in the top-20: SIREN −87% (Mar 25), RAVE −90% (Apr 15).
Why concentrating doesn't help (the key lesson)
We rank by dollar volume, which explodes exactly when a coin is pumping. SIREN and RAVE trade enormous volume because they are being hyped, so a "top-20 by volume" screen pulls them in at the worst moment, not out. Even with true market-cap data the result would likely be similar — these coins' caps also balloon 5–10× during the pump. Concentration throws away the breadth that gave top-100 its edge while keeping the bombs — worst of both worlds, matching the top-10 report's conclusion.
4. Honest conclusion
Top-20 FAILS out-of-sample (PF 0.94) and worsens the April whipsaw. Do not pursue. The cross-sectional momentum edge lives in breadth (top-100), confirmed a second time.
The whipsaw is a long-leg sizing problem, not a universe problem: equal-weighting coins that can move ±90%/week. The unaddressed, more promising fix remains volatility- scaled position sizing (shrink the bet on wild coins) and/or a "don't buy the parabolic" rule (skip a long whose prior-week return is already in an extreme top percentile) — either targets the actual mechanism instead of the universe.
Status: top-20 = FAIL. Top-100 strategy stands. Next lead: vol-scaled sizing.