Methodology
Every number in this app comes from a transparent statistical method — no black-box ML, no invented confidence. This page documents the exact formulas and, just as importantly, what the models do not claim.
Trend forecast (linear regression / drift)
For a price series p₀…pₙ₋₁ we fit an ordinary least-squares line of price on index and extrapolate forward by h periods:
slope β = Σ(xᵢ − x̄)(yᵢ − ȳ) / Σ(xᵢ − x̄)² intercept α = ȳ − β·x̄ prediction(h) = α + β·(n − 1 + h) direction = up if predicted > current, down if <, else flat (0.05% dead-band) confidence = R² = (Σ(xᵢ−x̄)(yᵢ−ȳ))² / [Σ(xᵢ−x̄)²·Σ(yᵢ−ȳ)²]
In plain terms: draw the straight line that best fits recent prices, then follow its slope forward. Direction comes from the trend, not a lagging average — and a jagged, unpredictable series honestly scores a low R².
Confidence band (prediction interval)
The shaded band around the forecast is a real prediction interval:
residual std error RSE = √( Σ(yᵢ − ŷᵢ)² / (n − 2) ) half-width(h) = 1.96 · RSE · √(1 + h/n) band = [predicted − half-width, predicted + half-width] (floored at 0)
In plain terms: the band is how much the line typically missed by, widened for distance. Further out means less certain — so the cone visibly fans open.
Alternative models & backtesting
Five transparent models are compared: linear-regression drift, EMA trend, momentum, mean-reversion, and a naive baseline (tomorrow = today). Evaluation is walk-forward, out-of-sample: at each point t the model is fit on p₀…pₜ and scored against the real p₍ₜ₊ₕ₎ — it never sees the future point it is judged on (no look-ahead bias).
Directional accuracy = correct up/down/flat calls / total MAE = mean |predicted − actual| MAPE = mean |predicted − actual| / |actual| · 100 Confusion matrix = predicted direction × actual direction
In plain terms: the naive baseline is the bar every model must clear to be worth anything. If a fancy model can't beat "tomorrow = today", it adds nothing.
Strategy backtest (signal → PnL)
Directional forecasts can be evaluated as a trading rule, not just a classification score. The engine walks forward on historical prices: at each point t the model sees only p₀…pₜ, emits a signal for the next h periods, and simulates a position held until t+h. Trades are non-overlapping — the next decision happens only after the prior horizon closes.
Long-only: enter when direction = up and confidence ≥ min_confidence Long/short: long on up, short on down (same confidence gate) Round-trip cost = 2 × (fee_bps + slippage_bps) / 10 000 (default 10 + 5 bps per side) Metrics: total return, CAGR, Sharpe (per-trade), max drawdown, profit factor, win rate Benchmark: buy-and-hold over the same entry→exit window, same costs
In plain terms: turn each call into a real trade with fees and slippage, then compare against simply holding. It is not proof of investable alpha — past walk-forward PnL does not guarantee future results.
Live example — BTC (linear signal, 7-period horizon)
Strategy equity vs period-matched buy-and-hold on synced price history.
Statistical validation
Accuracy alone does not prove edge. Phase B adds inference on top of walk-forward evaluation:
Wilson 95% CI — directional accuracy vs random (50%) Bootstrap CI — mean trade / excess return (2000 resamples) Permutation test — sign-flip p-value for positive mean return Calibration bins — confidence (R²) vs empirical hit rate Walk-forward folds — rolling train/test windows with embargo Bonferroni — α/5 across models; one declared best
In plain terms: we ask "could this just be luck?" A result marked significant at 95% is unlikely under a random null — but that is still not a guarantee of future profit.
Live example — BTC validation report
Confidence calibration
Bars = empirical hit rate per confidence bin; line = mean predicted confidence. Perfect calibration follows the dashed diagonal.
Model selection (Bonferroni α=1.0%)
Declared best after multiple-testing correction: Naive (baseline)
| Model | Accuracy | p-value | Sig. |
|---|---|---|---|
| Linear Regression | 54.5% | 0.546 | — |
| EMA Trend | 34.1% | 0.035 | — |
| Momentum | 50.0% | 1.000 | — |
| Mean Reversion | 47.7% | 0.763 | — |
| Naive (baseline) | 4.5% | <0.001 | ✓ |
Portfolio Monte Carlo (GBM)
Each asset follows Geometric Brownian Motion. Daily drift μ and volatility σ are estimated from that asset’s own log-returns (drift shrunk 50% toward zero to avoid over-extrapolating short history):
rᵢ = ln(pᵢ / pᵢ₋₁) (log-returns) μ = 0.5 · mean(r) , σ = stdev(r) daily step: Sₜ₊₁ = Sₜ · exp( (μ − ½σ²) + σ·Z ) , Z ~ N(0,1) We run thousands of seeded paths, then report: expected/median return, 95% VaR (5th percentile), P(profit), and a percentile fan (p5/p25/p50/p75/p95).
In plain terms: roll the dice thousands of times on each asset, then read off the range of outcomes — the median, the unlucky tail, and the odds of finishing green.
Alpha++ (ensemble v2)
Alpha++ amplifies the investable ensemble with market breadth, walk-forward quality gates, and a unified alpha score that blends signal strength with historical edge:
amplified_vote = 85%·ensemble_v1 + 15%·breadth_tilt(advancing_pct)
composite = (regime_gated_vote + 1) / 2 × 100 × quality_mult × confluence_bonus
alpha_score = 45%·composite + 25%·backtest_accuracy + 20%·strategy_α + 10%·win_rate
× confluence × (0.85 + 0.15·confidence)
Regime min score: risk_on 68 · chop 72 · risk_off 78 (longs need ≥82 in risk_off)
Grades: A≥85 · B≥75 · C≥65 · D≥50 · F<50In plain terms: agreement across signals is rewarded, weak walk-forward accuracy and wild volatility are penalised. An empty alpha watchlist in choppy regimes is expected — restraint, not an error.
Forward test (track record)
The forward log is a live, immutable test of the frozen ensemble model. Each worker sync may open new qualifying long signals; trades close automatically at horizon maturity. No retroactive edits.
Open: investable long, composite/alpha ≥ threshold, liquid volume Hold: fixed horizon (steps × sync interval) Close: mark-to-market at maturity; apply round-trip costs (fee + slippage bps) Equity: compound net returns scaled by vol-target position size Metrics: total return, win rate, Sharpe (per-trade), max drawdown
In plain terms: a public, tamper-proof paper-trading log. Not audited fund performance — early windows are small, so read cumulative stats with caution until the 90-day target elapses.
Glossary
The terms that recur above, in one place:
- R²
- Share of price variance the trend line explains (0–1). Used as honest confidence — a noisy series scores low.
- Walk-forward
- Fit only on the past, score on the next unseen point. No look-ahead, so accuracy isn't inflated by hindsight.
- MAE / MAPE
- Mean absolute error (in price) and the same as a percentage — how far predictions land from reality on average.
- Directional accuracy
- How often the up/down/flat call was correct, independent of magnitude.
- Sharpe ratio
- Return earned per unit of volatility, annualised. Higher means a smoother path to the same return.
- VaR (95%)
- Value at Risk — the loss the 5% worst simulated outcomes meet or exceed. A tail-risk yardstick.
- GBM
- Geometric Brownian Motion — the standard random-walk-with-drift model for prices that can't go below zero.
- Bootstrap / permutation
- Resampling tests that ask: could this result happen by luck? They turn a metric into a p-value.
- Bonferroni
- A stricter significance bar (α/N) applied when testing many models, so one isn't crowned by chance alone.
- Drawdown
- The largest peak-to-trough drop in equity — how much pain the strategy put you through along the way.
Honest limitations
- Not AI/ML. No neural networks, LSTM, XGBoost, or SHAP. These are classical statistical models, chosen because they are transparent and defensible.
- Independence assumption. The Monte Carlo simulates assets independently; it does not model cross-asset correlation, so it understates tail risk for positively-correlated holdings.
- Short history. Forecasts and backtests need enough synced points; with little data the app honestly shows “not enough history yet” rather than fabricating numbers.
- Sampling cadence. The forecast horizon is in sampling periods (≈ sync interval), not calendar days — labelled honestly.
- Past ≠ future. All methods extrapolate history. Markets are not stationary; treat outputs as analysis, not financial advice.
References
- Hull, J. — Options, Futures, and Other Derivatives (GBM, risk-neutral pricing).
- Sharpe, W. (1966) — “Mutual Fund Performance” (the Sharpe ratio).
- Hyndman & Athanasopoulos — Forecasting: Principles and Practice (walk-forward evaluation, MAPE).
- Tsay, R. — Analysis of Financial Time Series (log-returns, volatility).