$ glossary --search
26 terms. Plain English first, then technical.
Conformal Prediction
Conformal PredictionA method that wraps any ML model to produce prediction intervals with guaranteed coverage rates. No assumptions about data distribution needed.
Distribution-free framework producing prediction sets C(X) with finite-sample coverage guarantee P(Y ∈ C(X)) ≥ 1-α under exchangeability.
Coverage Rate
Conformal PredictionHow often the prediction interval actually contains the true value. If coverage is 90%, the real answer falls inside the interval 90% of the time.
Empirical frequency at which Y_{test} ∈ C(X_{test}) across held-out calibration folds. Target: 87.5% (α=0.125).
Prediction Interval
Conformal PredictionA range of values (like "$58k to $82k") that we expect to contain the true outcome. Wider intervals = more uncertainty, narrower = more confidence.
Set-valued output C(X) ⊆ Y from a conformal predictor, calibrated to contain Y with probability ≥ 1-α.
StandardCP
Conformal PredictionThe basic conformal method. Works well for most data types. Computes a single threshold from calibration data to determine interval width.
Split conformal prediction using the ⌈(1-α)(1+1/n)⌉ quantile of nonconformity scores on a held-out calibration set.
MondrianCP
Conformal PredictionA conformal method that gives separate coverage guarantees for each market regime (expansion vs contraction). Ensures accuracy within each state, not just overall.
Group-conditional conformal prediction: partitions calibration set by group labels g(X) and computes per-group quantile thresholds.
CQR (Conformalized Quantile Regression)
Conformal PredictionA conformal method for data where uncertainty varies. Produces wider intervals during volatile periods and tighter intervals during calm periods.
Trains quantile regressors at α/2 and 1-α/2, then conformally calibrates residuals for finite-sample heteroscedastic intervals.
Nonconformity Score
Conformal PredictionA measure of how "unusual" a new data point is compared to what the model expected. Higher scores = more surprising outcomes.
Score function s(X, Y) measuring deviation between prediction and observation. For regression: |Y - μ̂(X)|. For classification: 1 - π̂_y(X).
Exchangeability
Conformal PredictionThe assumption that the order of data points doesn't matter — future data is drawn from the same process as past data. This is what makes conformal guarantees work.
Joint distribution P(Z_1,...,Z_n) is invariant under permutations. Weaker than i.i.d. but required for conformal validity. Violated by structural breaks and concept drift.
Alpha (α)
Conformal PredictionThe error rate you're willing to accept. α=0.10 means you want the true value inside the interval at least 90% of the time. Lower α = wider intervals.
Miscoverage level: P(Y ∉ C(X)) ≤ α. Default α=0.125 (87.5% coverage target).
On-Target
Conformal PredictionA model slot whose coverage falls within the acceptable range (85-90%). The sweet spot — not too wide, not too narrow.
Coverage ∈ [0.85, 0.90]. The model's prediction intervals are correctly calibrated.
Overcovering
Conformal PredictionThe prediction interval is too wide — it catches the true value more than 90% of the time. Safe but not as useful. Like predicting "the temperature will be between -40 and 150 degrees."
Coverage > 0.90. Intervals are wider than necessary. May indicate insufficient calibration data or overly conservative alpha.
Undercovering
Conformal PredictionThe prediction interval is too narrow — the true value falls outside the interval more than 15% of the time. The model is overconfident.
Coverage < 0.85. Intervals fail to contain the true value at the target rate. May indicate exchangeability violations or distribution shift.
Regime
Regime AnalysisThe current state of the economy/market: EXPANSION (liquidity growing, good for risk assets), CONTRACTION (liquidity shrinking, bad for risk assets), or TRANSITION (changing states).
Three-state classification from year-over-year ΔNet S-D with hysteresis. EXPANSION: ΔNet S-D > +σ for ≥3 months. CONTRACTION: ΔNet S-D < -σ for ≥3 months.
Net Supply-Demand (Net S-D)
Macro EconomicsA single number showing whether there's more money flowing into the system (positive = bullish) or out of it (negative = bearish). Combines Fed, M2, reserves vs Treasury issuance and RRP.
Composite z-score: z(supply_avg) - z(demand_avg). Supply: Fed balance sheet, M2, bank reserves. Demand: TGA, RRP, Treasury issuance.
Z-Score
TechnicalHow many standard deviations a value is from its average. Z=0 means average, Z=+2 means unusually high, Z=-2 means unusually low.
z = (x - μ) / σ where μ and σ are computed over a trailing window (default 36 months).
Hamilton Filter
Macro EconomicsA way to remove long-term trends from economic data to reveal cyclical patterns. Better than the commonly used HP filter because it doesn't create fake cycles.
Regression-based detrending: ε_t = Y_t - β̂₀ - β̂₁·Y_{t-h}, avoiding HP filter spectral artifacts.
Lomb-Scargle Periodogram
Macro EconomicsA method for detecting periodic cycles in data, even when measurements aren't evenly spaced. Tells us how long liquidity cycles last and where we are in the current one.
Spectral estimation for unevenly sampled data, identifying dominant periodicities and fitting sinusoidal models for cycle phase classification.
FRED
Macro EconomicsFederal Reserve Economic Data — a free database of 800,000+ economic time series maintained by the St. Louis Fed. Our primary data source (281 series).
FRED API provides macroeconomic series with documented vintages and revision history. Our ingestion covers 6 categories across 281 series from 1998-present.
Slot
TechnicalOne of our 12 independent forecasting models. Each slot targets a specific asset + problem type + time horizon (e.g., "btc-direction-12m" predicts BTC direction over 12 months).
Independent pipeline instance with its own feature set, model type, conformal method, and coverage target. Slots span binary, regression, multiclass, clustering, and hybrid problem types.
Temporal Cross-Validation
TechnicalTesting model accuracy by training on past data and predicting the future — never peeking at future data during training. The honest way to evaluate financial models.
Time-series CV with purged groups: train/test split respects temporal ordering with gap buffer ≥ forecast horizon to prevent data leakage.
M2 Money Supply
Macro EconomicsThe total amount of money in circulation, including cash, checking deposits, and easily convertible savings. When M2 grows, there's more liquidity; when it shrinks, there's less.
M2 = M1 + savings deposits + small time deposits + money market funds. FRED series: M2SL (seasonally adjusted, monthly).
VIX
Macro EconomicsThe "fear index" — measures expected stock market volatility over the next 30 days. High VIX (>25) = scared market. Low VIX (<15) = calm market.
CBOE Volatility Index: implied volatility of S&P 500 options. Computed from a weighted strip of OTM calls and puts. VIX z-score > 1.5 correlates with 2.3x ATR expansion.
Cycle Position
Regime AnalysisWhere we are in the liquidity cycle right now: TROUGH (bottom, about to improve), RECOVERY (improving), PEAK (top, about to worsen), or CONTRACTION (worsening).
Quadrant classification from Lomb-Scargle sinusoidal fit phase angle φ. TROUGH: [π, 3π/2), RECOVERY: [3π/2, 2π), PEAK: [0, π/2), CONTRACTION: [π/2, π).
Coverage Guarantee
Conformal PredictionThe mathematical promise that our prediction intervals will be correct at least X% of the time. Not a hope or an estimate — a provable statistical property.
Finite-sample validity: P(Y_{n+1} ∈ C(X_{n+1})) ≥ 1-α for any underlying distribution, conditional on exchangeability of calibration + test data.
Correlation Matrix
TechnicalA table showing how closely different assets move together. +1 = move in lockstep, -1 = move opposite, 0 = no relationship. Useful for diversification and pair trading.
Pearson correlation coefficients from daily log returns over selectable rolling windows (30d-1Y). Net S-D row uses monthly z-score changes with minimum 24-month window.
Feature Engineering
TechnicalThe process of creating useful input signals from raw data. We transform 281 raw FRED series into ~11,240 derived features (z-scores, changes, filtered cycles) before the model picks the most useful ones.
Transformations include trailing z-scores (36mo), YoY changes, Hamilton residuals, Lomb-Scargle parameters, rolling correlations, and regime indicators. Total feature space: ~40 derived features × 281 series.