SPY521.40+0.82%
QQQ446.20+1.14%
NVDA875.60+2.31%
AAPL189.30+0.44%
VIX14.80-3.20%
SIMULATED FEED
agentic ml system

ALPHAWATCH

Self-healing ML for stock prediction

Detects when market conditions drift beyond training distribution, retrains automatically using walk-forward cross-validation, shadow trades the candidate model, then promotes or rolls back. No human in the loop.

alphawatch — agent log
09:14:01 [detect] PSI(volatility)=0.31 — threshold breached (>0.20) 09:14:02 [detect] KL(returns)=0.13 — threshold breached (>0.10) 09:14:03 [diagnose] LLM: volatility regime shift detected — bull→neutral 09:14:04 [diagnose] plan: walk-forward retrain with expanded VIX features 09:14:05 [retrain] fold 1/4 — sharpe=1.81 accuracy=0.669 09:15:22 [retrain] fold 4/4 — sharpe=1.91 accuracy=0.683 → v2.4.2-rc 09:15:23 [backtest] OOS sharpe=1.88 mdd=-3.8% calmar=2.31 — PASSED 09:15:24 [shadow] 10% allocation — 24hr evaluation window started 10:15:24 [shadow] gate passed — sharpe delta +0.06 vs stable 10:15:25 [promote] v2.4.2-rc → PRODUCTION | v2.4.1 → retired | SHAP logged 10:15:25 $
4,800+
lines of code
7
LangGraph agent nodes
30
unit tests
5
drift signals monitored

Seven nodes.
Zero manual intervention.

When drift is detected, a LangGraph agent activates and walks through the full remediation cycle autonomously.

01
node 01
ingest
Pulls latest OHLCV data via yfinance (or Polygon.io WebSocket in production) and rebuilds the full feature store — 40+ technical indicators across price, volume, volatility, momentum, and regime.
yfinancepolygon.iopandas
02
node 02
detect
Computes PSI and KL divergence between training baseline and recent production window. Triggers on PSI > 0.20 or KL > 0.10 across any of five monitored signals.
PSIKL divergencescipy
03
node 03
diagnose
Claude classifies the drift type — feature distribution shift, volatility regime change, or concept drift — and generates a specific remediation plan. Falls back to rule-based logic if no API key is set.
Claude APILangGraphPydantic
04
node 04
retrain
Walk-forward cross-validation with expanding windows. Optuna HPO fires only on concept drift — reuses existing hyperparameters for simpler feature or regime shifts to save time. LightGBM + XGBoost soft-voting ensemble. All metrics logged to MLflow.
LightGBMXGBoostOptunaMLflow
05
node 05
backtest
OOS walk-forward backtest with transaction costs and slippage. Gates on Sharpe ≥ 1.0 and directional accuracy ≥ 52%. Computes Sharpe, Calmar, Sortino, max drawdown, win rate, profit factor.
walk-forwardSharpe gatenumpy
06
node 06
shadow
New model runs at 10% allocation via Alpaca paper trading for 24 hours. Collects live P&L against the stable model. Gate: shadow Sharpe must be positive and win rate above 45%.
Alpaca paper24hr gate10% allocation
07
node 07
promote
If shadow gate passes, promotes to 100% production and archives the previous model. If it fails, auto-rollback restores the stable version. Every decision is logged to MLflow with agent reasoning trace — SR 11-7 audit-ready.
MLflow registrySHAPauto-rollback

Five signals.
Hourly checks.

PSI and KL divergence computed across feature groups. Drift type classification routes to the correct remediation path.

PSI — price features
0.20
Triggers on return distributions, momentum, and Bollinger band position. Catches price regime shifts before model accuracy degrades.
PSI — volume features
0.20
OBV, MFI, volume ratio. Volume distribution breakdowns often precede momentum reversals.
PSI — volatility
0.20
ATR, historical vol, Bollinger width. The most sensitive signal — catches VIX expansion and volatility clustering early.
KL divergence — returns
0.10
KL is more sensitive to tail shifts than PSI. Catches flash crash conditions and fat-tail regime changes that PSI misses.
PSI — market regime
0.20
ADX trend strength, volatility regime flag. Routes to full HPO retrain if regime classification itself has shifted.
trigger thresholds
PSI > 0.20
KL > 0.10
any breach activates the agent

Production stack.

Every component is swappable. yfinance ships as the free default. Polygon.io, Alpaca, and Anthropic activate when keys are set.

ML
LightGBM + XGBoost
Soft-voting ensemble
HPO
Optuna
TPE sampler, 50 trials
Agent
LangGraph
7-node typed graph
LLM
Claude (Anthropic)
Diagnose + plan nodes
Drift
PSI + KL divergence
Pure numpy/scipy
Explainability
SHAP
Feature importance logging
Registry
MLflow
Lineage + artifacts
Scheduling
Apache Airflow
Hourly drift DAG
API
FastAPI
WebSocket dashboard
Trading
Alpaca
Paper shadow trading
Data
yfinance / Polygon
OHLCV + real-time
Infra
Docker + Kubernetes
Full K8s manifests

The agent loop.

LangGraph with typed Pydantic state. Each node is independently testable. The LLM only touches two nodes — everything else is deterministic Python.

agent/pipeline.py PYTHON
def node_diagnose(state: AgentState) -> AgentState:
    """LLM-powered drift diagnosis and remediation planning."""
    drift_report = state.get("drift_report", {})
    diagnosis, plan = llm_diagnose(drift_report, perf, state["ticker"])
    state["diagnosis"] = diagnosis
    state["action_plan"] = plan
    return log_step(state, "diagnose", "info", f"Plan: {plan}")

def node_retrain(state: AgentState) -> AgentState:
    """Walk-forward retrain. HPO only on concept drift."""
    run_hpo = state.get("drift_type") in ("concept", "volatility_regime")
    result = walk_forward_train(
        ticker=state["ticker"],
        run_hpo_flag=run_hpo,
    )
    state["retrain_result"] = result
    return log_step(state, "retrain", "info",
        f"sharpe={result['overall_sharpe']:.2f} version={result['version']}")

# Graph wiring
graph.add_edge("ingest",   "detect")
graph.add_edge("detect",   "diagnose")
graph.add_edge("diagnose", "retrain")
graph.add_edge("retrain",  "backtest")
graph.add_edge("backtest", "shadow")
graph.add_edge("shadow",   "promote")
open source

Built for production. Ready to run.

Clone it, add your API keys, and run make full to train, backtest, and launch the dashboard in one command.