agentic ml system

ALPHAWATCH

Self-healing ML for stock prediction

Detects when market conditions drift beyond training distribution, retrains automatically using walk-forward cross-validation, shadow trades the candidate model, then promotes or rolls back. No human in the loop.

view source how it works

alphawatch — agent log

09:14:01 [detect] PSI(volatility)=0.31 — threshold breached (>0.20) 09:14:02 [detect] KL(returns)=0.13 — threshold breached (>0.10) 09:14:03 [diagnose] LLM: volatility regime shift detected — bull→neutral 09:14:04 [diagnose] plan: walk-forward retrain with expanded VIX features 09:14:05 [retrain] fold 1/4 — sharpe=1.81 accuracy=0.669 09:15:22 [retrain] fold 4/4 — sharpe=1.91 accuracy=0.683 → v2.4.2-rc 09:15:23 [backtest] OOS sharpe=1.88 mdd=-3.8% calmar=2.31 — PASSED 09:15:24 [shadow] 10% allocation — 24hr evaluation window started 10:15:24 [shadow] gate passed — sharpe delta +0.06 vs stable 10:15:25 [promote] v2.4.2-rc → PRODUCTION | v2.4.1 → retired | SHAP logged 10:15:25 $

agent pipeline

Seven nodes.
Zero manual intervention.

When drift is detected, a LangGraph agent activates and walks through the full remediation cycle autonomously.

node 01

ingest

Pulls latest OHLCV data via yfinance (or Polygon.io WebSocket in production) and rebuilds the full feature store — 40+ technical indicators across price, volume, volatility, momentum, and regime.

yfinancepolygon.iopandas

node 02

detect

Computes PSI and KL divergence between training baseline and recent production window. Triggers on PSI > 0.20 or KL > 0.10 across any of five monitored signals.

PSIKL divergencescipy

node 03

diagnose

Claude classifies the drift type — feature distribution shift, volatility regime change, or concept drift — and generates a specific remediation plan. Falls back to rule-based logic if no API key is set.

Claude APILangGraphPydantic

node 04

retrain

Walk-forward cross-validation with expanding windows. Optuna HPO fires only on concept drift — reuses existing hyperparameters for simpler feature or regime shifts to save time. LightGBM + XGBoost soft-voting ensemble. All metrics logged to MLflow.

LightGBMXGBoostOptunaMLflow

node 05

backtest

OOS walk-forward backtest with transaction costs and slippage. Gates on Sharpe ≥ 1.0 and directional accuracy ≥ 52%. Computes Sharpe, Calmar, Sortino, max drawdown, win rate, profit factor.

walk-forwardSharpe gatenumpy

node 06

shadow

New model runs at 10% allocation via Alpaca paper trading for 24 hours. Collects live P&L against the stable model. Gate: shadow Sharpe must be positive and win rate above 45%.

Alpaca paper24hr gate10% allocation

node 07

promote

If shadow gate passes, promotes to 100% production and archives the previous model. If it fails, auto-rollback restores the stable version. Every decision is logged to MLflow with agent reasoning trace — SR 11-7 audit-ready.

MLflow registrySHAPauto-rollback

drift detection

Five signals.
Hourly checks.

PSI and KL divergence computed across feature groups. Drift type classification routes to the correct remediation path.

PSI — price features

0.20

Triggers on return distributions, momentum, and Bollinger band position. Catches price regime shifts before model accuracy degrades.

PSI — volume features

0.20

OBV, MFI, volume ratio. Volume distribution breakdowns often precede momentum reversals.

PSI — volatility

0.20

ATR, historical vol, Bollinger width. The most sensitive signal — catches VIX expansion and volatility clustering early.

KL divergence — returns

0.10

KL is more sensitive to tail shifts than PSI. Catches flash crash conditions and fat-tail regime changes that PSI misses.

PSI — market regime

0.20

ADX trend strength, volatility regime flag. Routes to full HPO retrain if regime classification itself has shifted.

trigger thresholds

PSI > 0.20
KL > 0.10

any breach activates the agent

technology

Production stack.

Every component is swappable. yfinance ships as the free default. Polygon.io, Alpaca, and Anthropic activate when keys are set.

LightGBM + XGBoost

Soft-voting ensemble

HPO

Optuna

TPE sampler, 50 trials

Agent

LangGraph

7-node typed graph

LLM

Claude (Anthropic)

Diagnose + plan nodes

Drift

PSI + KL divergence

Pure numpy/scipy

Explainability

SHAP

Feature importance logging

Registry

MLflow

Lineage + artifacts

Scheduling

Apache Airflow

Hourly drift DAG

API

FastAPI

WebSocket dashboard

Trading

Alpaca

Paper shadow trading

Data

yfinance / Polygon

OHLCV + real-time

Infra

Docker + Kubernetes

Full K8s manifests

under the hood

The agent loop.

LangGraph with typed Pydantic state. Each node is independently testable. The LLM only touches two nodes — everything else is deterministic Python.

      agent/pipeline.py
      PYTHON
    
def node_diagnose(state: AgentState) -> AgentState:
    """LLM-powered drift diagnosis and remediation planning."""
    drift_report = state.get("drift_report", {})
    diagnosis, plan = llm_diagnose(drift_report, perf, state["ticker"])
    state["diagnosis"] = diagnosis
    state["action_plan"] = plan
    return log_step(state, "diagnose", "info", f"Plan: {plan}")

def node_retrain(state: AgentState) -> AgentState:
    """Walk-forward retrain. HPO only on concept drift."""
    run_hpo = state.get("drift_type") in ("concept", "volatility_regime")
    result = walk_forward_train(
        ticker=state["ticker"],
        run_hpo_flag=run_hpo,
    )
    state["retrain_result"] = result
    return log_step(state, "retrain", "info",
        f"sharpe={result['overall_sharpe']:.2f} version={result['version']}")

# Graph wiring
graph.add_edge("ingest",   "detect")
graph.add_edge("detect",   "diagnose")
graph.add_edge("diagnose", "retrain")
graph.add_edge("retrain",  "backtest")
graph.add_edge("backtest", "shadow")
graph.add_edge("shadow",   "promote")

ALPHAWATCH

Seven nodes.Zero manual intervention.

Five signals.Hourly checks.

Production stack.

The agent loop.

Built for production. Ready to run.

Seven nodes.
Zero manual intervention.

Five signals.
Hourly checks.