Predictive degradation detection module — monitors server telemetry (CPU, memory, HTTP 500 rate, network latency) and flags early-stage degradation before a full outage, triggering a reroute signal for the traffic router.
Model: IsolationForest (unsupervised anomaly detection), random_state=42, n_jobs=1
Threshold: -0.500710 (58th percentile of degrading-segment scores — the most aggressive cutoff that avoids false positives on normal traffic)
| Predicted Normal | Predicted Degrading | |
|---|---|---|
| Actual Normal (3,600) | 3,514 (TN) | 86 (FP) |
| Actual Degrading (1,800) | 756 (FN) | 1,044 (TP) |
| Metric | Value | What it means |
|---|---|---|
| Precision (degrading) | 92.4% | When we call it "degrading," we're right 92% of the time — few false alarms |
| Recall (degrading) | 58.0% | We catch 58% of all degrading readings — biased toward precision over recall to avoid rerouting healthy traffic |
| F1 | 0.713 | Balanced measure of precision/recall |
| Accuracy | 84.4% | Overall correctness (less meaningful here — classes are imbalanced 2:1 normal:degrading) |
| Detection point | ~42% through ramp | Reroute fires with over half the degradation window still remaining before critical state |
| Inference latency | Sub-millisecond | No I/O or DataFrame overhead — model + scaler loaded once at startup |
simulator.py → historical_logs.csv → train.py → model.joblib → app.py (FastAPI /predict) → demo_client.py
simulator.py— generates synthetic telemetry: a healthy baseline segment followed by a smooth, gradual degradation ramp (not a step change), so early detection is possible.train.py— trains IsolationForest on 4 raw features (CPU, memory, error rate, latency), tunes the threshold against the labeled degrading segment, savesmodel.joblib.app.py— FastAPI service. Loads the model once at startup.POST /predictreturns{"status": "healthy" | "degrading", "health_score": <float 0-1>, "action": "none" | "reroute"}.demo_client.py— streams simulated live telemetry to the API once per second, prints color-coded status, and fires a one-time reroute banner whenaction == "reroute".
python3 simulator.py # generates historical_logs.csv
python3 train.py # trains model, saves model.joblib
python3 app.py # starts API on :8000
# in a second terminal
python3 demo_client.py