Blog·02/03/2026

Drift: how to detect that your model is degrading

The model does not change. The data does. And if nobody is watching, the model can be giving bad predictions for weeks without anyone knowing. How to monitor drift in practice.

Drift is silent degradation: the model does not change, the data does, and predictions gradually get worse. Without monitoring, nobody detects it until the business notices — and by then the damage is done.

Types of drift that matter. Data drift (or covariate shift): the distribution of input variables changes, but the relationship between variables and label stays the same. Concept drift: the relationship between input variables and the target label changes (for example, the behaviour of customers who default changes after an economic crisis). The first is easier to detect; the second requires recent labelled data to confirm.

PSI as a practical drift metric. The Population Stability Index measures how different the distribution of a variable in production is from the training reference profile. It is calculated by comparing histograms: for each bin, PSI_bin = (actual - expected) * ln(actual / expected), and the total PSI is the sum of all bins. Values below 0.1 indicate stable distributions. Between 0.1 and 0.2, there is moderate drift worth investigating. Above 0.2, the distribution has changed significantly and the model has probably degraded.

Which variables to monitor. Not all features matter equally. The most predictive variables in the model (those with highest importance in SHAP or feature_importances_) are the ones that have the most impact if they change. Start by monitoring those, plus the ones that by their nature are most likely to change (demographics, prices, user behaviour).

How to build a minimal monitoring system. You need: a reference profile built on training data (percentiles, histograms for numerics; frequencies for categoricals), a job that runs periodically on recent traffic (last 24h, last week), drift metrics per feature exposed to an alerting system. With Prometheus and Grafana, the drift of each feature is visible in a dashboard and you can configure an alert when it exceeds the threshold.

The difference between detecting drift and acting on it. Detecting drift is straightforward with PSI. Deciding what to do is harder: retrain with recent data? adjust the decision threshold? review the data pipeline looking for errors? The answer depends on the type of drift, how many recent labelled data points you have and how much the business has changed. What you cannot do is not detect it.

Work with JMWEB

Let's build something that reaches production.

It all starts with a conversation. Bring a dataset, a goal or a model that is stuck; I will take care of the rest.

Start a project

Keep reading:

15/05/2026

When is it worth using an LLM — and when is it not?

Read article

08/05/2026

How to evaluate whether an ML model actually works

Read article