Automating post-deployment Data Collection for ML Monitoring

Automating post-deployment Data Collection for ML Monitoring
Do not index
Do not index
Canonical URL


To ensure your models perform as expected, you need to monitor them. And to make sure that monitoring happens 24/7, you might want to automate some things, like smoothly bringing data into the monitoring pipeline.
This blog post introduces the NannyML Cloud SDK, a code interface to most of the NannyML Cloud aspects. It makes monitoring ML models easier by allowing you to programagically create new models, log model inferences, and trigger new monitoring runs whenever you want.
The following image provides a high-level overview of how NannyML Cloud works.
High-level diagram of NannyML Cloud
High-level diagram of NannyML Cloud
You can think about the SDK as an alternative to the WEB component that also communicates to the server using the API. It only does so using Python, not JavaScript, from a browser.

Installing the NannyML Cloud SDK

The first thing we want to do is to install the NannyML Cloud SDK. Currently, the package hasn't been published on PyPI yet, which means you cannot install it via the regular Python channels. Instead, you'll have to clone the repository and install it from your local copy.
git clone
cd nannyml-cloud-sdk
pip install .


To tell the SDK to send the data over your NannyML Cloud instance, you’ll need the instance’s URL and an API token. You can create an API token on your settings page.
Create an API token on your NannyML Cloud instance settings page
Create an API token on your NannyML Cloud instance settings page
Then, you can simply add these credentials to your script python
import nannyml_cloud_sdk as nml_sdk

nml_sdk.url = "your instance URL goes here eg;"
nml_sdk.api_token = "api token goes here"
Or better, set them as environment variables and reference to them in your code.
import nannyml_cloud_sdk as nml_sdk
import os

nml_sdk.url = os.environ['NML_SDK_URL']
nml_sdk.api_token = os.environ['NML_SDK_API_TOKEN']
We recommend using an environment variable for the API token. This prevents accidentally leaking any token associated with your personal account when sharing code.
Once authenticated, we can start communicating with the cloud instance.

Collecting Data

To illustrate how to log data and create models on NannyML Cloud, we will use the Car Loan Dataset. This dataset is included in the NannyML OSS library and contains monitoring data for a model predicting whether a customer will repay a loan obtained to purchase a car.
We will load the reference and analysis data. Here is a quick recap of what both of these sets are:
  • Reference dataset: NannyML uses the reference set to establish a baseline for model performance and drift detection. Typically, it's the test set with the model outputs.
    • First five rows of the reference set
      First five rows of the reference set
  • Analysis dataset: It is the latest production data. NannyML checks on that period whether the model maintains its performance and if the feature distributions have shifted. Since we can estimate the performance, it doesn’t need to include the ground truth values.
    • First five rows of the analysis set
      First five rows of the analysis set
# Load a NannyML binary classification dataset to use as an example
reference_data = pd.read_csv('')

analysis_data = pd.read_csv('')
We can now use the Schema class together with the from_df method to configure the schema of the model. In this case, we define the problem as 'BINARY_CLASSIFICATION', but other options like 'MULTICLASS_CLASSIFICATION' and 'REGRESSION' are possible.
# Inspect schema from the dataset and apply overrides
schema = nml_sdk.Schema.from_df(

Setting up the Model

Once we have the model schema, we can create a new model by using the create method. Here is where we define things like the name of the model, how the data should be chunked, the main monitoring performance metric, etc.
Let’s add only the first 25,000 rows of the analysis data; we’ll add the rest later when explaining how to perform continuous monitoring by programmatically adding new analysis data.
# Create model
model = nml_sdk.Model.create(
    name='Blog example model (car loan)',
    target_data=pd.DataFrame(columns=["identifier," "work_home_actual"]), # we need to set up the target schema if we want to add ground truth in the future 
This will create a new model in your NannyML Cloud dashboard.
notion image

Ensure continuous monitoring

The previous steps allowed us to monitor an ML on the first half of the analysis data. In a real-world scenario, the analysis data is expected to change every day/week/etc, as the model makes new predictions. So, how do we add new analysis data to a previously created model?
Here is where the SDK becomes handy. Instead of manually adding the new production examples, we can automate the process with it.
We can load the previous model by searching for it by name. Then, it's a matter of loading the new model predictions, adding them to the model using the add_analysis_data method, and triggering a new monitoring run.
# Find the previous model in NannyML Cloud by name
model, = nml_sdk.Model.list(name='Blog example model (car loan)')

# Add new inferences to NannyML Cloud
new_inferences = analysis_data[25000:-1]
nml_sdk.Model.add_analysis_data(model['id'], new_inferences)

# Trigger analysis of the new data
Thenew_infererences variable can be a dataset with several new model predictions or even a single prediction:
notion image
In this case, we are using the second half of the analysis dataset. It is also worth noting that you can trigger a monitoring run whenever you want (e.g., after adding 1000 observations) by calling the trigger method.
If you now go to the NannyML Cloud dashboard, you should see new monitoring results for the recent periods.
notion image

(optional) Adding delayed ground truth data

I don’t know if you have noticed, but all this time, we have been monitoring the model without having access to ground truth for the analysis period. We meant it this way because that is how real-life scenarios tend to look like; for many ML applications, ground truth never becomes available.
But, if ground truth is available, you can add it to NannyML Cloud by using the function add_analysis_target_data from the Model class. And with it, calculate the realized performance and monitor concept shift!
Let’s imagine that in the Car Loan scenario, we've patiently waited for the actual outcomes of the loans to become available. If so, we could add them as delayed_ground_truth to NannyML.
delayed_ground_truth = pd.read_csv('')

# If you have delayed access to ground truth, you can add them to NannyML Cloud
# later. This will match analysis & target datasets using an identifier column.
nml_sdk.Model.add_analysis_target_data(model['id'], delayed_ground_truth)

# Trigger analysis of the new data
After doing that, we unlock other features like Concept drift or comparing realized vs estimated performance.
notion image


We have learned how to properly use the NannyML SDK to automate data collection and monitoring runs around NannyML Cloud. If you want to continue learning what other things you can do with the SDK, check out our docs and API reference.

Ready to learn how well are your ML models working?

Join other 1100+ data scientists now!


Written by

Santiago Víquez
Santiago Víquez

Machine Learning Developer Advocate at NannyML