How to Detect Under-Performing Segments in ML Models

Machine Learning (ML) models tend to behave differently across different data segments. Without monitoring each segment, you wouldn’t notice the problem until it’s too late.

How to Detect Under-Performing Segments in ML Models
Do not index
Do not index
Canonical URL

Introduction

Machine Learning (ML) models tend to behave differently across different data segments.
Consider a customer churn model. It might show strong performance on average, but averages can mask issues. High-income customers might start churning at higher rates, even though the model predicts churn accurately for other groups.
Without visibility into this segment, you wouldn’t notice the problem until it’s too late. By then, a key customer group could be gone.
This is a major challenge for data scientists. They rely on model predictions to make decisions, but they can’t trust the model unless they understand how it works for different segments of data.
Traditional monitoring tools fail to detect these nuanced patterns. That’s why we’ve added Segmentation to NannyML.
In this blog, you’ll learn how to create segments, compare performance across them, and gain a clearer understanding of how your model is performing.

Why You Need Segmentation

Segmentation in NannyML Cloud lets you break your data into smaller, more manageable groups (called segments) and analyze them separately. Segments refer to distinct groups within your data, such as customer demographics, product categories, or income ranges.
You can choose any column from your dataset for segmentation, and the tool will automatically create segments based on the unique values in that column.
By understanding how each segment behaves, you gain a clearer picture of how different data groups influence your model’s performance.
Even if everything is well overall, specific segments may exhibit early signs of performance degradation. Detecting these issues early allows you to intervene before they become bigger problems.
When you identify where performance is dropping, you can direct your efforts more precisely and make targeted adjustments that lead to better outcomes.

How to Add Segments in NannyML Cloud

There are two ways to configure segmentation for a model:

For a New Model

When setting up a new model, you can choose columns for segmentation by using the "Segment by" dropdown menu or flagging specific columns directly.
Adding segmentation in a new model configuration
Adding segmentation in a new model configuration

For an Existing Model

For an existing model, you can still set up segments. Simply go to the model settings, navigate to the Schema tab, and configure segments as described above.
Adding segmentation in an exisiting model configuration
Adding segmentation in an exisiting model configuration

Types of Segments

Creating Segments from Categorical Features

Suppose your dataset has a column like "product category" with distinct values such as "electronics," "clothing," and "furniture." By selecting "product category" as the segmentation column, NannyML will automatically create separate segments for each unique value within this column.

Segmenting by Ranges in Continuous Columns

For continuous features like "income," discretization can be used to create meaningful segments. To segment by income ranges, you could introduce categories such as "20k-40k," "40k-60k," and so on, and monitor these ranges separately. The choice of ranges depends on the business context i.e. whether the aim is to track changes in behavior across income brackets or identify specific economic groups.

Combining Features for Custom Segments

For more advanced segmentations, you may want to combine multiple columns. For instance, if your dataset includes "region" and "product type," selecting each column individually would create separate segments for each region and each product type. However, to track segments like "North America-electronics" or "Europe-clothing," you would need to create a new feature that combines both columns. This custom feature engineering step provides deeper insights into how different factors interact within your data.
Once added, you can select the segments you are interested in under Filter>Segments.
Performance Metric filtered by Segments.
Performance Metric filtered by Segments.
👉
Note: Segmentation is currently based on user-defined groups, relying on criteria you set up initially. We’re working on adding algorithm-driven segmentation to detect patterns and shifts automatically where performance is lagging.

Let’s see an example…

Now that we have segments, let's see how we can actually detect issues in the model predictions.
Let’s consider the hotel booking dataset, which tracks various details about reservations, cancellations, and customer profiles. You’ve deployed a ML model to predict whether a booking will be canceled (is_canceled). This prediction helps optimize inventory and manage revenue.
Accuracy throughout the data
Accuracy throughout the data
The accuracy of the model is estimated to drop. To understand where the model is failing the most, we can break down model performance and analyze different segment groups.
This dataset includes a market_segment column that categorizes bookings into categories like aviation, complementary, corporate, direct, and groups. We can easily segment this column like so:
Accuracy across various segments
Accuracy across various segments
Accuracy is different across segments. It is estimated to decrease for all segments other than complementary bookings. This nugget of information is a good starting point for further investigation. You can conduct domain-specific research, look into seasonal trends, and review the data distribution in production.

Taking action when segments underperform

When a specific segment begins to underperform, here's how you can take action to address the issue:
  • Set Up Custom Metrics for Segments: Design metrics that focus on the specific behavior of the underperforming segment. For example, if corporate bookings have higher cancellation rates, build metrics that track cancellations for this group. Learn how to set up custom metrics.
  • Refine business strategies: If the model struggles with predicting cancellations for aviation bookings, look at external factors like flight schedule changes or seasonal travel trends. Use this knowledge to adjust both the model and your approach to bookings.
  • Retrain the Model: Retrain the model with updated data that corrects the underperforming segment. Fine-tune it so that it works better for specific groups and reflects what’s happening in production data. Learn how to retrain your model the right way.
📌
Everything that applies to the entire data applies to a segment.
You can then monitor performance metrics and look out for signs of covariate shift, concept drift, and data quality for each distinct category within every column.

Conclusion

Meme by Author
Meme by Author
This blog highlighted the importance of segment-level monitoring in understanding how your ML models behave across different data groups.
With this feature, you can see where the issues originate and take action before the model’s performance impacts the business.
 
Segmentation is just one example of how advanced ML monitoring can give you deeper insights into model performance. Along with tools for detecting data drift and shifts in performance, you can gain a more complete understanding of how your models behave post-production.
If you’re facing challenges in model monitoring or want to explore how these capabilities can be applied to your use cases, we encourage you to schedule a demo and speak directly with the founders.

Continue Reading

Interested in more Post-Deployment Data Science content? Continue reading these blogs👇

Ready to learn how well are your ML models working?

Join other 1100+ data scientists now!

Subscribe

Written by

Kavita Rana
Kavita Rana

Data Science Intern at NannyML