Track Model Drift on Unstructured Data

Table of content

Discover how to effectively monitor drift in your LLM and ML models that work with unstructured data using Fiddler's powerful tracking capabilities. 

In this product tour, we walk through tracking drift for models like image classifiers by leveraging image embeddings and text embeddings. Learn how these embeddings provide an aggregate drift metric, enable deep insights with clustering, and pinpoint the sources of drift — be it changes in sensors, lighting, or new data introductions. See how Fiddler simplifies drift analysis for language models, providing granular insights without sampling.  

Thumbnail image for product tour video titled 'Track Model Drift on Unstructured Data with Fiddler'
Video transcript

[00:00:00] I'm here to share how you can leverage Fiddler to track drift on your models that run on unstructured data. Like this image classifier, which takes images as an input and produces an output label based on the classes it has been trained on.

[00:00:14] In this case, instead of leveraging the image itself, we leverage the image embeddings for tracking drift.

[00:00:20] Since these embeddings capture a lot of meaning and open up possibilities like giving your team access to an aggregate drift value in a metric of your choice, but also allowing you to dig deeper by clustering that drift across different clusters. And these clusters will help you identify where that source of drift lies exactly.

[00:00:42] Is it a change in the sensors, the lighting, or in a lot of cases, just introduction of new images.

[00:00:48] We supercharged this ability to show you granularity and drift when working with language models. Here, your team is just responsible for pushing in the raw input output string data to Fiddler. And behind the scenes, we can generate text embeddings using a model of your choice to provide you a similar aggregate value without any sampling for the drift that your model is experiencing on a given day.

[00:01:16] But just like the last example, we can make it very granular by clustering this data over specific bins, again, the number of your choice, and tagging them with keywords using TF-IDF algorithm to help you identify what keywords, phrases, or topics are resulting in that drift.

[00:01:35] Overall, giving your team an aggregated view to make better decisions to improve your models over time.