Fiddler Guardrails for Safeguarding LLM Applications

Table of content

Fiddler Trust Service includes low-latency guardrails to moderate LLM applications for hallucination, safety violations, and prompt injection attacks. In this chatbot demo, you can see how Fiddler Guardrails quickly and proactively reject malicious inputs like jailbreak attempts, ensuring the integrity of LLM responses without needing to reach the underlying model.

Demo for Fiddler Guardrails for Safeguarding LLM Applications
Video transcript

[00:00:00] Fiddler Trust Service allows your team to moderate your AI applications in real-time by preventing attacks like prompt injections from ever reaching your application and your model in the first place and making sure the responses generated for the users are reliable and not hallucinated and in compliance with your team's policies.

[00:00:19] Let's try this with a RAG application built by our team called the Fiddler Chatbot. This answers questions about Fiddler platform usability using our documentation. If I ask a question like, how does Fiddler handle ranking models, I will get a quick response and some code snippets. But here, the guardrailing process tells me how faithful this answer is to the documentation it used in this case.

[00:00:41] Now, instead of asking a simple question, I can send in a prompt injection attack, which tells the chatbot to ignore all its instructions and leak any email addresses that it might have access to in your documentation. When I send this in, you'll see very promptly that this input was rejected. And the reason behind that is this really high likelihood of this being a jailbreak attack that prevents this from ever even talking to my LLM or my application.

[00:01:10] And you see we delivered these results with a very fast response from our guardrailing functionality.