Guardrails
Protect your AI applications with comprehensive guardrails that ensure safety, security, and compliance across all interactions. Add these as you build your workflows.
Detect PII (Personally Identifiable Information)
Input & OutputPrivacy/Compliance
Definition
Scans input or output for sensitive data like names, addresses, or account numbers. It masks or redacts this data to protect user privacy.
Best Used
Input: Before prompt processing to prevent sensitive data logging. Output: Before display to ensure the model doesn't leak PII, even from its training data.
Hallucination
OutputFactual Accuracy
Definition
Compares the AI's generated response against verified, external sources (grounding) to check factual accuracy. Prevents the model from generating fabricated or incorrect information.
Best Used
In Retrieval-Augmented Generation (RAG) systems or any application generating high-stakes, factual content (e.g., legal, financial, or medical summaries).
Detect Jailbreak
InputSecurity/Integrity
Definition
Analyzes the user's prompt for malicious intent, obfuscation, or manipulation tactics designed to bypass safety filters. Blocks attempts to force the model to perform disallowed or harmful actions.
Best Used
On user-facing applications where the model's core instructions must be protected from external attacks to maintain system integrity and security.
Moderation
Input & OutputSafety/Ethics
Definition
Uses classifiers to flag and filter text (input and output) that is toxic, hateful, explicit, or discriminatory. Ensures all parts of the conversation are safe and compliant with ethical guidelines.
Best Used
Input: To stop harmful prompts before they reach the LLM. Output: To filter any inappropriate content the model may inadvertently generate, protecting brand reputation.
LLM Critique
OutputQuality Assurance
Definition
Employs a separate, often smaller, language model to evaluate the primary LLM's final output against detailed rules for accuracy, style, and safety. Serves as an automated, sophisticated quality assurance layer.
Best Used
As the final check in complex AI agents or pipelines that require the response to strictly adhere to multiple, nuanced constraints (e.g., tone, format, and compliance).