Guardrails
Overview
Guardrails provide safety mechanisms and constraints for agent behavior, ensuring agents operate within acceptable boundaries and follow security best practices.
Quick Start
csharp
using LlmTornado.Agents;
public struct IsMath
{
public string Reasoning { get; set; }
public bool IsMathRequest { get; set; }
}
async ValueTask<GuardRailFunctionOutput> MathGuardRail(string? input = "")
{
TornadoAgent mathGuardrail = new TornadoAgent(api, ChatModel.OpenAi.Gpt41.V41Mini, instructions: "Check if the user is asking you a Math related question.", outputSchema: typeof(IsMath));
Conversation result = await TornadoRunner.RunAsync(mathGuardrail, input);
IsMath? isMath = result.Messages.Last().Content.JsonDecode<IsMath>();
return new GuardRailFunctionOutput(isMath?.Reasoning ?? "", !isMath?.IsMathRequest ?? false);
}
TornadoAgent agent = new TornadoAgent(
client: api,
model: ChatModel.OpenAi.Gpt41.V41,
instructions: "You are a useful assistant"
);
try
{
Conversation result = await agent.RunAsync("What is the weather?", inputGuardRailFunction: MathGuardRail);
Console.WriteLine(result.Messages.Last().Content);
}
catch (GuardRailTriggerException guardRailEx)
{
Console.WriteLine(guardRailEx.message)
}Other considerations
Content Filtering
- Reject harmful or inappropriate requests
- Filter sensitive information from outputs
- Validate input before processing
- Sanitize responses
Output Validation
- Check responses meet quality standards
- Verify structured output schemas
- Ensure factual accuracy when possible
- Validate against business rules
Implementation
Tool Permissions
csharp
Dictionary<string, bool> permissions = new Dictionary<string, bool>
{
["send_email"] = true, // Requires permission
["read_file"] = false, // No permission needed
["delete_data"] = true // Requires permission
};
TornadoAgent agent = new TornadoAgent(
api, model,
toolPermissionRequired: permissions
);Input Validation
csharp
async Task<Conversation> SafeRunAsync(TornadoAgent agent, string userInput)
{
// Validate input
if (ContainsSensitiveInfo(userInput))
{
throw new InvalidOperationException("Input contains sensitive information");
}
if (userInput.Length > 10000)
{
throw new InvalidOperationException("Input too long");
}
return await agent.RunAsync(userInput);
}Output Filtering
csharp
async Task<string> FilteredResponse(TornadoAgent agent, string input)
{
Conversation result = await agent.RunAsync(input);
string response = result.Messages.Last().Content;
// Filter sensitive patterns
response = Regex.Replace(response, @"\d{3}-\d{2}-\d{4}", "***-**-****"); // SSN
response = Regex.Replace(response, @"\b\d{16}\b", "****-****-****-****"); // Credit card
return response;
}Best Practices
- Define clear boundaries in instructions
- Implement multiple layers of protection
- Log and monitor agent behavior
- Test guardrails thoroughly
- Update guardrails as threats evolve