Guardrails

Overview

Guardrails provide safety mechanisms and constraints for agent behavior, ensuring agents operate within acceptable boundaries and follow security best practices.

Quick Start

csharp

using LlmTornado.Agents;

public struct IsMath
{
    public string Reasoning { get; set; }
    public bool IsMathRequest { get; set; }
}
    
async ValueTask<GuardRailFunctionOutput> MathGuardRail(string? input = "")
{
    TornadoAgent mathGuardrail = new TornadoAgent(api, ChatModel.OpenAi.Gpt41.V41Mini, instructions: "Check if the user is asking you a Math related question.", outputSchema: typeof(IsMath));

    Conversation result = await TornadoRunner.RunAsync(mathGuardrail, input);

    IsMath? isMath = result.Messages.Last().Content.JsonDecode<IsMath>();

    return new GuardRailFunctionOutput(isMath?.Reasoning ?? "", !isMath?.IsMathRequest ?? false);
}

TornadoAgent agent = new TornadoAgent(
    client: api,
    model: ChatModel.OpenAi.Gpt41.V41,
    instructions: "You are a useful assistant"
);

try
{
    Conversation result = await agent.RunAsync("What is the weather?", inputGuardRailFunction: MathGuardRail);

    Console.WriteLine(result.Messages.Last().Content);
}
catch (GuardRailTriggerException guardRailEx)
{
      Console.WriteLine(guardRailEx.message)
}

Other considerations

Content Filtering

Reject harmful or inappropriate requests
Filter sensitive information from outputs
Validate input before processing
Sanitize responses

Output Validation

Check responses meet quality standards
Verify structured output schemas
Ensure factual accuracy when possible
Validate against business rules

Implementation

Tool Permissions

csharp

Dictionary<string, bool> permissions = new Dictionary<string, bool>
{
    ["send_email"] = true,      // Requires permission
    ["read_file"] = false,      // No permission needed
    ["delete_data"] = true      // Requires permission
};

TornadoAgent agent = new TornadoAgent(
    api, model,
    toolPermissionRequired: permissions
);

Input Validation

csharp

async Task<Conversation> SafeRunAsync(TornadoAgent agent, string userInput)
{
    // Validate input
    if (ContainsSensitiveInfo(userInput))
    {
        throw new InvalidOperationException("Input contains sensitive information");
    }
    
    if (userInput.Length > 10000)
    {
        throw new InvalidOperationException("Input too long");
    }
    
    return await agent.RunAsync(userInput);
}

Output Filtering

csharp

async Task<string> FilteredResponse(TornadoAgent agent, string input)
{
    Conversation result = await agent.RunAsync(input);
    string response = result.Messages.Last().Content;
    
    // Filter sensitive patterns
    response = Regex.Replace(response, @"\d{3}-\d{2}-\d{4}", "***-**-****"); // SSN
    response = Regex.Replace(response, @"\b\d{16}\b", "****-****-****-****"); // Credit card
    
    return response;
}

Best Practices

Define clear boundaries in instructions
Implement multiple layers of protection
Log and monitor agent behavior
Test guardrails thoroughly
Update guardrails as threats evolve

Guardrails ​

Overview ​

Quick Start ​

Other considerations ​

Content Filtering ​

Output Validation ​

Implementation ​

Tool Permissions ​

Input Validation ​

Output Filtering ​

Best Practices ​

Related Topics ​

Guardrails

Overview

Quick Start

Other considerations

Content Filtering

Output Validation

Implementation

Tool Permissions

Input Validation

Output Filtering

Best Practices

Related Topics