Moderation
Overview
Content moderation helps ensure that generated content complies with usage policies. LlmTornado provides moderation capabilities to check text for potentially harmful, inappropriate, or policy-violating content across various categories.
Quick Start
csharp
using LlmTornado;
using LlmTornado.Moderation;
TornadoApi api = new TornadoApi("your-api-key");
// Check content for policy violations
ModerationResult? result = await api.Moderation.CreateModeration("Text to check");
if (result?.Results?[0].Flagged == true)
{
Console.WriteLine("Content flagged for moderation");
}Prerequisites
- The LlmTornado package installed
- A valid API key
- Understanding of content policy categories
Basic Usage
Simple Moderation Check
csharp
ModerationResult? result = await api.Moderation.CreateModeration(
"Hello, how are you today?");
ModerationEntry entry = result.Results[0];
Console.WriteLine($"Flagged: {entry.Flagged}");
Console.WriteLine($"Sexual: {entry.Categories.Sexual}");
Console.WriteLine($"Violence: {entry.Categories.Violence}");
Console.WriteLine($"Hate: {entry.Categories.Hate}");Batch Moderation
csharp
string[] texts = [
"First text to check",
"Second text to check",
"Third text to check"
];
ModerationResult? result = await api.Moderation.CreateModeration(texts);
for (int i = 0; i < result.Results.Count; i++)
{
Console.WriteLine($"Text {i + 1}: Flagged = {result.Results[i].Flagged}");
}Detailed Category Scores
csharp
ModerationResult? result = await api.Moderation.CreateModeration("Text to analyze");
ModerationEntry entry = result.Results[0];
Console.WriteLine("Category Scores:");
Console.WriteLine($"Sexual: {entry.CategoryScores.Sexual:F4}");
Console.WriteLine($"Hate: {entry.CategoryScores.Hate:F4}");
Console.WriteLine($"Violence: {entry.CategoryScores.Violence:F4}");
Console.WriteLine($"Self-harm: {entry.CategoryScores.SelfHarm:F4}");
Console.WriteLine($"Sexual/Minors: {entry.CategoryScores.SexualMinors:F4}");
Console.WriteLine($"Hate/Threatening: {entry.CategoryScores.HateThreatening:F4}");
Console.WriteLine($"Violence/Graphic: {entry.CategoryScores.ViolenceGraphic:F4}");Advanced Usage
Pre-Moderation Filter
Implement a filter for user input:
csharp
async Task<bool> IsContentSafe(string userInput)
{
ModerationResult? result = await api.Moderation.CreateModeration(userInput);
return result?.Results?[0].Flagged != true;
}
// Usage
string userInput = Console.ReadLine();
if (await IsContentSafe(userInput))
{
// Process the input
ChatRichResponse response = await conversation.GetResponseRich();
}
else
{
Console.WriteLine("Your input contains inappropriate content.");
}Best Practices
- Moderate User Input - Check content before processing
- Moderate AI Output - Verify generated content is appropriate
- Handle False Positives - Provide appeal mechanisms
- Respect Privacy - Only moderate what's necessary
- Log Violations - Track patterns for improvement
API Reference
Moderation Endpoint
CreateModeration(string input)- Check single textCreateModeration(string[] input)- Check multiple texts
ModerationResult
List<ModerationEntry> Results- Moderation results for each inputstring Id- Request identifierstring Model- Model used for moderation
ModerationEntry
bool Flagged- Whether content is flaggedCategories Categories- Boolean flags for each categoryCategoryScores CategoryScores- Confidence scores (0-1)
Related Topics
- Chat Basics - Core chat functionality
- Agents - Building safe AI agents