The SDK provides built-in content moderation capabilities that detect potentially harmful content and expose violation details through response data. How you handle these violations is entirely up to your application’s requirements and policies.
Quick Start
Enable content moderation by setting the compliance
option to true
:
import { AnimusClient } from 'animus-client';
const client = new AnimusClient({
tokenProviderUrl: 'https://your-backend.com/api/get-animus-token',
chat: {
model: 'vivian-llama3.1-70b-1.0-fp8',
systemMessage: 'You are a helpful assistant.',
compliance: true // Enable content moderation (default: true)
}
});
// For event-driven chat, check violations in messageComplete event
client.on('messageComplete', (data) => {
if (data.compliance_violations && data.compliance_violations.length > 0) {
// Handle violations according to your application's needs
console.log("Violations detected:", data.compliance_violations);
// Your custom handling logic here
} else {
console.log("Content approved:", data.content);
}
});
// Send message - compliance checking happens automatically
client.chat.send("User message to check");
// Or use direct API method for immediate response
const response = await client.chat.completions({
messages: [{ role: 'user', content: 'User message to check' }],
compliance: true
});
if (response.compliance_violations && response.compliance_violations.length > 0) {
console.log("Violations detected:", response.compliance_violations);
} else {
console.log("Content approved");
}
Performance Note: Enabling compliance checking adds a small amount of latency to requests as content is analyzed for violations. While minimal, consider this when designing real-time applications.
Configuration Options
Global Compliance Setting
Enable moderation for all chat interactions:
const client = new AnimusClient({
tokenProviderUrl: 'https://your-backend.com/api/get-animus-token',
chat: {
model: 'vivian-llama3.1-70b-1.0-fp8',
compliance: true // All messages will be moderated
}
});
Per-Message Compliance
Control moderation on a per-message basis:
const client = new AnimusClient({
tokenProviderUrl: 'https://your-backend.com/api/get-animus-token',
chat: {
model: 'vivian-llama3.1-70b-1.0-fp8'
// compliance not set globally
}
});
// Enable moderation for specific messages using completions
const response = await client.chat.completions({
messages: [{ role: 'user', content: 'Message to moderate' }],
compliance: true
});
// Or disable for trusted content
const trustedResponse = await client.chat.completions({
messages: [{ role: 'user', content: 'Safe admin message' }],
compliance: false
});
Violation Detection
When compliance checking is enabled, the response includes violation information that your application can use:
const response = await client.chat.completions({
messages: [{ role: 'user', content: 'User message' }],
compliance: true
});
if (response.compliance_violations && response.compliance_violations.length > 0) {
// Violations detected - handle according to your application's policy
const violations = response.compliance_violations;
// Examples of how you might handle violations:
// - Block the message entirely
// - Show a warning to the user
// - Log for review
// - Apply content filters
// - Escalate to moderators
handleViolations(violations, response.choices[0].message.content);
} else {
// No violations detected - safe to display
displayMessage(response.choices[0].message.content);
}
function handleViolations(violations: string[], content: string) {
// Your application's violation handling logic
console.log(`Content flagged for: ${violations.join(', ')}`);
// Example handling strategies:
if (violations.includes('drug_use')) {
// Maybe just warn the user
showWarning("Please avoid discussing drug use");
}
if (violations.some(v => ['pedophilia', 'rape', 'murder'].includes(v))) {
// Maybe block completely and escalate
blockContent();
escalateToModerators(content, violations);
}
}
Content Violation Categories
The system can detect and report the following categories of potentially harmful content:
Category | Description |
---|
pedophilia | Content related to sexual content involving minors |
beastiality | Content involving sexual acts with animals |
murder | Content that promotes or glorifies murder |
rape | Content related to sexual assault |
incest | Content involving sexual relations between family members |
gore | Explicit and graphic violent content |
prostitution | Content promoting or soliciting prostitution |
drug_use | Content promoting or describing drug use |
Streaming with Moderation
Content moderation works with both streaming and non-streaming responses. For streaming responses, compliance violations are available in the chunk data:
// Streaming with compliance checking
const stream = await client.chat.completions({
messages: [{ role: 'user', content: 'User message' }],
compliance: true,
stream: true
});
let fullContent = '';
let hasViolations = false;
let violations: string[] = [];
for await (const chunk of stream) {
// Check for compliance violations in the chunk
if (chunk.compliance_violations && chunk.compliance_violations.length > 0) {
hasViolations = true;
violations = chunk.compliance_violations;
console.log('Compliance violations detected:', violations);
// You can choose to:
// - Stop streaming immediately
// - Continue streaming but flag the content
// - Handle violations after stream completes
break; // Example: stop streaming on violation
}
// Process content delta
const delta = chunk.choices?.[0]?.delta?.content || '';
if (delta) {
fullContent += delta;
displayStreamingContent(delta);
}
}
// Handle violations after streaming
if (hasViolations) {
handleViolations(violations, fullContent);
} else {
console.log('Content approved');
}
Non-Streaming with Moderation
For non-streaming responses, compliance violations are available in the response object:
// Non-streaming with compliance
const response = await client.chat.completions({
messages: [{ role: 'user', content: 'User message' }],
compliance: true,
stream: false
});
if (response.compliance_violations && response.compliance_violations.length > 0) {
console.log('Violations detected:', response.compliance_violations);
handleViolations(response.compliance_violations, response.choices[0].message.content);
} else {
console.log('Content approved');
displayContent(response.choices[0].message.content);
}
Event-Driven Chat with Moderation
When using chat.send()
, compliance violations are available in the messageComplete
event:
client.on('messageComplete', (data) => {
if (data.compliance_violations && data.compliance_violations.length > 0) {
console.log('Violations detected:', data.compliance_violations);
handleViolations(data.compliance_violations, data.content);
} else {
console.log('Content approved:', data.content);
displayMessage(data.content);
}
});
client.chat.send("User message to check");
Application Examples
Flexible Violation Handling
class ContentPolicy {
// Define your application's content policy
handleViolations(violations: string[], content: string, userId: string): PolicyAction {
// Severe violations - immediate action
if (violations.some(v => ['pedophilia', 'rape', 'murder'].includes(v))) {
return {
action: 'block',
reason: 'Severe policy violation',
escalate: true
};
}
// Moderate violations - warning
if (violations.includes('drug_use') || violations.includes('gore')) {
return {
action: 'warn',
reason: 'Content may violate community guidelines',
allowWithWarning: true
};
}
// Minor violations - log only
return {
action: 'log',
reason: 'Minor policy concern',
allowContent: true
};
}
}
interface PolicyAction {
action: 'block' | 'warn' | 'log' | 'allow';
reason: string;
escalate?: boolean;
allowWithWarning?: boolean;
allowContent?: boolean;
}
// Usage
const policy = new ContentPolicy();
const response = await client.chat.completions({
messages: [{ role: 'user', content: userMessage }],
compliance: true
});
if (response.compliance_violations?.length > 0) {
const action = policy.handleViolations(
response.compliance_violations,
response.choices[0].message.content,
userId
);
switch (action.action) {
case 'block':
blockMessage(action.reason);
if (action.escalate) escalateToModerators();
break;
case 'warn':
if (action.allowWithWarning) {
showWarningAndDisplay(response.choices[0].message.content, action.reason);
}
break;
case 'log':
logViolation(response.compliance_violations);
if (action.allowContent) displayMessage(response.choices[0].message.content);
break;
}
}
Educational Approach
// Example: Educational response to violations
function handleEducationalResponse(violations: string[]) {
const educationalMessages = {
'drug_use': 'We encourage discussions about health and wellness instead.',
'gore': 'We prefer conversations that promote positive interactions.',
'murder': 'Let\'s focus on constructive and peaceful topics.',
// Add more educational responses
};
violations.forEach(violation => {
const message = educationalMessages[violation];
if (message) {
showEducationalMessage(message);
}
});
}
Moderation Queue
// Example: Send flagged content to moderation queue
class ModerationQueue {
async addToQueue(content: string, violations: string[], userId: string) {
// Add to your moderation system
await this.moderationService.queue({
content,
violations,
userId,
timestamp: new Date(),
status: 'pending_review'
});
// Notify user
this.notifyUser(userId, 'Your message is under review');
}
async handleModeratorDecision(messageId: string, decision: 'approve' | 'reject') {
// Handle moderator's decision
if (decision === 'approve') {
this.releaseMessage(messageId);
} else {
this.notifyUserOfRejection(messageId);
}
}
}
Error Handling
Handle moderation failures gracefully:
async function safeModeration(message: string): Promise<ModerationResult> {
try {
const response = await client.chat.completions({
messages: [{ role: 'user', content: message }],
compliance: true
});
return {
success: true,
violations: response.compliance_violations || [],
content: response.choices[0].message.content
};
} catch (error) {
console.error('Moderation check failed:', error);
// Decide how to handle moderation failures
// Options:
// - Block content (fail-safe)
// - Allow content (fail-open)
// - Queue for manual review
return {
success: false,
violations: ['moderation_error'],
error: 'Unable to check content safety'
};
}
}
interface ModerationResult {
success: boolean;
violations: string[];
content?: string;
error?: string;
}
Key Points
- Violation Detection: The system detects and reports violations, but doesn’t automatically block content
- Application Control: Your application decides how to handle each type of violation
- Flexible Policies: Implement content policies that match your application’s needs
- User Experience: Choose between blocking, warning, educating, or queuing for review
- Escalation: Decide which violations require immediate action vs. review
- Logging: Track violations for analysis and policy improvement
Next Steps