Vision Completions
Basic Image Analysis
Ask questions about images using the vision completion API:
import { AnimusClient, MediaMessage } from 'animus-client';
const client = new AnimusClient({
tokenProviderUrl: 'https://your-backend.com/api/get-animus-token',
vision: {
model: 'animuslabs/Qwen2-VL-NSFW-Vision-1.2',
temperature: 0.2
}
});
// Analyze an image with a question
const visionMessages: MediaMessage[] = [
{
role: 'user',
content: [
{ type: 'text', text: 'What do you see in this image? Describe it in detail.' },
{ type: 'image_url', image_url: { url: 'https://example.com/image.jpg' } }
]
}
];
const response = await client.media.completions({
messages: visionMessages
});
console.log('Vision Analysis:', response.choices[0].message.content);
Multiple Images
You can analyze multiple images in a single request:
const multiImageMessages: MediaMessage[] = [
{
role: 'user',
content: [
{ type: 'text', text: 'Compare these two images. What are the differences?' },
{ type: 'image_url', image_url: { url: 'https://example.com/image1.jpg' } },
{ type: 'image_url', image_url: { url: 'https://example.com/image2.jpg' } }
]
}
];
const comparison = await client.media.completions({
messages: multiImageMessages,
temperature: 0.3
});
console.log('Image Comparison:', comparison.choices[0].message.content);
Base64 Images
You can also use base64-encoded images:
// Convert file to base64
function fileToBase64(file: File): Promise<string> {
return new Promise((resolve, reject) => {
const reader = new FileReader();
reader.onload = () => resolve(reader.result as string);
reader.onerror = reject;
reader.readAsDataURL(file);
});
}
// Use with vision API
const fileInput = document.getElementById('imageInput') as HTMLInputElement;
const file = fileInput.files?.[0];
if (file) {
const base64Image = await fileToBase64(file);
const messages: MediaMessage[] = [
{
role: 'user',
content: [
{ type: 'text', text: 'Analyze this uploaded image.' },
{ type: 'image_url', image_url: { url: base64Image } }
]
}
];
const response = await client.media.completions({ messages });
}
Extract structured metadata from images:
// Analyze image for categories and tags
const imageAnalysis = await client.media.analyze({
media_url: 'https://example.com/photo.jpg',
metadata: ['categories', 'tags', 'objects', 'faces']
});
console.log('Categories:', imageAnalysis.metadata?.categories);
console.log('Tags:', imageAnalysis.metadata?.tags);
console.log('Objects detected:', imageAnalysis.metadata?.objects);
console.log('Faces detected:', imageAnalysis.metadata?.faces);
Video Analysis
Analyze videos with automatic polling for results:
// Start video analysis (this will poll until complete)
console.log('Starting video analysis...');
const videoAnalysis = await client.media.analyze({
media_url: 'https://example.com/video.mp4',
metadata: ['actions', 'scene', 'objects', 'categories']
});
console.log('Video analysis complete!');
console.log('Actions detected:', videoAnalysis.results?.[0]?.actions);
console.log('Scene analysis:', videoAnalysis.results?.[0]?.scene);
console.log('Objects in video:', videoAnalysis.results?.[0]?.objects);
Manual Status Checking
For more control over video analysis, you can check status manually:
// Start analysis without waiting
const analysisRequest = await client.media.analyze({
media_url: 'https://example.com/long-video.mp4',
metadata: ['actions', 'scene'],
wait_for_completion: false // Don't wait automatically
});
const jobId = analysisRequest.job_id;
// Check status periodically
const checkStatus = async () => {
const status = await client.media.getAnalysisStatus(jobId);
console.log(`Job ${jobId}: ${status.status} (${status.percent_complete}% complete)`);
if (status.status === 'completed') {
console.log('Analysis results:', status.results);
return status.results;
} else if (status.status === 'failed') {
console.error('Analysis failed:', status.error);
return null;
} else {
// Still processing, check again in 5 seconds
setTimeout(checkStatus, 5000);
}
};
checkStatus();
Advanced Vision Features
Conversational Vision
Build conversational interfaces around images:
// Initialize with vision configuration
const client = new AnimusClient({
tokenProviderUrl: 'your-token-url',
vision: {
model: 'animuslabs/Qwen2-VL-NSFW-Vision-1.2'
}
});
// Start a conversation about an image
let conversation: MediaMessage[] = [
{
role: 'user',
content: [
{ type: 'text', text: 'What do you see in this image?' },
{ type: 'image_url', image_url: { url: 'https://example.com/artwork.jpg' } }
]
}
];
const firstResponse = await client.media.completions({
messages: conversation
});
// Add the response to conversation
conversation.push({
role: 'assistant',
content: firstResponse.choices[0].message.content
});
// Continue the conversation
conversation.push({
role: 'user',
content: [
{ type: 'text', text: 'What style of art is this? Who might have painted it?' }
]
});
const secondResponse = await client.media.completions({
messages: conversation
});
console.log('Art analysis:', secondResponse.choices[0].message.content);
Vision with Custom Parameters
Fine-tune vision analysis with custom parameters:
const detailedAnalysis = await client.media.completions({
messages: visionMessages,
temperature: 0.1, // More focused responses
max_tokens: 1000, // Longer descriptions
top_p: 0.9 // Nucleus sampling
});
- JPEG/JPG - Standard photo format
- PNG - Images with transparency
- WebP - Modern web format
- GIF - Animated images (first frame analyzed)
- BMP - Bitmap images
- MP4 - Most common video format
- AVI - Audio Video Interleave
- MOV - QuickTime format
- WebM - Web-optimized format
- MKV - Matroska format
Size Limitations
- Images: Maximum 10MB per image
- Videos: Maximum 100MB per video
- Resolution: Up to 4K (3840x2160) for optimal performance
Error Handling
Handle media-specific errors gracefully:
import { ApiError, AuthenticationError } from 'animus-client';
try {
const analysis = await client.media.analyze({
media_url: 'https://example.com/image.jpg',
metadata: ['categories']
});
} catch (error) {
if (error instanceof ApiError) {
if (error.status === 400) {
console.error('Invalid media URL or format');
} else if (error.status === 413) {
console.error('Media file too large');
} else if (error.status === 422) {
console.error('Unsupported media format');
} else {
console.error(`API Error (${error.status}):`, error.message);
}
} else if (error instanceof AuthenticationError) {
console.error('Authentication failed:', error.message);
} else {
console.error('Unexpected error:', error);
}
}
Best Practices
// For better performance, resize large images before analysis
function resizeImage(file: File, maxWidth: number = 1920): Promise<string> {
return new Promise((resolve) => {
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d')!;
const img = new Image();
img.onload = () => {
const ratio = Math.min(maxWidth / img.width, maxWidth / img.height);
canvas.width = img.width * ratio;
canvas.height = img.height * ratio;
ctx.drawImage(img, 0, 0, canvas.width, canvas.height);
resolve(canvas.toDataURL('image/jpeg', 0.8));
};
img.src = URL.createObjectURL(file);
});
}
Batch Processing
// Process multiple images efficiently
async function batchAnalyzeImages(imageUrls: string[]) {
const results = await Promise.allSettled(
imageUrls.map(url =>
client.media.analyze({
media_url: url,
metadata: ['categories', 'tags']
})
)
);
return results.map((result, index) => ({
url: imageUrls[index],
success: result.status === 'fulfilled',
data: result.status === 'fulfilled' ? result.value : null,
error: result.status === 'rejected' ? result.reason : null
}));
}
Caching Results
// Cache analysis results to avoid re-processing
class MediaAnalysisCache {
private cache = new Map<string, any>();
async analyze(url: string, metadata: string[]) {
const cacheKey = `${url}-${metadata.join(',')}`;
if (this.cache.has(cacheKey)) {
return this.cache.get(cacheKey);
}
const result = await client.media.analyze({
media_url: url,
metadata
});
this.cache.set(cacheKey, result);
return result;
}
}
Integration Examples
Image Upload and Analysis
// Complete image upload and analysis workflow
async function handleImageUpload(file: File) {
try {
// Show loading state
showLoadingIndicator();
// Convert to base64 for analysis
const base64Image = await fileToBase64(file);
// Analyze the image
const [visionResponse, metadataResponse] = await Promise.all([
client.media.completions({
messages: [{
role: 'user',
content: [
{ type: 'text', text: 'Describe this image in detail.' },
{ type: 'image_url', image_url: { url: base64Image } }
]
}]
}),
client.media.analyze({
media_url: base64Image,
metadata: ['categories', 'tags', 'objects']
})
]);
// Display results
displayAnalysisResults({
description: visionResponse.choices[0].message.content,
categories: metadataResponse.metadata?.categories,
tags: metadataResponse.metadata?.tags,
objects: metadataResponse.metadata?.objects
});
} catch (error) {
showErrorMessage('Failed to analyze image: ' + error.message);
} finally {
hideLoadingIndicator();
}
}
Next Steps