Vision Completions

Basic Image Analysis

Ask questions about images using the vision completion API:

import { AnimusClient, MediaMessage } from 'animus-client';

const client = new AnimusClient({
  tokenProviderUrl: 'https://your-backend.com/api/get-animus-token',
  vision: {
    model: 'animuslabs/Qwen2-VL-NSFW-Vision-1.2',
    temperature: 0.2
  }
});

// Analyze an image with a question
const visionMessages: MediaMessage[] = [
  {
    role: 'user',
    content: [
      { type: 'text', text: 'What do you see in this image? Describe it in detail.' },
      { type: 'image_url', image_url: { url: 'https://example.com/image.jpg' } }
    ]
  }
];

const response = await client.media.completions({
  messages: visionMessages
});

console.log('Vision Analysis:', response.choices[0].message.content);

Multiple Images

You can analyze multiple images in a single request:

const multiImageMessages: MediaMessage[] = [
  {
    role: 'user',
    content: [
      { type: 'text', text: 'Compare these two images. What are the differences?' },
      { type: 'image_url', image_url: { url: 'https://example.com/image1.jpg' } },
      { type: 'image_url', image_url: { url: 'https://example.com/image2.jpg' } }
    ]
  }
];

const comparison = await client.media.completions({
  messages: multiImageMessages,
  temperature: 0.3
});

console.log('Image Comparison:', comparison.choices[0].message.content);

Base64 Images

You can also use base64-encoded images:

// Convert file to base64
function fileToBase64(file: File): Promise<string> {
  return new Promise((resolve, reject) => {
    const reader = new FileReader();
    reader.onload = () => resolve(reader.result as string);
    reader.onerror = reject;
    reader.readAsDataURL(file);
  });
}

// Use with vision API
const fileInput = document.getElementById('imageInput') as HTMLInputElement;
const file = fileInput.files?.[0];

if (file) {
  const base64Image = await fileToBase64(file);
  
  const messages: MediaMessage[] = [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'Analyze this uploaded image.' },
        { type: 'image_url', image_url: { url: base64Image } }
      ]
    }
  ];

  const response = await client.media.completions({ messages });
}

Media Analysis

Image Metadata Extraction

Extract structured metadata from images:

// Analyze image for categories and tags
const imageAnalysis = await client.media.analyze({
  media_url: 'https://example.com/photo.jpg',
  metadata: ['categories', 'tags', 'objects', 'faces']
});

console.log('Categories:', imageAnalysis.metadata?.categories);
console.log('Tags:', imageAnalysis.metadata?.tags);
console.log('Objects detected:', imageAnalysis.metadata?.objects);
console.log('Faces detected:', imageAnalysis.metadata?.faces);

Video Analysis

Analyze videos with automatic polling for results:

// Start video analysis (this will poll until complete)
console.log('Starting video analysis...');

const videoAnalysis = await client.media.analyze({
  media_url: 'https://example.com/video.mp4',
  metadata: ['actions', 'scene', 'objects', 'categories']
});

console.log('Video analysis complete!');
console.log('Actions detected:', videoAnalysis.results?.[0]?.actions);
console.log('Scene analysis:', videoAnalysis.results?.[0]?.scene);
console.log('Objects in video:', videoAnalysis.results?.[0]?.objects);

Manual Status Checking

For more control over video analysis, you can check status manually:

// Start analysis without waiting
const analysisRequest = await client.media.analyze({
  media_url: 'https://example.com/long-video.mp4',
  metadata: ['actions', 'scene'],
  wait_for_completion: false // Don't wait automatically
});

const jobId = analysisRequest.job_id;

// Check status periodically
const checkStatus = async () => {
  const status = await client.media.getAnalysisStatus(jobId);
  
  console.log(`Job ${jobId}: ${status.status} (${status.percent_complete}% complete)`);
  
  if (status.status === 'completed') {
    console.log('Analysis results:', status.results);
    return status.results;
  } else if (status.status === 'failed') {
    console.error('Analysis failed:', status.error);
    return null;
  } else {
    // Still processing, check again in 5 seconds
    setTimeout(checkStatus, 5000);
  }
};

checkStatus();

Advanced Vision Features

Conversational Vision

Build conversational interfaces around images:

// Initialize with vision configuration
const client = new AnimusClient({
  tokenProviderUrl: 'your-token-url',
  vision: {
    model: 'animuslabs/Qwen2-VL-NSFW-Vision-1.2'
  }
});

// Start a conversation about an image
let conversation: MediaMessage[] = [
  {
    role: 'user',
    content: [
      { type: 'text', text: 'What do you see in this image?' },
      { type: 'image_url', image_url: { url: 'https://example.com/artwork.jpg' } }
    ]
  }
];

const firstResponse = await client.media.completions({
  messages: conversation
});

// Add the response to conversation
conversation.push({
  role: 'assistant',
  content: firstResponse.choices[0].message.content
});

// Continue the conversation
conversation.push({
  role: 'user',
  content: [
    { type: 'text', text: 'What style of art is this? Who might have painted it?' }
  ]
});

const secondResponse = await client.media.completions({
  messages: conversation
});

console.log('Art analysis:', secondResponse.choices[0].message.content);

Vision with Custom Parameters

Fine-tune vision analysis with custom parameters:

const detailedAnalysis = await client.media.completions({
  messages: visionMessages,
  temperature: 0.1,        // More focused responses
  max_tokens: 1000,        // Longer descriptions
  top_p: 0.9              // Nucleus sampling
});

Supported Media Types

Image Formats

  • JPEG/JPG - Standard photo format
  • PNG - Images with transparency
  • WebP - Modern web format
  • GIF - Animated images (first frame analyzed)
  • BMP - Bitmap images

Video Formats

  • MP4 - Most common video format
  • AVI - Audio Video Interleave
  • MOV - QuickTime format
  • WebM - Web-optimized format
  • MKV - Matroska format

Size Limitations

  • Images: Maximum 10MB per image
  • Videos: Maximum 100MB per video
  • Resolution: Up to 4K (3840x2160) for optimal performance

Error Handling

Handle media-specific errors gracefully:

import { ApiError, AuthenticationError } from 'animus-client';

try {
  const analysis = await client.media.analyze({
    media_url: 'https://example.com/image.jpg',
    metadata: ['categories']
  });
} catch (error) {
  if (error instanceof ApiError) {
    if (error.status === 400) {
      console.error('Invalid media URL or format');
    } else if (error.status === 413) {
      console.error('Media file too large');
    } else if (error.status === 422) {
      console.error('Unsupported media format');
    } else {
      console.error(`API Error (${error.status}):`, error.message);
    }
  } else if (error instanceof AuthenticationError) {
    console.error('Authentication failed:', error.message);
  } else {
    console.error('Unexpected error:', error);
  }
}

Best Practices

Performance Optimization

// For better performance, resize large images before analysis
function resizeImage(file: File, maxWidth: number = 1920): Promise<string> {
  return new Promise((resolve) => {
    const canvas = document.createElement('canvas');
    const ctx = canvas.getContext('2d')!;
    const img = new Image();
    
    img.onload = () => {
      const ratio = Math.min(maxWidth / img.width, maxWidth / img.height);
      canvas.width = img.width * ratio;
      canvas.height = img.height * ratio;
      
      ctx.drawImage(img, 0, 0, canvas.width, canvas.height);
      resolve(canvas.toDataURL('image/jpeg', 0.8));
    };
    
    img.src = URL.createObjectURL(file);
  });
}

Batch Processing

// Process multiple images efficiently
async function batchAnalyzeImages(imageUrls: string[]) {
  const results = await Promise.allSettled(
    imageUrls.map(url => 
      client.media.analyze({
        media_url: url,
        metadata: ['categories', 'tags']
      })
    )
  );

  return results.map((result, index) => ({
    url: imageUrls[index],
    success: result.status === 'fulfilled',
    data: result.status === 'fulfilled' ? result.value : null,
    error: result.status === 'rejected' ? result.reason : null
  }));
}

Caching Results

// Cache analysis results to avoid re-processing
class MediaAnalysisCache {
  private cache = new Map<string, any>();

  async analyze(url: string, metadata: string[]) {
    const cacheKey = `${url}-${metadata.join(',')}`;
    
    if (this.cache.has(cacheKey)) {
      return this.cache.get(cacheKey);
    }

    const result = await client.media.analyze({
      media_url: url,
      metadata
    });

    this.cache.set(cacheKey, result);
    return result;
  }
}

Integration Examples

Image Upload and Analysis

// Complete image upload and analysis workflow
async function handleImageUpload(file: File) {
  try {
    // Show loading state
    showLoadingIndicator();

    // Convert to base64 for analysis
    const base64Image = await fileToBase64(file);

    // Analyze the image
    const [visionResponse, metadataResponse] = await Promise.all([
      client.media.completions({
        messages: [{
          role: 'user',
          content: [
            { type: 'text', text: 'Describe this image in detail.' },
            { type: 'image_url', image_url: { url: base64Image } }
          ]
        }]
      }),
      client.media.analyze({
        media_url: base64Image,
        metadata: ['categories', 'tags', 'objects']
      })
    ]);

    // Display results
    displayAnalysisResults({
      description: visionResponse.choices[0].message.content,
      categories: metadataResponse.metadata?.categories,
      tags: metadataResponse.metadata?.tags,
      objects: metadataResponse.metadata?.objects
    });

  } catch (error) {
    showErrorMessage('Failed to analyze image: ' + error.message);
  } finally {
    hideLoadingIndicator();
  }
}

Next Steps