Vision Completions
Basic Image Analysis
Ask questions about images using the vision completion API:Multiple Images
You can analyze multiple images in a single request:Base64 Images
You can also use base64-encoded images:Media Analysis
Image Metadata Extraction
Extract structured metadata from images:Video Analysis
Analyze videos with automatic polling for results:Manual Status Checking
For more control over video analysis, you can check status manually:Advanced Vision Features
Conversational Vision
Build conversational interfaces around images:Vision with Custom Parameters
Fine-tune vision analysis with custom parameters:Supported Media Types
Image Formats
- JPEG/JPG - Standard photo format
- PNG - Images with transparency
- WebP - Modern web format
- GIF - Animated images (first frame analyzed)
- BMP - Bitmap images
Video Formats
- MP4 - Most common video format
- AVI - Audio Video Interleave
- MOV - QuickTime format
- WebM - Web-optimized format
- MKV - Matroska format
Size Limitations
- Images: Maximum 10MB per image
- Videos: Maximum 100MB per video
- Resolution: Up to 4K (3840x2160) for optimal performance
Error Handling
Handle media-specific errors gracefully:Best Practices
Performance Optimization
Batch Processing
Caching Results
Integration Examples
Image Upload and Analysis
Next Steps
Authentication Setup
Set up secure token provider for media analysis
Tool Calling
Combine vision with function calling capabilities
Image Generation
Generate images automatically from conversations
Event System
Handle media analysis events in your application
