Documentation Index Fetch the complete documentation index at: https://docs.animusai.co/llms.txt
Use this file to discover all available pages before exploring further.
Important: Streaming is automatically disabled when autoTurn (Conversational Turns) is enabled. The SDK will use non-streaming mode to properly handle response splitting and natural conversation flow. Choose either streaming OR conversational turns based on your use case.
Basic Streaming
Using AsyncIterable Pattern
Stream responses using the modern AsyncIterable pattern:
import { AnimusClient } from 'animus-client' ;
const client = new AnimusClient ({
tokenProviderUrl: 'https://your-backend.com/api/get-animus-token' ,
chat: {
model: 'vivian-llama3.1-70b-1.0-fp8' ,
systemMessage: 'You are a helpful assistant.' ,
// Note: autoTurn must be false or undefined for streaming to work
autoTurn: false
}
});
try {
// Enable streaming in the request
const stream = await client . chat . completions ({
messages: [
{ role: 'user' , content: 'Write a short story about a robot learning to paint.' }
],
stream: true
});
let fullContent = '' ;
// Process each chunk as it arrives
for await ( const chunk of stream ) {
const delta = chunk . choices ?.[ 0 ]?. delta ?. content || '' ;
fullContent += delta ;
// Update UI incrementally
updateChatDisplay ( fullContent );
console . log ( 'Streaming:' , delta );
}
console . log ( 'Stream complete. Final content:' , fullContent );
} catch ( error ) {
console . error ( 'Streaming error:' , error );
}
Real-time UI Updates
Here’s how to implement streaming in a web application:
// HTML element to display the streaming response
const responseElement = document . getElementById ( 'ai-response' );
async function streamResponse ( userMessage : string ) {
try {
const stream = await client . chat . completions ({
messages: [{ role: 'user' , content: userMessage }],
stream: true ,
temperature: 0.7
});
// Clear previous content
responseElement . textContent = '' ;
let accumulatedText = '' ;
for await ( const chunk of stream ) {
const delta = chunk . choices ?.[ 0 ]?. delta ?. content || '' ;
if ( delta ) {
accumulatedText += delta ;
responseElement . textContent = accumulatedText ;
// Auto-scroll to bottom
responseElement . scrollTop = responseElement . scrollHeight ;
}
}
console . log ( 'Streaming complete' );
} catch ( error ) {
responseElement . textContent = 'Error: ' + error . message ;
}
}
Advanced Streaming Features
Streaming with Reasoning
When reasoning is enabled, thinking content appears directly in the stream:
const stream = await client . chat . completions ({
messages: [{ role: 'user' , content: 'Solve this math problem: 2x + 5 = 15' }],
stream: true ,
reasoning: true // Include model's thinking process
});
let thinkingContent = '' ;
let responseContent = '' ;
let inThinkingBlock = false ;
for await ( const chunk of stream ) {
const delta = chunk . choices ?.[ 0 ]?. delta ?. content || '' ;
// Parse thinking blocks in real-time
if ( delta . includes ( '<think>' )) {
inThinkingBlock = true ;
}
if ( inThinkingBlock ) {
thinkingContent += delta ;
updateThinkingDisplay ( thinkingContent );
} else {
responseContent += delta ;
updateResponseDisplay ( responseContent );
}
if ( delta . includes ( '</think>' )) {
inThinkingBlock = false ;
}
}
Streaming with Custom Processing
You can implement custom processing for different types of content:
class StreamProcessor {
private buffer = '' ;
private onText : ( text : string ) => void ;
private onCode : ( code : string , language : string ) => void ;
constructor ( onText : ( text : string ) => void , onCode : ( code : string , language : string ) => void ) {
this . onText = onText ;
this . onCode = onCode ;
}
processChunk ( delta : string ) {
this . buffer += delta ;
// Detect code blocks
const codeBlockRegex = /``` ( \w + ) ? \n ( [ \s\S ] *? ) ```/ g ;
let match ;
while (( match = codeBlockRegex . exec ( this . buffer )) !== null ) {
const language = match [ 1 ] || 'text' ;
const code = match [ 2 ];
this . onCode ( code , language );
}
// Process regular text
const textContent = this . buffer . replace ( codeBlockRegex , '' );
if ( textContent . trim ()) {
this . onText ( textContent );
}
}
}
// Usage
const processor = new StreamProcessor (
( text ) => updateTextDisplay ( text ),
( code , lang ) => updateCodeDisplay ( code , lang )
);
for await ( const chunk of stream ) {
const delta = chunk . choices ?.[ 0 ]?. delta ?. content || '' ;
processor . processChunk ( delta );
}
Streaming Limitations
Compatibility with Other Features
Streaming is NOT compatible with:
autoTurn (Conversational Turns) - The SDK automatically disables streaming when autoTurn is enabled
compliance checks - Content moderation is not available for streaming responses
Streaming IS compatible with:
reasoning - Thinking content appears in the stream
check_image_generation - Image prompts can be detected in streaming responses
All standard chat parameters (temperature, max_tokens, etc.)
Error Handling
Implement robust error handling for streaming:
async function safeStreaming ( userMessage : string ) {
try {
const stream = await client . chat . completions ({
messages: [{ role: 'user' , content: userMessage }],
stream: true
});
for await ( const chunk of stream ) {
try {
const delta = chunk . choices ?.[ 0 ]?. delta ?. content || '' ;
// Process chunk safely
if ( delta ) {
processStreamChunk ( delta );
}
} catch ( chunkError ) {
console . error ( 'Error processing chunk:' , chunkError );
// Continue processing other chunks
}
}
} catch ( streamError ) {
console . error ( 'Stream initialization error:' , streamError );
// Fallback to non-streaming
const response = await client . chat . completions ({
messages: [{ role: 'user' , content: userMessage }],
stream: false
});
displayFinalResponse ( response . choices [ 0 ]. message . content );
}
}
Throttling Updates
For better performance, throttle UI updates:
class ThrottledDisplay {
private updateQueue = '' ;
private isUpdating = false ;
private element : HTMLElement ;
constructor ( element : HTMLElement ) {
this . element = element ;
}
addContent ( delta : string ) {
this . updateQueue += delta ;
if ( ! this . isUpdating ) {
this . isUpdating = true ;
requestAnimationFrame (() => {
this . element . textContent += this . updateQueue ;
this . updateQueue = '' ;
this . isUpdating = false ;
});
}
}
}
// Usage
const display = new ThrottledDisplay ( document . getElementById ( 'response' ));
for await ( const chunk of stream ) {
const delta = chunk . choices ?.[ 0 ]?. delta ?. content || '' ;
display . addContent ( delta );
}
Buffering Strategy
Implement smart buffering for smoother display:
class StreamBuffer {
private buffer : string [] = [];
private displayInterval : number ;
constructor ( onDisplay : ( text : string ) => void , intervalMs = 50 ) {
this . displayInterval = setInterval (() => {
if ( this . buffer . length > 0 ) {
const chunk = this . buffer . shift () ! ;
onDisplay ( chunk );
}
}, intervalMs );
}
addChunk ( delta : string ) {
this . buffer . push ( delta );
}
finish () {
// Flush remaining buffer
while ( this . buffer . length > 0 ) {
const chunk = this . buffer . shift () ! ;
// Display immediately
}
clearInterval ( this . displayInterval );
}
}
Best Practices
When to Use Streaming vs Conversational Turns
Use Streaming for:
Long-form content generation (stories, articles, explanations)
Real-time code generation
When you want immediate character-by-character display
Applications where you control the entire response flow
Use Conversational Turns for:
Natural chat interfaces that mimic human conversation
When you want automatic response splitting
Applications that benefit from realistic typing delays
Chat apps where the AI should feel more human-like
UI/UX Considerations
Visual feedback : Show a typing indicator or cursor
Graceful degradation : Fallback to non-streaming on errors
Performance : Throttle updates for smooth rendering
Accessibility : Announce streaming status to screen readers
// Example with typing indicator
function showTypingIndicator () {
const indicator = document . createElement ( 'span' );
indicator . textContent = '▋' ;
indicator . className = 'typing-cursor' ;
responseElement . appendChild ( indicator );
}
function hideTypingIndicator () {
const cursor = responseElement . querySelector ( '.typing-cursor' );
if ( cursor ) cursor . remove ();
}
Next Steps
Media & Vision Add image and video analysis to your streaming responses
Auto-Turn Conversations Learn about conversational turns (alternative to streaming)
Authentication Setup Secure your streaming implementation
Event System Comprehensive event handling for complex applications