Async video generation is now available through the API and dashboard, and multi-region routing now has full documentation and UI exposure.
Video generation
Generate videos through LLM API using the same OpenAI-compatible interface you use for everything else. Jobs run asynchronously — submit a request, poll for completion, download the finished MP4. The dashboard surfaces a video-specific model browser with capability badges and a cost calculator so you can see what a generation will cost before you run it.
– Async job queue — Submit a request, get a job ID, poll for status — runs in the background so it doesn’t block.
– MP4 download — Completed videos available for direct download from the dashboard or via the API response.
– Video cost calculator — See expected cost per generation before you commit.
– Per-org concurrency limits — Admins can cap the number of concurrent video jobs per org to keep costs predictable.
Multi-region routing
Pin any request to a specific geographic region by passing the X-LLMAPI-Region header — supported across Azure, AWS Bedrock, and Google Vertex. Useful for data residency compliance or cutting latency for users in a specific geography.
