Video generation and multi-region routing

Async video generation is now available through the API and dashboard, and multi-region routing now has full documentation and UI exposure.

Video generation

Generate videos through LLM API using the same OpenAI-compatible interface you use for everything else. Jobs run asynchronously — submit a request, poll for completion, download the finished MP4. The dashboard surfaces a video-specific model browser with capability badges and a cost calculator so you can see what a generation will cost before you run it.

– Async job queue — Submit a request, get a job ID, poll for status — runs in the background so it doesn’t block.
– MP4 download — Completed videos available for direct download from the dashboard or via the API response.
– Video cost calculator — See expected cost per generation before you commit.
– Per-org concurrency limits — Admins can cap the number of concurrent video jobs per org to keep costs predictable.

Multi-region routing

Pin any request to a specific geographic region by passing the X-LLMAPI-Region header — supported across Azure, AWS Bedrock, and Google Vertex. Useful for data residency compliance or cutting latency for users in a specific geography.

Cut your AI bill, not your usage

Route every request to the right model. Track every dollar you spend. Cut your LLM costs by up to 60%.