Back to changelog

Video generation and multi-region routing

Async video generation is now available through the API and dashboard, and multi-region routing now has full documentation and UI exposure.

Video generation

Generate videos through LLM API using the same OpenAI-compatible interface you use for everything else. Jobs run asynchronously — submit a request, poll for completion, download the finished MP4. The dashboard surfaces a video-specific model browser with capability badges and a cost calculator so you can see what a generation will cost before you run it.

Async job queue — Submit a request, get a job ID, poll for status — runs in the background so it doesn’t block.
MP4 download — Completed videos available for direct download from the dashboard or via the API response.
Video cost calculator — See expected cost per generation before you commit.
Per-org concurrency limits — Admins can cap the number of concurrent video jobs per org to keep costs predictable.

Multi-region routing

Pin any request to a specific geographic region by passing the X-LLMAPI-Region header — supported across Azure, AWS Bedrock, and Google Vertex. Useful for data residency compliance or cutting latency for users in a specific geography.

Back to changelog

Get Started

Cut your AI bill, not your usage

Route every request to the right model. Track every dollar you spend. Cut your LLM costs by up to 60%.