Changelog

Every change across LLM API in one place: inference discounts, product updates, new and retired models.

Follow us
Claude web search, new voice models, and model retirements
Claude can now search the web for real answers, Cartesia adds two new voice models, four new text models join the catalog, and a set of older models are retired or scheduled for retirement at the end of the year. Claude web search Claude models now correctly route to a backend that runs live web […]
New pricing: subscriptions with 10x token value and 50% PAYG volume discounts
Two new ways to pay launch today — AI for Coding, a flat monthly subscription that converts your fee into a leveraged token pool, and AI for Production, pay-as-you-go credits with automatic volume discounts as your usage grows. AI for Coding — subscription plans Nine frontier models, one token pool. Subscribe once and route your […]
Sign in with LLM API, ElevenLabs transcription & Bedrock API keys
Your coding tools can now connect to LLM API without asking you to copy-paste keys, ElevenLabs Scribe brings best-in-class transcription to the audio endpoint, and AWS Bedrock is now easier to set up. Sign in with LLM API External editors and coding assistants — starting with Kilo Code — can now connect to your LLM […]
Multi-tenant workspaces, batch jobs & dashboard improvements
Multi-tenant workspace support is live — run multiple customers out of a single LLM API organization, each fully isolated. This release also ships the batch jobs dashboard and dashboard UX improvements. Multi-tenant workspaces You can now carve your LLM API organization into isolated tenant spaces — each with its own credentials, plan, seat limits, and […]
Claude Opus 4.8, MiMo, guardrails log & model deprecation dates
Anthropic’s latest flagship arrives, Xiaomi MiMo joins the catalog, every guardrail evaluation gets a full audit log, and deprecation dates are now visible across all providers. Claude Opus 4.8 Anthropic’s latest flagship is now available via both the standard Anthropic endpoint and AWS Bedrock. Opus 4.8 supports a 1M token context window, vision, tool use, […]
Team invitations, provider fallback, new models & guardrail templates
Invite teammates by email, build automatic fallback chains between providers, add Azure Whisper STT plus four new models, and create guardrail rules from a template gallery. Team invitations Invite people to your organisation by email — they don’t need to sign up first. Pending invitations appear in a dedicated table with role badges and a […]
Subscription plans, batch processing & API key management
Subscription plans land in the dashboard this week alongside a proper batch processing UI, long-requested API key search and sort, and reasoning cost visibility in logs. Batch processing Batch is now a first-class feature in the dashboard. Filter models by batch capability, toggle a Batch only view, and inspect batch metadata — status, counts, cost, […]
Streaming STT, Gemini TTS, guardrails & provider controls
Live streaming transcription, new audio providers, per-project content guardrails, and org-level provider blocking — a dense release week for audio and safety capabilities. Deepgram live streaming STT Deepgram now supports real-time streaming transcription via WebSocket — send audio as it’s recorded and get transcripts back word by word instead of waiting for the full file. […]
Bonuses, rewards & top-up improvements
A new onboarding experience, a rewards system for early contributors, and a revamped top-up flow ship together this week. Get Started page New users land on a structured Get Started page that links out to quick-start guides, docs, the blog, and community channels — and highlights recently added models. The fastest path from signup to […]
Video generation and multi-region routing
Async video generation is now available through the API and dashboard, and multi-region routing now has full documentation and UI exposure. Video generation Generate videos through LLM API using the same OpenAI-compatible interface you use for everything else. Jobs run asynchronously — submit a request, poll for completion, download the finished MP4. The dashboard surfaces […]
API key controls & dashboard polish
Tighter access controls on shared API keys and a visual polish pass across the dashboard. API key access controls Public API keys now support fine-grained scope control — administrators can limit exactly what a shared key can access. If a key is leaked, the blast radius is contained. Invalid or expired keys now return clear […]
GitHub sign-in, EvalLab, speech-to-text & low balance alerts
Sign in with GitHub lands alongside EvalLab — a tool for running model comparisons — plus a full speech-to-text interface, low balance email alerts, and new providers. GitHub sign-in Sign in and link your account with GitHub OAuth, alongside the existing Google and Apple options. EvalLab A new EvalLab section in the dashboard lets you […]

Get Started

Cut your AI bill, not your usage

Route every request to the right model. Track every dollar you spend. Cut your LLM costs by up to 60%.