Powered by Google
Lyria 3 Clip Preview
- Text Generation
Lyria 3 Clip Preview is Google's preview music-generation model optimized for creating short, 30‑second musical clips, loops, and previews from text or image prompts.
About the model
What is Lyria 3 Clip Preview?
Lyria 3 Clip Preview is a Google music-generation model that produces high-quality short audio clips from text and image inputs. It is mainly used to generate 30-second music snippets, loops, and previews for creative, media, and sound design workflows. It also supports features like vocal or instrumental modes, user- or model-generated lyrics, and controls such as BPM and intensity to shape the resulting clip. It belongs to the Lyria 3 family of music-generation models, alongside Lyria 3 Pro, and is offered as a preview model via the Gemini API and related Google Cloud platforms.
Model capabilities
5 Core Capabilities
-
Text-to-music
Generates 30-second high-quality stereo music clips from detailed text prompts, including structure, style, mood, and instrumentation guidance.
-
Image-to-music
Creates musical clips and previews conditioned on input images, translating visual themes and scenes into coherent audio compositions.
-
Vocal And Lyrics
Supports vocal generation, lyric generation, and user-provided lyrics to produce clips with synchronized singing and musical phrasing.
-
Musical Controls
Offers BPM and intensity controls plus instrumental mode, enabling tailored rhythmic feel, energy, and arrangement for generated clips.
-
Safety Filtering
Applies input filtering, output recitation filtering, and vocal similarity filtering, alongside audio watermarking for safer music outputs.
Use cases
6 Most Valuable Use Cases
- Short Music Clips
- Loop Background Tracks
- Ad Jingle Generation
- Social Media Audio
- Game Sound Previews
- Image-To-Music Promos
Transparent pricing
Cost Comparison
LLM API offers the lowest cost and latency for Lyria 3 Clip–class models.
| Provider | Region | Latency | Throughput | Uptime | Input ($/1M) | Output ($/1M) | Context |
|---|---|---|---|---|---|---|---|
| Global | ~550ms | ~8 img/s | 99.9% | ~$1.20/1K images | $0.00 | ~10 min video or 20 images | |
| Vertex AI (Google Cloud) | US East | ~480ms | ~10 img/s | 99.9% | ~$1.30/1K images | $0.00 | ~10 min video or 20 images |
| Replicate | US West | ~750ms | ~5 img/s | 99.5% | ~$1.80/1K images | $0.00 | ~8 min video or 16 images |
| LLM API BEST | Global | 180ms | 20 img/s | 99.99% | $0.80/1K images | $0.00 | 12 min video or 32 images |
Performance benchmarks
Technical Specifications
| Metric | Lyria 3 Clip Preview (Google) | CLIP ViT-L/14 (OpenAI) | SigLIP Large Patch16-384 (Google) |
|---|---|---|---|
| Avg Latency | ~800ms | ~700ms | ~900ms |
| Context Window | 128K | 128K | 200K |
| Input Price ($/1M tokens) | ~$0.20 | $0.15 | $0.25 |
| Output Price ($/1M tokens) | ~$0.60 | $0.60 | $1.25 |
| Max Output Tokens | 8K | 16K | 4K |
| Throughput | ~80 tps | ~100 tps | ~60 tps |
| Uptime | ~99.9% | ~99.9% | ~99.9% |
30-day usage via LLM API
- 620M
- Prompt tokens processed (30 days)
- 7.5M
- Completion tokens generated (30 days)
- 410K
- API requests served (30 days)
- 98.9%
- Avg uptime over last 30 days
Architecture & Integration
Why Build on LLM.API?
One unified API. Every major model. Built-in reliability, cost control, and observability.
-
Unified AI Routing
Dynamically route each request to the optimal model across providers based on latency, cost, and performance—without changing your code or integrations.
One endpoint, every model -
Cost-Aware Orchestration
Optimize spend by automatically choosing cheaper equivalents, downgrading when quality allows, and enforcing per-project budgets with centralized cost controls and analytics.
Max performance, minimal spend -
Resilient Fallback Logic
Eliminate single-vendor downtime with automatic failover to backup models and providers, using configurable rules, health checks, and graceful degradation strategies.
Stay online, even when APIs fail -
Deep LLM Observability
Trace every call across providers with logs, metrics, and structured events to debug prompts, track latency, and understand model behavior in production.
See every token, every hop -
Task-Level Abstractions
Describe tasks—chat, generation, extraction, tools—once and let LLM.API select and configure the right models, prompts, and parameters for each use case.
Think in tasks, not models -
High-Throughput Batch Jobs
Run large-scale batch inference with automatic chunking, concurrency control, retries, and progress tracking designed for data pipelines and offline processing.
Ship millions of calls safely
Decision guide
When to Use — When NOT to Use
Use it if...
- You need to generate short, engaging video clips from text or prompts.
- Your use case involves rapid prototyping of social media clips and story previews.
- You need an automated way to create video teasers for marketing or product launches.
- Your use case involves experimenting with AI-generated visual narratives or concept clips.
- You need a model to help non-experts quickly draft video content variations.
- Your use case involves embedding clip generation into creative tools or content pipelines.
Avoid if...
- You need advanced text understanding, reasoning, or general-purpose conversational intelligence.
- Your workload requires high-precision image classification or detailed computer vision analytics.
- You need long-duration, coherent video generation beyond very short clip segments.
- Your workload requires strict, fine-grained control over camera movements and visual continuity.
- You need domain-specific scientific visualization or technical diagram generation with guaranteed accuracy.
- Your workload requires on-device or fully offline inference without cloud dependencies.
FAQ
Frequently Asked Questions
-
What is Lyria 3 Clip Preview?
Lyria 3 Clip Preview is a Google model for generating short video clips from text prompts, accessible through the unified LLM.API gateway.
-
What is Lyria 3 Clip Preview best suited for?
It is best for quickly prototyping and previewing short video concepts, storyboards, and motion ideas directly from text descriptions.
-
What modalities does Lyria 3 Clip Preview support?
Lyria 3 Clip Preview consumes text prompts and outputs short video clips, without support for audio or image-only inputs in this preview tier.
-
How is Lyria 3 Clip Preview priced on LLM.API?
Pricing is usage-based per generated clip or generated video-second, with exact rates defined in your LLM.API pricing dashboard.
-
What is the context window or prompt size for Lyria 3 Clip Preview?
The model supports moderately long text prompts, typically a few paragraphs, but you should avoid excessively long scripts or scene-by-scene screenplays.
-
How fast is Lyria 3 Clip Preview in terms of latency?
Latency depends on clip length and load, but you should expect generation to take several seconds up to a couple of minutes per request.
-
How do I call Lyria 3 Clip Preview via LLM.API?
Use the standard LLM.API generation endpoint, specifying the Google provider and the "Lyria 3 Clip Preview" model name in your request payload.
-
How does Lyria 3 Clip Preview compare to other video models on LLM.API?
It emphasizes quick preview-quality clips for ideation rather than long-duration, high-fidelity cinematic video compared to heavier video generation models.
-
What are the main limitations of Lyria 3 Clip Preview?
Limitations include short clip duration, preview-level visual quality, occasional motion artifacts, and imperfect adherence to very detailed or complex scene instructions.
-
Can I use Lyria 3 Clip Preview for production-ready marketing videos?
It is better suited for concepting and iteration; production workflows typically require post-processing or higher-fidelity models for final output.
