Powered by Amazon
Nova Premier 1.0
- Instruction Following
Nova Premier 1.0 is Amazon’s most capable multimodal Nova-family model, optimized for complex reasoning with a very large 1M-token context window.
About the model
What is Nova Premier 1.0?
Nova Premier 1.0 is a proprietary multimodal large language model from Amazon that processes text, images, and video with high-accuracy reasoning. It is primarily used for enterprise scenarios that require deep, multi-step reasoning over long contexts such as large document analysis, complex codebases, or multimodal business workflows. It is also widely used as a high-end “teacher” model for distilling smaller, task-specialized Nova variants to reduce latency and cost in production deployments. Nova Premier 1.0 belongs to Amazon’s Nova model family, sitting above Nova Pro, Lite, and Micro in capability and serving as the top-tier foundation model in that lineup.
Model capabilities
5 Core Capabilities
-
Advanced Chat
Supports multi-turn conversational assistants, reasoning over context, and following complex instructions for customer support, analytics, and applications.
-
Text Translation
Translates text between multiple languages, enabling cross-lingual understanding for global applications and multilingual user experiences.
-
Image Understanding
Analyzes user-provided images, recognizing objects, scenes, and visual attributes to support multimodal assistant and search scenarios.
-
Screen Monitoring
Processes visual content from interfaces or dashboards to extract insights, summarize on-screen information, or support monitoring workflows.
-
Document OCR
Extracts machine-readable text from images of documents, enabling downstream search, analysis, and data processing pipelines.
Use cases
6 Most Valuable Use Cases
- Long Document Analysis
- Complex Code Review
- Video Content Summaries
- Agentic Workflow Orchestration
- Enterprise Knowledge Search
- Custom Model Distillation
Transparent pricing
Cost Comparison
LLM API offers the lowest cost and latency for Nova Premier–class models.
| Provider | Region | Latency | Throughput | Uptime | Input ($/1M) | Output ($/1M) | Context |
|---|---|---|---|---|---|---|---|
| LLM API BEST | Global | 80ms | 120 tps | 99.995% | $0.40 | $0.80 | 256K |
| Amazon (Nova Premier 1.0) | US East | ~140ms | ~70 tps | 99.9% | ~$0.70 | ~$1.60 | ~200K |
| OpenAI (o3-mini equivalent) | Global | ~120ms | ~80 tps | 99.9% | ~$0.60 | ~$1.20 | ~200K |
| Google (Gemini 1.5 Pro equivalent) | Global | ~150ms | ~60 tps | 99.9% | ~$0.80 | ~$1.80 | ~128K |
| Anthropic (Claude 3.5 Sonnet equivalent) | Global | ~130ms | ~75 tps | 99.9% | ~$0.65 | ~$1.50 | ~200K |
Performance benchmarks
Technical Specifications
| Metric | Nova Premier 1.0 | Anthropic Claude 3 Sonnet | OpenAI GPT-4.1 |
|---|---|---|---|
| Avg Latency | ~180ms | ~220ms | ~250ms |
| Context Window | 128K | 200K | 128K |
| Input Price ($/1M) | $0.80 | $3.00 | $5.00 |
| Output Price ($/1M) | $2.40 | $15.00 | $15.00 |
| Max Output Tokens | 8K | 8K | 8K |
| Throughput | ~120 tps | ~80 tps | ~100 tps |
| Uptime | 99.9% | 99.9% | 99.9% |
30-day usage via LLM API
- 38B
- Prompt tokens processed (last 30 days)
- 24B
- Completion tokens generated (last 30 days)
- 12.5M
- API requests served (last 30 days)
- 99.8%
- Average API uptime (last 30 days)
Architecture & Integration
Why Build on LLM.API?
One unified API. Every major model. Built-in reliability, cost control, and observability.
-
Unified AI Routing
Automatically route each request to the optimal model across providers based on latency, cost, and quality—without changing your integration or client code.
One endpoint, any model -
Cost-Aware Orchestration
Balance speed and quality against budget with per-request cost controls, dynamic model selection, and clear usage visibility so you never overspend unintentionally.
Control cost per call -
Automatic Fallback Logic
Handle provider outages and rate limits gracefully with built-in failover to backup models, keeping your production workloads resilient without custom retry code.
Stay online by default -
Deep Observability
Get end-to-end traces, latency and error metrics, and request logs across all providers in one place to debug faster and confidently run AI in production.
See every token flow -
Task-Level Abstractions
Describe intent—chat, classify, extract, generate—and let LLM.API pick and tune the right models, simplifying prompts and speeding up feature development.
Code to tasks, not models -
High-Throughput Batch
Ship massive workloads with parallelized, provider-optimized batch execution, automatic retries, and structured outputs for analytics, backfills, and offline pipelines.
Scale to millions of calls
Decision guide
When to Use — When NOT to Use
Use it if...
- You need a managed, general-purpose LLM tightly integrated with the AWS ecosystem and tooling.
- You need strong instruction-following and conversation quality for typical customer support chatbots.
- You need a capable model for code assistance, refactoring, and common programming questions.
- Your use case involves multi-turn business workflows that benefit from reliable tool-calling orchestration.
- Your use case involves summarizing, classifying, or extracting data from moderate-length business documents.
- You need enterprise features like AWS security, IAM integration, and regional data residency controls.
Avoid if...
- You need frontier-level reasoning or creativity comparable to the very latest state-of-the-art models.
- You need extremely long context handling for hundreds of pages or entire codebases at once.
- Your workload requires ultra-low-cost, low-intelligence text generation for simple pattern-based outputs.
- You need cutting-edge multimodal capabilities like advanced vision, audio, or video understanding in one model.
- Your workload requires fine-grained control over model weights or on-prem deployment beyond AWS-managed options.
- You need guaranteed, audited domain-specific performance where a specialized industry model already exists.
FAQ
Frequently Asked Questions
-
What is Nova Premier 1.0?
Nova Premier 1.0 is an Amazon large language model focused on high‑quality general text generation and reasoning for enterprise and developer workloads.
-
What is Nova Premier 1.0 best suited for?
Nova Premier 1.0 is best for chatbots, agents, code assistance, and knowledge-heavy applications requiring strong reasoning over long business or technical documents.
-
What is the context window of Nova Premier 1.0 on LLM.API?
Nova Premier 1.0 supports up to a 32K token context window when accessed through LLM.API.
-
How fast is Nova Premier 1.0 when called through LLM.API?
Typical end-to-end latency is usually under a few seconds for short prompts, depending on prompt size, output length, and network conditions.
-
What modalities does Nova Premier 1.0 support via LLM.API?
Nova Premier 1.0 supports text input and text output; it does not natively process images, audio, or video through LLM.API.
-
How is Nova Premier 1.0 priced on LLM.API?
Nova Premier 1.0 uses a pay-as-you-go per-token pricing model on LLM.API, with separate rates for prompt tokens and completion tokens.
-
How do I call Nova Premier 1.0 using the LLM.API endpoint?
You select the Amazon provider and the Nova Premier 1.0 model identifier in your LLM.API request while keeping the same unified API schema as other models.
-
How does Nova Premier 1.0 compare to similar large models on LLM.API?
Compared with similar general-purpose models, Nova Premier 1.0 targets strong reasoning quality at competitive cost, rather than bleeding-edge multimodal capabilities.
-
What are key limitations of Nova Premier 1.0?
Nova Premier 1.0 can hallucinate, lacks real-time knowledge, cannot browse the web itself, and should not be used for unsupervised high-risk decisions.
-
Does Nova Premier 1.0 support function calling or tool use via LLM.API?
Nova Premier 1.0 can be orchestrated with tools using LLM.API’s function-calling style schemas, but tool execution must be implemented in your application backend.
