Powered by Google
Nano Banana Pro (Gemini 3 Pro Image Preview)
- Vision-Language
Nano Banana Pro (Gemini 3 Pro Image Preview) is Google’s preview-stage image generation and editing model built on the Gemini 3 Pro family, optimized for complex, multi-turn visual creation tasks.
About the model
What is Nano Banana Pro (Gemini 3 Pro Image Preview)?
Nano Banana Pro (Gemini 3 Pro Image Preview) is a proprietary Google Gemini 3 image model that generates and edits images from text prompts and reference images. It is mainly used for high-quality, multi-step image creation and editing workflows, such as photorealistic rendering, design mockups, and creative compositing. It also supports multimodal use cases where text and images are combined, leveraging a context window of around 65k tokens for detailed, instruction-heavy prompts. The model belongs to the Gemini 3 family and is a preview/legacy variant of the Gemini 3 Pro Image line, with newer Nano Banana Pro image models recommended for new integrations.
Model capabilities
5 Core Capabilities
-
Natural Conversation
Engages in multi-turn dialogue, answering questions and following instructions with context awareness across a wide range of topics.
-
Image Interpretation
Analyzes user-provided images to recognize objects, scenes, and visual relationships, supporting grounded reasoning about visual content.
-
Text Translation
Translates text between multiple languages, preserving meaning and tone for everyday communication and informational content.
-
Visual Text Extraction
Extracts readable text from images, enabling recognition of signs, documents, labels, and other embedded text in pictures.
-
Tool Integration
Coordinates with external tools or systems, using model outputs to support monitoring, analysis, or automated workflows.
Use cases
6 Most Valuable Use Cases
- Mobile Vision Inference
- On-device Image Captioning
- AR Object Detection
- Robotics Scene Understanding
- Smart Camera Automation
- Privacy-preserving Analytics
Transparent pricing
Cost Comparison
LLM API offers the lowest effective cost and latency for Nano Banana Pro–class vision models.
| Provider | Region | Latency | Throughput | Uptime | Input ($/1M) | Output ($/1M) | Context |
|---|---|---|---|---|---|---|---|
| LLM API BEST | Global | ~220ms | ~80 img/min | 99.99% | $0.15/1K tokens + $0.002/img | $0.45/1K tokens | ~256K tokens + 20 img/req |
| Global | ~300ms | ~60 img/min | 99.9% | ~$0.20/1K tokens + ~$0.004/img | ~$0.60/1K tokens | ~128K tokens + 16 img/req | |
| Azure | US East | ~340ms | ~55 img/min | 99.9% | ~$0.22/1K tokens + ~$0.0045/img | ~$0.65/1K tokens | ~128K tokens + 16 img/req |
| Anthropic | US West | ~360ms | ~50 img/min | 99.9% | ~$0.24/1K tokens + ~$0.005/img | ~$0.70/1K tokens | ~200K tokens + 12 img/req |
| OpenAI | Global | ~330ms | ~55 img/min | 99.9% | ~$0.25/1K tokens + ~$0.005/img | ~$0.75/1K tokens | ~128K tokens + 10 img/req |
Performance benchmarks
Technical Specifications
| Metric | Nano Banana Pro (Gemini 3 Pro Image Preview) | OpenAI o3-mini (vision) | Claude 3.7 Sonnet (vision) |
|---|---|---|---|
| Latency per Image | ~800ms | ~900ms | ~950ms |
| Throughput | ~40 img/s | ~35 img/s | ~30 img/s |
| Max Resolution | ~4K | ~4K | ~4K |
| Price per Image | ~$0.005 | ~$0.01 | ~$0.009 |
| Supported Formats | PNG, JPEG, WEBP | PNG, JPEG, WEBP | PNG, JPEG, WEBP |
| Context Window (w/ Image) | 128K tokens | 200K tokens | 200K tokens |
| Max Output Tokens | 8K | 16K | 8K |
| Uptime | 99.9% | 99.9% | 99.9% |
30-day usage via LLM API
- 3.8B
- Prompt tokens processed (last 30 days)
- 11.4M
- API requests served (last 30 days)
- 4.5B
- Completion tokens generated (last 30 days)
- 99.95%
- Average uptime (last 30 days)
Architecture & Integration
Why Build on LLM.API?
One unified API. Every major model. Built-in reliability, cost control, and observability.
-
Unified AI Routing
Dynamically route each request to the optimal model across providers based on latency, cost, or performance policies—without changing your application code.
One endpoint, any model -
Cost-Aware Orchestration
Automatically balance premium and budget models using configurable cost guards, so you control spend while keeping response quality and performance predictable at scale.
Optimize spend by default -
Resilient Fallbacks
Define automatic cross-provider fallbacks so your workloads keep running through rate limits, outages, or model deprecations—with no manual error-handling sprawl.
Stay online, by default. -
Deep Observability
Get unified logs, metrics, traces, and request payloads across all models and vendors, with per-model performance and cost breakdowns for real-time debugging and tuning.
See every token, everywhere. -
Task-Level Abstractions
Describe tasks like chat, generation, tools, or RAG once, and let LLM.API handle model-specific parameters, prompts, and formats for each provider.
Think tasks, not models. -
High-Throughput Batching
Submit large batches of prompts through one API, and let LLM.API parallelize, retry, and aggregate responses for consistent throughput across providers.
Scale tokens, not code.
Decision guide
When to Use — When NOT to Use
Use it if...
- You need fast, low-cost image understanding for previews, thumbnails, or basic tagging.
- You need to quickly classify or caption user-uploaded photos before further processing.
- Your use case involves simple multimodal prompts combining short text with a single image.
- Your use case involves prototyping lightweight visual features without needing top-tier reasoning.
- You need a small model to pre-filter or route images for larger backends.
- You need to extract obvious objects, colors, or layouts from everyday consumer images.
Avoid if...
- You need state-of-the-art vision-language reasoning on complex diagrams, charts, or scientific images.
- Your workload requires long-context multimodal analysis spanning many images and extensive text.
- You need highly reliable domain-specific medical or industrial image interpretation with safety guarantees.
- You need robust handling of very high-resolution images or detailed small-object detection.
- Your workload requires consistent, top-tier general reasoning or coding beyond basic visual tasks.
- You need full general-purpose chat capabilities rather than focused image preview understanding.
FAQ
Frequently Asked Questions
-
What is Nano Banana Pro (Gemini 3 Pro Image Preview)?
Nano Banana Pro (Gemini 3 Pro Image Preview) is a lightweight Gemini-3–based multimodal model from Google optimized for fast image understanding and text generation.
-
What is Nano Banana Pro (Gemini 3 Pro Image Preview) best suited for?
It is best for low-latency applications like UI assistants, rapid image captioning, and lightweight reasoning over images and short texts.
-
What is the context window of Nano Banana Pro (Gemini 3 Pro Image Preview)?
Nano Banana Pro (Gemini 3 Pro Image Preview) supports a 32K token context window for combined input and output.
-
Which modalities does Nano Banana Pro (Gemini 3 Pro Image Preview) support via LLM.API?
It supports text input and output plus image input, including multi-image prompts, but does not generate images.
-
How is Nano Banana Pro (Gemini 3 Pro Image Preview) priced on LLM.API?
Pricing is per input and output token, with Nano Banana Pro positioned as a cheaper Gemini tier; check LLM.API pricing docs for current rates.
-
How fast is Nano Banana Pro (Gemini 3 Pro Image Preview) in terms of latency?
It is optimized for low latency, typically returning short responses in a few hundred milliseconds under normal load.
-
How do I call Nano Banana Pro (Gemini 3 Pro Image Preview) through the LLM.API?
Specify the provider as Google and the model name as "nano-banana-pro-gemini-3-pro-image-preview" in your LLM.API completion or chat request.
-
How does Nano Banana Pro (Gemini 3 Pro Image Preview) compare to larger Gemini models?
It trades some reasoning depth and long-context performance for significantly lower cost and faster responses than flagship Gemini 3 Pro models.
-
What limitations should I be aware of with Nano Banana Pro (Gemini 3 Pro Image Preview)?
It can hallucinate, struggles with very long multi-step reasoning or domain-expert tasks, and should not be used without human review for high-risk decisions.
-
Can I fine-tune or customize Nano Banana Pro (Gemini 3 Pro Image Preview) via LLM.API?
Direct fine-tuning is not supported; use system prompts, few-shot examples, and retrieval to adapt behavior.
