Powered by Google

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

  • Text Generation

Nano Banana 2 (Gemini 3.1 Flash Image Preview) is Google DeepMind’s image generation and editing model built on the Gemini 3.1 Flash architecture, optimized for fast, cost‑efficient, high‑quality visuals. It balances strong multimodal understanding with 4K-capable output and low latency for both text-to-image and image-edit tasks.

Start Using API

What is Nano Banana 2 (Gemini 3.1 Flash Image Preview)?

Nano Banana 2 (Gemini 3.1 Flash Image Preview) is a Google DeepMind model for high-speed, high-quality image understanding, generation, and editing built on the Gemini 3.1 Flash family. It is mainly used for text-to-image generation, enabling rapid creation of detailed images with controllable aspect ratios and resolutions up to 4K. It is also used for image editing workflows, where users supply reference or input images for transformations, variations, and iterative refinements within apps and APIs. As part of the Gemini 3.1 model line, it succeeds earlier Gemini image capabilities and sits alongside the higher-end Nano Banana Pro image models.

5 Core Capabilities

  • Conversational Chat

    Supports interactive, multi-turn dialogue, following instructions and maintaining context for tasks like Q&A, drafting, and brainstorming.

  • Image Understanding

    Interprets images to identify objects, scenes, and visual details, enabling visual question answering and description tasks.

  • Optical Character Recognition

    Reads and extracts text from images, including screenshots and documents, enabling search, analysis, and transformation of visual text content.

  • Code and Tools

    Helps write and reason about code, and orchestrate external tools or APIs by interpreting structured instructions and outputs.

  • Language Translation

    Translates between multiple natural languages while preserving meaning and tone, supporting cross-lingual understanding and communication tasks.

6 Most Valuable Use Cases

  • Marketing Visual Creation
  • Product Mockup Design
  • UI Layout Ideation
  • Grounded Image Generation
  • Image Editing & Variants
  • High-Volume Asset Production

Cost Comparison

LLM API offers the lowest effective cost and latency for Nano Banana 2–class vision models.

Provider Region Latency Throughput Uptime Input ($/1M) Output ($/1M) Context
LLM API BEST Global 80ms 120 img/min 99.99% $0.30/1K img $0.00 16 images / 128K tokens equivalent
Google Global ~200ms ~60 img/min 99.9% ~$0.60/1K img $0.00 ~16 images / 128K tokens equivalent
Vertex AI US East ~220ms ~48 img/min 99.9% ~$0.65/1K img $0.00 ~16 images / 128K tokens equivalent
AWS Bedrock (equivalent vision model) US East ~240ms ~45 img/min 99.9% ~$0.80/1K img $0.00 ~8 images / 128K tokens equivalent
Azure OpenAI (equivalent vision model) Global ~230ms ~50 img/min 99.9% ~$0.75/1K img $0.00 ~8 images / 128K tokens equivalent

Technical Specifications

Metric Nano Banana 2 (Gemini 3.1 Flash Image Preview) GPT-4o mini (Image Preview) Claude 3.5 Haiku (Vision)
Latency per Image ~250ms ~220ms ~260ms
Context Window 128K 128K 200K
Max Resolution 2K 2K 2K
Price per Image $0.002 $0.002 $0.003
Supported Formats PNG, JPG, WEBP PNG, JPG, WEBP PNG, JPG, WEBP
Throughput 80 img/s 90 img/s 70 img/s
Uptime 99.9% 99.9% 99.9%

30-day usage via LLM API

3.8B
Prompt tokens processed (last 30 days)
240M
Completion tokens generated (last 30 days)
9.5M
API requests served (last 30 days)
99.8%
Avg uptime (last 30 days)
Start Using API

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

  • Intelligent AI Routing

    Automatically route each request to the optimal model and provider based on latency, cost, and quality — without changing your integration or redeploying.

    One endpoint, every model
  • Cost-Aware Orchestration

    Dynamically choose cheaper equivalents, downgrade for non-critical paths, and enforce budgets with per-route policies so you never get surprised by your AI bill.

    Lower spend, same output
  • Resilient Fallback Flows

    Define automatic cross-provider fallbacks and retries so traffic fails over seamlessly during outages, rate limits, or model errors — no manual playbooks required.

    Zero-downtime AI calls
  • Full-Stack Observability

    Get unified traces, logs, metrics, and payload sampling across all providers to debug latency, failures, and regressions from a single, model-agnostic view.

    See every token, everywhere
  • Task-Level Abstractions

    Describe tasks like chat, embedding, or tool-calling once and let LLM.API handle provider-specific quirks, parameters, and response formats for you.

    Code to tasks, not vendors
  • High-Throughput Batching

    Batch thousands of requests per task across providers with built-in retries, rate-limit smoothing, and streaming results to maximize throughput and minimize cost.

    Scale to millions of calls

When to Use — When NOT to Use

Use it if...

  • You need fast, low-cost image understanding for previews, tagging, and lightweight classification.
  • You need to quickly analyze UI mockups or app screenshots for basic structure.
  • You need visual content checks, like detecting obvious unsafe or off-brand imagery.
  • Your use case involves generating short descriptions or captions based on images.
  • Your use case involves prototyping visual features before upgrading to heavier multimodal models.
  • You need to enrich images with simple metadata, labels, or alt-text at scale.

Avoid if...

  • You need state-of-the-art reasoning over complex diagrams, technical schematics, or dense charts.
  • Your workload requires very long multimodal context, like many related images plus documents.
  • You need highly accurate OCR on small text, dense tables, or multilingual documents.
  • Your workload requires creative image generation rather than understanding or preview analysis.
  • You need robust domain-specific visual reasoning, such as medical, legal, or industrial inspection.
  • Your workload requires consistent, production-grade decisions on safety-critical or regulated visual content.

Frequently Asked Questions

  • What is Nano Banana 2 (Gemini 3.1 Flash Image Preview)?

    Nano Banana 2 (Gemini 3.1 Flash Image Preview) is a Google multimodal model optimized for fast, low-cost text and image understanding via LLM.API.

  • What is Nano Banana 2 (Gemini 3.1 Flash Image Preview) best suited for?

    It is best for latency-sensitive applications needing quick image interpretation, lightweight vision-language reasoning, and inexpensive high-volume text processing.

  • What context window does Nano Banana 2 (Gemini 3.1 Flash Image Preview) support via LLM.API?

    Nano Banana 2 (Gemini 3.1 Flash Image Preview) supports a 32K token context window through LLM.API.

  • How fast is Nano Banana 2 (Gemini 3.1 Flash Image Preview) in terms of latency?

    It is tuned for very low end-to-end latency, making it suitable for real-time or interactive user experiences.

  • What input and output modalities does Nano Banana 2 (Gemini 3.1 Flash Image Preview) support?

    The model accepts text and image inputs and returns text outputs, enabling vision-language workflows.

  • How is Nano Banana 2 (Gemini 3.1 Flash Image Preview) priced on LLM.API?

    Pricing is usage-based per token and image processed, with exact rates available in the LLM.API Google models pricing table.

  • How do I call Nano Banana 2 (Gemini 3.1 Flash Image Preview) through LLM.API?

    Use the LLM.API chat or completions endpoint specifying the Nano Banana 2 model name, sending text and optional image inputs in the request body.

  • How does Nano Banana 2 (Gemini 3.1 Flash Image Preview) compare to larger Gemini models?

    Compared to larger Gemini models, it trades some reasoning depth and creativity for significantly lower latency and cost.

  • What are the main limitations of Nano Banana 2 (Gemini 3.1 Flash Image Preview)?

    It may struggle with highly complex reasoning, long multi-step problem solving, and domain-expert tasks compared to larger frontier models.

  • Can Nano Banana 2 (Gemini 3.1 Flash Image Preview) be used for structured tool-calling or function calling?

    Yes, you can use LLM.API tool-calling interfaces, but the model is mainly optimized for lightweight text and image understanding.

Start in 2 lines of code

Get My API Key