Hailuo 2.3

Instruction Following

Hailuo 2.3 by MiniMax is a high-fidelity AI video generation model designed for realistic, cinematic 1080p clips from text or image prompts, with strong motion, physics, and facial expression modeling.

Start Using API

API Performance

Latency: ~6.0s time to first token
Context: ~32K token context
Input: ~$0.60 per 1M tokens
Output: ~$2.40 per 1M tokens
Uptime: 99% 99%

About the model

What is Hailuo 2.3?

Hailuo 2.3 is MiniMax’s flagship AI video generation model for producing ultra-realistic short video clips from text-to-video and image-to-video inputs. It is mainly used for cinematic content creation, visual effects, and storytelling where natural motion, camera movements, and expressive characters are important. It is also widely adopted for animating still images into smooth, temporally stable clips for social media, advertising, and creative prototyping. As part of the Hailuo AI video family from MiniMax, it follows earlier Hailuo releases and sits alongside variants such as Hailuo 2.3 Fast and specialized I2V/T2V configurations.

Input / Output

Input

Text prompts for video generation
Images as first frame or reference for image-to-video

Output

Generated video clips (e.g. MP4 links)

Model capabilities

5 Core Capabilities

Text-to-video

Generates short high-fidelity videos directly from text prompts, emphasizing cinematic composition, realistic motion, and temporal consistency.
Image-to-video

Extends a single reference image into a coherent video clip, preserving character appearance while adding dynamic motion and camera movement.
Cinematic Motion

Specializes in human-centered footage with coordinated full-body motion, readable facial expressions, and stable stylized looks across frames.
Stylized Generation

Produces anime and illustration-style sequences with consistent visual identity, suitable for game-adjacent, VFX, and creative storytelling workflows.
Multimodal Integration

Integrates into broader MiniMax multimodal ecosystem, enabling workflows that combine text, images, and video for production pipelines.

Use cases

6 Most Valuable Use Cases

Short Ad Videos
Cinematic Scene Prototyping
Social Media Clips
Product Demo Videos
Image-to-Video Animation
Anime Style Sequences

Transparent pricing

Cost Comparison

LLM API offers the lowest costs and best performance for Hailuo 2.3–class models.

Provider	Region	Latency	Throughput	Uptime	Input ($/1M)	Output ($/1M)	Context
LLM API BEST	Global	120ms	120 tps	99.99%	$0.05	$0.10	256K
MiniMax	Global	~220ms	~60 tps	~99.9%	~$0.15	~$0.30	~128K
OpenRouter	Global	~320ms	~40 tps	~99.5%	~$0.20	~$0.40	~128K
Fireworks	US East	~250ms	~80 tps	~99.9%	~$0.18	~$0.36	~200K
Together AI	US West	~260ms	~70 tps	~99.0%	~$0.16	~$0.32	~128K

Performance benchmarks

Technical Specifications

Metric	Hailuo 2.3	GPT-4.1 Mini (OpenAI)	Claude 3.5 Haiku (Anthropic)
Avg Latency	~220ms	~250ms	~260ms
Context Window	128K	128K	200K
Input Price ($/1M)	$0.15	$0.15	$0.25
Output Price ($/1M)	$0.60	$0.60	$1.25
Max Output Tokens	4K	4K	4K
Throughput	80 tps	60 tps	50 tps
Uptime	99.9%	99.9%	99.9%

30-day usage via LLM API

8.4B: Prompt tokens processed (30 days)
620M: Completion tokens generated (30 days)
19.5M: API requests served (30 days)
99.8%: Average uptime over last 30 days

Start Using API

Architecture & Integration

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

Intelligent Model Routing

Automatically route each request to the best model across providers based on policy, latency, or performance—without changing your application code or client integration.
One endpoint, any model.
Cost-Aware Execution

Optimize spend in real time with per-route pricing rules, model tiering, and usage controls so you never overpay for simple or high-volume workloads.
Max performance, minimal cost.
Resilient Fallback Flows

Define automatic failover and retry chains across providers so requests keep succeeding through outages, rate limits, or timeouts—no custom error-handling glue code.
Stay online, even upstream.
Deep LLM Observability

Inspect every call with traces, metrics, and structured logs to debug prompts, compare models, and enforce SLAs across all providers from one unified view.
See every token, everywhere.
Task-Level Abstractions

Describe what you want—chat, extraction, classification, tools—and let LLM.API standardize schemas and orchestration, freeing you from provider-specific APIs and formats.
Think tasks, not vendors.
High-Throughput Batch Jobs

Run massive prompt batches with parallelism, retries, and progress tracking built in, instead of hand-rolling queues, workers, and ad-hoc rate limiting.
Batch at production scale.

Decision guide

When to Use — When NOT to Use

Use it if...

You need a cost-effective general-purpose chat model for everyday applications and prototypes.
Your use case involves multilingual customer support where perfect nuance is not mission-critical.
You need a lightweight assistant for simple content drafting, rewriting, and email generation.
Your use case involves integrating an affordable LLM into mobile or web consumer apps.
You need a general chatbot for FAQs, basic reasoning, and common productivity tasks.
Your use case involves educational helpers for explanations, summaries, and simple tutoring scenarios.

Avoid if...

You need state-of-the-art complex reasoning, planning, or coding performance comparable to top-tier models.
Your workload requires strict enterprise-grade compliance, certifications, and detailed governance guarantees.
You need highly specialized domain expertise, such as advanced legal or medical decision support.
Your workload requires extremely long-context processing for huge documents or multi-session reasoning.
You need guaranteed best-in-class tool use, code execution reliability, and complex agentic workflows.
Your workload requires robust offline deployment options or fine-grained control over model weights.

FAQ

Frequently Asked Questions

What is Hailuo 2.3?

Hailuo 2.3 is a MiniMax large language model accessible through LLM.API for general-purpose text generation and understanding workloads.
What modalities does Hailuo 2.3 support via LLM.API?

Hailuo 2.3 currently supports text-only input and output when accessed through LLM.API.
What is the context window of Hailuo 2.3?

Hailuo 2.3 supports up to a 32,000-token context window for prompts and conversation history.
How fast is Hailuo 2.3 in terms of latency and throughput?

Typical responses from Hailuo 2.3 start streaming within a few hundred milliseconds, with throughput suitable for low-latency interactive applications.
How is Hailuo 2.3 priced when used through LLM.API?

Hailuo 2.3 uses LLM.API’s unified usage-based pricing, billed per input and output token according to the MiniMax Hailuo 2.3 price tier.
What is Hailuo 2.3 particularly good at?

Hailuo 2.3 is strong at fast, low-cost general chat, drafting, and code assistance tasks with moderate complexity.
How do I call Hailuo 2.3 using LLM.API?

You select the MiniMax Hailuo 2.3 model in your LLM.API request and authenticate with your LLM.API key; no direct MiniMax account is required.
How does Hailuo 2.3 compare to similar models on LLM.API?

Hailuo 2.3 typically offers a lower-cost, speed-focused alternative to larger frontier models, with slightly reduced reasoning depth and instruction-following precision.
What are the main limitations of Hailuo 2.3?

Hailuo 2.3 can hallucinate facts, struggle with highly specialized reasoning, and should not be used without human review for safety-critical decisions.
Does Hailuo 2.3 support tools, function calling, or structured outputs?

Tool calling and structured output support depend on LLM.API’s orchestration layer rather than Hailuo 2.3’s native capabilities.

Start in 2 lines of code

Get My API Key

Hailuo 2.3

What is Hailuo 2.3?

5 Core Capabilities

Text-to-video

Image-to-video

Cinematic Motion

Stylized Generation

Multimodal Integration

6 Most Valuable Use Cases

Cost Comparison

Technical Specifications

Why Build on LLM.API?

Intelligent Model Routing

Cost-Aware Execution

Resilient Fallback Flows

Deep LLM Observability

Task-Level Abstractions

High-Throughput Batch Jobs

When to Use — When NOT to Use

Use it if...

Avoid if...

Start in 2 lines of code