Wan 2.7

Text Generation

Wan 2.7 is Alibaba’s latest open-source multimodal visual generation model for high-quality video and image creation, offering text-to-video, image-to-video, text-to-image, and editing in a single architecture.

Start Using API

API Performance

Latency: ~8.0s avg generation time
Context: ~10s max video duration
Input: ~$1.00 per generated video
Output: ~$1.00 per generated video
Uptime: 99% 99%

About the model

What is Wan 2.7?

Wan 2.7 is an AI visual generation model from Alibaba that unifies video and image generation and editing in one system. It is mainly used for generating cinematic short videos from text or image prompts and for image-to-video transformations in creative, marketing, and storytelling workflows. It is also used for text-to-image creation, reference-guided image generation, and instruction-based image or video editing for design and content production teams. Wan 2.7 is part of Alibaba’s Wan video model family developed within the broader Qwen ecosystem, succeeding earlier Wan 2.x releases.

Input / Output

Input

Text prompts (for text-to-video and text-to-image generation, editing instructions)
Images (for image-to-video, image-to-image generation, and image editing)

Output

Generated videos (AI-created or edited video clips, often with audio)
Generated images (AI-created or edited pictures and frames)

Model capabilities

5 Core Capabilities

Text-to-Video

Generates high-quality video clips directly from detailed text prompts, supporting controllable camera movement, scenes, and lighting for creators.
Image-to-Video

Animates still images into coherent motion videos, preserving subject appearance and layout while adding realistic movement and transitions.
Reference-Based Editing

Edits and extends existing video using reference frames and instructions, enabling consistent subjects, motion control, and frame-level refinements.
Unified Visual Suite

Acts as a multimodal visual model handling text-to-video, image-to-video, text-to-image, and image editing within a single architecture.
Thinking Mode Control

Interprets user intent before rendering with a dedicated thinking phase, improving creative consistency, controllability, and reducing failed generations.

Use cases

6 Most Valuable Use Cases

Text-to-Image Generation
Image-to-Video Animation
Text-to-Video Creation
Video-to-Video Editing
Marketing Visual Production
Multimodal Media Research

Transparent pricing

Cost Comparison

LLM API offers the lowest cost and highest performance for Wan 2.7–class models.

Provider	Region	Latency	Throughput	Uptime	Input ($/1M)	Output ($/1M)	Context
LLM API BEST	Global	120ms	120 tps	99.99%	$0.15	$0.45	128K tokens
Alibaba Cloud	APAC East	~260ms	~60 tps	~99.95%	~$0.40	~$1.20	~32K tokens
OpenAI	Global	~180ms	~90 tps	99.9%	~$0.50	~$1.50	~128K tokens
Azure AI	US East	~200ms	~80 tps	99.9%	~$0.55	~$1.60	~128K tokens
Anthropic	US West	~190ms	~70 tps	~99.9%	~$0.60	~$1.80	~200K tokens

Performance benchmarks

Technical Specifications

Metric	Wan 2.7	Qwen2.5-72B-Instruct	Llama 3.1 70B
Avg Latency	~220ms	~250ms	~260ms
Context Window	128K	128K	128K
Input Price ($/1M)	$0.60	$0.80	$1.00
Output Price ($/1M)	$2.40	$3.20	$4.00
Max Output Tokens	4K	4K	4K
Throughput	48 tps	40 tps	42 tps
Uptime	99.9%	99.9%	99.9%

30-day usage via LLM API

11.8B: Prompt tokens processed (last 30 days)
7.4B: Completion tokens generated (last 30 days)
3.1M: API requests served (last 30 days)
99.8%: Average API uptime (last 30 days)

Start Using API

Architecture & Integration

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

Unified AI Routing

Dynamically route each request to the optimal model across providers based on performance, cost, and availability—without changing your code or client integration.
One endpoint, any model
Cost-Aware Orchestration

Automatically balance premium and budget models with configurable cost ceilings, so you keep latency low and quality high while tightly controlling spend.
Optimize quality per dollar
Resilient Fallback Logic

Define provider and model failover chains so requests transparently retry on alternates, insulating your app from regional outages, rate limits, or model regressions.
Stay online, even upstream
Deep Model Observability

Get unified traces, latency, error, and token metrics across all providers with request-level logs for fast debugging, tuning, and capacity planning.
See every token, everywhere
Task-Level Abstractions

Call high-level tasks—chat, generation, tools, and more—instead of vendor-specific APIs, so you can swap models without rewriting application logic.
Program to tasks, not models
High-Throughput Batch APIs

Submit large batches of prompts in a single call with automatic chunking, concurrency control, and retries to maximize throughput and minimize overhead.
Scale workloads, not code

Decision guide

When to Use — When NOT to Use

Use it if...

You need a strong general-purpose Chinese language model from a major Chinese provider.
You need reasonably capable text generation for chatbots, assistants, or content drafting.
You need to integrate with Alibaba Cloud services or an existing Alibaba ecosystem.
Your use case involves moderate reasoning tasks that do not require frontier-level performance.
Your use case involves experimentation with multiple Chinese LLMs, including non–US-based offerings.
You need a vendor-diverse backup model where Western foundation models are restricted.

Avoid if...

You need state-of-the-art reasoning, coding, or tool-use comparable to the latest frontier models.
Your workload requires detailed, up-to-date knowledge of non-Chinese global regulatory landscapes.
You need guaranteed support for highly specialized domains like advanced biotech, aerospace, or cryptography.
You need the broadest ecosystem of third-party tools, plugins, and community examples available.
Your workload requires clear, well-documented compliance attestations for US or EU-specific regulations.
You need extremely transparent, English-first documentation and debugging resources for all model behaviors.

FAQ

Frequently Asked Questions

What is Wan 2.7?

Wan 2.7 is an Alibaba large language model accessible via LLM.API, targeting general-purpose text generation and understanding tasks.
What is Wan 2.7 best suited for?

Wan 2.7 is best for cost-efficient chatbots, content generation, and general NLP tasks where balanced quality and efficiency matter.
What is the context window of Wan 2.7?

Wan 2.7 supports a context window of up to 8,192 tokens via LLM.API.
How fast is Wan 2.7 on LLM.API?

Wan 2.7 is optimized for low latency on LLM.API, typically returning first tokens within a few hundred milliseconds under normal load.
Which modalities does Wan 2.7 support?

Wan 2.7 is a text-only model on LLM.API, supporting text inputs and text outputs.
How is Wan 2.7 priced on LLM.API?

Wan 2.7 uses LLM.API’s unified token-based billing, with separate input and output token rates shown in your LLM.API pricing dashboard.
How do I call Wan 2.7 through LLM.API?

You select provider 'Alibaba' and model 'Wan 2.7' in the LLM.API request payload, keeping the standard chat or completion schema unchanged.
How does Wan 2.7 compare to similar models on LLM.API?

Wan 2.7 generally trades slightly lower peak quality than top-tier frontier models for better cost efficiency and predictable performance.
Does Wan 2.7 support streaming responses on LLM.API?

Yes, Wan 2.7 supports token streaming via LLM.API by enabling the standard 'stream' flag in your request.
What are key limitations of Wan 2.7?

Wan 2.7 may struggle with highly specialized domain knowledge, strict mathematical reasoning, and tasks requiring very long-context retention beyond its context window.

Start in 2 lines of code

Get My API Key

Wan 2.7

What is Wan 2.7?

5 Core Capabilities

Text-to-Video

Image-to-Video

Reference-Based Editing

Unified Visual Suite

Thinking Mode Control

6 Most Valuable Use Cases

Cost Comparison

Technical Specifications

Why Build on LLM.API?

Unified AI Routing

Cost-Aware Orchestration

Resilient Fallback Logic

Deep Model Observability

Task-Level Abstractions

High-Throughput Batch APIs

When to Use — When NOT to Use

Use it if...

Avoid if...

Start in 2 lines of code