Powered by Alibaba
Wan 2.7
- Text Generation
Wan 2.7 is Alibaba’s latest open-source multimodal visual generation model for high-quality video and image creation, offering text-to-video, image-to-video, text-to-image, and editing in a single architecture.
About the model
What is Wan 2.7?
Wan 2.7 is an AI visual generation model from Alibaba that unifies video and image generation and editing in one system. It is mainly used for generating cinematic short videos from text or image prompts and for image-to-video transformations in creative, marketing, and storytelling workflows. It is also used for text-to-image creation, reference-guided image generation, and instruction-based image or video editing for design and content production teams. Wan 2.7 is part of Alibaba’s Wan video model family developed within the broader Qwen ecosystem, succeeding earlier Wan 2.x releases.
Model capabilities
5 Core Capabilities
-
Text-to-Video
Generates high-quality video clips directly from detailed text prompts, supporting controllable camera movement, scenes, and lighting for creators.
-
Image-to-Video
Animates still images into coherent motion videos, preserving subject appearance and layout while adding realistic movement and transitions.
-
Reference-Based Editing
Edits and extends existing video using reference frames and instructions, enabling consistent subjects, motion control, and frame-level refinements.
-
Unified Visual Suite
Acts as a multimodal visual model handling text-to-video, image-to-video, text-to-image, and image editing within a single architecture.
-
Thinking Mode Control
Interprets user intent before rendering with a dedicated thinking phase, improving creative consistency, controllability, and reducing failed generations.
Use cases
6 Most Valuable Use Cases
- Text-to-Image Generation
- Image-to-Video Animation
- Text-to-Video Creation
- Video-to-Video Editing
- Marketing Visual Production
- Multimodal Media Research
Transparent pricing
Cost Comparison
LLM API offers the lowest cost and highest performance for Wan 2.7–class models.
| Provider | Region | Latency | Throughput | Uptime | Input ($/1M) | Output ($/1M) | Context |
|---|---|---|---|---|---|---|---|
| LLM API BEST | Global | 120ms | 120 tps | 99.99% | $0.15 | $0.45 | 128K tokens |
| Alibaba Cloud | APAC East | ~260ms | ~60 tps | ~99.95% | ~$0.40 | ~$1.20 | ~32K tokens |
| OpenAI | Global | ~180ms | ~90 tps | 99.9% | ~$0.50 | ~$1.50 | ~128K tokens |
| Azure AI | US East | ~200ms | ~80 tps | 99.9% | ~$0.55 | ~$1.60 | ~128K tokens |
| Anthropic | US West | ~190ms | ~70 tps | ~99.9% | ~$0.60 | ~$1.80 | ~200K tokens |
Performance benchmarks
Technical Specifications
| Metric | Wan 2.7 | Qwen2.5-72B-Instruct | Llama 3.1 70B |
|---|---|---|---|
| Avg Latency | ~220ms | ~250ms | ~260ms |
| Context Window | 128K | 128K | 128K |
| Input Price ($/1M) | $0.60 | $0.80 | $1.00 |
| Output Price ($/1M) | $2.40 | $3.20 | $4.00 |
| Max Output Tokens | 4K | 4K | 4K |
| Throughput | 48 tps | 40 tps | 42 tps |
| Uptime | 99.9% | 99.9% | 99.9% |
30-day usage via LLM API
- 11.8B
- Prompt tokens processed (last 30 days)
- 7.4B
- Completion tokens generated (last 30 days)
- 3.1M
- API requests served (last 30 days)
- 99.8%
- Average API uptime (last 30 days)
Architecture & Integration
Why Build on LLM.API?
One unified API. Every major model. Built-in reliability, cost control, and observability.
-
Unified AI Routing
Dynamically route each request to the optimal model across providers based on performance, cost, and availability—without changing your code or client integration.
One endpoint, any model -
Cost-Aware Orchestration
Automatically balance premium and budget models with configurable cost ceilings, so you keep latency low and quality high while tightly controlling spend.
Optimize quality per dollar -
Resilient Fallback Logic
Define provider and model failover chains so requests transparently retry on alternates, insulating your app from regional outages, rate limits, or model regressions.
Stay online, even upstream -
Deep Model Observability
Get unified traces, latency, error, and token metrics across all providers with request-level logs for fast debugging, tuning, and capacity planning.
See every token, everywhere -
Task-Level Abstractions
Call high-level tasks—chat, generation, tools, and more—instead of vendor-specific APIs, so you can swap models without rewriting application logic.
Program to tasks, not models -
High-Throughput Batch APIs
Submit large batches of prompts in a single call with automatic chunking, concurrency control, and retries to maximize throughput and minimize overhead.
Scale workloads, not code
Decision guide
When to Use — When NOT to Use
Use it if...
- You need a strong general-purpose Chinese language model from a major Chinese provider.
- You need reasonably capable text generation for chatbots, assistants, or content drafting.
- You need to integrate with Alibaba Cloud services or an existing Alibaba ecosystem.
- Your use case involves moderate reasoning tasks that do not require frontier-level performance.
- Your use case involves experimentation with multiple Chinese LLMs, including non–US-based offerings.
- You need a vendor-diverse backup model where Western foundation models are restricted.
Avoid if...
- You need state-of-the-art reasoning, coding, or tool-use comparable to the latest frontier models.
- Your workload requires detailed, up-to-date knowledge of non-Chinese global regulatory landscapes.
- You need guaranteed support for highly specialized domains like advanced biotech, aerospace, or cryptography.
- You need the broadest ecosystem of third-party tools, plugins, and community examples available.
- Your workload requires clear, well-documented compliance attestations for US or EU-specific regulations.
- You need extremely transparent, English-first documentation and debugging resources for all model behaviors.
FAQ
Frequently Asked Questions
-
What is Wan 2.7?
Wan 2.7 is an Alibaba large language model accessible via LLM.API, targeting general-purpose text generation and understanding tasks.
-
What is Wan 2.7 best suited for?
Wan 2.7 is best for cost-efficient chatbots, content generation, and general NLP tasks where balanced quality and efficiency matter.
-
What is the context window of Wan 2.7?
Wan 2.7 supports a context window of up to 8,192 tokens via LLM.API.
-
How fast is Wan 2.7 on LLM.API?
Wan 2.7 is optimized for low latency on LLM.API, typically returning first tokens within a few hundred milliseconds under normal load.
-
Which modalities does Wan 2.7 support?
Wan 2.7 is a text-only model on LLM.API, supporting text inputs and text outputs.
-
How is Wan 2.7 priced on LLM.API?
Wan 2.7 uses LLM.API’s unified token-based billing, with separate input and output token rates shown in your LLM.API pricing dashboard.
-
How do I call Wan 2.7 through LLM.API?
You select provider 'Alibaba' and model 'Wan 2.7' in the LLM.API request payload, keeping the standard chat or completion schema unchanged.
-
How does Wan 2.7 compare to similar models on LLM.API?
Wan 2.7 generally trades slightly lower peak quality than top-tier frontier models for better cost efficiency and predictable performance.
-
Does Wan 2.7 support streaming responses on LLM.API?
Yes, Wan 2.7 supports token streaming via LLM.API by enabling the standard 'stream' flag in your request.
-
What are key limitations of Wan 2.7?
Wan 2.7 may struggle with highly specialized domain knowledge, strict mathematical reasoning, and tasks requiring very long-context retention beyond its context window.
