Powered by Tencent
Hy3 preview
- Text Generation
Hy3 preview is Tencent's open-weight, large Mixture-of-Experts language model focused on high-efficiency reasoning and agentic workflows. It is notable for its very large parameter count and long context window while remaining optimized for production use.
About the model
What is Hy3 preview?
Hy3 preview is a 295B-parameter Mixture-of-Experts large language model from Tencent’s Hunyuan (Hy3) family, with about 21B active parameters and an extended context window around 256K tokens. It is mainly used for complex reasoning, instruction following, long-context tasks like document analysis, and general-purpose chat or writing. It is also applied in coding, agents, and other production scenarios that benefit from configurable reasoning depth and efficient inference. Hy3 preview belongs to Tencent’s Hunyuan model line (sometimes referred to as Hunyuan 3.0), succeeding earlier Hunyuan generations such as Hy2.
Model capabilities
5 Core Capabilities
-
Complex Reasoning
Performs advanced logical and mathematical reasoning, excelling on STEM tasks and challenging benchmarks and real-world exams and evaluations.
-
Instruction Following
Understands and executes nuanced natural-language instructions, with strong context learning for long prompts up to 256K tokens.
-
Agentic Workflows
Powers multi-step AI agents, integrating with frameworks like OpenClaw to orchestrate tools, search, and multi-stage task automation.
-
Code Generation
Generates and edits code, supporting complex software development workflows and scoring competitively on mainstream coding agent benchmarks.
-
Multilingual Support
Handles multiple languages, enabling cross-lingual text understanding and generation for global users across Tencent’s ecosystem and tools.
Use cases
6 Most Valuable Use Cases
- General Text Assistant
- Technical Reasoning Help
- Legal Case Summaries
- Compliance Change Monitoring
- Business Report Drafting
- Agentic Coding Support
Transparent pricing
Cost Comparison
LLM API offers the lowest cost and latency for Hy3‑class models, up to ~70% cheaper than comparable Tencent pricing.
| Provider | Region | Latency | Throughput | Uptime | Input ($/1M) | Output ($/1M) | Context |
|---|---|---|---|---|---|---|---|
| LLM API BEST | Global | 120ms | 80 tps | 99.99% | $0.20 | $0.20 | 256K |
| Tencent Cloud | APAC | ~220ms | ~40 tps | ~99.90% | ~$0.60 | ~$0.60 | ~128K |
| Alibaba Cloud | APAC | ~250ms | ~35 tps | ~99.90% | ~$0.70 | ~$0.70 | ~128K |
| OpenAI (comparable model) | Global | ~200ms | ~50 tps | ~99.95% | ~$1.00 | ~$1.00 | ~128K |
| AWS Bedrock (comparable model) | US East | ~210ms | ~45 tps | ~99.95% | ~$0.80 | ~$0.80 | ~128K |
Performance benchmarks
Technical Specifications
| Metric | Hy3 preview | Tencent Hunyuan-Large | OpenAI GPT-4o |
|---|---|---|---|
| Avg Latency | ~180ms | ~220ms | ~250ms |
| Context Window | 128K | 64K | 128K |
| Input Price ($/1M) | $0.30 | $0.25 | $5.00 |
| Output Price ($/1M) | $0.60 | $0.50 | $15.00 |
| Max Output Tokens | 4K | 4K | 4K |
| Throughput | 40 tps | 35 tps | 30 tps |
| Uptime | 99.9% | 99.5% | 99.9% |
30-day usage via LLM API
- 7.8B
- Prompt tokens processed (last 30 days)
- 620M
- Completion tokens generated (last 30 days)
- 4.5M
- API requests served (last 30 days)
- 280K
- Unique developer accounts (last 30 days)
Architecture & Integration
Why Build on LLM.API?
One unified API. Every major model. Built-in reliability, cost control, and observability.
-
Unified AI Routing
Dynamically route each request to the optimal model across providers based on cost, latency, and quality—without changing your integration or redeploying code.
One endpoint, every model. -
Cost-Aware Orchestration
Automatically balance price and performance with model-tier rules, budgets, and usage limits so you can ship rich AI features without runaway spend.
Control spend by design. -
Resilient Fallbacks
Define cascading provider and model fallbacks so requests survive outages, rate limits, and model failures with graceful degradation instead of hard errors.
No single point of failure. -
End-to-End Observability
Trace every request across models and providers with logs, metrics, and structured payloads so you can debug prompts, optimize latency, and track regressions.
See every token, everywhere. -
Task-Level Abstractions
Describe tasks like chat, generation, tools, or scoring once and let LLM.API map them to provider-specific APIs so you avoid vendor lock-in glue code.
Code to tasks, not vendors. -
High-Throughput Batch
Submit massive batches of prompts, evaluations, or embeddings through a single pipeline optimized for concurrency limits, retries, and cost-efficient parallelization.
Scale experiments, not ops.
Decision guide
When to Use — When NOT to Use
Use it if...
- You need a Chinese-developed model suitable for deployment within Tencent’s cloud ecosystem.
- You need a general-purpose assistant for chat, drafting, and everyday productivity tasks.
- Your use case involves experimenting with a newer Tencent model in preview environments.
- You need to prototype multilingual chatbots primarily targeting Chinese and English users.
- Your use case involves integrating with other Tencent services or existing Tencent infrastructure.
Avoid if...
- You need a fully production-hardened model with long-term stability beyond a preview phase.
- Your workload requires guaranteed enterprise SLAs, compliance attestations, and audited certifications.
- You need state-of-the-art long-context reasoning over very large documents or codebases.
- Your workload requires rich ecosystem tooling, plugins, and community resources already battle-tested.
- You need proven performance benchmarks against leading frontier models for mission-critical decisions.
FAQ
Frequently Asked Questions
-
What is Hy3 preview?
Hy3 preview is a Tencent large language model accessible via LLM.API, suitable for general-purpose text generation and analysis tasks.
-
What is Hy3 preview best suited for?
Hy3 preview is best for fast, low-friction chat-style completion, coding assistance, and structured text generation where low latency matters.
-
What is the context window of Hy3 preview?
Hy3 preview supports a mid-sized context window, suitable for multi-turn conversations, medium-length documents, and typical code files.
-
What modalities does Hy3 preview support?
Hy3 preview currently supports text input and text output only when accessed through LLM.API.
-
How is Hy3 preview priced on LLM.API?
Hy3 preview uses LLM.API’s unified per-token pricing; you are billed for input and output tokens at the Tencent Hy3 preview rate.
-
How fast is Hy3 preview in terms of latency and throughput?
Hy3 preview typically returns initial tokens quickly and supports streaming, making it suitable for interactive applications and tooling.
-
How do I call Hy3 preview through the LLM.API gateway?
Specify the model name "tencent-hy3-preview" (exact name may vary) in your LLM.API request and provide your LLM.API key for authentication.
-
How does Hy3 preview compare to other similar models on LLM.API?
Compared to larger flagship models, Hy3 preview generally trades some reasoning depth and creativity for lower cost and faster responses.
-
What are the main limitations of Hy3 preview?
Hy3 preview can hallucinate facts, struggle with very long contexts or complex reasoning chains, and should not be used without human review for high-stakes decisions.
-
Can I fine-tune or customize Hy3 preview via LLM.API?
Direct fine-tuning is typically not available; instead, you customize Hy3 preview behavior using system prompts, few-shot examples, and application-side orchestration.
