Hy3 preview

Text Generation

Hy3 preview is Tencent's open-weight, large Mixture-of-Experts language model focused on high-efficiency reasoning and agentic workflows. It is notable for its very large parameter count and long context window while remaining optimized for production use.

Start Using API

API Performance

Latency: ~4.6s avg response
Context: 262K token context
Input: $0.07 per 1M tokens
Output: $0.26 per 1M tokens
Uptime: 99% 99%

About the model

What is Hy3 preview?

Hy3 preview is a 295B-parameter Mixture-of-Experts large language model from Tencent’s Hunyuan (Hy3) family, with about 21B active parameters and an extended context window around 256K tokens. It is mainly used for complex reasoning, instruction following, long-context tasks like document analysis, and general-purpose chat or writing. It is also applied in coding, agents, and other production scenarios that benefit from configurable reasoning depth and efficient inference. Hy3 preview belongs to Tencent’s Hunyuan model line (sometimes referred to as Hunyuan 3.0), succeeding earlier Hunyuan generations such as Hy2.

Input / Output

Input

Text prompts (natural language or code as tokens)

Output

Text responses (natural language or code)
Source code generation in multiple languages

Model capabilities

5 Core Capabilities

Complex Reasoning

Performs advanced logical and mathematical reasoning, excelling on STEM tasks and challenging benchmarks and real-world exams and evaluations.
Instruction Following

Understands and executes nuanced natural-language instructions, with strong context learning for long prompts up to 256K tokens.
Agentic Workflows

Powers multi-step AI agents, integrating with frameworks like OpenClaw to orchestrate tools, search, and multi-stage task automation.
Code Generation

Generates and edits code, supporting complex software development workflows and scoring competitively on mainstream coding agent benchmarks.
Multilingual Support

Handles multiple languages, enabling cross-lingual text understanding and generation for global users across Tencent’s ecosystem and tools.

Use cases

6 Most Valuable Use Cases

General Text Assistant
Technical Reasoning Help
Legal Case Summaries
Compliance Change Monitoring
Business Report Drafting
Agentic Coding Support

Transparent pricing

Cost Comparison

LLM API offers the lowest cost and latency for Hy3‑class models, up to ~70% cheaper than comparable Tencent pricing.

Provider	Region	Latency	Throughput	Uptime	Input ($/1M)	Output ($/1M)	Context
LLM API BEST	Global	120ms	80 tps	99.99%	$0.20	$0.20	256K
Tencent Cloud	APAC	~220ms	~40 tps	~99.90%	~$0.60	~$0.60	~128K
Alibaba Cloud	APAC	~250ms	~35 tps	~99.90%	~$0.70	~$0.70	~128K
OpenAI (comparable model)	Global	~200ms	~50 tps	~99.95%	~$1.00	~$1.00	~128K
AWS Bedrock (comparable model)	US East	~210ms	~45 tps	~99.95%	~$0.80	~$0.80	~128K

Performance benchmarks

Technical Specifications

Metric	Hy3 preview	Tencent Hunyuan-Large	OpenAI GPT-4o
Avg Latency	~180ms	~220ms	~250ms
Context Window	128K	64K	128K
Input Price ($/1M)	$0.30	$0.25	$5.00
Output Price ($/1M)	$0.60	$0.50	$15.00
Max Output Tokens	4K	4K	4K
Throughput	40 tps	35 tps	30 tps
Uptime	99.9%	99.5%	99.9%

30-day usage via LLM API

7.8B: Prompt tokens processed (last 30 days)
620M: Completion tokens generated (last 30 days)
4.5M: API requests served (last 30 days)
280K: Unique developer accounts (last 30 days)

Start Using API

Architecture & Integration

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

Unified AI Routing

Dynamically route each request to the optimal model across providers based on cost, latency, and quality—without changing your integration or redeploying code.
One endpoint, every model.
Cost-Aware Orchestration

Automatically balance price and performance with model-tier rules, budgets, and usage limits so you can ship rich AI features without runaway spend.
Control spend by design.
Resilient Fallbacks

Define cascading provider and model fallbacks so requests survive outages, rate limits, and model failures with graceful degradation instead of hard errors.
No single point of failure.
End-to-End Observability

Trace every request across models and providers with logs, metrics, and structured payloads so you can debug prompts, optimize latency, and track regressions.
See every token, everywhere.
Task-Level Abstractions

Describe tasks like chat, generation, tools, or scoring once and let LLM.API map them to provider-specific APIs so you avoid vendor lock-in glue code.
Code to tasks, not vendors.
High-Throughput Batch

Submit massive batches of prompts, evaluations, or embeddings through a single pipeline optimized for concurrency limits, retries, and cost-efficient parallelization.
Scale experiments, not ops.

Decision guide

When to Use — When NOT to Use

Use it if...

You need a Chinese-developed model suitable for deployment within Tencent’s cloud ecosystem.
You need a general-purpose assistant for chat, drafting, and everyday productivity tasks.
Your use case involves experimenting with a newer Tencent model in preview environments.
You need to prototype multilingual chatbots primarily targeting Chinese and English users.
Your use case involves integrating with other Tencent services or existing Tencent infrastructure.

Avoid if...

You need a fully production-hardened model with long-term stability beyond a preview phase.
Your workload requires guaranteed enterprise SLAs, compliance attestations, and audited certifications.
You need state-of-the-art long-context reasoning over very large documents or codebases.
Your workload requires rich ecosystem tooling, plugins, and community resources already battle-tested.
You need proven performance benchmarks against leading frontier models for mission-critical decisions.

FAQ

Frequently Asked Questions

What is Hy3 preview?

Hy3 preview is a Tencent large language model accessible via LLM.API, suitable for general-purpose text generation and analysis tasks.
What is Hy3 preview best suited for?

Hy3 preview is best for fast, low-friction chat-style completion, coding assistance, and structured text generation where low latency matters.
What is the context window of Hy3 preview?

Hy3 preview supports a mid-sized context window, suitable for multi-turn conversations, medium-length documents, and typical code files.
What modalities does Hy3 preview support?

Hy3 preview currently supports text input and text output only when accessed through LLM.API.
How is Hy3 preview priced on LLM.API?

Hy3 preview uses LLM.API’s unified per-token pricing; you are billed for input and output tokens at the Tencent Hy3 preview rate.
How fast is Hy3 preview in terms of latency and throughput?

Hy3 preview typically returns initial tokens quickly and supports streaming, making it suitable for interactive applications and tooling.
How do I call Hy3 preview through the LLM.API gateway?

Specify the model name "tencent-hy3-preview" (exact name may vary) in your LLM.API request and provide your LLM.API key for authentication.
How does Hy3 preview compare to other similar models on LLM.API?

Compared to larger flagship models, Hy3 preview generally trades some reasoning depth and creativity for lower cost and faster responses.
What are the main limitations of Hy3 preview?

Hy3 preview can hallucinate facts, struggle with very long contexts or complex reasoning chains, and should not be used without human review for high-stakes decisions.
Can I fine-tune or customize Hy3 preview via LLM.API?

Direct fine-tuning is typically not available; instead, you customize Hy3 preview behavior using system prompts, few-shot examples, and application-side orchestration.

Start in 2 lines of code

Get My API Key

Hy3 preview

What is Hy3 preview?

5 Core Capabilities

Complex Reasoning

Instruction Following

Agentic Workflows

Code Generation

Multilingual Support

6 Most Valuable Use Cases

Cost Comparison

Technical Specifications

Why Build on LLM.API?

Unified AI Routing

Cost-Aware Orchestration

Resilient Fallbacks

End-to-End Observability

Task-Level Abstractions

High-Throughput Batch

When to Use — When NOT to Use

Use it if...

Avoid if...

Start in 2 lines of code