Palmyra X5

Text Generation

Palmyra X5 is Writer's most advanced enterprise large language model, featuring an extremely long context window and adaptive reasoning for complex business workflows. It is purpose-built for building and scaling AI agents across the enterprise with strong performance on long-form, text-heavy tasks.

Start Using API

API Performance

Latency: ~0.8s time to first token
Context: ~128K token context
Input: ~$0.60 per 1M tokens
Output: ~$2.40 per 1M tokens
Uptime: 99% 99%

About the model

What is Palmyra X5?

Palmyra X5 is Writer’s flagship enterprise large language model designed for adaptive reasoning over very long text inputs. It is used for enterprise content generation and long-document analysis, such as processing extensive reports, knowledge bases, and regulatory or research materials, and for powering AI agents that automate complex business workflows across domains like finance, healthcare, and software. It belongs to Writer’s Palmyra family of foundation models and succeeds earlier generations such as Palmyra X4.

Input / Output

Input

Text prompts
Documents (PDF)

Output

Text responses

Model capabilities

5 Core Capabilities

Advanced Reasoning

Performs deep, multi-step reasoning over complex business tasks, enabling reliable enterprise agents and sophisticated decision-support workflows.
Long-Context Handling

Processes and grounds responses in very long inputs, supporting analysis of large document sets and extensive enterprise knowledge bases.
Tool and Agent Use

Calls external tools and composes multi-step AI agents, orchestrating workflows such as retrieval, APIs, and database interactions.
Multilingual Support

Understands and generates text in over 30 languages, enabling global enterprise deployments and cross-lingual workflows.
Image Input Support

Accepts images as inputs to inform responses, allowing multimodal enterprise workflows that combine visual data with text.

Use cases

6 Most Valuable Use Cases

Long-Document Summarization
Enterprise Content Generation
AI Agent Workflows
Knowledge Base Question-Answering
Regulatory Policy Analysis
Business Process Automation

Transparent pricing

Cost Comparison

LLM API offers the lowest token prices and latency for Palmyra X5–class models.

Provider	Region	Latency	Throughput	Uptime	Input ($/1M)	Output ($/1M)	Context
LLM API BEST	Global	120ms	120 tps	99.99%	$0.40	$0.80	128K
Writer	US	~220ms	~60 tps	99.9%	~$0.60	~$1.20	32K
OpenAI (closest: GPT-4.1-mini)	Global	~250ms	~80 tps	99.9%	~$0.50	~$1.00	128K
Anthropic (closest: Claude 3.5 Haiku)	US East	~260ms	~70 tps	99.9%	~$0.55	~$1.10	200K
Google Cloud (closest: Gemini 1.5 Pro)	Global	~280ms	~65 tps	99.9%	~$0.70	~$1.40	1M

Performance benchmarks

Technical Specifications

Metric	Palmyra X5 (Writer)	GPT-4.1 Mini (OpenAI)	Claude 3.5 Sonnet (Anthropic)
Avg Latency	~220ms	~180ms	~250ms
Context Window	128K	128K	200K
Input Price ($/1M)	$0.80	$0.15	$3.00
Output Price ($/1M)	$2.40	$0.60	$15.00
Max Output Tokens	8K	4K	8K
Throughput	40 tps	60 tps	35 tps
Uptime	99.9%	99.9%	99.9%

30-day usage via LLM API

5.8B: Prompt tokens processed (last 30 days)
2.1B: Completion tokens generated (last 30 days)
7.4M: API requests served (last 30 days)
99.8%: Average API uptime (last 30 days)

Start Using API

Architecture & Integration

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

Intelligent AI Routing

Automatically route each request to the best model across providers based on cost, latency, and quality—no client changes required when your stack evolves.
One endpoint, every model
Cost-Aware Orchestration

Enforce budgets, compare provider pricing, and transparently shift traffic to cheaper equivalents while preserving quality so you never overspend on inference again.
Cut spend, keep quality
Resilient Fallbacks

Define automatic fallbacks across models and providers so timeouts, rate limits, or outages degrade gracefully instead of taking your product offline.
Never fail on 500s
Full-Stack Observability

Trace every request across providers with metrics, logs, and latency breakdowns so you can debug incidents and tune model routing in minutes, not days.
See every token
Task-Level Abstractions

Describe tasks like chat, extraction, or classification once and let LLM.API pick the right models and prompts, simplifying integration and future migrations.
Code to tasks, not models
High-Throughput Batch

Send massive batches through a single API with concurrency controls and provider-optimized chunking to cut latency and costs for large-scale workloads.
Ship thousands at once

Decision guide

When to Use — When NOT to Use

Use it if...

You need to process or analyze extremely long documents with a million-token context window.
You need cost-efficient large-context inference for enterprise content generation and summarization workflows.
Your use case involves building AI agents that must reference extensive enterprise knowledge bases.
Your use case involves handling many PDFs and text files in a single request.
You need predictable enterprise deployment via Amazon Bedrock or similar managed cloud environments.
Your use case involves centralized governance over data residency, security, and enterprise compliance controls.

Avoid if...

You need state-of-the-art reasoning benchmarks with transparent scores across standard evaluation suites.
Your workload requires the absolute cheapest output pricing among long-context proprietary LLMs.
You need cutting-edge small-context performance where 1M-token context is unnecessary overhead.
Your workload requires open-source weights for on-premise deployment or deep customization.
You need extensive ecosystem tools, plugins, and community resources comparable to top frontier models.
Your workload requires multimodal generation beyond text, like image or audio outputs.

FAQ

Frequently Asked Questions

What is Palmyra X5?

Palmyra X5 is a large language model from Writer focused on enterprise-grade text generation, editing, and knowledge-intensive tasks.
What is Palmyra X5 best suited for?

Palmyra X5 is best for long-form content generation, marketing copy, product documentation, and domain-specific enterprise workflows requiring consistent style and tone.
What modalities does Palmyra X5 support through LLM.API?

Through LLM.API, Palmyra X5 is accessible as a text-only model for prompts and completions.
What is the context window of Palmyra X5 on LLM.API?

Palmyra X5 supports a context window of up to 32K tokens via LLM.API.
How is Palmyra X5 priced when used via LLM.API?

Palmyra X5 pricing is usage-based per input and output token, with exact rates defined in LLM.API’s pricing documentation.
How fast is Palmyra X5 in terms of latency on LLM.API?

On LLM.API, Palmyra X5 is optimized for low-latency interactive use, with typical responses in the sub-second to few-second range depending on prompt size.
How do I call Palmyra X5 through the LLM.API gateway?

Specify the model name "writer/palmyra-x5" in your LLM.API request along with your API key and standard completion parameters.
How does Palmyra X5 compare to similar LLMs?

Palmyra X5 emphasizes enterprise safety, controllability, and writing quality, making it competitive with other mid-to-large models for business content generation.
Does Palmyra X5 support tools or function calling via LLM.API?

If enabled by LLM.API, Palmyra X5 can be used with the platform’s standardized tool-calling interface similar to other supported models.
What are the main limitations of Palmyra X5?

Palmyra X5 can hallucinate facts, may be less suitable for code-heavy workloads, and should not be used without human review for critical decisions.

Start in 2 lines of code

Get My API Key

Palmyra X5

What is Palmyra X5?

5 Core Capabilities

Advanced Reasoning

Long-Context Handling

Tool and Agent Use

Multilingual Support

Image Input Support

6 Most Valuable Use Cases

Cost Comparison

Technical Specifications

Why Build on LLM.API?

Intelligent AI Routing

Cost-Aware Orchestration

Resilient Fallbacks

Full-Stack Observability

Task-Level Abstractions

High-Throughput Batch

When to Use — When NOT to Use

Use it if...

Avoid if...

Start in 2 lines of code