Powered by Writer
Palmyra X5
- Text Generation
Palmyra X5 is Writer's most advanced enterprise large language model, featuring an extremely long context window and adaptive reasoning for complex business workflows. It is purpose-built for building and scaling AI agents across the enterprise with strong performance on long-form, text-heavy tasks.
About the model
What is Palmyra X5?
Palmyra X5 is Writer’s flagship enterprise large language model designed for adaptive reasoning over very long text inputs. It is used for enterprise content generation and long-document analysis, such as processing extensive reports, knowledge bases, and regulatory or research materials, and for powering AI agents that automate complex business workflows across domains like finance, healthcare, and software. It belongs to Writer’s Palmyra family of foundation models and succeeds earlier generations such as Palmyra X4.
Model capabilities
5 Core Capabilities
-
Advanced Reasoning
Performs deep, multi-step reasoning over complex business tasks, enabling reliable enterprise agents and sophisticated decision-support workflows.
-
Long-Context Handling
Processes and grounds responses in very long inputs, supporting analysis of large document sets and extensive enterprise knowledge bases.
-
Tool and Agent Use
Calls external tools and composes multi-step AI agents, orchestrating workflows such as retrieval, APIs, and database interactions.
-
Multilingual Support
Understands and generates text in over 30 languages, enabling global enterprise deployments and cross-lingual workflows.
-
Image Input Support
Accepts images as inputs to inform responses, allowing multimodal enterprise workflows that combine visual data with text.
Use cases
6 Most Valuable Use Cases
- Long-Document Summarization
- Enterprise Content Generation
- AI Agent Workflows
- Knowledge Base Question-Answering
- Regulatory Policy Analysis
- Business Process Automation
Transparent pricing
Cost Comparison
LLM API offers the lowest token prices and latency for Palmyra X5–class models.
| Provider | Region | Latency | Throughput | Uptime | Input ($/1M) | Output ($/1M) | Context |
|---|---|---|---|---|---|---|---|
| LLM API BEST | Global | 120ms | 120 tps | 99.99% | $0.40 | $0.80 | 128K |
| Writer | US | ~220ms | ~60 tps | 99.9% | ~$0.60 | ~$1.20 | 32K |
| OpenAI (closest: GPT-4.1-mini) | Global | ~250ms | ~80 tps | 99.9% | ~$0.50 | ~$1.00 | 128K |
| Anthropic (closest: Claude 3.5 Haiku) | US East | ~260ms | ~70 tps | 99.9% | ~$0.55 | ~$1.10 | 200K |
| Google Cloud (closest: Gemini 1.5 Pro) | Global | ~280ms | ~65 tps | 99.9% | ~$0.70 | ~$1.40 | 1M |
Performance benchmarks
Technical Specifications
| Metric | Palmyra X5 (Writer) | GPT-4.1 Mini (OpenAI) | Claude 3.5 Sonnet (Anthropic) |
|---|---|---|---|
| Avg Latency | ~220ms | ~180ms | ~250ms |
| Context Window | 128K | 128K | 200K |
| Input Price ($/1M) | $0.80 | $0.15 | $3.00 |
| Output Price ($/1M) | $2.40 | $0.60 | $15.00 |
| Max Output Tokens | 8K | 4K | 8K |
| Throughput | 40 tps | 60 tps | 35 tps |
| Uptime | 99.9% | 99.9% | 99.9% |
30-day usage via LLM API
- 5.8B
- Prompt tokens processed (last 30 days)
- 2.1B
- Completion tokens generated (last 30 days)
- 7.4M
- API requests served (last 30 days)
- 99.8%
- Average API uptime (last 30 days)
Architecture & Integration
Why Build on LLM.API?
One unified API. Every major model. Built-in reliability, cost control, and observability.
-
Intelligent AI Routing
Automatically route each request to the best model across providers based on cost, latency, and quality—no client changes required when your stack evolves.
One endpoint, every model -
Cost-Aware Orchestration
Enforce budgets, compare provider pricing, and transparently shift traffic to cheaper equivalents while preserving quality so you never overspend on inference again.
Cut spend, keep quality -
Resilient Fallbacks
Define automatic fallbacks across models and providers so timeouts, rate limits, or outages degrade gracefully instead of taking your product offline.
Never fail on 500s -
Full-Stack Observability
Trace every request across providers with metrics, logs, and latency breakdowns so you can debug incidents and tune model routing in minutes, not days.
See every token -
Task-Level Abstractions
Describe tasks like chat, extraction, or classification once and let LLM.API pick the right models and prompts, simplifying integration and future migrations.
Code to tasks, not models -
High-Throughput Batch
Send massive batches through a single API with concurrency controls and provider-optimized chunking to cut latency and costs for large-scale workloads.
Ship thousands at once
Decision guide
When to Use — When NOT to Use
Use it if...
- You need to process or analyze extremely long documents with a million-token context window.
- You need cost-efficient large-context inference for enterprise content generation and summarization workflows.
- Your use case involves building AI agents that must reference extensive enterprise knowledge bases.
- Your use case involves handling many PDFs and text files in a single request.
- You need predictable enterprise deployment via Amazon Bedrock or similar managed cloud environments.
- Your use case involves centralized governance over data residency, security, and enterprise compliance controls.
Avoid if...
- You need state-of-the-art reasoning benchmarks with transparent scores across standard evaluation suites.
- Your workload requires the absolute cheapest output pricing among long-context proprietary LLMs.
- You need cutting-edge small-context performance where 1M-token context is unnecessary overhead.
- Your workload requires open-source weights for on-premise deployment or deep customization.
- You need extensive ecosystem tools, plugins, and community resources comparable to top frontier models.
- Your workload requires multimodal generation beyond text, like image or audio outputs.
FAQ
Frequently Asked Questions
-
What is Palmyra X5?
Palmyra X5 is a large language model from Writer focused on enterprise-grade text generation, editing, and knowledge-intensive tasks.
-
What is Palmyra X5 best suited for?
Palmyra X5 is best for long-form content generation, marketing copy, product documentation, and domain-specific enterprise workflows requiring consistent style and tone.
-
What modalities does Palmyra X5 support through LLM.API?
Through LLM.API, Palmyra X5 is accessible as a text-only model for prompts and completions.
-
What is the context window of Palmyra X5 on LLM.API?
Palmyra X5 supports a context window of up to 32K tokens via LLM.API.
-
How is Palmyra X5 priced when used via LLM.API?
Palmyra X5 pricing is usage-based per input and output token, with exact rates defined in LLM.API’s pricing documentation.
-
How fast is Palmyra X5 in terms of latency on LLM.API?
On LLM.API, Palmyra X5 is optimized for low-latency interactive use, with typical responses in the sub-second to few-second range depending on prompt size.
-
How do I call Palmyra X5 through the LLM.API gateway?
Specify the model name "writer/palmyra-x5" in your LLM.API request along with your API key and standard completion parameters.
-
How does Palmyra X5 compare to similar LLMs?
Palmyra X5 emphasizes enterprise safety, controllability, and writing quality, making it competitive with other mid-to-large models for business content generation.
-
Does Palmyra X5 support tools or function calling via LLM.API?
If enabled by LLM.API, Palmyra X5 can be used with the platform’s standardized tool-calling interface similar to other supported models.
-
What are the main limitations of Palmyra X5?
Palmyra X5 can hallucinate facts, may be less suitable for code-heavy workloads, and should not be used without human review for critical decisions.
