Powered by Writer

Palmyra X5

  • Text Generation

Palmyra X5 is Writer's most advanced enterprise large language model, featuring an extremely long context window and adaptive reasoning for complex business workflows. It is purpose-built for building and scaling AI agents across the enterprise with strong performance on long-form, text-heavy tasks.

Start Using API

What is Palmyra X5?

Palmyra X5 is Writer’s flagship enterprise large language model designed for adaptive reasoning over very long text inputs. It is used for enterprise content generation and long-document analysis, such as processing extensive reports, knowledge bases, and regulatory or research materials, and for powering AI agents that automate complex business workflows across domains like finance, healthcare, and software. It belongs to Writer’s Palmyra family of foundation models and succeeds earlier generations such as Palmyra X4.

5 Core Capabilities

  • Advanced Reasoning

    Performs deep, multi-step reasoning over complex business tasks, enabling reliable enterprise agents and sophisticated decision-support workflows.

  • Long-Context Handling

    Processes and grounds responses in very long inputs, supporting analysis of large document sets and extensive enterprise knowledge bases.

  • Tool and Agent Use

    Calls external tools and composes multi-step AI agents, orchestrating workflows such as retrieval, APIs, and database interactions.

  • Multilingual Support

    Understands and generates text in over 30 languages, enabling global enterprise deployments and cross-lingual workflows.

  • Image Input Support

    Accepts images as inputs to inform responses, allowing multimodal enterprise workflows that combine visual data with text.

6 Most Valuable Use Cases

  • Long-Document Summarization
  • Enterprise Content Generation
  • AI Agent Workflows
  • Knowledge Base Question-Answering
  • Regulatory Policy Analysis
  • Business Process Automation

Cost Comparison

LLM API offers the lowest token prices and latency for Palmyra X5–class models.

Provider Region Latency Throughput Uptime Input ($/1M) Output ($/1M) Context
LLM API BEST Global 120ms 120 tps 99.99% $0.40 $0.80 128K
Writer US ~220ms ~60 tps 99.9% ~$0.60 ~$1.20 32K
OpenAI (closest: GPT-4.1-mini) Global ~250ms ~80 tps 99.9% ~$0.50 ~$1.00 128K
Anthropic (closest: Claude 3.5 Haiku) US East ~260ms ~70 tps 99.9% ~$0.55 ~$1.10 200K
Google Cloud (closest: Gemini 1.5 Pro) Global ~280ms ~65 tps 99.9% ~$0.70 ~$1.40 1M

Technical Specifications

Metric Palmyra X5 (Writer) GPT-4.1 Mini (OpenAI) Claude 3.5 Sonnet (Anthropic)
Avg Latency ~220ms ~180ms ~250ms
Context Window 128K 128K 200K
Input Price ($/1M) $0.80 $0.15 $3.00
Output Price ($/1M) $2.40 $0.60 $15.00
Max Output Tokens 8K 4K 8K
Throughput 40 tps 60 tps 35 tps
Uptime 99.9% 99.9% 99.9%

30-day usage via LLM API

5.8B
Prompt tokens processed (last 30 days)
2.1B
Completion tokens generated (last 30 days)
7.4M
API requests served (last 30 days)
99.8%
Average API uptime (last 30 days)
Start Using API

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

  • Intelligent AI Routing

    Automatically route each request to the best model across providers based on cost, latency, and quality—no client changes required when your stack evolves.

    One endpoint, every model
  • Cost-Aware Orchestration

    Enforce budgets, compare provider pricing, and transparently shift traffic to cheaper equivalents while preserving quality so you never overspend on inference again.

    Cut spend, keep quality
  • Resilient Fallbacks

    Define automatic fallbacks across models and providers so timeouts, rate limits, or outages degrade gracefully instead of taking your product offline.

    Never fail on 500s
  • Full-Stack Observability

    Trace every request across providers with metrics, logs, and latency breakdowns so you can debug incidents and tune model routing in minutes, not days.

    See every token
  • Task-Level Abstractions

    Describe tasks like chat, extraction, or classification once and let LLM.API pick the right models and prompts, simplifying integration and future migrations.

    Code to tasks, not models
  • High-Throughput Batch

    Send massive batches through a single API with concurrency controls and provider-optimized chunking to cut latency and costs for large-scale workloads.

    Ship thousands at once

When to Use — When NOT to Use

Use it if...

  • You need to process or analyze extremely long documents with a million-token context window.
  • You need cost-efficient large-context inference for enterprise content generation and summarization workflows.
  • Your use case involves building AI agents that must reference extensive enterprise knowledge bases.
  • Your use case involves handling many PDFs and text files in a single request.
  • You need predictable enterprise deployment via Amazon Bedrock or similar managed cloud environments.
  • Your use case involves centralized governance over data residency, security, and enterprise compliance controls.

Avoid if...

  • You need state-of-the-art reasoning benchmarks with transparent scores across standard evaluation suites.
  • Your workload requires the absolute cheapest output pricing among long-context proprietary LLMs.
  • You need cutting-edge small-context performance where 1M-token context is unnecessary overhead.
  • Your workload requires open-source weights for on-premise deployment or deep customization.
  • You need extensive ecosystem tools, plugins, and community resources comparable to top frontier models.
  • Your workload requires multimodal generation beyond text, like image or audio outputs.

Frequently Asked Questions

  • What is Palmyra X5?

    Palmyra X5 is a large language model from Writer focused on enterprise-grade text generation, editing, and knowledge-intensive tasks.

  • What is Palmyra X5 best suited for?

    Palmyra X5 is best for long-form content generation, marketing copy, product documentation, and domain-specific enterprise workflows requiring consistent style and tone.

  • What modalities does Palmyra X5 support through LLM.API?

    Through LLM.API, Palmyra X5 is accessible as a text-only model for prompts and completions.

  • What is the context window of Palmyra X5 on LLM.API?

    Palmyra X5 supports a context window of up to 32K tokens via LLM.API.

  • How is Palmyra X5 priced when used via LLM.API?

    Palmyra X5 pricing is usage-based per input and output token, with exact rates defined in LLM.API’s pricing documentation.

  • How fast is Palmyra X5 in terms of latency on LLM.API?

    On LLM.API, Palmyra X5 is optimized for low-latency interactive use, with typical responses in the sub-second to few-second range depending on prompt size.

  • How do I call Palmyra X5 through the LLM.API gateway?

    Specify the model name "writer/palmyra-x5" in your LLM.API request along with your API key and standard completion parameters.

  • How does Palmyra X5 compare to similar LLMs?

    Palmyra X5 emphasizes enterprise safety, controllability, and writing quality, making it competitive with other mid-to-large models for business content generation.

  • Does Palmyra X5 support tools or function calling via LLM.API?

    If enabled by LLM.API, Palmyra X5 can be used with the platform’s standardized tool-calling interface similar to other supported models.

  • What are the main limitations of Palmyra X5?

    Palmyra X5 can hallucinate facts, may be less suitable for code-heavy workloads, and should not be used without human review for critical decisions.

Start in 2 lines of code

Get My API Key