Powered by Poolside

Laguna XS.2 (free)

  • Text Generation

Laguna XS.2 (free) by Poolside is a compact, open‑weight agentic coding model optimized for fast, affordable software engineering workflows, available at no cost via selected providers. It combines a Mixture‑of‑Experts architecture with strong coding performance while remaining lightweight enough to run in more constrained environments.

Start Using API

What is Laguna XS.2 (free)?

Laguna XS.2 is Poolside’s open‑weight, 33B‑parameter Mixture‑of‑Experts language model with about 3B active parameters, designed primarily for agentic coding and software engineering tasks. It is used for building coding agents that can iteratively edit code, run tools, and solve multi‑step programming tasks, and for running private or on‑premise development assistants thanks to its relatively low resource requirements. It also supports general chat-style interactions for developers through APIs and integrations such as OpenRouter and third‑party platforms offering a free preview tier. Laguna XS.2 is part of Poolside’s Laguna family of models and is a second‑generation, XS‑class successor building on the training pipeline and lessons from the larger Laguna M.1 model.

5 Core Capabilities

  • Conversational Chat

    Supports instruction-following, multi-turn chat, and reasoning-focused assistant interactions over long contexts using a text-to-text chat interface.

  • Code Generation

    Optimized for software engineering and agentic coding, generating and editing code, fixing bugs, and handling multi-step programming tasks.

  • Long-Context Reasoning

    Handles up to 262k-token contexts with sliding window and global attention, enabling long-horizon reasoning and document-spanning workflows.

  • Tool And Function Use

    Natively supports tool use and function calling, reasoning before and between tool calls for automated workflows and coding agents.

  • Multilingual Text

    Processes and generates text in multiple languages, enabling cross-lingual chat, documentation assistance, and programming help with non-English content.

6 Most Valuable Use Cases

  • Local Coding Agents
  • Automated Bug Fixing
  • Codebase Refactoring
  • Tool-Assisted Debugging
  • Software Dev Assistants
  • Multi-Step Code Reasoning

Cost Comparison

LLM API offers the lowest cost and latency vs comparable Laguna-class APIs.

Provider Region Latency Throughput Uptime Input ($/1M) Output ($/1M) Context
LLM API BEST Global 120ms 80 tps 99.99% $0.05 $0.10 128K
Poolside Global ~220ms ~35 tps ~99.9% $0.00 $0.00 ~64K
OpenAI US East ~250ms ~40 tps 99.9% ~$0.30 ~$0.60 ~128K
Anthropic US West ~260ms ~30 tps 99.9% ~$0.35 ~$0.80 ~200K
Google Cloud EU West ~280ms ~25 tps 99.9% ~$0.40 ~$0.80 ~128K

Technical Specifications

Metric Laguna XS.2 (free) GPT-4o mini (OpenAI) Claude 3.5 Haiku (Anthropic)
Avg Latency ~250ms ~300ms ~350ms
Context Window 128K 128K 200K
Input Price ($/1M) $0.00 $0.15 $0.25
Output Price ($/1M) $0.00 $0.60 $1.25
Max Output Tokens 4K 4K 4K
Throughput ~120 tps ~100 tps ~90 tps
Uptime 99.0% 99.9% 99.9%

30-day usage via LLM API

12.5B
Prompt tokens processed (last 30 days)
7.8B
Completion tokens generated (last 30 days)
9.3M
API requests served (last 30 days)
410K
Unique users (last 30 days)
Start Using API

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

  • Intelligent AI Routing

    Automatically route each request to the optimal model across providers based on latency, capability, and cost—without changing your integration.

    One endpoint, any model
  • Cost-Aware Orchestration

    Blend premium and budget models with per-request cost controls, guardrails, and policies so teams can ship richer AI features without runaway spend.

    More AI, less spend
  • Resilient Fallback Flows

    Define automatic cross-provider fallbacks, retries, and degradations so production workloads keep working even when individual models or regions fail.

    Designed for failure
  • Full-Stack Observability

    Trace every call across providers with logs, metrics, and structured spans to debug prompts, tune routing, and meet compliance requirements.

    See every token
  • Task-Level Abstractions

    Describe tasks like chat, tools, RAG, or classification once and let LLM.API handle provider-specific prompts, parameters, and response shaping.

    Tasks, not prompts
  • High-Throughput Batch

    Submit large batches through a unified API with provider-aware chunking, concurrency control, and retries to slash latency and infrastructure overhead.

    Scale to millions

When to Use — When NOT to Use

Use it if...

  • You need a free model from Poolside for experimentation or early prototyping.
  • Your use case involves simple Q&A, proofreading, or light content rewriting tasks.
  • You need a backup or fallback model to reduce overall API spending.
  • Your use case involves building internal tools where occasional inaccuracies are acceptable.
  • You need to test Poolside platform integration before committing to paid tiers.
  • Your use case involves short-form text generation like summaries, captions, or brief replies.

Avoid if...

  • You need guaranteed top-tier reasoning performance comparable to the latest frontier models.
  • Your workload requires highly reliable code generation, debugging, and complex software design support.
  • You need enterprise-grade SLAs, dedicated support, or strict performance guarantees for production systems.
  • Your workload requires specialized capabilities like vision, audio, tools, or very long context windows.
  • You need state-of-the-art performance on complex math, formal logic, or multi-step planning.
  • Your workload requires tight latency guarantees for real-time or user-facing critical interactions.

Frequently Asked Questions

  • What is Laguna XS.2 (free)?

    Laguna XS.2 (free) is a lightweight Poolside language model accessible via LLM.API, intended for general-purpose text generation and experimentation at no charge.

  • What modalities does Laguna XS.2 (free) support?

    Laguna XS.2 (free) supports text-only input and output, without native image, audio, or video understanding capabilities.

  • How is Laguna XS.2 (free) priced on LLM.API?

    Laguna XS.2 (free) is offered with no per-token usage fees on LLM.API, subject to platform-specific free-tier and rate limits.

  • What is the context window of Laguna XS.2 (free)?

    Laguna XS.2 (free) supports a 16K token context window for combined input and output on LLM.API.

  • How fast is Laguna XS.2 (free) in terms of latency and throughput?

    Laguna XS.2 (free) is optimized for low-latency responses and higher throughput than larger models, though exact speeds depend on LLM.API load and client location.

  • How do I call Laguna XS.2 (free) through LLM.API?

    You can call Laguna XS.2 (free) by selecting its model name in the LLM.API completion or chat endpoint and authenticating with your LLM.API key.

  • What types of tasks is Laguna XS.2 (free) best suited for?

    Laguna XS.2 (free) works best for lightweight tasks like drafting text, basic coding help, quick data transformations, and prototyping chat-style assistants.

  • How does Laguna XS.2 (free) compare to larger Poolside or other premium models?

    Laguna XS.2 (free) generally trades off some reasoning depth, coding precision, and factual accuracy for lower cost and faster responses.

  • What are the main limitations of Laguna XS.2 (free)?

    Laguna XS.2 (free) can hallucinate, struggle with complex multi-step reasoning, lack up-to-date knowledge, and is not suitable for high-stakes or compliance-critical applications.

  • Does Laguna XS.2 (free) support streaming responses on LLM.API?

    Yes, Laguna XS.2 (free) can be used with streaming responses if you enable streaming in your LLM.API request parameters.

Start in 2 lines of code

Get My API Key