Perceptron Mk1

Text Classification

Perceptron Mk1 is Perceptron's highest‑quality proprietary vision‑language model focused on video understanding and embodied visual reasoning. It is notable for combining multimodal inputs (text, images, video) with long‑context reasoning and structured visual outputs for production use.

Start Using API

API Performance

Latency: ~1.0s avg response
Context: ~8K token context
Input: Free per 1M tokens
Output: Free per 1M tokens
Uptime: 99% 99%

About the model

What is Perceptron Mk1?

Perceptron Mk1 is a closed-source vision-language model from Perceptron designed for image and video understanding, OCR, object detection, document parsing, and embodied visual reasoning. It is mainly used for tasks like video question answering, summarization, event and temporal segment detection, and detailed scene analysis across robotics, autonomous systems, surveillance, and AR applications. It also supports structured outputs such as spatial annotations (points, boxes, polygons) and temporal clips for production APIs that need precise localization and grounding in complex visual data. Perceptron Mk1 is the first and highest-quality model in the proprietary Perceptron Mk family of multimodal vision and reasoning models.

Input / Output

Input

Text prompts and natural language instructions
Images (vision inputs for OCR, detection, captioning, etc.)
Video clips (for temporal and embodied reasoning)

Output

Natural-language text responses (descriptions, answers, summaries)
Structured text outputs (JSON-like annotations: points, boxes, polygons, clips)

Model capabilities

5 Core Capabilities

Conversational Chat

Engages in multi-turn text conversations, following instructions, answering questions, and adapting responses to user context and intent.
Image Analysis

Interprets visual content in images, identifying objects, scenes, relationships, and other salient details to support downstream reasoning tasks.
Text Translation

Translates written text between multiple languages while preserving meaning, tone, and basic formatting across diverse domains and styles.
Screen Content Handling

Processes user interface or webpage-like content, enabling reasoning about layouts, elements, and on-screen information for digital tasks.
Optical Character Recognition

Extracts machine-readable text from images, screenshots, or scanned documents to enable search, editing, or further automated processing.

Use cases

6 Most Valuable Use Cases

Video Question Answering
Video Event Detection
Image Object Detection
Document OCR Parsing
Robotics Data Curation
Embodied Spatial Reasoning

Transparent pricing

Cost Comparison

LLM API offers the lowest cost and best performance for Perceptron‑class models.

Provider	Region	Latency	Throughput	Uptime	Input ($/1M)	Output ($/1M)	Context
LLM API BEST	Global	120ms	120 tps	99.99%	$0.20	$0.20	64K tokens
Perceptron	US East	~220ms	~60 tps	~99.9%	~$0.45	~$0.45	~32K tokens
OpenAI	Global	~250ms	~80 tps	~99.9%	~$0.50	~$0.50	~32K tokens
Anthropic	US West	~260ms	~70 tps	~99.9%	~$0.55	~$0.55	~200K tokens
Azure AI	EU West	~240ms	~75 tps	~99.95%	~$0.52	~$0.52	~32K tokens

Performance benchmarks

Technical Specifications

Metric	Perceptron Mk1	OpenAI GPT-4.1 Mini	Anthropic Claude 3 Haiku
Avg Latency	~180ms	~200ms	~220ms
Context Window	128K	128K	200K
Input Price ($/1M)	$0.15	$0.15	$0.25
Output Price ($/1M)	$0.60	$0.60	$0.80
Max Output Tokens	8K	4K	8K
Throughput	~120 tps	~100 tps	~90 tps
Uptime	99.9%	99.9%	99.9%

30-day usage via LLM API

620M: Prompt tokens processed (last 30 days)
4.8M: API requests served (last 30 days)
710M: Completion tokens generated (last 30 days)
99.6%: Avg uptime over API (last 30 days)

Start Using API

Architecture & Integration

Why Build on LLM.API?

One unified API. Every major model. Built-in reliability, cost control, and observability.

Intelligent Model Routing

Dynamically route each request to the optimal model across providers using policies, constraints, and real-time signals—no client rewrites or vendor-specific logic needed.
One endpoint, any model
Cost-Aware Orchestration

Control spend with price-based routing, per-project limits, and smart downgrades—keeping quality high while preventing surprise bills in production workloads.
Optimize quality per dollar
Resilient Fallback Flows

Define graceful failover chains across models and providers so timeouts, rate limits, or outages degrade smoothly instead of breaking user experiences.
Never ship single points of failure
End-to-End Observability

Trace every request across providers with unified logs, metrics, and latency breakdowns to debug issues fast and tune prompts and routing policies confidently.
See every token, everywhere
Task-Level Abstractions

Describe tasks like chat, embeddings, tools, or reranking once and let LLM.API translate them into each provider’s API, schema, and capabilities.
Standard tasks, many backends
High-Throughput Batch APIs

Ship large workloads as batches with concurrency control, retries, and partial-failure handling built in—cutting overhead and maximizing throughput across providers.
Batch at scale, safely

Decision guide

When to Use — When NOT to Use

Use it if...

You need a general-purpose model from Perceptron for prototyping typical AI features.
You need basic question-answering or summarization without demanding complex multi-step reasoning.
Your use case involves moderate-length chat interactions with straightforward instructions and responses.
You need an additional non-OpenAI model to diversify provider or vendor dependencies.
Your use case involves experimentation or benchmarking across multiple mid-range foundation models.
You need a Perceptron-native model to integrate with existing Perceptron tooling or dashboards.

Avoid if...

You need state-of-the-art reasoning, planning, or tool use comparable to leading flagship models.
Your workload requires very long context windows handling large codebases or multi-document reasoning.
You need specialized capabilities like high-fidelity image generation or advanced multimodal understanding.
Your workload requires strict, well-documented compliance certifications or audited industry-specific guarantees.
You need a highly mature ecosystem with extensive third-party integrations, plugins, and community tools.
Your workload requires detailed, battle-tested performance benchmarks across many public evaluation suites.

FAQ

Frequently Asked Questions

What is Perceptron Mk1?

Perceptron Mk1 is a large language model from Perceptron focused on fast, low-cost text generation for general software and product development use cases.
What is Perceptron Mk1 best suited for?

Perceptron Mk1 is best for code generation, refactoring, technical writing, and structured data transformations where low latency and high throughput matter.
How is Perceptron Mk1 priced on LLM.API?

Perceptron Mk1 uses LLM.API’s unified per-token billing; check your LLM.API pricing dashboard for current input and output token rates.
What is the context window of Perceptron Mk1?

Perceptron Mk1 supports a context window defined by the LLM.API integration; refer to the model card for the latest maximum token limit.
How fast is Perceptron Mk1 in terms of latency?

Perceptron Mk1 is optimized for low latency, typically returning first tokens quickly enough for interactive applications when used via LLM.API.
What modalities does Perceptron Mk1 support?

Perceptron Mk1 is a text-only model, accepting text prompts and returning text completions via the LLM.API interface.
How do I call Perceptron Mk1 through LLM.API?

Specify the Perceptron Mk1 model name in your LLM.API completion or chat endpoint request, using the same authentication and parameters as other models.
How does Perceptron Mk1 compare to similar models on LLM.API?

Perceptron Mk1 typically trades some reasoning depth for higher throughput and lower cost compared with larger, more capable general-purpose models.
What are the main limitations of Perceptron Mk1?

Perceptron Mk1 can struggle with complex multi-step reasoning, domain-expert answers, and may occasionally produce incorrect or hallucinated information.
Can I use Perceptron Mk1 for streaming responses?

Yes, when enabled in LLM.API, Perceptron Mk1 supports streaming token responses to improve perceived latency for end users.

Start in 2 lines of code

Get My API Key

Perceptron Mk1

What is Perceptron Mk1?

5 Core Capabilities

Conversational Chat

Image Analysis

Text Translation

Screen Content Handling

Optical Character Recognition

6 Most Valuable Use Cases

Cost Comparison

Technical Specifications

Why Build on LLM.API?

Intelligent Model Routing

Cost-Aware Orchestration

Resilient Fallback Flows

End-to-End Observability

Task-Level Abstractions

High-Throughput Batch APIs

When to Use — When NOT to Use

Use it if...

Avoid if...

Start in 2 lines of code