How much does Perplexity Sonar Reasoning cost via Railwail?

Input: €1.00 per 1M tokens. Output: €5.00 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Perplexity Sonar Reasoning?

Perplexity Sonar Reasoning supports a 127K tokens context window — enough for long books, technical manuals, and extended analysis.

How fast is Perplexity Sonar Reasoning?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is Perplexity Sonar Reasoning better than Bio_ClinicalBERT?

It depends on your use case. Perplexity Sonar Reasoning (Custom) and Bio_ClinicalBERT (huggingface) are both strong choices in text & chat. Compare them side-by-side at /compare/sonar-reasoning-vs-bio-clinicalbert.

Perplexity Sonar Reasoning

Name: Perplexity Sonar Reasoning
Brand: Custom
SKU: sonar-reasoning
Price: 0.001 EUR
Availability: InStock

Custom

Text & Chat

Perplexity's reasoning model with chain-of-thought and integrated web search.

Try Perplexity Sonar Reasoning now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated June 24, 2026

Perplexity Sonar Reasoning is text & chat AI model from Custom, priced at €1.00 per 1M input tokens with a 127K tokens context window.

Try Perplexity Sonar Reasoning

System Prompt

Message

Temperature

0.7

Max Tokens

Direct API access coming soon

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Perplexity Sonar Reasoning into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("sonar-reasoning", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("sonar-reasoning", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("sonar-reasoning", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

127,000 tokens

Max output

8,192 tokens

Developer

Custom

Deep dive — Perplexity AI's Perplexity Sonar Reasoning

About Perplexity AI

Founded 2022 · San Francisco, California, USA

Perplexity AI was founded in August 2022 by Aravind Srinivas (CEO, former OpenAI and DeepMind researcher), Denis Yarats (former Quora), Johnny Ho (former Quora) and Andy Konwinski (Databricks co-founder). Headquartered in San Francisco, Perplexity built one of the first AI-native search products — an answer engine combining live web retrieval with LLM synthesis and inline citations. The company has raised over $500M from investors including IVP, Nvidia, NEA, Jeff Bezos and SoftBank, with a 2024 valuation above $9B. Perplexity offers its Sonar API for developers to access search-grounded LLM responses. Sonar Reasoning, released January 2025, is built on the open-source DeepSeek-R1 reasoning model wrapped in Perplexity's live web-search retrieval stack with the reasoning trace exposed in the API response.

Visit Perplexity AI →

Architecture

Reasoning Mixture-of-Experts Transformer with live web retrieval

Sonar Reasoning is Perplexity's hosted deployment of DeepSeek-R1 (671B Sparse Mixture-of-Experts, 37B active per token, DeepSeekMoE architecture with Multi-head Latent Attention) wrapped in Perplexity's retrieval pipeline. The base model uses 256 fine-grained experts plus shared experts, top-8 routing, 128K context, and was trained by DeepSeek with the GRPO reinforcement-learning recipe described in the R1 technical report. Perplexity hosts the model on US-based infrastructure, applies its own safety post-training on the Perplexity domain, and exposes the reasoning chain-of-thought as a separate field in the API response alongside the final answer. Each request triggers a live web-search retrieval that injects ranked passages into the model context with source URLs returned alongside the answer. Released January 2025 via the Sonar API.

Parameters: 671B total, 37B active per token (DeepSeek-R1 base)
Context: 127K tokens

What it can do

Combines live web search with explicit reasoning traces
Returns ranked citations with source URLs alongside answers
Strong on math, logic and research-style questions
Reasoning chain-of-thought exposed in the API
Knowledge always current via live search — bypasses model cutoff limits
Cheaper than OpenAI o-series for comparable reasoning quality at release
Hosted on Perplexity's US infrastructure
Best for: research-grade Q&A, market intelligence, citation-required workflows, fact-grounded reasoning.

Training & License

Inherits DeepSeek-R1's training pipeline (DeepSeek-V3-Base ~14.8T tokens of multilingual web, code and math, followed by the R1 reasoning RL stage with GRPO). Perplexity adds retrieval-grounded supervised fine-tuning on its domain. Base model knowledge cutoff approximately July 2024 — but live search overrides for current questions.

License: Perplexity API Terms of Service. The base DeepSeek-R1 weights are MIT-licensed and can be self-hosted separately, but the live-search and reasoning-citation pipeline are Perplexity-specific and hosted-only.

Known limitations

Reasoning latency is high (typically 5-20s) due to thinking tokens plus search
Quality depends on what the search retrieves — bad sources lead to bad answers
Per-call cost includes reasoning tokens plus search — can be expensive on long answers
No vision input
Reasoning trace can be verbose and not directly user-facing
Hosted-only — Perplexity does not offer self-hosting for the Sonar stack

Research papers

Frequently asked questions

Related Models

View all Text & Chat

Bio_ClinicalBERT

huggingface

The original Bio_ClinicalBERT from Alsentzer et al., a BERT model initialized from BioBERT and further pretrained on all MIMIC-III clinical notes. Served as a fill-mask endpoint it predicts masked tokens in clinical text and produces clinical embeddings. It is the standard encoder backbone behind many downstream clinical NLP fine-tunes.

€1.00

Biomedical NER (all entities)

huggingface

Token-classification model from d4data that tags 84 biomedical entity types in clinical and medical text, including disease, sign, symptom, medication, dosage, lab value, body part and procedure. Trained on the Maccrobat clinical case corpus on a DistilBERT base, so it runs cheaply for high-volume tagging.

€1.00

Claude Opus 4

Anthropic

Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.

Free

Claude Opus 4.8