Perplexity Sonar

Custom
Text & Chat

Perplexity's fastest and cheapest web-grounded chat model. Live-source citations included.

Try Perplexity Sonar now
Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.
Sign in to try this model with €5 free credits.
Sign in
Press Cmd+Enter to send
Response appears here.
TL;DR·Last updated May 16, 2026

Perplexity Sonar is text & chat AI model from Custom, priced at €1.00 per 1M input tokens with a 127K tokens context window.

Try Perplexity Sonar

0.7

Direct API access coming soon

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Perplexity Sonar into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("sonar", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("sonar", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("sonar", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
127,000 tokens
Max output
8,192 tokens
Developer
Custom
Category
Text & Chat
Supported Formats
text
Tags
perplexity
web-search
citations
cost-efficient

Deep dive — Perplexity AI's Perplexity Sonar

About Perplexity AI
Founded 2022 · San Francisco, USA

Perplexity AI was founded in August 2022 by Aravind Srinivas (CEO, former OpenAI researcher and DeepMind/Google Brain alumnus), Denis Yarats (CTO, former Meta AI researcher), Andy Konwinski (co-founder of Databricks) and Johnny Ho. The company's product is an 'answer engine' that combines real-time web search with LLM-based reasoning to return cited answers, positioning itself as a search alternative to Google. Perplexity launched its consumer chat product in late 2022 and quickly raised over $500M across multiple rounds from investors including IVP, NEA, NVIDIA, Jeff Bezos and Susan Wojcicki, reaching a $9B valuation in late 2024. The company released its own Sonar model family in 2024-2025, fine-tuned for grounded, citation-bearing answers from live web retrieval. Sonar is positioned as a fast, cost-efficient default model in the Perplexity API and consumer product, with Sonar Pro and Sonar Reasoning as higher-capability variants. Perplexity also ships Perplexity Pages (long-form content), Perplexity Spaces (collaboration) and partnership integrations with Apple Intelligence and several US news organisations.

Visit Perplexity AI →
Architecture
Search-grounded LLM (Llama-derived, tuned for retrieval-augmented answering)

Perplexity Sonar is the default, fast and low-cost model in Perplexity's API and consumer product, released as a public API tier in January 2025. Sonar is built on a Llama-3.x-derived base, fine-tuned by Perplexity for retrieval-augmented generation, citation insertion and concise web-grounded answers. The training process is not fully disclosed but follows standard supervised fine-tuning and preference optimisation on Perplexity's curated logs of high-quality, well-cited answers, plus synthetic question-answer pairs grounded in real web pages. At inference time, every Sonar query is augmented with a live web search step run by Perplexity's in-house search index, which retrieves the most relevant pages, ranks them, and supplies them as context to the model. The model is trained to cite sources inline using numbered references, to refuse to answer when the corpus does not support the claim, and to compose answers in a concise, structured format. Sonar supports a 127K token context window and the standard OpenAI-compatible chat completions API, making it a drop-in replacement for OpenAI in many search-enabled agent stacks. The cost is roughly $1 per 1,000 search-augmented requests at launch, making it one of the cheapest fully-grounded options on the market.

Parameters
Undisclosed (Llama-3.x-derived base, mid-sized)
Context
127K tokens
What it can do
  • Live web search built into every request
  • Inline citations with numbered source references
  • Concise, structured answers tuned for search use cases
  • 127K context window
  • OpenAI-compatible chat completions API
  • Low latency and low cost (around $1 per 1k search-augmented requests at launch)
  • Returns sources, images and related questions in the response
  • Drop-in replacement for retrieval-augmented agents
  • Refuses to answer when source corpus is insufficient
  • Available in Perplexity API and consumer product
  • Best for: real-time research, grounded Q&A, news summarisation, RAG without managing your own retrieval.
Training & License

Built on a Llama-3.x-derived base, fine-tuned on Perplexity's curated logs of high-quality cited answers plus synthetic retrieval-grounded QA pairs. At inference time augmented with Perplexity's proprietary live web search index.

License: Proprietary commercial license via the Perplexity API. Weights not released. Standard Perplexity Terms of Service apply.

Known limitations
  • Quality depends heavily on Perplexity's retrieval ranking
  • May cite low-quality sources if highly ranked
  • Closed weights; no on-prem option
  • Citations occasionally do not support the exact claim
  • Less capable than Sonar Pro on complex multi-step reasoning

Frequently asked questions

Start using Perplexity Sonar today

Get started with free credits. No credit card required. Access Perplexity Sonar and 100+ other models through a single API.