How much does Whisper Large v3 Turbo cost via Railwail?

Per-call: €0.006. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Whisper Large v3 Turbo?

Whisper Large v3 Turbo supports a unknown context window — enough for typical AI workloads.

How fast is Whisper Large v3 Turbo?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is Whisper Large v3 Turbo better than Incredibly Fast Whisper?

It depends on your use case. Whisper Large v3 Turbo (OpenAI) and Incredibly Fast Whisper (Community) are both strong choices in speech-to-text. Compare them side-by-side at /compare/whisper-large-v3-turbo-vs-incredibly-fast-whisper.

Does Whisper Large v3 Turbo support audio input?

Yes — Whisper Large v3 Turbo processes audio input. Supported formats: audio. Use the standard Railwail API endpoint with audio content blocks.

Whisper Large v3 Turbo

Name: Whisper Large v3 Turbo
Brand: OpenAI
SKU: whisper-large-v3-turbo
Price: 0.006 EUR
Availability: InStock

Popular

OpenAI

Speech-to-Text

OpenAI's distilled Whisper Large v3. ~216x realtime, 99+ languages, MIT-licensed weights.

Transcribe with Whisper Large v3 Turbo

Upload an audio file and get a written transcript.

Drop or pick an audio file (MP3, WAV, M4A, FLAC).

Language

Transcript appears here.

TL;DR·Last updated June 24, 2026

Whisper Large v3 Turbo is speech-to-text AI model from OpenAI, priced at €0.000 per 1M input tokens with a unknown context window.

Try Whisper Large v3 Turbo

Drop audio file here

MP3, WAV, M4A, FLAC (max 25MB)

Language

Pricing

Price per Generation

Per generation€0.006

API Integration

Use our OpenAI-compatible API to integrate Whisper Large v3 Turbo into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("whisper-large-v3-turbo", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("whisper-large-v3-turbo", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("whisper-large-v3-turbo", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Price

€0.006

Developer

OpenAI

Deep dive — OpenAI's Whisper Large v3 Turbo

About OpenAI

Founded 2015 · San Francisco, California, USA

OpenAI was founded in December 2015 by Sam Altman, Elon Musk, Greg Brockman, Ilya Sutskever, Wojciech Zaremba and John Schulman, restructured to capped-profit OpenAI LP in 2019. Whisper Large v3 Turbo was released in October 2024 as a distilled fast variant of Whisper Large v3, designed to deliver approximately 8x faster inference at near-identical accuracy by reducing the decoder depth from 32 to 4 layers. The release was led by the original Whisper authors (Alec Radford, Jong Wook Kim, Tao Xu) and remained under the MIT licence. Turbo was distributed via GitHub, the Hugging Face Hub and the OpenAI Whisper API as the new default model where supported, replacing many in-production deployments of Whisper Large v3 within weeks of launch.

Visit OpenAI →

Architecture

Distilled encoder-decoder Transformer (4-layer decoder) for speech recognition

Whisper Large v3 Turbo is a distilled variant of Whisper Large v3 that keeps the same 32-layer audio encoder and 128-mel front-end but shrinks the decoder from 32 to just 4 Transformer layers, taking the total parameter count from 1.55B to 809M. The smaller decoder gives roughly 8x faster inference on long-form audio (and 4-5x faster on short clips) at a WER cost of approximately 0.5-1 percentage points on most benchmarks. The model was distilled on the same multilingual corpus as Large v3 (5 million hours total, of which 4 million are pseudo-labelled) with knowledge-distillation losses from the Large v3 teacher. Translation-to-English capability was deliberately removed to focus capacity on transcription quality. The 30-second sliding window, 99-language coverage and special task tokens are unchanged. Turbo runs in real-time on consumer GPUs (RTX 3060) and at 3-4x real-time on Apple Silicon CPUs via whisper.cpp.

Parameters: 809M
Context: 30 tokens

What it can do

8x faster long-form transcription than Whisper Large v3
99-language transcription with automatic language detection
Word-level timestamps preserved
Runs in real-time on a single consumer GPU (RTX 3060 / M2 Pro)
Half the memory footprint of Large v3 (809M vs 1.55B)
Open weights under MIT licence
Drop-in replacement for Large v3 in most pipelines
Best for: production ASR on commodity hardware, on-premise transcription, batch processing

Training & License

Distilled from Whisper Large v3 on the same 5-million-hour multilingual audio corpus with knowledge-distillation losses. Translation-to-English data was excluded.

License: MIT licence for code and weights; commercial use permitted.

Known limitations

No translation-to-English mode (transcription only)
WER 0.5-1 pp worse than Large v3 on average
Same 30-second hard window requires chunking
Same hallucination behaviour on silent / music-only audio
No native diarisation

Research papers

Frequently asked questions

Related Models

View all Speech-to-Text

Incredibly Fast Whisper

Community

Whisper Large v3 wrapped with Hugging Face Transformers optimizations (batched inference, flash attention) for very high throughput. Transcribes hours of audio in minutes on a single GPU. Maintained by Vaibhav Srivastav. Good when you need bulk transcription fast.

€1.00

Whisper

OpenAI

OpenAI's Whisper running on Replicate. General-purpose speech recognition trained on 680k hours of multilingual audio. Transcribes and translates 99 languages, robust to accents and background noise, and outputs plain text, segments, or word-level timestamps.

€2.00