How much does ElevenLabs v3 (alpha) cost via Railwail?

Input: €0.300 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of ElevenLabs v3 (alpha)?

ElevenLabs v3 (alpha) supports a unknown context window — enough for typical AI workloads.

How fast is ElevenLabs v3 (alpha)?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is ElevenLabs v3 (alpha) better than ElevenLabs Multilingual V2?

It depends on your use case. ElevenLabs v3 (alpha) (ElevenLabs) and ElevenLabs Multilingual V2 (ElevenLabs) are both strong choices in text-to-speech. Compare them side-by-side at /compare/eleven-v3-vs-elevenlabs-multilingual-v2.

ElevenLabs v3 (alpha)

Name: ElevenLabs v3 (alpha)
Brand: ElevenLabs
SKU: eleven-v3
Price: 0.0003 EUR
Availability: InStock

ElevenLabs

Text-to-Speech

ElevenLabs' v3 alpha TTS. Most expressive voice model with audio tags and laughter, higher latency.

Speak with ElevenLabs v3 (alpha)

Type any text and hear it spoken in a chosen voice.

Voice

Audio player appears here.

TL;DR·Last updated May 16, 2026

ElevenLabs v3 (alpha) is text-to-speech AI model from ElevenLabs, priced at €0.300 per 1M input tokens with a unknown context window.

Try ElevenLabs v3 (alpha)

Text to speak

Voice

Speed

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate ElevenLabs v3 (alpha) into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("eleven-v3", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("eleven-v3", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("eleven-v3", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Developer

ElevenLabs

Deep dive — ElevenLabs's ElevenLabs v3 (alpha)

About ElevenLabs

Founded 2022 · London, UK / New York, USA

ElevenLabs was founded in 2022 by Piotr Dabkowski (CTO, ex-Google ML engineer) and Mati Staniszewski (CEO, ex-Palantir), two Polish high-school friends frustrated with the poor quality of TV-show dubbing in Polish. The company set out to build voice AI that captures intonation and emotion across languages. Headquartered in London and New York with engineering hubs in Warsaw and the Bay Area, ElevenLabs raised a $19M Series A in June 2023 led by Andreessen Horowitz, a $80M Series B in January 2024 also led by a16z at a $1.1B valuation, and a $180M Series C in January 2025 at a $3.3B valuation co-led by a16z and ICONIQ. ElevenLabs v3 (alpha) was previewed in 2025 as the next generation flagship model with expressive emotion tags, longer context and more languages, succeeding the Multilingual V2 family that became the de-facto standard for AI dubbing.

Visit ElevenLabs →

Architecture

Proprietary autoregressive Transformer TTS with neural codec and emotion/prosody conditioning

ElevenLabs v3 (alpha) is the company's 2025 flagship text-to-speech model and the first ElevenLabs system to expose explicit emotion and event tags inside text input ([whispers], [laughs], [angry], [sighs]). It is a proprietary Transformer-based autoregressive model that predicts neural-codec audio tokens conditioned on a text prompt and a speaker embedding obtained from a few seconds of reference audio (Instant Voice Clone) or a fully fine-tuned voice (Professional Voice Clone, requires ~30 minutes of clean audio). v3 expands language coverage from 29 (v2) to 70+ languages, lengthens the input window to roughly 10,000 characters per request, and adds dialogue mode for multi-speaker scenes. ElevenLabs has not published a technical paper; product blog posts describe internal improvements in speaker disentanglement, code-switching and emotional range. v3 is offered through the same hosted API and Studio UI as Multilingual V2 but at higher latency and price.

Parameters: Undisclosed
Context: 10K tokens

What it can do

Expressive emotion and event tags ([laughs], [whispers], [angry], [crying])
70+ languages with high-quality code-switching
Multi-speaker dialogue mode for podcast and audiobook generation
Instant Voice Clone from ~1 minute of audio and Professional Voice Clone from ~30 minutes
Long-form input up to ~10,000 characters per request
Studio editor for multi-paragraph projects with per-line speaker control
Best for: audiobooks, dubbing, narrative podcasts, character voices for games

Training & License

Not disclosed. ElevenLabs licences professional voice talent, uses public-domain audiobooks and crowd-sourced opt-in voice contributions; commercial recordings are excluded per their public statements.

License: Proprietary commercial SaaS. Commercial use of generated audio is permitted on paid plans; voice clones remain customer property.

Known limitations