SeamlessM4T v2 Large (Speech)
Meta SeamlessM4T v2 Large speech mode. Speech-to-speech, speech-to-text, and text-to-speech translation across 100+ languages in a single unified model.
SeamlessM4T v2 Large (Speech) is speech-to-text AI model from Meta, priced at €0.000 per 1M input tokens with a unknown context window.
Drop audio file here
MP3, WAV, M4A, FLAC (max 25MB)
Pricing
API Integration
Use our OpenAI-compatible API to integrate SeamlessM4T v2 Large (Speech) into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple — just pass a string
const reply = await rw.run("seamlessm4t-v2-large-speech", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("seamlessm4t-v2-large-speech", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("seamlessm4t-v2-large-speech", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Frequently asked questions
Related Models
View all Speech-to-TextWhisper Large V3
OpenAI's Whisper model. State-of-the-art speech recognition supporting 99+ languages.
Whisper Large v3 Turbo
OpenAI's distilled Whisper Large v3. ~216x realtime, 99+ languages, MIT-licensed weights.
Deepgram Nova-3
Deepgram's flagship STT. First to offer realtime multilingual transcription with self-serve customization.
ElevenLabs Scribe v1
ElevenLabs' STT. 99 languages, word-level timestamps, speaker diarization, audio-event tagging.
Start using SeamlessM4T v2 Large (Speech) today
Get started with free credits. No credit card required. Access SeamlessM4T v2 Large (Speech) and 100+ other models through a single API.