How much does Sora cost via Railwail?

Per-call: €1.00. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Sora?

Sora supports a unknown context window — enough for typical AI workloads.

Average response latency: 180.0s (p50 across recent Railwail traffic). See live p50/p95 metrics on /rankings.

Is Sora better than Google Veo 2?

It depends on your use case. Sora (OpenAI) and Google Veo 2 (Google DeepMind) are both strong choices in video generation. Compare them side-by-side at /compare/sora-vs-google-veo-2.

Sora

Name: Sora
Brand: OpenAI
SKU: sora
Price: 1 EUR
Availability: InStock

New

Popular

OpenAI

Video Generation

OpenAI video generation model. Create realistic and imaginative videos from text prompts up to 20 seconds.

Queue video with Sora

Video generation runs asynchronously — we'll queue a job and you can track it in your history.

Generates as an async job — typically 30 s to 2 min.

TL;DR·Last updated March 25, 2026

Sora is video generation AI model from OpenAI, priced at €0.000 per 1M input tokens with a unknown context window.

About this model

Sora is OpenAI's video generation model that can create realistic and imaginative scenes from text instructions. It understands not just what the user has asked for in a prompt, but also how those things exist in the physical world.

Try Sora

Prompt

Image References

Reference Images

Add

Duration

Aspect Ratio

Examples

See what Sora can generate

0:08

Cinematic Scene

"A cinematic aerial shot of a lighthouse on a rocky coast at sunset, golden light reflecting on crashing waves"

0:04

Nature Close-up

"Close-up of rain drops falling on a green leaf in slow motion, crystal clear water beads"

Pricing

Price per Generation

Per generation€1.00

API Integration

Use our OpenAI-compatible API to integrate Sora into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("sora", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("sora", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("sora", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Price

€1.00

Avg. latency

180.0s

Est. duration

3min

Developer

OpenAI

Deep dive — OpenAI's Sora

About OpenAI

Founded 2015 · San Francisco, USA

OpenAI was founded in December 2015 by Sam Altman, Elon Musk, Greg Brockman, Ilya Sutskever, Wojciech Zaremba, John Schulman, Andrej Karpathy and others, originally as a non-profit AI research lab. It restructured in 2019 into a capped-profit company and built the GPT, DALL-E, Whisper and Codex families. Sora was unveiled by OpenAI on 15 February 2024 as a research preview, demonstrating one-minute high-fidelity video generation from text prompts. After a months-long red-team and creator-partner phase, Sora launched publicly in December 2024 as 'Sora 1' inside the ChatGPT Plus and Pro plans. OpenAI's headline framing for Sora was that scaling a diffusion-transformer trained on long video clips and 'spacetime patches' produced emergent world-model-like behaviour: object permanence, physical plausibility and persistent identity.

Visit OpenAI →

Architecture

Diffusion Transformer (DiT) on spacetime patches of video latents

OpenAI describe Sora as a diffusion transformer that operates on 'spacetime patches' of a learned video latent representation. A video is first encoded by a learned 3D causal VAE into a spatio-temporal latent tensor, then split into a sequence of cube-shaped patches that play the role of tokens for a transformer. The DiT denoiser is conditioned on rich text embeddings produced by a GPT-family captioner trained to generate dense, highly descriptive captions for training videos -- a recaptioning trick borrowed from DALL-E 3. The model supports text-to-video, image-to-video, video-to-video, in-painting and extension. Native generations at launch were up to 1 minute at 1080p in research demos, with public Sora capped lower for compute reasons (typically up to 20 seconds at 720p / 1080p). Training uses a massive curated multilingual video corpus with synthetic captions; OpenAI have not published a formal paper, only a technical report 'Video generation models as world simulators'.

Parameters: Undisclosed
Context: unknown

What it can do

Text-to-video, image-to-video, video-to-video, in-painting and extension
Up to ~1 minute high-fidelity video in research demos
Strong object permanence and emergent physical plausibility
Rich cinematographic prompt vocabulary
Spacetime-patch DiT architecture pioneered at this scale
Storyboard mode and remix in the Sora consumer app
Available via ChatGPT Plus / Pro and Sora app
Dense recaptioning pipeline (DALL-E 3 style) for strong prompt adherence
Best for: cinematic shorts, creative ideation, complex multi-shot scenes.

Training & License

Massive curated multilingual video corpus including licensed footage, public web data and partner sources, with dense synthetic captions produced by a GPT-family captioner. Exact size and sources undisclosed.

License: Proprietary commercial licence via OpenAI terms; commercial use on Pro plans subject to content policy and provenance requirements (C2PA metadata, visible watermark).

Known limitations

Public Sora 1 capped well below research-demo 1-minute / 1080p capacity
No native audio in Sora 1
Strict moderation, including on people, brands and political content
Closed model without a peer-reviewed paper
Per-clip cost and queue times can be high

Research papers

Frequently asked questions

Related Models

View all Video Generation

Google Veo 2

Google DeepMind

Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.

€5.00

Google Veo 3

Google DeepMind

Google's Veo 3. High-fidelity text-to-video with native audio generation, up to 8s clips.

€0.75

Google Veo 3 (Replicate)

Google DeepMind

Google's Veo 3 served via Replicate. Text-to-video with native synchronized audio generation. High-fidelity motion and scene coherence in short clips.

€8.00

Google Veo 3.1

Google DeepMind

Latest Veo with image-to-video and context-aware audio

€6.00

Start using Sora today

Get started with free credits. No credit card required. Access Sora and 100+ other models through a single API.

Get Started Free Browse All Models

Sora

About this model

Examples

Pricing

API Integration

Deep dive — OpenAI's Sora

Research papers

Frequently asked questions

What is Sora?

How much does Sora cost via Railwail?

What is the context window of Sora?

How fast is Sora?

Is Sora better than Google Veo 2?

Related Models

Google Veo 2

Google Veo 3

Google Veo 3 (Replicate)

Google Veo 3.1

Start using Sora today