How much does Gemini 2.0 Flash cost via Railwail?

Input: €1.00 per 1M tokens. Output: €4.00 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Gemini 2.0 Flash?

Gemini 2.0 Flash supports a 1M tokens context window — enough for entire codebases or research papers in one prompt.

How fast is Gemini 2.0 Flash?

Average response latency: 1.2s (p50 across recent Railwail traffic). See live p50/p95 metrics on /rankings.

Is Gemini 2.0 Flash better than Bio_ClinicalBERT?

It depends on your use case. Gemini 2.0 Flash (Google DeepMind) and Bio_ClinicalBERT (huggingface) are both strong choices in text & chat. Compare them side-by-side at /compare/gemini-2-flash-vs-bio-clinicalbert.

Gemini 2.0 Flash

Name: Gemini 2.0 Flash
Brand: Google
SKU: gemini-2-flash
Price: 0.001 EUR
Availability: InStock

New

Popular

Google DeepMind

Text & Chat

Google's fastest multimodal model. Supports text, images, audio, and video input.

Try Gemini 2.0 Flash now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated March 4, 2026

Gemini 2.0 Flash is text & chat AI model from Google DeepMind, priced at €1.00 per 1M input tokens with a 1M tokens context window.

Try Gemini 2.0 Flash

System Prompt

Message

Temperature

0.7

Max Tokens

Examples

See what Gemini 2.0 Flash can generate

Data Extraction

Extract all dates, names, and locations from this text: 'On March 15, 2024, Dr. Sarah Mitchell presented her findings at the Berlin Conference Center. Her colleague James Wu joined via video from Tokyo.'

Dates: March 15, 2024 Names: Dr. Sarah Mitchell, James Wu Locations: Berlin Conference Center (Berlin), Tokyo

Brainstorming

Give me 5 unique app ideas that combine fitness with social gaming

1) StepQuest – An RPG where your real-world steps power your character through dungeons, and friends can join as party members. 2) PulseRacer – Heart rate-based racing game where your cardio intensity controls your car's speed in real-time multiplayer. 3) FlexTown – Build a virtual city where each building unlocks only after completing specific workout challenges with friends. 4) GymBounty – A bounty-hunter game where players earn rewards by completing fitness challenges posted by other users. 5) TrailTales – Collaborative storytelling app where each chapter unlocks after your running group collectively logs enough miles.

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Gemini 2.0 Flash into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("gemini-2-flash", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("gemini-2-flash", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("gemini-2-flash", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

1,000,000 tokens

Max output

8,192 tokens

Avg. latency

1.2s

Developer

Google DeepMind

Deep dive — Google DeepMind's Gemini 2.0 Flash

About Google DeepMind

Founded 2010 · Mountain View, USA / London, UK

Google DeepMind was created in April 2023 by merging Google Brain (founded 2011 inside Google) with DeepMind (founded 2010 in London by Demis Hassabis, Shane Legg and Mustafa Suleyman; acquired by Google in 2014). CEO Demis Hassabis leads the combined unit. The lab is responsible for some of the most influential papers in AI: 'Attention Is All You Need' (Vaswani et al., 2017), AlphaGo (2016), AlphaFold (2018-2021), Chinchilla scaling laws (2022) and the Gemini Technical Report (2023). The Gemini product line started in December 2023 with Gemini 1.0 Ultra/Pro/Nano, expanded with Gemini 1.5 (long context, MoE) in early 2024, Gemini 2.0 Flash announced in December 2024 and generally available in February 2025, Gemini 2.0 Pro Experimental and the Gemini 2.5 family (Pro and Flash) in 2025. Google DeepMind also produces Imagen, Veo, Lyria and the AlphaProteo/AlphaFold series in life sciences. Gemini powers AI Overviews in Google Search, Gemini in Google Workspace and on Pixel/Android devices.

Visit Google DeepMind →

Architecture

Sparse Mixture-of-Experts Transformer (natively multimodal, fast tier)

Gemini 2.0 Flash is the speed- and cost-optimised tier of the Gemini 2.0 family, announced in December 2024 and generally available in February 2025. It is a natively multimodal Sparse Mixture-of-Experts Transformer that accepts text, image, audio and video inputs and can emit text plus, in the Flash Experimental variants, images and audio. The model was pretrained on Google's TPU v5p infrastructure on a multi-trillion-token mixture of web text, code, books, image-text pairs and YouTube video frames with a knowledge cutoff in August 2024. Post-training combines supervised fine-tuning, RLHF and reinforcement learning against verifiable rewards, with optimisation focused on making the model 2x faster than Gemini 1.5 Pro while exceeding it on most benchmarks. Gemini 2.0 Flash is the first generally available model in the family with built-in tool use including Code Execution, Google Search grounding and the new Live API for low-latency bidirectional streaming of voice and video. Flash also supports a 1M token context window in production, native image output via the Flash Image (Imagen-integrated) endpoint and structured function calling. The model is the default for the Gemini app's chat experience and is widely used in Google Workspace AI features for summarisation, drafting and meeting notes.

Parameters: Undisclosed (sparse MoE, smaller than Gemini 2.5 Pro)
Context: 1.0M tokens

What it can do

1,048,576 token context window in production
Natively multimodal: text, image, audio, video input
Low-latency Live API for streaming voice/video conversations
Built-in tools: Google Search grounding, Code Execution
Function calling and parallel tool calls
Roughly 2x faster than Gemini 1.5 Pro at lower cost
Native image generation via Flash Image variant
Multilingual coverage across 100+ languages
Audio output with multi-voice support in Live API
Strong long-document Q&A and summarisation
Best for: real-time voice agents, high-throughput multimodal chat, long-document RAG, mobile assistants.

Training & License

Pretrained on Google's multi-trillion-token corpus of web text, code, books, image-text pairs, audio waveforms and video frames. Knowledge cutoff August 2024. Post-training uses supervised fine-tuning, RLHF and RL against verifiable rewards, with a focus on inference efficiency.

License: Proprietary commercial license via Google AI Studio, Vertex AI and the Gemini app. Commercial use permitted under Google Cloud terms.

Known limitations

Weaker on hardest reasoning benchmarks than Gemini 2.5 Pro
Live API regional availability is limited
Long-context recall degrades beyond ~500K tokens
Image generation variant has separate quota and limits
Knowledge cutoff August 2024

Research papers

Frequently asked questions

Related Models

View all Text & Chat

Bio_ClinicalBERT

huggingface

The original Bio_ClinicalBERT from Alsentzer et al., a BERT model initialized from BioBERT and further pretrained on all MIMIC-III clinical notes. Served as a fill-mask endpoint it predicts masked tokens in clinical text and produces clinical embeddings. It is the standard encoder backbone behind many downstream clinical NLP fine-tunes.

€1.00

Biomedical NER (all entities)

huggingface

Token-classification model from d4data that tags 84 biomedical entity types in clinical and medical text, including disease, sign, symptom, medication, dosage, lab value, body part and procedure. Trained on the Maccrobat clinical case corpus on a DistilBERT base, so it runs cheaply for high-volume tagging.

€1.00

Claude Opus 4

Anthropic

Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.

Free

Claude Opus 4.8