Gemini 2.0 Flash

New
Popular
Google DeepMind
Text & Chat

Google's fastest multimodal model. Supports text, images, audio, and video input.

Try Gemini 2.0 Flash now
Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.
Sign in to try this model with €5 free credits.
Sign in
Press Cmd+Enter to send
Response appears here.
TL;DR·Last updated March 4, 2026

Gemini 2.0 Flash is text & chat AI model from Google DeepMind, priced at €1.00 per 1M input tokens with a 1M tokens context window.

Try Gemini 2.0 Flash

0.7

Sign in to generate — 50 free credits on sign-up

Examples

See what Gemini 2.0 Flash can generate

Data Extraction

Extract all dates, names, and locations from this text: 'On March 15, 2024, Dr. Sarah Mitchell presented her findings at the Berlin Conference Center. Her colleague James Wu joined via video from Tokyo.'

Dates: March 15, 2024 Names: Dr. Sarah Mitchell, James Wu Locations: Berlin Conference Center (Berlin), Tokyo

Brainstorming

Give me 5 unique app ideas that combine fitness with social gaming

1) StepQuest – An RPG where your real-world steps power your character through dungeons, and friends can join as party members. 2) PulseRacer – Heart rate-based racing game where your cardio intensity controls your car's speed in real-time multiplayer. 3) FlexTown – Build a virtual city where each building unlocks only after completing specific workout challenges with friends. 4) GymBounty – A bounty-hunter game where players earn rewards by completing fitness challenges posted by other users. 5) TrailTales – Collaborative storytelling app where each chapter unlocks after your running group collectively logs enough miles.

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Gemini 2.0 Flash into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("gemini-2-flash", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("gemini-2-flash", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("gemini-2-flash", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
1,000,000 tokens
Max output
8,192 tokens
Avg. latency
1.2s
Developer
Google DeepMind
Category
Text & Chat
Tags
fast
multimodal
affordable

Deep dive — Google DeepMind's Gemini 2.0 Flash

About Google DeepMind
Founded 2010 · Mountain View, USA / London, UK

Google DeepMind was created in April 2023 by merging Google Brain (founded 2011 inside Google) with DeepMind (founded 2010 in London by Demis Hassabis, Shane Legg and Mustafa Suleyman; acquired by Google in 2014). CEO Demis Hassabis leads the combined unit. The lab is responsible for some of the most influential papers in AI: 'Attention Is All You Need' (Vaswani et al., 2017), AlphaGo (2016), AlphaFold (2018-2021), Chinchilla scaling laws (2022) and the Gemini Technical Report (2023). The Gemini product line started in December 2023 with Gemini 1.0 Ultra/Pro/Nano, expanded with Gemini 1.5 (long context, MoE) in early 2024, Gemini 2.0 Flash announced in December 2024 and generally available in February 2025, Gemini 2.0 Pro Experimental and the Gemini 2.5 family (Pro and Flash) in 2025. Google DeepMind also produces Imagen, Veo, Lyria and the AlphaProteo/AlphaFold series in life sciences. Gemini powers AI Overviews in Google Search, Gemini in Google Workspace and on Pixel/Android devices.

Visit Google DeepMind
Architecture
Sparse Mixture-of-Experts Transformer (natively multimodal, fast tier)

Gemini 2.0 Flash is the speed- and cost-optimised tier of the Gemini 2.0 family, announced in December 2024 and generally available in February 2025. It is a natively multimodal Sparse Mixture-of-Experts Transformer that accepts text, image, audio and video inputs and can emit text plus, in the Flash Experimental variants, images and audio. The model was pretrained on Google's TPU v5p infrastructure on a multi-trillion-token mixture of web text, code, books, image-text pairs and YouTube video frames with a knowledge cutoff in August 2024. Post-training combines supervised fine-tuning, RLHF and reinforcement learning against verifiable rewards, with optimisation focused on making the model 2x faster than Gemini 1.5 Pro while exceeding it on most benchmarks. Gemini 2.0 Flash is the first generally available model in the family with built-in tool use including Code Execution, Google Search grounding and the new Live API for low-latency bidirectional streaming of voice and video. Flash also supports a 1M token context window in production, native image output via the Flash Image (Imagen-integrated) endpoint and structured function calling. The model is the default for the Gemini app's chat experience and is widely used in Google Workspace AI features for summarisation, drafting and meeting notes.

Parameters
Undisclosed (sparse MoE, smaller than Gemini 2.5 Pro)
Context
1.0M tokens
What it can do
  • 1,048,576 token context window in production
  • Natively multimodal: text, image, audio, video input
  • Low-latency Live API for streaming voice/video conversations
  • Built-in tools: Google Search grounding, Code Execution
  • Function calling and parallel tool calls
  • Roughly 2x faster than Gemini 1.5 Pro at lower cost
  • Native image generation via Flash Image variant
  • Multilingual coverage across 100+ languages
  • Audio output with multi-voice support in Live API
  • Strong long-document Q&A and summarisation
  • Best for: real-time voice agents, high-throughput multimodal chat, long-document RAG, mobile assistants.
Training & License

Pretrained on Google's multi-trillion-token corpus of web text, code, books, image-text pairs, audio waveforms and video frames. Knowledge cutoff August 2024. Post-training uses supervised fine-tuning, RLHF and RL against verifiable rewards, with a focus on inference efficiency.

License: Proprietary commercial license via Google AI Studio, Vertex AI and the Gemini app. Commercial use permitted under Google Cloud terms.

Known limitations
  • Weaker on hardest reasoning benchmarks than Gemini 2.5 Pro
  • Live API regional availability is limited
  • Long-context recall degrades beyond ~500K tokens
  • Image generation variant has separate quota and limits
  • Knowledge cutoff August 2024

Frequently asked questions

Start using Gemini 2.0 Flash today

Get started with free credits. No credit card required. Access Gemini 2.0 Flash and 100+ other models through a single API.