How much does Reka Flash cost via Railwail?

Input: €0.200 per 1M tokens. Output: €0.800 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Reka Flash?

Reka Flash supports a 128K tokens context window — enough for long books, technical manuals, and extended analysis.

How fast is Reka Flash?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is Reka Flash better than BLIP?

It depends on your use case. Reka Flash (Custom) and BLIP (Salesforce) are both strong choices in multimodal. Compare them side-by-side at /compare/reka-flash-vs-blip-captioning.

Does Reka Flash support image input (vision)?

Yes — Reka Flash accepts image inputs in addition to text. Send images via the standard OpenAI-compatible `messages` array with `image_url` content blocks. Supported formats: text, image, video.

Reka Flash

Name: Reka Flash
Brand: Custom
SKU: reka-flash
Price: 0.0002 EUR
Availability: InStock

Custom

Multimodal

Reka's 21B dense multimodal model balancing speed and quality. Up to 128k context.

Try Reka Flash now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated June 24, 2026

Reka Flash is multimodal AI model from Custom, priced at €0.200 per 1M input tokens with a 128K tokens context window.

Try Reka Flash

System Prompt

Message

Temperature

0.7

Max Tokens

Direct API access coming soon

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Reka Flash into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("reka-flash", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("reka-flash", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("reka-flash", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

128,000 tokens

Max output

4,096 tokens

Developer

Custom

Deep dive — Reka AI's Reka Flash

About Reka AI

Founded 2022 · San Francisco, California, USA

Reka AI was founded in mid-2022 by Dani Yogatama, Yi Tay, Donovan Ong and Qi Liu, senior researchers from Google DeepMind, Google Brain, Apple and Facebook AI. The company is headquartered in San Francisco with engineering hubs in London and Singapore. Reka raised a $58M Series A in 2023 led by DST Global Partners and a Series B in 2024 at a reported near-unicorn valuation, with backers including Snowflake. Reka's multimodal-by-design family contains three tiers: Edge (~7B), Flash (~21B) and Core (flagship). Reka Flash launched in February 2024 as the cost-performance sweet spot and was the first publicly available Reka model. The Flash technical report described the multimodal training approach later expanded in the April 2024 Reka Core paper. Flash is offered exclusively as a hosted model on the Reka API, Reka Playground and Snowflake Cortex.

Visit Reka AI →

Architecture

Decoder-only Transformer trained multimodally on text, image, video and audio (mid variant)

Reka Flash is a ~21B parameter decoder-only Transformer trained on Reka's multimodal corpus of text, code, images, video frames and audio. As described in the Reka Core technical report, all three Reka sizes share the same architecture (modality encoders projecting into a shared token embedding space) but differ in scale and the proportion of training compute spent on each modality. Flash supports a 128K-token context window, accepts text, image, video (up to several minutes via uniform frame sampling) and audio (via a learned audio encoder), and produces JSON output and tool calls. Public Reka benchmarks place Flash between GPT-3.5 and GPT-4 on MMLU and BIG-Bench Hard, while matching GPT-4V on Perception Test and VideoMME at a fraction of the price. Flash is the recommended default model in Snowflake Cortex's multimodal API and is positioned as the workhorse Reka model for production AI features.

Parameters: 21B
Context: 128K tokens

What it can do

Multimodal input: text, image, video up to several minutes, audio
128K-token context window
JSON output and function calling
Multilingual coverage across 32+ languages
Available on Reka API, Reka Playground and Snowflake Cortex
Strong cost-performance ratio: between GPT-3.5 and GPT-4 quality at much lower price
21B parameters give faster serving than Core
Best for: production multimodal workloads, RAG, audio-grounded chat, video QA

Training & License

Multimodal pretraining over a curated corpus of text, code, images, video frames and audio with progressive curriculum, same mix as Reka Core at a smaller compute budget than Core but larger than Edge.

License: Proprietary commercial API. Generated outputs may be used commercially under the Reka terms.

Known limitations

Closed weights, hosted only
Quality below Core on hardest reasoning and video tasks
Smaller ecosystem and tooling than OpenAI / Anthropic
Audio understanding lighter than dedicated ASR models
No external fine-tuning

Research papers

Frequently asked questions

Related Models

View all Multimodal

BLIP

Salesforce

Salesforce BLIP. Vision-language model for image captioning and visual question answering. Given an image it writes a short natural-language caption, or answers a question about the image when one is supplied. A widely used baseline for automatic captioning.

€1.00

CLIP Interrogator

Community

pharmapsychotic's CLIP Interrogator. Takes an image and produces a Stable-Diffusion-style text prompt by combining BLIP captioning with CLIP to rank likely subjects, artists, mediums and styles. Commonly used to reverse-engineer a prompt from an existing picture.

€1.00

Claude 3.5 Sonnet (vision)

Anthropic

Anthropic Claude 3.5 Sonnet with image input. 200k context, strong on dense documents, tables, charts and handwriting. Reliable structured extraction from screenshots and scans.

Free

Claude Opus 4.7