Reka Flash

Custom
Multimodal

Reka's 21B dense multimodal model balancing speed and quality. Up to 128k context.

Try Reka Flash now
Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.
Sign in to try this model with €5 free credits.
Sign in
Press Cmd+Enter to send
Response appears here.
TL;DR·Last updated May 16, 2026

Reka Flash is multimodal AI model from Custom, priced at €0.200 per 1M input tokens with a 128K tokens context window.

Try Reka Flash

0.7

Direct API access coming soon

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Reka Flash into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("reka-flash", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("reka-flash", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("reka-flash", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
128,000 tokens
Max output
4,096 tokens
Developer
Custom
Category
Multimodal
Supported Formats
text
image
video
Tags
reka
multimodal
cost-efficient

Deep dive — Reka AI's Reka Flash

About Reka AI
Founded 2022 · San Francisco, California, USA

Reka AI was founded in mid-2022 by Dani Yogatama, Yi Tay, Donovan Ong and Qi Liu, senior researchers from Google DeepMind, Google Brain, Apple and Facebook AI. The company is headquartered in San Francisco with engineering hubs in London and Singapore. Reka raised a $58M Series A in 2023 led by DST Global Partners and a Series B in 2024 at a reported near-unicorn valuation, with backers including Snowflake. Reka's multimodal-by-design family contains three tiers: Edge (~7B), Flash (~21B) and Core (flagship). Reka Flash launched in February 2024 as the cost-performance sweet spot and was the first publicly available Reka model. The Flash technical report described the multimodal training approach later expanded in the April 2024 Reka Core paper. Flash is offered exclusively as a hosted model on the Reka API, Reka Playground and Snowflake Cortex.

Visit Reka AI →
Architecture
Decoder-only Transformer trained multimodally on text, image, video and audio (mid variant)

Reka Flash is a ~21B parameter decoder-only Transformer trained on Reka's multimodal corpus of text, code, images, video frames and audio. As described in the Reka Core technical report, all three Reka sizes share the same architecture (modality encoders projecting into a shared token embedding space) but differ in scale and the proportion of training compute spent on each modality. Flash supports a 128K-token context window, accepts text, image, video (up to several minutes via uniform frame sampling) and audio (via a learned audio encoder), and produces JSON output and tool calls. Public Reka benchmarks place Flash between GPT-3.5 and GPT-4 on MMLU and BIG-Bench Hard, while matching GPT-4V on Perception Test and VideoMME at a fraction of the price. Flash is the recommended default model in Snowflake Cortex's multimodal API and is positioned as the workhorse Reka model for production AI features.

Parameters
21B
Context
128K tokens
What it can do
  • Multimodal input: text, image, video up to several minutes, audio
  • 128K-token context window
  • JSON output and function calling
  • Multilingual coverage across 32+ languages
  • Available on Reka API, Reka Playground and Snowflake Cortex
  • Strong cost-performance ratio: between GPT-3.5 and GPT-4 quality at much lower price
  • 21B parameters give faster serving than Core
  • Best for: production multimodal workloads, RAG, audio-grounded chat, video QA
Training & License

Multimodal pretraining over a curated corpus of text, code, images, video frames and audio with progressive curriculum, same mix as Reka Core at a smaller compute budget than Core but larger than Edge.

License: Proprietary commercial API. Generated outputs may be used commercially under the Reka terms.

Known limitations
  • Closed weights, hosted only
  • Quality below Core on hardest reasoning and video tasks
  • Smaller ecosystem and tooling than OpenAI / Anthropic
  • Audio understanding lighter than dedicated ASR models
  • No external fine-tuning

Frequently asked questions

Start using Reka Flash today

Get started with free credits. No credit card required. Access Reka Flash and 100+ other models through a single API.