How much does AI21 Jamba 1.5 Mini cost via Railwail?

Input: €0.200 per 1M tokens. Output: €0.400 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of AI21 Jamba 1.5 Mini?

AI21 Jamba 1.5 Mini supports a 256K tokens context window — enough for entire codebases or research papers in one prompt.

How fast is AI21 Jamba 1.5 Mini?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is AI21 Jamba 1.5 Mini better than Bio_ClinicalBERT?

It depends on your use case. AI21 Jamba 1.5 Mini (Custom) and Bio_ClinicalBERT (huggingface) are both strong choices in text & chat. Compare them side-by-side at /compare/jamba-1-5-mini-vs-bio-clinicalbert.

AI21 Jamba 1.5 Mini

Name: AI21 Jamba 1.5 Mini
Brand: Custom
SKU: jamba-1-5-mini
Price: 0.0002 EUR
Availability: InStock

Custom

Text & Chat

Cost-efficient hybrid Mamba-Transformer model with 256k context. Tuned for high-throughput RAG.

Try AI21 Jamba 1.5 Mini now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated June 24, 2026

AI21 Jamba 1.5 Mini is text & chat AI model from Custom, priced at €0.200 per 1M input tokens with a 256K tokens context window.

Try AI21 Jamba 1.5 Mini

System Prompt

Message

Temperature

0.7

Max Tokens

Direct API access coming soon

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate AI21 Jamba 1.5 Mini into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("jamba-1-5-mini", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("jamba-1-5-mini", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("jamba-1-5-mini", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

256,000 tokens

Max output

4,096 tokens

Developer

Custom

Deep dive — AI21 Labs's AI21 Jamba 1.5 Mini

About AI21 Labs

Founded 2017 · Tel Aviv, Israel

AI21 Labs is an Israeli LLM pioneer founded in 2017 by Yoav Shoham (Stanford emeritus), Ori Goshen and Amnon Shashua (Mobileye founder). Long active in commercial LLMs (Jurassic-1, Jurassic-2), AI21 pioneered the hybrid State-Space + Transformer Jamba architecture in March 2024. Jamba 1.5 Mini is the small-and-fast variant of the August 2024 Jamba 1.5 release, optimised for long-context inference at production cost points and tractable on a single H100 80GB. AI21 has raised over $336M from investors including Google, Nvidia, Walden Catalyst and Pitango.

Visit AI21 Labs →

Architecture

Hybrid Mamba-Transformer Mixture-of-Experts

Jamba 1.5 Mini uses the same hybrid SSM+Transformer+MoE recipe as Jamba 1.5 Large at smaller scale. Across 32 blocks each block alternates Mamba (selective state-space) and self-attention layers in a 7:1 ratio. MLPs are MoE with 16 experts and top-2 routing, giving 12B active parameters out of 52B total. The 12B active count means inference cost is competitive with dense 12B models, while Mamba layers provide constant-memory long-context scaling. The model uses a 64,000-token BPE tokeniser and supports the same nine languages as Jamba 1.5 Large (English, Spanish, French, Portuguese, Italian, Dutch, German, Arabic, Hebrew). Released August 2024 under the Jamba Open Model License with hosted access via AI21 Studio, AWS Bedrock, Azure AI Studio and Snowflake Cortex.

Parameters: 52B total, 12B active per token (16 experts, top-2 routing)
Context: 256K tokens

What it can do

Hybrid Mamba+Transformer+MoE architecture at compact scale
52B total / 12B active parameters
256K effective context
Fits on a single H100 80GB at FP16 or A100 at INT8
Native function calling and JSON-mode output
Multilingual (9 languages)
Open weights under Jamba Open Model License
Best for: cheap long-context summarisation, RAG with large retrieval windows, single-GPU enterprise pilots.

Training & License

Same data mixture and methodology as Jamba 1.5 Large: trillions of tokens of web, code, math, books and multilingual sources, knowledge cutoff March 2024, followed by supervised fine-tuning and preference optimisation.

License: Jamba Open Model License. Permits research and commercial use with attribution and AUP compliance. Hosted access via AI21 Studio, AWS Bedrock, Azure AI Studio and Snowflake Cortex.

Known limitations

Lower quality than Jamba 1.5 Large or Mixtral 8x7B Instruct on hard reasoning
No vision modality
Limited community ecosystem — fewer inference engines support hybrid Mamba+attention
Behind dense 70B-class instructs on benchmark depth
Multilingual coverage narrower than Command R, Aya or Mistral Large

Research papers

Frequently asked questions

Related Models

View all Text & Chat

Bio_ClinicalBERT

huggingface

The original Bio_ClinicalBERT from Alsentzer et al., a BERT model initialized from BioBERT and further pretrained on all MIMIC-III clinical notes. Served as a fill-mask endpoint it predicts masked tokens in clinical text and produces clinical embeddings. It is the standard encoder backbone behind many downstream clinical NLP fine-tunes.

€1.00

Biomedical NER (all entities)

huggingface

Token-classification model from d4data that tags 84 biomedical entity types in clinical and medical text, including disease, sign, symptom, medication, dosage, lab value, body part and procedure. Trained on the Maccrobat clinical case corpus on a DistilBERT base, so it runs cheaply for high-volume tagging.

€1.00

Claude Opus 4

Anthropic

Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.

Free

Claude Opus 4.8