AI21 Jamba 1.5 Large
AI21's flagship hybrid Mamba-Transformer model with a 256k context window for long-document tasks.
AI21 Jamba 1.5 Large is text & chat AI model from Custom, priced at β¬2.00 per 1M input tokens with a 256K tokens context window.
0.7
Pricing
API Integration
Use our OpenAI-compatible API to integrate AI21 Jamba 1.5 Large into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple β just pass a string
const reply = await rw.run("jamba-1-5-large", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("jamba-1-5-large", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("jamba-1-5-large", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive β AI21 Labs's AI21 Jamba 1.5 Large
AI21 Labs is one of the earliest commercial LLM companies, founded in 2017 in Tel Aviv by Yoav Shoham (Stanford emeritus, AI pioneer), Ori Goshen and Amnon Shashua (Mobileye founder, ex-Intel SVP). AI21 built the Jurassic-1 (2021) and Jurassic-2 (2023) families and pioneered hybrid State-Space + Transformer architectures with Jamba in March 2024 β the first production-scale Mamba-Transformer hybrid LLM. Jamba 1.5 followed in August 2024 at two scales: Mini (52B total / 12B active) and Large (398B total / 94B active). AI21 has raised over $336M from investors including Google, Nvidia, Walden Catalyst and Pitango, and serves enterprise customers through AI21 Studio, AWS Bedrock, Azure AI Studio, and Snowflake Cortex.
Visit AI21 Labs βJamba 1.5 Large is a hybrid State-Space + Transformer Mixture-of-Experts model. The architecture interleaves Mamba (selective state-space) layers with standard self-attention layers in a 7:1 Mamba-to-Attention ratio across 72 blocks. Mixture-of-Experts is applied to MLP modules in attention blocks with 16 experts and top-2 routing, giving 94B active parameters out of 398B total. The Mamba layers handle long-range dependencies with O(N) memory while the attention layers preserve in-context retrieval quality, enabling a true 256,000-token effective context β empirically validated on the RULER long-context benchmark, where pure-transformer 128K models degrade noticeably. The model uses a 64,000-token BPE tokeniser and supports nine languages (English, Spanish, French, Portuguese, Italian, Dutch, German, Arabic, Hebrew). Released August 2024 under the Jamba Open Model License with hosted access via AI21 Studio, AWS Bedrock, Azure AI Studio and Snowflake Cortex.
- Parameters
- 398B total, 94B active per token (16 experts, top-2 routing)
- Context
- 256K tokens
- Hybrid Mamba+Transformer+MoE architecture
- 398B total / 94B active parameters
- 256K effective context β best-in-class on RULER long-context benchmark
- Constant memory per token from Mamba β cheap long-context inference
- Native function calling and JSON-mode structured output
- Multilingual: English, Spanish, French, Portuguese, Italian, Dutch, German, Arabic, Hebrew
- Open weights on Hugging Face under Jamba Open Model License
- Best for: long-document analysis, many-document RAG, long-trace agents, cost-efficient enterprise long-context inference.
Pretrained on trillions of tokens of web data, code, math, books and multilingual sources (exact figure not disclosed). Knowledge cutoff March 2024. Post-training is supervised fine-tuning plus preference optimisation; the SSAM (state-space attention mix) post-training adapts Mamba-state regularisation.
License: Jamba Open Model License. Permissive for research and commercial use with attribution and AUP compliance β weaker than Apache 2.0 but more open than research-only licenses. Hosted commercial access via AI21 Studio, AWS Bedrock, Azure AI Studio and Snowflake Cortex.
Known limitations
- 398B total parameters need ~8x H100 for FP16 inference
- No vision modality
- Hybrid architecture has less community tooling β some inference engines unsupported
- Behind GPT-4o / Claude 3.5 Sonnet on hardest reasoning, code and math
- Jamba Open Model License has acceptable-use restrictions and attribution requirements
Frequently asked questions
Related Models
View all Text & ChatClaude Opus 4
Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.
Claude Sonnet 4
Anthropic's most capable model. Excellent for complex analysis, coding, math, and creative writing.
DeepSeek V3.1
DeepSeek's refreshed V3.1 release. 671B MoE / 37B active. Tops open-weights leaderboards on coding and reasoning.
DeepSeek V4 Pro
DeepSeek's April 2026 flagship. 1.6T MoE / 49B active params, 1M context, rivals top closed-source models on STEM and coding at a fraction of the price.
Start using AI21 Jamba 1.5 Large today
Get started with free credits. No credit card required. Access AI21 Jamba 1.5 Large and 100+ other models through a single API.