How much does Cohere Command R (08-2024) cost via Railwail?

Input: €0.150 per 1M tokens. Output: €0.600 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Cohere Command R (08-2024)?

Cohere Command R (08-2024) supports a 128K tokens context window — enough for long books, technical manuals, and extended analysis.

How fast is Cohere Command R (08-2024)?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is Cohere Command R (08-2024) better than Bio_ClinicalBERT?

It depends on your use case. Cohere Command R (08-2024) (Cohere) and Bio_ClinicalBERT (huggingface) are both strong choices in text & chat. Compare them side-by-side at /compare/command-r-08-2024-vs-bio-clinicalbert.

Cohere Command R (08-2024)

Name: Cohere Command R (08-2024)
Brand: Custom
SKU: command-r-08-2024
Price: 0.00015 EUR
Availability: InStock

Cohere

Text & Chat

Cohere's mid-tier RAG/tool model. Cost-efficient sibling of Command R+ with 128k context.

Try Cohere Command R (08-2024) now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated June 24, 2026

Cohere Command R (08-2024) is text & chat AI model from Cohere, priced at €0.150 per 1M input tokens with a 128K tokens context window.

Try Cohere Command R (08-2024)

System Prompt

Message

Temperature

0.7

Max Tokens

Direct API access coming soon

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Cohere Command R (08-2024) into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("command-r-08-2024", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("command-r-08-2024", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("command-r-08-2024", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

128,000 tokens

Max output

4,096 tokens

Developer

Cohere

Deep dive — Cohere's Cohere Command R (08-2024)

About Cohere

Founded 2019 · Toronto, Canada

Cohere is an enterprise-focused LLM lab founded in 2019 by Aidan Gomez (co-author of 'Attention Is All You Need'), Ivan Zhang and Nick Frosst. Headquartered in Toronto with offices in San Francisco and London, Cohere targets enterprise customers — banks, telcos, government — with deployment options spanning Cohere's hosted API, AWS Bedrock, Azure, Oracle Cloud and on-prem (Cohere North). The Command R family launched March 2024 as Cohere's RAG-and-tool-use flagship line, refreshed as command-r-08-2024 in August 2024 with substantial gains on math, code, reasoning and reduced spurious refusals. Investors include Nvidia, Cisco, Salesforce and Oracle, with total funding above $1B and a 2024 valuation around $5.5B.

Visit Cohere →

Architecture

Decoder-only Transformer optimised for RAG and tool use

Command R 08-2024 is a 35B dense decoder-only transformer designed specifically for retrieval-augmented generation and multi-step tool use. The architecture uses grouped-query attention (8 KV heads), RoPE positional embeddings (extended for 128K context), SwiGLU activations, RMSNorm and no biases, with a 256,000-token multilingual BPE tokeniser shared with the Aya line. The 08-2024 refresh kept the architecture and tokeniser identical to the March 2024 release but substantially improved math, code, reasoning, structured output and reduced 'I cannot answer' refusals. Post-training emphasises grounded citation generation (Cohere's structured `<co: 0>` citation tokens) and parallel multi-step tool calling. Open weights are released under CC-BY-NC 4.0 for research; commercial use goes through the hosted Cohere API, AWS Bedrock, Azure, Oracle marketplaces or licensed on-prem deployment.

Parameters: 35B (dense)
Context: 128K tokens

What it can do

Excellent retrieval-augmented generation with structured inline citations
Native multi-step tool/function calling with parallel-tool support
128K context window for long-document analysis
Strong multilingual quality across 10+ business languages
Improved math, code and reasoning in the 08-2024 refresh
Open weights (CC-BY-NC) for research evaluation
Hosted on Cohere API, AWS Bedrock, Azure, Oracle Cloud
Best for: enterprise RAG, multilingual support agents, mid-tier tool-using assistants.

Training & License

Trained on trillions of tokens of multilingual web data, code, math and Cohere-curated synthetic RAG and tool-use traces. Knowledge cutoff approximately early 2024. Post-training is supervised fine-tuning followed by preference optimisation, with specific data and methods proprietary to Cohere.

License: Cohere Terms of Service for the hosted API. Open weights on Hugging Face under CC-BY-NC 4.0 (research-only). Commercial production access via Cohere API, AWS Bedrock, Azure, Oracle Cloud or licensed on-prem deployments.

Known limitations

Open weights are non-commercial — production needs hosted/licensed access
Smaller and less capable than Command R+ on hard reasoning
No vision modality
Cohere tool-use schema differs from OpenAI — adapter work required
Latency higher than smaller tiers for simple short-text tasks

Research papers

Frequently asked questions

Related Models

View all Text & Chat

Bio_ClinicalBERT

huggingface

The original Bio_ClinicalBERT from Alsentzer et al., a BERT model initialized from BioBERT and further pretrained on all MIMIC-III clinical notes. Served as a fill-mask endpoint it predicts masked tokens in clinical text and produces clinical embeddings. It is the standard encoder backbone behind many downstream clinical NLP fine-tunes.

€1.00

Biomedical NER (all entities)

huggingface

Token-classification model from d4data that tags 84 biomedical entity types in clinical and medical text, including disease, sign, symptom, medication, dosage, lab value, body part and procedure. Trained on the Maccrobat clinical case corpus on a DistilBERT base, so it runs cheaply for high-volume tagging.

€1.00

Claude Opus 4

Anthropic

Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.

Free

Claude Opus 4.8