How much does Cohere Aya 23 35B cost via Railwail?

No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Cohere Aya 23 35B?

Cohere Aya 23 35B supports a 8.2K tokens context window — enough for long documents up to ~24,000 words.

How fast is Cohere Aya 23 35B?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is Cohere Aya 23 35B better than Bio_ClinicalBERT?

It depends on your use case. Cohere Aya 23 35B (Custom) and Bio_ClinicalBERT (huggingface) are both strong choices in text & chat. Compare them side-by-side at /compare/aya-23-35b-vs-bio-clinicalbert.

Cohere Aya 23 35B

Name: Cohere Aya 23 35B
Brand: Custom
SKU: aya-23-35b
Availability: InStock

Custom

Text & Chat

Open-weights multilingual research model from Cohere covering 23 languages. 35B parameters.

Try Cohere Aya 23 35B now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated June 24, 2026

Cohere Aya 23 35B is text & chat AI model from Custom, priced at €0.000 per 1M input tokens with a 8.2K tokens context window.

Try Cohere Aya 23 35B

System Prompt

Message

Temperature

0.7

Max Tokens

Direct API access coming soon

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Cohere Aya 23 35B into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("aya-23-35b", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("aya-23-35b", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("aya-23-35b", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

8,192 tokens

Max output

4,096 tokens

Developer

Custom

Deep dive — Cohere For AI's Cohere Aya 23 35B

About Cohere For AI

Founded 2022 · Toronto, Canada

Cohere For AI (C4AI) is the non-profit research lab of Cohere, founded in 2022 and led by Sara Hooker. Its remit is open science and global multilingual NLP. C4AI's flagship Aya initiative — launched in 2023 — coordinated 3,000+ collaborators across 119 countries to build the largest open-access multilingual instruction-tuning dataset (the Aya Collection: 513M instances across 114 languages). Aya 23, released May 2024, is the second generation of the Aya model family. It builds on Cohere's Command R base architecture and trades broad coverage (101 languages in Aya-101) for stronger per-language quality across 23 widely-used languages. Cohere itself was founded in 2019 by Aidan Gomez (co-author of 'Attention Is All You Need'), Ivan Zhang and Nick Frosst.

Visit Cohere For AI →

Architecture

Decoder-only Transformer (Command R base architecture)

Aya 23 35B is a dense decoder-only transformer fine-tuned from Cohere's Command R 35B base. The architecture uses grouped-query attention (8 KV heads), RoPE positional embeddings, SwiGLU activations, RMSNorm, no biases, and a 256,000-token multilingual BPE tokeniser that is particularly efficient on non-Latin scripts (Arabic, Hebrew, Korean, Japanese, Chinese). The model was instruction-tuned on the Aya Collection — 513M templated and translated instruction instances — narrowed to coverage for 23 target languages, supplemented by the Aya Dataset (204K human-curated prompt-completion pairs from 65 languages collected via the Aya Annotation Platform) and ShareGPT, Dolly and Flan corpora. There is no public RLHF or DPO stage for Aya 23. Released May 2024 under CC-BY-NC 4.0 alongside an 8B sibling.

Parameters: 35B (dense)
Context: 8.2K tokens

What it can do

State-of-the-art open multilingual performance across 23 languages
Particularly strong on Arabic, Hebrew, Persian, Korean, Hindi, Vietnamese
Outperforms Aya-101 13B, Mistral 7B Instruct v0.2, Gemma 1.1 7B on multilingual benchmarks
256K multilingual BPE tokeniser efficient for non-Latin scripts
Open weights on Hugging Face under CC-BY-NC 4.0
8K context window
Strong instruction following in covered languages
Best for: research and academic multilingual NLP, non-commercial multilingual prototypes.

Training & License

Instruction-tuned on the Aya Collection (513M instances across 114 languages, filtered to 23 target languages for the 23 variants) plus the Aya Dataset (204K human-curated multilingual prompt-completion pairs from 65 languages collected through the Aya Annotation Platform), ShareGPT, Dolly and Flan. No public RLHF stage. Base model knowledge cutoff March 2024.

License: Open weights under CC-BY-NC 4.0 — non-commercial research use only. Commercial use requires the hosted Cohere API under separate commercial terms.

Known limitations

CC-BY-NC license blocks commercial use of the open weights
Narrower language coverage than Aya-101 (23 vs 101 languages)
Short 8K context by 2024 standards
No tool-use or function-calling fine-tune
No vision input

Research papers

Frequently asked questions

Related Models

View all Text & Chat

Bio_ClinicalBERT

huggingface

The original Bio_ClinicalBERT from Alsentzer et al., a BERT model initialized from BioBERT and further pretrained on all MIMIC-III clinical notes. Served as a fill-mask endpoint it predicts masked tokens in clinical text and produces clinical embeddings. It is the standard encoder backbone behind many downstream clinical NLP fine-tunes.

€1.00

Biomedical NER (all entities)

huggingface

Token-classification model from d4data that tags 84 biomedical entity types in clinical and medical text, including disease, sign, symptom, medication, dosage, lab value, body part and procedure. Trained on the Maccrobat clinical case corpus on a DistilBERT base, so it runs cheaply for high-volume tagging.

€1.00

Claude Opus 4

Anthropic

Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.

Free

Claude Opus 4.8