DeepSeek V3.1

Popular
DeepSeek
Text & Chat

DeepSeek's refreshed V3.1 release. 671B MoE / 37B active. Tops open-weights leaderboards on coding and reasoning.

Try DeepSeek V3.1 now
Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.
Sign in to try this model with €5 free credits.
Sign in
Press Cmd+Enter to send
Response appears here.
TL;DRΒ·Last updated May 16, 2026

DeepSeek V3.1 is text & chat AI model from DeepSeek, priced at €0.270 per 1M input tokens with a 131.1K tokens context window.

Try DeepSeek V3.1

0.7

Sign in to generate β€” 50 free credits on sign-up

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate DeepSeek V3.1 into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple β€” just pass a string
const reply = await rw.run("deepseek-v3-1", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("deepseek-v3-1", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("deepseek-v3-1", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
131,072 tokens
Max output
8,192 tokens
Developer
DeepSeek
Category
Text & Chat
Supported Formats
text
Tags
deepseek
open-weights
moe
coding
reasoning

Deep dive β€” DeepSeek's DeepSeek V3.1

About DeepSeek
Founded 2023 Β· Hangzhou, China

DeepSeek AI was founded in July 2023 in Hangzhou by Liang Wenfeng, also co-founder of the High-Flyer quantitative hedge fund. The fund's pre-export-control GPU cluster financed DeepSeek's training runs. The lab is known for transparent technical reports and an aggressive open-weights strategy under MIT license. Releases include DeepSeek Coder (Nov 2023), DeepSeek LLM 67B (Jan 2024), DeepSeekMath with GRPO (Feb 2024), DeepSeek V2 introducing Multi-head Latent Attention (May 2024), DeepSeek V3 in December 2024 trained for ~$5.6M of GPU-hours, DeepSeek R1 in January 2025 and DeepSeek V3.1 in 2025 as an incremental update consolidating the base model and the R1 reasoning capabilities into a unified hybrid model. The company has roughly 200 researchers and is privately backed by High-Flyer rather than venture capital. Its V3/R1 release triggered a global re-evaluation of frontier-AI training economics and a notable stock-market move in late January 2025.

Visit DeepSeek β†’
Architecture
Sparse Mixture-of-Experts Transformer (hybrid base + thinking modes)

DeepSeek V3.1 is a 2025 update of the V3 base that unifies chat (non-thinking) and reasoning (thinking) modes into a single hybrid checkpoint. It retains the V3 architecture - a Sparse MoE Transformer with 671B total and 37B active parameters using DeepSeekMoE routing and Multi-head Latent Attention - but expands the pretraining corpus and updates the post-training recipe. According to DeepSeek's release notes, V3.1 was continually pretrained on ~840B additional tokens of long-context data, extending effective context handling and improving long-document recall within the 128K window. Post-training merged the V3 chat data with R1-style long-CoT reasoning data plus tool-use and agentic trajectories. V3.1 exposes two operating modes selected via the chat template: 'non-thinking' (V3-style fast responses) and 'thinking' (R1-style chain-of-thought before the answer), letting developers choose per request. Tool use and function calling are first-class and improved over both V3 and R1. The model also includes targeted strengthening on coding, agent benchmarks (SWE-bench, Terminal-Bench), and search-augmented reasoning. Weights are released under MIT license and the official DeepSeek API hosts both V3.1 and V3.1-Terminus checkpoints.

Parameters
671B total, 37B active per token (extended for V3.1)
Context
128K tokens
What it can do
  • Hybrid model: switchable thinking / non-thinking modes in one checkpoint
  • 671B-parameter MoE with 37B active per token
  • 128K context window, retrained on ~840B additional long-context tokens
  • Strong agentic and tool-use performance on SWE-bench Verified and Terminal-Bench
  • Function calling and parallel tool calls
  • Long-CoT reasoning inherited from R1
  • Open weights under MIT license
  • DeepSeek API approximately 1/20th the cost of GPT-4o-class models
  • Compatible with vLLM, SGLang, llama.cpp, HuggingFace
  • Improved code editing and diff-format generation
  • Best for: budget-conscious agentic workloads, coding, hybrid reasoning, on-prem enterprise.
Training & License

Built on V3's 14.8T-token base, then continually pretrained on roughly 840B additional tokens biased toward long-context documents and code. Post-training combines V3 chat data with R1-style long-CoT and agentic tool-use trajectories.

License: MIT license for weights, code and tokenizer; commercial use permitted.

Known limitations
  • Sensitive Chinese political topics filtered
  • Large memory footprint requires multi-GPU inference
  • Text-only inputs (no native vision)
  • Knowledge cutoff approximately late 2024
  • Hybrid mode switching adds prompt-template complexity

Frequently asked questions

Start using DeepSeek V3.1 today

Get started with free credits. No credit card required. Access DeepSeek V3.1 and 100+ other models through a single API.