DeepSeek V3.1
DeepSeek's refreshed V3.1 release. 671B MoE / 37B active. Tops open-weights leaderboards on coding and reasoning.
DeepSeek V3.1 is text & chat AI model from DeepSeek, priced at €0.270 per 1M input tokens with a 131.1K tokens context window.
0.7
Pricing
API Integration
Use our OpenAI-compatible API to integrate DeepSeek V3.1 into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple — just pass a string
const reply = await rw.run("deepseek-v3-1", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("deepseek-v3-1", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("deepseek-v3-1", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive — DeepSeek's DeepSeek V3.1
DeepSeek AI was founded in July 2023 in Hangzhou by Liang Wenfeng, also co-founder of the High-Flyer quantitative hedge fund. The fund's pre-export-control GPU cluster financed DeepSeek's training runs. The lab is known for transparent technical reports and an aggressive open-weights strategy under MIT license. Releases include DeepSeek Coder (Nov 2023), DeepSeek LLM 67B (Jan 2024), DeepSeekMath with GRPO (Feb 2024), DeepSeek V2 introducing Multi-head Latent Attention (May 2024), DeepSeek V3 in December 2024 trained for ~$5.6M of GPU-hours, DeepSeek R1 in January 2025 and DeepSeek V3.1 in 2025 as an incremental update consolidating the base model and the R1 reasoning capabilities into a unified hybrid model. The company has roughly 200 researchers and is privately backed by High-Flyer rather than venture capital. Its V3/R1 release triggered a global re-evaluation of frontier-AI training economics and a notable stock-market move in late January 2025.
Visit DeepSeek →DeepSeek V3.1 is a 2025 update of the V3 base that unifies chat (non-thinking) and reasoning (thinking) modes into a single hybrid checkpoint. It retains the V3 architecture - a Sparse MoE Transformer with 671B total and 37B active parameters using DeepSeekMoE routing and Multi-head Latent Attention - but expands the pretraining corpus and updates the post-training recipe. According to DeepSeek's release notes, V3.1 was continually pretrained on ~840B additional tokens of long-context data, extending effective context handling and improving long-document recall within the 128K window. Post-training merged the V3 chat data with R1-style long-CoT reasoning data plus tool-use and agentic trajectories. V3.1 exposes two operating modes selected via the chat template: 'non-thinking' (V3-style fast responses) and 'thinking' (R1-style chain-of-thought before the answer), letting developers choose per request. Tool use and function calling are first-class and improved over both V3 and R1. The model also includes targeted strengthening on coding, agent benchmarks (SWE-bench, Terminal-Bench), and search-augmented reasoning. Weights are released under MIT license and the official DeepSeek API hosts both V3.1 and V3.1-Terminus checkpoints.
- Parameters
- 671B total, 37B active per token (extended for V3.1)
- Context
- 128K tokens
- Hybrid model: switchable thinking / non-thinking modes in one checkpoint
- 671B-parameter MoE with 37B active per token
- 128K context window, retrained on ~840B additional long-context tokens
- Strong agentic and tool-use performance on SWE-bench Verified and Terminal-Bench
- Function calling and parallel tool calls
- Long-CoT reasoning inherited from R1
- Open weights under MIT license
- DeepSeek API approximately 1/20th the cost of GPT-4o-class models
- Compatible with vLLM, SGLang, llama.cpp, HuggingFace
- Improved code editing and diff-format generation
- Best for: budget-conscious agentic workloads, coding, hybrid reasoning, on-prem enterprise.
Built on V3's 14.8T-token base, then continually pretrained on roughly 840B additional tokens biased toward long-context documents and code. Post-training combines V3 chat data with R1-style long-CoT and agentic tool-use trajectories.
License: MIT license for weights, code and tokenizer; commercial use permitted.
Known limitations
- Sensitive Chinese political topics filtered
- Large memory footprint requires multi-GPU inference
- Text-only inputs (no native vision)
- Knowledge cutoff approximately late 2024
- Hybrid mode switching adds prompt-template complexity
Frequently asked questions
Related Models
View all Text & ChatClaude Opus 4
Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.
Claude Sonnet 4
Anthropic's most capable model. Excellent for complex analysis, coding, math, and creative writing.
DeepSeek V4 Pro
DeepSeek's April 2026 flagship. 1.6T MoE / 49B active params, 1M context, rivals top closed-source models on STEM and coding at a fraction of the price.
GPT-4.1
OpenAI's newest flagship model. Improved reasoning, instruction following, and coding over GPT-4o.
Start using DeepSeek V3.1 today
Get started with free credits. No credit card required. Access DeepSeek V3.1 and 100+ other models through a single API.