How much does DeepSeek Coder V2 cost via Railwail?

Input: €1.40 per 1M tokens. Output: €2.80 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of DeepSeek Coder V2?

DeepSeek Coder V2 supports a 128K tokens context window — enough for long books, technical manuals, and extended analysis.

How fast is DeepSeek Coder V2?

Average response latency: 2.0s (p50 across recent Railwail traffic). See live p50/p95 metrics on /rankings.

Is DeepSeek Coder V2 better than Codestral?

It depends on your use case. DeepSeek Coder V2 (DeepSeek) and Codestral (Mistral AI) are both strong choices in code. Compare them side-by-side at /compare/deepseek-coder-v2-vs-codestral.

DeepSeek Coder V2

Name: DeepSeek Coder V2
Brand: DeepSeek
SKU: deepseek-coder-v2
Price: 0.0014 EUR
Availability: InStock

DeepSeek

Code

DeepSeek's specialized coding model. Excellent at code generation, debugging, and explanation.

Try DeepSeek Coder V2 now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated March 4, 2026

DeepSeek Coder V2 is code AI model from DeepSeek, priced at €1.40 per 1M input tokens with a 128K tokens context window.

Try DeepSeek Coder V2

Prompt

Language

Examples

See what DeepSeek Coder V2 can generate

Python Binary Search

Write an efficient binary search function in Python that returns the index of a target value or -1 if not found

def binary_search(arr: list[int], target: int) -> int:
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = left + (right - left) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

# Example usage:
# sorted_list = [1, 3, 5, 7, 9, 11, 13]
# print(binary_search(sorted_list, 7))  # Output: 3

React Custom Hook

Create a React custom hook called useDebounce that debounces a value by a given delay

import { useState, useEffect } from 'react';

function useDebounce<T>(value: T, delay: number): T {
  const [debouncedValue, setDebouncedValue] = useState<T>(value);

  useEffect(() => {
    const timer = setTimeout(() => {
      setDebouncedValue(value);
    }, delay);

    return () => {
      clearTimeout(timer);
    };
  }, [value, delay]);

  return debouncedValue;
}

export default useDebounce;

// Usage:
// const debouncedSearch = useDebounce(searchTerm, 300);

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate DeepSeek Coder V2 into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("deepseek-coder-v2", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("deepseek-coder-v2", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("deepseek-coder-v2", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

128,000 tokens

Max output

8,192 tokens

Avg. latency

2.0s

Developer

DeepSeek

Deep dive — DeepSeek's DeepSeek Coder V2

About DeepSeek

Founded 2023 · Hangzhou, China

DeepSeek AI was founded in July 2023 in Hangzhou by Liang Wenfeng, who is also co-founder of the High-Flyer quantitative hedge fund. High-Flyer's GPU cluster (thousands of NVIDIA A100/H800 cards stockpiled before US export controls tightened) bootstrapped DeepSeek's training capacity. The lab is known globally for highly efficient training recipes documented in transparent technical reports. The DeepSeek-Coder line started with V1 (1.3B-33B dense models, November 2023). DeepSeek-Coder V2, released June 2024, brought MoE scaling — a 236B/21B-active model that matched or exceeded GPT-4 Turbo on code benchmarks at release. A 'Lite' 16B-active sibling was also released. DeepSeek's later DeepSeek-V3 (December 2024) and DeepSeek-R1 (January 2025) absorbed many DeepSeek-Coder design lessons. All releases use the permissive DeepSeek License with broad commercial-use rights.

Visit DeepSeek →

Architecture

Sparse Mixture-of-Experts Transformer for code (DeepSeekMoE + Multi-head Latent Attention)

DeepSeek-Coder V2 is a Sparse Mixture-of-Experts Transformer using the DeepSeekMoE architecture: 160 fine-grained experts plus 2 shared 'always-on' experts per MoE layer, with top-6 routing among the 160. This fine-grained + shared design (introduced in the DeepSeek-MoE paper) gives better expert specialisation than coarse 8x or 16x MoEs at similar active-parameter budgets. The model has 60 layers and 5,120 hidden size, uses Multi-head Latent Attention (MLA) for memory-efficient KV cache, RoPE position embeddings with a 128K context extension, and SwiGLU activations. The 100,000-token DeepSeek BPE tokeniser is shared across the DeepSeek family. Training began from the DeepSeek-V2 base (8.1T tokens pretraining) and added 6T more tokens of code, code-related natural language and math reasoning data. Programming language coverage is 338 languages — broader than any contemporaneous open code model. The model supports fill-in-the-middle via `<|fim_begin|>`, `<|fim_hole|>` and `<|fim_end|>` tokens. Post-training uses supervised fine-tuning plus DeepSeek's Group Relative Policy Optimisation (GRPO) RL method. A 16B-active 'Lite' variant (DeepSeek-Coder-V2-Lite) is also released. Open weights are released under the permissive DeepSeek License Agreement, which allows commercial use.

Parameters: 236B total, 21B active per token (160 fine-grained experts + 2 shared, top-6 routing)
Context: 128K tokens

What it can do

236B total / 21B active — MoE cheaper to serve than 200B+ dense alternatives
Matched or exceeded GPT-4 Turbo on HumanEval, MBPP, LiveCodeBench at release
Supports 338 programming languages — broader than Codestral, CodeLlama or any peer at release
128K context for whole-repo and large-file reasoning
Native fill-in-the-middle (FIM) tokens for IDE completion
Multi-head Latent Attention for memory-efficient inference
Strong general reasoning and math from the V2 base — not just code
Open weights under permissive DeepSeek License (commercial use allowed)
Best for: high-quality code generation, repository-scale reasoning, polyglot codebases, self-hosted production code AI.

Training & License

Continued pretraining from DeepSeek-V2 base (8.1T tokens). Added 6T tokens of code-and-math-heavy data: code repositories across 338 languages (60% of the additional mix), code-related natural language (10%), math reasoning data (10%) and web data (20%). Knowledge cutoff November 2023. Post-training is supervised fine-tuning plus GRPO RL on reasoning and code benchmarks.

License: DeepSeek License Agreement. Permissive commercial license with standard acceptable-use restrictions. No revenue threshold or separate-licence requirement — among the most liberal frontier-scale open-weights licenses.

Known limitations

236B total weights need ~470GB FP16 (~120GB INT4) — heavy for self-hosting
Superseded on many benchmarks by DeepSeek-V3 and DeepSeek-V3-Coder successors
Latency higher than smaller dense code models for short completion calls
MoE routing means fewer inference engines support it cleanly
No vision modality
Filters politically sensitive Chinese topics consistent with regulations

Research papers

Frequently asked questions

Related Models

View all Code

Codestral

Mistral AI

Mistral's code-specialized model. Optimized for code generation, completion, and understanding across 80+ languages.

Free

Code Llama 13B Instruct

Code Llama 34B Instruct

Code Llama 70B Instruct

Start using DeepSeek Coder V2 today

Get started with free credits. No credit card required. Access DeepSeek Coder V2 and 100+ other models through a single API.

Get Started Free Browse All Models