Snowflake Arctic Instruct
Snowflake's open MoE model: 480B total / 17B active params with dense+MoE hybrid architecture.
Snowflake Arctic Instruct is text & chat AI model from Custom, priced at β¬0.000 per 1M input tokens with a 4.1K tokens context window.
0.7
Pricing
API Integration
Use our OpenAI-compatible API to integrate Snowflake Arctic Instruct into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple β just pass a string
const reply = await rw.run("arctic-instruct", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("arctic-instruct", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("arctic-instruct", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive β Snowflake AI Research's Snowflake Arctic Instruct
Snowflake AI Research is the applied-AI team inside the Snowflake data cloud company, established as a focused unit in 2023 under Yuxiong He (former Microsoft DeepSpeed lead) and Samyam Rajbhandari. Snowflake itself is a public NYSE company (founded 2012) headquartered in Bozeman, Montana with engineering primarily in California. The AI Research team built Arctic to demonstrate that enterprise-targeted LLMs (SQL, code, instruction following) could be trained cheaply by combining many-small-expert MoE designs with carefully curated training data. Arctic Instruct was released April 2024 under Apache 2.0, with the team publicly reporting a training cost of approximately $2M β a fraction of comparable Western frontier runs.
Visit Snowflake AI Research βSnowflake Arctic Instruct is a hybrid dense-MoE transformer released April 2024 under Apache 2.0. The architecture combines a 10B dense transformer with a 128-way MoE component (128 experts of ~3.66B parameters each, top-2 routing) for a total of 480B parameters and 17B active per forward pass. Compared to peer MoEs (Mixtral 8x22B, DBRX) Arctic uses many more, smaller experts and a high expert-count-to-active-param ratio β a deliberate choice to maximise specialisation for enterprise SQL and code tasks at low active compute. Training used a three-stage curriculum on roughly 3.5T tokens of web, code, GitHub, StackExchange and Snowflake-curated enterprise SQL data, with the final stage heavily oversampling SQL and structured outputs. Snowflake reported total training compute of approximately $2M on a cluster of around 3,200 H100 GPUs. Post-training was supervised fine-tuning plus DPO on enterprise instruction data; no full RLHF pipeline was reported.
- Parameters
- 480B total, 17B active per token (10B dense + 128 x 3.66B experts, top-2 routing)
- Context
- 4.1K tokens
- Hybrid 10B-dense + 128-expert MoE architecture
- 17B active parameters out of 480B total β cheap inference for an MoE this large
- Strong text-to-SQL performance (matches or beats Llama 3 70B on Spider, BIRD)
- Solid enterprise code generation in Python, Java, SQL
- Instruction following tuned for structured (JSON) outputs
- Apache 2.0 open weights β fully permissive commercial use
- Reported $2M training cost β milestone in cheap-to-train frontier MoE
- Best for: enterprise SQL generation, structured extraction, self-hosted business assistants.
Pretrained on approximately 3.5 trillion tokens in a three-stage curriculum: broad web data, code-and-math-heavy mix, and a final SQL-and-enterprise-instruction-heavy phase. Sources include filtered Common Crawl, GitHub, StackExchange, books, and Snowflake-curated enterprise SQL corpora. Knowledge cutoff is early 2024. Post-training is supervised fine-tuning plus DPO on enterprise instruction data.
License: Apache 2.0 for both base and Instruct weights. Commercial use, redistribution and modification permitted without royalty.
Known limitations
- Very short 4K context window for a 2024 release
- Weaker on open-ended creative writing than Llama 3 70B Instruct
- Multilingual support limited β trained primarily on English
- Reasoning chains shorter than dedicated reasoning models
- No vision or audio input
Frequently asked questions
Related Models
View all Text & ChatClaude Opus 4
Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.
Claude Sonnet 4
Anthropic's most capable model. Excellent for complex analysis, coding, math, and creative writing.
DeepSeek V3.1
DeepSeek's refreshed V3.1 release. 671B MoE / 37B active. Tops open-weights leaderboards on coding and reasoning.
DeepSeek V4 Pro
DeepSeek's April 2026 flagship. 1.6T MoE / 49B active params, 1M context, rivals top closed-source models on STEM and coding at a fraction of the price.
Start using Snowflake Arctic Instruct today
Get started with free credits. No credit card required. Access Snowflake Arctic Instruct and 100+ other models through a single API.