GPT-5.4 Mini

New
Popular
OpenAI
Multimodal

OpenAI's efficient mid-tier model. 2x faster than its predecessor, 400k context, approaches GPT-5.4 quality on SWE-Bench Pro at a fraction of the cost.

Try GPT-5.4 Mini now
Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.
Sign in to try this model with €5 free credits.
Sign in
Press Cmd+Enter to send
Response appears here.
TL;DR·Last updated May 16, 2026

GPT-5.4 Mini is multimodal AI model from OpenAI, priced at €0.001 per 1M input tokens with a 400K tokens context window.

About this model

Released March 17, 2026, GPT-5.4 mini brings the strengths of GPT-5.4 to a smaller, faster model designed for the subagent era. 400K context, vision input, integrated tool use, and 2x faster latency than GPT-5 mini. Significant gains on coding, reasoning, multimodal understanding and tool use; approaches the full GPT-5.4 on SWE-Bench Pro and OSWorld-Verified. Recommended for subagent workflows, customer-facing chat, coding assistants and high-volume API workloads.
Try GPT-5.4 Mini

0.7

Sign in to generate — 50 free credits on sign-up

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate GPT-5.4 Mini into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("gpt-5-4-mini", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("gpt-5-4-mini", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("gpt-5-4-mini", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
400,000 tokens
Max output
128,000 tokens
Developer
OpenAI
Category
Multimodal
Supported Formats
text
image
Tags
openai
balanced
cost-efficient
vision
subagents
tools

Deep dive — OpenAI's GPT-5.4 Mini

About OpenAI
Founded 2015 · San Francisco, USA

OpenAI was founded in December 2015 as a non-profit AI research organisation and transitioned to a capped-profit structure in 2019. The GPT lineage spans GPT-1 (2018) through GPT-5 (mid-2025) and the GPT-5.x family (2025-2026) which unified the o-series reasoning models with the general-purpose GPT line. GPT-5.4 mini and nano were announced on March 17, 2026 as the small-model tier of the GPT-5.4 generation. OpenAI is backed by Microsoft, Khosla, Andreessen Horowitz, Thrive Capital and Sequoia, with total funding above $60 billion and a 2026 valuation above $300 billion.

Visit OpenAI →
Architecture
Unified Transformer (mid-tier, with integrated 'Thinking' tier)

GPT-5.4 mini was announced March 17, 2026 alongside GPT-5.4 nano as the small-model tier of the GPT-5.4 generation. It is a smaller variant of the unified GPT-5.4 architecture, retaining native text + image input, an integrated 'Thinking' tier for reasoning on demand, and the full tool-use API, while running more than 2x faster than GPT-5 mini at significantly lower cost. Pretraining used a similar multi-trillion-token mixture as GPT-5.4 with heavier distillation pressure from larger teacher models. Post-training included supervised fine-tuning, RLHF and reinforcement learning against verifiable rewards on coding, reasoning and tool-use trajectories. On evaluations such as SWE-Bench Pro and OSWorld-Verified, GPT-5.4 mini approaches the performance of full GPT-5.4 while costing roughly one-third as much.

Parameters
Undisclosed (estimated tens of billions of parameters, likely sparse MoE)
Context
400K tokens
What it can do
  • 2x faster than GPT-5 mini at lower cost
  • Approaches full GPT-5.4 on SWE-Bench Pro and OSWorld-Verified
  • 400K token context window
  • Native multimodal input: text and images
  • Integrated 'Thinking' tier activates for harder reasoning
  • Native tool use, function calling and parallel tool calls
  • Designed for the subagent era: works well under an orchestrator
  • Strong coding, classification and extraction performance
  • Available in ChatGPT, Codex CLI and the OpenAI API
  • Regional processing endpoints available with 10% uplift
  • Best for: subagent workflows, customer-facing chat, coding assistants, high-volume API workloads.
Training & License

Pretrained on a multi-trillion-token mixture of web text, code, scientific papers and licensed data; heavy distillation from larger GPT-5.4 teacher models. Post-training uses supervised fine-tuning, RLHF and RL against verifiable rewards. Knowledge cutoff approximately late 2025.

License: Proprietary commercial license via OpenAI API and Azure OpenAI.

Known limitations
  • Below full GPT-5.4 on the hardest agentic and reasoning benchmarks
  • Smaller context window than GPT-5.4 (400K vs 1.05M)
  • No native audio or video input
  • Knowledge cutoff in late 2025
  • Thinking mode adds latency and token cost when enabled

Frequently asked questions

Start using GPT-5.4 Mini today

Get started with free credits. No credit card required. Access GPT-5.4 Mini and 100+ other models through a single API.