Physical Intelligence π-0.5
Upgraded π-0 with open-world generalization via knowledge insulation. Weights and fine-tuning open-sourced.
Physical Intelligence π-0.5 is vla / robotics AI model from Physical Intelligence, priced at €0.000 per 1M input tokens with a unknown context window.
0.7
Pricing
API Integration
Use our OpenAI-compatible API to integrate Physical Intelligence π-0.5 into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple — just pass a string
const reply = await rw.run("pi-0-5-pi", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("pi-0-5-pi", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("pi-0-5-pi", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive — Physical Intelligence (PI)'s Physical Intelligence π-0.5
Physical Intelligence (PI) is a San Francisco robot-foundation-model startup founded in 2024 by Sergey Levine, Chelsea Finn, Karol Hausman and co-founders. π-0.5 is the upgraded successor to π-0, announced in April 2025, focused on open-world generalisation: the same robot policy is shown deploying in previously unseen real homes - making beds, cleaning kitchens, sorting laundry - without per-house fine-tuning. The π-0.5 release also expanded the openpi GitHub repository with new checkpoints, training recipes and evaluation tools. π-0.5 is positioned as the next step toward general-purpose physical intelligence and continues PI's strategy of partial open-source: research weights are public, while production-scale checkpoints and proprietary data remain internal.
Visit Physical Intelligence (PI) →π-0.5 builds on the π-0 architecture (PaliGemma 3B multimodal backbone + flow-matching action expert) and introduces a co-training recipe combining heterogeneous data sources: high-quality teleoperation, web-scale image+text, multi-robot cross-embodiment data and large amounts of in-the-wild egocentric video. The training pipeline interleaves robot action prediction with auxiliary objectives (next-token prediction over instructions, captioning, scene reasoning), which significantly improves zero-shot transfer to unseen homes. Inputs remain multi-view RGB + proprioception + natural-language instruction; outputs are flow-matched continuous action chunks. The technical report shows π-0.5 generalising to homes never seen during training, performing multi-step household tasks with natural-language prompts. π-0.5 also improves on long-horizon planning by adopting a hierarchical inference strategy that issues sub-instructions through the same VLA.
- Parameters
- ~3B (PaliGemma backbone + action expert)
- Context
- unknown
- Open-world generalisation across unseen homes
- Long-horizon household tasks: cleaning, bed making, laundry
- Co-training with robot data + egocentric video + web text
- Hierarchical self-prompting for multi-step tasks
- Flow-matching action head for smooth high-frequency control
- Single set of weights deployed across many real homes
- Successor to π-0 with improved language grounding
- Partial open-source via openpi GitHub repository
- Best for: open-world robot research, household manipulation.
Co-trained corpus combining ~10,000+ hours of robot teleoperation, Open-X-Embodiment cross-embodiment data, large-scale egocentric / web video, and web-scale image-text pairs. Multi-stage curriculum mixing action prediction with vision-language objectives.
License: Partially open-source via the openpi GitHub repository for research use. Production deployment licensing is handled directly by Physical Intelligence.
Known limitations
- Still confined to relatively quasi-static manipulation
- Hardware variations cause performance regression
- Latency and onboard compute requirements remain high
- Generalisation to industrial / outdoor settings less proven
- Open-source release trails internal latest checkpoint
- Documentation primarily targets researchers, not integrators
Frequently asked questions
Related Models
View all VLA / RoboticsGemini Robotics (2025)
Google DeepMind's vision-language-action model based on Gemini 2.0. Generalist robot policy with strong dexterity.
Gemini Robotics-ER
Embodied-reasoning variant of Gemini Robotics. Enhanced 3D spatial reasoning and trajectory planning.
Google RT-2-X
Google's VLA from RT-X collaboration. Trained on Open-X-Embodiment (22 robots, 527 skills), positive transfer.
LeRobot SmolVLA
HuggingFace's 450M VLA pretrained on 487 community LeRobot datasets. Runs on consumer GPUs.
Start using Physical Intelligence π-0.5 today
Get started with free credits. No credit card required. Access Physical Intelligence π-0.5 and 100+ other models through a single API.