Robotics / VLA

Vision-Language-Action models for robotics and embodied AI

12 models available

GR00T N1

RoboticsNVIDIA
NewPopular

NVIDIA's foundation model for humanoid robots. World-model-based VLA enabling whole-body control and human-like manipulation.

€0.13
humanoidNVIDIAwhole-body

OpenVLA

RoboticsOpenVLA
NewPopular

Open-source 7B Vision-Language-Action model built on Prismatic VLM and Llama 2. Converts visual observations and language goals into robot actions.

€0.05
open-source7BLlama-based

Pi0

RoboticsPhysical Intelligence
NewPopular

Physical Intelligence's foundation model for robot control. Combines vision-language understanding with dexterous manipulation across diverse tasks.

€0.13
dexterousfoundation-modelPhysical Intelligence

RT-2

RoboticsGoogle DeepMind
Popular

Google DeepMind's Robotic Transformer 2. Vision-Language-Action model that translates visual observations and language instructions directly into robot actions.

€0.13
roboticsGoogle DeepMindmanipulation

Gemini Robotics

RoboticsGoogle DeepMind
New

Google DeepMind's Gemini model adapted for robotics. Leverages Gemini's multimodal understanding for zero-shot robot task planning and execution.

€0.10
Geminizero-shottask-planning

Helix

RoboticsFigure AI
New

Figure AI's VLA model powering their humanoid robots. Combines language understanding with full-body motion planning for household and industrial tasks.

€0.13
humanoidFigure AIfull-body

LLARVA

RoboticsLLARVA

Vision-Language-Action model using LLM backbones for structured robot action prediction. Bridges language models and low-level robot control.

€0.05
LLM-basedstructured-actionsresearch

Octo

RoboticsUC Berkeley

Open-source generalist robot policy from UC Berkeley. Supports multiple robot embodiments and can be fine-tuned for new tasks with minimal data.

€0.07
open-sourcegeneralistfine-tunable

Pi0.5

RoboticsPhysical Intelligence
New

Physical Intelligence's latest VLA model with improved generalization. Handles complex multi-step manipulation tasks with fewer demonstrations.

€0.13
next-genmulti-stepPhysical Intelligence

RoboFlamingo

RoboticsRoboFlamingo

Robotics adaptation of the Flamingo vision-language model. Few-shot learning for robot tasks using language-conditioned visuomotor policies.

€0.05
few-shotFlamingo-basedresearch

RT-X

RoboticsGoogle DeepMind

Cross-embodiment robotic foundation model from the Open X-Embodiment collaboration. Trained on data from 22 robot types for generalized manipulation.

€0.10
cross-embodimentopen-datamanipulation

SpatialVLA

RoboticsSpatialVLA

VLA model with explicit 3D spatial reasoning. Uses depth perception and spatial understanding for more precise robotic manipulation.

€0.08
3D-spatialdepthprecision

Start Building with AI

Access all models through a single API. Get free credits when you sign up — no credit card required.