Robotics / VLA

Vision-Language-Action models for robotics and embodied AI

12 models available

GR00T N1

RoboticsNVIDIA
NewPopular

NVIDIA's foundation model for humanoid robots. World-model-based VLA enabling whole-body control and human-like manipulation.

Free
humanoidNVIDIAwhole-body

OpenVLA

RoboticsOpenVLA
NewPopular

Open-source 7B Vision-Language-Action model built on Prismatic VLM and Llama 2. Converts visual observations and language goals into robot actions.

Free
open-source7BLlama-based

Pi0

RoboticsPhysical Intelligence
NewPopular

Physical Intelligence's foundation model for robot control. Combines vision-language understanding with dexterous manipulation across diverse tasks.

Free
dexterousfoundation-modelPhysical Intelligence

RT-2

RoboticsGoogle DeepMind
Popular

Google DeepMind's Robotic Transformer 2. Vision-Language-Action model that translates visual observations and language instructions directly into robot actions.

Free
roboticsGoogle DeepMindmanipulation

Gemini Robotics

RoboticsGoogle DeepMind
New

Google DeepMind's Gemini model adapted for robotics. Leverages Gemini's multimodal understanding for zero-shot robot task planning and execution.

Free
Geminizero-shottask-planning

Helix

RoboticsFigure AI
New

Figure AI's VLA model powering their humanoid robots. Combines language understanding with full-body motion planning for household and industrial tasks.

Free
humanoidFigure AIfull-body

LLARVA

RoboticsLLARVA

Vision-Language-Action model using LLM backbones for structured robot action prediction. Bridges language models and low-level robot control.

Free
LLM-basedstructured-actionsresearch

Octo

RoboticsUC Berkeley

Open-source generalist robot policy from UC Berkeley. Supports multiple robot embodiments and can be fine-tuned for new tasks with minimal data.

Free
open-sourcegeneralistfine-tunable

Pi0.5

RoboticsPhysical Intelligence
New

Physical Intelligence's latest VLA model with improved generalization. Handles complex multi-step manipulation tasks with fewer demonstrations.

Free
next-genmulti-stepPhysical Intelligence

RoboFlamingo

RoboticsRoboFlamingo

Robotics adaptation of the Flamingo vision-language model. Few-shot learning for robot tasks using language-conditioned visuomotor policies.

Free
few-shotFlamingo-basedresearch

RT-X

RoboticsGoogle DeepMind

Cross-embodiment robotic foundation model from the Open X-Embodiment collaboration. Trained on data from 22 robot types for generalized manipulation.

Free
cross-embodimentopen-datamanipulation

SpatialVLA

RoboticsSpatialVLA

VLA model with explicit 3D spatial reasoning. Uses depth perception and spatial understanding for more precise robotic manipulation.

Free
3D-spatialdepthprecision

Start Building with AI

Access all models through a single API. Get free credits when you sign up — no credit card required.