Robotics / VLA
Vision-Language-Action models for robotics and embodied AI
12 models available
GR00T N1
NVIDIA's foundation model for humanoid robots. World-model-based VLA enabling whole-body control and human-like manipulation.
OpenVLA
Open-source 7B Vision-Language-Action model built on Prismatic VLM and Llama 2. Converts visual observations and language goals into robot actions.
Pi0
Physical Intelligence's foundation model for robot control. Combines vision-language understanding with dexterous manipulation across diverse tasks.
RT-2
Google DeepMind's Robotic Transformer 2. Vision-Language-Action model that translates visual observations and language instructions directly into robot actions.
Gemini Robotics
Google DeepMind's Gemini model adapted for robotics. Leverages Gemini's multimodal understanding for zero-shot robot task planning and execution.
Helix
Figure AI's VLA model powering their humanoid robots. Combines language understanding with full-body motion planning for household and industrial tasks.
LLARVA
Vision-Language-Action model using LLM backbones for structured robot action prediction. Bridges language models and low-level robot control.
Octo
Open-source generalist robot policy from UC Berkeley. Supports multiple robot embodiments and can be fine-tuned for new tasks with minimal data.
Pi0.5
Physical Intelligence's latest VLA model with improved generalization. Handles complex multi-step manipulation tasks with fewer demonstrations.
RoboFlamingo
Robotics adaptation of the Flamingo vision-language model. Few-shot learning for robot tasks using language-conditioned visuomotor policies.
RT-X
Cross-embodiment robotic foundation model from the Open X-Embodiment collaboration. Trained on data from 22 robot types for generalized manipulation.
SpatialVLA
VLA model with explicit 3D spatial reasoning. Uses depth perception and spatial understanding for more precise robotic manipulation.
Start Building with AI
Access all models through a single API. Get free credits when you sign up — no credit card required.