Visual-Language-Action
-
Google Unveils Offline AI Robot Model with Vision and Language Capabilities
Google DeepMind has introduced Gemini Robotics On-Device, an offline AI model for robots. This VLA model processes visual data, language, and actions on-device, enabling real-time responses to human commands. Designed for two-armed robots, it interprets natural language instructions. This offers advantages in critical scenarios by removing reliance on cloud connectivity. The model is built on Gemini 2.0 and competes with rivals like Nvidia and OpenAI in the robotics market.