Generative AI systems that create text, images, audio, and even video are becoming commonplace. In the same way AI models output those data types, they can also be used to output robot actions. That’s the foundation of Google DeepMind’s Gemini Robotics project, which has announced a pair of new models that work together to create the first robots that “think” before acting. Traditional LLMs have their own set of problems, but the introduction of simulated reasoning did significantly upgrade their capabilities, and now the same could be happening with AI robotics.

The team at DeepMind contends that generative AI is a uniquely important technology for robotics because it unlocks general functionality. Current robots have to be trained intensively on specific tasks, and they are typically bad at doing anything else. “Robots today are highly bespoke and difficult to deploy, often taking many months in order to install a single cell that can do a single task,” said Carolina Parada, head of robotics at Google DeepMind.

The fundamentals of generative systems make AI-powered robots more general. They can be presented with entirely new situations and workspaces without needing to be reprogrammed. DeepMind’s current approach to robotics relies on two models: one that thinks and one that does.

Read full article

Comments


From Ars Technica - All content via this RSS feed