Google Deepmind to LAunching Two new AI models Designed to lend a hand the robots “perform a wider range of real tasks than ever before”. The first, called Gemini Robotics, is a model of vision action that is able to understand novel situations, even if it has not been trained on them.
Gemini Robotics is built on Gemini 2.0, the latest version of the flagship model AI Google. During the press briefing Carolina Parada, senior director and head of robotics at Google Deepmind, said that Gemini Robotics “draws on the multimodal understanding of the Gemini world and takes it to the real world, adding physical actions as new modality.”
The novel model makes progress in three key areas, which according to Google Deepmind are necessary to build helpful robots: generality, interactivity and dexterity. In addition to the possibility of generalizing novel scenarios, Gemini robotics is better to interact with people and their environment. It is also able to perform more precise physical tasks, such as folding a piece of paper or removing bottles of bottles.
“Although we have made progress in each of these areas individually in the past with general robotics, we bring [drastically] Increasing performance in all three areas with one model, “said the parade. “This allows us to build robots that are more talented, which are more responsive and more solid for changes in their environment.”
Google Deepmind also introduces Gemini Robotics-era (or Evodied Reasoning), which the company describes as an advanced visual language model that can “understand our complex and dynamic world.”
As the parade explains, when you pack a lunch box and you have objects on your table, you need to know where everything is, and how to open a lunch box, how to capture objects and where to put them. This kind of reasoning of Gemini Robotics-e is to be done. It is intended for robotists to connect with existing low-level controllers-the system controlling the movements of the robot-overwhelming motion to include novel possibilities driven by Gemini Robotics-era.
In terms of safety, Google Deepmind researcher, Vikas Sindhwani, told reporters that the company is developing “layered use”, adding that the models of the Gemini robotics “are trained to assess whether the potential action is safe to perform in a given scenario.” The company also releases novel reference and frame points to lend a hand further safety research in the AI industry. Last year, Google Deepmind introduced his “robot constitution”, a set of rules inspired by Isaac Asimov for his robots.
Google Deepmind cooperates with Apptronik to “build the next generation of humanoid robots.” It also gives “trusted testers” access to the Gemini Robotics-era model, including Agile robotsAgility robotics, boston dynamics i Enchanted tools. “We are very focused on building intelligence, which will be able to understand the physical world and be able to act in this physical world,” said the parade. “We are very excited to basically use it in many examples of performance and many applications for us.”
