Sunday, April 20, 2025

The latest DeepMind research at ICLR 2023

Share

Research on artificial intelligence models that can generalize, scale, and accelerate learning

The 11th will start next week International conference on learning representations (ICLR), which will take place from May 1 to 5 in Kigali, Rwanda. This will be the first major artificial intelligence (AI) conference to be held in Africa and the first in-person event since the pandemic began.

Scientists from around the world will gather to share their cutting-edge work in deep learning, spanning the fields of artificial intelligence, statistics and data analytics, and applications including machine vision, gaming and robotics. We are proud to support the conference as a diamond sponsor and DEI champion.

Teams from across DeepMind are presenting 23 papers this year. Here are some highlights:

Open questions on the road to AGI

Recent advances have shown incredible AI performance across text and images, but more research is needed for systems to generalize results across domains and scales. This will be a key step in the development of artificial general intelligence (AGI) as a transformational tool in our everyday lives.

We present a recent approach in which models learn by solving two problems in one. By teaching models to look at a problem from two perspectives simultaneously, they learn how to draw conclusions from tasks that require solving similar problems, which is beneficial for generalization. We also investigated the ability of neural networks to generalize comparing them with Chomsky’s hierarchy of languages. By rigorously testing 2,200 models on 16 different tasks, we found that some models are challenging to generalize, and we found that extending them with external memory is crucial to improving performance.

The next challenge we face is how to do this make progress on long-term tasks at an expert level, where rewards are few and far between. We have developed a recent approach and an open-source training dataset to aid models learn to explore in a human-like way over long time horizons.

Pioneering approaches

As we develop more advanced AI capabilities, we must ensure that current methods work as intended and effectively in the real world. For example, although language models can produce impressive answers, many people are unable to explain their answers. Introducing A a method of using language models to solve multi-step reasoning problems using the underlying logical structure to provide explanations that can be understood and verified by humans. On the other hand, adversarial attacks are a way to test the limitations of AI models by tricking them into producing erroneous or harmful results. Training on adversarial examples makes models more resistant to attacks, but may come at the cost of performance with “ordinary” inputs. We show that by adding adapters we can create models that allow us to control for this trade-off in flight.

Reinforcement learning (RL) has proven effective in many cases real-world challenges, but RL algorithms are typically designed to perform well on one task and have difficulty generalizing to recent ones. We propose algorithmic distillation, a method for efficiently generalizing a single model to recent tasks by training the transformer to mimic the learning history of RL algorithms across tasks. RL models also learn through trial and error, which can be data-intensive and time-consuming. Our Agent 57 model needed almost 80 billion frames of data to achieve human-level performance on 57 Atari games. We are providing a recent way Reach this level using 200 times less experiencesignificantly reducing computational and energy costs.

Artificial intelligence for science

Artificial intelligence is a powerful tool for researchers to analyze huge amounts of sophisticated data and understand the world around us. Several articles show how artificial intelligence is accelerating scientific progress and how science is developing artificial intelligence.

Predicting the properties of a molecule from its 3D structure is crucial for drug discovery. We present denoising method that provides state-of-the-art molecular property prediction, enables large-scale pre-training, and generalizes across diverse biological datasets. We also present something recent a transformer that can perform more accurate quantum chemical calculations using only atomic position data.

Finally with FIGnetWe draw inspiration from physics to model the collisions of sophisticated shapes, such as a teapot or a donut. This simulator can be used in robotics, graphics and mechanical design.

See the full list DeepMind documents and event schedule for ICLR 2023.

Latest Posts

More News