Tests
Towards more multimodal, stalwart and general AI systems
The 37th Annual Conference on Neural Information Processing Systems (NeurIPS), the world’s largest artificial intelligence (AI) conference, begins next week. NeurIPS 2023 will take place on December 10-16 in Modern Orleans, USA.
At the main conference and workshops, teams from across Google DeepMind present over 180 papers.
We will be showcasing demonstrations of our cutting-edge AI models for global weather forecasting, material discovery, and watermarking of AI-generated content. There will also be an opportunity to hear from the team behind Gemini, our largest and most capable artificial intelligence model.
Here are some of the most essential results of our research:
Multimodality: language, video, action
UniSim is a universal real-world interaction simulator.
Generative AI models can create images, compose music, and write stories. However, no matter how capable these models may be in one medium, most have difficulty transferring these skills to another. We delve into how generative abilities can aid learning in various modalities. We show this in a spotlight presentation diffusion models can be used for image classification without the need for additional training. Diffusion models like Imagen classify images in a more human-like way than other models, based on shapes rather than textures. What’s more, we show you how simply Predicting signatures from images can improve computer vision learning. Our approach outperformed current methods on visual and language tasks and showed greater scaling potential.
More multimodal models could make way for more useful digital and robotic assistants to aid people in their daily lives. We are the center of attention on the postercreate agents that could interact with the digital world like humans — through screenshots and keyboard and mouse actions. Separately we show it by using video generation, including captions, models can convey knowledge by predicting video plans for real robot actions.
One of the next milestones could be generating realistic experiences in response to the actions of humans, robots and other types of interactive agents. We will present a demo UniSim, our universal real-world interaction simulator. This type of technology can be used in a variety of industries, from video games and film to real-world training agents.
Building secure and understandable artificial intelligence
Artistic illustration of artificial intelligence (AI). This image shows artificial intelligence security research. It was created by artist Khyati Trehan as part of the Visualizing AI project launched by Google DeepMind.
When developing and deploying huge models, privacy must be considered at every step.
In the article recognized as NeurIPS Best Article AwardOur researchers show you how to evaluate privacy protection training using effective technique enough for real world apply. For training, our teams learn how to measure “ifs.” language models remember data – to protect private and sensitive materials. In another oral presentation, our scientists investigate limitations of training in the “student” and “teacher” models. that have different levels of access and vulnerability to attack.
Immense-language models can produce impressive responses, but are susceptible to “hallucinations,” text that appears correct but is made up. Our researchers ask whether the method of finding the location(s) of a recorded fact can enable its editing. Surprisingly, they discovered itlocating a fact and editing the location does not edit the fact, pointing out the complexity of understanding and controlling the information stored in the LLM. WITH Tracr, we propose a novel way to evaluate interpretability methods by translating human-readable programs into transformer models. We have open source version of Tracr serve as a basis for evaluating interpretability methods.
Emerging abilities
Artistic illustration of artificial intelligence (AI). This image shows Artificial General Intelligence (AGI). It was created by Novoto Studio as part of the Visualizing AI project launched by Google DeepMind.
As huge models become more powerful, our research pushes the boundaries of fresh possibilities to develop more general artificial intelligence systems.
Although language models are used for general tasks, they lack the necessary exploratory and contextual understanding to solve more elaborate problems. Introducing Thought Tree, a new framework for language model inference to aid models explore and consider a wide range of possible solutions. By organizing reasoning and planning in the form of a tree rather than the commonly used flat chain of thought, we show that the language model is able to solve elaborate tasks such as the “24 game” much more accurately.
To aid people solve problems and find what they are looking for, AI models must efficiently process billions of unique values. WITH Function multiplexing, one representation space is used for many different functionsenabling huge embedding models (LEMs) to scale into products for billions of users.
Finally, with DoReMi we show you how to apply artificial intelligence for automation a mix of training data types can significantly speed up the training of language modelsand improve performance for fresh and unseen tasks.
Supporting the global AI community
We are proud to sponsor NeurIPS and support workshops led by LatinX in artificial intelligence, QueerInAIAND Women In ML, helping to foster research collaborations and grow a diverse AI and machine learning community. This year, NeurIPS will have a artistic track that includes our Visualizing AI project, which commissions artists to create more diverse and accessible representations of artificial intelligence.
If you are attending NeurIPS, come to our booth to learn more about our cutting-edge research and meet our teams leading workshops and presenting at the conference.