Saturday, March 7, 2026

Gemini 2.5: Our most smart models get even better

Share

Novel capabilities of Gemini 2.5

Native audio output and Live API improvements

Today, Active API introduces a preview of audiovisual input and native audio output dialogue, so you can directly build conversational experiences with a more natural and expressive Gemini.

It also allows the user to control tone, accent and speaking style. For example, you can tell your model to tell a story in a dramatic voice. It also supports the operate of tools to search on your behalf.

You can experiment with a set of early features including:

  • Affective dialogue, in which the model detects emotions in the user’s voice and responds accordingly.
  • Proactive Audio, where the model will ignore background conversations and know when to respond.
  • Live API thinking, where the model leverages Gemini’s thinking capabilities to handle more complicated tasks.

We’re also rolling out recent Text-to-Speech previews in 2.5 Pro and 2.5 Flash. They provide first-of-its-kind multi-speaker support, enabling text-to-speech with two voices via native audio output.

Like Native Audio dialogue, text-to-speech is crisp and capable of capturing truly subtle nuances such as whispers. It works in over 24 languages ​​and switches between them seamlessly.

Latest Posts

More News