Oriol Nieto from Adobe uploaded a low video with a few scenes and a voiceover, but no sound effects. The AI model analyzed the video and divided it into scenes using emotional tags and a description of each scene. Then came the sound effects. For example, the AI model captured a scene with an alarm clock and automatically created a sound effect. He identified a scene where the main character (in this case an octopus) is driving a car and added the sound effect of a door closing.
It wasn’t perfect. The alarm sound wasn’t realistic, and in the scene where two characters were hugging, the AI model added an unnatural rustling of clothes, which didn’t work. Instead of manual editing, Adobe used a conversational interface (such as ChatGPT) to describe the changes. In the car scene, there were no ambient sounds coming from the car. Instead of manually selecting a scene, Adobe used a conversational interface and asked the AI model to add a car sound effect to the scene. It successfully found the scene, generated the sound effect and placed it perfectly.
These experimental features are not available, but usually find their way into the Adobe suite. For example, Harmonizea Photoshop feature that automatically places assets with right colors and lighting into a scene was demonstrated at last year’s Sneaks conference. Now it’s in Photoshop. Expect them to arrive sometime in 2026.
Adobe’s announcement comes just months after video game voice actors ended a nearly year-long strike aimed at securing protections around artificial intelligence – companies must obtain consent and provide disclosure agreements when game developers want to recreate a voice actor’s voice or likeness using artificial intelligence. Voice actors have been preparing for the impact that artificial intelligence will have on business for some time now, and Adobe’s recent features, even if they don’t generate voices from scratch, are another indicator of the change that artificial intelligence is forcing on the inventive industry.
