Join our daily and weekly newsletters to receive the latest updates and exclusive content on our industry-leading AI coverage. Find out more
As Amazon takes a significant step into the AI space with its modern Nova family of entry-level models, Google is increasing its own multimodal AI capabilities. The tech giant’s cloud division has announced that its latest video and image generation models, Veo and Imagen 3, are now available on Vertex AI.
This move enables teams to integrate cutting-edge video and image generation capabilities with AI-powered workflows, unlocking a variety of employ cases – especially in marketing and advertising. This makes Google Cloud the first hyperscaler to offer a video model to its customers.
While the Veo is currently in private preview, Imagen 3 will be generally available to all Vertex AI users starting next week. Notably, Imagen 3 also includes editing features, allowing users to refine generated images to meet specific imaginative needs.
What do Veo and Imagen 3 offer?
First unveiled at Google’s I/O developer conference, Veo is Google DeepMind’s answer to competition such as Runway Gen-3 and Sora with OpenAI, providing advanced video generation capabilities. The model transforms text or image prompts into cinematic, high-resolution videos in a variety of visual styles, generating clips longer than 60 seconds. What sets it apart is frame-level consistency, ensuring objects move smoothly within the shot.
Imagen 3, also from DeepMind, takes over the task of generating text to image, creating photorealistic visualizations in a variety of styles. Google claims it outperforms its predecessors in terms of detail, lighting accuracy and artifact reduction.
In addition to generating, Google whitelisted users also have access to advanced customization options in Imagen 3. These include image scaling, painting, repainting, and background replacement – all with text prompts. Additionally, users can provide reference images, allowing Imagen 3 to create content tailored to a specific brand aesthetic, logo or product features.
Wider implications for industry
Vertex AI has long been Google Cloud’s flagship platform for streamlining the development and deployment of AI applications. With the integration of Veo and Imagen 3, the platform gives organizations an even more comprehensive set of tools to innovate in marketing, sales and beyond.
For example, Imagen 3 simplifies the creation of high-quality assets such as product photos and social media content, while Veo extends this capability by giving teams the option to convert these visualizations into polished videos. It speeds up production, lowers costs and accelerates prototyping, enabling teams to quickly make changes to their imaginative strategies.
“Customers like Agoda are leveraging the power of AI models like Veo, Gemini and Imagen to streamline video ad production, achieving significant reductions in production time,” said Warren Barkley, senior director of product management at Google, in blog post. He also highlighted that both models come with safety features such as digital watermarks and content moderation guardrails to mitigate risks associated with generative AI.
Other early adopters include Mondelez International – owner of brands including Oreo, Cadbury and Milka – and global marketing and communications firm WPP. As Google’s core models expand in reach, companies across industries have a huge opportunity to transform the way they create and deliver visual content.
The competition continues to heat up
While all major cloud service providers, including Google Cloud, Amazon Web Services, and Microsoft Azure, provide image generation models on their AI orchestration platforms, video generation has been quite sporadic so far. Google’s decision to put Veo in private preview changes that today.
Interestingly, shortly after announcing Veo, AWS made a splash at re:Invent by announcing Nova Reel, an entry-level model that generates six-second, studio-quality videos based on text and image prompts.
This model, along with other models in the Nova family, will be available through Amazon Bedrock, the company’s fully managed service designed to simplify the creation and deployment of generative AI applications.
For its part, Microsoft appears to be lagging behind in this category at this stage. Its AI Foundry does not include video generation models. However, we expect this to change as soon as Sora from OpenAI hits the market.