Responsibility and safety
Up-to-date research examines today’s misuse of multimodal generative AI to support build safer, more responsible technologies
Generative artificial intelligence (AI) models that can generate images, text, audio, video, and more are enabling a recent era of creativity and commercial opportunity. But as these capabilities grow, so does the potential for misuse, including manipulation, fraud, intimidation, or harassment.
As part of our commitment to responsibly develop and employ artificial intelligence, we have published new paperin association with Puzzle AND Google.organalyzing how generative AI technologies are currently being misused. Teams at Google are using this and other research to develop better safeguards for our generative AI technologies, among other security initiatives.
Together, we collected and analyzed nearly 200 media reports of public misuse incidents published between January 2023 and March 2024. From these reports, we defined and classified common generative AI misuse tactics and discovered recent patterns in how these technologies are being used or compromised.
By explaining the current risks and tactics used in different types of generative AI solutions, our work can support shape AI governance and support companies like Google and others building AI technologies develop more comprehensive security assessments and mitigation strategies.
Highlighting the main categories of abuse
While generative AI tools are a unique and compelling way to foster creativity, the ability to create personalized, immersive content can be misused by malicious actors.
Through analysis of media reports, we identified two main categories of generative AI abuse tactics: exploiting generative AI capabilities and compromising generative AI systems. Examples of technology exploitation included creating realistic human likenesses to impersonate public figures; while instances of technology compromise included “jailbreaking” to remove model protections and using adversarial input to cause failures.
Relative frequency of AI abuse tactics in our dataset. Each reported case of abuse in the media may involve one or more tactics.
Exploitation cases—involving malicious actors using readily available, consumer-grade generative AI tools, often in ways that did not require advanced technical skills—were the most common in our dataset. For example, we reviewed a high-profile case from February 2024 in which a multinational company he reportedly lost 200 million Hong Kong dollars (approx. $26 million) after an employee was tricked into making a financial transfer during an online meeting. In that case, every other “person” in the meeting, including the company’s CFO, was actually a convincing, computer-generated fraudster.
Some of the most prominent tactics we’ve seen, such as impersonation, deception, and synthetic personas, predate the invention of generative AI and have long been used to influence the information ecosystem and manipulate others. But broader access to generative AI tools could change the costs and incentives behind information manipulation, giving these age-old tactics recent power and potential, especially to those who previously lacked the technical sophistication to enable such tactics.
Identifying Abuse Strategies and Combinations
Falsifying evidence and manipulating human likenesses are among the most common tactics in real-world abuse cases. In our analyzed period, most cases of generative AI abuse were used to influence public opinion, enable fraud or deceitful behavior, or generate profit.
By observing how malicious users combine AI abuse tactics to achieve different goals, we identified specific combinations of abuses and named them strategies.
A diagram illustrating how bad actors’ goals (left) translate into their abuse strategies (right).
Emerging forms of generative AI abuse that are not overtly malicious continue to raise ethical concerns. For example, recent forms of political outreach blur the lines between authenticity and deception, such as government officials suddenly began speaking multiple voter-friendly languages without see-through disclosure that they are using generative artificial intelligence and activists using AI-generated voices of deceased victims to call for gun law reform.
While the study provides recent insights into emerging forms of abuse, it’s worth noting that this dataset represents a narrow sample of media reports. Media reports can prioritize sensational incidents, which in turn can skew the dataset toward certain types of abuse. Detecting or reporting abuse can also be more challenging for those involved because generative AI systems are so novel. The dataset also doesn’t draw a direct comparison between generative AI abuse and established content creation and manipulation tactics, such as image editing or setting up “content farms” to create gigantic volumes of text, video, GIFs, images, and more. Anecdotal evidence so far suggests that established content manipulation tactics remain more prevalent.
Getting ahead of potential abuses
Our paper highlights the possibilities of designing initiatives to protect society, such as promoting broad campaigns for generative AI literacy, developing better interventions to protect society from malicious actors, or warning people and equipping them to detect and subvert manipulative strategies used in generative abuse of AI.
This research helps our teams better protect our products by informing our evolving security initiatives. On YouTube now require creators to provide information that their work has been significantly altered or synthetically generated and appears realisticSimilarly, we have updated our election advertising policies to require advertisers to disclose when their election ads contain material that has been digitally altered or generated.
As we expand our understanding of the malicious employ of generative AI and make further technical advances, we know it is more essential than ever to ensure that our work is not done in isolation. We recently joined Content regarding the origin and authenticity of the coalition (C2PA) as a member of the steering committee tasked with helping develop a technical standard and promoting the adoption of content credentials, which are tamper-proof metadata that shows how content has been created and edited over time.
At the same time, we conduct research that develops existing red-teaming activities, including: Improving best practices for testing the security of large language models (LLMs)and developing pioneering tools to support identify AI-generated content, such as SynthID, which is being integrated into a growing number of products.
In recent years, Jigsaw has conducted research with the creators of disinformation to understand the tools and tactics they employ, developed bedtime videos to warn people against attempts to manipulate them, and prebunking campaigns have been shown to improve resilience to large-scale disinformation. This work is part of Jigsaw’s wider portfolio of information interventions to support people protect themselves online.
By proactively addressing potential misuse, we can support responsible and ethical employ of generative AI while minimizing its risks. We hope these insights into the most common misuse tactics and strategies will support researchers, policymakers, and industry trust and safety teams build safer, more responsible technologies and develop better measures to combat misuse.