Google has open-sourced its watermarking tool for AI-generated text

Share

LLM generates text one token at a time. These tokens can represent a single character, word, or part of a phrase. To create a sequence of consistent text, the model predicts the next most likely token to generate. These predictions are based on previous words and probability scores assigned to each potential token.

For example, with the phrase “My favorite tropical fruit is __.” The LLM can start completing the sentence using the tokens “mango”, “lychee”, “papaya” or “durian”, and each token is assigned a probability score. When there are many different tokens to choose from, SynthID may adjust the probability score of each predicted token, in cases where this does not negatively impact the quality, accuracy and creativity of the results.

This process is repeated throughout the generated text, so a single sentence may contain ten or more customized probability scores, and a page may contain hundreds. The final pattern of scores for both words selected by the model combined with the adjusted probability scores is considered as a watermark.

The AI Sckool

Categories

Google has open-sourced its watermarking tool for AI-generated text

Penalties: Does the team that kicks first have a better chance of winning?

3 questions: Beyond data-driven aesthetics

Almost anyone can now sell you GLP-1 on the Internet

7 Real Python Projects You Can Build in 2026 (with Guides)

Start building with Nano Banana 2 Lite and Gemini Omni Flash

More News

What’s going on with Alexa+?

The winter storm tested power grids that are strained to accommodate AI data centers

Google DeepMind employees ask leaders to ensure their “physical safety” from ICE

Google Photos now lets you describe how to turn images into videos

Penalties: Does the team that kicks first have a better chance of winning?

3 questions: Beyond data-driven aesthetics

Almost anyone can now sell you GLP-1 on the Internet