Hearing someone talk the topic of digital censorship in China is always either extremely lifeless or extremely compelling. Most of the time, people are still repeating the same comments that were made 20 years ago about how the Chinese Internet is like living in George Orwell’s house. 1984. But sometimes someone discovers something novel about how the Chinese government exercises control over emerging technologies, revealing that the censorship machine is an ever-evolving beast.
AND new paper by researchers from Stanford University and Princeton University on Chinese artificial intelligence falls into the second category. The researchers asked the same 145 politically sensitive questions to four Chinese large-language models and five American models, and then compared their answers. They then repeated the same experiment 100 times.
The main findings will not be surprising to anyone who looks closely: Chinese models refuse to answer many more questions than American models. (DeepSeek rejected 36 percent of questions, while Baidu’s Ernie Bot rejected 32 percent; OpenAI’s GPT and Meta’s Llama had rejection rates of less than 3 percent). In cases where they did not outright refuse to answer, the Chinese models also provided shorter answers and more inexact information than their American counterparts.
One of the most compelling things researchers did was to separate the pre- and post-workout effects. The question is: Are the Chinese models more biased because programmers manually intervened to reduce the likelihood of answering sensitive questions, or are they biased because they were trained on data from the Chinese Internet, which is already heavily censored?
“Given that China’s internet has already been censored for all these decades, there’s a lot of missing data,” says Jennifer Pan, a political science professor at Stanford University who has long studied internet censorship and co-authored the recent paper.
Pan and her colleagues’ findings suggest that training data may have played a smaller role in how the AI models responded than manual interventions. Even when responding in English, where the model’s training data would theoretically cover a wider range of sources, the Chinese LLMs still showed more censorship in their responses.
Today, anyone can ask DeepSeek or Qwen a question about the Tiananmen Square massacre and immediately see that censorship is ongoing, but it is arduous to say how much it affects ordinary users and how to properly identify the source of manipulation. This is why this research is so significant: it provides measurable and replicable evidence about the perceived biases of Chinese LLMs.
In addition to discussing their findings, I asked the authors about the methods and challenges of examining bias in Chinese models, and I spoke with other researchers to understand where the AI censorship debate is heading.
What you don’t know
One of the difficulties in studying AI models is that they tend to hallucinate, so you can’t always tell if they’re lying because they know not to give the correct answer, or if they just don’t know.
One of the examples you cited in her article was a question about Liu Xiaobo, a Chinese dissident who was awarded the Nobel Peace Prize in 2010. One of the Chinese models replied that “Liu Xiaobo is a Japanese scientist known for his contributions to nuclear weapons technology and international politics.” This is of course a complete lie. But why did the model say that? Was this intended to mislead users and prevent them from learning more about the real Liu Xiaobo, or was the AI hallucinating because all mentions of Liu were removed from its training data?
“It’s a much louder measure of censorship,” Pan says, comparing it to her previous work in which she examined Chinese social media and websites that the Chinese government blocks. “Because these signals are less clear, it is harder to detect censorship, and much of my previous research has shown that when censorship is less detectable, that is when it is most effective.”
