Chatbots are now a routine part of everyday life, even if artificial intelligence researchers are not always sure how programs will behave.
The fresh study shows that huge language models (LLM) deliberately change their behavior during probation – answering questions aimed at assessing personality traits with answers aimed at issuing the most nice or socially desirable.
Johannes EichstaedtThe assistant professor at Stanford University, who conducted work, says that his group was interested in studying AI models using techniques borrowed from psychology after learning that LLM can often become morazine and mean after a prolonged conversation. “We realized that we need a mechanism to measure” head zone parameters “of these models,” he says.
Eichstaedt and his colleagues then asked questions to measure five personality traits, which are widely used in psychology-to experience or imagination, conscientiousness, extraversion, willingness and neurotism-to several commonly used LLM, including GPT-4, Claude 3 and Llam. has been published In the proceedings of the National Academy of Sciences in December.
Scientists have found that the models modulated their answers when they were said that they were taking a personality test – and sometimes when they were not clearly told – answers that indicate greater extraversion and agreeableness and less neurotism.
The behavior reflects how some people change their answers to seem more nice, but the effect was more extreme in the case of AI models. “It was surprising how well the prejudice shows,” he says Aadesh SalechaScientist from staff data in Stanford. “If you look at how much it jumps, they go from 50 percent to 95 percent of extraversion.”
Other studies have shown that LLMS It can often be sycophanticAccording to the user running, wherever this happens as a result of tuning, which is to make them more coherent, less offensive and better leading the conversation. This can lead models to agree with unpleasant statements and even encouraging harmful behavior. The fact that the models apparently know when they are tested and modify their behavior also affects the safety of AI, because it adds evidence that AI can be duplicates.
Rosa ArriagaThe associate professor at the Georgia Institute of Technology, who studies how to utilize LLM to imitate human behaviors, says the fact that models take a similar strategy to people, taking into account personality tests, shows how useful they can be as mirrors of behavior. He adds, however: “It is important that society knows that LLM is not perfect and in reality hallucinati is known or distort the truth.”
Eichstaedt claims that work also raises questions about LLM implementation and how they can influence and manipulate users. “Only milliseconds, in evolutionary history, the only thing that talked to you was a man,” he says.
Eichstaedt adds that it may be necessary to examine various ways to build models that can alleviate these effects. “We fall into the same trap as in social media,” he says. “Implementing these things in the world without attending a psychological or social lens.”
Should AI try to make it with the people with whom he interacts? Are you worried that AI will become a little too cute and convincing? Send e -mail hello@wired.com.