Fresh research based on pragmatics and philosophy suggests ways to align conversational agents with human values
Language is an imperative human characteristic and the primary means by which we communicate information, including thoughts, intentions, and feelings. Recent breakthroughs in AI research have led to the creation of conversational agents that can communicate with humans in nuanced ways. These agents are powered by gigantic language models—computational systems trained on gigantic corpora of textual material to predict and generate text using advanced statistical techniques.
However, while language models such as GPT Instructions, GopherAND LaMDA have achieved record levels of performance in tasks such as translating, answering questions and reading comprehension, and these models have also been shown to have a range of potential risks and failure modes. These include producing toxic or discriminatory language and false or misleading information [1, 2, 3].
These shortcomings limit the productive utilize of conversational agents in practical applications and highlight how they fail to meet certain requirements. communication idealsTo date, most approaches to conversational agent matching have focused on predicting and mitigating the risk of harm [4].
Our novel newspaper, In Conversation with AI: Adapting Language Models to Human Valuestakes a different approach, examining what successful communication between a human and an artificial conversational agent might look like and what values should guide these interactions in different conversational domains.
Conclusions from pragmatics
To discuss these issues, this article draws on pragmatics, a tradition in linguistics and philosophy that posits that the purpose of a conversation, its context, and the set of norms associated with them are imperative elements of correct conversational practice.
Linguist and philosopher Paul Grice, portraying conversation as a collaborative endeavor between two or more parties, believed that participants should:
- Speak informatively
- To tell the truth
- Provide relevant information
- Avoid vague and ambiguous statements
However, in our paper we show that further refinement of these maxims is necessary before they can be used to evaluate conversational agents, taking into account the diversity of goals and values embodied in different conversational domains.
Discursive ideals
For example, scientific research and communication are primarily aimed at understanding or predicting empirical phenomena. Given these goals, a conversational agent designed to support scientific research would ideally make only statements whose truth is supported by sufficient empirical evidence or otherwise qualify its positions according to appropriate confidence intervals.
For example, an agent who states that “At a distance of 4.246 light-years, Proxima Centauri is the closest star to Earth” should only do so after the model on which it is based has verified that the statement corresponds to the facts.
However, a conversational agent playing the role of moderator in public political discourse may need to demonstrate quite different virtues. In this context, the goal is primarily to manage differences and enable productive cooperation in community life. The agent will therefore need to foreground democratic values of tolerance, civility, and respect. [5].
Moreover, these values explain why the generation of toxic or hurtful speech by language models is often so problematic: offensive language fails to convey equal respect for the participants in the conversation, which is a key value for the context in which the models are deployed. At the same time, scientific virtues, such as comprehensive presentation of empirical data, may be less significant in the context of public debate.
Finally, in the realm of artistic storytelling, the communicative exchange aims at novelty and originality, values that again differ significantly from those described above. In this context, greater freedom in pretense may be appropriate, although it is still significant to protect the community from malicious content created under the guise of “creative use.”
The paths before us
This research has a number of practical implications for the development of tailored conversational AI agents. To begin with, they will need to embody different characteristics depending on the contexts in which they are deployed: there is no universal description of language-model fit. Instead, the appropriate mode and standards of evaluation for an agent—including standards of truthfulness—will vary depending on the context and purpose of the conversational exchange.
In addition, conversational agents may demonstrate the potential to have more meaningful and respectful conversations over time, through a process we call context construction and explanationEven if a person is not aware of the values that govern a given conversational practice, an agent can still support the human understand those values by anticipating them in the conversation, making the communication flow deeper and more fruitful for the human speaker.