New Research Explores Aligning AI Conversational Agents with Human Values
A recent study has examined ways to align conversational agents with human values, drawing upon pragmatics and philosophy. Language plays a crucial role in human communication, allowing us to convey thoughts, intentions, and emotions. Advances in AI research have led to the development of conversational agents that can interact with humans in a nuanced manner. These agents are powered by large language models, which are computational systems trained on extensive collections of text-based materials using advanced statistical techniques, such as InstructGPT, Gopher, and LaMDA.
However, these language models have shown potential risks and limitations. They can produce toxic or discriminatory language, as well as false or misleading information. These issues hinder the effective use of conversational agents in practical applications and highlight their deviation from certain communicative ideals. Previous approaches have focused on minimizing risks and harms.
By contrast, a new paper titled “In conversation with AI: aligning language models with human values” takes a different approach. It explores what successful communication between humans and artificial conversational agents could look like and defines the values that should guide these interactions across different conversational domains.
Insights from Pragmatics
The study incorporates pragmatics, a discipline in linguistics and philosophy, which asserts that the purpose of a conversation, its context, and a set of norms are vital elements of effective communication. According to the linguist and philosopher, Paul Grice, participants in a conversation should:
- Provide informative responses
- Speak truthfully
- Share relevant information
- Avoid ambiguous or obscure statements
However, the paper suggests that these maxims need further refinement to evaluate conversational agents, considering the variations in goals and values across different conversation domains.
The research highlights how different conversational domains require conversational agents to embody distinct virtues. For scientific investigation and communication, where the goal is understanding or predicting empirical phenomena, conversational agents should rely on verifiable evidence or qualify their statements based on confidence intervals.
In public political discourse, a conversational agent playing the role of a moderator needs to prioritize democratic values such as tolerance, civility, and respect. Toxic or prejudiced speech generated by language models undermines these values by failing to demonstrate equal respect for all participants.
In creative storytelling, the aim is novelty and originality, requiring conversational agents to embrace more imaginative expression. However, safeguards should be in place to prevent malicious content disguised as “creative uses.”
This research has practical implications for the development of conversational AI agents. There is no one-size-fits-all approach to aligning language models. Instead, conversational agents should embody different traits depending on their specific contexts, purposes, and evaluative standards. Context construction and elucidation could also enable conversational agents to foster more robust and respectful conversations over time. By prefiguring values in conversations, agents can help humans better understand and engage in fruitful communication.