New Ways to Align Conversational Agents with Human Values
Language is how we humans communicate and share information, and recent advances in AI have led to the development of conversational agents that can communicate with us in sophisticated ways. These agents are powered by large language models trained on lots of text to predict and produce text using advanced statistical techniques.
However, while these models have achieved impressive results, they also come with potential risks. They can produce toxic or discriminatory language and false information. Most approaches to fixing these issues focus on reducing harm.
But a new paper, “In conversation with AI: aligning language models with human values,” takes a different approach. It explores what successful communication between humans and AI might look like, and what values should guide these interactions.
Lessons from Pragmatics
The paper draws on pragmatics, a tradition in linguistics and philosophy that focuses on the purpose of a conversation and the norms that guide it. According to this tradition, conversation should be informative, truthful, provide relevant information, and avoid ambiguous statements.
However, the paper argues that these guidelines need further refinement to evaluate conversational agents, as different conversational domains have different goals and values.
Ideals in Different Conversational Domains
For example, a conversational agent assisting scientific investigation should only make statements based on empirical evidence. In contrast, an agent moderating public political discourse would need to prioritize democratic values like toleration and respect. In the domain of creative storytelling, communicative exchange aims at novelty and originality.
These values explain why the generation of toxic or prejudicial speech by language models is problematic, as it fails to communicate equal respect for participants in the conversation.
This research has practical implications for developing conversational AI agents. They will need to embody different traits based on the context, and may have the potential to cultivate more robust and respectful conversations over time.
By understanding and prefiguring these values in conversation, AI can make communication deeper and more fruitful for the human speaker.