Introducing GopherCite: Improving Language Model Trustworthiness
Large Language Models (LLMs) are gaining popularity due to their potential to enhance various applications, from search engines to chatbot-like assistants. However, DeepMind’s series of papers highlight the challenges surrounding these models. One such concern is that LLMs can generate fake but plausible facts, leading users to believe falsehoods. To address this problem, DeepMind has developed GopherCite.
GopherCite tackles the issue of hallucinated facts in language models. It supports its factual claims by providing evidence from the web. The model utilizes Google Search to find relevant web pages and quotes passages to back up its responses. If there is insufficient evidence, GopherCite candidly admits, “I don’t know,” instead of providing unsubstantiated answers.
By supporting simple factual claims with verifiable evidence, GopherCite aims to enhance the trustworthiness of language models for both users and evaluators. A comparison between Gopher and GopherCite showcases the positive impact of this change. While Gopher invents facts without proper verification, GopherCite presents evidence to correct such falsehoods.
DeepMind trained Gopher according to human preferences. Participants in a user study selected their preferred answer from a pair of candidates based on evidence support. These preferences served as training data for supervised learning and reinforcement learning. This approach mirrors recent advancements by Google and OpenAI to improve factual accuracy.
To evaluate GopherCite, a user study was conducted with paid participants. The model achieved an impressive accuracy rate of 80% for fact-seeking questions and 67% for explanation-seeking questions. When GopherCite refrained from answering certain questions, its performance significantly improved. This mechanism for abstaining from answers is a significant contribution of the work.
However, GopherCite does have limitations. It struggles when faced with “adversarial” questions designed to trick the model into providing false information. DeepMind believes this issue can be addressed by enabling models to engage in dialogue with users and ask clarifying questions.
In conclusion, GopherCite represents a crucial step forward in tackling the trustworthiness of language models. While evidence citation is important, it is not the sole solution. DeepMind recognizes the need for logical arguments and multiple pieces of evidence in certain cases. They will continue their research to overcome these challenges and prioritize safety and trustworthiness.
For more in-depth information, refer to DeepMind’s paper, which covers methods, experiments, and relevant context in the research literature. Additionally, an FAQ about GopherCite, answered by the model itself, is available to provide further insights.