Assessing Free-Text Justifications: A New Framework for Evaluating Model Explanations

Title: Assessing Free-Text Explanations in AI Models for Trust and Interpretability

Model explanations play a crucial role in building trust and interpretability in Natural Language Processing (NLP). Free-text rationales have gained popularity as they provide human-like explanations for model predictions. However, existing evaluation metrics focus mainly on accuracy and fail to measure the additional information provided by these justifications. In this article, we discuss a new automatic evaluation framework called REV2 that assesses free-text justifications based on their support for the intended label and the amount of additional information they offer.

Introducing REV2 Evaluation Framework:
REV2 evaluates free-text justifications along two dimensions: (1) whether the justification supports the intended label and (2) how much new information it adds to the label justification beyond the input provided. Existing metrics fail to differentiate between rationales that provide varying levels of fresh and pertinent information. For example, two rationales may have different amounts of valuable information, but existing metrics would consider them equally important.

How REV2 Works:
The REV2 framework is based on conditional V-information, which measures the extent to which a representation contains information beyond a baseline representation. Vacuous justifications that simply pair an input with a label without adding any new information are considered as the baseline. REV2 compares two representations: one from an evaluation model trained to produce the label given the input and the rationale, and the other from a model considering only the input without any justification.

Assessing Fresh and Label-Relevant Information:
REV2 addresses the limitations of other metrics by accounting for empty justifications. The framework provides evaluations for two reasoning tasks, commonsense question-answering, and natural language inference, across four benchmarks. Quantitative assessments demonstrate how REV2 can generate ratings that align more closely with human judgments compared to current measurements.

Sensitivity to Input Disturbances:
REV2’s evaluation also sheds light on the impact of input disturbances on rationale performance. It shows why the rationales discovered by chain-of-thought prompting do not always enhance prediction performance. By considering different levels of input disturbances, REV2 shows its sensitivity in evaluating justifications.

The REV2 evaluation framework offers a comprehensive assessment of free-text justifications in AI models. By considering the support for the intended label and the additional information provided, REV2 provides ratings that align more closely with human judgments. It also highlights the impact of input disturbances on rationale performance. Further details and implementation can be found in the provided Paper and GitHub link. Stay updated with the latest AI research news, projects, and more by joining our ML SubReddit, Discord Channel, and Email Newsletter. For any questions or feedback, feel free to reach out to us at

1. Paper: [Link to Paper]
2. GitHub: [Link to GitHub]

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...