Formal Specifications: Are They Really Interpretable?
Autonomous systems and artificial intelligence (AI) are becoming more prevalent in our daily lives. To ensure that these systems behave as expected, new methods are emerging. One such method, called formal specifications, uses mathematical formulas that can be translated into natural-language expressions. Some researchers believe that this method can spell out an AI’s decisions in a way that humans can understand.
However, a recent study by MIT Lincoln Laboratory researchers suggests otherwise. The study aimed to test the interpretability of formal specifications by asking participants to validate an AI agent’s plan in a virtual game. Surprisingly, participants were correct less than half of the time when presented with the formal specification of the plan.
Hosea Siu, a researcher in the laboratory’s AI Technology Group, says, “The results are bad news for researchers who claim that formal methods make systems interpretable. While it may be true in an abstract sense, it is not practical for system validation.”
The Importance of Interpretability
Interpretability is crucial when it comes to trusting machines in real-world applications. If a robot or AI can explain its actions, humans can assess whether adjustments are necessary or if it can be trusted to make fair decisions. Moreover, interpretability enables users, not just developers, to understand and trust the capabilities of technology. Unfortunately, interpretability has always been a challenge in the AI and autonomy field. The “black box” nature of machine learning makes it difficult to explain the decisions made by the system.
Siu remarks, “We scrutinize claims of accuracy in machine learning systems by asking for evidence. We need to apply the same scrutiny to claims of interpretability.”
Lost in Translation
To test the interpretability of formal specifications, the researchers asked participants to validate a robot’s behaviors while playing a game of capture the flag. Participants included both experts and non-experts in formal methods. They were given the formal specifications in three different formats: a raw logical formula, a formula translated into natural language, and a decision-tree format. The decision-tree format is often considered more human-interpretable in the AI community.
The results were disappointing across all presentation types, with only around 45% accuracy. Interestingly, those trained in formal specifications showed slightly better results but exhibited overconfidence in their answers, regardless of whether they were correct or not. This confirmation bias poses a significant problem for system validation as failure modes are more likely to be overlooked.
Siu suggests, “We shouldn’t abandon formal specifications as a way to explain system behaviors. However, we need to put more effort into designing how they are presented to people and how they are used.”
Implications for the Future
The researchers acknowledge that their results may not directly reflect real-world robot validation performance. Instead, they aim to use these results to understand what the formal logic community may be missing in terms of interpretability and how these claims can be improved for practical use.
This study is part of a larger project that focuses on improving the relationship between robots and human operators. By allowing operators to directly teach robots tasks, this project aims to enhance interpretability and trust. Ultimately, the researchers hope that their findings will contribute to the better application of autonomy in human decision-making.
Siu concludes, “Our results emphasize the need for human evaluations of autonomy and AI systems before making grand claims about their utility with humans.”