Standard AI Safety Benchmarks: Advancing Responsible AI Development
The development of AI safety benchmarks is crucial for ensuring that AI systems are developed and deployed responsibly. Standard benchmarks already exist in various fields, from measuring product qualities to evaluating safety in the automotive industry. However, there are currently no standard benchmarks for AI safety. To address this gap, the non-profit MLCommons Association is spearheading an effort to develop these benchmarks with the help of expert researchers from academia and industry.
Why are AI safety benchmarks important? While AI technology has the potential for immense benefits, such as improving healthcare diagnostics and energy usage analysis, it also carries risks if not properly managed. AI systems can be misused, respond in biased or offensive ways, or support malicious activities. Standard AI safety benchmarks can help mitigate these risks by providing measures of safety across various categories, such as harmful use and AI-control risks. These benchmarks will enable society to reap the benefits of AI while ensuring that appropriate precautions are taken.
What are standard AI safety benchmarks? Existing AI safety tests focus on providing prompts to AI systems and algorithmically scoring their responses. However, these tests have limitations, as they often rely on open datasets that may have unintentionally biased training data. MLCommons proposes a multi-stakeholder process for selecting and grouping tests into subsets to measure safety for specific AI use-cases. The goal is to translate the technical results of these tests into scores that are understandable by everyone. MLCommons aims to create a platform that brings together existing tests and encourages the development of more rigorous tests to advance the state of AI safety.
The importance of collective effort: Developing mature and trusted AI safety benchmarks requires the involvement of the entire AI community. Responsible AI developers already use various safety measures, but determining if sufficient precautions have been taken can be challenging as the AI systems landscape expands. Standard AI benchmarks can help vendors and users measure AI safety and promote an ecosystem focused on improving AI safety. To accomplish this, researchers, engineers, companies, and stakeholders from different backgrounds must collaborate and provide innovative improvements to safety testing technology. This collective effort will ensure that multiple perspectives are incorporated, including those of public advocates, policy makers, academics, engineers, and business leaders.
Google’s support for MLCommons: Google is committed to the safe, secure, and trustworthy development and use of AI. As part of this commitment, Google supports the MLCommons Association’s efforts to develop AI safety benchmarks in the following ways:
1. Funding for the testing platform: Google, along with other companies, provides financial support to develop a testing platform specifically for AI safety benchmarks.
2. Technical expertise and resources: Google contributes technical expertise and resources, such as the Monk Skin Tone Examples Dataset, to ensure the benchmarks are well-designed and effective.
3. Datasets: Google shares internal datasets for multilingual representational bias and externalized tests for stereotyping harms, such as SeeGULL and SPICE. Additionally, datasets focused on collecting human annotations responsibly and inclusively, like DICES and SRP, are made available.
The future of AI safety benchmarks: These benchmarks will greatly contribute to the advancement of AI safety research and the responsible development of AI systems. AI safety is a collective-action problem, and other organizations, such as the Frontier Model Forum and Partnership on AI, are also leading standardization initiatives. Google looks forward to continued collaboration and collective efforts to promote responsible development and use of new generative AI tools.
Acknowledgements: The authors would like to thank the Google team, including Peter Mattson, Lora Aroyo, Chris Welty, Kathy Meier-Hellstern, Parker Barnes, Tulsee Doshi, Manvinder Singh, Brian Goldman, Nitesh Goyal, Alice Friend, Nicole Delange, Kerry Barker, Madeleine Elish, Shruti Sheth, Dawn Bloxwich, William Isaac, and Christina Butterfield, for their contributions to this work.