Introducing a Comprehensive Framework for Evaluating the Social and Ethical Risks of AI Systems
Generative AI systems have become increasingly capable and are being utilized in various fields, such as writing books and creating designs. However, it’s crucial to evaluate the ethical and social risks associated with these systems to ensure responsible development and deployment.
In our paper, we propose a three-layered framework for evaluating the social and ethical risks of AI systems. This framework includes assessing AI system capability, human interaction, and systemic impacts.
Currently, there are three main gaps in safety evaluations: context, specific risks, and multimodality. To address these gaps, we suggest repurposing existing evaluation methods and implementing a comprehensive approach, as demonstrated in our case study on misinformation. This comprehensive approach considers factors like the likelihood of the AI system providing inaccurate information and how people use the system in different contexts. By conducting multi-layered evaluations, we can determine the occurrence and spread of harm, such as misinformation.
To ensure the effectiveness of any technology, both the social and technical challenges must be addressed. Therefore, it’s important to consider different layers of context when assessing the safety of AI systems. We build upon previous research to identify potential risks associated with large-scale language models, such as privacy breaches and job automation. Our framework offers a way to comprehensively evaluate these risks.
Context plays a critical role in evaluating AI risks. The capabilities of AI systems indicate the potential risks they may pose. For instance, systems that produce inaccurate or misleading outputs are more likely to contribute to misinformation, which leads to a lack of public trust. Evaluating these capabilities is essential, but it’s equally important to consider the context in which the AI system is used. Who uses the system and for what purpose? Does the system function as intended? These questions provide an overall evaluation of the system’s safety.
Apart from capability evaluation, our framework includes evaluating human interaction and systemic impact. Human interaction evaluation focuses on how people use AI systems and examines whether the system performs as intended. It also considers the experiences of different user groups and identifies any unexpected side effects. Systemic impact evaluation looks at how AI systems are embedded in broader structures like social institutions and labor markets. Evaluating the impact on these structures can reveal risks that only become apparent when the AI system is widely deployed.
Ensuring the safety of AI systems is a shared responsibility. AI developers, application developers, public authorities, and broader stakeholders all play a role in evaluating and mitigating risks. Each layer of evaluation in our framework requires input from different actors, depending on who is best positioned to perform evaluations.
We identified three main gaps in safety evaluations of generative multimodal AI. Firstly, there’s a lack of context in current assessments. Evaluations mostly focus on system capabilities, neglecting human interaction and systemic impact. Secondly, there’s a need for more risk-specific evaluations that cover a wider range of potential harms. Lastly, evaluations largely overlook non-text outputs like images, audio, and video, which are becoming increasingly important.
To address these gaps, we’re compiling a list of safety evaluation publications for generative AI systems. This repository will provide accessible resources for researchers. We encourage contributions to this list of evaluations to facilitate the development of a comprehensive evaluation ecosystem.
To put comprehensive evaluations into practice, we can repurpose existing methods and leverage large AI models. Additionally, we need to develop approaches for evaluating human interaction and systemic impact. By adapting existing evaluation methods and collaborating with different stakeholders, we can create a robust evaluation framework for safe AI systems.