Title: Addressing Biased Data in AI-Assisted Healthcare: A Sociotechnical Approach
In a recent opinion piece published in the New England Journal of Medicine (NEJM), computer science and bioethics professors from MIT, Johns Hopkins University, and the Alan Turing Institute highlight the need for a sociotechnical perspective to address biased medical data in artificial intelligence (AI) models. The White House Office of Science and Technology considers this issue significant and identified it as a key concern in their recent Blueprint for an AI Bill of Rights. To effectively address bias in public health, the authors argue that a technical approach should be complemented with an understanding of historical and current social factors.
Understanding Biased Data:
Biased clinical data should be seen as “artifacts” that reveal societal practices, belief systems, and cultural values. For example, a widely-used algorithm in the healthcare industry concluded that sicker Black patients require the same level of care as healthier white patients. The algorithm failed to consider unequal access to healthcare, leading to algorithmic discrimination. Rather than treating biased datasets or lack of data as problems to be fixed or discarded, the authors advocate for an “artifacts” approach that raises awareness of social and historical factors influencing data collection and alternative approaches to AI development in healthcare.
The Role of a Sociotechnical Perspective:
Engaging bioethicists or clinicians early in the problem formulation stage is crucial when developing models for deployment in clinical settings. Computer scientists often lack a complete understanding of the social and historical factors that shape the data they use. Expertise in recognizing when existing models may not work well for specific subgroups is necessary. Additionally, researchers must be prepared to investigate race-based correction as part of the research process.
Potential Pitfalls and Risks:
Including self-reported race in clinical risk scores, assuming it improves the performance of machine learning models, can actually result in worse risk scores and metrics for minority populations. There is no one-size-fits-all solution, and self-reported race is a social construct that requires scrutiny. The solution should be evidence-based.
Biased datasets should not be accepted, but quality training data remains essential for developing safe and high-performance AI models in healthcare. The National Institutes of Health (NIH) plays a crucial role in driving ethical practices and has prioritized the collection of ethically sourced datasets. Emphasizing local context and understanding the historical and contemporary factors shaping datasets can lead to the development of new policies and structures that eliminate bias.
Addressing biased data in AI-assisted healthcare requires a sociotechnical approach that goes beyond technical solutions. By viewing biased data as informative artifacts and considering social and historical factors, researchers can identify discriminatory practices that may not be immediately apparent. This approach paves the way for meaningful health outcomes and ensures that AI in healthcare benefits all patient populations. It is a step towards progress rather than replicating existing poor practices.