In a recent study presented at the International Conference on Machine Learning, a team of researchers from MIT analyzed the biases that can arise in machine learning models. They focused on the phenomenon of “subpopulation shifts” which can cause differences in performance for different groups. The researchers identified four types of shifts – spurious correlations, attribute imbalance, class imbalance, and attribute generalization – which can lead to biases in the models.
To illustrate the concept, the researchers used the example of sorting images of animals into two classes: cows and camels. They explained that biases can arise from the class or the attribute. For example, if all the images used in the analysis showed cows on grass and camels on sand, the model may incorrectly conclude that cows can only be found on grass. This is an example of a spurious correlation.
In the context of medical diagnosis, the researchers highlighted the importance of avoiding biases in machine learning models. They gave an example of using X-ray images to diagnose pneumonia. If there is an attribute imbalance or a class imbalance in the dataset, the model may perform better for certain groups of people, such as men, or for healthy cases rather than sick ones.
The researchers tested 20 advanced algorithms on different datasets to study their performance across different population groups. They found that improving the classifier layer of the neural network could reduce certain types of biases, but attribute generalization remained a challenge.
The researchers also discussed the evaluation of model performance in terms of fairness. They discovered that improving worst-group accuracy, which focuses on the group with the worst model performance, can result in a decrease in worst-case precision. They emphasized the importance of balancing accuracy and precision in classification tasks, especially in medical diagnostics.
The MIT team is currently working on a study with a medical center to examine whether machine learning models can work in an unbiased manner for all populations. They acknowledged the disparities that still exist across different age groups, genders, ethnicities, and intersectional groups.
The ultimate goal of the researchers is to achieve fairness in healthcare for all populations. However, they acknowledge that reforming the current system will not be easy. They are committed to gaining a better understanding of the sources of unfairness and finding ways to address them.
The research conducted by the MIT team is funded by the MIT-IBM Watson AI Lab.