Machine learning models can fail when attempting to predict individuals who are underrepresented within the dataset used for training.
For example, a model that predicts the best treatment options for people with a chronic disease can be trained using a dataset that includes primarily male patients. Implementing this model in a hospital could lead to incorrect predictions for female patients.
To improve results, engineers can try to balance the training dataset by removing data points until all subgroups are equally represented. Although dataset balance is promising, it often requires removing large amounts of data, which reduces overall model performance.
MIT researchers have developed a new technique to identify and remove specific points in a training dataset that contribute most to model failure in underrepresented subgroups. By removing far fewer data points than other approaches, this technique improves performance with respect to underrepresented groups while maintaining the overall accuracy of the model.
Additionally, this technique can identify hidden sources of bias in the unlabeled training dataset. In many applications, unlabeled data is much more prevalent than labeled data.
This method can also be combined with other approaches to improve the fairness of machine learning models deployed in high-stakes situations. For example, it may one day help ensure that patients who are underdiagnosed by biased AI models are not misdiagnosed.
“Many other algorithms that attempt to address this problem assume that each data point is as important as every other data point. In this paper, we show that this assumption is not true. There are certain points in our dataset that are contributing to this bias, and we can find and remove those data points to improve performance.” Kimia, an electrical engineering and computer science (EECS) graduate student at Hamidieh says. -Lead author of a paper on this technology.
She co-authored the paper with co-lead authors Saachi Jain PhD ’24 and fellow EECS graduate student Kristian Georgiev. Andrew Ilyas Engineering 18 years, PhD 23 years, Stein Fellow at Stanford University. Senior authors are Marzyeh Ghassemi, EECS associate professor and member of the Institute for Biomedical Engineering Sciences and the Institute for Information and Decision Systems, and Aleksander Madry, Cadence Design Systems professor at MIT. This research will be presented at the Neural Information Processing Systems Conference.
remove bad examples
Machine learning models are often trained using large datasets collected from many sources on the internet. These datasets are too large to carefully curate manually and may contain bad examples that hurt model performance.
Scientists also know that some data points have a greater impact on a model’s performance on certain downstream tasks than others.
MIT researchers have combined these two ideas to create an approach that identifies and removes problematic data points. They are trying to solve a problem known as worst-case group error. This problem occurs when the model underperforms on a small number of subgroups in the training dataset.
The researchers’ new method builds on previous work that introduced a technique called TRAK that identifies the most important training samples for a given model output.
This new method takes the incorrect predictions that the model makes about minority subgroups and uses TRAK to identify which training samples contribute the most to those incorrect predictions.
“By aggregating this information across bad test predictions in a sensible way, we can find the specific parts of the training that are reducing the overall accuracy of the worst group,” Ilyas explains. .
Then remove those specific samples and retrain the model on the remaining data.
More data typically improves overall performance, so by removing only the samples that cause the worst group failures, you can maintain the overall accuracy of your model while improving performance with fewer subgroups. can be improved.
A more accessible approach
Their method outperformed multiple methods across three machine learning datasets. In one example, we improved accuracy for the worst group while removing approximately 20,000 fewer training samples than traditional data balancing methods. Their technique also achieved higher accuracy than methods that require changes to the internal workings of the model.
Because the MIT method requires modifying the dataset instead, it is easier to use for practitioners and applicable to many types of models.
Subgroups in the training dataset are unlabeled, so it can be used when bias is unknown. By identifying the data points that contribute the most to the features your model is learning, you can understand the variables your model is using to make predictions.
“This is a tool that anyone can use when training a machine learning model. They can look at those data points and see if they match the functionality you’re trying to teach the model,” Hamidi said. says.
Using this technique to detect bias in unknown subgroups requires intuition about which groups to look for, so researchers hope to validate it through future human studies to more fully I would like to investigate.
We also want to improve the performance and reliability of the technology and ensure that the method is accessible and easy to use for practitioners who may one day deploy it in real-world environments.
“Having tools that allow us to look critically at our data and identify which data points lead to bias or other undesirable behavior is a first step toward building fairer, more reliable models. ” Ilyas says.
This research was funded in part by the National Science Foundation and the Defense Advanced Research Projects Agency.