The Unintended Severity of Machine Learning Models

Machine-learning models designed to mimic human decision-making often make different, and sometimes harsher, judgments than humans due to being trained on the wrong type of data, according to researchers from MIT and other institutions. The “right” data for training these models is normative data, which is labeled by humans who were explicitly asked whether items defy a certain rule. However, most models are trained on descriptive data, where humans identify factual features instead. When models are trained on descriptive data to judge rule violations, they tend to over-predict these violations, potentially leading to serious real-world implications.

Researchers find that models trained using common data-collection techniques judge rule violations more harshly than humans would.

Machine-learning models often make harsher judgments than humans due to being trained on the wrong type of data, which can have serious real-world implications, according to researchers from

In an effort to improve fairness or reduce backlogs, machine-learning models are sometimes designed to mimic human decision-making, such as deciding whether social media posts violate toxic content policies.

But researchers from MIT and elsewhere have found that these models often do not replicate human decisions about rule violations. If models are not trained with the right data, they are likely to make different, often harsher judgments than humans would.

In this case, the “right” data are those that have been labeled by humans who were explicitly asked whether items defy a certain rule. Training involves showing a machine-learning model millions of examples of this “normative data” so it can learn a task.

But data used to train machine-learning models are typically labeled descriptively — meaning humans are asked to identify factual features, such as, say, the presence of fried food in a photo. If “descriptive data” are used to train models that judge rule violations, such as whether a meal violates a school policy that prohibits fried food, the models tend to over-predict rule violations.

Researchers have found that machine-learning models trained to mimic human decision-making often suggest harsher judgments than humans would. They found that the way data were gathered and labeled impacts how accurately a model can be trained to judge whether a rule has been violated. Credit: MIT News with figures from iStock

This drop in

Ghassemi is senior author of a new paper detailing these findings, which was published on May 10 in the journal

In each case, the descriptive labelers were asked to indicate whether three factual features were present in the image or text, such as whether the dog appears aggressive. Their responses were then used to craft judgments. (If a user said a photo contained an aggressive dog, then the policy was violated.) The labelers did not know the pet policy. On the other hand, normative labelers were given the policy prohibiting aggressive dogs, and then asked whether it had been violated by each image, and why.

The researchers found that humans were significantly more likely to label an object as a violation in the descriptive setting. The disparity, which they computed using the absolute difference in labels on average, ranged from 8 percent on a dataset of images used to judge dress code violations to 20 percent for the dog images.

“While we didn’t explicitly test why this happens, one hypothesis is that maybe how people think about rule violations is different from how they think about descriptive data. Generally, normative decisions are more lenient,” Balagopalan says.

Yet data are usually gathered with descriptive labels to train a model for a particular machine-learning task. These data are often repurposed later to train different models that perform normative judgments, like rule violations.

Training troubles

To study the potential impacts of repurposing descriptive data, the researchers trained two models to judge rule violations using one of their four data settings. They trained one model using descriptive data and the other using normative data, and then compared their performance.

They found that if descriptive data are used to train a model, it will underperform a model trained to perform the same judgments using normative data. Specifically, the descriptive model is more likely to misclassify inputs by falsely predicting a rule violation. And the descriptive model’s accuracy was even lower when classifying objects that human labelers disagreed about.

“This shows that the data do really matter. It is important to match the training context to the deployment context if you are training models to detect if a rule has been violated,” Balagopalan says.

It can be very difficult for users to determine how data have been gathered; this information can be buried in the appendix of a research paper or not revealed by a private company, Ghassemi says.

Improving dataset transparency is one way this problem could be mitigated. If researchers know how data were gathered, then they know how those data should be used. Another possible strategy is to fine-tune a descriptively trained model on a small amount of normative data. This idea, known as transfer learning, is something the researchers want to explore in future work.

They also want to conduct a similar study with expert labelers, like doctors or lawyers, to see if it leads to the same label disparity.

“The way to fix this is to transparently acknowledge that if we want to reproduce human judgment, we must only use data that were collected in that setting. Otherwise, we are going to end up with systems that are going to have extremely harsh moderations, much harsher than what humans would do. Humans would see nuance or make another distinction, whereas these models don’t,” Ghassemi says.

Reference: “Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data” by Aparna Balagopalan, David Madras, David H. Yang, Dylan Hadfield-Menell, Gillian K. Hadfield and Marzyeh Ghassemi, 10 May 2023, Science Advances.
DOI: 10.1126/sciadv.abq0701

This research was funded, in part, by the Schwartz Reisman Institute for Technology and Society, Microsoft Research, the Vector Institute, and a Canada Research Council Chain.

JD Vance challenges “The Border Czar” on immigration

Australians Get ‘Right To Disconnect’ From Jobs After Hours

Judge accepts insanity plea from man who attacked Virginia congressman’s office with bat

“Proving them wrong”: After raising minimum wage, California has more fast-food jobs than ever

Before CPS classes begin, soaking up the last of summer break

Francisco Lindor hears MVP chants in win over Padres

Shaquille O’Neal Calls Out Team USA, Says He Didn’t Pay Attention to Olympics

Anthony Fauci recovering after hospitalization for West Nile virus

Trump Pledges Commission on Presidential Assassination Attempts

JD Vance challenges “The Border Czar” on immigration

Australians Get ‘Right To Disconnect’ From Jobs After Hours

Judge accepts insanity plea from man who attacked Virginia congressman’s office with bat

“Proving them wrong”: After raising minimum wage, California has more fast-food jobs than ever

Before CPS classes begin, soaking up the last of summer break

Francisco Lindor hears MVP chants in win over Padres

Shaquille O’Neal Calls Out Team USA, Says He Didn’t Pay Attention to Olympics

Anthony Fauci recovering after hospitalization for West Nile virus

Trump Pledges Commission on Presidential Assassination Attempts

The Unintended Severity of Machine Learning Models

In Turkey’s election, voters choose between Erdogan and Kilicdaroglu

China sentences 78-year-old US citizen to life in prison on spying charges

China sentences 78-year-old US citizen to life in prison on spying charges

Leave a Reply Cancel reply

Subscribe To Our Newsletters

Customer Support

Subscribe To Our Newsletters

Categories

Welcome Back!

Retrieve your password

JD Vance challenges “The Border Czar” on immigration

Australians Get ‘Right To Disconnect’ From Jobs After Hours

Judge accepts insanity plea from man who attacked Virginia congressman’s office with bat

“Proving them wrong”: After raising minimum wage, California has more fast-food jobs than ever

Before CPS classes begin, soaking up the last of summer break

Francisco Lindor hears MVP chants in win over Padres

Shaquille O’Neal Calls Out Team USA, Says He Didn’t Pay Attention to Olympics

Anthony Fauci recovering after hospitalization for West Nile virus

Trump Pledges Commission on Presidential Assassination Attempts