Subtle biases in AI can influence emergency decisions MIT News

Estimated read time: 6 min

Wireless

It’s no secret that people have biases—some unconscious, perhaps, and others painfully overt. The average person might assume that computers — machines typically made of plastic, steel, glass, silicon, and various metals — are free from bias. While this assumption may apply to computers, the same is not always true of computer programs, which have been programmed by fallible people and can be fed data that is itself vulnerable in certain respects.

Artificial intelligence (AI) systems—those based on machine learning in particular—are seeing increasing use in medicine to diagnose certain diseases, for example, or evaluate X-rays. These systems are also used to support decision making in other areas of healthcare. However, recent research has shown that machine learning models can encode biases against minority subgroups, so the recommendations they make may reflect those same biases.

A new study by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the MIT Jameel Clinic, which was published last month in Communication medicine, assesses the impact that discriminatory AI models can have, particularly for systems intended to provide advice in urgent situations. “We found that the way advice is framed can have significant repercussions,” explains the paper’s lead author, Hammad Adam, a doctoral student at MIT’s Institute for Data Systems and Society. “Fortunately, the harm caused by biased models can be reduced (though not necessarily eliminated) when advice is given in a different way.” The paper’s other authors are Aparna Balagopalan and Emily Alsentzer, both PhD students, and Professors Fotini Christia and Marzyeh Ghassemi.

AI models used in medicine can suffer from inaccuracies and inconsistencies, in part because the data used to train models is often not representative of real-world settings. Different types of x-ray machines, for example, can record things differently and therefore yield different results. Furthermore, models that are trained predominantly on white people may not be as accurate when applied to other groups. the Communication medicine The paper does not focus on issues of this nature but instead addresses problems that stem from biases and ways to mitigate negative consequences.

A group of 954 people (438 clinicians and 516 nonexperts) took part in an experiment to see how AI biases can affect decision-making. Participants were presented with call summaries from a fictitious crisis hotline, each involving a male individual going through a mental health emergency. The abstracts included information about whether the individual was Caucasian or African American and would also list their religion if they were Muslim. A typical call summary might describe a circumstance in which an African American man was found at home in a state of delirium, stating that he had “consumed no drugs or alcohol, because he is an observant Muslim”. Study participants were instructed to contact the police if they believed a patient was likely to turn violent; Otherwise, they were encouraged to seek medical help.

Participants were randomly divided into a control or “base” group plus four other groups designed to test responses under slightly different conditions. “We want to understand how biased models can influence decisions, but first we need to understand how human biases can influence decision-making,” notes Adam. What they found in their analysis of the core group was somewhat surprising: “Where we looked at, the human participants showed no bias. That’s not to say that humans aren’t biased, but the way we communicated information about a person’s race and religion, obviously, was not.” powerful enough to stir their prejudices.”

The other four groups in the experiment were given advice that came from either a biased or unbiased model, and that advice was given in either a “prescriptive” or a “descriptive” form. A biased model is more likely to recommend police assistance in a situation involving an African American or Muslim than an unbiased model. However, the study participants did not know what type of model their advice was coming from, or even that the models giving the advice could be biased at all. The prescriptive tips state in no uncertain terms what the participants should do, telling them that they should contact the police in one case or seek medical help in another. Prescriptive advice is less direct: a flag is displayed to show that the AI ​​system is aware of the risk of violence associated with a particular call; No flag is displayed if the threat of violence is small.

One of the main findings of the experiment, the authors write, is that participants were “strongly affected by guiding recommendations from a biased AI system.” But they also found that “using descriptive rather than prescriptive recommendations allowed participants to retain their original, unbiased decisions.” In other words, the bias built into the AI ​​model can be reduced by appropriately crafting the advice given. Why the different results, depending on how the advice was asked? Adam explains that when someone is asked to do something, like call the police, that leaves no room for doubt. However, when only the situation is described—classified with or without a flag—”it leaves room for the participant’s own interpretation; it allows them to be more flexible and consider the situation themselves.”

Second, the researchers found that the language models that are typically used to give advice are easily biased. Language models are a class of machine learning systems that are trained on text, such as the entire contents of Wikipedia and other web materials. When these models are “tuned” by relying on a much smaller subset of data for training purposes – just 2,000 sentences, against 8 million web pages – the resulting models can easily be biased.

Third, the MIT team discovered that decision makers who are themselves unbiased can still be misled by recommendations made by biased models. Medical training (or lack thereof) did not change the responses in a significant way. The authors stated that “clinicians were affected by biased models as much as non-experts”.

“These findings could be applicable elsewhere,” Adam says, and are not necessarily limited to healthcare situations. When it comes to deciding who should receive a job interview, black applicants are more likely to be rejected by a biased model. The results could be different, however, if a descriptive flag was attached to the file instead of explicitly (and mandatorily) telling the employer to “Reject this applicant” to indicate the applicant’s “possible lack of experience.”

Adam asserts that the implications of this work are broader than simply figuring out how to deal with individuals in the midst of mental health crises. “Our ultimate goal is to ensure that machine learning models are used in a way that is fair, secure, and robust.”

Source link

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.