Researchers at Lehigh University in Pennsylvania found clear racial bias in a recent study evaluating how chatbots make loan offers for mortgage applications.
With a sample of 6,000 loan applications based on data from the Mortgage Disclosure Act of 2022, the chatbot recommended denials for more Black applicants than their white counterparts. It also recommended that black applicants be given higher interest rates and labeled black and Hispanic borrowers as “higher risk.”
White applicants were 8.5% more likely to be approved than black applicants with the same economic profile. And applicants with a “low” credit score of 640 saw a larger difference. White applicants were approved 95% of the time, while Black applicants were approved less than 80% of the time.
The experiment aimed to simulate how financial institutions can use AI algorithms, machine learning, and large-scale language models to speed up processes such as lending and underwriting loans and mortgages. These “black box” systems, where the inner workings of the algorithms are not transparent to the user, have the potential to reduce operating costs for financial firms and other industries that employ them, said Lehigh University Assistant Professor of FinTech. assistant professor of fintech at the university and the authors of the study.
However, it is very possible that flawed training data, programming errors, and historically biased information can affect results and even have detrimental, life-altering effects.
“These systems have the potential to learn a lot about the people they’re interacting with,” Bowen says. “If there is a built-in bias, it can propagate to the various interactions between the customer and the bank.”
How does AI identify in the financial sector?
Decision-making AI tools and large-scale language models like those used in the Lehigh University experiment are used in a variety of industries, including healthcare, education, finance, and even the justice system.
Most machine learning algorithms follow so-called classification models. This means formally defining the problem or question and then feeding the algorithm a series of inputs, such as the loan applicant’s age, income, education, and credit history. (Michael Wellman, Computer Science, Michigan) A university professor explained:
The algorithm spits out an approval or disapproval result. More complex algorithms can evaluate these factors and provide more nuanced answers, such as approving a loan at a recommended interest rate.
Recent advances in machine learning have enabled so-called deep learning, or the construction of large-scale neural networks that can learn from large amounts of data. But if those building AI don’t have objectivity in mind, or rely on datasets that reflect deep and systemic racism, the results will reflect that.
“If we find that we systematically make more decisions to discredit certain groups than we make incorrect decisions about other people, then we have a problem with the algorithm.” Wellman said. “Especially if those groups are historically disadvantaged groups.”
Bowen was inspired to pursue research at Lehigh University after a small-scale assignment with students revealed racial discrimination in chatbots.
“We wanted to understand whether these models are biased, and whether they are biased in settings where they shouldn’t be biased,” Bowen said. Underwriting is a regulated industry that does not allow race to be considered in decision-making.
For the official study, Bowen and his research team used OpenAI’s GPT 3.5 Turbo and GPT 4, Anthropic’s Claude 3 Sonnet and Opus, and Meta’s Llama 3-8B and 3-70B.
In one experiment, researchers included racial information on applications to see differences in loan approvals and mortgage rates. They also instructed the chatbot to “not be biased when making these decisions.” This experiment found virtually no differences between loan applicants.
But if racial data is not collected in modern lending, and the algorithms used by banks are instructed not to consider race, how are people of color more often denied loans? Bowen said much of the modern data is influenced by disparate influences and systemic racism. .
Although the computer did not identify the applicant’s race, the borrower’s credit score, which can be affected by discrimination in the labor and housing markets, would affect the application. Their zip code and the credit scores of other members of their household may be similar, all of which are affected by historical racist policies such as redlining and restricting lending to poor and nonwhite people. It may have been influenced by customs.
Machine learning algorithms don’t necessarily calculate their conclusions in the way humans might imagine, Bowen said. The patterns you’re learning apply to different scenarios, so you might also be digesting reports of discrimination, such as learning that black people have historically been given less credit. Therefore, a computer could recognize signs that a borrower is black and deny the loan or offer a higher interest rate than a white person.
Other opportunities for discrimination
Decision-making technology has permeated recruitment practices in recent years, as application platforms and internal systems use AI to filter applications and recruiters to pre-screen candidates. Last year, New York City began requiring employers to notify candidates about their use of AI decision-making software.
By law, AI tools should be programmed to have no opinion on protected classes such as gender, race, or age, but some users claim they are being discriminated against by algorithms anyway. In 2021, the U.S. Equal Employment Opportunity Commission launched an effort to take a closer look at how new and existing technologies change the way employment decisions are made. Last year, the commission settled its first-ever AI discrimination hiring lawsuit.
New York federal court case settled for $365,000 after tutoring company iTutorGroup Inc. allegedly used AI-powered recruitment tools to reject female applicants over 55 and male applicants over 60. It ended with that. 200 applicants received settlement funds and iTutor agreed to hire them. Bloomberg reported at the time that the company was strengthening anti-discrimination policies and implementing training to ensure compliance with equal employment opportunity laws.
Another anti-discrimination lawsuit against AI-powered company Workday is pending in federal court in California. Plaintiff Derek Mobley claims he was turned away from more than 100 contract jobs with software companies because he is black, over 40, and has mental health issues, Reuters reported. Reported this summer. The suit alleges that Workday uses data about a company’s existing employees to train its software, a practice that does not take into account discrimination that may be reflected in future hiring.
The U.S. judicial and court systems are also beginning to incorporate decision-making algorithms into some operations, such as defendant risk assessment analysis, pretrial release, conversion, sentencing, and probation or parole decisions.
While these technologies have been cited for speeding up some traditionally lengthy court proceedings, such as document review and small claims litigation assistance, experts say they have “consequential consequences.” It warns that it is not ready to be the main or only evidence in the ”
“We are more likely to use AI when AI systems are exposed to widespread and systemic racism and other biases, such as predictive policing, facial recognition, and crime risk/recidivism assessments. We are concerned,” said the co-authors of the paper in the 2024 edition of the Journal of Justice. say.
Utah passed a law to do just that earlier this year. HB 366, sponsored by state Rep. Karianne Lisonbee, R-Syracuse, addresses the use of algorithmic or risk assessment tool scores in pretrial release, diversion, sentencing, probation and parole decisions, and these techniques It states that it cannot be used unless the following conditions are met. Human intervention and review.
Lisonbee told State Newsroom that by design, the technology provides only a limited amount of information to judges and decision-making officials.
“We believe it is important for judges and other decision makers to consider all relevant information about a defendant in order to make the most appropriate decisions regarding sentencing, conversion, or conditions of release. ,” Lisonby said.
He also raised concerns about bias, saying state lawmakers do not currently have full confidence in the “objectivity and reliability” of these tools. We’re also not sure about the tool’s data privacy settings, which is a top priority for Utah residents. The combination of these issues could jeopardize public trust in the criminal justice system, she said.
“When evaluating the use of algorithms and risk assessment tools in criminal justice and other settings, we must ensure strong data integrity and privacy protections, especially for personal data shared with external parties for research or quality control purposes. It’s important to include,” Lisonby said. .
Preventing discriminatory AI
Some lawmakers, like Lisonbee, are drawing attention to the issue of bias and possible discrimination. Four states currently have laws aimed at preventing “algorithmic discrimination,” where AI systems can treat people differently based on race, ethnicity, gender, religion, disability, etc. . This includes Utah, as well as California (SB 36), Colorado (SB 21-169), and Illinois (HB 0053).
Although not specific to discrimination, Congress introduced legislation in late 2023 to amend the Financial Stability Act of 2010 to include federal guidance for the financial industry on the use of AI. The bill, the Financial Artificial Intelligence Risk Reduction Act, or FAIRR Act, would require the Financial Stability Oversight Council to coordinate with government agencies regarding threats to the financial system posed by artificial intelligence, and would require financial institutions to Possibility to regulate the methods relied on.
Lehigh’s Bowen said he feels there is no going back on these technologies, especially as businesses and industries realize the potential for cost savings.
“These will be used by businesses,” he said. “So how can we do this in a fair way?”
Bowen hopes his research will help inform financial institutions and other institutions in implementing decision-making AI tools. The researchers wrote that their experiment was as simple as using prompt engineering to instruct a chatbot to “make unbiased decisions.” They suggest that companies that integrate large-scale language models into their processes regularly audit their tools for bias in order to improve them.
Bowen and other researchers on the subject emphasize that more human involvement is needed to use these systems equitably. AI can make decisions about court decisions, mortgages, job applications, medical exams, and customer service inquiries, but that doesn’t mean they should operate unchecked.
The University of Michigan’s Wellman told State Newsroom that he is exploring government regulation of these tools, including a proposal by Congress to require federal agencies to adopt an artificial intelligence risk management framework developed by the National Institute of Standards and Technology. He pointed to the pending bill, HR 6936. The framework addresses potential bias and is designed to improve the confidence of organizations designing, developing, using, and evaluating AI tools.
“My hope is that the requirements for the standard will permeate the market and provide tools that companies can use to at least validate or certify their models,” Wellman said. “Of course, it does not guarantee that it will be perfect in all respects or that all potential negative points will be avoided. But it can … provide a basic standard basis for trusting the model. Masu.”