UNIVERSITY PARK, Pa. — Artificial intelligence (AI) systems such as home assistants, search engines, and large-scale language models like ChatGPT may appear nearly omniscient, but their output depends on the data used to train them. It’s just as good. However, for ease of use, users often deploy AI systems without understanding what training data was used or who prepared the data, including any potential biases held by the data or trainers. This often happens. A new study by Penn State researchers aims to make this information available to help form appropriate expectations for AI systems and learn more about whether and how users use these systems. It suggests that it could be even more helpful in making more informed decisions.
The study showed that by displaying racial diversity cues (visual signals on an AI interface that convey the racial composition of the training data and the backgrounds of the typically crowdsourced workers that labeled the data), the algorithm We investigated whether it is possible to increase users’ expectations of fairness and trust. Their findings were recently published in the journal Human-Computer Interaction.
AI training data is often systematically biased with respect to race, gender, and other characteristics, said S. Shyam Sundar, Evan Pugh University professor and director of the Center for Socially Responsible Artificial Intelligence at Penn State.
“Users may not realize that using certain AI systems can perpetuate biased human decision-making,” he says.
Lead author Chen “Chris” Chen, an assistant professor of communication design at Elon University who received his PhD in mass communication from Penn State University, said that because users have no information about the biases embedded in AI systems, , explained that there are many things that cannot be evaluated. Training data or trainer.
“This bias emerges after the user has completed the task, meaning the harm has already been done and the user has enough information to decide whether to trust the AI before using it. “We haven’t done that,” Chen said.
Sander said one solution is to communicate the nature of the training data, particularly its racial composition.
“That’s what we did in this experimental study, and we aimed to see if it caused any changes in their perception of the system,” Sander said.
To understand how diversity cues affect the trustworthiness of AI systems, the researchers created two experimental conditions: one with diversity and one without. has been created. In the former, participants received a brief explanation of machine learning models and data labeling practices, along with bar graphs showing the even distribution of facial images in the training data from three racial groups: White, Black, and Asian. You have viewed. One-third of the dataset. In the no racial diversity condition, the bar graph showed that 92% of the images belonged to a single major racial group. Similarly, a balanced representation of labeler backgrounds was maintained with approximately one-third each of white, black, and Asian labelers. In the no-diversity condition, a bar graph showed that 92% of labelers belonged to a single racial group.