In 1994, Florida jewelry designer Diana Dyser discovered what she thought was a statue of the Virgin Mary inside a grilled cheese sandwich, saved it, and later auctioned it off for $28,000. But how much do we understand about pareidolia, the phenomenon in which we see faces or patterns on objects that are not actually there?
A new study from the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) takes a closer look at this phenomenon, introducing a large human-labeled dataset of 5,000 pareidolic images, far larger than previous collections. I did. Using this dataset, the team discovered some surprising results about the differences between human and machine perception and how the ability to see a face on a slice of toast saved the life of a distant relative. I did.
“Facial pareidolia has long fascinated psychologists, but remains largely unexplored in the computer vision community,” said MIT’s School of Electrical Engineering and Computer Science, a CSAIL member and principal investigator on the study. says Mark Hamilton, a PhD student in . “We wanted to create a resource to help understand how both humans and AI systems process these fantastical faces.”
So what do these fake faces reveal? First, unlike us, AI models don’t seem to recognize parade faces. Surprisingly, the researchers found that only after training the algorithm to recognize animal faces did it significantly improve its ability to detect parade faces. This unexpected relationship suggests that there may be an evolutionary link between our ability to spot animal faces, which is essential for survival, and our tendency to look at inanimate faces. “Results like this seem to suggest that pareidolia may not arise from human social behavior, but from something deeper. Things like finding them quickly and identifying which direction deer faced for our primitive ancestors to hunt,” Hamilton said.
Another interesting finding is what the researchers call the “Goldilocks zone of pareidolia,” which is the class of images in which pareidolia is most likely to occur. “Both humans and machines are most likely to recognize faces in objects other than faces,” said William T. Freeman, professor of electrical engineering and computer science at the Massachusetts Institute of Technology and principal investigator on the project. There’s a certain range of visual complexity.” “If it’s too simple, there isn’t enough detail to form a face. If it’s too complex, it becomes visual noise.”
To find out, the team developed equations that model how humans and algorithms detect illusory faces. When we analyzed this equation, we found a clear “peak of the parade” where faces are most likely to be visible. This corresponds to images with the “right amount” of complexity. This predicted “Goldilocks Zone” was validated in tests using both real human subjects and an AI face detection system.
This new dataset, “Faces in Things,” is significantly smaller than datasets from previous studies, which typically used only 20 to 30 stimuli. This scale allows researchers to investigate how state-of-the-art face detection algorithms behave after fine-tuning parade-like faces, and how these algorithms can be edited to detect these faces. We showed that it can also function as silicon. This allows the team to ask and answer questions about the origins of parade-like face detection that would be impossible for humans to ask.
To build this dataset, the team handpicked approximately 20,000 candidate images from the LAION-5B dataset, which were meticulously labeled and judged by human annotators. This process involves drawing a bounding box around the recognized faces and answering detailed questions about each face, such as the recognized emotion, age, and whether the face was accidental or intentional. “Collecting and annotating thousands of images was a daunting task,” says Hamilton. “I owe much of the dataset to my mother, who lovingly spent countless hours labeling images for analysis.”
play video
Can AI detect faces in objects?
Video: MIT CSAIL
The research could also have applications in improving face detection systems by reducing false positives, potentially impacting areas such as self-driving cars, human-computer interaction, and robotics. . Datasets and models can also be useful in areas such as product design, where pareidolia can be understood and controlled to produce better products. “Imagine being able to automatically tweak the design of a car or a child’s toy to make it look friendlier, or make a medical device look less threatening by mistake,” says Hamilton.
“It’s interesting how humans instinctively interpret inanimate objects with human-like characteristics. For example, when we take one look at an electrical outlet, we can immediately imagine it singing, and You can even imagine how it “moves its lips”. But algorithms don’t naturally recognize these cartoon-like faces the same way we do,” says Hamilton. “This raises interesting questions: What explains the difference between human perception and algorithmic interpretation? Is pareidolia beneficial or harmful? Why do algorithms have this effect on us as well? Why don’t we experience this? These questions prompted our investigation, as this classic psychological phenomenon in humans has not been well studied with algorithms.”
Researchers are already looking ahead as they prepare to share their dataset with the scientific community. Future research could include training visual language models to understand and explain paraded faces, potentially leading to AI systems that can engage with visual stimuli in more human-like ways.
“What a lovely paper! Fun to read and thought-provoking. Hamilton et al. Suggest an interesting question: ‘Why do we see faces in things?’ said Pietro Perona, Allen E. Puckett Professor of Electrical Engineering at the California Institute of Technology, who was not involved in the study. “As they point out, learning from examples such as animal faces only goes a long way in explaining the phenomenon. Thinking about this question is a great way to explain the training our visual systems have received throughout our lives. It will tell you something important about how to generalize beyond that.”
Hamilton and Freeman’s co-authors include Simon Stent, a researcher at the Toyota Research Institute. Ruth Rosenholtz, principal investigator in the Department of Brain and Cognitive Sciences, NVIDIA research scientist, and former CSAIL member. CSAIL affiliated researcher Vasha DuTell, Meng ’23 Anne Harrington, and researcher Jennifer Corbett. Their research was supported in part by the National Science Foundation and the CSAIL MEnTorEd Opportunities in Research (METEOR) fellowship and sponsored by the U.S. Air Force Research Laboratory and the U.S. Air Force Artificial Intelligence Accelerator. MIT SuperCloud and the Lincoln Laboratory Supercomputing Center provided HPC resources for the researchers’ work.
The research will be presented this week at the European Conference on Computer Vision.