The SAT is to standardized testing what floppy disks are to data storage.
As new AI tools challenge traditional approaches to probing student knowledge and enable new ways of administering and scoring tests, companies that provide some of the most popular standardized tests are rethinking their offerings.
For example, ETS, one of the oldest and largest organizations in the field of standardized testing, has moved away from traditional college admissions exams like the SAT to focus on new approaches that measure students’ skills and persistence.
It’s been a tumultuous time for academic testing in general, and for ETS, a 75-year-old nonprofit, in particular. During the pandemic, at least 1,600 colleges and universities made the SAT and other admissions tests optional, at least temporarily, due to concerns about equity and access. And earlier this year, ETS announced it would no longer administer the College Board’s SAT. College Board spokeswoman Holly Stepp says the organization has shifted to an entirely digital format, “and now develops and administers SAT- and PSAT-related assessments in person.”
ETS began a rebranding effort in April to focus on what it called “talent solutions” rather than just academic testing, and it also downsized to recalibrate, offering buyouts to many employees earlier this year after laying off 6% of its workforce last September.
“The assessments that ETS offers in the future will be more behavioral than cognitive,” says Kara McWilliams, vice president of product innovation and development at ETS. “That means creating experiences that measure user behavior, not the answer to a question,” she adds. “So we want to look at things like perseverance, and as we think about how we build these[assessment]experiences, we build nudges into them, so we understand things like, ‘Did you ask for a hint? Did you reach out to a friend? Did you ask for more time?’ What actions are you taking to get to the answer? It doesn’t matter what the answer is, it matters how you get there.”
One example of that work is the group’s new focus on its “Skills for the Future” initiative, a collaboration with the Carnegie Foundation for the Advancement of Teaching, to rethink how assessment is done.
The goal of the effort is to move away from having students drop all their work, sit in a room, and answer questions for a few hours, says Timothy Knowles, president of the Carnegie Foundation. Instead, he says, the group is experimenting with using data schools have on students, including from after-school activities like sports, clubs, and internships, to measure and track progress in skills like communication, collaboration, and critical thinking.
“The idea is to build an insight system that’s useful for kids, families and educators,” he says, “so we can understand where people are in terms of developing skills that are predictive of success. So we’re figuring out how to visualize this in a way that’s not punitive or problematic for kids.”
Schools and school systems already have a wealth of data that’s not being used much, he said. The question is, “Can we look at that data in different ways and infer from that data how well young people are mastering certain skills?”
The effort has partnered with education leaders in five states — Indiana, Nevada, North Carolina, Rhode Island and Wisconsin — to help pilot test the approach starting in January, Knowles said. ETS and Carnegie officials plan to use the new form of AI to review and tag existing student work, analyze state education data and run interactive assessments, though not all of those uses will be ready by January.
But experts urge caution, especially when AI is used to analyze data or write test questions.
“We still have a lot to learn about whether bias is built into the use of AI,” said Nicole Turner Lee, director of the Center for Innovation at the Brookings Institution. “AI is only as good as the training data. If the training data is biased toward advantaged students who have more resources than students in disadvantaged schools, it will hurt them.”
She points to a controversial experiment conducted in 2020, when the pandemic was at its peak and many schools were closed and forced to teach remotely. Since many students were unable to take the in-person end-of-year exams offered by the International Baccalaureate, the group decided to build a model based on historical data to predict student performance.
“They developed an algorithm that basically predicts which schools are more likely to produce tertiary-level graduates,” she says.
Thousands of students have voiced dissatisfaction with their result scores, and some governments have launched formal investigations. “The algorithm itself didn’t take into account the location of the school or the resources of the school,” Turner-Lee says.
The researcher said ETS officials invited her to speak at a recent event to share her views and concerns about approaches to using AI in testing and assessment.
“Think about how hard we’ve worked to address inequities in standardized testing,” she says, “and we have to be cautious about going all in because the datasets we use to train AI are themselves likely historically biased.”
Other test providers are experimenting with using AI to create new kinds of test questions.
Next year’s edition of the Programme for International Student Assessment (PISA), a global test that measures reading, mathematics and science literacy for 15-year-olds, will include a new type of “performance task” designed to see how students approach problems and will be graded by AI.
ETS’s McWilliams said the past year has “changed” thinking about AI in testing.
Whereas last year the focus was on using AI to create traditional multiple-choice questions, now, she says, “What I’m really focused on is generating content dynamically on the fly. Rather than multiple-choice questions, I’m focusing on more experiential tasks where individuals can most meaningfully demonstrate their knowledge and abilities.”
One example is a new AI tool called Authentic Interview Prep, which uses AI to help people hone their job interview skills.
“A lot of people get nervous when they have an interview,” she says. “So what we’re trying to do is create an experience that helps people understand how to have a more meaningful interview. The AI does a lot of different things, like giving me feedback on my tone of voice, the speed at which I speak, the eye contact I make with you, and then instantly you get haptics on your watch that tell you, ‘Cara, calm down. You’re talking too quickly,’ or, ‘Make more eye contact.'”
Of course, such tests aren’t for college or grad school admissions; they’re a different kind of assessment than the SAT. She says the SAT will likely have some role for the foreseeable future. “What I’m thinking about now is, ‘What content can I use to inform the experiences that people have every day?'”