Picture the scene: A new technology has been introduced that is unlike anything we’ve seen before. This technology creates a new means of sharing information that is both interesting and entertaining and promises to generate new forms of knowledge on a regular basis. Indeed, this new creation appears so transformative, it leads one of the world’s most prominent entrepreneurs to predict that the method of transmitting knowledge to students will be radically altered in just a few years.
I’m referring, of course, to 1913 and the introduction of motion-picture technology—movies—which led Thomas Edison to predict that books would be obsolete in schools within a decade.
What, did you have something else in mind?
Here we go again. The introduction of generative artificial intelligence, in particular “large-language model” (LLM) chatbots such as ChatGPT, has led many to predict that this new technology will transform education as we know it. Bill Gates, for example, predicted in April 2023 that within 18 months—by October of this year—generative AI “will be as good a tutor as any human could be.” Not to be outdone, Sal Khan, the founder of Khan Academy, thinks AI will be “probably the biggest transformation that education has ever seen.” His organization is marketing the education-focused chatbot Khanmigo to schools right now.
Why do we keep making this mistake? Why do we seem doomed to repeat the endless cycle of ed-tech enthusiasts vastly overpromising and radically underdelivering on results? There are many reasons, but I’ll focus here on just one: we keep misunderstanding the role that technology can play in education because we’ve failed to understand properly the science of how we humans think and learn.
Here’s one example. Cognitive scientists use the term “theory of mind” to describe our capacity to ascribe mental states to ourselves and to others. In education, teachers must have a rough understanding of what’s happening in the minds of each of their students so that they can make inferences about what misconceptions a particular student may have, or what existing knowledge they are drawing upon when trying to learn a new concept. Importantly, emerging science suggests that we develop this ability through explicit cultural practices—our conversations and cooperative interactions with other humans.
You know, like what happens every day in school.
Neither ChatGPT nor Khanmigo nor any other existing LLM-based technology is presently capable of developing anything close to a robust theory of mind regarding what’s happening in the minds of their human users. That’s just not how the technology works. Instead, these models are essentially next-word prediction engines, meaning that after they’ve been prompted by human-generated text, they run a complicated set of statistical algorithms to predict what text to generate as output. This often feels like human conversation, as if there is another mind operating behind the machine, but feelings can be deceiving.
To demonstrate this deficiency, let’s do some algebra together (sorry). Consider this simple problem: If a pen and crayon together cost $2.50, and the crayon costs $2 less than the pen, how much do the pen and crayon each cost? I’ll spare you the mental effort: if P = cost of the pen, then the cost of the crayon is (P–2), which means that P + (P–2) = $2.50.
What happens when we ask Khanmigo to help us simplify the equation from here? The specifics of the chatbot response will vary, but here’s one example of how it can go (with my prompts on the right, Khanmigo’s responses on the left):
It’s not just that Khanmigo gets the math wrong here (as it often does). What’s arguably worse is that it appears to be having a conversation with me, but there’s no real “thinking” happening on its end. Khanmigo has no capacity to theorize what’s occurring inside my head as I repeatedly try to convince it that my simplification of the equation is correct. Instead, it just vaguely insinuates that my reasoning is off in some unidentified way. The entire interaction is an illusion of a conversation. I’m offering justifications for my thinking, while Khanmigo is just generating text in response to my prompts.
This isn’t just unhelpful; it’s counterproductive to learning. At this stage of my life, I’m confident in my knowledge of algebra, but imagine being an eighth-grade student learning the subject for the first time and having Khanmigo repeatedly but wrongly hint that you’ve miscalculated something that in fact you’ve done correctly. The cognitive scientist Gary Marcus describes the behavior of generative AI “frequently wrong, never in doubt,” and that’s just about the worst quality I can imagine for an educator. We might even call this anti-learning, since these tools are correcting “errors” that students aren’t making.
At this point, an AI enthusiast might object that I’ve used a cherry-picked example of Khanmigo misfiring. But the problem is that both sides are arguing in the dark. In my experience, Khanmigo frequently struggles with simple math, but we lack good data on how often these errors occur. Similarly, AI supporters might argue that these tools are destined to get better over time, but there is zero rigorous evidence—I mean, none—that they are effective at improving student learning at scale right now.
So, maybe we should slow down before broadly pushing this technology into our education system.
There I go again, sounding like a Luddite. Yet one of the great ironies in education is that, though we profess to want to develop critical thinking in all our students, whenever a seemingly transformative new technology is introduced into society writ large, prominent and powerful actors in the education system seem unwilling to think critically about whether its costs might outweigh the alleged benefits.
This needs to stop. By fostering a broader understanding of the science of how our minds work, and by comparing and contrasting this science with the technology underlying generative AI, we can make better decisions about how to use this new tool—or whether to use it all.
Benjamin Riley is the founder of Cognitive Resonance, a new venture dedicated to improving understanding of human cognition and generative AI.