Study Shows How Chat GPT Harms Learning

Does AI actually help students learn? A recent experiment in a high school provides a cautionary tale. Researchers at the University of Pennsylvania found that Turkish high school students who had access to Chat GPT while doing practice math problems did worse on a math test compared with students who didn’t have access to Chat GPT. Those with Chat GPT solved 48% more of the practice problems correctly, but they ultimately scored 17% worse on a test of the topic that the students were learning.

A third group of students had access to a revised version of Chat GPT that functioned more like a tutor. This chatbot was programmed to provide hints without directly divulging the answer. The students who used it did spectacularly better on the practice problems, solving 127% more of them correctly compared with students who did their practice work without any high-tech aids. But on a test afterwards, these AI-tutored students did no better. Students who just did their practice problems the old fashioned way — on their own — matched their test scores.

The researchers titled their paper, “Generative AI Can Harm Learning,” to make clear to parents and educators that the current crop of freely available AI chatbots can “substantially inhibit learning.” Even a fine-tuned version of Chat GPT designed to mimic a tutor doesn’t necessarily help. The researchers believe the problem is that students are using the chatbot as a “crutch.” When they analyzed the questions that students typed into ChatGPT, students often simply asked for the answer. Students were not building the skills that come from solving the problems themselves.

Chat GPT’s errors also may have been a contributing factor. The chatbot only answered the math problems correctly half of the time. Its arithmetic computations were wrong 8% of the time, but the bigger problem was that its step-by-step approach for how to solve a problem was wrong 42% of the time. The tutoring version of Chat GPT was directly fed the correct solutions and these errors were minimized.

This is just one experiment in another country, and more studies will be needed to confirm its findings. But this experiment was a large one, involving nearly a thousand students in grades nine through 11 during the fall of 2023. The authors likened the problem of learning with Chat GPT to autopilot. They recounted how an over-reliance on autopilot led the Federal Aviation Administration to recommend that pilots minimize their use of this technology. Regulators wanted to make sure that pilots still know how to fly when autopilot fails to function correctly. Chat GPT is not the first technology to present a tradeoff in education. Typewriters and computers reduce the need for handwriting. Calculators reduce the need for arithmetic. When students have access to Chat GPT, they might answer more problems correctly, but learn less. Getting the right result to one problem won’t help them with the next one.

Allison Green
Boston Tutoring Services

Leave a Reply

Your email address will not be published. Required fields are marked *