SELS Dialogues Journal Volume 3 Issue 1

More documents

Info

Educational Technology Will ChatGPT Get an A on My Test? by Vincent Wong ChatGPT is an advanced AI language model developed by OpenAI capable of understanding and generating human-like text responses across various subjects and contexts. It is either your worst nightmare or a dream come true whether you are a teacher or a student. Regardless of which side you are on, whether ChatGPT will ace your test should pique your interest. The answer will either leave your jaw dropping in disbelief or have you doing fist pumps in excitement. In this article, I will share my discovery journey of finding out the answer to that question for my assessments. It was early July; my 8-year-old daughter and I were both out of school. One afternoon, she came up with this game where she tries to make ChatGPT guess an object or a character she was thinking about. To her amazement, ChatGPT was able to guess everything she came up with, ranging from Rubrics Cube and Moana to more obscure things like Tapioca and Gravity. After an hour or so, she asked me “Baba…you think ChatGPTcan get 100 on your test?” Challenge Accepted! To initiate my experiment, I fed ChatGPT 3.5 one of my online assessments, verbatim. It was a biology test with a mix of knowledge-based multiple-choice (MC) questions, application questions (App) in multiple-choice format and multi-select (MS) questions that had more than one correct answer. ChatGPT scored 83%, substantially higher than the class average of 75%. The 70-minute test only took the AI 15 minutes to complete. It performed best with MC questions, producing a correct answer 93% of the time. It performed significantly worse in MS and application questions, scoring 72% and 63%, respectively. This raised the question: How could I outsmart ChatGPT to the extent that it would stumble or even fail? To assess the limitations of ChatGPT, I changed different parameters for my test questions and fed them to the AI model. Let’s start with parameters that didn’t have any impact on the AI’s performances. • Number of choices: For MC and MS questions, doubling and even tripling the number of choices for each question had no effects on the test scores. ChatGPT took a few seconds longer to generate the answer, but the extra choices did not seem to confuse its programming. • Different wordings: ChatGPT was able to answer the questions correctly regardless of how the question was phrased. I tried replacing certain phrases with their definitions instead, but it had no effect. For example, instead of asking for the “expiratory reserved volume”, I asked for the “maximum volume of air exhaled” or the “ERV”. ChatGPT was able to identify them as having the same meaning. • Extra information: Including irrelevant information in the question has limited effect. For example, when I added information related to the heart in a question related to respiration, it had no effect. However, extra information for family members in a genetic question caused ChatGPT to generate the wrong answer. The AI model seems to be good at discerning information that are distinctly in different categories (e.g. heart and lung) but struggles when the extra information is similar to the relevant ones. Let’s look at some strategies that will either help reduce ChatGPT use or performance. • Lock-Down Browser: This is perhaps the most straightforward strategy. Lock-Down Browser will prevent copy and pasting questions directly from D2L into ChatGPT. The student can still type the questions into ChatGPT on a separate device but this will take significantly longer. This method doesn’t prevent the use of ChatGPT but should serve as a deterrent. <strong>SELS</strong> DIALOGUES | 8
Educational Technology • Use Images/Diagrams: Currently, ChatGPT3.5 does not accept images and therefore is incapable of generating answers for questions that require analysis of images. For example, instead of asking “What condition could result in the plasma portion of blood turning red once the blood is fractionated?”, I showed a picture of fractionated blood with the condition. In another instance, I showed a graph that plots the result from an experiment and asked questions about it. Without the image, ChatGPT was unable to answer the question. • Applications: ChatGPT’s strength lies in its ability to draw information from its vast knowledge repository. It struggles with questions that require deduction, inference or analysis beyond factual recollection. It turns out, the MS questions on my test were a mix of knowledge-based and application-based questions. When I separated the MS questions into these 2 categories, ChatGPT failed 100% of the time with the MS application questions (i.e. application questions with multiple correct answers). It either misses some of the correct answers or chooses incorrect ones. This probably explains why the MS score was somewhere between MC and application scores from my initial testing. When I tested additional application questions (with one correct answer) from other assessments, ChatGPT consistently scored around the low 60% range. ChatGPT has proven itself to be a formidable tool capable of answering my test questions with impressive accuracy. While I have identified ways to reduce ChatGPT’s performances, my goal is not to create a test that ChatGPT will fail. If such a test exists, students will likely perform poorly on it as well. I don’t think we should be creating tests with the mindset that every student will be using ChatGPT during the test, that scenario is highly unlikely. Instead, what I learned from this exercise is to look at my assessments from a different perspective. Perhaps some of my test questions became obsolete in the era of AI. More powerful versions of ChatGPT are likely on their way, rather than working against it, maybe the best way moving forward is to incorporate it into my assessments. Maybe ask students to analyze and evaluate its response in an application question or maybe I can ask students to create test questions that ChatGPT cannot answer. The best approach will likely depend on the subject and the student demographic. So…will ChatGPT get an A on YOUR test? Maybe it’s time to find out. Author’s Bio Vincent teaches in the Pre-Health program. He is interested in innovative ways to make learning a fun experience for his students. His teaching approach emphasizes on understanding rather than memorizing. Vincent’s own passion for science helps motivate students and ensures they remain engaged during his class. <strong>SELS</strong> DIALOGUES | 9
Page 1 and 2: SELS Dialogues School of English an
Page 3 and 4: Table of Contents Acknowledgements.
Page 5 and 6: Managing Editor’s Note by Sherry
Page 7 and 8: Educational Technology Games are sh
Page 9: Educational Technology Machine lear
Page 13 and 14: Educational Technology three, build
Page 15 and 16: Pedagogy and Critical Thinking case
Page 17 and 18: Pedagogy and Critical Thinking With
Page 19 and 20: Pedagogy and Critical Thinking coul
Page 21 and 22: Research Initiatives Research quest
Page 23 and 24: Research Initiatives In a nutshell,
Page 25 and 26: Creative Pursuits Welcome to the Cr
Page 27 and 28: Creative Pursuits It was never a bu
Page 29 and 30: Creative Pursuits Stratford’s New
Page 31 and 32: Creative Pursuits Sleeping Bag Tran
Page 33 and 34: Creative Pursuits Toronto by Prabha
Page 35 and 36: SELS Dialogues Editors Sherry Hejaz
Page 37 and 38: Jill McDonald For the past 30 years
Page 39: A1-01-NOV23

SELS Dialogues Journal Volume 3 Issue 1

Create successful ePaper yourself

Delete template?

Save as template?