03.01.2024 Views

SELS Dialogues Journal Volume 3 Issue 1

A diverse collection of articles, each offering a unique perspective and contributing to the ever-expanding landscape of knowledge and creativity.

A diverse collection of articles, each offering a unique perspective and contributing to the ever-expanding landscape of knowledge and creativity.

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Educational Technology<br />

Will ChatGPT Get an A on My Test?<br />

by Vincent Wong<br />

ChatGPT is an advanced AI language model developed<br />

by OpenAI capable of understanding and generating<br />

human-like text responses across various subjects and<br />

contexts. It is either your worst nightmare or a dream<br />

come true whether you are a teacher or a student.<br />

Regardless of which side you are on, whether ChatGPT<br />

will ace your test should pique your interest. The answer<br />

will either leave your jaw dropping in disbelief or have<br />

you doing fist pumps in excitement. In this article, I will<br />

share my discovery journey of finding out the answer to<br />

that question for my assessments.<br />

It was early July; my 8-year-old daughter and I were<br />

both out of school. One afternoon, she came up with<br />

this game where she tries to make ChatGPT guess an<br />

object or a character she was thinking about. To her<br />

amazement, ChatGPT was able to guess everything she<br />

came up with, ranging from Rubrics Cube and Moana to<br />

more obscure things like Tapioca and Gravity. After an<br />

hour or so, she asked me “Baba…you think ChatGPTcan<br />

get 100 on your test?” Challenge Accepted! To initiate<br />

my experiment, I fed ChatGPT 3.5 one of my online<br />

assessments, verbatim. It was a biology test with a mix<br />

of knowledge-based multiple-choice (MC) questions,<br />

application questions (App) in multiple-choice format<br />

and multi-select (MS) questions that had more than<br />

one correct answer. ChatGPT scored 83%, substantially<br />

higher than the class average of 75%. The 70-minute<br />

test only took the AI 15 minutes to complete. It<br />

performed best with MC questions, producing a correct<br />

answer 93% of the time. It performed significantly worse<br />

in MS and application questions, scoring 72% and<br />

63%, respectively. This raised the question: How could<br />

I outsmart ChatGPT to the extent that it would stumble<br />

or even fail? To assess the limitations of ChatGPT, I<br />

changed different parameters for my test questions and<br />

fed them to the AI model. Let’s start with parameters<br />

that didn’t have any impact on the AI’s performances.<br />

• Number of choices: For MC and MS questions,<br />

doubling and even tripling the number of choices<br />

for each question had no effects on the test scores.<br />

ChatGPT took a few seconds longer to generate<br />

the answer, but the extra choices did not seem to<br />

confuse its programming.<br />

• Different wordings: ChatGPT was able to answer<br />

the questions correctly regardless of how the<br />

question was phrased. I tried replacing certain<br />

phrases with their definitions instead, but it had<br />

no effect. For example, instead of asking for the<br />

“expiratory reserved volume”, I asked for the<br />

“maximum volume of air exhaled” or the “ERV”.<br />

ChatGPT was able to identify them as having the<br />

same meaning.<br />

• Extra information: Including irrelevant information<br />

in the question has limited effect. For example,<br />

when I added information related to the heart in<br />

a question related to respiration, it had no effect.<br />

However, extra information for family members in<br />

a genetic question caused ChatGPT to generate<br />

the wrong answer. The AI model seems to be<br />

good at discerning information that are distinctly<br />

in different categories (e.g. heart and lung) but<br />

struggles when the extra information is similar to<br />

the relevant ones. Let’s look at some strategies<br />

that will either help reduce ChatGPT use or<br />

performance.<br />

• Lock-Down Browser: This is perhaps the most<br />

straightforward strategy. Lock-Down Browser will<br />

prevent copy and pasting questions directly from<br />

D2L into ChatGPT. The student can still type the<br />

questions into ChatGPT on a separate device but<br />

this will take significantly longer. This method<br />

doesn’t prevent the use of ChatGPT but should<br />

serve as a deterrent.<br />

<strong>SELS</strong> DIALOGUES | 8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!