12.01.2015 Views

zmWmQs

zmWmQs

zmWmQs

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Dropout Prediction in MOOCs using Learner Activity Features<br />

Sherif Halawa, Daniel Greene, John Mitchell<br />

Absent Mode (M2) Predictor<br />

For most students, absences of several days at a time are<br />

not uncommon. As the absence lengthens, however, the<br />

probability that the student may not continue to persist<br />

in the course increases. The job of this predictor is to redflag<br />

a student once he or she has been absent for a certain<br />

number of days, called the “absence threshold”.<br />

To determine the optimum threshold, we studied the<br />

variation of accuracy with threshold. The threshold was<br />

varied from 0 to 3 “course units”, where a course unit is defined<br />

as the time period between the release of two units<br />

of course content. For most courses, a course unit is 1<br />

week long. The variation of specificity and recall with the<br />

threshold is presented in Figure 3.<br />

Results<br />

Figure 3. Variation of specificity and recall with the absence<br />

threshold At very low thresholds, S is very low and R is very high<br />

because almost every student has an absence at least as long as<br />

the threshold. As the threshold is increased, S improves and R<br />

deteriorates. We identified the point at 2 course units (14 days<br />

for a typical course) as a convenient threshold, where R and S are<br />

both above 0.75, and have selected this value to be the threshold<br />

of our M2 predictor.<br />

Specificity and Recall<br />

First, we evaluate our predictor’s specificity and recall<br />

observed over 10 test courses different from the 6 training-set<br />

courses. Table 3 shows the best, median, and worst<br />

observed recall and specificity figures.<br />

Table 3. Best, median, and worst specificity and recall for various<br />

predictor components<br />

Individual feature predictors<br />

assnperformance<br />

video-skip assn-skip lag<br />

M1<br />

predictor<br />

M2<br />

predictor<br />

Integrated<br />

predictor<br />

Specificity<br />

Best 1.0 0.86 0.96 0.96 0.85 0.93 0.68<br />

Median 1.0 0.82 0.92 0.84 0.72 0.80 0.58<br />

Worst 0.96 0.40 0.73 0.47 0.36 0.70 0.29<br />

Recall<br />

Best 0.008 0.58 0.38 0.43 0.67 0.98 0.99<br />

Median 0.006 0.30 0.25 0.17 0.48 0.93 0.93<br />

Worst 0.00 0.23 0.10 0.14 0.41 0.77 0.91<br />

In order to develop an understanding of what the<br />

strengths and weaknesses of our predictor are, we need<br />

to provide some interpretation of the numbers in Table 3.<br />

The ‘assn-performance’ (assessment performance)<br />

Feature<br />

This feature generally has high precision and specificity.<br />

Over 95% of students it flags (students with average assessment<br />

scores below 0.5) eventually dropout. However,<br />

its recall was observed to be generally very low compared<br />

to the other features. For the majority of MOOC quizzes,<br />

mean scores are in the range of 70% - 85%. Even though<br />

some students occasionally score below 50% on certain<br />

quizzes, there are very few students whose average quiz<br />

scores are below 50%. This could be attributed to the deliberate<br />

easiness for which MOOC assessments are designed,<br />

or due to MOOCs’ self-selective nature (students<br />

who believe that the course will be too difficult refrain<br />

from enrolling or refrain from attempting assessments).<br />

Research Track | 63

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!