13.05.2018 Views

merged

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Handling Missing Data in Air Pollution Datasets: A Comparison between<br />

Linear Interpolation Method and Mean Top Bottom Method<br />

Muhammad Hafizi Bin Mat Rahim<br />

Supervisor: Dr. Nurul Adyani Bt Ghazali<br />

Bachelor of Technology (Environment)<br />

School of Ocean Engineering<br />

Universiti Malaysia Terengganu<br />

Missing data is one of common problems in real life datasets especially in air quality<br />

datasets. The aim of this study is to apply missing values using Linear Interpolation<br />

(LI) and Mean Top Bottom (MTB) techniques in air quality datasets at Kemaman and<br />

Kuala Terengganu. Five performance indicators that are mean absolute error (MAE),<br />

root mean square error (RMSE), coefficient of determination (R 2 ), prediction accuracy<br />

(PA) and index of agreement (d2) were calculated in order to examine the best method<br />

for replacing the missing values. From these performance indicators, mean top bottom<br />

(MTB) method was found as the best method for predicting missing values compared<br />

to others method. This was because MTB method gave the smallest values of MAE<br />

and RMSE and the highest values for R 2 , PA and d2 in almost all parameters.<br />

45 | U M T U N D E R G R A D U A T E R E S E A R C H D A Y 2019

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!