27.06.2013 Views

Volume Two - Academic Conferences

Volume Two - Academic Conferences

Volume Two - Academic Conferences

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Phelim Murnion and Markus Helfert<br />

intervention within that context. On this basis we postulate a spectrum of EDM research in terms of<br />

integration with the educational context from studies with little or no integration with the context to<br />

studies which are highly integrated. However, the level of integration is unlikely to be a simple scalar<br />

variable. Any analysis of an ‘integration’ construct within the EDM literature requires a more concrete<br />

version of this construct. Since the relationship of data mining to the domain context is a feature of the<br />

data mining methodology adopted, an examination of data mining methodologies is a necessary next<br />

step.<br />

2.2 Data Mining methodologies<br />

Data mining arose from a number of related computational and statistical approaches, which form a<br />

set of data mining methodologies. The methodologies can have an emphasis on technical aspects<br />

rather than the processes and the relationship to the task domain(Peng, Kou et al. 2008). However,in<br />

order to structure the methodologies,research and practice in data mining has expanded in<br />

perspective to include steps along a data mining process or data mining model (figure 1). This model<br />

can be described as the ‘technical perspective’ on data mining and has been commonly used in EDM<br />

research(Luo 2001), (García E. 2007).<br />

Figure 1: Data Mining Model: Technical perspective<br />

As data mining matured as a discipline and as data mining applications were implemented in a wider<br />

variety of problem domains the technical steps were subsumed into a more comprehensive<br />

methodology known as the data mining cycle (e.g. figure 2).A number of data mining<br />

methodologies/cycles have been developed but the process of developing these methodologies has<br />

exhibited two common features(Hofmann and Tierney 2009): the replacement of a sequence of steps<br />

with a cyclical, iterative process, and a greater focus on the connections between the data mining<br />

process and the underlying problem context.The most widely accepted model, known as the CRISP-<br />

DM (CRoss Industry Standard Process for Data Mining) Cycle(Shearer 2000),exhibits both features<br />

(figure 2).<br />

Figure 2: CRISP-DM cycle (Shearer 2000)<br />

528

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!