29.03.2013 Views

October 2006 Volume 9 Number 4

October 2006 Volume 9 Number 4

October 2006 Volume 9 Number 4

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

P(B|D) is the probability of a network B given data D.<br />

P(D|B) is the probability the network gives to data D.<br />

P(D) is the ‘general’ probability of data.<br />

P(B) is the probability of the network before seen the data.<br />

Applying equation (1) to (2), we get:<br />

P(<br />

D | B1)<br />

P(<br />

B1)<br />

r = (3)<br />

P(<br />

D | B ) P(<br />

B )<br />

2<br />

Having not seen the data, no prior knowledge is obtainable and thus no straightforward method of computing<br />

P(B1) and P(B2) is feasible. A common way to deal with this is to assume that every network has the same<br />

probability with all the others, so equation (3) becomes:<br />

P(<br />

D | B1)<br />

r =<br />

P(<br />

D | B )<br />

2<br />

The probability the model gives to the data can be extracted using the following formula (Glymour and Cooper,<br />

1999):<br />

P(<br />

D | B)<br />

=<br />

n qi<br />

ri<br />

ijk<br />

qi<br />

ri<br />

qi<br />

∏<br />

i=<br />

1 j=<br />

1 Ξ<br />

k=<br />

1 Ξ<br />

Γ(<br />

+ Nij<br />

) Γ(<br />

)<br />

qi<br />

ri<br />

qi<br />

∏∏<br />

Ξ<br />

Γ(<br />

)<br />

where:<br />

Γ is the gamma function.<br />

n equals to the number of variables.<br />

ri denotes the number of values in i:th variable.<br />

qi denotes the number of possible different value combinations the parent variables can take.<br />

Nij depicts the number of rows in data that have j:th value combinations for parents of i:th variable.<br />

Nijk corresponds to the number of rows that have k:th value for the i:th variable and which also have j:th<br />

value combinations for parents of i:th variable.<br />

Ξ is the equivalent sample size, a parameter that determines how readily we change our beliefs about the<br />

quantitative nature of dependencies when we see the data. In our study, we follow a simple choice inspired<br />

by Jeffreys (1939) prior. Ξ is equal to the average number of values variables have, divided by 2.<br />

Given the great number of possible networks produced by the learning process, a search algorithm has to be<br />

applied. We follow greedy search with one modification: instead of comparing all candidate networks, we<br />

consider investigating the set that resembles the best current model the most.<br />

In general, a BBN is capable of computing the probability distribution for any partial subset of variables, given<br />

the values or distributions of any subset of the remaining variables. Note that the values have to be discretised,<br />

and different discretisation size affects the network. As we shall discuss in the result section, BBNs constitute a<br />

significant tool for knowledge representation, visualising the relationships between features and subsets of them.<br />

This fact has a significant result on identifying which features actually affect the class variable, thus reducing<br />

training data size without any significant impact in the performance.<br />

However, we should stress the fact that transforming the Bayesian Belief Network elicitation process to a<br />

tractable procedure is not enough. The most important step, while attempting to resolve a problem using the<br />

application of such a technique, is the establishment of a deep understanding of the nature of the problem. This<br />

understanding could lead to a manageable transformation of the problem domain, where applications of BBNs<br />

could serve as an important inference tool. From our expertise obtained dealing with the adaptation problems<br />

described in the following, attempting to re-express the problem to be modelled should first include an initial<br />

understanding and clarification of the prominent variables of the domain which influence our reasoning about the<br />

state or the goal of the student while executing a task. These variables, coupled with nodes indicating the issue<br />

(the ‘class’ variable) that is needed to infer upon, are used to feed the BBN construction process. The states of<br />

the variables represent actual user’s actions, neither at a high task level, nor in the lowest keystroke level of<br />

interaction. The granularity of the abstraction is heavily depended by the nature of the problem and our aim. It<br />

2<br />

Γ(<br />

Ξ<br />

+ N<br />

)<br />

154

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!