23.07.2013 Views

Online Boosting Based Intrusion Detection in Changing Environments

Online Boosting Based Intrusion Detection in Changing Environments

Online Boosting Based Intrusion Detection in Changing Environments

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

use of the <strong>in</strong>formation supplied by part of the tra<strong>in</strong><strong>in</strong>g data<br />

set. Thus, the key issue of onl<strong>in</strong>e learn<strong>in</strong>g research is how to<br />

control the above difference while tra<strong>in</strong><strong>in</strong>g a classifier onl<strong>in</strong>e.<br />

Table 1. Adaboost Algorithm<br />

Input: {( x 1 , y1),...,<br />

( xn<br />

, yn<br />

)}, M , Lb<br />

( 1)<br />

Initilization: 1 N n 1,<br />

2,...,<br />

N<br />

w n<br />

For m 1,<br />

2,...,<br />

M<br />

m)<br />

(<br />

h L ({( x , y ),..., ( x , y )}, w<br />

( m)<br />

b 1 1 n n<br />

( m)<br />

<br />

(m)<br />

h<br />

( m)<br />

( m)<br />

w<br />

n:<br />

h ( xn<br />

) y n<br />

n<br />

( m)<br />

1<br />

, then<br />

Calculate the weighted error of<br />

If 2<br />

Set M m1<br />

and stop loop<br />

endif<br />

Update the weights<br />

( m)<br />

( m1)<br />

( m)<br />

2(<br />

1<br />

)<br />

w<br />

if<br />

n wn<br />

{<br />

1<br />

( m)<br />

2<br />

if<br />

Output the f<strong>in</strong>al strong classifier:<br />

(<br />

M ( m) 1<br />

<br />

H ( x)<br />

sign(<br />

h ( x)<br />

lg<br />

m1<br />

( m)<br />

<br />

In order to adapt Adaboost to the data streams<br />

environment, Oza proposed an onl<strong>in</strong>e version of Adaboost <strong>in</strong><br />

[16], and the convergence proof for the onl<strong>in</strong>e version was<br />

also given. Recently, Grabner and Bischof [17] successfully<br />

<strong>in</strong>troduced the onl<strong>in</strong>e boost<strong>in</strong>g algorithm <strong>in</strong>to computer<br />

vision field.<br />

The detailed onl<strong>in</strong>e boost<strong>in</strong>g algorithm is presented <strong>in</strong><br />

Table 2. Here h is the set of weak classifiers to be updated<br />

onl<strong>in</strong>e, L is the onl<strong>in</strong>e base model learn<strong>in</strong>g algorithm. Note<br />

o<br />

that <strong>in</strong> the batch Adaboost algorithm, the sum of sample<br />

weights rema<strong>in</strong>s 1:<br />

N<br />

<br />

n1<br />

w<br />

( m)<br />

n<br />

)<br />

where the def<strong>in</strong>ition of w is already given <strong>in</strong> Section 2.2.<br />

(m<br />

n<br />

1<br />

However, <strong>in</strong> onl<strong>in</strong>e boost<strong>in</strong>g, the weight evolves<br />

)<br />

<strong>in</strong>dividually for each tra<strong>in</strong><strong>in</strong>g sample. The weight w is<br />

actually a sampl<strong>in</strong>g weight while generat<strong>in</strong>g the m-th weak<br />

classifier. The same function is taken through the parameter<br />

k <strong>in</strong> onl<strong>in</strong>e boost<strong>in</strong>g, which is randomly generated from<br />

Poission distribution with parameter . As to the weighted<br />

(m)<br />

classification error of h , an approximation is used:<br />

sw<br />

( m)<br />

m<br />

<br />

( 4)<br />

sc sw<br />

m<br />

m<br />

which <strong>in</strong>volves only samples already seen. Moreover, the<br />

number of weak classifiers is not fixed <strong>in</strong> Adaboost; while <strong>in</strong><br />

onl<strong>in</strong>e boost<strong>in</strong>g, the number of weak classifiers is fixed<br />

beforehand, and the weak classifiers are all learned onl<strong>in</strong>e.<br />

Although it may differ greatly from that learned <strong>in</strong> batch<br />

mode when only a few tra<strong>in</strong><strong>in</strong>g samples have been processed,<br />

the onl<strong>in</strong>e ensemble classifier converges statistically to the<br />

m)<br />

)<br />

)<br />

1,<br />

m 1,...,<br />

M<br />

( m)<br />

h ( xn<br />

) <br />

(<br />

m)<br />

h ( xn<br />

) <br />

y<br />

y<br />

n<br />

n<br />

(m<br />

n<br />

( 3)<br />

ensemble generated <strong>in</strong> batch mode, as the number of tra<strong>in</strong><strong>in</strong>g<br />

samples <strong>in</strong>creases [16].<br />

Table 2. <strong>Onl<strong>in</strong>e</strong> <strong>Boost<strong>in</strong>g</strong> Algorithm<br />

Input: {( x 1 , y1),...,<br />

( xn<br />

, yn<br />

)}, M , Lo<br />

sc sw<br />

Initilization: 0 0 m 1,<br />

2,...,<br />

M<br />

m<br />

m<br />

For each new tra<strong>in</strong><strong>in</strong>g sample ( x , y)<br />

Initialize weight of the current sample 1<br />

For m 1,<br />

2,...,<br />

M<br />

Set k accord<strong>in</strong>g to Poission ()<br />

Do k times<br />

h L (( x,<br />

y),<br />

h<br />

( m)<br />

( m)<br />

o<br />

h x<br />

m ( )<br />

( ) ,then<br />

If y<br />

<br />

sc sc<br />

m<br />

m<br />

( m)<br />

<br />

sc m<br />

<br />

1<br />

<br />

2(<br />

1<br />

<br />

else<br />

<br />

sw sw<br />

<br />

m<br />

( m)<br />

m<br />

<br />

<br />

<br />

<br />

2<br />

sc<br />

m<br />

1<br />

( m)<br />

sw<br />

m<br />

sw m<br />

(m)<br />

)<br />

sw<br />

m<br />

sw m<br />

endif<br />

Output the f<strong>in</strong>al strong classifier<br />

(<br />

M ( m) 1<br />

<br />

H ( x)<br />

sign(<br />

h ( x)<br />

lg<br />

m1<br />

( m)<br />

<br />

2.4 <strong>Onl<strong>in</strong>e</strong> boost<strong>in</strong>g based <strong>in</strong>trusion detection<br />

An <strong>in</strong>trusion detection algorithm is expected to fill<br />

ma<strong>in</strong>ly three requirements <strong>in</strong> order to be suitable for<br />

practical uses:<br />

)<br />

m)<br />

the detection should be perform <strong>in</strong> real-time;<br />

the detection accuracy should be as high as possible,<br />

which means a high detection rate to guarantee the<br />

system security, and a low false alarm rate to decrease<br />

unnecessary human burden;<br />

the detector should adapt quickly to the chang<strong>in</strong>g<br />

network environments, which implies the ability to<br />

accurately detect any new type of attack soon after its<br />

emergence.<br />

In order to make the updat<strong>in</strong>g of <strong>in</strong>trusion detector<br />

efficient, the tra<strong>in</strong><strong>in</strong>g of the detector should not be timeconsum<strong>in</strong>g,<br />

which prevents the use of some complex<br />

classifiers; on the other hand, the strict requirement of<br />

detection performance makes the direct use of simple<br />

classifiers impractical. Consider the variety of attribute of<br />

network connection data, the situation is even worse. We<br />

will try to solve these difficulties <strong>in</strong> the proposed onl<strong>in</strong>e<br />

boost<strong>in</strong>g based detection method.<br />

)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!