23.07.2013 Views

Online Boosting Based Intrusion Detection in Changing Environments

Online Boosting Based Intrusion Detection in Changing Environments

Online Boosting Based Intrusion Detection in Changing Environments

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

To focus on some specific <strong>in</strong>trusion types, Jirapumm<strong>in</strong><br />

et al. [19] employed a hybrid neural network model to detect<br />

TCP SYN flood<strong>in</strong>g and port scan attacks. In order to<br />

exam<strong>in</strong>e the detection ability of the specific types of attack,<br />

a smaller data set that conta<strong>in</strong>s only three attack types is<br />

constructed. The samples are randomly selected from the<br />

KDD Cup 1999 data set, and the numbers of samples are<br />

listed <strong>in</strong> Table 5. <strong>Detection</strong> results on the test data are<br />

presented <strong>in</strong> Table 6, <strong>in</strong>clud<strong>in</strong>g also the results given by<br />

Sarasamma et al. [10] for comparison. We can see that the<br />

specific types of <strong>in</strong>trusions are accurately detected by the<br />

onl<strong>in</strong>e boost<strong>in</strong>g based method, and a smaller FAR of 0.17 is<br />

obta<strong>in</strong>ed compared with the other two methods.<br />

Table 5. A Smaller Data Set<br />

Categories Tra<strong>in</strong><strong>in</strong>g data Test data<br />

Normal 20676 60593<br />

Neptune 22783 58001<br />

Satan 316 1633<br />

Portsweep 225 354<br />

Total 44000 120581<br />

Table 6. <strong>Detection</strong> Results for Some Specific Attacks<br />

Algorithms DR (%) FAR (%)<br />

Hybrid Neural Networks [19] 90-99.72 0.06-4.5<br />

Hierarchical K-Map [10] 99.17-99.63 0.34-1.41<br />

Proposed Method 99.55 0.17<br />

3.3 Fast adaption to chang<strong>in</strong>g environments<br />

In the onl<strong>in</strong>e boost<strong>in</strong>g based <strong>in</strong>trusion detection method,<br />

the characteristic of a new type of attack is quickly learned<br />

<strong>in</strong> onl<strong>in</strong>e mode, and the ability to detect the new attack is<br />

soon <strong>in</strong>corporated <strong>in</strong>to the ensemble classifier.<br />

For example, there are two new attack types emerg<strong>in</strong>g,<br />

“guess-passwd” (a R2L attack) and “back” (a DOS attack),<br />

which have not been learned for the current detector. For the<br />

detection algorithms learned <strong>in</strong> batch mode, such as<br />

Adaboost, the detector has to be retra<strong>in</strong>ed from the very<br />

beg<strong>in</strong>n<strong>in</strong>g, and make multi-pass over the entire tra<strong>in</strong><strong>in</strong>g data<br />

<strong>in</strong>clud<strong>in</strong>g the new emerg<strong>in</strong>g ones. However, <strong>in</strong> the proposed<br />

detection method, the detector can be efficiently updated<br />

onl<strong>in</strong>e, and only one pass over the labeled <strong>in</strong>trusion data of<br />

new attack types is needed.<br />

The onl<strong>in</strong>e learn<strong>in</strong>g processes are showed <strong>in</strong> Fig. 1 and<br />

Fig. 2. The horizontal axis <strong>in</strong>dicates the sequence of samples<br />

of this new attack that are be<strong>in</strong>g learned, and the vertical<br />

axis <strong>in</strong>dicates the class label y ~ predicted by the detector.<br />

~ y 1<br />

means the sample is classified as a normal connection,<br />

and ~ y 1<br />

means a network <strong>in</strong>trusion. Note that the<br />

classification results are derived from the updated detector<br />

immediately after onl<strong>in</strong>e learn<strong>in</strong>g of each tra<strong>in</strong><strong>in</strong>g sample.<br />

We can see that the new types of attack are accurately<br />

detected soon after onl<strong>in</strong>e learn<strong>in</strong>g on a few samples (7<br />

samples for “guess-passwd” and 15 samples for “back”).<br />

Figure 1. <strong>Onl<strong>in</strong>e</strong> Learn<strong>in</strong>g of “guess-passwd” Attack<br />

Figure 2. <strong>Onl<strong>in</strong>e</strong> Learn<strong>in</strong>g of “back” Attack<br />

3.4 Comparison with some other algorithms<br />

We also compare the proposed method with some<br />

recent progresses on the <strong>in</strong>trusion detection research. For<br />

example, Esk<strong>in</strong> et al. proposed a geometric framework for<br />

unsupervised anomaly detection <strong>in</strong> [20], and three methods<br />

are presented to detect anomalies <strong>in</strong> the data set. Kayacik et<br />

al. [21] performed <strong>in</strong>trusion detection based on selforganiz<strong>in</strong>g<br />

feature maps. Recently, a multi-layer hierarchical<br />

K-Map was proposed by Sarasamma et al. [10].<br />

<strong>Detection</strong> results of the above methods are listed <strong>in</strong><br />

Table 7. Note that these results are all obta<strong>in</strong>ed us<strong>in</strong>g the<br />

KDD Cup 1999 data set. We can see that the detection<br />

performance of the onl<strong>in</strong>e boost<strong>in</strong>g based method is very<br />

comparable with those of the other methods, and the false<br />

alarm rate is superior to most of exist<strong>in</strong>g methods. Moreover,<br />

the proposed method only needs one pass over the new types<br />

of <strong>in</strong>trusion data <strong>in</strong> order to successfully detect them. This<br />

outperforms the other methods which have to make multipass<br />

over the entire tra<strong>in</strong><strong>in</strong>g data and retra<strong>in</strong> the whole<br />

detector.<br />

Table 7. Comparison with Some Other Algorithms<br />

Algorithms DR (%) FAR (%)<br />

Cluster<strong>in</strong>g [20] 93 10<br />

K-NN [20] 91 8<br />

SVM [20] 91-98 6-10<br />

SOM [21] 89-90.6 4.6-7.6<br />

Hierarchical K-Map [10] 90.94-93.46 2.19-3.99<br />

Proposed Method 90.1329 2.2374

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!