11.07.2015 Views

Network Traffic Anomaly Detection Based on Packet Bytes

Network Traffic Anomaly Detection Based on Packet Bytes

Network Traffic Anomaly Detection Based on Packet Bytes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Lower probabilities result in higher anomaly scores, since theseare presumably more likely to be hostile.ADAM is a classifier which can be trained <strong>on</strong> both knownattacks and <strong>on</strong> (presumably) attack-free traffic. Patterns which d<strong>on</strong>ot match any learned category are flagged as anomalous. ADAMalso models address subnets (prefixes) in additi<strong>on</strong> to ports andindividual addresses. NIDES is a comp<strong>on</strong>ent of EMERALD [9],a system that integrates host and network based techniques usingboth signature and anomaly detecti<strong>on</strong>. The NIDES comp<strong>on</strong>ent,like SPADE and ADAM, models ports and addresses, flaggingdifferences between short and l<strong>on</strong>g term behavior.SPADE, ADAM, and NIDES use stati<strong>on</strong>ary models, in whichthe probability of an event is estimated by its average frequencyduring training. PHAD [6], ALAD [8], and LERAD [7] usen<strong>on</strong>stati<strong>on</strong>ary models, in which the probability of an eventdepends instead <strong>on</strong> the time since it last occurred. For eachattribute, they collect a set of allowed values (anything observedat least <strong>on</strong>ce in training), and flag novel values as anomalous.Specifically, they assign a score of tn/r to a novel valued attribute,where t is the time since the attribute was last anomalous (duringeither training or testing), n is the number of trainingobservati<strong>on</strong>s, and r is the size of the set of allowed values. Notethat r/n is the average rate of anomalies in training; thus attributeswith high n/r are not likely to generate anomalies in testing andought to score high. The factor t makes the model n<strong>on</strong>stati<strong>on</strong>ary,yielding higher scores for attributes for which there have been noanomalies for a l<strong>on</strong>g time.PHAD, ALAD, and LERAD differ in the attributes that theym<strong>on</strong>itor. PHAD (<strong>Packet</strong> Header <str<strong>on</strong>g>Anomaly</str<strong>on</strong>g> Detector) has 34attributes, corresp<strong>on</strong>ding to the Ethernet, IP, TCP, UDP, andICMP packet header fields. It builds a single model of allnetwork traffic, incoming or outgoing. ALAD (Applicati<strong>on</strong> Layer<str<strong>on</strong>g>Anomaly</str<strong>on</strong>g> Detector) models incoming server TCP requests: sourceand destinati<strong>on</strong> addresses and ports, opening and closing TCPflags, and the list of commands (the first word <strong>on</strong> each line) in theapplicati<strong>on</strong> payload. Depending <strong>on</strong> the attribute, it buildsseparate models for each target host, port number (service), orhost/port combinati<strong>on</strong>. LERAD (LEarning Rules for <str<strong>on</strong>g>Anomaly</str<strong>on</strong>g><str<strong>on</strong>g>Detecti<strong>on</strong></str<strong>on</strong>g>) also models TCP c<strong>on</strong>necti<strong>on</strong>s, but samples the trainingdata to suggest large subsets to model. For example, if it samplestwo HTTP port requests to the same host, then it might suggest arule that all requests to this host must be HTTP, and it builds aport model for this host.PHAD, ALAD, and LERAD were tested <strong>on</strong> the 1999 DARPAoff-line intrusi<strong>on</strong> detecti<strong>on</strong> evaluati<strong>on</strong> data set [5], by training <strong>on</strong><strong>on</strong>e week of attack free traffic (inside sniffer, week 3) and testing<strong>on</strong> two weeks of traffic (weeks 4 and 5) c<strong>on</strong>taining 185 detectableinstances of 58 attacks. The DARPA data set uses variati<strong>on</strong>s ofexploits taken from public sources, which are used to attacksystems running SunOS, Solaris, Linux, Windows NT, and aCisco router <strong>on</strong> an Ethernet network with a simulated Internetc<strong>on</strong>necti<strong>on</strong> and background traffic. At a threshold allowing 100false alarms, PHAD detects 54 attack instances [6], ALAD detects60, or 70 when the results were merged with PHAD [8], andLERAD detects 114, or 62% [7].Table 1 shows the results for the top 4 systems (includingversi<strong>on</strong>s of EMERALD and ADAM) of the 18 (submitted by 8organizati<strong>on</strong>s) that participated in the original 1999 evaluati<strong>on</strong>.These systems used a variety of techniques, both host and networkbased, and both signature and anomaly detecti<strong>on</strong>. The number ofdetecti<strong>on</strong>s is shown out of the total number of instances that thesystem was designed to detect. This counts <strong>on</strong>ly attacks visible inthe examined data, and <strong>on</strong>ly those of the types specified by thedeveloper: probe, denial of service (DOS), remote to local (R2L),or user to root (U2R). PHAD, ALAD, and LERAD wereevaluated later under similar criteria by the authors, but theevaluati<strong>on</strong>s were not blind or independent. The developers wereable to tune their systems <strong>on</strong> the test data.Table 1. Top four percentage of attacks detected in the 1999DARPA IDS evaluati<strong>on</strong> at 100 false alarms, out of the totalnumber of attacks they were designed to detect [5, Table 6].System<str<strong>on</strong>g>Detecti<strong>on</strong></str<strong>on</strong>g>sExpert 1 85/169 (50%)Expert 2 81/173 (47%)Dmine 41/102 (40%)Forensics 15/27 (55%)3. THE NETAD MODELNETAD (<str<strong>on</strong>g>Network</str<strong>on</strong>g> <str<strong>on</strong>g>Traffic</str<strong>on</strong>g> <str<strong>on</strong>g>Anomaly</str<strong>on</strong>g> Detector), like PHAD,detects anomalies in network packets. However, it differs asfollows:1. The traffic is filtered, so <strong>on</strong>ly the start of incoming serverrequests are examined.2. Starting with the IP header, we treat each of the first 48 bytesas an attribute for our models--we do not parse the packetinto fields.3. There are 9 separate models corresp<strong>on</strong>ding to the mostcomm<strong>on</strong> protocols (IP, TCP, HTTP, etc.).4. The anomaly score tn/r is modified to (am<strong>on</strong>g other things)score rare, but not necessarily novel, events.3.1. <str<strong>on</strong>g>Traffic</str<strong>on</strong>g> FilteringThe first stage of NETAD is to filter out uninteresting traffic.Most attacks are initiated against a target server or operatingsystem, so it is usually sufficient to examine <strong>on</strong>ly the first fewpackets of incoming server requests. This not <strong>on</strong>ly filters outtraffic likely to generate false alarms, but also speeds upprocessing. NETAD removes the following data.• N<strong>on</strong> IP packets (ARP, hub test, etc.).• All outgoing traffic (which means that resp<strong>on</strong>se anomaliescannot be detected).• All TCP c<strong>on</strong>necti<strong>on</strong>s starting with a SYN-ACK packet,indicating the c<strong>on</strong>necti<strong>on</strong> was initiated by a local client.Normally, attacks are initiated remotely against a server.• UDP to high numbered ports (>1023). Normally this is to aclient (e.g. a DNS resolver).• TCP starting after the first 100 bytes (as determined by thesequence number). A 4K hash table is used to store thestarting TCP SYN sequence number for eachsource/destinati<strong>on</strong> address/port combinati<strong>on</strong>. There is asmall amount of packet loss due to hash collisi<strong>on</strong>s.• <strong>Packet</strong>s to any address/port/protocol combinati<strong>on</strong> (TCP,UDP, or ICMP) if more than 16 have been received in thelast 60 sec<strong>on</strong>ds. Again, a 4K hash table (without collisi<strong>on</strong>detecti<strong>on</strong>) is used to index a queue of the last 16 packettimes. This limits packet floods.• <strong>Packet</strong>s are truncated to 250 bytes (although NETAD uses<strong>on</strong>ly the first 48 bytes).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!