13.07.2015 Views

Effect of Intrusion Detection and Response on Reliability of Cyber ...

Effect of Intrusion Detection and Response on Reliability of Cyber ...

Effect of Intrusion Detection and Response on Reliability of Cyber ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ong>Effectong> ong>ofong> ong>Intrusionong> ong>Detectionong> ong>andong> ong>Responseong> onReliability ong>ofong> Cyber Physical SystemsRobert MitchellDepartment ong>ofong> Computer ScienceVirginia TechEmail: rrmitche@vt.eduIng-Ray ChenDepartment ong>ofong> Computer ScienceVirginia TechEmail: irchen@cs.vt.eduAbstract—In this paper we analyze the effect ong>ofong> intrusiondetection ong>andong> response on the reliability ong>ofong> a cyber physicalsystem (CPS) comprising sensors, actuators, control units,ong>andong> physical objects for controlling ong>andong> protecting a physicalinfrastructure. We develop a probability model based onStochastic Petri nets to describe the behavior ong>ofong> the CPS inthe presence ong>ofong> both malicious nodes exhibiting a range ong>ofong>attacker behaviors, ong>andong> an intrusion detection ong>andong> responsesystem (IDRS) for detecting ong>andong> responding to malicious eventsat runtime. Our results indicate that adjusting detection ong>andong>response strength in response to attacker strength ong>andong> behaviordetected can significantly improve the reliability ong>ofong> the CPS. Wereport numerical data for a CPS subject to persistent, rong>andong>omong>andong> insidious attacks with physical interpretations given.keywords: Reliability, intrusion detection, intrusionresponse, cyber physical systems, performance analysis.I. INTRODUCTIONA cyber physical system (CPS) typically comprises sensors,actuators, control units, ong>andong> physical objects for controllingong>andong> protecting a physical infrastructure. Because ong>ofong> the direconsequence ong>ofong> a CPS failure, protecting a CPS from maliciousattacks is ong>ofong> paramount importance. In this paper we addressthe reliability issue ong>ofong> a CPS designed to sustain maliciousattacks over a prolonged mission period without energyreplenishment. A CPS ong>ofong>ten operates in a rough environmentwherein energy replenishment is not possible ong>andong> nodes maybe compromised (or captured) at times. Thus, an intrusiondetection ong>andong> response system (IDRS) must detect maliciousnodes without unnecessarily wasting energy to prolong thesystem lifetime.ong>Intrusionong> detection system (IDS) design for CPSs hasattracted considerable attention [1], [6]. ong>Detectionong> techniquesin general can be classified into three types: signature based,anomaly based, ong>andong> specification based techniques. In the areaong>ofong> signature based IDS techniques, Oman ong>andong> Phillips [15]study an IDS for CPSs that tests an automated XML prong>ofong>ileto Snort signature transform in an electricity distributionlaboratory. Verba ong>andong> Milvich [19] study an IDS for CPSsthat takes a multitrust hybrid approach using signature baseddetection ong>andong> traffic analysis. Our work is different from thesestudies in that we use specification based detection ratherthan signature based detection to deal with unknown attackerpatterns.In the area ong>ofong> anomaly based IDS techniques, Barbosaong>andong> Pras [2] study an IDS for CPSs that tests state machineong>andong> Markov chain approaches to traffic analysis on a waterdistribution system based on a comprehensive vulnerabilityassessment. Linda, et al. [12] study an IDS for CPSs that useserror-back propagation ong>andong> Levenberg-Marquardt approacheswith window based feature extraction. Gao, et al. [10] study anIDS for CPSs that uses a three stage back propagation artificialneural network (ANN) based on Modbus features. Bellettiniong>andong> Rrushi [3] study an IDS for CPSs that seeds the runtimestack with NULL calls, applies shuffle operations ong>andong> performsdetection using product machines. Yang, et al. [20] study anIDS for CPSs that uses SNMP to drive prediction, residualcalculation ong>andong> detection modules for an experimental testbed.Bigham, et al. [4] study an IDS for CPSs that demonstratespromising control ong>ofong> detection ong>andong> false negative rates. Tsangong>andong> Kwong [18] study a rich multitrust IDS for CPSs thatuses a novel machine learning approach. Xie, et al. [13]surveys IDS for CPSs that advocates an anomaly based layeredapproach. Our work is different from these studies in that weuse specification based rather than anomaly based techniquesto avoid using resource-constrained sensors or actuators in aCPS for prong>ofong>iling anomaly patterns (e.g., through learning)ong>andong> to avoid high false positives (treating good nodes as badnodes).In the area ong>ofong> specification-based IDS techniques, Cheung,et al. [7] study a specification based IDS that uses PVSto transform protocol, communication pattern ong>andong> serviceavailability specifications into a format compatible withEMERALD. Carcano, et al. [5] propose a specification basedIDS that extends [9] ong>andong> distinguishes faults ong>andong> attacks,describes a language to express a CPS specification, ong>andong>establishes a critical state distance metric. Zimmer, et al.[21] study a specification based IDS that instruments atarget application ong>andong> uses a scheduler to confirm timinganalysis results. Our work is also specification based. However,our work is different from these prior studies in thatwe automatically map a specification into a state machineconsisting ong>ofong> good ong>andong> bad states ong>andong> simply measure a node’sdeviation from good states at runtime for intrusion detection.Moreover we apply specification-based techniques to hostlevelintrusion detection only. To cope with incomplete ong>andong>uncertain information available to nodes in the CPS ong>andong> to


mitigate the effect ong>ofong> node collusion, we devise system-levelintrusion detection based on multitrust to yield a low falsealarm probability.While the literature is abundant in the collection ong>andong>analysis aspects ong>ofong> intrusion detection, the response aspectis little treated. In particular, there is a gap with respect tointrusion detection ong>andong> response. Our IDRS design addressesboth intrusion detection ong>andong> response issues, with the goal tomaximize the CPS lifetime.Our methodology for CPS reliability assessment is modelbasedanalysis. Specifically, we develop a probability modelto assess the reliability property ong>ofong> a CPS equipped with anintrusion detection ong>andong> response system (IDRS) for detectingong>andong> responding to malicious events detected. Untreated in theliterature, we consider a variety ong>ofong> attacker behaviors includingpersistent, rong>andong>om ong>andong> insidious attacker models ong>andong> identifythe best design settings ong>ofong> the detection strength ong>andong> responsestrength to best balance energy conservation vs. intrusiontolerance for achieving high reliability, when given a set ong>ofong>parameter values characterizing the operational environmentong>andong> network conditions.The rest ong>ofong> the paper is organized as follows: Section IIgives the system model. Section III develops a mathematicalmodel based on Stochastic Petri Nets [16] for theoreticalanalysis. Section IV discusses the parameterization processfor the reference CPS. Section V presents numerical data withphysical interpretations given. Finally, Section VI summarizesthe paper ong>andong> outlines some future research areas.II. SYSTEM MODEL/REFERENCE CONFIGURATIONA. Reference CPSOur reference CPS model is based on the CPS infrastructuredescribed in [14] comprising at the sensor layer 128 sensorcarriedmobile nodes. Each node ranges its neighborsperiodically. Each node uses its sensor to measure anydetectable phenomena nearby. Each node transmits a CDMAwaveform. Neighbors receiving that waveform transform thetiming ong>ofong> the PN code (1023 symbols) ong>andong> RF carrier(915 MHz) into distance. Essentially each node performssensing ong>andong> reporting functions to provide information toupper layer control devices to control ong>andong> protect the CPSinfrastructure, ong>andong> in addition utilizes its ranging function fornode localization ong>andong> intrusion detection.The reference model is a special case ong>ofong> a single-enclavesystem with homogeneous nodes. The IDS functionality isdistributed to all nodes in the system for intrusion/faulttolerance. On top ong>ofong> the sensor-carried mobile nodes sits anenclave control node responsible for setting system parametersin response to dynamically changing conditions such aschanges ong>ofong> attacker strength. The control module is assumedfault/intrusion free through security/hardware protectionmechanisms against capture attacks/hardware failure.B. Security FailureWhile our approach is general enough to take any securityfailure definition, we consider two security failure conditions.The first condition is based on the Byzantine fault model [11].That is, if one-third or more ong>ofong> the nodes are compromised,then the system fails. The reason is that once the systemcontains 1/3 or more compromised nodes, it is impossibleto reach a consensus, hence inducing a security failure.The second condition is impairment failure. This is, heavyimpairment due to attacks will result in a security failurewhen the system is not able to respond to attacks in a timelymanner. This is modeled by defining an impairment-failureperiod beyond which the system cannot sustain the damage.C. Attack ModelThe first step in investigating network security is to definethe attack model. We consider capture attacks which turn agood node into a bad insider node. At the sensor/actuatorlayer ong>ofong> the CPS architecture, a bad node can performdata spoong>ofong>ing attacks (reporting spoong>ofong> sensor data) ong>andong> badcommong>andong> execution attacks. At the networking layer, a badnode can perform various communication attacks includingselective forwarding, packet dropping, packet spoong>ofong>ing, packetreplaying, packet flooding ong>andong> even Sybil attacks to disruptthe system’s packet routing functionality. At the controllayer, a bad node can perform control-level attacks includingaggregated data spoong>ofong>ing attacks, ong>andong> commong>andong> spoong>ofong>ingattacks. Nodes at the control layer, however, are lesssusceptible to capture attacks because they are normallydeployed in a physical confine which protects them fromtampering. For this reason in this paper our primary interest ison capture attacks ong>ofong> sensor/actuator nodes performing basicsensing, actuating ong>andong> networking functions.We consider three attacker models: persistent, rong>andong>om,ong>andong> insidious. A persistent attacker performs attacks withprobability one (i.e., whenever it has a chance). The primaryobjective is to cause impairment failure. A rong>andong>om attackerperforms attacks rong>andong>omly with probability p rong>andong>om . Theprimary objective is to evade detection. It may take a longertime for a rong>andong>om attacker to cause impairment failure sincethe attack is rong>andong>om. However rong>andong>om attackers are hidden soit may increase the probability ong>ofong> Byzantine security failureonce the number ong>ofong> bad nodes equals or exceeds 1/3 ong>ofong> thenode population. An insidious attacker is hidden all the time toevade detection until a critical mass ong>ofong> compromised nodes isreached to perform “all in” attacks. The primary objective is tomaximize the failure probability caused by either impairmentor Byzantine security failure.D. Host ong>Intrusionong> ong>Detectionong>Our host intrusion detection protocol design is based ontwo core techniques: behavior rule specification ong>andong> vectorsimilarity specification. The basic idea ong>ofong> behavior rulespecification is to specify the behavior ong>ofong> an entity (a sensoror an actuator) by a set ong>ofong> rules from which a state machineis automatically derived. Then, node misbehavior can beassessed by observing the behaviors ong>ofong> the node againstthe state machine (or behavior rules). The basic idea ong>ofong>vector similarity specification is to compare similarity ong>ofong>


a sequence ong>ofong> sensor readings, commong>andong>s or votes amongentities performing the same set ong>ofong> functions. A state machineis also automatically derived from which a similarity testis performed to detect outliers. More specifically, the statesderived in the state machine would be labeled as securevs. insecure. A monitoring node then applies snooping ong>andong>overhearing techniques observing the percentage ong>ofong> time aneighbor node is in secure states over a detection interval,say, T IDS . A longer sojourn time in secure states indicatesgreater specification compliance while a shorter sojourn timeindicates less specification compliance. If the compliancedegree ong>ofong> node i denoted by X i falls below a minimumcompliance threshold denoted by C T , node i is consideredcompromised. We apply these two host IDS techniques to thereference CPS as follows: (a) a monitoring node periodicallydetermines a sequence ong>ofong> locations ong>ofong> a sensor-carried mobilenode within radio range through ranging ong>andong> detects ifthe location sequence (corresponding to the state sequence)deviates from the expected location sequence; (b) a monitoringnode periodically collects votes from neighbor nodes whohave participated in system intrusion detection (describedbelow) ong>andong> detects dissimilarity ong>ofong> vote sequences among theseneighbors for outlier detection.The measurement ong>ofong> compliance degree ong>ofong> a node frequentlyis not perfect ong>andong> can be affected by noise ong>andong> unreliablewireless communication in the CPS. We model the compliancedegree by a rong>andong>om variable X with G(·) = Beta(α, β)distribution [17], with the value 0 indicating that the output istotally unacceptable (zero compliance) ong>andong> 1 indicating theoutput is totally acceptable (perfect compliance), such thatG(a), 0 ≤ a ≤ 1, is given by∫ aΓ(α + β)G(a) =0 Γ(α)Γ(β) xα−1 (1 − x) β−1 dx (1)ong>andong> the expected value ong>ofong> X is given by∫ 1Γ(α + β)E B [X] = x0 Γ(α)Γ(β) xα−1 (1 − x) β−1 dx =αα + β(2)The α ong>andong> β parameters are to be estimated based on themethod ong>ofong> maximum likelihood by using the compliancedegree history collected during the system’s testing phase inwhich the system is tested with its anticipated attacker eventprong>ofong>ile ong>andong> where the compliance degree is assessed using thespecification-based host IDS technique described earlier. Thecompliance degree history collected this way is the realizationong>ofong> a sequence ong>ofong> rong>andong>om variables (c 1 ,c 2 , ..., c n ) where c i isthe i th compliance degree output observed during the testingphase, ong>andong> n is the total number ong>ofong> compliance degree outputsobserved. The maximum likelihood estimates ong>ofong> α ong>andong> β areobtained by numerically solving the following equations:∂Γ(ˆα+ ˆβ) ∂Γ(ˆα)n∂ ˆαnn∑∂ ˆα− + log c i =0Γ(ˆα + ˆβ) Γ(ˆα)∂Γ(ˆα+ ˆβ)n∂ ˆβΓ(ˆα + ˆβ) − n∂Γ( ˆβ)∂ ˆβΓ(ˆα)+i=1n∑log(1 − c i )=0 (3)i=1where∂Γ(ˆα + ˆβ)∂ ˆα=∫ ∞0(log x)xˆα+ ˆβ−1 e −x dx.A less general, though simpler model, is to consider a singleparameter Beta(β) distribution with α equal to 1. In this case,the density is β(1 − x) β−1 for 0 ≤ x ≤ 1 ong>andong> 0 otherwise.The maximum likelihood estimate ong>ofong> β isˆβ =nn∑ 1log( )1 − c ii=1Host intrusion detection is characterized by per-node falsenegative ong>andong> false positive probabilities, denoted by p fn ong>andong>p fp , respectively. While many detection criteria are possible,we consider a threshold criterion in this paper. That is, if abad node’s compliance degree denoted by X b is higher thana system minimum compliance threshold C T then there is afalse negative. Suppose that the compliance degree X b ong>ofong> abad node is modeled by a G(·) =Beta(α, β) distribution asdescribed above. Then the host IDS false negative probabilityp fn is given by:(4)p fn = Pr{X b >C T } =1− G(C T ). (5)On the other hong>andong>, if a good node’s compliance degree denotedby X g is less than C T then there is a false positive. Againsuppose that the compliance degree X g ong>ofong> a good node ismodeled by a G(·) =Beta(α, β) distribution. Then the hostfalse positive probability p fp is given by:p fp = Pr{X g ≤ C T } = G(C T ). (6)Here we observe that these two probabilities are largelyaffected by the setting ong>ofong> the minimum compliance thresholdC T . A large C T induces a small false negative probability atthe expense ong>ofong> a large false positive probability. Conversely,a small C T induces a small false positive probability at theexpense ong>ofong> a large false negative probability. A proper settingong>ofong> C T in response to attacker strength detected at runtimehelps maximize the system lifetime.E. System ong>Intrusionong> ong>Detectionong>Our system IDS technique is based on majority votingong>ofong> host IDS results to cope with incomplete ong>andong> uncertaininformation available to nodes in the CPS. Our systemlevelIDS technique involves the selection ong>ofong> m detectorsas well as the invocation interval T IDS to best balanceenergy conservation vs. intrusion tolerance for achievinghigh reliability. Each node periodically exchanges its routinginformation, location, ong>andong> identifier with its neighbor nodes.A coordinator is selected rong>andong>omly among neighbors sothat the adversaries will not have specific targets. We addrong>andong>omness to the coordinator selection process by introducinga hashing function that takes in the identifier ong>ofong> a nodeconcatenated with the current location ong>ofong> the node as the hashkey. The node with the smallest returned hash value wouldthen become the coordinator. Because cong>andong>idate nodes know


each other’s identifier ong>andong> location, they can independentlyexecute the hash function to determine which node wouldbe the coordinator. The coordinator then selects m detectorsrong>andong>omly (including itself), ong>andong> lets all detectors know eachothers’ identities so that each voter can send its yes/no voteto other detectors. Vote authenticity is achieved via preloadedpublic keys. At the end ong>ofong> the voting process, all detectors willknow the same result, that is, the node is diagnosed as good,or as bad based on the majority vote.The system IDS is characterized by system false negativeong>andong> false positive probabilities, denoted by P fn ong>andong> P fp ,respectively. These two false alarm probabilities are notconstant but vary dynamically, depending on the percentage ong>ofong>bad nodes in the system when majority voting is performed.We will derive these two probabilities in the paper.F. ong>Intrusionong> ong>Responseong>Our IDRS reacts to malicious events detected at runtime byadjusting the minimum compliance threshold C T . For examplewhen it senses an increasing attacker strength, it can increasethe compliance threshold C T with the objective to preventimpairment security failure. This results in a smaller falsenegative probability, which has a positive effect ong>ofong> reducingthe number ong>ofong> bad nodes in the system ong>andong> decreasing theprobability ong>ofong> impairment security failure. However it alsoresults in a larger false positive probability, which has anegative effect ong>ofong> reducing the number ong>ofong> good nodes inthe system ong>andong> consequently increasing the probability ong>ofong>Byzantine security failure. To compensate for the negativeeffect, the IDRS increases the audit rate (by decreasingthe intrusion detection interval) or increases the number ong>ofong>detectors to reduce the false positive probability at the expenseong>ofong> more energy consumption. The relationship between theminimum compliance threshold C T set vs. p fn ong>andong> p fpmust be determined at static time so the system can adjustC T dynamically in response to malicious events detected atruntime.III. MODEL AND ANALYSISTable I lists the set ong>ofong> parameters used in our modelbasedanalysis ong>ofong> intrusion detection ong>andong> response designs.The parameter N defines the starting network size (i.e., thenumber ong>ofong> nodes). The hostility ong>ofong> the network is characterizedby a per-node capture rate λ c ; p fp ong>andong> p fn are host IDSfalse positive ong>andong> false negative probabilities, respectively,while P fp ong>andong> P fn are system-level IDS false positive ong>andong>false negative probabilities, respectively; T IDS is the intrusiondetection interval; m is the number ong>ofong> detectors used in thesystem IDS.Our theoretical model utilizes Stochastic Petri Nets (SPN)techniques [8]. Figure 1 shows the SPN model describing theecosystem ong>ofong> a CPS with intrusion detection ong>andong> responseunder capture, impairment ong>andong> Byzantine security attacks. Theunderlying model ong>ofong> the SPN model is a continuous-timesemi-Markov process with a state representation (N g , N b , N e ,impaired, energy) where N g is the number ong>ofong> good nodes,TABLE IPARAMETERS USED FOR ANALYSIS OF INTRUSION DETECTION ANDRESPONSE DESIGNParameter Meaning TypeN number ong>ofong> nodes in a CPS inputλ c per node compromise rate (Hz) inputp fp probability ong>ofong> per-host IDS false positive inputp fn probability ong>ofong> per-host IDS false negative inputT IDS intrusion detection interval (s) inputm number ong>ofong> detectors in the system IDS inputP fp probability ong>ofong> system IDS false positive derivedP fn probability ong>ofong> system IDS false negative derivedp rong>andong>om rong>andong>om attack probability by a rong>andong>om attacker inputp a attack probability by an insidious attacker derivedN IDS maximum IDS cycles before energy exhaustion derivedMTTF system lifetime outputTABLE IITRANSITION RATES OF THE SPN MODEL.Transition Name Rate1TENERGYN IDS ×T IDSTCPN g × λ cNTFPg×P fpTIDSTIFT IDSN b ×(1−P fn )T IDSp a × N b × λ ifN b is the number ong>ofong> bad nodes, N e is the number ong>ofong> nodesevicted (as they are considered as bad nodes by intrusiondetection), impaired is a binary variable with 1 indicatingimpairment security failure, ong>andong> energy is a binary variablewith 1 indicating energy availability ong>andong> 0 indicating energyexhaustion.Fig. 1.SPN Model for ong>Intrusionong> ong>Detectionong> ong>andong> ong>Responseong>.Table II annotates transitions ong>andong> gives transition rates usedin the SPN model. The SPN model shown in Figure 1 isconstructed as follows:• We use places to hold tokens each representing a node.Initially, all N nodes are good nodes (e.g., 128 in ourreference CPS) ong>andong> put in place N g as tokens.• We use transitions to model events. Specifically, TCPmodels good nodes being compromised; TFP models agood node being falsely identified as compromised; TIDSmodels a bad node being detected correctly.• Good nodes may become compromised because ong>ofong>capture attacks with per-node compromising rate λ c .


This is modeled by associating transition TCP with anaggregate rate λ c ×N g . Firing TCP will move tokens oneat a time (if it exists) from place N g to place N b . Tokensin place N b represent bad nodes performing impairmentattacks with probability p a .• When a bad node is detected by the system IDS ascompromised, the number ong>ofong> compromised nodes evictedwill be incremented by 1, so place N e will hold onemore token. On the other hong>andong>, the number ong>ofong> undetectedcompromised nodes will be decremented by 1, i.e., placeN b will hold one less token. These detection events aremodeled by associating transition TIDS with a rate ong>ofong>N b ×(1−P fn )T IDSwith 1−P fn accounting for the system IDStrue negative probability.• The system-level IDS can incorrectly identify a goodnode as compromised. This is modeled by moving a goodnode in place N g to place N e from firing transition TFPwith a rate ong>ofong> Ng×P fpT IDSwith P fp accounting for the systemIDS false positive probability.• The system energy is exhausted after time N IDS ×T IDS where N IDS is the maximum number ong>ofong> intrusiondetection intervals the CPS can possibly perform beforeit exhausts its energy due to performing ranging, sensing,ong>andong> intrusion detection functions. It can be estimatedby considering the amount ong>ofong> energy consumed ineach T IDS interval. This energy exhaustion event ismodeled by placing a token in place energy initially1ong>andong> firing transition TENERGY with rateN IDS×T IDS.When the energy exhaustion event occurs, the token inplace energy will be vanished ong>andong> the system entersan absorbing state meaning the lifetime is over. This ismodeled by disabling all transitions in the SPN model.• When the number ong>ofong> bad nodes (i.e., tokens in place N b )is at least 1/3 ong>ofong> the total number ong>ofong> nodes (tokens inplace N g ong>andong> N b ), the system fails because ong>ofong> a Byzantinefailure. The system lifetime is over ong>andong> is modeled againby disabling all transitions in the SPN model.• Bad nodes in place N b perform attacks with probabilityp a ong>andong> cause impairment to the system. Afteran impairment-failure time period is elapsed, heavyimpairment due to attacks will result in a security failure.We model this by firing transition TIF with a rate ong>ofong>p a × N b × λ if indicating the amount ong>ofong> time needed byp a N b bad nodes to reach this level ong>ofong> impairment beyondwhich the system cannot sustain the damage. The valueong>ofong> λ if is system specific ong>andong> is determined by domainexperts predicting the amount ong>ofong> time needed for a badnode to cause severe functional impairment. A token isflown into place impaired when such a security failureoccurs. Once a token is in place impaired, the systementers an absorbing state meaning the lifetime is over.Again it is modeled by disabling all transitions in theSPN model.Here we note that the last two bullet points cover the twoconditions that would cause a security failure.We utilize the SPN model to analyze two design tradeong>ofong>fs:• ong>Detectionong> strength vs. energy consumption: As weincrease the detection frequency (a smaller T IDS ) or thenumber ong>ofong> detectors (a larger m), the detection strengthincreases, thus preventing the system from running into asecurity failure. However, this increases the rate at whichenergy is consumed, thus resulting in a shorter lifetime.Consequently, there is an optimal setting ong>ofong> T IDS ong>andong> munder which the system MTTF is maximized, given thenode capture rate ong>andong> attack model.• ong>Detectionong> response vs. attacker strength: As the rong>andong>omattack probability p a decreases, the attacker strengthdecreases, thus lowering the probability ong>ofong> securityfailure due to impairment attacks. However, compromisednodes become more hidden ong>andong> difficult to detectbecause they leave less evidence traceable, resultingin a higher per-host false negative probability p fnong>andong> consequently a higher system-level false negativeprobability P fn . This increases the probability ong>ofong> securityfailure due to Byzantine attacks. The system can respondto instantaneous attacker strength detected ong>andong> adjust C Tto trade a high per-host false positive probability p fp ong>ofong>ffor a low per-host false negative probability p fn , or viceversa, so as to minimize the probability ong>ofong> security failure.Hence, there exists an optimal setting ong>ofong> C T as a functionong>ofong> attacker strength detected at time t under which thesystem security failure probability is minimized.Let L be a binary rong>andong>om variable denoting the lifetime ong>ofong>the system such that it takes on the value ong>ofong> 1 if the system isalive at time t ong>andong> 0 otherwise. Then, the expected value ong>ofong> Lis the reliability ong>ofong> the system R(t) at time t. Consequently,the integration ong>ofong> R(t) from t =0to ∞ gives the mean timeto failure (MTTF) or the average lifetime ong>ofong> the system weaim to maximize. The binary value assignment to L can bedone by means ong>ofong> a reward function assigning a reward r i ong>ofong>0 or 1 to state i at time t as follows:{1 if system is alive in state ir i =0 if system fails due to security or energy failureA state is represented by the distribution ong>ofong> tokens to places inthe SPN model. For example, with the SPN model defined inFigure 1, the underlying state is represented by (N g , N b , N e ,impaired, energy). When place energy contains zero token,it indicates energy exhaustion. When N g is less than or equalto twice ong>ofong> N b , it indicates a Byzantine failure. When placeimpaired contains a token, it indicates a security failure dueto significant functional impairment. Once the binary value ong>ofong>0 or 1 is assigned to all states ong>ofong> the system as described above,the reliability ong>ofong> the system R(t) is the expected value ong>ofong> Lweighted on the probability ong>ofong> the system stays at a particularstate at time t, which we can obtain easily from solving theSPN model using SPNP [8]. The MTTF ong>ofong> the system is equalto the cumulative reward to absorption, i.e.,MTTF =∫ ∞0R(t)dt (7)which we can again compute easily using SPNP.


IV. PARAMETERIZATIONTABLE IIIPARAMETERS AND THEIR VALUES FOR THE REFERENCE CPS.Parameter Meaning Default valueN number ong>ofong> nodes or network size 128¯n number ong>ofong> neighbors within radio range 32p fn per-host false negative probability [1-20%]p fp per-host false positive probability [1-20%]λ c per-node capture rate 1/[1-24hr]T IDS intrusion detection interval [1-60min]m number ong>ofong> intrusion detectors per node [3,11]α number ong>ofong> ranging operations 5E t energy for transmission per node 0.000125 JE r energy for reception per node 0.00005 JE a energy for analyzing data per node 0.00174 JE s energy for sensing per node 0.0005 JE o initial system energy 16128 kJWe consider the reference CPS model introduced in SectionII operating in a 2 × 2 area with a network size (N) ong>ofong> 128nodes. Hence, the number ong>ofong> neighbors within radio range,denoted by ¯n, initially is about 128/4=32 nodes. A node in ourreference CPS uses a 35 Wh battery, so its energy is 126000J. The system energy initially, denoted by E o , is therefore126000 J × 128 = 16128000 J. Table III lists the set ong>ofong>parameters ong>andong> their values for the reference CPS.A. System-Level IDS P fn ong>andong> P fpWe first parameterize the system IDS P fn ong>andong> P fp givenper-host IDS false positive probability p fp ong>andong> per-host IDSfalse negative probability as input. We first note that P fnong>andong> P fp highly depend on the attacker behavior. A persistentattacker constantly performs slong>andong>ering attacks such that it willvote a bad node as a good node, ong>andong> conversely a good nodeas a bad node, to eventually cause a security failure. However,a rong>andong>om or an insidious attacker will only perform slong>andong>eringattacks rong>andong>omly with probability p a to avoid detection.We first differentiate the number ong>ofong> active bad nodes, N a b ,from the number ong>ofong> inactive bad nodes, N i b , with N a b + N i b =N b , such that at any time:N a b = p a × N b (8)N i b =(1− p a ) × N b (9)The difference between an active bad node ong>andong> an inactivebad node is that an inactive bad node behaves as if it were agood node in order to evade detection, including casting votesthe same way as a good node would, when it participates inthe system-level IDS voting process.For a persistent attacker, p a =1. For a rong>andong>om attacker,p a = p rong>andong>om . For an insidious attacker, to maximizethe benefit ong>ofong> colluding attacks, a compromised node staysdormant until a critical mass ong>ofong> compromised nodes is gatheredso that p a =1when N b ≥ Nb T , ong>andong> p a =0otherwise, whereNbT is a parameter reflecting the insidiousness degree. In otherwords, all bad nodes engage in active attacks when there is acritical mass ong>ofong> compromised nodes in the system.We calculate P fn by Equation 10. The equation for P p fp isthe same except replacing p fn by p fp in the right hong>andong> sideexpression.We explain Equation 10 for obtaining P fn in detail below.The explanation for P fp follows the same logic. In Equation10, m this is the number ong>ofong> detectors ong>andong> m a is the majority ong>ofong>m. The first summation aggregates the probability ong>ofong> a falsenegative stemming from selecting a majority ong>ofong> active badnodes. That is, it is equal to the number ong>ofong> ways to choose amajority ong>ofong> m nodes from the set ong>ofong> active bad nodes times thenumber ong>ofong> ways to choose a minority ong>ofong> m nodes from the setong>ofong> good nodes ong>andong> inactive bad nodes divided by the numberong>ofong> ways to choose m nodes from the set ong>ofong> all good ong>andong> badnodes. The second summation aggregates the probability ong>ofong> afalse negative stemming from selecting a minority ong>ofong> m nodesfrom the set ong>ofong> active bad nodes which always cast incorrectvotes, coupled with selecting a sufficient number ong>ofong> nodesfrom the set ong>ofong> good nodes ong>andong> inactive bad nodes which makeincorrect votes with probability p fn , resulting in a majority ong>ofong>incorrect votes being cast.B. Host IDS p fn ong>andong> p fpNext we parameterize the host IDS false negative probabilityp fn ong>andong> false positive probability p fp for persistent, rong>andong>omong>andong> insidious attacks. The system, after a thorough testing ong>andong>debugging phase, determines a minimum threshold C T suchthat p fn ong>andong> p fp measured based on Equations 5 ong>andong> 6 areacceptable to system design. Let p p fn ong>andong> pp fpbe the falsenegative probability ong>andong> the false positive probability ong>ofong> thehost IDS when p a =1(e.g., under persistent attacks). Let theminimum threshold C T value set for the persistent attack casebe denoted by C p T .Let p r fn ong>andong> pr fpbe the false negative probability ong>andong> thefalse positive probability ong>ofong> the host IDS when p a < 1(e.g., under rong>andong>om attacks). For the case ong>ofong> rong>andong>om attackswith probability p a < 1, conceivably the amount ong>ofong> evidenceobservable from a bad node would be diminished proportionalto p a . Consequently, with the same minimum threshold C p Tbeing used, the host false negative probability would increase.We again utilize Equations 5 ong>andong> 6 to obtain p r fn ong>andong> pr fp foreach given p a value during the testing ong>andong> debugging phase.Here we note that the host false positive probability wouldremain the same, i.e., p r fp = pp fpbecause the attacker behaviordoes not affect false positives, given the same minimumthreshold C p Tbeing used.Lastly let p i fn ong>andong> pi fpbe the false negative probabilityong>andong> the false positive probability ong>ofong> the host IDS underinsidious attacks. Obviously, the false positive probability isnot affected, so p i fp = pp fp. Since insidious nodes stay dormantuntil a critical mass is achieved to perform “all in” attacks, thefalse negative probability is one during the dormant periodong>andong> is equal to that under persistent attacks during the “all in”attack period. Specifically,p i fn ={ ppfnif N b ≥ NbT1 otherwise(11)


P fn =m−m∑ ai=0m−m∑ aj=0⎡( )()NabN g + N i ⎤bm⎢ a + i m − (m a + i)⎣(Ng + Nb a + N )bim⎡( )Na m−j ∑[( bNg + Nbi j kk=m a−j⎢⎣⎥⎦ + (10)) ((p fn ) k Ng + Nb i − km − j − k(Ng + Nb i + N )bam) ] ⎤(1 − p fn ) (m−j−k) ⎥⎦TABLE IVβ IN BETA(1,β) AND RESULTING p fn AND p fp VALUES UNDER VARIOUSATTACK MODELS.Attack Type β p fn p fpRong>andong>om with P a=1.000 (Persistent) 1.20 6.3% 7.3%Rong>andong>om with P a=0.800 1.00 10.0% 7.3%Rong>andong>om with P a=0.400 0.75 17.8% 7.3%Rong>andong>om with P a=0.200 0.50 31.6% 7.3%Rong>andong>om with P a=0.100 0.20 63.1% 7.3%Rong>andong>om with P a=0.050 0.13 74.1% 7.3%Rong>andong>om with P a=0.025 0.09 81.3% 7.3%Insidious 0; 1.20 100%; 6.3% 7.3%Here we note that p fn ong>andong> p fp obtained above for persistent,rong>andong>om, or insidious attacks would be a function ong>ofong> time asinput to Equation 10 for calculating system-level IDS P fn ong>andong>P fp dynamically.We apply the statistical analysis described by Equations 1-4to get the maximum likelihood estimates ong>ofong> β (with α set as 1)under each attacker behavior model ong>andong> then utilize Equations5 ong>andong> 6 to yield p fn ong>andong> p fp . The system minimum thresholdC T is set to C p T =0.9 to yield pp fn =6.3% ong>andong> pp fp=7.3%. TableIV summarizes β values ong>andong> the resulting p fn ong>andong> p fp valuesunder various attacker behavior models. The persistent attackmodel is a special case in which p a =1. The insidious attackmodel is another special case in which p a =1during the “allin” attack period ong>andong> p a =0during the dormant period.C. Parameterizing C T for Dynamic ong>Intrusionong> ong>Responseong>The parameterization ong>ofong> p fn ong>andong> p fp above is based ona constant C T being used (i.e., C p T=0.9). A dynamic IDSresponse design is to adjust C T in response to the attackerstrength detected with the goal to maximize the systemlifetime. The attacker strength ong>ofong> a node, say node i, may beestimated periodically by node i’s intrusion detectors. That is,the compliance degree value ong>ofong> node i, X i (t), as collected bym intrusion detectors based on observations collected during[t − T IDS ,t], is compared against the minimum thresholdC p T set for persistent attacks. If X i(t) < C p Tthen node iis considered a bad node performing active attacks at timet; otherwise it is a good node. This information is passedto the control module who subsequently estimates Nb a(t)representing the attacker strength at time t.In this paper we investigate a simple yet efficient IDSresponse design. The basic idea is to decrease the per-host falsenegative probability p fn when the attacker strength is high, sowe may quickly remove active attackers from the system toprevent impairment failure. This is achieved by increasing theC T value. Conversely, when there is little attacker evidencedetected, we lower C T so we may quickly decrease theprobability ong>ofong> a good node being misidentified as a bad node,i.e., lowering the per-host false positive probability, to preventByzantine failure.While there are many possible ways to dynamically controlC T , in this paper we consider a linear one-to-one mappingfunction as follows:C T (t) =C p T + δ C T× (N a b − 1) (12)Here C T (t) refers to the C T value set at time t as a response tothe attacker strength measured by Nb a(t) detected at time t; Cp Tis the minimum threshold set by the system for the persistentattack case; ong>andong> δ CT is the increment to C T per active badnode detected. Essentially we set C T to C p T when N b a(t)detected at time t is 1, ong>andong> linearly increase (or decrease)C T with increasing (or decreasing) attacker strength detected.With C p T =0.9 in our CPS reference system, we set δ C T=0.5ong>andong> parameterize C T (t) as follows:⎧⎪⎨C T (t) =⎪⎩0.85 if N a b =00.90 if N a b =10.95 if N a b =20.99 if N a b ≥ 3 (13)Note that when C T is closer to 1, a node will more likelybe considered as compromised even if it wong>andong>ers only for asmall amount ong>ofong> time in insecure states. A large C T inducesa small per-host false negative probability p fn at the expenseong>ofong> a large per-host false positive probability p fp .D. EnergyLastly, we parameterize N IDS , the maximum number ong>ofong>intrusion detection cycles the system can possibly performbefore energy exhaustion, as follows:N =E o(14)E TIDSwhere E o is the initial energy ong>ofong> the reference CPS ong>andong> E TIDSis the energy consumed per T IDS interval due to ranging,sensing, ong>andong> intrusion detection functions, calculated as:E TIDS = n × (E ranging + E sensing + E detection ) (15)


where E ranging , E sensing , E detection stong>andong> for energy spentfor ranging, sensing, ong>andong> intrusion detection in a T IDSinterval, respectively. Here the energy spent per node ismultiplied with the node population in the CPS to get thetotal energy spent by all nodes per cycle.In Equation 15, E ranging stong>andong>s for the energy spent forperiodic ranging. It is calculated as:E ranging = α × [E t +¯n × (E r + E a )] (16)Here a node spends E t energy to transmit a CDMA waveform.Its ¯n neighbors each spend E r energy to receive the waveformong>andong> each spend E a energy to transform it into distance. Thisoperation is repeated for α times for determining a sequenceong>ofong> locations. In Equation 15, E sensing stong>andong>s for the amountong>ofong> energy consumed due to periodic sensing. It is computedas:E sensing =¯n × (E s + E a ). (17)Here a node spends E s energy for sensing navigation ong>andong>multipath mitigation data ong>andong> E a energy for analyzing senseddata for each ong>ofong> its ¯n neighbors. Finally, E detection stong>andong>s forthe energy used for performing intrusion detection on a targetnode. It can be calculated by:E detection = m×(E t +¯n·E r )+m×(E t +(m−1)·(E r +E a )).(18)Here we consider the energy required to choose m intrusiondetectors to evaluate a target node (the first term) ong>andong> theenergy required for m intrusion detectors to vote (the secondterm). Specifically, the first term is the number ong>ofong> intrusiondetectors times the cost ong>ofong> transmitting plus the number ong>ofong>nodes in radio range times the cost ong>ofong> receiving. The secondterm is the number ong>ofong> intrusion detectors times the cost ong>ofong>transmitting plus the number ong>ofong> peer intrusion detectors timesthe cost ong>ofong> receiving plus the cost ong>ofong> analyzing the vote.V. NUMERICAL DATAIn this section we present numerical data for reliabilityassessment as a result ong>ofong> executing intrusion detection ong>andong>response in a CPS. Our objective is to identify optimal designsettings in terms ong>ofong> the optimal values ong>ofong> T IDS , m ong>andong>C T under which we can best trade ong>ofong>f energy consumptionvs. intrusion detection, as well as response effectiveness vs.impairment security failure to maximize the system MTTF,when given a set ong>ofong> parameter values characterizing theoperational ong>andong> networking conditions.A. ong>Effectong> ong>ofong> ong>Intrusionong> ong>Detectionong> StrengthWe first examine the effect ong>ofong> intrusion detection strengthmeasured by the intrusion interval, T IDS , ong>andong> the numberong>ofong> intrusion detectors, m. We only present results for thereference CPS under persistent attacks, as the results for othertypes ong>ofong> attacks show similar trends.Figure 2 shows MTTF vs. T IDS as the number ong>ofong> detectors(m) in the system-level IDS varies over the range ong>ofong> [3,11] inincrements ong>ofong> 2. We see that there exists an optimal T IDSvalue at which the system lifetime is maximized to besttradeong>ofong>f energy consumption vs. intrusion tolerance. Initiallywhen T IDS is too small, the system performs ranging, sensingong>andong> intrusion detection too frequently ong>andong> quickly exhaustsits energy, resulting in a small lifetime. As T IDS increases,the system saves more energy ong>andong> its lifetime increases.Finally when T IDS is too large, although the system can saveeven more energy, it fails to catch bad nodes ong>ofong>ten enough,resulting in the system having many bad nodes. Bad nodesthrough active attacks can cause impairment security failure.Furthermore, when the system has 1/3 or more bad nodes outong>ofong> the total population, a Byzantine failure ensues. We observerthat the optimal T IDS value at which the system MTTF ismaximized is sensitive to the m value. The general trend isthat as m increases, the optimal T IDS value decreases. Herewe observe that m = 7 is optimal to yield the maximum MTTFbecause too many intrusion detectors would induce energyexhaustion failure, while too few intrusion detectors wouldinduce security failure. Using m = 7 can best balance energyexhaustion failure vs. security failure for high reliability.MTTF (min)1601401201008060402020001500MTTF (min)10005000m = 3m = 5m = 7m = 9m = 11network size n = 128C T = 0.90adversary type = PERSISTENTcapture rate λ c = 1/hourimpairment rate λ if = 1/(48 hours)static C T0 5 10 15 20T IDS (min)Fig. 2. MTTF vs. T IDS ong>andong> m.λ c = 1/(4 hours)λ c = 1/(8 hours)λ c = 1/(16 hours)λ c = 1/(24 hours)network size n = 128C T = 0.90m = 5adversary type = PERSISTENTimpairment rate λ if = 1/(24 hours)static C T0 10 20 30 40 50 60 70 80T IDS (min)Fig. 3. MTTF vs. T IDS ong>andong> λ c.Figure 3 shows MTTF vs. T IDS as the compromising rate


λ c varies over the range ong>ofong> once per 4 hours to once per 24hours to test the sensitivity ong>ofong> MTTF with respect to λ c (withm fixed at five to isolate its effect). We first observe that asλ c increases, MTTF decreases because a higher λ c will causemore compromised nodes to be present in the system. Wealso observe that the optimal T IDS decreases as λ c increases.This is because when more compromised nodes exist, thesystem needs to execute intrusion detection more frequentlyto maximize MTTF. Figure 3 identifies the best T IDS to beused to maximize the lifetime ong>ofong> the reference CPS to balanceenergy exhaustion vs. security failure, when given C T ong>andong> λ ccharacterizing the operational ong>andong> networking conditions ong>ofong>the system.B. ong>Effectong> ong>ofong> Attacker BehaviorIn this section, we analyze the effect ong>ofong> various attackerbehavior models, including persistent (with p a =1, p i fn ong>andong>p i fp given as input), rong>andong>om (with p a = p rong>andong>om , p r fn , ong>andong>p r fp given as input), ong>andong> insidious attacks (with p a =1whenN b ≥ NbT =10ong>andong> p a =0otherwise, p i fp , ong>andong> pi fn definedby Equation 11 given as input). The analysis conducted hereis based on static C T . In the next section, we will analyzethe effect ong>ofong> dynamic C T as a response to attacker strengthdetected at runtime.Figure 4 shows MTTF vs. T IDS with varying p rong>andong>omvalues. We first observe that the system MTTF is low whenp rong>andong>om is small (e.g., p rong>andong>om = 0.025). This is becausewhen p rong>andong>om is small, most bad nodes are dormant ong>andong>remain in the system without being detected. Thus, the systemsuffers from Byzantine failure quickly, leading to a low MTTF.As p rong>andong>om increases from 0.025 to 0.2, the system MTTFincreases because ong>ofong> a higher chance ong>ofong> bad nodes beingdetected ong>andong> removed from the system, thus reducing theprobability ong>ofong> Byzantine security failure. As p rong>andong>om increasesfurther, however, the system MTTF decreases again because ong>ofong>a higher probability ong>ofong> impairment security failure as there willbe more bad nodes actively performing impairment attacks.In the extreme case ong>ofong> p rong>andong>om =1, all bad nodes performattacks ong>andong> the system failure is mainly caused by impairment.The maximum MTTF occurs when p rong>andong>om =0.2 at whichpoint the probability ong>ofong> security failure due to either typeong>ofong> security attacks is minimized. Here we should note thatthe result ong>ofong> p rong>andong>om =0.2 yielding the highest MTTF isa balance ong>ofong> impairment security failure rate vs. Byzantinefailure rate dictated by the parameter settings ong>ofong> the referenceCPS as given in Tables III ong>andong> IV.Figure 5 compares MTTF vs. T IDS ong>ofong> the referenceCPS under the three attacker types head-to-head. It showsthat the MTTF is the highest for the reference CPS underrong>andong>om attacks. The MTTF ong>ofong> the CPS under persistentattacks is the second highest. As expected, the reference CPSunder insidious attacks has the lowest MTTF. We attributethis to the fact that unlike persistent attacks which aim tocause impairment failure, insidious attacks while dormant cancause Byzantine failure ong>andong> while “all in” can also causeimpairment failure. The extent to which the system MTTF20001500MTTF (min)10005000p rong>andong>om = 0.025p rong>andong>om = 0.050p rong>andong>om = 0.100p rong>andong>om = 0.200p rong>andong>om = 0.400p rong>andong>om = 0.800p rong>andong>om = 1.000network size n = 128m = 5adversary type = RANDOMcapture rate λ c = 1/(16 hours)impairment rate λ if = 1/(24 hours)static C T0 10 20 30 40 50 60 70 80T IDS (min)Fig. 4. MTTF vs. T IDS ong>andong> p rong>andong>om .differs depends on the relative rate at which impairment failurevs. Byzantine failure occurs. The former is dictated by λ ifong>andong> the latter is dictated by how fast the Byzantine failurecondition is satisfied. The result that the MTTF differencebetween persistent attacks (the second curve) ong>andong> insidiousattacks (the last curve) is relatively significant is due to a largeByzantine failure rate compared with the impairment failurerate. On the other hong>andong>, the reference CPS under rong>andong>omattacks can more effectively prevent either Byzantine failureor impairment failure from occurring by removing bad nodesas soon as they perform attacks. The system MTTF differencebetween rong>andong>om vs. persistent attacks again depends on therelative rate at which impairment failure vs. Byzantine failureoccurs.20001800160014001200MTTF (min) 10008006004002000network size n = 128m = 5capture rate λ c = 1/(16 hours)impairment rate λ if = 1/(24 hours)static C Tadversary type = PERSISTENTadversary type = RANDOMadversary type = INSIDIOUS0 10 20 30 40 50 60 70 80T IDS (min)Fig. 5.MTTF vs. T IDS ong>andong> adversary type.C. ong>Effectong> ong>ofong> ong>Intrusionong> ong>Responseong>In this section, we analyze the effect ong>ofong> intrusion response,i.e., dynamic C T as a response to attacker strength detectedat runtime, on the system MTTF.Figure 6 shows MTTF vs. T IDS under the static C T designong>andong> the dynamic C T design for the persistent attack case. We


see there is a significant gain in MTTF under dynamic C Tover static C T . The reason is that with persistent attacks, allbad nodes are actively performing attacks, so the system isbetter ong>ofong>f by increasing C T to a high level to quickly removebad nodes to prevent impairment failure. We also observe thatin the case the optimal T IDS at which MTTF is maximizeddecreases compared with the static C T case so to as quicklyremove bad nodes from the system.1800160014001200static C Tdynamic C Twe observe the MTTF difference is relatively small comparedwith persistent or rong>andong>om attacks. The reason is that badnodes do not perform active attacks until a critical mass isreached, so dynamic C T would set a lower C T value during thedormant period while rapidly setting a higher C T value duringthe attack period. Since the attack period is relatively shortcompared with the dormant period, the gain in MTTF isn’tvery significant. Nevertheless, we observe even for insidiousattacks, dynamic C T still performs better than static C T .As our C T dynamic control function (Equation 12) adjustsC T solely based on the attacker strength detected regardless ong>ofong>the attacker type, we conclude that the dynamic C T design asa response to attacker strength detected at runtime can improveMTTF compared with the static C T design.1000MTTF (min)800600800700static C Tdynamic C T400200network size n = 128m = 5adversary type = PERSISTENTcapture rate λ c = 1/(16 hours)impairment rate λ if = 1/(24 hours)60050000 10 20 30 40 50 60 70 80T IDS (min)MTTF (min)400Fig. 6.Static vs. Dynamic ong>Responseong> under Persistent Attacks.300200network size n = 128m = 5adversary type = INSIDIOUScapture rate λ c = 1/(16 hours)impairment rate λ if = 1/(24 hours)2500static C Tdynamic C T1000 10 20 30 40 50 60 70 80T IDS (min)2000Fig. 8.Static vs. Dynamic ong>Responseong> under Insidious Attacks.MTTF (min)150010005000Fig. 7.network size n = 128m = 5adversary type = RANDOMcapture rate λ c = 1/(16 hours)impairment rate λ if = 1/(24 hours)p rong>andong>om = 0.200 10 20 30 40 50 60 70 80T IDS (min)Static vs. Dynamic ong>Responseong> under Rong>andong>om Attacks.Figure 7 shows MTTF vs. T IDS under the static C T designong>andong> the dynamic C T design for the rong>andong>om attack case withp rong>andong>om =0.2. We pick the case ong>ofong> p rong>andong>om =0.2 becauseit yields the highest MTTF among all rong>andong>om attack casesin the reference CPS system (see Figure 4). Here again weobserve that dynamic C T performs significantly better thanstatic C T , when operating at the identified optimal T IDS value.The optimal T IDS value under dynamic C T design again issmaller than that under static C T design to quickly removenodes that perform active attacks.Figure 8 shows MTTF vs. T IDS under the static C T designong>andong> the dynamic C T design for the insidious attack case. HereVI. CONCLUSIONSIn this paper, we developed a probability model to analyzereliability ong>ofong> a cyber physical system in the presence ong>ofong> bothmalicious nodes exhibiting a range ong>ofong> attacker behaviors, ong>andong>an intrusion detection ong>andong> response system for detecting ong>andong>responding to malicious events at runtime. For each attackerbehavior, we identified the best detection strength (in termsong>ofong> the detection interval ong>andong> the number ong>ofong> detectors), ong>andong>the best response strength (in terms ong>ofong> the per-host minimumcompliance threshold for setting the false positive ong>andong> negativeprobabilities) under which the reliability ong>ofong> the system may bemaximized.There are several future research directions, including (a)investigating other intrusion detection criteria (e.g., based onaccumulation ong>ofong> deviation from good states) other than thecurrent binary criterion used in the paper based on a minimumcompliance threshold to improve the false negative probabilitywithout compromising the false positive probability; (b)investigating other intrusion response criteria (e.g., exponentialincrease ong>ofong> the minimum compliance threshold) other than thelinear function used in the paper ong>andong> analyzing the effect onthe system lifetime; (c) exploring other attack behavior models(e.g., an oracle attacker that can adjust the attacker strength


depending on the detection strength to maximize securityfailure) ong>andong> investigating the best dynamic response design tocope with such attacks; (d) developing a more elaborate modelto describe the relationship between intrusion responses ong>andong>attacker behaviors ong>andong> justifying such a relationship modelby means ong>ofong> extensive empirical studies; ong>andong> (e) extendingthe analysis to hierarchically-structured intrusion detection ong>andong>response system design for a large CPS consisting ong>ofong> multipleenclaves each comprising heterogeneous entities subject todifferent operational ong>andong> environment conditions ong>andong> attackthreats.REFERENCES[18] C.-H. Tsang ong>andong> S. Kwong. Multi-agent intrusion detection systemin industrial network using ant colony clustering approach ong>andong>unsupervised feature extraction. In IEEE International Conference onIndustrial Technology, 2005., pages 51–56, December 2005.[19] J. Verba ong>andong> M. Milvich. Idaho national laboratory supervisory controlong>andong> data acquisition intrusion detection system (scada ids). In 2008IEEE Conference on Technologies for Homelong>andong> Security, pages 469–473, May 2008.[20] D. Yang, A. Usynin, ong>andong> J. Hines. Anomaly-based intrusion detectionfor scada systems. In 5th Intl. Topical Meeting on Nuclear PlantInstrumentation, Control ong>andong> Human Machine Interface Technologies(NPIC&HMIT 05), pages 12–16. Citeseer.[21] C. Zimmer, B. Bhat, F. Mueller, ong>andong> S. Mohan. Time-based intrusiondetection in cyber-physical systems. In Proceedings ong>ofong> the 1stACM/IEEE International Conference on Cyber-Physical Systems, pages109–118, New York, NY, USA, 2010. ACM.[1] M. Anong>andong>, E. Cronin, M. Sherr, M. Blaze, Z. Ives, ong>andong> I. Lee. Securitychallenges in next generation cyber physical systems. Beyond SCADA:Networked Embedded Control for Cyber Physical Systems, 2006.[2] R. Barbosa ong>andong> A. Pras. ong>Intrusionong> detection in scada networks.In B. Stiller ong>andong> F. De Turck, editors, Mechanisms for AutonomousManagement ong>ofong> Networks ong>andong> Services, volume 6155 ong>ofong> Lecture Notes inComputer Science, pages 163–166. Springer Berlin / Heidelberg, 2010.[3] C. Bellettini ong>andong> J. Rrushi. A product machine model for anomalydetection ong>ofong> interposition attacks on cyber-physical systems. volume278 ong>ofong> International Federation for Information Processing, pages 285–300. Springer Boston, 2008.[4] J. Bigham, D. Gamez, ong>andong> N. Lu. Safeguarding scada systems withanomaly detection. In V. Gorodetsky, L. Popyack, ong>andong> V. Skormin,editors, Computer Network Security, volume 2776 ong>ofong> Lecture Notes inComputer Science, pages 171–182. Springer Berlin / Heidelberg, 2003.[5] A. Carcano, A. Coletta, M. Guglielmi, M. Masera, I. Fovino, ong>andong>A. Trombetta. A multidimensional critical state analysis for detectingintrusions in scada systems. IEEE Transactions on IndustrialInformatics, 7(2):179 –186, May 2011.[6] A. Cárdenas, S. Amin, B. Sinopoli, A. Giani, A. Perrig, ong>andong> S. Sastry.Challenges for securing cyber physical systems. In First Workshop onCyber-physical Systems Security, page submitted. DHS, 2009.[7] S. Cheung, B. Dutertre, M. Fong, U. Lindqvist, K. Skinner, ong>andong>A. Valdes. Using model-based intrusion detection for scada networks. InProceedings ong>ofong> the SCADA Security Scientific Symposium, pages 127–134. Citeseer, 2007.[8] G. Ciardo, J. Muppala, ong>andong> K. Trivedi. Spnp: stochastic petri netpackage. In Proceedings ong>ofong> the Third International Workshop on PetriNets ong>andong> Performance Models, 1989., pages 142 –151, December 1989.[9] I. Fovino, A. Carcano, T. De Lacheze Murel, A. Trombetta, ong>andong>M. Masera. Modbus/dnp3 state-based intrusion detection system. In2010 24th IEEE International Conference on Advanced InformationNetworking ong>andong> Applications (AINA), pages 729 –736, April 2010.[10] W. Gao, T. Morris, B. Reaves, ong>andong> D. Richey. On scada control systemcommong>andong> ong>andong> response injection ong>andong> intrusion detection. In eCrimeResearchers Summit (eCrime), 2010, pages 1 –9, October 2010.[11] L. Lamport, R. Shostak, ong>andong> M. Pease. The byzantine generals problem.ACM Transactions on Programming Languages ong>andong> Systems, 4(3):382–401, 1982.[12] O. Linda, T. Vollmer, ong>andong> M. Manic. Neural network based intrusiondetection system for critical infrastructures. In International JointConference on Neural Networks, 2009., pages 1827 –1834, June 2009.[13] S. H. Miao Xie ong>andong> H.-H. Chen. ong>Intrusionong> detection in cyber-physicalsystems: Its techniques ong>andong> challenges. IEEE Systems Journal.[14] U. D. ong>ofong> Homelong>andong> Security. BAA-09-02 Geospatial LocationAccountability ong>andong> Navigation System for Emergency Responders(GLANSER), 2009.[15] P. Oman ong>andong> M. Phillips. ong>Intrusionong> detection ong>andong> event monitoringin scada networks. In E. Goetz ong>andong> S. Shenoi, editors, CriticalInfrastructure Protection, volume 253 ong>ofong> International Federation forInformation Processing, pages 161–173. Springer Boston, 2007.[16] A. P. Robin A. Sahner, Kishor S. Trivedi. Performance ong>andong> ReliabilityAnalysis ong>ofong> Computer Systems: An Example-Based Approach Using theSHARPE Song>ofong>tware Package. Kluwer Academic Publishers, 2002.[17] S. M. Ross. Introduction to Probability Models, 10th Edition. AcademicPress, 2009.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!