Error, Uncertainty, and Loss in Digital Evidence
Error, Uncertainty, and Loss in Digital Evidence
Error, Uncertainty, and Loss in Digital Evidence
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2they were computer-generated rather than the result of human entries. (State ofMissouri v. Dunn, 1999)Even if it is not possible to exactly quantify the uncerta<strong>in</strong>ty <strong>in</strong> a given piece of digitalevidence, courts should look to experts for a clear sense of how reliable the data are. Alldigital evidence has some degree of uncerta<strong>in</strong>ty <strong>and</strong> an expert should be capable ofdescrib<strong>in</strong>g <strong>and</strong> estimat<strong>in</strong>g the level of certa<strong>in</strong>ty that can be placed <strong>in</strong> a given piece ofevidence. If we do not make an effort to estimate uncerta<strong>in</strong>ty <strong>in</strong> digital evidence, it couldbe argued that there is no basis on which to assess the reliability or accuracy of theevidence. Additionally, forensic exam<strong>in</strong>ers who do not account for error, uncerta<strong>in</strong>ty, <strong>and</strong>loss dur<strong>in</strong>g their analysis may reach <strong>in</strong>correct conclusions <strong>in</strong> the <strong>in</strong>vestigative stage <strong>and</strong>may f<strong>in</strong>d it harder to justify their assertions when cross-exam<strong>in</strong>ed. Furthermore, unlesscourts require some quantification of the uncerta<strong>in</strong>ty associated with given digitalevidence, technical experts are less likely to provide it, <strong>and</strong> the result<strong>in</strong>g rul<strong>in</strong>gs will beweakened accord<strong>in</strong>gly.This paper explores the <strong>in</strong>herent uncerta<strong>in</strong>ty <strong>in</strong> evidence collected from computernetworks <strong>and</strong> describes some potential sources of error <strong>and</strong> loss. The aim of this work isto provide a basis to help forensic exam<strong>in</strong>ers implement the scientific method bydetect<strong>in</strong>g, quantify<strong>in</strong>g <strong>and</strong> compensat<strong>in</strong>g for errors <strong>and</strong> loss <strong>in</strong> evidence collected fromnetworked systems. A method for estimat<strong>in</strong>g <strong>and</strong> report<strong>in</strong>g the level of certa<strong>in</strong>ty of digitalevidence <strong>in</strong> a st<strong>and</strong>ard manner is provided <strong>in</strong> Table 1 <strong>in</strong> Subsection 2.3. This paperfocuses on logs relat<strong>in</strong>g to network activities <strong>in</strong>clud<strong>in</strong>g server logon records, router
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2NetFlow logs, <strong>and</strong> network capture logs. The general concepts <strong>and</strong> approaches toassess<strong>in</strong>g uncerta<strong>in</strong>ty can be generalized to other computer, network <strong>and</strong>telecommunication systems.2. Separation: Time, Space <strong>and</strong> Abstraction<strong>Digital</strong> data are separated by both time <strong>and</strong> distance from the events they represent. A logfile that records network activities is an historic record of events that happened at variousplaces <strong>in</strong> the world. Even when view<strong>in</strong>g network traffic us<strong>in</strong>g a sniffer, there is a delaybetween the activity that generated the traffic <strong>and</strong> the display of the data on the monitor.Additionally, networks are comprised of layers that perform different functions fromcarry<strong>in</strong>g electronic pulses over network cables to present<strong>in</strong>g data <strong>in</strong> a form that computerapplications can <strong>in</strong>terpret. For <strong>in</strong>stance, the Open System Interconnection (OSI) referencemodel divides networks <strong>in</strong>to seven layers: the application, presentation, session,transport, network, data l<strong>in</strong>k, <strong>and</strong> physical layers. 2 Each layer of abstraction hides thecomplexity of lower layer <strong>and</strong> provides a new opportunity for error <strong>and</strong> loss as isdiscussed throughout this paper.Each of these dimensions of separation prevents exam<strong>in</strong>ers from view<strong>in</strong>g the true data<strong>and</strong> can distort what the exam<strong>in</strong>er sees. Learn<strong>in</strong>g how to quantify <strong>and</strong> account for theresult<strong>in</strong>g uncerta<strong>in</strong>ty is a critical aspect of a forensic exam<strong>in</strong>er’s work.2 For more <strong>in</strong>formation about network layers <strong>and</strong> the digital evidence they conta<strong>in</strong>, see (Casey, 2000)
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 22.1 Temporal <strong>Uncerta<strong>in</strong>ty</strong>When <strong>in</strong>vestigat<strong>in</strong>g a crime, it is almost <strong>in</strong>variably desirable to know the time <strong>and</strong>sequence of events. Fortunately, <strong>in</strong> addition to stor<strong>in</strong>g, retriev<strong>in</strong>g, manipulat<strong>in</strong>g, <strong>and</strong>transmitt<strong>in</strong>g data, computers associate date/time stamps with almost every task theyperform. However, forensic exam<strong>in</strong>ers must keep <strong>in</strong> m<strong>in</strong>d that computer clocks are<strong>in</strong>herently limited, with resolutions <strong>in</strong> the millisecond or microsecond range <strong>and</strong> can driftover time (Stevens, 1994). Although these time periods may be negligible <strong>in</strong> some cases,there are situations where small discrepancies can be important. For <strong>in</strong>stance, whencreat<strong>in</strong>g a timel<strong>in</strong>e us<strong>in</strong>g sniffer logs from several systems that record hundreds of events<strong>in</strong> the same second, m<strong>in</strong>or clock discrepancies can make it difficult to create an accuratechronological sequence of actions. Also, small discrepancies can accumulate to createlarger errors such as clock drift result<strong>in</strong>g <strong>in</strong> a large offset. Until errors are actuallyquantified it cannot be stated with certa<strong>in</strong>ty that they are negligible.The most common source of temporal error is system clock offset. If a router’s clock isseveral hours fast, this discrepancy can make it difficult to correlate events with logsfrom other systems <strong>and</strong> can impair subsequent analysis. Also, errors <strong>in</strong> an evidencecollection system's clock can create discrepancies dur<strong>in</strong>g the analysis <strong>and</strong> report<strong>in</strong>gstages. For <strong>in</strong>stance, most syslog servers generate a date/time stamp for log entries thatthey receive from remote systems over the network. Thus, a clock offset on the server can<strong>in</strong>troduce error even if the clock of the computer that sent the syslog message is correct.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2Determ<strong>in</strong><strong>in</strong>g whether a system clock was accurate when specific records were generatedcan be a challeng<strong>in</strong>g task <strong>in</strong> a networked environment. The system clock of <strong>in</strong>terest maybe elsewhere on the network <strong>and</strong> might be overlooked dur<strong>in</strong>g the <strong>in</strong>itial search. For<strong>in</strong>stance, <strong>in</strong> a 1995 homicide case the primary suspect claimed that he was at work at thetime of the murder <strong>and</strong> his alibi depended largely on the last configuration time of anetwork device – a Shiva Fastpath. The last configuration time of the device supportedthe suspect’s alibi but <strong>in</strong>vestigators believed that the evidence had been fabricated severaldays after the crime.The Fastpath device did not have an <strong>in</strong>ternal time-keep<strong>in</strong>g source <strong>and</strong> relied on anexternal time source – the system time of the remote management console on aMac<strong>in</strong>tosh computer. Furthermore, the log show<strong>in</strong>g the last configuration time of theFastpath was only recorded on the management console. Therefore, because the suspecthad full control over the management console, it was possible for him to reset the time onthe device from the management console to give the impression that the device was lastconfigured at a specific time. In this case, the management console had not been collectedas evidence dur<strong>in</strong>g the <strong>in</strong>itial search, mak<strong>in</strong>g it impossible to determ<strong>in</strong>e conclusively ifthe date/time stamps had been fabricated. Because the management console was notcollected dur<strong>in</strong>g the <strong>in</strong>itial search <strong>and</strong> seizure, there was a high degree of uncerta<strong>in</strong>ty <strong>in</strong>the date/time stamp, the suspect’s alibi could not be confirmed, <strong>and</strong> he was convicted ofthe murder.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2Time zone differences are another common source of temporal error <strong>in</strong> network logs.Some Web servers (e.g., Microsoft IIS) create logs with date/time stamps <strong>in</strong> GMT <strong>and</strong>computer systems around the world often use local time <strong>in</strong> their logs. A failure to correctfor time zone offsets can result <strong>in</strong> a great deal of confusion. For <strong>in</strong>stance, whenrequest<strong>in</strong>g <strong>in</strong>formation from an Internet Service Provider, a time zone discrepancy cancause the ISP to provide the wrong subscriber <strong>in</strong>formation <strong>and</strong> thus possibly implicate an<strong>in</strong>nocent <strong>in</strong>dividual.Smaller temporal offsets can be <strong>in</strong>troduced by system process<strong>in</strong>g when there is a delaybetween an event <strong>and</strong> the associated record<strong>in</strong>g of the event. For example, before an entryis made <strong>in</strong> a sniffer log, the associated datagram must travel from the network cablethrough the network <strong>in</strong>terface card <strong>and</strong> sniffer application before it is stored <strong>in</strong> a file ondisk. More sophisticated network traffic monitor<strong>in</strong>g systems such as an IDS can<strong>in</strong>troduce further delays because the packet must be compared with a set of signaturesbefore a log entry is created (the example <strong>in</strong> Figure 1 shows a total delay of 17 seconds).
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2account orig<strong>in</strong>ation uncerta<strong>in</strong>ty, a warrant was served at a suspected hacker’s home <strong>in</strong>California. All that law enforcement officers found were a compromised home computerconnected to the Internet via DSL <strong>and</strong> an <strong>in</strong>nocent, frightened computer owner.2.3 Estimat<strong>in</strong>g <strong>Uncerta<strong>in</strong>ty</strong> <strong>in</strong> Network Related <strong>Evidence</strong>In some <strong>in</strong>stances, it may be relatively easy for a forensic exam<strong>in</strong>er to qualify assertionswith appropriate uncerta<strong>in</strong>ty. For <strong>in</strong>stance, when network traffic statistics are onlyavailable <strong>in</strong> aggregate form, the exam<strong>in</strong>er may be provided with summarized data aboutconnections to/from a suspect’s computer that are totaled every ten m<strong>in</strong>utes. With thisdata, it is not possible to dist<strong>in</strong>guish between a Web site that the suspect accessed for tenm<strong>in</strong>utes <strong>and</strong> a Web site that was only viewed for a few seconds. To ensure that theresult<strong>in</strong>g uncerta<strong>in</strong>ty is conveyed to others, the exam<strong>in</strong>er's report should <strong>in</strong>dicate that thedate/time stamps associated with the suspect's network activity are accurate to with<strong>in</strong> 10m<strong>in</strong>utes.Another approach to calculat<strong>in</strong>g uncerta<strong>in</strong>ty is us<strong>in</strong>g probability distribution functions.For example, clock offsets of 355 computers on a university network were calculated. 3Four of the mach<strong>in</strong>es’ clocks were more than six years slow <strong>and</strong> another mach<strong>in</strong>e’s clockwas over one year slow. Factor<strong>in</strong>g these systems <strong>in</strong>to the calculations places theuncerta<strong>in</strong>ty <strong>in</strong>to years as opposed to hours. Exclud<strong>in</strong>g these outliers <strong>and</strong> plott<strong>in</strong>g ahistogram of the rema<strong>in</strong><strong>in</strong>g 350 values shows a dom<strong>in</strong>ant central peak with gradualdecl<strong>in</strong>e on both sides (Figure 2) suggest<strong>in</strong>g that this data has a normal distribution3 These values are not <strong>in</strong>tended to be representative but simply to demonstrate a common approach tocalculat<strong>in</strong>g uncerta<strong>in</strong>ty.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2enabl<strong>in</strong>g us to calculate the st<strong>and</strong>ard deviation (σ = 6.7 hours). 4 Thus, there is a 95%probability that time measured by a system clock is with<strong>in</strong> +/- 13.1 hours (1.96σ) of theactual time.Histogram of Clock Offsets on 350 Unix computers(extreme values not shown)140120# Computers100806040200-60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60M<strong>in</strong>utesFigure 2: Histogram of clock offsets show<strong>in</strong>g a Gaussian distributionBased on these results, a forensic exam<strong>in</strong>er might qualify all assertions by stat<strong>in</strong>g thatdate/time stamps from any given system are accurate to +/- 13 hours (with 95%confidence). However, this uncerta<strong>in</strong>ty was calculated us<strong>in</strong>g data that may not berepresentative of computers <strong>in</strong>volved <strong>in</strong> crime <strong>and</strong> five <strong>in</strong>convenient values were4 Although exclud<strong>in</strong>g these anomalous values simplifies this demonstration, it also weakens the underly<strong>in</strong>gassumption that the temporal errors <strong>in</strong> this experiment have a Gaussian distribution <strong>and</strong> may <strong>in</strong>validate theassociated uncerta<strong>in</strong>ty computation.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2excluded for convenience, mak<strong>in</strong>g the results highly suspect. So, although it is relativelystraightforward to estimate uncerta<strong>in</strong>ty <strong>in</strong> this way, it is not necessarily applicable oraccurate.In most cases, quantify<strong>in</strong>g uncerta<strong>in</strong>ty <strong>in</strong> network related evidence is very difficult. Thestate of a network changes constantly, tamper<strong>in</strong>g <strong>and</strong> corruption are unpredictable, <strong>and</strong>quantify<strong>in</strong>g temporal or orig<strong>in</strong>ation uncerta<strong>in</strong>ty of a given log entry may not be possible.Consider the follow<strong>in</strong>g example from a Microsoft IIS Web server access log show<strong>in</strong>g an<strong>in</strong>trusion:2002-03-08 04:42:31 10.10.24.93 - 192.168.164.88 80 GET /scripts/..%5c../w<strong>in</strong>nt/system32/cmd.exe /c+tftp+i+172.16.19.42+get+su.exe+c:\<strong>in</strong>etpub\scripts\su.exe502The date/time stamps of this type of log file are usually <strong>in</strong> GMT but can be configured touse local time. Without additional <strong>in</strong>formation, we cannot determ<strong>in</strong>e if this eventoccurred on 2002-03-08 at 04:42 GMT or 2002-03-07 at 23:42 EST. Additionally, thesystem clock of the computer could be <strong>in</strong>correct by m<strong>in</strong>utes, hours, or days. Of course,these potential errors can be avoided by document<strong>in</strong>g the system clock time <strong>and</strong> the Webserver configuration but orig<strong>in</strong>ation uncerta<strong>in</strong>ty can be more problematic. In the aboveexample, the attacker could be connect<strong>in</strong>g through a proxy so the IP address 10.10.24.93may not be on the same local area network or even <strong>in</strong> the same geographical region as theattacker.Also note that the Web server request recorded <strong>in</strong> this log entry was constructed toexploit a vulnerability that allows an attacker to execute comm<strong>and</strong>s on the server. In thiscase, the attacker is <strong>in</strong>struct<strong>in</strong>g the Web server to fetch a file called su.exe from another
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2system (172.16.19.42). We do not know if the IP address 172.16.19.42 is <strong>in</strong> the samegeographical region as the attacker <strong>and</strong> we do not even know if this comm<strong>and</strong> wassuccessful. 5 F<strong>in</strong>ally, s<strong>in</strong>ce the <strong>in</strong>truder could execute comm<strong>and</strong>s on the Web server,he/she could have fabricated this log entry to misdirect <strong>in</strong>vestigators or deleted log entriesthat implicated him/her as is discussed <strong>in</strong> Section 3. The total potential uncerta<strong>in</strong>ty <strong>in</strong> theabove log entry is a comb<strong>in</strong>ation of these (<strong>and</strong> possibly other) uncerta<strong>in</strong>ties. However, itis not clear how the <strong>in</strong>dividual uncerta<strong>in</strong>ties <strong>in</strong>teract or how they can be comb<strong>in</strong>ed toestimate the maximum potential uncerta<strong>in</strong>ty. Given the number of unknowns <strong>in</strong> theequation, this problem is effectively <strong>in</strong>determ<strong>in</strong>ate. So, it is necessary to estimateuncerta<strong>in</strong>ty <strong>in</strong> a heuristic manner.To strike a balance between the need for some estimate of certa<strong>in</strong>ty <strong>in</strong> digital evidence onnetworked systems versus the cost <strong>and</strong> complexity of calculat<strong>in</strong>g uncerta<strong>in</strong>ty, thefollow<strong>in</strong>g chart is provided, from lowest level of certa<strong>in</strong>ty (C0) to highest level ofcerta<strong>in</strong>ty (C6). 6Certa<strong>in</strong>ty Description/IndicatorsCommensurateExamplesLevelQualificationC0 <strong>Evidence</strong> contradicts known facts. Erroneous/Incorrect Exam<strong>in</strong>ers found a vulnerability <strong>in</strong> InternetExplorer (IE) that allowed scripts on aparticular Web site to create questionablefiles, desktop shortcuts, <strong>and</strong> IE favorites.The suspect did not purposefully createthese items on the system.C1 <strong>Evidence</strong> is highly questionable. Highly Uncerta<strong>in</strong> Miss<strong>in</strong>g entries from log files or signs oftamper<strong>in</strong>g.C2Only one source of evidence thatis not protected aga<strong>in</strong>st tamper<strong>in</strong>g.Somewhat Uncerta<strong>in</strong> E-mail headers, sulog entries, <strong>and</strong> syslogwith no other support<strong>in</strong>g evidence.C3The source(s) of evidence aremore difficult to tamper with butthere is not enough evidence tosupport a firm conclusion or therePossibleAn <strong>in</strong>trusion came from Czechoslovakiasuggest<strong>in</strong>g that the <strong>in</strong>truder might be fromthat area. However, a later connectioncame from Korea suggest<strong>in</strong>g that the5 A 200 return code would tell us that the comm<strong>and</strong> was executed but a 502 return code <strong>in</strong>dicates that theserver became overloaded <strong>and</strong> may or may not have executed the comm<strong>and</strong>.6 Similar types of scales have been found extremely useful <strong>in</strong> meteorology such as the Beaufort W<strong>in</strong>d Scale<strong>and</strong> the Fujita Tornado Scale, which was <strong>in</strong>troduced by F. T. Fujita <strong>in</strong> 1971 <strong>and</strong> has been universallyadopted <strong>in</strong> rat<strong>in</strong>g the <strong>in</strong>tensity of tornados by exam<strong>in</strong><strong>in</strong>g the damage they have done (The Fujita Scale,1999).
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2C4C5C6are unexpla<strong>in</strong>ed <strong>in</strong>consistencies <strong>in</strong>the available evidence.<strong>Evidence</strong> is protected aga<strong>in</strong>sttamper<strong>in</strong>g or multiple, <strong>in</strong>dependentsources of evidence agree butevidence is not protected aga<strong>in</strong>sttamper<strong>in</strong>g.Agreement of evidence frommultiple, <strong>in</strong>dependent sources thatare protected aga<strong>in</strong>st tamper<strong>in</strong>g.However small uncerta<strong>in</strong>ties exist(e.g., temporal error, data loss).The evidence is tamper proof <strong>and</strong>unquestionable.ProbableAlmost Certa<strong>in</strong>Certa<strong>in</strong><strong>in</strong>truder might be elsewhere or that there ismore than one <strong>in</strong>truder.Web server defacement probably orig<strong>in</strong>atedfrom a given apartment s<strong>in</strong>ce tcpwrapperlogs show FTP connections from theapartment at the time of the defacement<strong>and</strong> Web server access logs show the pagebe<strong>in</strong>g accessed from the apartment shortlyafter the defacement.IP address, user account, <strong>and</strong> ANI<strong>in</strong>formation lead to suspect’s home.Monitor<strong>in</strong>g Internet traffic <strong>in</strong>dicates thatcrim<strong>in</strong>al activity is com<strong>in</strong>g from the house.Although this is <strong>in</strong>conceivable at themoment, such sources of digital evidencemay exist <strong>in</strong> the future.Table 1: A proposed scale for categoriz<strong>in</strong>g levels of certa<strong>in</strong>ty <strong>in</strong> digital evidenceIn the 1995 homicide mentioned <strong>in</strong> Subsection 2.1, the certa<strong>in</strong>ty level of the lastconfiguration time of a Shiva Fastpath was greater than C0 (absence of evidence is notevidence of absence) <strong>and</strong> was greater than C1 because there was no evidence oftamper<strong>in</strong>g. However, s<strong>in</strong>ce the date/time stamp was easily falsified, it had a relatively lowcerta<strong>in</strong>ty level (C2). E-mail headers that do not appear to be forged also have a C2certa<strong>in</strong>ty level but can drop to C1 certa<strong>in</strong>ty level if any <strong>in</strong>consistencies or signs of forgeryare detected. If there is an <strong>in</strong>dication that an offender is us<strong>in</strong>g a proxy or some othermethod of conceal<strong>in</strong>g his/her location, there is virtually no certa<strong>in</strong>ty <strong>in</strong> associated IPaddresses recorded <strong>in</strong> log files (C1).When multiple sources of evidence are available, the certa<strong>in</strong>ty level <strong>in</strong> the data isbolstered because details can be corroborated (C4). 7 However, if there are <strong>in</strong>consistenciesbetween the various sources the certa<strong>in</strong>ty level is reduced (C3) <strong>and</strong> signs of tamper<strong>in</strong>g <strong>in</strong>7 Investigators should seek the sources, conduits, <strong>and</strong> targets of an offense. Each of these three areas canhave multiple sources of digital evidence <strong>and</strong> can be used to establish the cont<strong>in</strong>uity of offense (COO).Notably, additional systems may be peripherally <strong>in</strong>volved <strong>in</strong> an offense <strong>and</strong> may have related evidence. Themore corroborat<strong>in</strong>g evidence that <strong>in</strong>vestigators can obta<strong>in</strong>, the more certa<strong>in</strong>ty they can have <strong>in</strong> theirconclusions.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2the available sources of evidence reduce the certa<strong>in</strong>ty level even further (C1). When thereis complete agreement between protected sources of digital evidence, there is a relativelyhigh certa<strong>in</strong>ty level (C5) <strong>in</strong> the evidence but some uncerta<strong>in</strong>ty still exists. Currently, thereare no tamper proof sources of digital evidence (C6) but future developments mayprovide this level of certa<strong>in</strong>ty.The certa<strong>in</strong>ty levels <strong>in</strong> Table 1 apply to data, not to conclusions based on digital evidences<strong>in</strong>ce erroneous conclusions can be drawn from reliable data. Also, the certa<strong>in</strong>ty level <strong>in</strong>digital evidence depends on its context, which <strong>in</strong>cludes offender actions <strong>and</strong> can only beassessed after a crime has been committed. An entry <strong>in</strong> a s<strong>in</strong>gle log file that is susceptibleto tamper<strong>in</strong>g has a low certa<strong>in</strong>ty level (C2) unless there are other <strong>in</strong>dependent sources ofcorroborat<strong>in</strong>g evidence (C4). A st<strong>and</strong>ard syslog entry has a lower certa<strong>in</strong>ty level (C2)than a cryptographically signed entry (C4) unless there is some <strong>in</strong>dication that the signedlogs have been compromised, <strong>in</strong> which case the certa<strong>in</strong>ty level drops to C1.In addition to provid<strong>in</strong>g forensic exam<strong>in</strong>ers with a practical method for estimat<strong>in</strong>guncerta<strong>in</strong>ty, this heuristics approach allows <strong>in</strong>vestigators, attorneys, judges, <strong>and</strong> jurorswho do not have a deep technical underst<strong>and</strong><strong>in</strong>g of network technology to assess thereliability of a given piece of digital evidence.3. Log Tamper<strong>in</strong>g, Corruption <strong>and</strong> <strong>Loss</strong><strong>Error</strong>s <strong>and</strong> losses can be <strong>in</strong>troduced <strong>in</strong>to log files at various stages:• At the time of the event: a fake IP address can be <strong>in</strong>serted <strong>in</strong>to a packet or a logentry can be fabricated to misdirect <strong>in</strong>vestigators
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2• Dur<strong>in</strong>g observation: Spann<strong>in</strong>g ports on a switch to monitor network traffic<strong>in</strong>creases the load on the switch <strong>and</strong> may <strong>in</strong>troduce load<strong>in</strong>g error, <strong>in</strong>creas<strong>in</strong>g thenumber of dropped datagrams.• At the time of the log creation: remote logg<strong>in</strong>g facilities such as syslog can havesome percentage of lost messages <strong>and</strong> a clock offset on the logg<strong>in</strong>g system canresult <strong>in</strong> <strong>in</strong>correct date/time stamps.• After creation: log files can become corrupted or can be altered to mislead<strong>in</strong>vestigators• Dur<strong>in</strong>g exam<strong>in</strong>ation: the application used to represent log files can <strong>in</strong>troduceerrors by <strong>in</strong>correctly <strong>in</strong>terpret<strong>in</strong>g data or by fail<strong>in</strong>g to display certa<strong>in</strong> details• Dur<strong>in</strong>g analysis: mistakes by an exam<strong>in</strong>er can result <strong>in</strong> <strong>in</strong>correct conclusionsWhen deal<strong>in</strong>g with networked systems, it is crucial to extend one’s th<strong>in</strong>k<strong>in</strong>g beyond thes<strong>in</strong>gle computer <strong>and</strong> consider other connected systems. We will concentrate on simplescenarios s<strong>in</strong>ce they are more likely than sophisticated attacks. However, one should keep<strong>in</strong> m<strong>in</strong>d that DNS records can be modified to make an attacker’s computer appear to be atrusted computer, logg<strong>in</strong>g programs such as tcpwrappers can be maliciously altered(CERT, 1999), <strong>and</strong> many other components of a network can be tampered with to createerror <strong>and</strong> loss.3.1 Log Tamper<strong>in</strong>gOn Unix, system messages are sent to syslogd through the /dev/log socket <strong>and</strong> are writtento a file on disk or sent to a remote logg<strong>in</strong>g host. S<strong>in</strong>ce /dev/log usually allows anyone onthe system to write to it <strong>and</strong> syslogd is often configured to accept messages over thenetwork from any system, it is relatively easy to fabricate a syslog entry. Additionally,
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2syslogd does not add a date/time stamp to locally generated messages that already have atime stamp. So, even if a user does not have write access to the syslog file, he/she canfabricate an entry with any date/time stamp <strong>and</strong> pass it through /dev/log us<strong>in</strong>g a simpleprogram as shown here:$ ls -latc /var/log/messages-rw------- 1 root root 756174 May 6 17:34 /var/log/messages$ ls -latc /dev/logsrw-rw-rw- 1 root root 0 Apr 22 14:18 /dev/log$ ./make-syslog "Apr 13 04:55:05 lpd[1534]: unexpected error from 192.168.35.65"$ suPassword:# tail -f /var/log/messagesMay 6 17:35:00 server1 CROND[21478]: (root) CMD (/usr/local/security/regular-job.pl)Apr 13 04:55:05 server1 lpd[1534]: unexpected error from 192.168.35.65May 6 17:36:00 server1 CROND[21482]: (root) CMD (/usr/local/security/regular-jop.pl)The only component of a syslog message that cannot be fabricated <strong>in</strong> this way is thehostname (server1 <strong>in</strong> this <strong>in</strong>stance) mak<strong>in</strong>g st<strong>and</strong>ard syslog a relatively unreliable sourceof evidence (C2).Although Unix utmp <strong>and</strong> wtmp files (C3) are harder to tamper with because root accessis required, they provide another example of log file tamper<strong>in</strong>g. Logon sessions to Unixsystems are recorded <strong>in</strong> the wtmp <strong>and</strong> utmp files which are ma<strong>in</strong>ta<strong>in</strong>ed by log<strong>in</strong> <strong>and</strong> <strong>in</strong>it.The wtmp file reta<strong>in</strong>s an historical record of logon sessions whereas the utmp file showswho is currently logged <strong>in</strong>to the system. As noted <strong>in</strong> the manual page for utmp:The utmp file allows one to discover <strong>in</strong>formation about who iscurrently us<strong>in</strong>g the system. There may be more users currentlyus<strong>in</strong>g the system, because not all programs use utmp logg<strong>in</strong>g.Warn<strong>in</strong>g: utmp must not be writable, because many system programs(foolishly) depend on its <strong>in</strong>tegrity. You risk faked systemlogfiles <strong>and</strong> modifications of system files if you leave utmpwritable to any user.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2The same warn<strong>in</strong>gs apply to the wtmp log s<strong>in</strong>ce not all legitimate programs utilize thelog<strong>in</strong> process <strong>and</strong> most <strong>in</strong>truder backdoors purposefully avoid mak<strong>in</strong>g log entries.Additionally, it is common for <strong>in</strong>truders who ga<strong>in</strong> unauthorized root access to computersystems to remove entries from the wtmp log us<strong>in</strong>g programs such as wzap to concealtheir activities. Although W<strong>in</strong>dows NT logon entries are difficult to fabricate (C3), verylittle logg<strong>in</strong>g is enabled by default <strong>and</strong>, when they are enabled, the logs can be deletedus<strong>in</strong>g programs such as W<strong>in</strong>Zapper. 8 Application logs such as Web server access logs areeven easier to alter, mak<strong>in</strong>g them even less reliable (C2). 93.2 Log CorruptionWhen users logout of a Unix system, <strong>in</strong>it usually clears most of the associated utmp entry<strong>and</strong> records the logout <strong>in</strong> the wtmp file. However, the logout process sometimes fails toupdate these files, leav<strong>in</strong>g remnants of previous sessions that can <strong>in</strong>troduce errors <strong>in</strong> anumber of ways. The most common error is a wtmp entry that shows the user account asstill logged <strong>in</strong> long after the user disconnected from the system. 10In one harassment case, wtmp logs (C3) suggested that the primary suspect had been8 http://ntsecurity.nu/toolbox/w<strong>in</strong>zapper/9 The certa<strong>in</strong>ty level <strong>in</strong> these logs depends on their context. Although Microsoft Internet Information Server(IIS) makes it difficult to alter <strong>and</strong> delete its active logs, IIS generally creates a new log each day leav<strong>in</strong>gthe previous log vulnerable to tamper<strong>in</strong>g or destruction. Intruders will return to a Web server shortly afterIIS creates a new log file to overwrite <strong>in</strong>crim<strong>in</strong>at<strong>in</strong>g evidence from the <strong>in</strong>active log. Interest<strong>in</strong>gly, theW<strong>in</strong>dows swap file can conta<strong>in</strong> an abundance of <strong>in</strong>formation about recent Web server queries, provid<strong>in</strong>gforensic exam<strong>in</strong>ers with <strong>in</strong>formation to corroborate the exist<strong>in</strong>g Event Logs <strong>and</strong> IIS access logs orreconstruct events when access logs have been overwritten, <strong>in</strong>creas<strong>in</strong>g the certa<strong>in</strong>ty <strong>in</strong> the log entries to C4or C5.10 A similar problem can occur <strong>in</strong> TACACS logs when an <strong>in</strong>dividual connects to the Internet through a dialupterm<strong>in</strong>al server. The logout may not be recorded, giv<strong>in</strong>g the impression that a particular <strong>in</strong>dividual isstill connected to the Internet after the session has been term<strong>in</strong>ated.11 The system clock was four m<strong>in</strong>utes fast – a fact that would have made it more difficult to locate theimplicat<strong>in</strong>g pacct records if it had been overlooked. Also, pacct log entries are listed <strong>in</strong> order of the timethat a process ended rather than when it began. If this feature is not accounted for, the order of events will
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2logged <strong>in</strong>to a system at the time of one act. A closer exam<strong>in</strong>ation of the log entry showedthat a logout entry had never been entered for his session, giv<strong>in</strong>g the impression that thesuspect had been logged <strong>in</strong> for 5 days, 16 hours, <strong>and</strong> 44 m<strong>in</strong>utes.serverZ% lastuserA pts/14 roosevelt.corpX. Sun Dec 9 13:10 - 13:17 (00:07)userB pts/2 pc01.sales.corpX Sun Dec 9 13:09 - 13:29 (00:10)userC pts/13 l<strong>in</strong>coln.corpX.co Sun Dec 9 13:01 - 16:16 (03:15)userH pts/3 homepc.isp.com Fri Dec 7 14:14 - 10:53 (6+20:38)suspect pts/7 201.4.136.24 Fri Dec 7 08:39 - 01:23 (5+16:44)An exam<strong>in</strong>ation of the process account<strong>in</strong>g logs (C3) for the time <strong>in</strong> question showed thatno processes had been runn<strong>in</strong>g under the suspect’s account <strong>and</strong> that only one of the otherlogon sessions had been active, implicat<strong>in</strong>g another <strong>in</strong>dividual. 11A less common, secondary logg<strong>in</strong>g error occurs when someone logs <strong>in</strong>to a Unix term<strong>in</strong>althat has an associated utmp entry that was not properly cleared <strong>and</strong> then uses the su(substitute user) comm<strong>and</strong>. Instead of mak<strong>in</strong>g an entry <strong>in</strong> the sulog (C2) show<strong>in</strong>g thecurrent user su-<strong>in</strong>g, su erroneously obta<strong>in</strong>s the user name of the previous user from theutmp file <strong>and</strong> makes an entry <strong>in</strong> the sulog show<strong>in</strong>g the previous user su-<strong>in</strong>g. This anomalycan give the impression that someone escalated their privileges on a system withoutauthorization when <strong>in</strong> fact an authorized <strong>in</strong>dividual escalated privileges <strong>and</strong> the logsmislabeled the act. Both of these examples show how corrupt log files can implicate an<strong>in</strong>nocent <strong>in</strong>dividual if potential errors are not accounted for.be illogical, <strong>and</strong> important events may be overlooked because the correspond<strong>in</strong>g log entries appear after thetime period of <strong>in</strong>terest.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 23.3 Data <strong>Loss</strong>In addition to the log tamper<strong>in</strong>g <strong>and</strong> corruption, uncerta<strong>in</strong>ty can be <strong>in</strong>troduced by lost logentries. Because syslog <strong>and</strong> other logg<strong>in</strong>g mechanisms like NetFlow (detailed <strong>in</strong>Subsection 4.2) use UDP by default when send<strong>in</strong>g messages to remote systems, there isno guarantee of delivery. When a remote logg<strong>in</strong>g host is heavily loaded it may drop somesyslog messages sent from remote hosts, thus leav<strong>in</strong>g gaps <strong>in</strong> the logs. Alternatives to thest<strong>and</strong>ard syslog use TCP <strong>and</strong> sequence numbers to <strong>in</strong>crease reliability <strong>and</strong> facilitate losscalculations. Also, some alternatives use digital signatures to check the <strong>in</strong>tegrity of logfiles (C4), mak<strong>in</strong>g it more difficult to modify logs without detection (Tan, J., 2001).<strong>Loss</strong>es can also occur <strong>in</strong> system kernels <strong>and</strong> at the hardware level, miss<strong>in</strong>g importantdetails. Figure 3 illustrates some losses that can occur on a networked system used tomonitor network communications.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2Figure 3: Potential data losses as pulses <strong>in</strong> network cables are translated <strong>in</strong>to humanreadable formNetwork <strong>in</strong>terface cards <strong>and</strong> network devices such as switches, routers, <strong>and</strong> firewalls c<strong>and</strong>rop packets. Also, programs used to monitor network traffic can become overloaded <strong>and</strong>fail to reta<strong>in</strong> all packets captured by the kernel as described <strong>in</strong> Subsection 4.1. AlthoughTCP is designed to retransmit dropped packets, network sniffers are not activeparticipants <strong>in</strong> the communication channel <strong>and</strong> will not cause packets to be resent.Furthermore, network-monitor<strong>in</strong>g applications may show only certa<strong>in</strong> types of data (e.g.,only Internet Protocol data) <strong>and</strong> may <strong>in</strong>troduce error or discard <strong>in</strong>formation by design orun<strong>in</strong>tentionally dur<strong>in</strong>g operation.4. Quantify<strong>in</strong>g <strong>and</strong> Reduc<strong>in</strong>g <strong>Loss</strong>es 12Because of the transient nature of evidence on a network, there is usually only oneopportunity to capture it, <strong>and</strong> it can be difficult to determ<strong>in</strong>e what <strong>in</strong>formation was notcollected. Although it may not be possible to <strong>in</strong>fer the content of lost datagrams, it isuseful to quantify the percentage loss. Quantify<strong>in</strong>g the amount of loss gives a sense of thecompleteness of the logs <strong>and</strong> the result<strong>in</strong>g crime reconstruction. High losses dur<strong>in</strong>g themonitor<strong>in</strong>g <strong>and</strong> collection phase translate to low level of certa<strong>in</strong>ty <strong>in</strong> the crimereconstruction phase. Furthermore, quantification can help identify <strong>and</strong> m<strong>in</strong>imize thecause of data loss as demonstrated <strong>in</strong> Subsection 4.2.12 The Snort <strong>and</strong> NetFlow data <strong>in</strong> this section were collected at Yale University, Information SecurityOffice.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2Snort analyzed 936893902 out of 1062045648 packets, dropp<strong>in</strong>g 125151746 (11.784%) packetsThe percentage of packets dropped by one Snort system 18 over a 74-day period on anetwork that peaks regularly at 60 Mbits/second is shown <strong>in</strong> Figure 4. This chart<strong>in</strong>dicates that the number of packets missed by Snort <strong>in</strong>creases with total traffic. 19 Thehigh percentage losses at low traffic levels may be due to higher than normal loads on thecollection system.3025Percentage loss201510500 500000000 1E+09 1.5E+09 2E+09 2.5E+09Total packets captured by libpcapFigure 4: High losses <strong>in</strong> Snort correspond to higher total number of packetscaptured by libpcap18 A st<strong>and</strong>ard desktop computer with a Pentium processor, 100Mbit network <strong>in</strong>terface card <strong>and</strong> 256MBRAM, runn<strong>in</strong>g FreeBSD.19 These losses do not <strong>in</strong>clude packets dropped by the network <strong>in</strong>terface card as will be discussed later <strong>in</strong>this section. In this <strong>in</strong>stance, the losses <strong>in</strong> the network <strong>in</strong>terface card are negligible <strong>in</strong> comparison to thelosses <strong>in</strong> the kernel.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2In an effort to identify the cause of the high losses, the number of lost NetFlow logentries was tracked over a four-day period <strong>in</strong> 15 m<strong>in</strong>ute <strong>in</strong>tervals <strong>and</strong> is plotted <strong>in</strong> Figure5. The peak losses occurred between 14:45 <strong>and</strong> 16:45 each day, which corresponds withtimes of peak <strong>in</strong>bound traffic for the network. Zero losses occurred between 04:30 <strong>and</strong>07:00, correspond<strong>in</strong>g with times of low network usage. More notably, the number ofcaptured flows never exceeded 219090 flows <strong>in</strong> any 15 m<strong>in</strong>ute <strong>in</strong>terval, <strong>in</strong>dicat<strong>in</strong>g thatsome limit<strong>in</strong>g factor was caus<strong>in</strong>g the large losses.The limit<strong>in</strong>g factor turned out to be a rate-limit<strong>in</strong>g device called a Packeteer that wasun<strong>in</strong>tentionally restrict<strong>in</strong>g the NetFlow UDP traffic between the router <strong>and</strong> collector. Therate-limit<strong>in</strong>g was removed <strong>and</strong> losses over another four-day period are plotted <strong>in</strong> Figure6, show<strong>in</strong>g significant reduction <strong>in</strong> losses (predom<strong>in</strong>antly zero) <strong>and</strong> <strong>in</strong>crease <strong>in</strong> the totalnumber of captured flows. This example demonstrates that uncerta<strong>in</strong>ty can be <strong>in</strong>troduced<strong>in</strong> unexpected ways on a network.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2NetFlow: Captured <strong>and</strong> Lost FlowsFlows4500004000003500003000002500002000001500001000005000000 20 40 60 80 100Hourslost flowscaptured flowsFigure 5: Captured <strong>and</strong> lost flows between Apr 15 00:00 – Apr 18 20:15, 2002. Highlosses <strong>in</strong> NetFlow correspond to higher traffic loads between 14:45 <strong>and</strong> 16:45 eachday
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2NetFlow: Captured <strong>and</strong> Lost Flows600000500000Flows400000300000200000100000lost flowscaptured flows00 20 40 60 80 100HoursFigure 6: Captured <strong>and</strong> lost flows between May 9, 21:45 – May 13 17:45, 2002.<strong>Loss</strong>es are significantly reduced after limit<strong>in</strong>g factor (Packeteer rule) was removed.In this situation, the NetFlow collector was runn<strong>in</strong>g on a Sun Netra system that was notphysically close to the router. The losses could be further reduced by connect<strong>in</strong>g thecollector directly to the router us<strong>in</strong>g a dedicated <strong>in</strong>terface thus elim<strong>in</strong>at<strong>in</strong>g the physicaldistance <strong>and</strong> shared b<strong>and</strong>width that can cause UDP datagrams to be dropped. Connect<strong>in</strong>gthe NetFlow collector directly to the router would have the added benefit of reduc<strong>in</strong>g thepotential for load<strong>in</strong>g error. 20 Us<strong>in</strong>g a more powerful computer may also reduce the lossesif the processor or disk access speeds are limit<strong>in</strong>g factors.20 Export<strong>in</strong>g NetFlow logs from a router through one of its exist<strong>in</strong>g <strong>in</strong>terfaces consumes b<strong>and</strong>width that<strong>in</strong>terferes with the transmission of other data pass<strong>in</strong>g through the router, thus <strong>in</strong>troduc<strong>in</strong>g load<strong>in</strong>g error.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 24.3 Network Interface <strong>Loss</strong>esSome communication devices have Control/Status Registers that record the number ofdropped datagrams <strong>and</strong> packets. For <strong>in</strong>stance, Universal AsynchronousReceiver/Transmitter (UART) devices such as modems <strong>and</strong> network <strong>in</strong>terface cards(NIC) have registers that are updated each time the device encounters a problem with adatagram. One manufacturer provides the follow<strong>in</strong>g <strong>in</strong>formation about <strong>in</strong>terface errors,<strong>in</strong>clud<strong>in</strong>g datagrams lost due to receiver overruns a.k.a. FIFO overruns (NX Networks,1997).• packet too long or failed, frame too long: "The <strong>in</strong>terface received a packet that is larger than the maximum sizeof 1518 bytes for an Ethernet frame."• CRC error or failed, FCS (Frame Check Sequence) error: "The <strong>in</strong>terface received a packet with a CRC error."• Fram<strong>in</strong>g error or failed, alignment error: "The <strong>in</strong>terface received a packet whose length <strong>in</strong> bits is not a multiple ofeight."• FIFO Overrun: "The Ethernet chipset is unable to store bytes <strong>in</strong> the local packet buffer as fast as they come offthe wire."• Collision <strong>in</strong> packet: "Increments when a packet collides as the <strong>in</strong>terface attempts to receive a packet, but thelocal packet buffer is full. This error <strong>in</strong>dicates that the network has more traffic than the <strong>in</strong>terface can h<strong>and</strong>le."• Buffer full warn<strong>in</strong>gs: "Increments each time the local packet buffer is full."• Packet misses: "The <strong>in</strong>terface attempted to receive a packet, but the local packet buffer is full. This error<strong>in</strong>dicates that the network has more traffic than the <strong>in</strong>terface can h<strong>and</strong>le."Many network <strong>in</strong>terface cards also have a register that <strong>in</strong>dicates how many datagramswere captured by the device but were not read by the system. Figure 7 depicts theselosses when tcpdump is used.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2Figure 7: Potential losses when monitor<strong>in</strong>g/captur<strong>in</strong>g network traffic us<strong>in</strong>g tcpdumpIn Figure 7, when the Ethernet card’s buffer is full <strong>and</strong> is unable to capture a datagramfrom the Ethernet cable, an error is generated <strong>and</strong> stored <strong>in</strong> a Control/Status Register onthe card (receiver overruns = datagrams on the wire – datagrams captured by NIC). Iflibpcap does not read packets from the NIC’s buffer quickly enough, each unread packetis recorded as a dropped packet by the NIC (dropped packets = packets available on NIC– packets read by libpcap; reported by netstat or ifconfig). When an application such astcpdump does not process packets stored by libpcap quickly enough, each unread packetis recorded as a dropped packet by libpcap as shown <strong>in</strong> Subsection 4.1 (dropped packets= packets read by libpcap – packets read by tcpdump).Some system kernels display receiver overruns based on Control/Status Register values<strong>in</strong> Ethernet cards. For example, on L<strong>in</strong>ux systems netstat <strong>and</strong> ifconfig can be used to
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2display data <strong>in</strong> network structures <strong>in</strong>clud<strong>in</strong>g receiver overruns. 21 Output of thesecomm<strong>and</strong>s on a L<strong>in</strong>ux mach<strong>in</strong>e is shown below with receiver overruns <strong>in</strong> bold (alsoviewable us<strong>in</strong>g cat /proc/net/dev).% netstat -nidKernel Interface tableIface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flgeth0 1500 0 19877416 0 0 128 7327647 0 0 0 BRU% /sb<strong>in</strong>/ifconfigeth0L<strong>in</strong>k encap:Ethernet HWaddr 00:B0:D0:F3:CB:B5<strong>in</strong>et addr:128.36.232.10 Bcast:128.36.232.255Mask:255.255.255.0UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1RX packets:19877480 errors:0 dropped:0 overruns:128 frame:0TX packets:7327676 errors:0 dropped:0 overruns:0 carrier:1collisions:442837 txqueuelen:100Interrupt:23 Base address:0xec80Similar <strong>in</strong>formation is available on network devices such as switches, routers <strong>and</strong>firewalls. The follow<strong>in</strong>g output obta<strong>in</strong>ed from a PIX firewall us<strong>in</strong>g the show <strong>in</strong>terfacecomm<strong>and</strong> gives a significant amount of error <strong>in</strong>formation, <strong>in</strong>clud<strong>in</strong>g overruns.# show <strong>in</strong>ter<strong>in</strong>terface ethernet0 "outside" is up, l<strong>in</strong>e protocol is upHardware is i82557 ethernet, address is 00a0.c5d1.35d3IP address 192.168.12.15, subnet mask 255.255.255.0MTU 1500 bytes, BW 100000 Kbit full duplex1600481967 packets <strong>in</strong>put, 4258033917 bytes, 1172 no bufferReceived 969256 broadcasts, 0 runts, 0 giants3 <strong>in</strong>put errors, 3 CRC, 0 frame, 0 overrun, 3 ignored, 0 abort1974900124 packets output, 1524603379 bytes, 12 underruns0 output errors, 0 collisions, 0 <strong>in</strong>terface resets0 babbles, 0 late collisions, 0 deferred0 lost carrier, 0 no carrier<strong>in</strong>put queue (curr/max blocks): hardware (128/128) software (0/129)output queue (curr/max blocks): hardware (0/128) software (0/167)Although it may not be feasible to prevent such errors, it is important to document themwhen collect<strong>in</strong>g evidence. For <strong>in</strong>stance, when network traffic is be<strong>in</strong>g monitored througha spanned port on a switch, the <strong>in</strong>terface on some switches can be polled for errors when21 FreeBSD, Solaris, <strong>and</strong> other versions of Unix do not report receiver overruns <strong>in</strong> the netstat or ifconfigoutput but this <strong>and</strong> other <strong>in</strong>formation about network <strong>in</strong>terfaces can be obta<strong>in</strong>ed us<strong>in</strong>g SNMP. For <strong>in</strong>stance,<strong>in</strong>formation about <strong>in</strong>terfaces on FreeBSD can be obta<strong>in</strong>ed us<strong>in</strong>g snmpwalk <strong>and</strong> snmpnetstat <strong>in</strong> net-snmpdistribution (e.g., snmpwalk localhost public <strong>in</strong>terfaces.ifTable.ifEntry)
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2the capture is complete. Additionally, errors on the collection system’s network <strong>in</strong>terfacecard can be similarly documented. Presumably these losses will be negligible unless thesystem is malfunction<strong>in</strong>g but document<strong>in</strong>g the errors or lack thereof will help establishthis fact.5. <strong>Error</strong>s <strong>in</strong> Reconstruction <strong>and</strong> InterpretationGiven the complexity of networked systems, large amounts of data, <strong>and</strong> occurrence ofevidence dynamics 22 there is a high potential for mistakes <strong>and</strong> mis<strong>in</strong>terpretations <strong>in</strong> theanalysis phase of an <strong>in</strong>vestigation. Tools exist to help us process <strong>and</strong> exam<strong>in</strong>e the vastamounts of data common <strong>in</strong> network <strong>in</strong>vestigations but these tools add another layer ofabstraction <strong>and</strong> are useful only when sufficient data are obta<strong>in</strong>ed <strong>and</strong> the mean<strong>in</strong>g of thedata are understood. W<strong>in</strong>dows Event Log analysis provides a useful example of howanalysis tools can <strong>in</strong>troduce errors. To facilitate analysis of W<strong>in</strong>dows NT Event Logs,many exam<strong>in</strong>ers dump the log entries <strong>in</strong>to a text file us<strong>in</strong>g dumpel or dempevt. Althoughdempevt presents more <strong>in</strong>formation than dumpel, it can <strong>in</strong>correctly <strong>in</strong>terpret date/timestamps after one hour is <strong>in</strong>serted for daylight sav<strong>in</strong>gs time. The output from dempevt failsto correct for the time change <strong>and</strong> represents times prior to daylight sav<strong>in</strong>gs one hour off(Casey, 2001).<strong>Uncerta<strong>in</strong>ty</strong> <strong>in</strong> orig<strong>in</strong> can also lead to <strong>in</strong>terpretation error, as was demonstrated when apharmaceutical company <strong>in</strong> the United K<strong>in</strong>gdom relied on a forged “From” e-mail header(C1) to accuse a researcher <strong>in</strong> the United States of purposefully send<strong>in</strong>g a virus to several22 <strong>Evidence</strong> dynamic is any <strong>in</strong>fluence that changes, relocates, obscures, or obliterates evidence, regardlessof <strong>in</strong>tent, between the time evidence is transferred <strong>and</strong> the time the case is adjudicated (Chisum, Turvey2000), (Casey 2001)
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2e-mail accounts that had been targeted by extreme animal rights groups <strong>in</strong> the past. Byexam<strong>in</strong><strong>in</strong>g the “Received” headers (C2) of the offend<strong>in</strong>g e-mail messages, <strong>in</strong>vestigatorslocated the computer that was used to send the messages. The computer was a publicworkstation that the accused researcher did not remember us<strong>in</strong>g. Investigators found thatthe computer was <strong>in</strong>fected with the Sircam worm – the worm had evidently obta<strong>in</strong>ed thepharmaceutical company’s e-mail addresses from a Web page <strong>in</strong> the browser cache <strong>and</strong>forged a message to these accounts, mak<strong>in</strong>g it appear to come from the researcher’s e-mail address. The Sircam worm was spread<strong>in</strong>g itself to others <strong>in</strong> a similar manner,weaken<strong>in</strong>g the already feeble theory that the research purposefully targeted thepharmaceutical company.A more subtle example of <strong>in</strong>terpretation error arose <strong>in</strong> a case of an employee suspected ofdownload<strong>in</strong>g several pornographic files to disk, creat<strong>in</strong>g shortcuts on the desktop toseveral pornographic sites, <strong>and</strong> add<strong>in</strong>g several Favorite items to the Internet ExplorerWeb browser. The employee denied all knowledge of these items. An analysis of theWeb sites that were viewed by the employee at the time <strong>in</strong> question helped exam<strong>in</strong>ersdeterm<strong>in</strong>e that one site exploited a vulnerability <strong>in</strong> Internet Explorer (Microsoft, 1999)<strong>and</strong> created the offend<strong>in</strong>g items on the computer. Although the suspect’s onl<strong>in</strong>e behaviordid not cross the crim<strong>in</strong>al threshold, it was sufficiently reprehensible to justify histerm<strong>in</strong>ation.There are many potential <strong>in</strong>terpretation errors that can occur when trac<strong>in</strong>g a logged eventback to its source. Locat<strong>in</strong>g a computer <strong>in</strong>truder, for <strong>in</strong>stance, often beg<strong>in</strong>s with an IP
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2address <strong>in</strong> a log file. However, rely<strong>in</strong>g on <strong>in</strong>formation from only one source can result <strong>in</strong>the wrong IP address.In one Web defacement case, a system adm<strong>in</strong>istrator asserted that a specific <strong>in</strong>dividualwas responsible because he was the only person logged <strong>in</strong>to the Web server at the time ofthe defacement. However, this conclusion was based on one log file – the wtmp log (C3)on a Unix mach<strong>in</strong>e that shows when user accounts logged on <strong>and</strong> off. A thoroughexam<strong>in</strong>ation of other log files on the system showed the follow<strong>in</strong>g:1) The tcpwrappers log showed an FTP connection from a PC to the Web servershortly before the page was defaced2) The Web server access log showed the defaced page be<strong>in</strong>g viewed from severalPCs shortly after the defacementThese support<strong>in</strong>g log entries (C4) were used to locate the general source of the attack – ashared apartment – <strong>and</strong> personal <strong>in</strong>terviews of the reduced suspect pool led to the v<strong>and</strong>al.The orig<strong>in</strong>al suspect was not <strong>in</strong>volved <strong>and</strong>, even if the <strong>in</strong>itial suspect's account had beenused to deface the Web page, this does not necessarily imply that the account owner wasresponsible. A wtmp entry does not necessarily <strong>in</strong>dicate that the user logged <strong>in</strong> at a giventime s<strong>in</strong>ce it is possible that someone else used the <strong>in</strong>dividual’s account.Once there is a high degree of certa<strong>in</strong>ty that a given IP address is the actual orig<strong>in</strong> of theattack, the responsible ISP is contacted for subscriber <strong>in</strong>formation. In one <strong>in</strong>trusion case,
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2an ISP became very concerned when an <strong>in</strong>trusion appeared to orig<strong>in</strong>ate from one of theiremployees. It transpired that the IP address had been transcribed <strong>in</strong>correctly <strong>in</strong> thesubpoena. The <strong>in</strong>trusion was actually committed us<strong>in</strong>g a Californian subscriber’s account.However, the subscriber’s password had been stolen <strong>and</strong> used by the <strong>in</strong>truder without herknowledge, mak<strong>in</strong>g it more difficult to identify the offender. Fortunately, the ISP usedAutomatic Number Identification (ANI), allow<strong>in</strong>g the <strong>in</strong>vestigators to determ<strong>in</strong>e theorig<strong>in</strong>at<strong>in</strong>g phone number <strong>in</strong> Texas for each connection to the compromised systems.Although the phone number led to the <strong>in</strong>truder <strong>in</strong> this case, ANI <strong>in</strong>formation is notalways conclusive s<strong>in</strong>ce someone could have broken <strong>in</strong>to the <strong>in</strong>dividual’s home computer<strong>and</strong> used it as a launch<strong>in</strong>g pad for attacks.Even when an exam<strong>in</strong>er underst<strong>and</strong>s the mean<strong>in</strong>g of a given log entry, <strong>in</strong>terpretationerrors can result when the possibility of a wily offender is ignored. It is not uncommonfor computer <strong>in</strong>truders to fabricate log entries to give a false orig<strong>in</strong> of attack. Therefore,forensic exam<strong>in</strong>ers should always enterta<strong>in</strong> the possibility that a sophisticated attacker <strong>in</strong>one location has staged a crime scene to make it appear to be an unsophisticated <strong>in</strong>truder<strong>in</strong> a different location.Mistakes <strong>in</strong> <strong>in</strong>terpretation <strong>and</strong> analysis can be reduced by rigorous application of thescientific method – perform<strong>in</strong>g exhaustive <strong>in</strong>vestigation <strong>and</strong> research, question<strong>in</strong>g allassumptions, <strong>and</strong> develop<strong>in</strong>g a theory that expla<strong>in</strong>s the facts. Even if a forensic exam<strong>in</strong>erhas a high degree of confidence <strong>in</strong> his/her assumptions, potential errors <strong>and</strong> alternateexplanations should be explored <strong>and</strong> elim<strong>in</strong>ated or at least documented as unlikely. If a
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2forensic exam<strong>in</strong>er has complete confidence <strong>in</strong> his/her conclusions, this is usually an<strong>in</strong>dication that he/she is miss<strong>in</strong>g someth<strong>in</strong>g – there is always uncerta<strong>in</strong>ty <strong>and</strong> allassertions should be qualified accord<strong>in</strong>gly as discussed <strong>in</strong> Subsection 2.3.6. Case Example 23On 04/23/02, an organization found that a server conta<strong>in</strong><strong>in</strong>g sensitive <strong>in</strong>formation hadbeen compromised. Although there were no unusual logon entries <strong>in</strong> the wtmp log (C3),the contents of the root account’s comm<strong>and</strong> history file (/.sh_history) gave a sense ofwhat the <strong>in</strong>truder was do<strong>in</strong>g on a system.# cat /.sh_historymkdir nispluscd nisplusmkdir /dev/pts/01wcp /lib/boot/uconf.<strong>in</strong>v /dev/pts/01wget ftp://ftp.uu.net/tmp/genocide-AIX-000DC06D4C00-20020419.tar.gzgzip -dc *|tar -xvf -./setuptelnet 192.168.34.30 4580TERM=xtermexport TERMrm -rf /tmp/.newtelnet 192.168.34.30 4580Although the filename appears to conta<strong>in</strong> a date 20020419, all of the <strong>in</strong>formation <strong>in</strong> thecomm<strong>and</strong> history file has a high temporal uncerta<strong>in</strong>ty (C2) because it does not havedate/time stamps associated with each entry. Also, although this log conta<strong>in</strong>s the hostftp.uu.net, it is unlikely that this is the orig<strong>in</strong> of the <strong>in</strong>itial <strong>in</strong>trusion.The ctime of the /dev/pts/01 directory shown <strong>in</strong> the .sh_history file suggests April 23 as adate, but this is not necessarily the time that the directory was created as on a W<strong>in</strong>dows23 Some hostnames <strong>and</strong> IP addresses <strong>in</strong> this example have been modified to protect the <strong>in</strong>nocent
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2system. 24 Mis<strong>in</strong>terpret<strong>in</strong>g the <strong>in</strong>ode ctime as the creation time can lead to errors <strong>in</strong>analysis.# ls -latc /dev/pts/total 24crw--w--w- 1 user staff 27, 1 Apr 23 15:40 1crw--w--w- 1 root system 27, 4 Apr 23 15:40 4crw--w--w- 1 user staff 27, 0 Apr 23 15:35 0crw--w--w- 1 user staff 27, 3 Apr 23 15:34 3crw--w--w- 1 user staff 27, 2 Apr 23 15:31 2drwxr-xr-x 2 root system 512 Apr 23 15:22 01The ctime of the /dev/pts/01/uconf.<strong>in</strong>v file shown <strong>in</strong> the .sh_history suggests that the<strong>in</strong>truder copied this rootkit configuration file on April 21 at 22:25.# ls -latc /dev/pts/01 | head -5total 48drwxr-xr-x 2 root system 512 Apr 23 15:22 .-rwxr-xr-x 1 root system 4675 Apr 23 15:22 a.out-rw-r--r-- 1 root system 85 Apr 23 15:22 foo.c-rw-r--r-- 1 root system 437 Apr 21 22:25 uconf.<strong>in</strong>vdrwxr-xr-x 3 root system 3584 Apr 21 22:23 ..A search for other files created on April 21 supports the hypothesis that the entries <strong>in</strong> the.sh_history file were from April 21 between 22:25 <strong>and</strong> 22:36 (genocide-AIX-000DC06D4C00-20020419.tar.gz).# ls -latc /usr/lib/security/nisplus | head -22total 3016drwxr-xr-x 3 root system 1024 Apr 23 14:00 .-rw------- 1 root system 24099 Apr 23 14:00 drone2.dat-rw-r--r-- 1 root system 91 Apr 23 14:00 servers-rw-r--r-- 1 root system 0 Apr 21 22:36 identd-rw-r--r-- 1 root system 6 Apr 21 22:35 pid.drone-rw-r--r-- 1 root system 1769 Apr 21 22:35 cront-rwxr-xr-x 1 root system 1472 Apr 21 22:35 droneload-rw-r--r-- 1 root system 3 Apr 21 22:35 b<strong>in</strong>name-rw-r--r-- 1 root system 6 Apr 21 22:35 nick-rw-r--r-- 1 root system 5 Apr 21 22:35 port-rw-r--r-- 1 root system 5 Apr 21 22:35 user-rw-r--r-- 1 root system 15 Apr 21 22:35 vhost-rw-r--r-- 1 root system 14 Apr 21 22:35 realname-rw-r--r-- 1 root system 6 Apr 21 22:34 ircnick-rwxr-xr-x 1 root system 769266 Apr 21 22:29 sh-rw-r--r-- 1 root system 805 Apr 21 22:27 drone.dat-rwxr-xr-x 1 root system 449 Apr 21 22:27 geno.ans-rwxr-xr-x 1 root system 455 Apr 21 22:27 loader24 In this <strong>in</strong>stance, the ctime was probably updated when files were added to the directory.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2-rwxr-xr-x 1 root system 449 Apr 21 22:27 motd-rwxr-xr-x 1 root system 5093 Apr 21 22:27 nickpool-rwxr-xr-x 1 root system 218323 Apr 21 22:27 wget-rw-r--r-- 1 root system 417075 Apr 21 22:27 genocide-AIX-000DC06D4C00-20020419.tar.gz6.2 SyslogSearch<strong>in</strong>g the syslog file for activity on April 21 suggests that <strong>in</strong>truder broke <strong>in</strong> on April21 at 00:06.Apr 21 00:06:03 target rcp[35640]: IMPORT file from host ns1.corpX.com cmd is rcp -f /var/adm/aixkit.tar, localuser adm.However, surround<strong>in</strong>g log entries were stamped April 20 20:06, <strong>in</strong>dicat<strong>in</strong>g that the timestamp on this message is <strong>in</strong>correct (bias error = 5 hours) <strong>and</strong> that the log entry might befabricated (C1).A search for files created on April 20 suggests that the <strong>in</strong>itial <strong>in</strong>trusion occurred on April20 at 20:06.# ls -latc /usr/lib/boot | head -12total 17152-rw------- 1 root system 512 Apr 23 14:06 ssh_r<strong>and</strong>om_seeddrwxr-xr-x 5 b<strong>in</strong> b<strong>in</strong> 1024 Apr 20 20:06 .-rw-r--r-- 1 root system 6 Apr 20 20:06 sshd.pid-rwxr-xr-x 1 root system 3119 Apr 20 20:06 <strong>in</strong>v-r-xr-xr-x 1 root system 213679 Apr 20 20:06 netstat-r-xr-xr-x 1 root system 64506 Apr 20 20:06 psr-rw-r--r-- 1 root system 437 Apr 20 20:06 uconf.<strong>in</strong>v-rw-r--r-- 1 root system 10 Apr 20 20:06 iver-rw-r--r-- 1 root system 880 Apr 20 20:06 ssh_config-rw------- 1 root system 530 Apr 20 20:06 ssh_host_key-rw-r--r-- 1 root system 334 Apr 20 20:06 ssh_host_ke y.pub-rw-r--r-- 1 root system 408 Apr 20 20:06 sshd_config6.3 NetFlowAn analysis of NetFlow logs for the time period gives a more complete underst<strong>and</strong><strong>in</strong>g ofthe sequence of events.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2• On 04/20 between 18:55 <strong>and</strong> 19:11, there was an automated scan fromalcor.hknet.cz of the entire network for vulnerable FTP servers.• On 04/20 between 20:11:39 <strong>and</strong> 20:12:13, an <strong>in</strong>dividual us<strong>in</strong>g alcor.hknet.czga<strong>in</strong>ed unauthorized access to the target system via a vulnerability <strong>in</strong> the FTPserver. The <strong>in</strong>truder downloaded data to the compromised system fromns1.corpX.com apparently us<strong>in</strong>g rcp (port 514) as suggested <strong>in</strong> the questionedsyslog entry:Start End SrcIPaddress SrcP DstIPaddress DstP P Fl Pkts Octets0420.20:11:39.737 0420.20:11:42.861 192.168.34.30 21 195.113.115.170 4113 6 6 2 840420.20:11:39.873 0420.20:11:41.33 192.168.34.30 21 195.113.115.170 4113 6 1 11 27760420.20:11:43.890 0420.20:11:43.890 192.168.34.30 21 195.113.115.170 4123 6 2 1 440420.20:11:44.22 0420.20:11:49.186 192.168.34.30 21 195.113.115.170 4123 6 0 10 27620420.20:11:51.110 0420.20:11:51.110 192.168.34.30 0 17.16.103.2 2048 1 0 1 15000420.20:11:51.110 0420.20:12:13.210 192.168.34.30 1023 17.16.103.2 514 6 3 283 11364• On 04/21 between 22:27 <strong>and</strong> 22:30 an <strong>in</strong>truder accessed a hacked SSH server onport 38290 of the compromised system from Seoul National PolytechnicUniversity (203.246.80.143).• On 04/21 between 22:33:04 <strong>and</strong> 22:33:09, the <strong>in</strong>truder downloaded <strong>and</strong> <strong>in</strong>stalledan IRC bot from ftp.uu.net (192.48.96.9) as recorded <strong>in</strong> the .sh_history file <strong>and</strong><strong>in</strong>stalled the bot a few m<strong>in</strong>utes later. The <strong>in</strong>truder returned several times throughthe hacked SSH server on port 38290:Start End SrcIPaddress SrcP DstIPaddress DstP P Fl Pkts Octets0421.22:27:48.2 0421.22:27:48.2 192.168.34.30 0 203.246.80.143 2048 1 0 1 15000421.22:27:48.2 0421.22:27:48.2 192.168.34.30 38290 203.246.80.143 1022 6 2 1 440421.22:28:28.294 0421.22:28:36.118 192.168.34.30 38290 203.246.80.143 1022 6 0 13 12720421.22:30:56.229 0421.22:31:24.781 192.168.34.30 38290 203.246.80.143 1022 6 0 32 20320421.22:33:04.694 0421.22:33:09.170 192.168.34.30 35334 192.48.96.9 21 6 7 19 9010421.22:33:04.694 0421.22:33:04.694 192.168.34.30 0 192.48.96.9 2048 1 0 1 15000421.22:34:17.123 0421.22:34:18.443 192.168.34.30 38290 203.246.80.143 1022 6 0 7 9080421.22:37:38.798 0421.22:37:38.798 192.168.34.30 0 203.246.80.143 2048 1 0 1 15000421.22:45:59.949 0421.22:46:00.205 192.168.34.30 38290 203.246.80.143 1022 6 1 4 200
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2In summary, log files <strong>and</strong> file system <strong>in</strong>formation on the compromised host gave a senseof when the attack occurred but have low levels of certa<strong>in</strong>ty (C1 – C3). The reliability ofthe data is questionable because the clock on the compromised system was approximately5 m<strong>in</strong>utes slow giv<strong>in</strong>g all timestamps a bias error of -5 m<strong>in</strong>utes. Also, the <strong>in</strong>truder ga<strong>in</strong>edroot access <strong>and</strong> had the ability to change any <strong>in</strong>formation on the system <strong>and</strong> one relevantsyslog entry had a bias error of +5 hours. Additionally, the orig<strong>in</strong> of the attack was notevident from <strong>in</strong>formation on the host. In fact, <strong>in</strong>formation on the host could have led<strong>in</strong>vestigators to <strong>in</strong>correctly assume that ftp.uu.net or ns1.corpX.com.NetFlow logs give a more complete overview of events but a significant amount of datawas not be<strong>in</strong>g collected at the time of the <strong>in</strong>trusion due to the losses shown <strong>in</strong> Figure 5.Therefore, although the NetFlow logs have more accurate date/time stamps, the<strong>in</strong>formation associated with each NetFlow entry has some degree of uncerta<strong>in</strong>ty due tothe high losses (C4). Also, the NetFlow logs <strong>in</strong>dicate that the attack was launched fromalcor.hknet.cz but did not <strong>in</strong>dicate where the <strong>in</strong>truder was physically located. It is likelythat the <strong>in</strong>truder was not proximate to the computer used to launch the attack, furthersupported by the later connections from Korea. Of course, there could be multipleattackers <strong>in</strong> different countries – the po<strong>in</strong>t is that there is a degree of uncerta<strong>in</strong>tyregard<strong>in</strong>g the <strong>in</strong>truder’s physical location <strong>and</strong> other aspects of the <strong>in</strong>trusion that wouldneed to be addressed through further <strong>in</strong>vestigation.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2Notably, if the system clocks on both the router <strong>and</strong> compromised host had not beendocumented, the bias error could be <strong>in</strong>ferred from the data, <strong>and</strong> the discrepancy could bedocumented by stat<strong>in</strong>g that all timestamps have an uncerta<strong>in</strong>ty of +/- 5 m<strong>in</strong>utes (with<strong>in</strong>some level of confidence). This would not significantly alter the conclusions <strong>in</strong> the reportregard<strong>in</strong>g the orig<strong>in</strong> <strong>and</strong> time of the <strong>in</strong>trusion but would help readers underst<strong>and</strong>discrepancies <strong>and</strong> generate subpoenas for the attack<strong>in</strong>g system. For <strong>in</strong>stance, if thesubpoena asked for subscriber <strong>in</strong>formation <strong>and</strong> only provided a specific time (Apr 2020:06 EST) rather than a range of time (Apr 20 20:01 - 20:17), the <strong>in</strong>correct subscriber<strong>in</strong>formation might be given.7. Summary <strong>and</strong> ConclusionsEach layer of abstraction on a network provides a new opportunity for error. The orig<strong>in</strong><strong>and</strong> time of events can be uncerta<strong>in</strong>, errors can occur <strong>in</strong> logg<strong>in</strong>g applications, systemlimitations can exacerbate data loss, <strong>in</strong>dividuals can conceal or fabricate evidence, <strong>and</strong>mistakes can occur <strong>in</strong> data presentation <strong>and</strong> analysis. At the same time, networks providemany opportunities from a forensic st<strong>and</strong>po<strong>in</strong>t. One of the advantages of networks as asource of evidence is that a s<strong>in</strong>gle event leaves traces on multiple systems. Therefore, it ispossible to compare traces from different systems for consistency <strong>and</strong> it is difficult forcrim<strong>in</strong>als to destroy or alter all digital traces of their network activities.To reduce the <strong>in</strong>cidence of <strong>in</strong>correct conclusions based on unreliable or <strong>in</strong>accurate data itis necessary to quantify uncerta<strong>in</strong>ty <strong>and</strong> correct for it whenever possible. In addition tous<strong>in</strong>g corroborat<strong>in</strong>g data from multiple, <strong>in</strong>dependent sources, forensic exam<strong>in</strong>ers shouldattempt to rate their level of confidence <strong>in</strong> the relevant digital evidence. A practical
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2approach to categoriz<strong>in</strong>g uncerta<strong>in</strong>ty <strong>in</strong> digital data was proposed <strong>in</strong> Subsection 2.3, <strong>and</strong>numerous examples were given. Us<strong>in</strong>g this type of systematic method to qualifyconclusions helps decision makers assess the reliability of the <strong>in</strong>formation they are be<strong>in</strong>ggiven <strong>and</strong> anticipates the challenges that will be raised <strong>in</strong> courts as attorneys becomemore familiar with digital evidence. Describ<strong>in</strong>g measures taken to document <strong>and</strong>m<strong>in</strong>imize loss of data can further raise the confidence <strong>in</strong> evidence collected from anetwork.Even when uncerta<strong>in</strong>ty is addressed, completely reliable sources of digital evidence donot exist <strong>and</strong> digital data is circumstantial at best, mak<strong>in</strong>g it difficult to implicate an<strong>in</strong>dividual directly with digital data alone. Therefore, it is generally necessary to seekfirmer proof such as confessions, video surveillance, or physical evidence to overcomethe <strong>in</strong>herent uncerta<strong>in</strong>ty of digital evidence. 25 The amount of support<strong>in</strong>g evidence that isrequired depends on the reliability of the available digital evidence.Ultimately, abid<strong>in</strong>g by the scientific method will help forensic exam<strong>in</strong>ers to avoidegregious errors. Carefully explor<strong>in</strong>g potential sources of error, hypothesis test<strong>in</strong>g, <strong>and</strong>qualify<strong>in</strong>g conclusions with appropriate uncerta<strong>in</strong>ty will protect forensic exam<strong>in</strong>ers fromoverstat<strong>in</strong>g or mis<strong>in</strong>terpret<strong>in</strong>g the facts.AcknowledgementsI am grateful to Professor James Casey at the University of California, Berkeley for acritical read<strong>in</strong>g of the manuscript <strong>and</strong> for <strong>in</strong>troduc<strong>in</strong>g me to the Fujita Tornado scale,25 Some video surveillance is stored <strong>in</strong> digital form <strong>and</strong> is subject to the uncerta<strong>in</strong>ties of any other digitaldata. Time stamps can be <strong>in</strong>correct, the content is mutable, <strong>and</strong> the exam<strong>in</strong>er’s analysis can be <strong>in</strong>correct.
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2which guided my th<strong>in</strong>k<strong>in</strong>g about rat<strong>in</strong>g uncerta<strong>in</strong>ty. Thanks to Andrew Fisher for afruitful discussion regard<strong>in</strong>g statistical analysis of uncerta<strong>in</strong>ty <strong>in</strong> evidence. Thanks also tomy colleagues at Yale University for their <strong>in</strong>sights <strong>and</strong> assistance, especially H. MorrowLong <strong>and</strong> Andy Newman for their technical clarifications, <strong>and</strong> the Data NetworkOperations staff.ReferencesBeckwith, T. G., Marangoni, & R. D., Lienhard, J. H. (1993). Mechanical Measurements5 th Ed., Read<strong>in</strong>g MA: Addison WesleyCasey, E. (2000). <strong>Digital</strong> <strong>Evidence</strong> <strong>and</strong> Computer Crime: Forensic Science, Computers,<strong>and</strong> the Internet, London: Academic PressCasey, E. (2001). H<strong>and</strong>book of Computer Crime Investigation: Forensic Tools <strong>and</strong>Technology, London: Academic PressCERT (1999). Trojan horse version of TCP Wrappers, CERT Advisory CA-1999-01[onl<strong>in</strong>e]. Available: http://www.cert.org/advisories/CA-1999-01.htmlChisum,W. J., & Turvey, B. (2000). <strong>Evidence</strong> dynamics: Locard’s Exchange Pr<strong>in</strong>ciple<strong>and</strong> crime reconstruction, Journal of Behavioral Profil<strong>in</strong>g, Vol. 1, No. 1, 25. RetrievedMay 31, 2002 from http://www.profil<strong>in</strong>g.org/Cisco (2000). NetFlow services <strong>and</strong> applications whitepaper [onl<strong>in</strong>e]. Available:http://www.cisco.com/warp/public/cc/pd/iosw/ioft/neflct/tech/napps_wp.htm.Daubert v. Merrell Dow Pharaceutical, Inc (1993) 113 S. Ct. 2786Microsoft (1999), Vulnerability <strong>in</strong> "ImportExportFavorites", Knowledge Base Article ID:Q241438 [onl<strong>in</strong>e]. Available:http://support.microsoft.com/support/kb/articles/Q241/4/38.ASP
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2NX Networks (1997), LAN/WAN Interface Guide [onl<strong>in</strong>e]. Available:http://www.nxnetworks.com/Docs/software_documentation/openroute_21/lanwan/ethmon4.htmPalmer, G. (2002), Forensic Analysis <strong>in</strong> the <strong>Digital</strong> World, IJDE, Volume 1, Issue 1.Retrieved May 31, 2002 from http://www.ijde.org/gary_article.htmlState of Missouri v. Dunn (1999). Retrieved May 31, 2002 fromhttp://www.missourilawyersweekly.com/mocoa/56028.htmStevens, S. W. (1994). TCP/IP Illustrated, Appendix B, Volume 1: The Protocols.Addison Wesley.Strong, J. W. (1992). McCormick on <strong>Evidence</strong>, section 294 (4th Ed.), West WadsworthTan, J. (2001). Forensic Read<strong>in</strong>ess, @Stake. Retrieved May 31, 2002 fromhttp://@stake.com/research/reports/<strong>in</strong>dex.html#forensic_read<strong>in</strong>essThe Fujita Scale (1999). The Tornado Project. Retrieved May 31, 2002 fromhttp://www.tornadoproject.com/fujitascale/fscale.htm© 2002 International Journal of <strong>Digital</strong> <strong>Evidence</strong>About the AuthorEoghan Casey (eco@corpus-delicti.com) is System Security Adm<strong>in</strong>istrator at YaleUniversity where he <strong>in</strong>vestigates computer <strong>in</strong>trusions, cyberstalk<strong>in</strong>g reports, <strong>and</strong> othercomputer-related crimes, <strong>and</strong> assists <strong>in</strong> the research <strong>and</strong> implementation of universitywide security solutions. He is the author of <strong>Digital</strong> <strong>Evidence</strong> <strong>and</strong> Computer Crime <strong>and</strong>the editor of the H<strong>and</strong>book of Computer Crime Investigation. Mr. Casey is also a partner<strong>in</strong> Knowledge Solutions, LLC. Additional <strong>in</strong>formation about the author is available at
International Journal of <strong>Digital</strong> <strong>Evidence</strong> Summer 2002, Volume 1, Issue 2http://www.corpus-delicti.com/eco/. The views expressed <strong>in</strong> this article are those of theauthors <strong>and</strong> do not necessarily reflect those of Yale University. Address correspondenceto: Eoghan Casey, 175 Whitney Ave., New Haven, CT 06520.