Hadoop Development - CSC
Hadoop Development - CSC
Hadoop Development - CSC
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Real world example - a Cyber problem (cont’d)<br />
• As always, there is more than one way to approach this, but one simple way would<br />
seem to be to have a map-reduce job that takes as input:<br />
– The auditd logs, and<br />
– A list of parent process ids that we want to trace back<br />
• and which outputs a file that contains:<br />
– A list of parent process-ids as the key to a list of auditd events that all have this process-id as a ppid entry.<br />
– A list of parent process-ids found that also have "terminal" entries (i.e. for which we have now found the user).<br />
• We can then repeatedly run this map-reduce job using the output of one run as input,<br />
with the auditd logs, to the next, until there is no difference in the output file between<br />
two consecutive runs, or until the output file is empty (whichever occurs first).<br />
(Note that since we will be dealing with audit logs from many different servers, we will need to use the IP<br />
address of the server with the process-id to form a unique key)<br />
TBSC 2009<br />
11/10/2011 12:53 PM 0725-23_TBSC 2009 26