Views
5 years ago

April 10, 2011 Salzburg, Austria - WOMBAT project

April 10, 2011 Salzburg, Austria - WOMBAT project

HARMUR ThreatGroup - ASN

HARMUR ThreatGroup - ASN - Registrant - Registrar - Domain keyword - Threat type - Seenred_start - Seenred_end - Firstseen_start - Firstseen_end - geolocation - count_servers() - count_domains() - as_info() - get_urls() Content - url - http_code - me5 - file_type - file_size - ts_first_seen - ts_last_seen threats() servers() domains() content() content() threats() domain() Server - address - active - seen_active - seen_on - first_seen, last_seen - address_reverse - address_asn - address_as_name - location - reverse_resolution() servers() threats() domain() Domain - name - current_color - first_seen - whois_created_at - whois_registrant - whois_registrar - security_checks() Threat - url - url_tags - url_source - id - help - rating - type - type-description - ts_first_seen - ts_last_seen evolution() security_states() domain() threats_found() threat_class() instances() ServerState - run_at - active - active_serverstatus - ports - version - address_reverse - address_asn - address_as_name - location SecurityState - run_at - analyzer - color - color_business - color_security ThreatClass - id - threat - rating - type - type-description - geolocation() Figure 4: Relationships between the main HARMUR dataset objects, as they are implemented in the current version of the WOMBAT API [24]. 51

ABSTRACT Adversaries’ Holy Grail: Access Control Analytics ∗ The analysis of access control data has many applications in information security, including: role mining and policy learning; discovering errors in deployed policies; regulatory compliance; intrusion detection; and risk mitigation. The success of research in these areas hinges on the availability of high quality real-world data. Thus far, little access control data has been released to the public. We analyze eight publicly released access control datasets and contrast them with three client policies in our possession. Our analysis indicates there are many differences in the structure and distribution of permissions between the public and client datasets, including sparseness, permission distributions, and cohesion. The client datasets also revealed a wide range of semantics and granularities of permissions, ranging from application-specific rights to general accounts on systems we could not observe on the public data due to anonymization. Finally, we analyze the distribution of user-attributes, which the public datasets lack. We find techniques that work well on some datasets do not work equally well on others and discuss possible future research and directions based on our experience with real-world data. Categories and Subject Descriptors D.4.6 [Operating Systems]: Security and Protection—Access Controls ∗ This Research is conducted through participation in the International Technology Alliance sponsored by the U.S. Army Research Laboratory and U.K. Ministry of Defense and was accomplished under Agreement Number W911NF- 06-3-0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the US Army Research Laboratory, the U.S. Government, the UK Ministry of Defense, or the UK Government. The US and UK Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. BADGERS’11, April 10–13, 2011, Salzburg, Austria. Copyright 2011 ACM 978-1-4503-0615-7 ...$10.00. Ian Molloy, Jorge Lobo, Suresh Chari IBM T.J. Watson Research Center {molloyim, jlobo, schari}@us.ibm.com 52 General Terms Experimentation, Security Keywords Access Control, Real-World Data, RBAC, Analysis, Metrics 1. INTRODUCTION Provisioning entitlements in an organization is a challenging task, especially in organizations with thousands of users and tens of thousands of resources. A common solution is to use role-based access control: roles are assigned permissions and users are authorized to roles. However, before roles can be used, they must first be defined by administrators, a challenging and time consuming task known as role engineering. Simplifying the role engineering task has become an active area of research. Many, including ourselves, have investigated how to apply data mining and analytics techniques to existing data [3, 4, 6, 8–11, 13, 16, 17]. We have spent the past four years building and validating our own techniques and other’s solutions on real customer data. The best way to validate academic work is with real data, using appropriate metrics that access the quality and fitness of solutions to real-world problems. In this paper we will discuss our experience with customer access control data, and some of the differences we have observed between real-world data and assumptions made in theoretical work. We analyze and compare eleven access control datasets: eight have been publicly released, and three are confidential policies from clients. We found the public and private data differs in several key aspects that critically impacted the utility of well-studied solutions on private data. Key differences include: • Customer data is more sparse, assigning users a small fraction of the entitlements. • Public data is more compressible, allowing them to be expressed with a smaller relative number of roles. • Customer data has higher entropy; clusters of users and permissions are less well defined, smaller, and lack cohesion. • Customer data has a long tail distribution, while groups of permissions in public data are assigned to a similar number of users. • The granularity of permissions varies, e.g., accounts on systems to columns on tables in databases. • There is known noise in the private data, while the anonymization of public data makes the identification of noise difficult.

D06 (D3.1) Infrastructure Design - WOMBAT project
6-9 December 2012, Salzburg, Austria Social Programme
D I P L O M A R B E I T - Salzburg Research
D I P L O M A R B E I T - Salzburg Research
D I P L O M A R B E I T - Salzburg Research
ECCMID meeting Vienna, Austria 10-13 April 2010 - European ...
April 10, 2011 - University of Cambridge