Presentation: Data preservation and security - UK Data Archive

data.archive.ac.uk

Presentation: Data preservation and security - UK Data Archive

DATA PRESERVATION AND SECURITY

……………………………………………………….………………………………..................................................................................................

MIKE

KING

………………………………………………………………………...

DIGITAL PRESERVATION, SYSTEMS AND

SECURITY

UK DATA ARCHIVE

UNIVERSITY OF ESSEX

………………………………………................................................

HOW TO SET UP A DATA SERVICE, UK DATA ARCHIVE

3-5 JULY 2013


TWO SIDES OF THE SECTION

……………………………………………………………………………………………………………………………….……………………………..

• Digital Preservation

• Systems and security

…..the quick 45-minute version

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DIGITAL PRESERVATION - OVERVIEW

……………………………………………………………………………………………………………………………….……………………………..

What we do:

• maintain and preserve material acquired and ingested

into the archive

• this data is stored in a non-proprietary, safe and

secure format that can be easily accessed and

migrated for secondary research

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DPS OVERVIEW – HUB OF THE ARCHIVE

……………………………………………………………………………………………………………………………….……………………………..

Research and

Development

Data Services

Systems and

Preservation

Support Services

Projects

Data In

Data Out

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DIGITAL PRESERVATION - WHAT

……………………………………………………………………………………………………………………………….……………………………..

We currently preserve

• approximately 6,336 studies

• occupying about 1,632 GB but with capacity for

more than 8TBytes on main system

• 1,255,814 files, 113,832 directories (with an

average file size 1.29MBytes)

• growing by about 120GB per year

In 44 years we have not lost any data!

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DIGITAL PRESERVATION – WHY

……………………………………………………………………………………………………………………………….……………………………..

• studies that are created by the depositors can be used

for secondary research as soon as processed and for

many years to come through an active file format

migration strategy

• depositors do not have the infrastructure to maintain the

material deposited or keep it usable over time

• economy of scale allows us to support end users with

queries

• end users have a consistent secure structure to gain

access to data from different depositors

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


HOW: FILE STORAGE

……………………………………………………………………………………………………………………………….……………………………..

• standard and consistent directory structure for all

preserved data

• everything is in a specific location – all studies can be treated the

same and the consistent structure makes precisely locating

information easy

• makes caching of specific information types simple

• split unrestricted documentation and “protect and above” data

• allows future migration to other systems and formats easier

• data and documentation stored in portable format

• ability to freely and intelligently read on many platforms

• easier conversion to required format

• easier migration to new portable format

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


HOW: CONSISTENT DIRECTORY STRUCTURE

……………………………………………………………………………………………………………………………….……………………………..

• Study Number

Note and Read files

}

Data format files (SPSS exp, SAS, SIR)

Original deposited format

mrdoc

}

Processing information and control files

Machine readable document files (pdf, word, ascii)

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


EXAMPLE STUDY 6536 (TOP LEVEL)

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


HOW: DATA RESILIENCE

……………………………………………………………………………………………………………………………….……………………………..

Multi-copy and multi version resilience

• four copies on separate media in two independent server

rooms

• historical copies of older versions which have been

superseded or deleted

• highly resilient RAID 5 or 6 disk protection strategy on all

servers

• near-site copy

• all with automated preservation metadata to confirm all

copies identical (dates, location and MD5 checksum)

• backed by Anti-virus protection, firewalls, intrusion and

failure detection systems

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SEGREGATION AND MULTI RESILIENCE (SIMPLIFIED VIEW)

……………………………………………………………………………………………………………………………….……………………………..

Database

Server

Offsite

Study Updates

Web Server

General User

Web Access

Restricted Data

Download

Front end

Main Preservation

Unrestricted

Documentation

Near-Site

Main Preservation 2

Archive

Staff

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


……………………………………………………………………………………………………………………………….……………………………..

Preservation Workflow

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


Preservation System Workflow (DETAILED)

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


PRE-PRESERVATION PROCESSING

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


PLATTERING– UPDATING PRESERVATION SYSTEM

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DUPLICATING AND VERIFYING IDENTICAL

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


OFFSITE AND EARLIER VERSIONS

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


CREATING DIP (DELIVERY PACKAGE)

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DIP BUNDLE FOR 6221

……………………………………………………………………………………………………………………………….……………………………..

6221stata8_b77d59ae50facc93eafbace4970d85e3.zip

Sfe-6221/dp/download/6221_file_information.rtf

doc – mrdoc/UKDA/UKDA_Study_6221_Information.htm

Sfe - 6221/mrdoc/pd

Sfe - 6221/mrdoc/allissue/*

Sfe -6221/read6221.htm

Sfe - 6221/stata/stata8/*

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


PRESERVATION POLICY

……………………………………………………………………………………………………………………………….……………………………..

UK Data Archive Preservation policy

http://www.data-archive.ac.uk/media/54776/ukda062-dps_preservationpolicy.pdf

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DP – SYSTEMS

……………………………………………………………………………………………………………………………….……………………………..

Systems

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SYSTEMS AND IT INFRASTRUCTURE

……………………………………………………………………………………………………………………………….……………………………..

We currently support:

• two partially mirrored server rooms

• over 60 servers in 11 racks

• over 20 network switches

• four firewalls

• four SANs

• 77 desktop PCs

• 33 laptops

• 33 printers

• one Access Grid Node (AGN)

• four AV presentation rooms

• 30+ TB of data - of which 8TB is unique

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SERVERS AND THEIR ENVIRONMENT

……………………………………………………………………………………………………………………………….……………………………..

Servers

• our servers are based in one of the two server rooms.

These two rooms are independent from each other and are

protected as follows:

• three level physical intruder prevention system

• magnetic lock

• PIR and contact break alarm system

• good old-fashioned key lock

• fire detection and extinguishing system

• Multi-point failover air-conditioning system with

automated thermal shutdown

• uninterruptable power supply protection with dual

failover recovery and automatic shutdown of servers

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


NETWORK INFRASTRUCTURE

……………………………………………………………………………………………………………………………….……………………………..

• the Archive operates 23 Gigabit network switches

supplying Gigabit connections to the desktop

• each rack in the server room is fed by two load

balanced switches providing multiple network

redundancy to our servers

• the feed to and from the University is now provided

by a dual resilient 10Gigabit connection to the East

Anglia Metropolitan Area Network which links into

SuperJANET

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SYSTEMS – NETWORK INFRASTRUCTURE

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DESKTOP SUPPORT

……………………………………………………………………………………………………………………………….……………………………..

We currently support

• 77 desktop PCs

• all PCs have a consistent base install with a core set of

applications. Additional software can be requested.

• 33 laptops

• there are user-assigned laptops and pool bookable ones

• 33 printers

• we are slowly phasing out individual printers in favour of

faster central breakout area ones.

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DPS – STANDARDS COMPLIANCE

……………………………………………………………………………………………………………………………….……………………………..

• ISO/IEC 27001: 2005 certification

- Information Security Management

• ISO/DIS 14721

- Open archival information system (OAIS) - Reference

model

• ISO 15489-1:2001.

- Information and documentation - Records management

Details of these standards are outlined in the Archive’s

preservation policy which is supplied on your memory key

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DPS HELPDESK

……………………………………………………………………………………………………………………………….……………………………..

http://www.esds.ac.uk:4000/

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


DPS HELPDESK

……………………………………………………………………………………………………………………………….……………………………..

• allows all of DPS team to work on and monitor jobs

• allows DPS to track requests for security and

auditing purposes

• allows DPS to use it as a knowledge base for

resolving future similar problems

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SECURITY: INFORMATION CLASSIFICATION

……………………………………………………………………………………………………………………………….……………………………..

• the Archive information classication policy is based on the UK

Government Protective Marking System

http://www.cabinetoffice.gov.uk/spf/sp2_pmac.aspx

• the Archive system comprises six markings. In ascending order of

sensitivity they are:

• UNRESTRICTED

• PROTECT

• RESTRICTED

• CONFIDENTIAL (We do not hold data in this classification)

• SECRET (We do not hold data in this classification)

• TOP SECRET (We do not hold data in this classification)

the ‘unclassified’ material may be unmarked or it may be marked,

‘UNCLASSIFIED’. ‘NOT PROTECTIVELY MARKED’ or ‘NON-

PROTECTIVELY MARKED’ to indicate positively that a protective

marking is not needed

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SECURITY: HOW WE PROTECT DATA (1)

……………………………………………………………………………………………………………………………….……………………………..

• Encryption

• backup copies of all our data are protected using

256bit encryption

• data exchanged with the Archive classified as

PROTECT or above is encrypted using a variety of

applications

• obviously for incoming data we are reliant on our

depositors following our published guidelines. This

is not always the case and depositor/user

education is an ongoing process

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SECURITY: HOW WE PROTECT DATA (2)

……………………………………………………………………………………………………………………………….……………………………..

Encryption

The Archive’s preferred encryption applications are

TrueCrypt and PGP

(our public key is published on our web site)

Other organisations have standardised on different

encryption applications and currently we support:

• PGP http://www.pgp.com/

• TrueCrypt http://www.truecrypt.org/

• PrivateCrypto http://tinyurl.com/394otjt

• SafeHouse Explorer http://www.safehousesoftware.com/

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SECURITY: HOW WE PROTECT DATA (3)

……………………………………………………………………………………………………………………………….……………………………..

• Encryption

• all Archive laptops are protected by full disk encryption

and encrypted containers

• SDS workstations are also protected by full disk

encryption and encrypted containers

• encrypted USB keys are supplied as necessary and are

securely erased when they returned. For USB keys we

use encrypted containers which do not require admin

rights to run the software

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


SECURITY: DATA DESTRUCTION

……………………………………………………………………………………………………………………………….……………………………..

• all media which has ever contained PROTECT or

above data must be securely disposed of when it is no

longer needed

• a brief outline of appropriate disposal:

• paper, CD and DVD – shredded in a cross shredder

certified to a level appropriate to the data classification

• hard drives – both the drive electronics and the data

platters should be destroyed by commercial shredding.

However, for small quantities, destruction with a large

hammer can be just as effective!

• magnetic tape media – bulk degaussing or shredding

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE


QUESTIONS

……………………………………………………………………………………………………………………………….……………………………..

…………………………………………………………………………………………………………………………………………………………..…

UK DATA ARCHIVE

More magazines by this user
Similar magazines