UK Data Archive News Issue 17 September - November 2011

data.archive.ac.uk

UK Data Archive News Issue 17 September - November 2011

UK DATA ARCHIVE

NEWS

September - November 2011

ISSUE 17

PERSISTENT IDENTIFIER PROJECT SET TO ROLL OUT

Work on the UK Data Archive's

Digital Object Identifier (DOI)

project is moving forward with

a pilot launch in late

September 2011. This unique

system promises to improve

data citations over time.

This work, done in consultation with the

DataCite organisation and the British

Library, is set to benefit researchers who are

discouraged by the temporary nature of

many web-based data citations that change

or break over time. Our goal is to ensure

that descriptions of all revisions to the data

remain visible and accessible – while

retaining the researchers’ link to the exact

instance of the data they originally cited.

HOW THE DOI SYSTEM WORKS

The DOI framework is an international

standard for identifying objects in a unique

way. By using a DOI, you are ensuring that

even if the location of an object changes,

the DOI will always link to the same object.

In the new system, a DOI used on its own

or as part of a citation will resolve to a

‘change log’ which identifies the full

history of amendments made to a study

over its lifetime. The study may change

for a number of reasons, for example: a

new wave is added to a series; we or the

data producers have found a way to

enrich the usability of the data; the

underlying file formats have changed to

support preservation; in response to

requests from our community of users.

To the Archive, a changed digital object is

a new digital object while to researchers,

a study is an entity which may be

expected to grow and improve over time.

For this reason, the new system

differentiates between ‘low-impact’

changes (such as an amended catalogue

record) and ‘high-impact’ changes (for

example, updates which could affect the

ability to replicate results). In the case of

high-impact changes, a new DOI will be

generated and the study citation

updated.

As material is often deposited with access

restrictions – and some information is

necessarily protected – a DOI cannot take

a user directly to the data, so the new

system will generate a separate ‘jump’

page providing the full history of the study

and each DOI associated with it. A DOI will

remain persistent over time, greatly

facilitating data citation and improving the

visibility of associated research. At the

same time, presenting all DOIs related to a

study in one location ensures visibility for

the most recent versions of the data.

When the system is released publicly later

this year, the latest DOI on the jump page

will always direct users to the data via its

catalogue record. The jump page will be

fully integrated into the data catalogue

during the planned redevelopment of the

ESDS website. In this way, researchers

using UK Data Archive citations

incorporating DOIs can be confident that

they will persistently identify the original

data source while providing seamless

access to later revisions.

THE UK’S LARGEST COLLECTION OF DIGITAL RESEARCH DATA IN THE SOCIAL SCIENCES AND HUMANITIES


UK DATA ARCHIVE NEWS

ARCHIVE COLLABORATING WITH MRC ON DATA SHARING

As social science researchers

gear up to meet new data

management and sharing

policy guidelines, so too are

medical researchers. And now

the Medical Research Council

(MRC) is looking to a UK Data

Archive expert to help support

them develop a support service

for researchers working with

population and patient cohorts.

The MRC is developing its Data Support

Service with hands-on input from Veerle

Van den Eynden, who currently manages

the Archive’s Research Data Management

Support Services. Over the next nine

months, Veerle will be seconded to the

MRC Unit for Lifelong Health and Ageing

to work with MRC researchers, data

managers and with the MRC head office.

The project focuses on populationbased

research data, working in depth with key

studies, extending work on access and

sharing already being pioneered by teams

such as those managing the MRC National

Survey of Health and Development. In the

last two years an online gateway for the

discovery of cohort studies and their

variables along with an underlying

directory has been developed for MRC by

a consortium from the Science and

Technology Facilities Council (STFC),

Oxford University and University College

London.

practices, and then to provide practical

guidance and tools to facilitate data

sharing and data management planning.

These activities build on her previous work

at the UK Data Archive and for the Data

Support Service of the Rural Economy and

Land Use Programme (Relu-DSS), where

she provided support, guidance and

training on data management to

researchers in order to promote good data

practices and optimise data sharing.

This is an exciting opportunity for the

Archive to work closely again with the

MRC in furthering its support for valueadded

data sharing.

In the coming months, the gateway will be

further developed and made available to

researchers along with information on

associated tools and standards.

Veerle Van den Eynden

Key to supporting MRC scientists, will be

to work closely with them on

requirements. Veerle will help MRC develop

a network of data managers and data

scientists to share expertise and best

BEHIND THE SCENES AT THE ARCHIVE: THE DATA SERVICES TEAM

The Data Services team are

highly specialised data

professionals who work at the

core of the Archive to validate

and enhance ESDS social

science data deposits. The

aim of the team is to prepare

high quality datasets for

curation, enabling users to

access all of the data and

documentation they need to

facilitate their research.

Here’s how it works: Once data are

acquired, they are passed to the Data

Services team, who are responsible for

the Archive’s ingest activities. They carry

out data validation and quality control

and convert the data into a form suitable

for both long-term preservation and

immediate access. Each data collection is

unique and requires specific research and

data conditioning so that the resulting

preservation and dissemination formats

are of the highest order to enable

secondary analysts to make informed use

of the data.

Once the Submission Information Package

(SIP) is ready for ingest, the Data Services

team’s role in the data lifecycle begins.

Data deposits, both quantitative and

qualitative, are carefully validated and

checked to ensure data integrity,

anonymisation and confidentiality. Any

anomalies found are resolved in

collaboration with the data depositor. Data

enrichment such as improved labeling,

creating additional metadata, grouping

survey variables, creating data lists and

transferring to a preferred dissemination

format are carried out by the team. They

then assign keyword search terms that link

to the HASSET thesaurus, and create and

update online study descriptions for each

dataset.

When checks are complete, data files are

migrated into several preferred standard

dissemination/preservation formats,

depending on their original deposit

condition. Some quantitative studies are

mounted in Nesstar, the Archive’s premier

online data browsing tool. Nesstar allows

users to explore a range of social survey

data, view variable frequencies and

question text, and conduct online

tabulations and graphs.

The team also ensure that each study

includes sufficient documentation. This

may include questionnaires,

methodological information, interview

schedules and research reports to

accompany the data, forming the

complete Dissemination Information

Package (DIP).

September - November 2011


UK DATA ARCHIVE NEWS

NEW TOOLKIT HELPS KEEP RESEARCH DATA SAFE

The JISC-funded Keeping

Research Data Safe (KRDS)

project is rolling forward with

significant contributions from

the UK Data Archive and

partners. The latest phase

involves extending the KRDS

Benefits Framework and

developing another advanced

tool to assess not only

benefits, but impact.

The KRDS Benefits Framework aims to

provide a coherent structure for those

involved in digital preservation to

demonstrate the direct benefits of their

services. The Archive used it successfully

as part of the review of its flagship service

the ESDS.

During that review, It became clear that the

Benefits Framework did not go far enough,

as it does not provide the means to capture

or measure the impact of a service or the

data which are being provided. This led to

the KRDS project incorporating additional

methodological features from another tool.

The Value Chain Tool extends the value of

the Benefits Framework. It provides a more

precise statement of the benefits of any

archive-related activity and it allows

generic benefits to be reconfigured into

organisation-specific benefits. Once

completed it can be used to show how the

data service or repository provides

additional value, as well as how the

resulting impact could potentially be

measured or demonstrated as a case

study. The format of the tool focuses the

attention on the measurable impacts, but

without excluding the possibility of a more

qualitative interpretation of impacts. With

the estimation of an impact weighting

score, it assists in making decisions about

how best to prioritise activities in order to

both maximise benefit and impact and

demonstrate value for money.

We believe that the Benefits Framework

and its more advanced tool should have

significant use within the data service

community. It influenced the ESRC’s call

for the economic impact of research data

infrastructure and it will influence the UK

Data Archive’s implementation of its

strategic plan.

In the coming months we will be feeding

some of the elements of both tools into

the revisions of our advice to researchers

on research data management planning,

especially complementing work already

carried out on costing research data

management.

Full details of the tools and the Archive

case study are available at

beagrie.com/krds-i2s2.php

ARCHIVE BOOSTS

QUALITATIVE

RESEARCH AT ESSEX

SUMMER SCHOOL

The University of Essex’s 44 th annual

summer school for social science data

analysis got a boost this year from some

Archive experts in qualitative data.

Staff led two half-day workshops to

introduce the Archive’s ESDS Qualidata

collections and teaching resources. This

included discussions of working with

recently archived data, such as research

into flooding in the UK (SN 6605 Flood,

Vulnerability and Resilience, 2007-

2009).

The session also covered ongoing work

in digitising collections originally

deposited on paper and making them

available digitally for teaching and

analysis. Examples used included

research into Italian cultural life (SN

6479 Oral History of Cultural

Consumption in Italy, 1936-1954) and

Peter Townsend’s seminal research into

residential care for the elderly (SN 4750

Last Refuge, 1958-1959).

Learn more about the ESDS Qualidata

service and its collection:

www.esds.ac.uk/qualidata/

Version control is then applied to each

data revision to ensure all information is

retained. It is crucial to the Archive to

ensure that deposited data are securely

preserved, and that persistent data access

is maintained, allowing re-use and citation

of every major version of the data.

RE-USING

DATA

CREATING

DATA

PROCESSING

DATA

When data conditioning and curation are

complete, the Archival Information

Package (AIP) is submitted to the

Archive’s preservation system for longterm

management. As new data are added

– or the required data formats migrate

forward in response to changes in the

needs of the user community – the files

from the preservation system form the

basis for the generation and update of

studies into new access formats.

GIVING

ACCESS

TO DATA

THE DATA

LIFECYCLE

The Archive has been curating data for

more than 40 years. The Data Services

team play an integral part in assuring

that our digital data are enriched,

preserved, and accessible.

ANALYSING

DATA

Learn more at www.dataarchive.ac.uk/curate/process

PRESERVING

DATA

www.data-archive.ac.uk


UK DATA ARCHIVE NEWS

ARCHIVE HELPS SET THE BAR FOR TRUSTED

DIGITAL REPOSITORIES

DDI MEETS

QUALITATIVE DATA

Developers who are working

to implement the Data

Documentation Initiative

(DDI) are invited to

contribute to a three-day

meeting in December

dedicated to formally

describing qualitative data.

The meeting will continue the work of

the DDI Alliance working group on

qualitative data and will focus on a

possible extension of the current DDI-

Lifecycle specification (DDI 3 branch).

Archive Director Matthew Woollard and Metadata Manager Herve L’Hours hosted the test auditors, a team with a vast

range of experience in data management, archiving, preservation and standardisation.

An international team of test

auditors gathered at the UK

Data Archive in June 2011 to

examine our structures,

policies and procedures

against a new international

security standard for trusted

digital repositories.

The ISO 16363 standard extends the

reference model for an Open Archival

Information System (OAIS), which guides

best practice in the preservation

community. Since the Archive is registered

to ISO 27001 for information security –and

has passed the self-audit of the Data Seal

of Approval (DSA) – this voluntary audit

against the proposed new standard

represents the next step in best practice

for digital repositories.

Our motivation was to demonstrate a

commitment to developing – and adhering

to – appropriate standards that can be

widely recognised and adopted. We clearly

have a vested interest in being confident

that sister archives, project partners and

providers of technical registries related to

preservation can be deemed trustworthy

as a result of self-audit or full certification

against a recognised standard.

The standard covers all processes

necessary to demonstrate that an

organisation can be trusted by depositors,

funders and users. This includes everything

from organisational infrastructure and

security risk management to digital object

management.

The Archive was the only repository in the

UK chosen to undertake this test audit. The

results will be used to inform and fortify

the APARSEN project, which seeks to

lower barriers to long-term accessibility

and usability of digital information and

data through the development of a Virtual

Centre of Digital Preservation Excellence.

Qualitative data presents challenges for

systematic description as it is typically

unstructured and complex, and

capturing adequate context is difficult.

The UK Data Archive has spent some

years working on descriptive schema –

such as QuDex, currently being tested

by various applications – and

encourages the move by the DDI

Alliance to welcome qualitative data

into its house.

The event is set for 7-9 December

immediately following the European

DDI 2011 Conference (EDDI11) in

Gothenburg, Sweden. It’s open to

anybody actively working in this area of

metadata, especially technical

developers working on archival data

collections.

For more on the conference, go to

www.iza.org/conference_files/EDDI2011

/call_for_papers

Learn more about QuDex at dext.dataarchive.ac.uk/schema/schema.asp

Learn more about the APARSEN project

and test audit at www.dataarchive.ac.uk/about/projects/aparsen

UK DATA ARCHIVE

UNIVERSITY OF ESSEX

WIVENHOE PARK

COLCHESTER

ESSEX CO4 3SQ, UK

T +44 (0) 1206 872001

E comms@data-archive.ac.uk

ISSN 1755-8190 (print)

ISSN 1755-8204 (online)

WE ARE SUPPORTED BY THE UNIVERSITY OF ESSEX, THE ECONOMIC AND SOCIAL

RESEARCH COUNCIL AND JOINT INFORMATION SYSTEMS COMMITTEE

More magazines by this user
Similar magazines