EN100-web
EN100-web
EN100-web
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Special theme: Scientific Data Sharing and Re-use<br />
Capturing the Experimental Context<br />
via Research Objects<br />
by Catherine Jones, Brian Matthews and Antony Wilson<br />
Data publication and sharing are becoming accepted parts of the data ecosystem to support<br />
research, and this is becoming recognised in the field of ‘facilities science’. We define facilities<br />
science as that undertaken at large-scale scientific facilities, in particular neutron and synchrotron<br />
x-ray sources, although similar characteristics can also apply to large telescopes, particle physics<br />
institutes, environmental monitoring centres and satellite observation platforms. In facilities<br />
science, a centrally managed set of specialized and high value scientific instruments is made<br />
accessible to users to run experiments which require the particular characteristics of those<br />
instruments<br />
The institutional nature of the facilities,<br />
with the provision of support infrastructure<br />
and staff, has allowed the facilities<br />
to support their user communities by<br />
systematically providing data acquisition,<br />
management, cataloguing and<br />
access. This has been successful to date;<br />
however, as the expectations of facilities<br />
users and funders develop, this approach<br />
has its limitations in the support of validation<br />
and reuse, and thus we propose to<br />
evolve the focus of the support provided.<br />
A research project produces many outputs<br />
during its lifespan; some are formally<br />
published, some relate to the<br />
administration of the project and some<br />
will relate to the stages in the process.<br />
Changes in culture are encouraging the<br />
re-use of existing data which means that<br />
data should be kept, discoverable and<br />
useable, for the long term. For example,<br />
a scientist wishing to reuse data may<br />
have discovered the information about<br />
the data from a journal article; but to be<br />
able to reuse this data they will also need<br />
to understand information about the<br />
analysis done to produce the data<br />
behind the publication. This activity<br />
may happen years after the original<br />
experiment has been undertaken and to<br />
achieve this, the data digital object and<br />
its context must be preserved from the<br />
start.<br />
We propose that instead of focussing on<br />
traditional artefacts such as data or publications<br />
as the unit of dissemination,<br />
we elevate the notion of experiment or<br />
‘investigation’ as an aggregation of the<br />
artefacts and supporting metadata surrounding<br />
a particular experiment on a<br />
facility to a first class object of discourse,<br />
which can be managed, published<br />
and cited in its own right. By providing<br />
this aggregate ‘research object’,<br />
we can provide information at the right<br />
level to support validation and reuse, by<br />
capturing the context for a given digital<br />
object and also preserving that context<br />
over the long term for effective preservation.<br />
In the SCAPE project , STFC<br />
has built on the notion of a Research<br />
Object which enables the aggregation of<br />
information about research artefacts.<br />
These are usually represented as a<br />
Linked Data graph; thus RDF is used as<br />
the underlying model and representation,<br />
with a URI used to uniquely identify<br />
artefacts, and OAI-ORE used as a<br />
aggregation container, with standard<br />
vocabularies for provenance citation<br />
and for facilities investigations. Within<br />
the SCAPE project, the focus of the<br />
research lifecycle is the experiment<br />
undertaken at the ISIS Neutron<br />
Spallation Facility. By following the<br />
lifecycle of a successful beam line<br />
application, we can collect all the artefacts<br />
and objects related to it, with their<br />
appropriate relationships. As this is<br />
strongly related to allocation of the<br />
resources of the facility, this is a highly<br />
appropriate intellectual unit for the<br />
facility; the facility want to record and<br />
evaluate the scientific results arising<br />
from the allocation of its scarce<br />
resources.<br />
Figure 1: Schematic of an Investigation Research Object.<br />
28<br />
ERCIM NEWS 100 January 2015