Dataset compliance The MQA also undertakes a simple assessment of each dataset’s compliance with the DCAT standard. Validation is done using metadata stored in CKAN. Users of the dashboard can identify which datasets are non-compliant, and why. 7.5 Choose metrics that help benchmark publisher performance, but don’t rely on one metric (e.g. quantity) Benchmarks relating to quantity of data published by teams within an organisation, or an organisation itself, have benefits in that they are straightforward to measure. They are a useful mechanism to compare different Open Data publishers, query blockages and encourage friendly competition between departments, cities, regions and countries. They can accelerate a comprehensive data 62
initiative within an organisation and culture change. They indirectly result in teams developing processes, standards and timetables for publishing Open Data. Large quantity metrics – such as the 8,000 Open Dataset challenge set by Defra in the UK – mean organisations have to get a handle on the data they do have, who is responsible for it and how it is being used. Relying on quantity measurements alone, however, is usually insufficient. One national portal owner said that they intentionally avoided using the number of datasets as a proxy for measuring the success of the portal, because it was difficult to know exactly what constitutes one dataset, and focussing on quantity can be reductive. Another told us that what a portal should measure depended on what stage it was at. In a portal’s first year or two, the number of datasets is the right metric but once you reach certain number then those metrics become less important. Focusing on quantity alone can impact on the quality of datasets published, and how they are used. As important is ensuring that documentation published alongside datasets is accurate and comprehensive, that data is published in open, machine-readable formats and with the correct licence. Teams may focus too much on publishing the ‘low hanging fruit’ – old or out-of-date data, aggregate data, or data that is not heavily used within the organisation. The data that gets published does not get widely used as a result, and benefits are not seen by those involved in the initiative, resulting in cynicism. Case study: Open Defra (UK) In June 2015, then Secretary of State for the Department for Environment, Food and Rural Affairs (Defra) Liz Truss unveiled her vision to have Defra use data more effectively to transform the food, agriculture and environmental sectors. 92 Open Data was central to the vision, with the Environment Secretary committing the whole Defra group to publishing 8,000 datasets as Open Data within 12 months (#OpenDefra). The Defra group comprises 33 agencies and public bodies. 93 Many of these bodies contributed to reaching the 8,000 dataset target. By June 2016, Defra had exceeded its 8,000 dataset target, publishing 11,000 datasets as Open Data. The five biggest contributors to that total were: 94 Natural England – over 2,300 datasets Environment Agency – over 1,600 datasets Centre for Environment Fisheries and Aquaculture Science (CEFAS) – over 1,200 datasets Joint Nature Conservation Committee (JNCC) – over 1,000 datasets Rural Payments Agency – over 700 datasets 92 DEFRA, 2015, Environment Secretary unveils vision for Open Data to transform food and farming 93 UK Cabinet Office, Department for Environment, Food & Rural Affairs 94 As of 16 June 2016 63
The Open Data Institute prepared th
Table of Contents List of figures .
List of figures Figure 1: 25 years
eport was reviewed and contributed
1. Introduction Open Data is now a