icegov2012 proceedings
icegov2012 proceedings
icegov2012 proceedings
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
approach [28] to expose indicators that play a significant role in<br />
releasing data for public at the four organizations. Important<br />
elements in this approach are desktop research, workshops,<br />
questionnaires, and in-depth interviews with key persons at<br />
different levels of the organizations. We have found that<br />
important indicators for data release are the way data is stored in<br />
an organization (distributed/decentralized versus centralized),<br />
whether data is internally produced or externally gathered, the use<br />
of data, and the availability of guidelines to determine whether<br />
data is suitable for release. We discuss how these indicators may<br />
contribute to shape an open data policy at a local level.<br />
We note that while the combination of the indicators may be<br />
considered as a predictor of data release by an organization, the<br />
focus of the paper is rather to identify these indicators.<br />
The remainder of this paper is organized as follows. In Section 2,<br />
we embed our work in the field of open data in more detail. In<br />
Section 3, we motivate the use of a participatory action research<br />
approach for our purpose, while we implement this approach in<br />
Section 4. In Section 5, we present our findings from the<br />
participatory action research sessions. Section 6 is devoted to the<br />
lessons that we have learned so far. Section 7 reports about our<br />
future activities, while Section 8 concludes the paper.<br />
2. RELATED WORK<br />
Despite the pressures from higher levels of (federal) government<br />
to realise data release on local level, there are barriers along<br />
several dimensions to open data release. From a user perspective,<br />
major barriers are the access to proper datasets and the adequate<br />
use of datasets. The access to proper datasets is aggravated by the<br />
fact that the datasets are fragmented and offered at several<br />
websites, which are in some case hard to find [3, 17]. Moreover,<br />
access to datasets are in some cases restricted to specific user<br />
groups. Adequate use of datasets is hindered since metadata with<br />
regard to datasets are poorly documented, and therefore the<br />
semantics of the data may be ambiguous. Furthermore, how to<br />
determine the quality of a dataset is an open question. In [29] an<br />
overview of the barriers that users may encounter in using public<br />
sector information is provided.<br />
Along with barriers from a user perspective, data providers also<br />
encounter barriers to data release. From this perspective, major<br />
barriers are the lack of knowledge to deposit data, the economical<br />
loss that data release may entail, and the lack of knowledge to<br />
decide which datasets are eligible to release. In order to deposit a<br />
dataset, often this dataset should be formatted and processed<br />
according to the system requirements that accepts the dataset.<br />
Especially dataset providers who are not familiar with the<br />
technical aspects of data processing should deliver a lot of effort<br />
to deposit a dataset.<br />
The economic impact of data release becomes clear when taking<br />
into account that some European public sector bodies used a cost<br />
recovery model to fund data collection. For example, the Dutch<br />
Business Register (KvK), knows a cost recovery ratio of 19,50%<br />
[25]. By releasing data that is currently sold, certain agencies<br />
loose a valuable source of income [27].<br />
Organizations need practical frameworks that support them to<br />
decide whether a dataset is eligible to release or not [30]. In these<br />
frameworks, special attention should be given to privacy issues.<br />
Nowadays, organizations are struggling with the questions<br />
whether a dataset is or may become privacy sensitive [7]. For<br />
6<br />
example, according to [16, 18], combining datasets could lead to<br />
undesirable results such as revealing the identity of persons.<br />
Additionally, publishing data that was gathered and managed for a<br />
one purpose might lead to different conclusions if implemented in<br />
an unrelated context [16]. The potential violation of privacy, has<br />
especially gained a lot of attention in the literature, see for<br />
example [1, 5, 13, 16, 22, 23, 26].<br />
In [5] a study is devoted on the impact that information<br />
technology has on several privacy issues. While the field of<br />
database security [1] mainly focuses to technical solutions that<br />
enforce the “need to know principle” i.e., access control policies,<br />
which may prevent the disclosure of privacy, more comprehensive<br />
alternatives to prevent the disclosure of privacy are proposed in<br />
[4, 16, 26]. In [26] a framework is proposed to protect the privacy<br />
of citizens, while in [4] and [16] comprehensive architectures are<br />
proposed to minimize the violation of privacy law and regulations.<br />
In [13], the authors plea for a so-called ambient law that<br />
articulates fundamental legal protections, including privacy,<br />
within the socio-technical infrastructure. In [23] privacy concerns<br />
about using cameras and solutions to these concerns are discussed<br />
in the context of monitoring dementia patients, while [8, 22] do<br />
the same in the context of public safety.<br />
However, PSI, is not always personally bound. Moreover, the<br />
barriers that may be encountered depends on the type of data at<br />
hand to be released. Therefore, for different types of data,<br />
different kind of barriers may be encountered. At a national level,<br />
there are also barriers to data release above and beyond privacy. A<br />
comparative study by [15] highlights these issues. Barriers to data<br />
release on national level are cited as traditionally closed<br />
government culture, but also limited data quality or uncertain<br />
economic impact of data release.<br />
These studies, while providing a view to data release on a national<br />
level, lack the focus on issues as experienced on a local level by<br />
public sector information professionals. There is a lack of<br />
understanding on local government levels on the impact, barriers<br />
and opportunities of open data release. Our study focus on the<br />
understanding of the underlying processes entailed by open data at<br />
a municipal level. The rationale to choose this level instead of a<br />
national level is that data is mainly gathered at local levels, and<br />
therefore their support are of crucial importance for the success of<br />
open data. To prevent that the understanding of the underlying<br />
processes will be dominated by privacy issues, we will place our<br />
emphasis in this study on the release of data that not are<br />
personally bound, and therefore the chance that data is privacy<br />
sensitive is minimized. For example, we focus on the underlying<br />
process that are entailed by the release of data that pertains to all<br />
trees in Rotterdam, such as the number of trees in a street, the<br />
kind of trees in the street, the year that a tree is planted and so on.<br />
3. APPROACH<br />
As mentioned above, there is a lack of understanding of how data<br />
should be released, what the effects of release might be, and the<br />
processes needed to facilitate data release. This applies not only to<br />
national or federal governments, but also on a local level. Within<br />
our education program experience has also shown that data<br />
release is not self-evident.<br />
In order to explore data release, the University of Applied<br />
Sciences in Rotterdam initiated a research project with four<br />
services that form part of the Municipality of Rotterdam. They are