28.02.2013 Views

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

172 D. Apiletti et al.<br />

Along with the ongoing trend to use the XML format in biological databases,<br />

some kinds of constraints for XML data have been explored by recent research<br />

papers [10][19], such as key, inclusion, inverse <strong>and</strong> path constraints. Instead functional<br />

dependencies other than key dependencies have been little investigated. In<br />

[14] the authors address this issue with a subgraph-based approach which captures<br />

new kinds of functional dependencies useful in designing XML documents. Furthermore,<br />

they analyze various properties such as expressiveness, semantic behaviour<br />

<strong>and</strong> axiomatizations.<br />

Constraints in hierarchically structured data have been discussed in [13]. In<br />

particular, they focus on functional <strong>and</strong> key dependencies <strong>and</strong> investigate how<br />

constraints can be used to check the consistency of data being exchanged among<br />

different sources. This leads to the constraint propagation problem, which is addressed<br />

in the form of propagating XML keys to relational views by providing<br />

two algorithms. The importance of translating constraints in data exchanges is<br />

also discussed in [9], whose work proposes a suitable language for this task. The<br />

proposed language is able to express a wide variety of database constraints <strong>and</strong><br />

transformations <strong>and</strong> its development was motivated by experiences on bio<strong>medical</strong><br />

databases in informatic support to genome research centers.<br />

In [8] the authors introduce the notion of pseudo-constraints, which are predicates<br />

having significantly few violations. This is a concept similar to the quasi<br />

functional dependencies, but they define pseudo-constraints on the Entity-<br />

Relationship model, whereas we use association rules to define quasi functional<br />

dependencies. Furthermore, they focus on cyclic pseudo-constraints <strong>and</strong> propose<br />

an algorithm for extracting this kind of cyclic pattern. On the contrary, our notion<br />

of dependency is an implication between sets of elements <strong>and</strong> is not bound to the<br />

structure of the data source used to mine the pattern.<br />

In [7] quasi functional dependencies have been exploited to detect anomalies in<br />

the data. On the contrary, in this work the focus is on the extraction of constraints<br />

<strong>and</strong> the anomaly detection is only a further possible application of our method.<br />

3 Background<br />

In this section we provide an overview of the main concepts behind the successful<br />

discovery of constraints. We start with a survey of the relational model, which is<br />

widely adopted in the database design <strong>and</strong> whose strongly structured format eases<br />

the definition of constraints. Then we focus on a particular kind of constraints: integrity<br />

constraints <strong>and</strong> their subtypes. Eventually we conclude with an introduction<br />

to association rules, a well-established data mining technique for inferring<br />

unknown correlations from data.<br />

3.1 Relational Model<br />

There are several ways to describe a database at a logic level, e.g., hierarchical, relational,<br />

or object oriented. Nowadays the relational model is the most widely

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!