28.02.2013 Views

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Extraction of Constraints from <strong>Bio</strong>logical Data 183<br />

2. In literature evidences can be found that the crystal structure of MTH1020<br />

protein reveals an Ntn-hydrolase fold [22].<br />

3. The results can be compared to the SCOP constraints (which are known),<br />

<strong>and</strong> confirm the correctness of this relationship (i.e., it is not an error).<br />

Finally, applying the same procedure also to the CATH database, we discovered<br />

the anomalies reported in Fig. 3. Among these, we noticed the one for the<br />

hypothetical protein MTH1020. Thus, the method applied independently on two<br />

different databases reveals the same anomaly. This result confirms the consistency<br />

of the proposed method for anomaly discovery.<br />

5.4 Quasi Functional Dependencies<br />

For the purpose of identifying functional dependencies, the dependency degree between<br />

couples of attributes must be equal to 1 (see Eq. (3)). We verified that the<br />

proposed method is sound <strong>and</strong> complete by means of two experiments, which<br />

show that it correctly identifies all <strong>and</strong> only the functional dependencies contained<br />

in the database.<br />

In the first experiment we executed the algorithm on the SCOP database with<br />

the lowest support value. Table 4 shows the detected functional dependencies.<br />

The first two columns contain the attributes which are functionally dependent<br />

(their names are explained in the following example). The third column represents<br />

the dependency degree obtained by applying Eq. (3) on the original data.<br />

Only the couples of attributes with dependency degree of 1 are shown. Since the<br />

SCOP database constraints are known, we exploited its structural knowledge to<br />

prove the correctness of our method. Our method detects all <strong>and</strong> only the functional<br />

dependencies in the SCOP database. Hence, it is sound (i.e., it only detects<br />

correct functional dependencies) <strong>and</strong> complete (i.e., all known functional dependencies<br />

are detected).<br />

In the second experiment we performed a simulation of fault injection tests. In<br />

this way, we demonstrate that also in presence of errors, our method is sound <strong>and</strong><br />

complete, because it still detects all <strong>and</strong> only the non-faulted dependencies. We<br />

changed r<strong>and</strong>om values at different levels of the hierarchy, by substituting the actual<br />

value with one r<strong>and</strong>omly taken from another branch of the tree. This kind of<br />

misclassification is rather subtle to identify, since the value is acceptable (i.e., it is<br />

valid <strong>and</strong> it is not misspelled), but it assigns a particular protein to the wrong class<br />

for one or more levels of the hierarchy. The fourth column of Table 4 represents<br />

the dependency degree after the fault injection simulation, where a r<strong>and</strong>om fault<br />

has been injected for some levels of the classification hierarchy. All the affected<br />

attributes, represented in bold in the first two columns, report dependencies whose<br />

value falls below 1. Thus, a quasi functional dependency analysis can be performed<br />

to detect all the injected faults.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!