28.02.2013 Views

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

182 D. Apiletti et al.<br />

Rules<br />

1400<br />

1200<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

0<br />

0,100<br />

0,090<br />

0,080<br />

0,070<br />

0,060<br />

0,050<br />

0,045<br />

0,040<br />

0,035<br />

Tolerance<br />

0,030<br />

0,025<br />

0,020<br />

0,015<br />

0,010<br />

SCOP<br />

CATH<br />

Fig. 3. Number of quasi tuple constraints found for different values of tolerance<br />

1. querying different related databases (e.g., GO, PDB, Swiss Prot, CATH,<br />

SCOP),<br />

2. searching relevant information in literature,<br />

3. comparing the obtained results with the schema constraints of the examined<br />

database. If the database constraints are known, errors can be distinguished<br />

from exceptions automatically. In particular, if the attribute values of a tuple<br />

do not satisfy the constraints, we can conclude that it is an error, otherwise it<br />

is a biological exception.<br />

For example, in the SCOP database, we can consider the following quasi tuple<br />

constraint.<br />

(fold= Ntn hydrolase-like) → (superfamily=hypothetical protein MTH1020)<br />

[c=0.99771]<br />

The extracted anomaly rule with respect to this constraint is:<br />

(fold= Ntn hydrolase-like) → (superfamily= N-terminal nucleophile aminohydrolases)<br />

[c=0. 00229]<br />

Given such anomaly, the three above-described approaches can be applied to<br />

distinguish between an error <strong>and</strong> a biological exception.<br />

1. In the CATH database, the same protein is classified in the same alpha+beta<br />

class as in SCOP <strong>and</strong> it has a 4-layer s<strong>and</strong>wich architecture (<strong>and</strong> Glutamine<br />

topology) that consists of the 4 layers alpha/beta/beta/alpha that are the<br />

same for the N-terminal nucleophile aminohydrolases fold found in the<br />

SCOP classification.<br />

0,005<br />

0,001

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!