28.02.2013 Views

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

180 D. Apiletti et al.<br />

Fig. 2. SCOP (top) <strong>and</strong> CATH (bottom) hierarchical trees<br />

The CATH database is a hierarchical classification of protein domain structures<br />

in the Protein Data Bank (http://www.rcsb.org/pdb/). Protein structures are classified<br />

using a combination of automated <strong>and</strong> manual procedures. There are four<br />

major levels in this hierarchy: Class, Architecture, Topology <strong>and</strong> Homologous<br />

superfamily, as shown in Fig. 2. Domains within each Homologous superfamily<br />

level are subclustered into sequence families using multi-linkage clustering, identifying<br />

five family levels (named S, O, L, I, D). Thus, the complete classification<br />

hierarchy consists of nine levels (CATHSOLID).<br />

5.2 Extracting Association Rules<br />

The first step in finding association rules is to look for attribute values that appear<br />

in the same tuple. Every attribute-value couple is an item. A group of items is

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!