24.10.2014 Views

Automatic functional annotation of predicted active sites - European ...

Automatic functional annotation of predicted active sites - European ...

Automatic functional annotation of predicted active sites - European ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Clearly, the <strong>annotation</strong> for a whole protein cannot be transferred to residue site <strong>annotation</strong>,<br />

because different groups <strong>of</strong> residues in the protein structure have different function.<br />

In this respect, the biological community is missing an information extraction system for<br />

the <strong>annotation</strong> <strong>of</strong> proteins at residue level.<br />

2.1.3 Gene Ontology<br />

The Gene Ontology (GO) [AL02] [GOC06] is one <strong>of</strong> the most widely used <strong>functional</strong><br />

classification scheme including all <strong>of</strong> the most important criteria for <strong>annotation</strong>s <strong>of</strong> biological<br />

data [PKS06]. Currently, the ontology lists a total <strong>of</strong> 26,302 terms with 15,643<br />

biological process terms, 2,233 cellular component terms, and 8,426 molecular function<br />

terms (version November 2008). The UniProtKB/InterPro group at the <strong>European</strong> Bioinformatics<br />

Institute (EBI) belongs to the Gene Ontology Consortium, and use its standard<br />

vocabulary to the <strong>annotation</strong> <strong>of</strong> protein function. The vocabulary is meant to describe<br />

biological phenomenology <strong>of</strong> genes and gene products (proteins). This is the reason why<br />

terminologies in GO are not suitable to describe the function and property <strong>of</strong> a protein<br />

residue. Figure 2.4 lists some examples where the identification <strong>of</strong> GO terms [GJYLRS08]<br />

did not find the more relevant keywords for the <strong>annotation</strong> <strong>of</strong> residues. At the moment,<br />

an ontology dedicated solely for the <strong>functional</strong> <strong>annotation</strong> <strong>of</strong> protein residues has not been<br />

developed. However, terminologies can be in general collected from other considerable resources,<br />

such as the Open Biomedical Ontologies [SAR + 07] which contains, for example,<br />

REX (an ontology <strong>of</strong> physico-chemical processes), and PSI-MOD (an ontology describing<br />

protein chemical modifications).<br />

2.1.4 Biomedical literature<br />

Biomedical research tackles biological questions from a number <strong>of</strong> perspectives and the<br />

published experimental data are always heterogeneous. The sum <strong>of</strong> description <strong>of</strong> biological<br />

phenomenon enables scientists to understand mechanisms in biology within various<br />

33

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!