Automatic functional annotation of predicted active sites - European ...
Automatic functional annotation of predicted active sites - European ...
Automatic functional annotation of predicted active sites - European ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Clearly, the <strong>annotation</strong> for a whole protein cannot be transferred to residue site <strong>annotation</strong>,<br />
because different groups <strong>of</strong> residues in the protein structure have different function.<br />
In this respect, the biological community is missing an information extraction system for<br />
the <strong>annotation</strong> <strong>of</strong> proteins at residue level.<br />
2.1.3 Gene Ontology<br />
The Gene Ontology (GO) [AL02] [GOC06] is one <strong>of</strong> the most widely used <strong>functional</strong><br />
classification scheme including all <strong>of</strong> the most important criteria for <strong>annotation</strong>s <strong>of</strong> biological<br />
data [PKS06]. Currently, the ontology lists a total <strong>of</strong> 26,302 terms with 15,643<br />
biological process terms, 2,233 cellular component terms, and 8,426 molecular function<br />
terms (version November 2008). The UniProtKB/InterPro group at the <strong>European</strong> Bioinformatics<br />
Institute (EBI) belongs to the Gene Ontology Consortium, and use its standard<br />
vocabulary to the <strong>annotation</strong> <strong>of</strong> protein function. The vocabulary is meant to describe<br />
biological phenomenology <strong>of</strong> genes and gene products (proteins). This is the reason why<br />
terminologies in GO are not suitable to describe the function and property <strong>of</strong> a protein<br />
residue. Figure 2.4 lists some examples where the identification <strong>of</strong> GO terms [GJYLRS08]<br />
did not find the more relevant keywords for the <strong>annotation</strong> <strong>of</strong> residues. At the moment,<br />
an ontology dedicated solely for the <strong>functional</strong> <strong>annotation</strong> <strong>of</strong> protein residues has not been<br />
developed. However, terminologies can be in general collected from other considerable resources,<br />
such as the Open Biomedical Ontologies [SAR + 07] which contains, for example,<br />
REX (an ontology <strong>of</strong> physico-chemical processes), and PSI-MOD (an ontology describing<br />
protein chemical modifications).<br />
2.1.4 Biomedical literature<br />
Biomedical research tackles biological questions from a number <strong>of</strong> perspectives and the<br />
published experimental data are always heterogeneous. The sum <strong>of</strong> description <strong>of</strong> biological<br />
phenomenon enables scientists to understand mechanisms in biology within various<br />
33