28.02.2013 Views

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

Bio-medical Ontologies Maintenance and Change Management

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A Summary of Genomic Databases: Overview <strong>and</strong> Discussion 39<br />

is a sequence of symbols on an alphabet of four characters, that are, A, T ,<br />

C <strong>and</strong> G, representing the four nucleotides Adenine, Thymine, Cytosine <strong>and</strong><br />

Guanine. In genomic sequences, three kinds of subsequences can be distinguished:<br />

i) genic subsequences, coding for protein expression; ii) regulatory<br />

subsequences, placed upstream or downstream the gene of which they influence<br />

the expression; iii) subsequences apparently not related to any function.<br />

Each gene is associated to a functional result that, usually, is a protein.<br />

Protein synthesis can be summarized in two main steps: transcription <strong>and</strong><br />

translation. During transcription, the DNA genic sequence is copied into a<br />

Messenger Ribonucleic Acid (RNAm), delivering needed information to the<br />

synthesis apparatus. During translation, the RNAm is translated into a protein,<br />

that is, a sequence of symbols on an alphabet of 20 characters, each<br />

denoting an amino acid <strong>and</strong> each corresponding to a nucleotide triplet. Therefore,<br />

even changing one single nucleotide in a genic subsequence may cause a<br />

change in the corresponding protein sequence. Proteins as, more in general,<br />

macromolecules, are characterized by different levels of information, related<br />

to the elementary units constituting them <strong>and</strong> to their spatial disposition.<br />

Such levels are known as structures. In particular, the biological functions<br />

of macromolecules also depend on their three-dimensional shapes, usually<br />

named tertiary structures. Moreover, behind the information encoded in the<br />

genomic sequence, additive knowledge about biological functions of genes <strong>and</strong><br />

molecules can be attained by experimental techniques of computational biology.<br />

Such techniques are devoted, for example, to build detailed pathway<br />

maps, that are, directed graphs representing metabolic reactions of cellular<br />

signals [14].<br />

1.2 <strong>Bio</strong>logical Database Classification<br />

As pointed out in [31], all the biological databases may be classified, w.r.t.<br />

their biological contents, in the following categories (see also Figure 1):<br />

Macromolecular Databases: contain information related to the three main<br />

macromolecules classes, that are, DNA, RNA <strong>and</strong> proteins.<br />

– DNA Databases: describe DNA sequences.<br />

– RNA Databases: describe RNA sequences.<br />

– Protein Databases: store information on proteins under three different<br />

points of view:<br />

· Protein Sequence Databases: describe amino acid sequences.<br />

· Protein Structure Databases: store information on protein structures,<br />

that are, protein shapes, charges <strong>and</strong> chemical features, characterizing<br />

their biological functions.<br />

· Protein Motif <strong>and</strong> Domain Databases: store sequences <strong>and</strong> structures<br />

of particular protein portions (motifs <strong>and</strong> domains) to which<br />

specific functional meanings can be associated, grouped by biological<br />

functions, cellular localizations, etc.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!