14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

can be used for gene function assignments. The KEGG ortholog group tables are a<br />

clean reference data set of orthologous relations that is intended to make this process<br />

easier. The table contains not only the information of orthologous <strong>and</strong> paralogous<br />

genes, but also the information of the group of genes that is supposed to form a<br />

functional unit, such as a regulatory unit in the metabolic pathway or a molecular unit<br />

of assembly. Thus, the ortholog group tables represent a library of 'network motifs' or<br />

conserved local network patterns that are related to functional meanings. They are<br />

maintained manually with the aid of network comparison tools in KEGG.<br />

Molecular catalogs<br />

The molecular catalogs are to be used for representing functional <strong>and</strong> structural<br />

classifications of proteins, RNAs, other biological macromolecules, small chemical<br />

compounds, <strong>and</strong> their assemblies. However, the current version of KEGG contains<br />

only several tables, mostly for enzyme classifications.<br />

Chemical compounds<br />

The living cell contains a number of non-genetic compounds that are synthesized,<br />

transported from outside, or simply carried over cell divisions. In order to represent a<br />

complete network of molecular interactions, it is necessary to have a complete<br />

catalog of compounds in the cell <strong>and</strong> possibly in the environment as well. The<br />

COMPOUND section of the LIGAND database currently consists of over 5,000<br />

chemical compounds, mostly metabolites with links to the location on the metabolic<br />

pathways <strong>and</strong> to the enzymatic reactions involved. The COMPOUND entry also<br />

contains the chemical structure that is entered manually using the ISIS system, <strong>and</strong><br />

the CAS number.<br />

Enzymatic reactions<br />

The information of enzymatic reactions <strong>and</strong> enzyme molecules is currently stored in<br />

the ENZYME section of the LIGAND database. Work is in progress, however, to<br />

organize the third REACTION section of the LIGAND database containing both<br />

enzymatic <strong>and</strong> non-enzymatic reactions. A reaction between multi-substrates <strong>and</strong><br />

multi-products is decomposed into a set of binary relations or approximated by a<br />

reaction between two major compounds. The reaction data are especially important<br />

for computing possible chemical networks, from which possible gene (enzyme)<br />

networks can also be obtained.<br />

Molecular relations<br />

The binary relations of successive enzymes are also extracted from the KEGG<br />

metabolic pathway diagrams. They form a class of molecular relation data in KEGG.<br />

Another important class of molecular relation is the protein-protein interactions in<br />

regulatory pathways such as in signal transduction, cell cycle, <strong>and</strong> developmental<br />

pathways. These data are not yet well organized, except for a few attempts in BRITE<br />

71

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!