2005 gtl abstracts.indb - Genomics - U.S. Department of Energy
2005 gtl abstracts.indb - Genomics - U.S. Department of Energy
2005 gtl abstracts.indb - Genomics - U.S. Department of Energy
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Genomics</strong>:GTL Program Projects<br />
19<br />
Mapping <strong>of</strong> Biological Pathways and Networks across Microbial<br />
Genomes<br />
F. Mao, V. Olman, Z. Su, P. Dam, and Ying Xu* (xyn@bmb.uga.edu)<br />
University <strong>of</strong> Georgia, Athens, GA and Oak Ridge National Laboratory, Oak Ridge, TN<br />
Homology exists beyond the individual gene level, and it could exist at the biological pathway and<br />
network level. There are a number <strong>of</strong> databases consisting <strong>of</strong> all experimentally validated and reliably<br />
predicted pathways/networks, providing a rich source <strong>of</strong> information for genome annotation and biological<br />
studies at a systems level. A key to effectively use such information is to identify orthologous<br />
genes accurately. However existing methods for mapping these known pathways and networks have<br />
serious limitations, greatly limiting the utility <strong>of</strong> such very useful information. Virtually all existing<br />
mapping methods are based on sequence similarity information, using tools such as reciprocal<br />
BLAST search or COG mapping. A fundamental problem with such methods is that sequence similarity<br />
information alone does NOT contain all the information needed to identify true orthologous<br />
genes!<br />
We have recently developed a computational method and s<strong>of</strong>tware, called P-MAP, for mapping<br />
a known pathway/network from one microbial organism to another by combining homology<br />
information and genomic structure information. The basic idea is that in microbes, genes working<br />
in the same pathway can generally be decomposed into a few operons or, in case <strong>of</strong> complex pathways/networks,<br />
regulons. Such information has not been effectively used in pathway mapping. When<br />
mapping known pathways, we first predict all the operons in a genome using our operon prediction<br />
program. The predictions are then validated through comparing microarray data mainly to check<br />
for consistency between gene expression patterns for genes predicted to be in the same operons or<br />
adjacent operons. Our evaluation has indicated that our prediction accuracy is close to 90%. With<br />
such information, we then map genes in a pathway template to the target genome that simultaneously<br />
gives relatively high sequence similarity between predicted orthologous gene pairs and has all<br />
the mapped genes grouped into a number <strong>of</strong> operons, preferably co-regulated operons based on the<br />
predicted cis regulatory elements and available microarray data. We have formulated the mapping<br />
problem as a linear integer programming (LIP) problem, and solved the problem using a commercial<br />
LIP solver, called COIN.<br />
We have applied the P-MAP program to map known biological pathways in KEGG and MetaCyc<br />
to the cyanobacterial genomes and currently are mapping them to the Shewanella oneidensis MR-1<br />
genome. Some <strong>of</strong> the mapping results could be found at http://csbl.bmb.uga.eddu/WH8102.<br />
Acknowledgement: This project is supported by the U.S.<strong>Department</strong> <strong>of</strong> <strong>Energy</strong>’s <strong>Genomics</strong>:GTL Program under<br />
project “Carbon Sequestration in Synechococcus sp: From Molecular Machines to Hierarchical Modeling” (http://www.<br />
genomes-to-life.org).<br />
30 * Presenting author