14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

influenzae lacks the upper portion of the cycle shown in Figure 3. Interestingly, H.<br />

pylori is complementary, having only the upper portion <strong>and</strong> lacking the lower<br />

portion. Furthermore, by examining the KEGG ortholog group table, these two<br />

portions turn out to be coded in different operons whenever the operon structure is<br />

observed. These observations suggest that the TCA cycle is actually formed by two<br />

sets of pathways that are under different regulatory control mechanisms.<br />

Network comparison<br />

The four types of networks (Table 2) can be compared by using the search<br />

capabilities of KEGG (Table 4). For example, the genome-pathway comparison is<br />

done as follows. Starting from the genome map of a given organism, the user displays<br />

the area of interest in the enlarged window <strong>and</strong> asks where in the known biochemical<br />

pathways the genes in the window function. The query can be done by simply<br />

clicking on the button marked PATHWAY. A typical result would be a gene cluster<br />

in the genome forming a functional unit in the biochemical pathway; namely, the<br />

genes in the window code for a set of proteins in successive steps of, say, amino acid<br />

biosynthesis.<br />

Another example of network comparison invlolves a hierarchy versus biochemical<br />

pathways. For example, in the KEGG table of contents page, select the hierarchical<br />

classification (molecular catalog) of enzymes by SCOP 3D folds. By opening the<br />

third-level data for beta/alpha (TIM)-barrel in the hierarchy, the user can search all<br />

occurrences of TIM-barrel proteins against the known metabolic pathways. This is<br />

done by clicking on "Pathway Search by EC" <strong>and</strong> choosing "Search against 3D<br />

structures in PDB" to limit the search for only those enzymes with known structures.<br />

One of the results of this query is Phenylalanine, tyrosine <strong>and</strong> tryptophan<br />

biosynthesis, where the last steps of tryptophan biosynthesis are populated by TIM<br />

barrel proteins, which suggests possible gene duplications in the evolution of<br />

pathway formation [ 1].<br />

Pathway reconstruction with reference<br />

In the traditional similarity search of individual genes (or proteins) against<br />

repositories of all known sequences, it is always problematic to determine an<br />

appropriate level of sequence similarity that can be extended to functional similarity.<br />

The prediction tools in KEGG incorporate an additional feature that is used for<br />

interpretation of sequence similarity; namely, the requirement for reconstructing a<br />

complete pathway or a complete functional unit from a set of genes or proteins. The<br />

reference for reconstruction is the set of pathway diagrams, <strong>and</strong> the refined data set of<br />

ortholog group tables for a limited, but increasing, number of functional units. For<br />

example, by searching sequence similarities against the KEGG ortholog group table<br />

for ABC transporters using a set of consecutive genes in the genome as a query (an<br />

ABC transporter is often coded in an operon), a transporter can be reconstructed with<br />

prediction of substrate specificity according to the subgrouping of the ortholog group<br />

table.<br />

73

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!