13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

SYSTEMS BIOLOGY IN S. CEREVISIAE AND HALOBACTERIUM SP. 355tectable primary sequence homology. Rosetta employsthe following steps. (1) Proteins are subdivided into putativedomains. (2) Three- and nine-residue local structurefragments are estimated based on local sequence similarityto corresponding three- and nine-residue substratesfrom protein <strong>of</strong> known structure. (3) <strong>The</strong>se precomputedlocal structure fragments are assembled into global structuresby minimizing a scoring function that favors hydrophobicburial/packaging, strand-pairing, compactness,and energetically favorable residue pairings. Sincethe predictions from Rosetta are lower in resolution thanexperimentally solved structures, they need to be supportedby other data types (microarrays, proteomics, predictednetworks, etc.). In the third, fourth, and fifth community-widecritical assessments <strong>of</strong> structure prediction(CASP3, 4, and 5), Rosetta was one <strong>of</strong> the most effectivemethods employed (Bonneau et al. 2001). We are usingRosetta to predict the three-dimensional structures <strong>of</strong> all<strong>of</strong> the Halobacterium sp. proteins and protein domains <strong>of</strong>unknown function that are less than 150 residues inlength.When sequence similarities by PSI-BLAST or Pfamhmmersearches have statistically marginal relationships,structure-based methods such as Rosetta can help confirmor negate the putative relationships. For example, aPfam search on the Halobacterium protein VNG0511Hyields a weak hit (homology) (e-value: 0.61) to PF00392,a family <strong>of</strong> bacterial transcription factors. <strong>The</strong> Rosettapredictedstructure for this protein, which has a strongmatch to a CATH family <strong>of</strong> helix-turn-helix DNA-bindingproteins (CATH I.D.: 1.10.10.10), supported this initialtentative predicted function. CATH, like SCOP, is ahierarchical classification <strong>of</strong> protein domain structures.Thus, the agreement <strong>of</strong> these two procedures (weak Pfamand Rosetta) imparts a higher degree <strong>of</strong> confidence to thisprediction not obtainable with either individual method.We can use the knowledge that VNG0551H is a DNAbindingprotein to tentatively assign the role <strong>of</strong> a putativemediator <strong>of</strong> the changes observed in the gene(s) downstreamto its position in the inferred gene regulatory network.Likewise, all function annotations that point to regulatoryor DNA-binding roles should be examined in thecontext <strong>of</strong> one or more gene regulatory networks.A second example, the protein VNG1302H, has a weakBLAST match (e-value: 0.091) to one <strong>of</strong> three domains <strong>of</strong>a protein disulfide isomerase from Cricetulus griseus.<strong>The</strong> Rosetta-predicted structure for the proteinVNG1302H had a strong match to 1A8L (the structure fora protein disulfide oxidoreductase from the hyperthermophilePyrococcus furiosus), which has a thioredoxin-A. B.trxB2VNG0450C trxB1_1VNG5220GtxrB3VNG1302HModel 1:1A8L-02(Oxidoreductase):msrAtrh_1trh_2VNG5077GtrxB1_2argStrh1VNG2468CVNG2115HVNG1012HVNG2310H VNG1546HtrxA1_1trxA2trxA1_2VNG1302HVNG5221GVNG0711CVNG5076GFigure 6. Structure/function annotation for VNG1302H. (A) <strong>The</strong> physical interaction network for the Halobacterium sp. VNG1302H.Blue edges indicate domain fusion type interactions, green edges are yeast interaction data mapped onto Halobacterium sp. via theCOG database. VNG1302H is indicated with the sign <strong>of</strong> the rising sun. All nodes shown have one or more direct connections toVNG1302H. Proteins/nodes in this subnetwork that are known to have redox-related functions are indicated in bold. <strong>The</strong>se functionsinclude thioredoxin, thioredoxin-like, and a peptide methionine sulfoxide reductase (msrA). (B) Rosetta structure prediction forVNG1302H and the closest match in the PDB to this model. This fold similarity (note the structural alignment <strong>of</strong> two key cysteines,indicated by black spheres) indicates a thioredoxin (redox) function.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!