advances-in-protein-chemistry

More documents

Recommendations

Info

entire set of comprehensive and pre-computed structural alignments are obtained with the VAST algorithm [36] where as the visualization tools for 3D structure and structure/sequence alignment are provided by the molecular graphics viewer Cn3D [37]. As on April 29, 2013, there are 89,571 structure records total which includes 22,084 proteins, 715 DNA and 508 RNA molecules only. Additionally, MMDB also consist of 2571 protein-DNA complexes, 1113 protein-RNA complexes and 116 protein-DNA-RNA complexes, in addition to more than 60,000 proteins bound to chemicals. 2.2.2.3 ModBase ModBase is a database of annotated homology based protein structure models. Models in ModBase are generated as an automated software pipeline for comparative protein structure modeling, known as ModPipe [3] which mostly rely on modules of Modeller [38]. For fold assignment and target–template alignment, ModPipe uses sequence–sequence [39] sequence–profile [40,41] and profile–profile [40,42] methods by using an E-value cut-off of 1.0 to augment the possibility of identifying the finest available template structure. 10 models are generated [38] for each target–template alignment, and the model with the top Discrete Optimized Protein Energy (DOPE) statistical potential [43] score is selected and further assessed by numerous additional quality criteria: (i) target–template sequence identity, (ii) GA341 score [44] (iii) Z-DOPE score [43] (iv) ModPipe Quality Score (MPQS) and (v) TSVMod score [45]. Because of the rapid growth of the public sequence databases, models in ModBase are structured in data sets that are useful for specific projects. Currently, ModBase includes about 27,288,148 models and 4,332,658 unique sequences modeled for more than 50 complete genomes [46]. ModBase can be queried through its web interface by querying with UniprotKB [47] and GI [48] identifiers, gene names, annotation keywords, PDB codes [49] data set names, organism names, sequence similarity to the modeled sequences (BLAST [41]) and model-specific criteria such as model reliability, model size and target–template sequence identity. Additionally, the coordinate and alignment files can also be retrieved as text files [50]. 2.2.2.4 SWISS-MODEL Repository SWISS-MODEL Repository is a database of 3D protein structure models constructed by using the SWISS-MODEL homology-modelling pipeline based on protein sequences from the UniProt database [47]. The SWISS-MODEL pipeline integrates various steps like: template selection, target sequence and template structure alignment, model building, energy minimization and/or refinement and model quality assessment [51] Model target sequences are individually identified by their md5 cryptographic hash of the full length raw amino acid sequence which permits the redundancy in protein sequence databases entries to be reduced, and in turn assists cross-referencing with databases by means of different accession codes. The current SWISS-MODEL Repository release contains 3143784 model entries for 2286870 unique sequences in the UniProt database (2013_02). The database could be queried for particular proteins by using diverse database accession codes (e.g. UniProt AC and ID, GenBank, IPI, Refseq) or directly by means of the protein amino acid sequence. For a particular query protein, a graphical outline demonstrating the segments for which models or experimental structures are available is shown. SWISS- MODEL Repository users can review the quality of the models in the database; search for alternative template structures, and construct models interactively by the use of SWISS-MODEL Workspace [52]. Repository is updated on a regular basis to reflect the growth of the sequence and structure databases. 3. 3D Structure Prediction 3.1 A brief history of molecular modelling The first homology based model dates back to 1969 when a wire and plastic models of bonds and atoms of α-lactalbumin was constructed by using the coordinates of a hen’s egg-white lysozyme and adjusting, physically, those amino acids that did not match the structure [53]. The two proteins exhibited 39% of sequence identity. Afterwards, the crystal structure of lysozyme was used to generate a model for α-lactalbumin [54]. These models were created by taking the existing coordinates of the well-known structure, and mutating side chains that were not identical in the protein to be modeled. This approach to protein modeling is still used at present with substantial success, particularly when the proteins share a considerable degree of sequence similarity [55]. McLachlan and Shotton [56] used the structures of mammalian chymotrypsin and elastase, and modeled the structure of α-lytic proteinase of the fungus Myxobacter 495. The modeling was not easy as the sequence similarity between the target and the template was only about 18%. Subsequently, the crystal structure of α-lytic proteinase was determined and compared with the homology model [57]. Although the domains of the model were constructed accurately, it was found that misalignment of the sequences led to local errors. The modeling of variable regions was introduced in proteins on the basis of equivalent regions from homologous proteins of known structures [58,59]. Therefore, in order to construct the homology models of various serine proteases, structures of trypsin, chymotrypsin and elastase were superimposed, and it was found many equivalent Cα atoms were within 1.0Å of one another. The regions comprising of the amino acids of these Cα atoms were described as structurally conserved regions (SCRs). All the other remaining sites correspond to structurally variable or loop regions (VR) where the insertions/deletions were located. The backbone of SCRs and VRs was generated from the fragments of known serine proteases, where as the side chains were modeled based on the conformation found at the equivalent locations for those identical side chains in the well-known structures. OMICS Group eBooks 08
Furthermore, the initial models for the aspartic proteinases renin and renin-inhibitor complexes were built by using the 3D structure of the remotely related fungal proteinases [60-62]. Later on, the models for renin were constructed by employing the structures of mammalian aspartic proteases, pepsin and chymosin [63,64]. Comparative analysis of fungal and mammalian renin models revealed that the inaccuracies in the models occurred due to the dissimilarity in the arrangement of helices and strands between the fungal and mammalian proteinases, as well as the slightly different variable regions. On the other hand, the modeling of renin active site was reasonably accurate [65]. Early in the eighties, manual homology modeling was assisted by manoeuvring of protein molecules on the graphics terminal that was made achievable by computer programs like FRODO [66]. Since then, many different homology modeling packages have been developed [42], which can be grouped into three different groups: rigid-body assembly, segment matching, or modeling by satisfaction of spatial restraints [67]. 3.2 Protein 3D structure prediction tools The prediction of 3D structures of proteins remains an exceedingly complicated and uncertain undertaking. However, these difficulties can be addressed up to a certain extent by using some of the key state of the art tools which have been developed over the years. These tools (Table 2) either employ homology based methods or Ab initio methods in case of no significant similarities are found. SNo. Software Link Description 1. MODELLER http://salilab.org/modeller/ Satisfaction of spatial restraints 2. SWISS-MODEL http://swissmodel.expasy.org/ Local similarity/fragment assembly 3. 3D-JIGSAW http://bmm.cancerresearchuk.org/~3djigsaw/ Fragment assembly 4. ROBETTA http://robetta.bakerlab.org/ 5. RaptorX http://raptorx.uchicago.edu/ 6. ESyPred3D http://www.unamur.be/sciences/biologie/urbm/ bioinfo/esypred/ Rosetta homology modeling and ab initio fragment assembly with Ginzu domain prediction Remote homology detection, protein 3D modeling, binding site prediction Template detection, alignment, 3D modeling 7. HHpred http://toolkit.tuebingen.mpg.de/hhpred Template detection, alignment, 3D modeling 8. EasyModeller NA GUI to MODELLER 9. CPHModel http://www.cbs.dtu.dk/services/CPHmodels/ Fragment assembly 10. BHAGEERATH-H http://www.scfbio-iitd.res.in/bhageerath/ bhageerath_h.jsp Combination of ab initio folding and homology methods 11. GeneSilico https://genesilico.pl/meta2 ‘meta-server’ for 3D structure prediction 12. Geno3D 13. PSIPRED Protein Sequence Analysis Workbench http://geno3d-pbil.ibcp.fr/cgi-bin/geno3d_ automat.pl?page=/GENO3D/geno3d_home.html http://bioinf.cs.ucl.ac.uk/psipred/ Predicts 3D models based on distance geometry, simulated annealing and energy minimization algorithms. Several high quality protein structure prediction and function annotation algorithms 14. I-TASSER http://zhanglab.ccmb.med.umich.edu/I-TASSER/ Combination of ab initio folding and threading methods 15. QUARK http://zhanglab.ccmb.med.umich.edu/QUARK/ Monte Carlo fragment assembly 16. MUSTER http://zhanglab.ccmb.med.umich.edu/MUSTER/ Profile-profile alignment 17. SPARKS-X 18. Phyre and Phyre2 19. PEP-FOLD 3.2.1 Homology modeling http://sparks.informatics.iupui.edu/yueyang/ sparks-x/ http://www.sbg.bio.ic.ac.uk/~phyre2/html/page. cgi?id=index http://bioserv.rpbs.univ-paris-diderot.fr/PEP- FOLD/ Table 2: List of protein 3D structure prediction tools. 3D structure modeling by Fold recognition according to Sequence profiles and structural profiles Remote template detection, alignment, 3D modeling, multitemplates, ab initio de novo structure prediction of linear and disulfide bonded cyclic peptides 3D structure of a protein is capable of providing invaluable information about the function of a protein and allowing an efficient design of experiments, for instance site-directed mutagenesis, studies of disease-related mutations or the structure based drug designing efforts [68]. Traditional approaches to determine the 3D structure of a protein includes X-ray crystallography or NMR spectroscopy. Other theoretical methods have not shown much promise in providing highresolution information for the bulk of proteins. The number of structurally characterized proteins is very less in comparison to the number of known protein sequences. As of May 07, 2013, there are 90,424 structures in PDB [http://www.rcsb. org/pdb/home/home.do] which is extremely low as compared to UniProtKB/Swiss-Prot (http://www.uniprot.org/) which contains 5,40,052 sequence entries as of May 01, 2013. Nevertheless, it seems quite unreasonable to believe that it is possible to experimentally determine the structures of all these hundreds and thousands of proteins regardless of immense growth in the efforts of structural genomics. Therefore, in view of the above, homology modeling (also known as comparative modeling) methods offer the only possible way to get structural information for such a huge number of proteins [69]. One of the prerequisites of successful model building requires the availability of at least one experimentally determined 3D structure known as template that shares a significant amino acid sequence similarity to the target sequence [68]. The main steps that are required to create a homology based model are summarized in Figure 2 and include: (1) identification of homologs that can be used as templates for modeling; (2) target-template sequence alignment; (3) building a model for the target based on the information from the alignments; and (4) evaluation of the model [70,71]. These modeling steps usually involve extensive expertise in structural biology and the use of extremely specialized computational tools [72]. Some of these highly specialized and frequently used homology based modeling tools are summarized below. OMICS Group eBooks 09
Page 1 and 2:
www.esciencecentral.org/ebooks Adva
Page 3 and 4:
Tyrosine Nitrated Proteins: Biochem
Page 5 and 6:
[58-61]. Protein sulfhydryls [62] a
Page 7 and 8:
2. Souza JM, Daikhin E, Yudkoff M,
Page 9 and 10:
Oligoclonality of the antibody resp
Page 11 and 12:
The Recent Methodologies of Protein
Page 13 and 14:
are due to altering protein activit
Page 15 and 16:
Figure 6: At Domain Level, Protein
Page 17 and 18:
Brownian Dynamics (BD) method Prote
Page 19 and 20:
protein demonstrate the lowest ener
Page 21 and 22:
Figure 12: Steps in Protein Prepara
Page 23 and 24:
FlexServ The flexibility of protein
Page 25 and 26:
21. van der Kamp MW, Shaw KE, Woods
Page 27 and 28:
Advances in Protein Thermodynamics
Page 29 and 30:
3. Fluorescence correlation spectro
Page 31 and 32:
Figure 3: Molecular Chaperone (Top
Page 33 and 34:
Figure 7: A Potherse is Consists of
Page 35 and 36:
Figure 12: Residue Clusters with Mu
Page 37 and 38:
Figure 16: Various Features at the
Page 39 and 40:
Figure 21: Stability of protein str
Page 41 and 42: Advances in Protein Chemistry Chapt
Page 43 and 44: MMPs are believed to remodel the EC
Page 45 and 46: MMPs7 (Matrilysin, PUMP 1) A big ro
Page 47 and 48: MMP27 (MMP-22, C-MMP) mRNAs for MMP
Page 49 and 50: Signal transduction MMP production
Page 51 and 52: a | Active MMPs are generated throu
Page 53 and 54: Future Perspective Research into ne
Page 55 and 56: 56. Yonemura Y, Endo Y, Fujita H, K
Page 57 and 58: 126. López-Boado YS, Wilson CL, Ho
Page 59 and 60: Proteins and Peptides- Reemergence
Page 61 and 62: Figure 3: Schematic representation
Page 63 and 64: Figure 6: Different mechanisms of c
Page 65 and 66: [30,33]. Clinical trials with autoi
Page 67 and 68: Therapeutic Proteins Figure 9: Mole
Page 69 and 70: Coming years will witness increasin
Page 73 and 74: purification methodology and laid t
Page 75 and 76: An overview of various purification
Page 77 and 78: Affinity chromatography (AC): The p
Page 79 and 80: molecules such as proteins. Further
Page 81 and 82: However, for proteins with poor sol
Page 83 and 84: The schematic of an HPLC instrument
Page 85 and 86: » Denaturation of all proteins giv
Page 89 and 90: the biological society access to th
Page 91: identifier, since a single macromol
Page 95 and 96: evaluation [78], Whatcheck [79] and
Page 97 and 98: corresponding to gaps in the FR ali
Page 99 and 100: programming (IP) problem based on t
Page 101 and 102: or almost the entire target. - Temp
Page 103 and 104: 3.2.1.10 Geno3D http://geno3d-pbil.
Page 105 and 106: 80. Laskowski RA, MacArthur MW, Mos
Page 107 and 108: 165. Zhang Y, Kolinski A, Skolnick
Page 111 and 112: Figure 1: Biochemical and biophysic
Page 113 and 114: equilibrium an equilibrium concentr
Page 115 and 116: for virtually all macromolecules to
Page 117 and 118: egion) for estimating fraction or p
Page 119 and 120: etween proteins in living cells. It
Page 121 and 122: pre-requisite in understanding how
Page 123 and 124: 82. Rodger A, Marrington R, Roper D
Page 125 and 126: New Peptide Based Therapeutic Appro
Page 127 and 128: ability to selectively target cells
Page 129 and 130: 24 amino acid extracellular domain
Page 131 and 132: separation efficiency. However, use
Page 133 and 134: the peptides through acetylation an
Page 135 and 136: 63. Mesiano AJ, Beckman EJ, Russell
Page 137 and 138: Antimicrob Agents Chemother 49: 330
Page 139 and 140: Protein Detection Techniques Dennis
Page 141 and 142: technique has been applied to the i
Page 143 and 144:
involves placing our protein of int
show all

advances-in-protein-chemistry

Create successful ePaper yourself

Delete template?

Save as template?