29.10.2014 Views

advances-in-protein-chemistry

advances-in-protein-chemistry

advances-in-protein-chemistry

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

programm<strong>in</strong>g (IP) problem based on the contact map graph of the prote<strong>in</strong> 3D structure template. Furthermore, the IP<br />

formulation is then relaxed to a l<strong>in</strong>ear programm<strong>in</strong>g (LP) problem, and later resolved by the canonical branch-and-bound<br />

scheme. RAPTOR extensively outperforms other programs at the fold similarity level as shown by benchmark tests for fold<br />

recognition. RAPTOR was ranked as top 1 among <strong>in</strong>dividual prediction servers, <strong>in</strong> terms of the recognition capability and<br />

alignment accuracy for Fold Recognition (FR) family targets as evaluated by CAFASP3 [153].<br />

However, RAPTOR was significantly outperformed by RaptorX which is good at align<strong>in</strong>g prote<strong>in</strong>s with sparse sequence<br />

profile. RaptorX utilizes a statistical knowledge based method to design a new thread<strong>in</strong>g scor<strong>in</strong>g function, which aims at<br />

enhanced measur<strong>in</strong>g the compatibility between a target sequence and a template structure. Additionally, RaptorX also has<br />

a multiple-template thread<strong>in</strong>g component and conta<strong>in</strong>s a new module for alignment quality prediction. RaptorX <strong>in</strong>cludes<br />

three ma<strong>in</strong> components: s<strong>in</strong>gle-template thread<strong>in</strong>g, alignment quality prediction, and multiple-template thread<strong>in</strong>g.<br />

Initially, RaptorX aligns a specific target to each of its templates us<strong>in</strong>g the s<strong>in</strong>gle-template alignment algorithm. Then, the<br />

alignment quality is predicted and all the templates are ranked by this predicted quality <strong>in</strong> a descend<strong>in</strong>g order. If the target is<br />

not appropriate for multiple-template thread<strong>in</strong>g, RaptorX generates a 3D model for the target from the pairwise alignment<br />

by means of the highest predicted quality. Else, RaptorX runs multiple-template thread<strong>in</strong>g for the target and constructs a<br />

correspond<strong>in</strong>g 3D model. RaptorX does extremely well at the alignment of hard targets, which have less than 30% sequence<br />

identity with experimentally determ<strong>in</strong>ed structures <strong>in</strong> PDB. RaptorX outperformed all the CASP9 participat<strong>in</strong>g servers<br />

<strong>in</strong>clud<strong>in</strong>g those us<strong>in</strong>g consensus and ref<strong>in</strong>ement methods, bl<strong>in</strong>dly tested on the 50 hardest CASP9 template-based model<strong>in</strong>g<br />

targets [154,155] As an <strong>in</strong>put, RaptorX takes an am<strong>in</strong>o acid sequence and predicts its secondary and tertiary structures as<br />

well as solvent accessibility and disordered regions. It also allocates confidence scores like P-value for the relative global<br />

quality, GDT (global distance test) and uGDT (un-normalized GDT) for the absolute global quality, and RMSD for the<br />

absolute local quality of each residue <strong>in</strong> the model to <strong>in</strong>dicate the quality of a predicted 3D model [156].<br />

3.2.3 Ab <strong>in</strong>itio<br />

For a given prote<strong>in</strong> am<strong>in</strong>o acid sequence, the aim of Ab <strong>in</strong>itio structure prediction is to predict the 3D structure of<br />

its native state. A prote<strong>in</strong> sequence folds to a native conformation or a collection of conformations that is at or close to<br />

the global free-energy m<strong>in</strong>imum. Therefore, the difficulty <strong>in</strong> f<strong>in</strong>d<strong>in</strong>g native-like conformations for a given sequence has<br />

to address the development of a precise potential and an efficient method for search<strong>in</strong>g the resultant energy landscape<br />

[157]. Traditionally, the most acclaimed structure prediction methods have been homology-based comparative model<strong>in</strong>g<br />

and fold recognition [158]. These two approaches construct prote<strong>in</strong> models by align<strong>in</strong>g query sequences aga<strong>in</strong>st solved<br />

template structures. If close templates are recognized, high-resolution models could be generated by the template-based<br />

model<strong>in</strong>g. However, if templates are lack<strong>in</strong>g from the Prote<strong>in</strong> Data Bank (PDB), the models require to be built from scratch,<br />

i.e. Ab <strong>in</strong>itio fold<strong>in</strong>g, the most complicated category of prote<strong>in</strong>-structure prediction [159,160]. As the size of the prote<strong>in</strong><br />

<strong>in</strong>creases, the conformational phase space of sampl<strong>in</strong>g also <strong>in</strong>creases sharply mak<strong>in</strong>g the Ab <strong>in</strong>itio model<strong>in</strong>g of bigger<br />

prote<strong>in</strong>s exceed<strong>in</strong>gly tricky [161]. Current Ab <strong>in</strong>itio predictions are ma<strong>in</strong>ly focused on small prote<strong>in</strong>s. Exist<strong>in</strong>g Ab <strong>in</strong>itio<br />

predictions are primarily focused on small prote<strong>in</strong>s and quite a few successful examples have been described <strong>in</strong> literature<br />

[139]. In this chapter, we highlight some of the well known current methods for predict<strong>in</strong>g tertiary structure which employ<br />

Ab <strong>in</strong>itio methods <strong>in</strong> the absence of homology to a known structure.<br />

3.2.3.1 I-TASSER<br />

I-TASSER is a web-based hierarchical prote<strong>in</strong> structure modell<strong>in</strong>g method which employs the Profile-Profile thread<strong>in</strong>g<br />

Alignment (PPA) improved by secondary-structure [162] and the iterative implementation of the Thread<strong>in</strong>g ASSEmbly<br />

Ref<strong>in</strong>ement (TASSER) program [163]. In I-TASSER, the query sequences are <strong>in</strong>itially threaded through a representative<br />

PDB structure library (us<strong>in</strong>g 70% sequence identity as cut-off) to explore for the potential folds by four straightforward<br />

variants of PPA methods, with diverse comb<strong>in</strong>ations of the hidden Markov model [164] and PSI-BLAST [41] profiles<br />

and the Needleman-Wunsch [140] and Smith-Waterman [39] alignment algorithms. The cont<strong>in</strong>uous fragments are then<br />

elim<strong>in</strong>ated from the thread<strong>in</strong>g aligned regions which are used to reconstruct full-length models whereas the thread<strong>in</strong>g<br />

unaligned regions (e.g. loops) are built by Ab <strong>in</strong>itio model<strong>in</strong>g [165]. The replica-exchange Monte Carlo simulation [166]<br />

is used to search conformational space. SPICKER [167,168] is used to cluster the structure trajectories and the cluster<br />

centroids are acquired by the averag<strong>in</strong>g the coord<strong>in</strong>ates of the entire clustered structures.<br />

Fragment assembly simulation is implemented aga<strong>in</strong>, which beg<strong>in</strong>s from the cluster centroid of the <strong>in</strong>itial round<br />

simulation, so that the steric clashes on the centroid structures are excluded and promote additional model ref<strong>in</strong>ement.<br />

Spatial restra<strong>in</strong>ts are removed from the centroids and the PDB structures are explored by the structure alignment program<br />

TM-align [169], which are used to direct the second round of simulation. In the end, the structure decoys are clustered<br />

and the structures with the lowest possible energy <strong>in</strong> each cluster are selected, which has the Cα atoms and the side-cha<strong>in</strong><br />

centres’ of mass specified. Backbone atoms (N, C, O) are added by Pulchra [170] and Scwrl_3.0 [171] is used to build sidecha<strong>in</strong><br />

rotamers. The absolute quality of the f<strong>in</strong>al model <strong>in</strong> comparison with the native structure is given by TM-score.<br />

3.2.3.2 QUARK<br />

QUARK is a web-based Ab <strong>in</strong>itio prote<strong>in</strong> structure prediction method which focuses on the elaborate design of the<br />

force field and the search eng<strong>in</strong>e by tak<strong>in</strong>g a semi-reduced model to represent prote<strong>in</strong> residues by the full backbone atoms<br />

and the side-cha<strong>in</strong> center of mass [172]. Initially, QUARK predicts a wide range of carefully selected structural features<br />

by us<strong>in</strong>g neural network (NN) for a query sequence. Based on the idea borrowed from Rosetta and I-TASSER, the global<br />

fold is then created by replica-exchange Monte Carlo (REMC) simulations by assembl<strong>in</strong>g the small fragments as generated<br />

OMICS Group eBooks<br />

015

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!