bbc 2015

Recommendations

Info

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P40. RIBOSOME PROFILING ENABLES THE DISCOVERY OF SMALL OPEN READING FRAMES (SORFS), A NEW SOURCE OF BIOACTIVE PEPTIDES Volodimir Olexiouk 1,* , Jeroen Crappé 1 , Steven Verbruggen 1 & Gerben Menschaert 1,* . Lab of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University 1 . INTRODUCTION Evidence for micropeptides, defined as translation products from small open reading frames (sORFs), has recently emerged. While limitations contributed to sequencing technologies as well as proteomics have stalled the discovery of micropeptides. It is the advent of ribosome profiling (RIBO-SEQ), a next generation sequencing technique revealing the translation machinery on a sub-codon resolution, that provided evidence in favor of translating sORFs. RIBO-SEQ captures and subsequently sequences the +-30 nt mRNA-fragments captured within ribosomes, providing means to identify translating sORFs, possible encoding functional micropeptides. Since the advent of ribosome profiling several micropeptides were described with import cellular functions micropeptides (e.g. Toddler, Pri-peptides, Sarcolipin and Myoregulin). METHODS RIBO-SEQ allows the identification of sORFs with ribosomal activity, however in order to further access the coding potential (potential of sORFs truly encoding functional micropeptides) down-stream analysis is necessary. Here we propose a pipeline which starts from RIBO-SEQ, implements state-of-the-art tools and metrics accessing the coding potential of sORFs and creates a list of candidate sORFs for downstream analysis (e.g. proteomic identification). In summary, assessment of the coding potential includes: PhyloCSF (conservation analysis), FLOSS-score (Ribosome protected fragment (RPF) length distribution analysis), ORFscore (distribution analysis of RPFs towards the first frame of a coding sequence (CDS), BLASTp (sequence similarity), VarAn (genetic variation analysis). In an attempt to set a community standard in addition to make sORFs accessible to a larger audience, a public database (www.sorfs.org) is provided where public available datasets were processed by this pipeline, allowing users to browse, query and export identified ORFs. Furthermore a PRIDE-respin pipeline was developed in order to periodically search the PRIDE database for proteomic evidence. RESULTS & DISCUSSION The pipeline has been tested and curated on three different cell-lines. These cell-lines include: HCT116 (human), E14 mESC (mouse) and s2 (fruitfly). Results obtained provided similar results to those reported in recent literature proving its relevance. All metrics, as stated above, have been carefully inspected for their biological relevance and contributed significantly to the detection of sORFs. The pipeline is currently being finalized, however is available upon request. The public repository is accessible at http://www.sorfs.org, and includes the datasets mentioned above resulting in 263354 sORFs. Two querying interfaces were implemented, a default query interface intended for browsing sORFs and a BioMart query interface for advanced querying and export functions. sORFs have their own detail page, visualizing the above discussed metrics and ribosome profiling data and a link to the UCSC-browser is provided, visualizing the RIBO-SEQ data. REFERENCES Pauli,A., Norris,M.L., Valen,E., Chew,G.-L., Gagnon,J. a, Zimmerman,S., Mitchell,A., Ma,J., Dubrulle,J., Reyon,D., et al. (2014) Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science, 343, 1248636. Pauli,A., Norris,M.L., Valen,E., Chew,G.-L., Gagnon,J. a, Zimmerman,S., Mitchell,A., Ma,J., Dubrulle,J., Reyon,D., et al. (2014) Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science, 343, 1248636. Crappé,J., Ndah,E., Koch,A., Steyaert,S., Gawron,D., De Keulenaer,S., De Meester,E., De Meyer,T., Van Criekinge,W., Van Damme,P., et al. (2014) PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration. Nucleic Acids Res., 10.1093/nar/gku1283. Ingolia,N.T. (2014) Ribosome profiling: new views of translation, from single codons to genome scale. Nat. Rev. Genet., 15, 205–13. Crappé,J., Van Criekinge,W., Trooskens,G., Hayakawa,E., Luyten,W., Baggerman,G. and Menschaert,G. (2013) Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs. BMC Genomics, 14, 648. Pauli,A., Norris,M.L., Valen,E., Chew,G.-L., Gagnon,J. a, Zimmerman,S., Mitchell,A., Ma,J., Dubrulle,J., Reyon,D., et al. (2014) Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science, 343, 1248636. Chanut-Delalande,H., Hashimoto,Y., Pelissier-Monier,A., Spokony,R., Dib,A., Kondo,T., Bohère,J., Niimi,K., Latapie,Y., Inagaki,S., et al. (2014) Pri peptides are mediators of ecdysone for the temporal control of development. Nat. Cell Biol., 16 84
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P PosterBeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract 10th ID: Benelux 000 Bioinformatics Category: Conference Abstract template bbc 2015 P41. RIGAPOLLO, A HMM-SVM BASED APPROACH TO SEQUENCE ALIGNMENT Gabriele Orlando 1,2,3,4 , Wim Vranken 1,2,3 and & Tom Lenaerts 1,4,5 . 1 Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, CP 263 1 ; 2 Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2 2 ; 3 Structural Biology Research Center, VIB,1050 Brussels, Belgium 3 ;. 4 Machine Learning group, Université Libre de Bruxelles, Brussels, 1050, Belgium 4 ;. 5 Artificial Intelligence lab, Vrije Universiteit Brussel, Brussels, 1050, Belgium 5 . INTRODUCTION Reliable protein alignments are a central problem for many bioinformatics tools, such as homology modelling. Over the years many different algorithms have been developed and different kinds of information have been used to align very divergent sequences [1]. Here we present a pairwise alignment tool, called Rigapollo, based on pairwise HMM-SVM, which includes backbone dynamics predictions [2] in the alignment process: recent work suggests that protein backbone dynamics is often evolutionary conserved and contains information orthogonal to the amino acid conservation.. METHODS Rigapollo uses a pairwise HMM-SVM alignment approach to infer the optimal alignment between two proteins, taking into consideration both sequence and dynamic information. The model (described in Figure 1) is composed by 3 states: M (match), G1 (gap in the first sequence) and G2 (gap in the second sequence). The transition probabilities are defined in the same way as a standard HMM. This new alignment tool is further designed in the following manner: Defining the N-dimensional feature vectors: Each amino acid in the sequences is described by an N- dimensional feature vector. That vector can be defined using any kind of information, ranging from evolutionary information (i.e. PSSM calculated with HHblits [3])) to dynamics predictions (using the DynaMine predictor [2]). While standard pairwise HMMs require the definition of a finite and discrete alphabet of observable states, our model works directly using these feature vectors (that can be both orthonormal or not orthonormal), evaluating the emission probability with a support vector machine (SVM). Definition of the emisisonemission probability: We define the emission probability using a SVM trained to discriminate matches from mismatches. We define as matches all the positions in the reference pairwise alignments that do not contain gaps and we use the concatenation of the previously defined feature vectors to describe them. These matches are considered positive hits. For what concerns the mismatches, we perform the same procedure, but couple positions that, in the reference alignment, are shifted a number of amino acids, varying between 5 and 10. After the training, the predicted emission probabilities for the M state, given the concatenation of two feature vectors, will be a function of the distance from the decision hyperplane of the SVM (called f(D)). The corresponding emission probabilities for the states G1 and G2 will be modeled as 1-f(D) RESULTS & DISCUSSION For the evaluation of the performances of Rigapollo, we adopted two publicly available subsets of the Balibase and SABmark alignmenta datasets, already used to evaluate other pairwise alignment tools [1]; from the MSAs, allpair pairwise alignments has been extracted, and all these that shared a percentage of sequence equal to the median of the one of the full database has been put in the subset. The datasets consist respectively in 38 and 123 manually curated, structure based pairwise alignments and they share very low sequence identity. For the evaluation of the performances we performed a 10 folds randomized crossvalidtion. Rigapollo increases the quality of low sequence identity pairwise alignment from 5 to 10% respect to the state of the art methods and it seams appears that the increase in the performancewse is more marked in very Figure 1: Structure of the pairwise HMM-SVM model divergent sequences, such as the onesthose in the SABmark dataset , where the dynamics information seams to significantly increase the quality of the alignment. This is probably due to the fact that dynamics are often well conserved in functional patterns, also when the sequence is not preserved [2]. REFERENCES [1] Do Chuong B.et al. Research in Computational Molecular Biology. Springer Berlin Heidelberg, 2006 [2] Cilia, Elisa, et al. Nucleic acids research 42.W1 (2014): W264-W270 [3] Remmert, Michael, et al.Nature methods 9.2 (2012): 173-175. 85
Page 1 and 2:
10 th Benelux Bioinformatics Confer
Page 3 and 4:
10th Benelux Bioinformatics Confere
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
BeNeLux Bioinformatics Conference -
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34: BeNeLux Bioinformatics Conference -
Page 83: BeNeLux Bioinformatics Conference -
Page 115: 10th Benelux Bioinformatics Confere
show all

bbc 2015

Create successful ePaper yourself

Delete template?

Save as template?