bbc 2015

Recommendations

Info

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: O20 Oral presentation 10th Benelux Bioinformatics Conference bbc 2015 O20. MULTI-OMICS INTEGRATION: RIBOSOME PROFILING APPLICATIONS Volodimir Olexiouk 1 , Elvis Ndah 1 , Sandra Steyaert 1 , Steven Verbruggen 1 , Eline De Schutter 1 , Alexander Koch 1 , Daria Gawron 2 , Wim Van Criekinge 1 , Petra Van Damme 2 , Gerben Menschaert 1,* . Lab of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University 1 ; Dept. Medical Protein Research, VIB-Ghent University 2 . * Gerben.menschaert@ugent.be Ribosome profiling is a relatively new NGS technology that enables the monitoring of the in vivo synthesis of mRNAencoded translation products measured at the genome-wide level. The technique, also sometimes referred to as RIBOseq, uses the property of translating ribosomes to protect mRNA fragments from nuclease digestion and allows to determine genomic positions of translating ribosomes with sub-codon to single-nucleotide precision. Since the advent of the technology, several bioinformatics solutions have been devised to investigate this type of data. Here we will present several solutions to detect novel proteoforms by combining RIBOseq and mass spectrometry data, to detect putatively coding small open reading frames (sORFs), and to evaluate the impact of DNA and RNA methylation on the translation level. INTRODUCTION Integration of different OMICS technologies is routinely adapted to investigate biological systems. Our lab focuses on high-throughput data analysis and the development of novel data integration methodologies. Currently our focus goes to ribosome profiling (Ingolia et al., 2011), an NGS based technique to measure the so-called translatome (i.e. the mRNA that shows ribosome occupancy). This technique is applied in combination with other sequencing based protocols to measure expression (RNAseq), translation (mass spectrometry) and to chart maps of regulatory elements such as DNA methylation (reduced representation bisulfite sequencing, RRBS) and RNA methylation (m 6 Aseq) to address several biological questions. METHODS For the integration of RIBOseq and mass spectrometry (MS), we devised a tool called PROTEOFORMER (www.biobix.be/proteoformer). This proteogenomics tool consists of several steps. It starts with the mapping of ribosome-protected fragments (RPFs) and quality control of subsequent alignments. It further includes modules for identification of transcripts undergoing protein synthesis, positions of translation initiation with sub-codon specificity and single nucleotide polymorphisms (SNPs). We used PROTEOFORMER to create protein sequence search databases from publicly available mouse and inhouse performed human RIBOseq experiments and evaluated these with matching proteomics data (Crappé et al., 2015). Another pipeline based on RIBOseq data is built around the discovery of putatively coding small open reading frames (sORFs). Herein, the first step is to delineate sORFs based on RPF coverage throughout the coding sequence and at the translation initiation site. Afterwards, state-of-the-art tools and metrics accessing the coding potential of sORFs are implemented and a list of candidate sORFs for downstream analysis is compiled (e.g. MSbased identification). To assess the impact of DNA-methylation at the translation level a double knockout DNMT model was studied (WT and DNMT1 + 3B knockout HCT116 cell line). Genome-wide DNA methylation profiling was performed using RRBS, while ribosome profiling, quantitative shotgun and positional proteomics (Nterminal COFRADIC) were used to obtain protein expression data. An initial experiment to integrate m6Aseq (measuring the m6A epitranscriptome) and ribosome profiling has also been executed on HCT116 cells. RESULTS & DISCUSSION The RIBOseq-MS integration (through PROTEOFORMER) increases the overall protein identification rates with 3% and 11% (improved and new identifications) for human and mouse respectively and enables proteome-wide detection of 5’-extended proteoforms, upstream ORF (uORF) translation and nearcognate translation start sites. The PROTEOFORMER tool is available as a stand-alone pipeline and has been implemented in the galaxy framework for ease of use. The sORF pipeline was tested and curated on three different cell-lines (HCT116: human, E14 mESC: mouse, and S2: fruitfly). The public repository has been made available at www.sorfs.org (Olexiouk V. et al., in review), and so far includes the datasets mentioned above. In the study for the effect of DNA methylation at the proteome level in the DNMT double knock-out we found that the knockout cells show more significantly upregulated than down-regulated genes and that these upregulated genes were characterized by higher levels of promoter methylation in the wild type cells. Both the MS and RIBOseq analyses corroborated these findings. Preliminary results based on the m6A sequencing confirm previous findings on know m6A sequence motifs and enrichment of m6A sites in specific functional regions (around translation start sites and in 3’UTR regions) and moreover some examples hint at an effect of m6A on ribosomal pausing, after integrating m6A- and RIBOseq data. REFERENCES Ingolia N. et al. Cell 11;147(4):789-802 (2011). Crappé, J., Ndah, E. et al. NAR 11;43(5):e29 (2015). 40
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: O21 Oral presentation 10th Benelux Bioinformatics Conference bbc 2015 O21. CLUB-MARTINI: SELECTING FAVORABLE INTERACTIONS AMONGST AVAILABLE CANDIDATES: A COARSE-GRAINED SIMULATION APPROACH TO SCORING DOCKING DECOYS Qingzhen Hou 1* , Kamil K. Belau 2 , Marc F. Lensink 3 , Jaap Heringa 1 & K. Anton Feenstra 1* . Center for Integrative Bioinformatics VU (IBIVU), VU University Amsterdam, De Boelelaan 1081A, 1081 HV Amsterdam, The Netherlands 1 ; Intercollegiate Faculty of Biotechnology, University of Gdańsk - Medical University of Gdańsk, Kładki 24, 80-822 Gdańsk, Poland 2 ; Institute for Structural and Functional Glycobiology (UGSF), CNRS UMR8576, FRABio FR3688, University Lille, 59000, Lille, France 3 . Protein-protein Interactions (PPIs) play a central role in all cellular processes. Large-scale identification of native binding orientations is essential to understand the role of particular protein-protein interactions in their biological context. We estimate the binding free energy using coarse-grained simulations with the MARTINI forcefield, and use those to rank decoys for 15 CAPRI benchmark targets. In our top 100 and top 10 ranked structures, for the 'easier' targets that have many near-native conformations, we obtain a strong enrichment of acceptable or better quality structures; for the 'hard' targets with very few near-native complexes in the decoys, our method is still able to retain structures which have native interface contacts. Moreover, CLUB-MARTINI is rather precise for some targets and able to pinpoint near-native binding modes in top 1, 5, 10 and 20 selections. INTRODUCTION Measuring binding free energy is essential to understand the relevance of particular protein-protein interactions in their biological context. Moreover, at the atomic scale, molecular simulations give us insight into the physically realistic details of these interactions. In our recent study, we successfully applied coarse-grained molecular dynamics simulations to estimate binding free energy with similar accuracy as and 500-fold less time consuming than full atomistic simulation (May et al., 2014). The approach relied on the availability of crystal structures of the protein complex of interest. Here, we investigate the effectiveness of this approach as a scoring method to identify stable binding conformations out of docking decoys from protein docking. We apply our method as an evaluation method to rank more than 19 000 docked protein conformations, or ‘decoys’, for 15 benchmark targets from the Critical Assessment of PRedicted Interactions (CAPRI) (Lensink & Wodak, 2014). METHODS For each target, the binding free energy of all decoys was calculated, using the MARTINI forcefield as introduced before (May et al., 2014). In short, for a set of closely spaced separation distances, we calculate the constraint force applied to maintain the set distance. Integrating this force yields a potential of mean force (PMF), from which the binding free energy is extracted as the highest minus the lowest value. Previously, for accuracy, we used up to 20 replicate simulations for each distance in the PMF, but for efficiency, here we use only a single replicate initially. We then selected the lowest-scoring half to run an additional four replicates to obtain better sampling and more accurate estimates of the binding free energy. In total, we used approximately 800 000 core-hours of compute time. RESULTS & DISCUSSION We obtained strong enrichment of acceptable and high quality structures in the TOP 100 based on our PMF free energies, as shown in Figure 1. We estimate the error of our energies to be significant. This can be approved by increasing sampling, but remains very expensive. Moreover, for several targets, we can select near-native structures in top 1, top 5 and top 10 as shown in Table 1, which means that, overall, our method is rather precise. From estimates of the error, we expect we can improve accuracy by extending the amount of sampling done at each distance. In conclusion, our approach can find favorable interactions from available candidates produced by docking programs. To the best of our knowledge, this is the first time interaction free energy from a coarse-grained force field is used as a scoring method to rank docking solutions at a large scale. FIG. 1. Enrichment in percentage of acceptable or better structures. For each of the 13 targets with acceptable or better decoys, two columns (from left to right) stand for CAPRI Score_set and top 100 in our rank of binding free energy calculation. Red, orange and yellow represent the fractions of high, medium and acceptable quality structures over the number of all or selected docking decoys. The order (left to right) is based on the fraction of acceptable structures in each target (easy to difficult) Table 1. Success selections of top ranked structures Selection Target\Quality High Medium Acceptable Total (% ) TOP 1 T47 1 0 0 100 T53 0 0 1 100 T47 3 2 0 100 TOP 5 T41 0 0 4 80 T53 0 0 3 60 T37 0 2 0 40 T47 7 3 0 100 T41 0 1 7 80 TOP 10 T53 0 1 5 60 T37 0 3 0 30 T50 0 0 1 10 T47 14 6 0 100 T41 0 4 13 85 T53 0 3 9 60 TOP 20 T37 0 4 2 30 T50 0 0 3 15 T40 1 2 0 15 T46 0 0 1 5 REFERENCES May, Pool, Van Dijk, Bijlard, Abeln, Heringa & Feenstra. Coarsegrained versus atomistic simulations: realistic interaction free energies for real proteins. Bioinformatics (2014) 30: 326-334. Lensink & Wodak. Score_set: A CAPRI benchmark for scoring protein complexes. Proteins (2014) 82:3163-3169. 41
Page 1 and 2: 10 th Benelux Bioinformatics Confer
Page 3 and 4: 10th Benelux Bioinformatics Confere
Page 19 and 20: BeNeLux Bioinformatics Conference -
Page 39: BeNeLux Bioinformatics Conference -
Page 91 and 92:
BeNeLux Bioinformatics Conference -
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115:
10th Benelux Bioinformatics Confere
show all

bbc 2015

Create successful ePaper yourself

Delete template?

Save as template?