bbc 2015

Recommendations

Info

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P18. RNA-SEQ REVEALS ALTERNATIVE SPLICING WITH ALTERNATIVE FUNCTIONALITY IN MUSHROOMS Thies Gehrmann 1 , Jordi F. Pelkmans 2 , Han Wösten 2 , Marcel J.T. Reinders 1 & Thomas Abeel 1* . Delft Bioinformatics Lab, Delft Technical University 1 ; Fungal Microbiology, Science Faculty, Utrecht University 2 ; * T.Abeel@tudelft.nl Alternative splicing is well studied in mammalian genomes, and alternative transcripts are often associated with disease and their role in regulation is gradually being unveiled. In fungi, the study of alternative splicing has only scratched the surface. Using RNA-Seq data, we predict alternative transcripts based on existing gene predictions in two mushroom forming fungi. We study the alternative functionality of genes through functional domains, developmental stages, tissue and time. This analysis reveals the amount of alternative functionality induced by alternative splicing which was previously unknown in fungi, and asserts the need for further research. INTRODUCTION Transcriptreconstruction algorithms rely on the sparsity (intergenic regions) of the genome in order distinguish between genes. In fungi, due to the density of the genome, transcripts overlap in the up and down-stream untranslated regions (UTRs) and prevent the use of existing tools for transcript prediction (Roberts et. al. 2011). Previous studies (Xie et. al. 2015, Zhao et. al. 2013), were limited to the study of splice junctions, more advanced functional analyses. We transform the genomes of S. commune and A. bisporusin order to enable the prediction of alternative transcripts applying existing transcript reconstruction algorithms to RNA-Seq data from different tissue types and developmental stages. We present a functional analysis of the resulting transcripts. METHODS We apply a transformation on our fungal genomes in order to reduce the impact of overlapping UTRs which prevent the prediction of alternative transcripts. We split the genome into chunks, with each chunk being defined by existing gene annotations. Thus, the transformation essentially removes intergenic regions (which contain the UTRs). Each chunk is then analyzed separately by Cufflinks (Roberts et. al. 2011). Predicted transcripts are filtered based on read information and ORF sanity. Protein domain annotations are predicted for each transcript using InterPro (Zdobnov & Apweiler 2001). For each gene with multiple alternative transcripts, we construct a consensus sequence which allows us to call specific splicing events without the influence of erroneous reference annotations. RESULTS & DISCUSSION For both fungi, we find that alternative splicing is prevalent and many genes have multiple alternative transcripts (see Table 1). # Orig. Genes # Filt. # Transcripts Genes S. commune 16,319 14,615 20,077 A. bisporus 10,438 9612 14,320 TABLE 1. The number of originally annotated genes in S. Commune and A. Bisporus is decreased after prediction based on RNA-Seq data filters them out. The number of new transcripts predicted indicates that alternative splicing is not a rare event in these fungi. The frequency of specific events in the two fungi are similar and match what is seen in humans (Sammeth, M, et. al. 2008). However, there are significant differences in the event usage. While most transcripts in S. commune only have one event associated with it, most transcripts in A. Bisporushave at least two events. We show that this is a result of co-operative events. As our dataset consists of multiple developmental timepoints and tissue types, we are able to observe the alternative use of transcripts through time. If a gene swaps transcript usage at a certain time point, this is indicative of a functional involvement of that particular transcript (Lees et. al. 2015). We find multiple transcripts in both S. commune and A. bisporus which are activated in specific developmental stages of the mushroom. Furthermore, in A. bisporus, we are able to identify transcripts which are activated specifically for certain tissue types through development. Using protein domain predictions for each transcript in a gene, we can measure how gene functionality changes across its transcripts. Figure 1 shows that functional annotations are not always preserved across all transcripts, indicating alternative functionality. FIGURE 1. Many genes in S. commune demonstrate alternative functionality through alternative splicing This is the first genome-wide functional analysis of alternative splicing in fungi from RNA-Seq data. We find a wealth of alternative splicing events in two fungi, resulting in many newly discovered transcripts. Although their functional influence is not yet demonstrated, we present evidence to suggest that they are relevant to mushroom development. REFERENCES Lees, J. G., et. al. BMC Genomics, 16:1 (2015) Roberts, A., et. al. Bioinformatics 27:17, 2325–2329. (2011) Sammeth, M., et. al. PLoS Computational Biology, 4:8. (2008) Xie, B.-B., et. al.. BMC Genomics, 16:54(2015). Zdobnov, E. M., & Apweiler, R. Bioinformatics 17:9 (2001) Zhao, C., et. al. BMC Genomics, 14:21. (2013). 62
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P19. MSQROB: AN R/BIOCONDUCTOR PACKAGE FOR ROBUST RELATIVE QUANTIFICATION IN LABEL-FREE MASS SPECTROMETRY-BASED QUANTITATIVE PROTEOMICS Ludger Goeminne 1,2,3* , Kris Gevaert 2,3 & Lieven Clement 1 . Department of Applied Mathematics, Computer Science and Statistics, Ghent University 1 ; VIB Medical Biotechnology Center 2 ; Department of Biochemistry, Ghent University 3 . * ludger.goeminne@UGent.be MSqRob is an R/Bioconductor package that uses robust ridge regression on peptide-level data for robust relative quantification of proteins in label-free data-dependent acquisition (DDA) mass spectrometry (MS)-based proteomic experiments. It has been shown that statistical methods inferring at the peptide-level outperform workflows that summarize peptide intensities prior to inference. MSqRob improves upon existing peptide-level methods by three modular extensions: (1) ridge regression, (2) empirical Bayes variance estimation and (3) M-estimation with Huber weights. The extensions make MSqRob less sensitive towards outliers and missing peptides, enabling more proteins to be processed. Our software provides streamlined data analysis pipelines for experiments with simple layouts as well as for more complex multi-factorial designs. Using a spike-in dataset, we illustrate that MSqRob grants more stable protein fold change estimates and improves the differential abundance (DA) ranking. INTRODUCTION In a typical label-free DDA LC-MS/MS-based proteomic workflow, proteins are digested to peptides, separated by RP-HPLC and analyzed by a mass spectrometer. However, several issues inherent to the protocol make data analysis non-trivial. Most of the common data analysis procedures use summarization-based workflows. We have previously shown that inference at the peptide level outperforms these summarization-based approaches (Goeminne et al., 2015). However, even these pipelines are sensitive to outliers and suffer from overfitting. Here, we present MSqRob, an R/Bioconductor package that starts form peptide-level data and provides robust inference on DA at the protein level. METHODS Dataset. To demonstrate the performance of our package, we use the CPTAC dataset, in which 48 known human proteins were spiked-in at different concentrations in a yeast proteome background. Ideally, when comparing different spike-in conditions, only the human proteins should be flagged as differentially abundant. Competing analytical methods. MaxLFQ+Perseus, which summarizes peptide data followed by pairwise t- tests. LM model. Generally, peptide-based models are constructed as follows: y ijklmn = treat ij + pep ik + biorep il + techrep im + ε ijklmn with y ijklmn the n th log 2 -transformed normalized feature intensity for the i th protein under the j th treatment treat ij , the k th peptide sequence pep ik , the lth biological repeat biorep il and the m th technical repeat techrep im , and ε ijklmn a normally distributed error term with mean zero and variance σ i 2 . MSqRob. MSqRob adds the following improvements to the LM model: 1. Ridge regression: shrink parameter estimates towards 0 by adding a ridge penalty term to the loss function. 2. Stabilize variance estimation by borrowing information across proteins with empirical Bayes (EB): shrink individual variances towards the pooled variance. 3. M estimation with Huber weights: weigh down observations with large errors. RESULTS & DISCUSSION MSqRob uses MaxQuant or Mascot peptide-level data as input. It performs preprocessing, robust model fitting and returns log 2 fold change estimates and FDR corrected p- values for all model parameters and/or (user specified) contrasts. Advanced users have the flexibility to (a) adopt their own preprocessing pipeline (e.g. transformation, normalization, drop contaminants…) and (b) specify the appropriate model structure. Compared to competing methods, MSqRob returns more stable log 2 fold change estimates, improves DA ranking (Figure 1) and is able to discern between consistently strong DA and an accidental hit caused by outliers or a small variance due to random chance in low-abundant proteins. FIGURE 1. Receiver operating characteristic (ROC) curves showing the superior performance of MSqRob compared to a simple linear model (LM) and a summarizarion-based approach (MaxLFQ+Perseus) when comparing the lowest spike-in concentration 6A with the second lowest spike-in concentration 6B. Stars denote the methods’ cut off at an estimated 5 % FDR. REFERENCES Goeminne LJE et al. Journal of Proteome Research 14, 2457-2465 (2015). 63
Page 1 and 2:
10 th Benelux Bioinformatics Confer
Page 3 and 4:
10th Benelux Bioinformatics Confere
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12: 10th Benelux Bioinformatics Confere
Page 19 and 20: BeNeLux Bioinformatics Conference -
Page 61: BeNeLux Bioinformatics Conference -
Page 113 and 114:
BeNeLux Bioinformatics Conference -
Page 115:
show all

bbc 2015

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?