bbc 2015

Recommendations

Info

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P30. GALAHAD: A WEB SERVER FOR THE ANALYSIS OF DRUG EFFECTS FROM GENE EXPRESSION DATA Griet Laenen 1,2,* , Amin Ardeshirdavani 1,2 , Yves Moreau 1,2 & Lieven Thorrez 1,3 . Dept. of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven 1 ; iMinds Medical IT Dept., KU Leuven 2 ; Dept. of Development and Regeneration @ Kulak, KU Leuven 3 . * griet.laenen@esat.kuleuven.be Galahad (https://galahad.esat.kuleuven.be) is a web-based application for the analysis of gene expression data from drug treatment versus control experiments, aimed at predicting a drug’s molecular targets and biological effects. Galahad provides data quality assessment and exploratory analysis, as well as computation of differential expression. Based on the obtained differential expression values, drug target prioritization and both pathway and disease enrichment can be calculated and visualized. Drug target prioritization is based on the integration of the gene expression data with a functional protein association network. INTRODUCTION Gene expression analysis is frequently employed to study the effects of drug compounds on cells. The observed transcriptional patterns can provide valuable information for identifying compound–protein inter-actions as well as resulting biological effects. To facilitate the analysis of this particular data type and enable an in-depth exploration of a drug’s mode of effect, we have developed Galahad 1 . INPUT The main input for Galahad are raw Affymetrix human, mouse or rat DNA microarray data derived from both untreated control samples and samples treated with a drug of interest. In addition, Galahad provides the possibility to start from differential expression data derived with other platforms to perform drug target prioritization and enrichment analysis. METHODS The different analyses are depicted in Figure 1 and include: preprocessing of the raw data with RMA or MAS5.0, as indicated by the user; quality assessment and exploratory analysis to ascertain data quality, uncover experimental issues, and help in deciding whether certain arrays need to be considered as outlying; differential expression analysis to determine the significance of gene up- and downregulation following drug treatment; genome-wide drug target prioritization by means of an in-house developed algorithm for network neighborhood analysis integrating the expression data with functional protein association infor-mation 2 ; prediction of molecular pathways involved in the drug’s mode of effect; identification of associated disease phenotypes enabling side effect prediction and drug repositioning. OUTPUT The output is displayed in a series of tabs corresponding to the different analyses selected by the user: in the Quality Control and Data Exploration tabs, several diagnostic plots are displayed along with a short explanation; the Differential Expression tab contains a sorted table listing all genes together with their log 2 ratios and P-values for differential expression, as well as links to the corresponding GeneCards sections; in the Drug Target Prioritization tab, a ranked list of genes as potential targets of the drug can be found, together with the network diffusion-based scores and P-values for prioritization, and links to the corresponding GeneCards section; in addition, a network-based visualization is available for each gene, showing the 10 interaction partners contrib-uting most to the gene’s ranking; the tabs summarizing the results for Pathway and Disease Enrichment contain a sorted table with pathway or disease ontology IDs, names, and database links, together with the number of differentially expressed genes in the corresponding gene sets and the accompanying P- values; in addition, network graphs are available, consisting of the top 10 most significant pathways or disease phenotypes, along with their associated genes colored according to fold change. FIGURE 1. Overview of the Galahad analysis steps. REFERENCES 1. Laenen G. et al. Nucl Acids Res 43, W208-W212 (2015). 2. Laenen G. et al. Mol BioSyst 9, 1676-1685 (2013). 74
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: 000 Category: Abstract template 10th Benelux Bioinformatics Conference bbc 2015 P31. KMAD: KNOWLEDGE BASED MULTIPLE SEQUENCE ALIGNMENT FOR INTRINSICALLY DISORDERED PROTEINS Joanna Lange 1,2 , Lucjan S Wyrwicz 1 & Gert Vriend 2* . Laboratory of Bioinformatics and Biostatistics, M. Sklodowska-Curie Memorial Cancer Center; Institute of Oncology 1 , CMBI, Radboud University Nijmegen 2 . * vriend@cmbi.ru.nl INTRODUCTION Intrinsically disordered proteins (IDPs) lack tertiary structure and thus differ from globular proteins in terms of their sequence – structure – function relations. IDPs have a lower sequence conservation, different types of active sites, and a different distribution of functionally important regions, which altogether makes their multiple sequence alignment (MSA) difficult. Algorithms underlying existing MSA programs are directly or indirectly based on knowledge obtained from studying three dimensional protein structures. Hereby we introduce a tool for Knowledge based Multiple sequence Alignment for intrinsically Disordered proteins, KMAD, that incorporates SLiM, domain, and PTM annotations to improve the alignments. KMAD web server is accessible at http://www.cmbi.ru.nl/kmad/. A standalone version is freely available. METHODS Dataset of proteins experimentally proven to be disordered was obtained from DisProt (Sickmeier et al., 2007). For each IDP all homologous sequences were extracted from SwissProt (The Uniprot Consortium, 2014) using BLAST. The sequence sets were aligned with several MSA tools. Apart from manual validation we also performed a benchmark validation on reference sets from BAliBASE (Thompson et al., 2005) and PREFAB holding structurebased 'gold standard' sequence alignments. For this purpose we used KMAD and a modified version of KMAD, which performs a ’refinement’ of Clustal Omega (Sievers et al., 2011) alignments. RESULTS & DISCUSSION Manual validation showed that KMAD bypasses many mistakes made by Clustal Omega. An example of an alignment mistake is shown on Figure 1. a) Clustal Omega b) KMAD FIGURE 1. Excerpts from Clustal Omega and KMAD alignments of human sialoprotein (SIAL HUMAN) with four homologues. Various PTM kinds are highlighted with bright colours In the field of sequence alignment research it is common practice to compare the sequence alignments obtained with MSA software with those that are obtained from structure superpositions. IDPs do not possess a static 3D structure so that this method is not applicable to KMAD alignments. Both of the validation methods that we used have their disadvantages, but so far there is no alternative. Validation on benchmark alignments of structured proteins is biased towards Clustal Omega, because it was optimized to work with structured proteins. On the other hand, the manual inspection based on the same features that influence the alignment is not a very elegant method, but given the nature of IDPs probably the best we can do. REFERENCES Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32(5), 1792– 1797. Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., S öding, J., Thompson, J. D., and Higgins, D. G. (2011). Fast, scalable generation of highquality protein multiple sequence alignments using Clustal Omega. Molecular System Biology, 7(539), 539. Sickmeier, M., Hamilton, J. a., LeGall, T., Vacic, V., Cortese, M. S., Tantos, A., Szabo, B., Tompa, P., Chen, J., Uversky, V. N., Obradovic, Z., and Dunker, a. K. (2007). DisProt: the Database of Disordered Proteins. Nucleic Acids Research, 35(Database issue), D786–93. The Uniprot Consortium (2014). Activities at the Universal Protein Resource (UniProt). Nucleic Acids Research, 42(Database issue), D191–8. Thompson, J. D., Koehl, P., Ripp, R., and Poch, O. (2005). BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins: Structure, Function, and Bioinformatics, 61(1), 127–136. 75
Page 1 and 2:
10 th Benelux Bioinformatics Confer
Page 3 and 4:
10th Benelux Bioinformatics Confere
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
BeNeLux Bioinformatics Conference -
Page 21 and 22:
BeNeLux Bioinformatics Conference -
Page 23 and 24: BeNeLux Bioinformatics Conference -
Page 73: BeNeLux Bioinformatics Conference -
Page 115: 10th Benelux Bioinformatics Confere
show all

bbc 2015

Create successful ePaper yourself

Delete template?

Save as template?