bbc 2015
BBC2015_booklet
BBC2015_booklet
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 <strong>2015</strong><br />
Abstract ID: O1<br />
Oral presentation<br />
10th Benelux Bioinformatics Conference <strong>bbc</strong> <strong>2015</strong><br />
O1. CELL TYPE-SELECTIVE DISEASE ASSOCIATION<br />
OF GENES UNDER HIGH REGULATORY LOAD<br />
Mafalda Galhardo 1 , Philipp Berninger 2 , Thanh-Phuong Nguyen 1 , Thomas Sauter 1 & Lasse Sinkkonen 1*.<br />
Life Sciences Research Unit, University of Luxembourg, Luxembourg, Luxembourg 1 ; Biozentrum, University of Basel<br />
and Swiss Institute of Bioinformatics, Basel, Switzerland 2 . * lasse.sinkkonen@uni.lu<br />
Identification of biomarkers and drug targets is a key task of biomedical research. We previously showed that diseaselinked<br />
metabolic genes are often under combinatorial regulation (Galhardo et al. 2014). Here we extend this analysis to<br />
include almost 100 transcription factors (TFs) and key histone modifications from over 100 samples to show that genes<br />
under high regulatory load (HRL) are enriched for disease-association across cell types. Network and pathway analysis<br />
suggests the central role of HRL genes in biological networks, under heavy regulation both at transcriptional and posttranscriptional<br />
level, as a possible explanation for the observed enrichment. Thus, epigenomic mapping of enhancers<br />
presents an unbiased approach for identification of novel disease-associated genes.<br />
INTRODUCTION<br />
Identification of disease-relevant genes and gene products<br />
as biomarkers and drug targets is one of key tasks of<br />
biomedical research. Still, a great majority of research is<br />
focused on a small minority of genes while many remain<br />
unstudied (Pandey et al. 2014). Unbiased prioritization<br />
within these ignored genes would be important to harvest<br />
the full potential of genomics in understanding diseases.<br />
Many databases to catalog disease-associated genes have<br />
been created, including DisGeNET that draws from<br />
multiple sources (Bauer-Mehren et al. 2010). In addition,<br />
large amounts of publicly available epigenomic data on<br />
the cell type-selective regulation of these genes has been<br />
produced. The importance of epigenetic regulation for<br />
disease development is increasingly recognized, for<br />
example in analysis of GWAS studies where causal SNPs<br />
are mostly located within gene regulatory regions<br />
(Maurano et al. 2012).<br />
METHODS<br />
Public ChIP-seq data produced by the ENCODE project<br />
(Dunham et al. 2012), the BLUEPRINT Epigenome<br />
project (Martens et al. 2013) and the NIH Epigenomic<br />
Roadmap project (Kundaje et al. <strong>2015</strong>) were downloaded<br />
on May 2014. The data were used to rank active protein<br />
coding genes (based on NCBI Entrez and marked by<br />
H3K4me3) by their regulatory load based on the number<br />
of associated TFs or enhancer (H3K27ac) regions using<br />
GREAT tool. The enrichment of disease genes from<br />
DisGeNET among HRL genes was tested using either<br />
Matlab® hypergeometric cumulative distribution function<br />
and adjusted for multiple testing with the Benjamini and<br />
Hochberg methodology or normalized enrichment score.<br />
Enriched diseases were clustered using R package<br />
“blockcluster”. Peak calling for super-enhancers was done<br />
using HOMER. A liver disease gene network was<br />
constructed from HPRD based on liver diseases genes<br />
from MeSH and genes from CTD and had 8278<br />
interactions. Statistical analysis of KEGG pathway<br />
enrichments and betweenness centrality was done using<br />
random sampling tests. miRNA target predictions were<br />
obtained from TargetScan6.2. Further details of the used<br />
methods can be found in Galhardo et al. <strong>2015</strong>.<br />
RESULTS & DISCUSSION<br />
Using ENCODE ChIP-Seq profiles for 93 transcription<br />
factors (TFs) in nine cell lines, we show that HRL genes<br />
are enriched for disease-association across cell types<br />
(Figure 1). TF load correlates with the enhancer load of<br />
the genes, allowing the identification of HRL genes by<br />
epigenomic mapping of active enhancers marked by<br />
H3K27ac modifications. Identification of the HRL genes<br />
across 139 samples from 96 different cell and tissue types<br />
reveals a consistent enrichment for disease-associated<br />
genes in a cell type-selective manner.<br />
The HRL genes are involved in more pathways than<br />
expected by chance, exhibit increased betweenness<br />
centrality in the interaction network of liver disease genes,<br />
and carry longer 3’UTRs with more microRNA binding<br />
sites than genes on average, suggesting a role as hubs<br />
within regulatory networks.<br />
Thus, epigenomic mapping of enhancers presents an<br />
unbiased approach for identification of novel diseaseassociated<br />
genes (Galhardo et al. <strong>2015</strong>).<br />
Transcription factor<br />
binding sites<br />
(93 TFs)<br />
9 ENCODE cell lines<br />
A549, GM12878, H1hESC, HCT116,<br />
HeLaS3, HepG2, HUVEC, K562, MCF7<br />
Gene ranking by<br />
regulatory load<br />
(Number of TFs or enhancers per gene)<br />
ChIP-seq data (Human)<br />
Active enhancers<br />
(H3K27ac)<br />
139 samples comprising<br />
96 tissue or cell types<br />
Disease genes<br />
(min score 0.08)<br />
High regulatory load genes are enriched<br />
for disease association<br />
FIGURE 1. Worflow of the disease-gene enrichment analysis.<br />
Figure 1<br />
REFERENCES<br />
Pandey AK et al. PLoS One, 9:e88889 (2014).<br />
Bauer-Mehren A et al. Nucleic Acids Res., 33:D514-D517 (2010).<br />
Maurano et al. Science, 337:1190-1195 (2012).<br />
Galhardo et al. Nucleic Asics Res. 42:1474-1496 (2014).<br />
Dunham et al. Nature, 489:57-74 (2012)<br />
Martens et al. Haematologica, 98:1487-1489 (2013)<br />
Kundaje et al. Nature, 518:317-330 (<strong>2015</strong>).<br />
Galhardo et al. Nucleic Acids Res. 10.1093/nar/gkv863 (<strong>2015</strong>).<br />
21