03.12.2015 Views

bbc 2015

BBC2015_booklet

BBC2015_booklet

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 <strong>2015</strong><br />

Abstract ID: O6<br />

Oral presentation<br />

10th Benelux Bioinformatics Conference <strong>bbc</strong> <strong>2015</strong><br />

O6. COMBINING TREE-BASED AND DYNAMICAL SYSTEMS<br />

FOR THE INFERENCE OF GENE REGULATORY NETWORKS<br />

Vân Anh Huynh-Thu 1* & Guido Sanguinetti 2,3 .<br />

GIGA-R & Department of Electrical Engineering and Computer Science, University of Liège 1 ; School of Informatics,<br />

University of Edinburgh 2 ; SynthSys – Systems and Synthetic Biology, University of Edinburgh 3 . * vahuynh@ulg.ac.be<br />

INTRODUCTION<br />

Reconstructing the topology of gene regulatory networks<br />

(GRNs) from time series of gene expression data remains<br />

an important open problem in computational systems<br />

biology. Current approaches can be broadly divided into<br />

model-based and model-free approaches, and face one of<br />

two limitations: model-free methods are scalable but<br />

suffer from a lack of interpretability, and cannot in general<br />

be used for out of sample predictions. On the other hand,<br />

model-based methods focus on identifying a dynamical<br />

model of the system; these are clearly interpretable and<br />

can be used for predictions, however they rely on strong<br />

assumptions and are typically very demanding<br />

computationally. Here, we aim to bridge the gap between<br />

model-based and model-free methods by proposing a<br />

hybrid approach to the GRN inference problem, called<br />

Jump3 (Huynh-Thu & Sanguinetti, <strong>2015</strong>). Our approach<br />

combines formal dynamical modelling with the efficiency<br />

of a nonparametric, tree-based method, allowing the<br />

reconstruction of GRNs of hundreds of genes.<br />

METHODS<br />

Gene expression model. At the heart of the Jump3<br />

framework, we use the on/off model of gene expression<br />

(Ptashne & Gann, 2002), where the rate of transcription of<br />

a gene can vary between two levels depending on the<br />

activity state μ of the promoter of the gene. The expression<br />

x of a gene is modelled through the following stochastic<br />

differential equation:<br />

dx i = (A i μ i (t) + b i – λ i x i )dt + σdω(t),<br />

where subscript i refers to the i-th target gene. Here, the<br />

promoter state μ i (t) is a binary variable (the promoter is<br />

either active or inactive) that depends on the expression<br />

levels of the transcription factors (TFs) that bind to the<br />

promoter. A i , b i and λ i are kinetic parameters, and the term<br />

σdω(t) represents a white noise-driving process with<br />

variance σ 2 .<br />

Network reconstruction with jump trees. Recovering<br />

the regulatory links pointing to gene i amounts to finding<br />

the genes whose expression is predictive of the promoter<br />

state μ i . To achieve this goal, we propose a procedure that<br />

learns, for each target gene i, an ensemble of decision trees<br />

predicting the promoter state μ i at any time t from the<br />

expression levels of the candidate regulators at the same<br />

time t. However, standard tree-based methods cannot be<br />

applied here since the output μ i (t) is a latent variable. We<br />

therefore propose a new decision tree algorithm called<br />

“jump tree”, which splits the observations by maximising<br />

the marginal likelihood of the dynamical on/off model.<br />

The learned tree-based model is then used to derive an<br />

importance score for each candidate regulator, computed<br />

as the sum of the likelihood gains that are obtained at all<br />

the tree nodes where this regulator was selected to split the<br />

observations. The importance of a candidate regulator j is<br />

used as weight for the putative regulatory link of the<br />

network that is directed from gene j to gene i.<br />

RESULTS & DISCUSSION<br />

We evaluated Jump3 on the networks of the DREAM4 In<br />

Silico Network challenge (Prill et al., 2010). For each<br />

network topology, two types of simulated expression data<br />

were used: data simulated using the on/off model (toy<br />

data) and the time series data that was provided in the<br />

context of the DREAM4 challenge. We compared Jump3<br />

to other GRN inference methods: two model-free methods,<br />

which are time-lagged variants of GENIE3 (Huynh-Thu et<br />

al., 2010) and CLR (Faith et al., 2007) respectively; two<br />

model-based methods, namely Inferelator (Greenfield et<br />

al., 2010) and TSNI (Bansal et al., 2006), and G1DBN<br />

(Lèbre, 2009), a method based on dynamic Bayesian<br />

networks. Areas Under the Precision-Recall curves<br />

(AUPRs) obtained for size-100 networks are shown in<br />

Table 1. Jump3 yields the highest AUPR in the case of the<br />

toy data. As expected, its performance decreases when the<br />

networks are inferred from the DREAM4 data, due to the<br />

mismatch between the on/off model and the one used to<br />

simulate the data. However, Jump3 still outperforms the<br />

other methods.<br />

Toy<br />

DREAM4<br />

Jump3 0.272 ± 0.060 0.187 ± 0.058<br />

GENIE3-lag 0.114 ± 0.010 0.176 ± 0.056<br />

CLR-lag 0.088 ± 0.008 0.169 ± 0.047<br />

Inferelator 0.069 ± 0.006 0.144 ± 0.036<br />

TSNI 0.020 ± 0.003 0.042 ± 0.010<br />

G1DBN 0.104 ± 0.024 0.114 ± 0.043<br />

TABLE 1. Comparison of network inference methods (mean AUPR and<br />

standard deviation).<br />

We also applied Jump3 to gene expression data from<br />

murine bone marrow-derived macrophages treated with<br />

interferon gamma (Blanc et al., 2011). Several of the hub<br />

TFs in the predicted network have biologically relevant<br />

annotations. They include interferon genes, one gene<br />

associated with cytomegalovirus infection, and cancerassociated<br />

genes, showing the potential of Jump3 for<br />

biologically meaningful hypothesis generation.<br />

REFERENCES<br />

Bansal M et al. Bioinformatics 22, 815-822 (2006).<br />

Blanc M et al. PLoS Biol 9, e1000598 (2011).<br />

Faith JJ et al. PLoS Biol 5, e8 (2007).<br />

Greenfield A. PLoS ONE 5, e13397 (2010).<br />

Huynh-Thu VA & Sanguinetti G. Bioinformatics 31, 1614-1622 (<strong>2015</strong>).<br />

Huynh-Thu VA et al. PLoS ONE 5, e12776 (2010).<br />

Lèbre S. Stat Appl Genet Mol Biol 8, Article 9 (2009).<br />

Prill RJ et al. PLoS ONE 5, e9202 (2010).<br />

Ptashne M & Gann A. Genes and Signals. Cold Harbor Spring<br />

Laboratory Press (2002).<br />

26

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!