bbc 2015

Recommendations

Info

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P62. FLOREMI: SURVIVAL TIME PREDICTION BASED ON FLOW CYTOMETRY DATA Sofie Van Gassen 1,2,3* , Celine Vens 2,3,4 , Tom Dhaene 1 , Bart N. Lambrecht 2,3 & Yvan Saeys 2,3 . Department of Information Technology, Ghent University—iMinds 1 ; VIB Inflammation Research Center 2 ; Department of Respiratory Medicine, Ghent University 3 ; Department of Public Health and Primary Care, kU Leuven Kulak 4 . * sofie.vangassen@irc.vib-ugent.be Flow cytometry is a high-throughput technique for single cell analysis. It enables researchers and pathologists to study blood and tissue samples by measuring several cell properties, such as cell size, granularity and the presence of cellular markers. While this technique provides a wealth of information, it becomes hard to analyze all data manually. To investigate alternative automatic analysis methods, the FlowCAP challenges were organized. We will present an algorithm that obtained the best results on the FlowCAP IV challenge, predicting the time of progression to AIDS for HIV patients. INTRODUCTION The main task of the most recent FlowCAP IV challenge was a survival modeling challenge: participants had to predict the time of progression to AIDS for HIV patients, based on flow cytometry data of an unstimulated and a stimulated blood sample. Additionally, a secondary task was the identification of cell populations that could be indicative of this progression rate. Several challenges needed to be taken into account: the raw dataset was about 20GB large and about eighty percent of the survival times were censored. METHODS We developed a new algorithm, FloReMi, which combined several preprocessing steps with a density based clustering algorithm, a feature selection step and a random survival forest (Van Gassen et al., 2015). The input for our algorithm consisted of 2 flow cytometry samples for each patient: one unstimulated PBMC sample and one PBMC sample stimulated with HIV antigens. For each of these samples, 16 parameters were measured for hundreds of thousands of cells. First, we included quality control to remove erroneous measurements from the samples. We also made an automatic selection of live T cells to focus on the cells of interest in this specific flow cytometry staining. Once the dataset was cleaned up, we extracted features for each patient. This was done by clustering the cells using the flowDensity (Malek et al., 2015) and flowType algorithms (Aghaeepour et al., 2012). These algorithms divide the values for each feature into either “high” or “low” and use all combinatorial options of “high”, “low” or “neutral” marker values to group the cells. This resulted in 3 10 different cell subsets. For each of these subsets, we computed the number of cells assigned to it and the mean fluorescence intensity for 13 markers. Per patient, we collected these numbers for both samples and also computed the differences between the two. This resulted in a total of 2,480,058 features per patient. Because traditional machine learning algorithms cannot handle this amount of features, we then applied a feature selection step. To estimate the usefulness of a feature, we applied a Cox proportional hazards model on each feature. The resulting p-value indicates how well the feature corresponds with the known survival times for the training set. We ordered the features based on these scores, and picked only those that were uncorrelated with the others. This resulted in a final selection of 13 features, on which we applied several machine learning techniques. We compared the results of the Cox Proportional Hazards model, the Additive Hazards model and the Random Survival Forest. RESULTS & DISCUSSION All three methods performed well on the training dataset. However, on the test dataset, both the Cox Proportional Hazards model and the Additive Hazards model obtained bad results, probably due to overfitting on the training data. Only the Random Survival Forest obtained good results on the test dataset (Figure 1). This method outperformed all other methods submitted to the challenge. FIGURE 1. On the training dataset, there was a strong correlation between the scores and the actual survival times for all models. On the test dataset, only the Random Survival Forest performed well. One important challenge remains: the biological interpretation of our final features. Although they correlate with the transition times from HIV to AIDS, it is hard to interpret them as known cell types, due to our unsupervised feature extraction. Our method delivers a first step towards new insights in the progress from HIV to AIDS. REFERENCES Malek M et al. Bioinformatics 31.4, 606-607 (2015). Aghaeepour N et al. Bioinformatics 28, 1009-1016 (2012). Van Gassen S et al. Cytometry A, DOI 10.1002/cyto.a.22734 106
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P63. STUDYING BET PROTEIN-CHROMATIN OCCUPATION TO UNDERSTAND GENOTOXICITY OF MLV-BASED GENE THERAPY VECTORS Sebastiaan Vanuytven 1* , Jonas Demeulemeester 1 , Zeger Debyser 1 & Rik Gijsbers 1,2 . Laboratory for Molecular Virology and Gene Therapy, KU Leuven 1 ; Leuven Viral Vector Core, KU Leuven 2 . * Sebastiaan.vanuytven@student.kuleuven.be Integrating retroviral vectors are used to treat genetic and acquired disorders that, theoretically, can be cured by introducing specific gene expression cassettes into patient cells. Clinical trials held over the past two decades have proven that this approach is effective in curing genetic disorders and can produce better results than the standard therapy (Touzot, F et al., 2015). Nevertheless, adverse events in a limited number of patients treated with gamma-retroviral vectors have deterred their widespread application. Specifically, vector integration occurring in proximity of protooncogenes resulted in insertional mutagenesis and clonal expansion of the cells (Hacein-Bey-Abina S et al., 2003). INTRODUCTION Retroviruses and their derived viral vectors do not integrate at random. Their overall integration pattern is dictated by cellular cofactors that are co-opted by the invading viral complex. For gammaretroviral vectors (prototype MLV) the cellular bromo- and extraterminal domain (BET) family of proteins (BRD2, BRD3 and BRD4) tethers the viral integrase to the host cell chromatin (De Rijck J et al., 2013). At the moment the only available ChIP-seq data derives from HEK-293T cells exogenously overexpressing FLAG-tagged versions of the BET proteins (LeRoy G et al., 2012). Yet, the detailed chromatin binding profile of endogenous BET proteins in human cells is currently unknown. Here we report on the chromatin occupation of the endogenous BET proteins in K562 and human primary CD4+ T cells. METHODS Following fixation, all three BET proteins were pulleddown with specific antibodies (Bethyl Laboratories, α- BRD2: A302-583A; α-BRD3: A302-368A; α-BRD4: A301-985A or Abcam ab84776). Subsequently, 1x10 7 cells per sample were processed for ChIP as previously described (Pradeepa MM et al., 2012). ChIPed DNA was amplified with WGA2 using the manufacturer's protocol (Sigma Aldrich). All ChIP experiments were done with at least two biological replicates in K562 and CD4+ T cells. After processing of the ChIP-seq data, we compared the obtained BET protein-binding sites with MLV integration sites, histone modifications and other genetic features. Furthermore, we used motif discovery in the neighbourhood of BET binding sites and MLV integration sites to try and discover potential new players in the MLV integration process. RESULTS & DISCUSSION Analysis showed that 24% of the MLV integration sites overlap with a BET-binding site in K562 cells, the majority of which are BRD4 sites. In addition, BET binding sites located in promoter and enhancer regions are preferred for MLV integration. Further, evaluation demonstrated a strong correlation between MLVintegration in these sites and the occurrence of the transcription factor recognition motifs for MAX, GATA2, EGR1, GAPBA and YY1, suggesting a role for these proteins or the underlying chromatin structures in targeting integration of MLV to these locations in the genome via interaction with BET proteins and/or the MLV long terminal repeat sequences. Recently, we generated MLV-based vectors that no longer recognize BET-proteins, BET independent MLV-based (BinMLV) vectors (El Ashkar S et al., 2014). Integration preferences of BinMLV vectors are shifted away from epigenetic marks associated with enhancers and promoters as shown in a PCA analysis, but they also associate less with BET and MAX binding sites. Even though, BinMLV vectors still did not integrate at random, their distribution can overall be described as more safe, with 3% more integration sites in so-called genomic "safe-harbor" regions (Sadelain M et al., 2012). REFERENCES De Rijck J et al. The BET family of proteins targets moloney murine leukemia virus integration near transcription start sites, Cell Rep, 5, 886-894, (2013). El Ashkar S et al. BET-independent MLV-based Vectors Target Away From Promoters and Regulatory Elements, Mol Ther Nucleic Acids, 3, e179, (2014). Hacein-Bey-Abina S et al. LMO2-associated clonal T cell proliferation in two patients after gene therapy for SCID-X1, Science, 302, 415-419, (2003). LeRoy G et al. Proteogenomic characterization and mapping of nucleosomes decoded by Brd and HP1 proteins, Genome Biol, 13, R68, (2012). Pradeepa MM et al. Psip1/Ledgf p52 binds methylated histone H3K36 and splicing factors and contributes to the regulation of alternative splicing, PLoS Genet, 8, e1002717, (2012). Sadelain M, Papapetrou EP and Bushman FD. Safe harbours for the integration of new DNA in the human genome, Nat Rev Cancer, 12, 51-58, (2012). Touzot, F et al. Faster T-cell development following gene therapy compared with haploidentical HSCT in the treatment of SCID-X1, Blood, 125, 3563-3569, (2015). 107
Page 1 and 2:
10 th Benelux Bioinformatics Confer
Page 3 and 4:
10th Benelux Bioinformatics Confere
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
BeNeLux Bioinformatics Conference -
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56: BeNeLux Bioinformatics Conference -
Page 105: BeNeLux Bioinformatics Conference -
Page 115: 10th Benelux Bioinformatics Confere
show all

bbc 2015

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?