Annual Scientific Report 2015
EMBL_EBI_ASR_2015_DigitalEdition
EMBL_EBI_ASR_2015_DigitalEdition
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Stegle Group<br />
Statistical Genomics &<br />
Systems Genetics<br />
We use computational approaches to map genotype to phenotype on a<br />
genome-wide scale. Using statistics, we seek to understand how genetic<br />
background and environment jointly shape phenotypic traits or cause diseases,<br />
how genetic and external factors are integrated at different molecular layers, and<br />
how molecular signatures vary between individual cells.<br />
To make accurate inferences from high-dimensional<br />
‘omics datasets, it is essential to account for biological<br />
and technical noise and to propagate evidence strength<br />
between the different steps of a given analysis. To<br />
address these needs, we develop statistical analysis<br />
methods in the areas of gene regulation, genome wide<br />
association studies (GWAS) and causal reasoning<br />
in molecular systems. Our methodological work ties<br />
in with experimental collaborations, and we actively<br />
develop methods to fully exploit large-scale datasets<br />
that are obtained using the most recent technologies. In<br />
doing so, we derive computational methods to dissect<br />
phenotypic variability at the level of the transcriptome,<br />
epigenome and the proteome, and derive advanced<br />
statistical methods for the emerging field of<br />
single-cell biology.<br />
Major achievements<br />
In <strong>2015</strong> we developed and applied methods for linking<br />
genetic variation data and phenotype. We derived a<br />
new statistical model that allows studying genetic<br />
associations between sets of genetic variants and<br />
multiple correlated phenotypes (Casale et al. <strong>2015</strong>).<br />
The model makes it possible to interrogate very large<br />
cohorts with hundreds of thousands of samples,<br />
increases statistical power and clarifies the genetic basis<br />
of phenotypic correlation between genetically diverse<br />
individuals.<br />
In addition to deriving new statistical pools, we<br />
actively applied these methods to study the regulatory<br />
consequence of copy-number changes and other<br />
structural variants in the human genome on geneexpression<br />
levels. In a collaboration with Korbel<br />
team at EMBL Heidelberg, we surveyed the effect of<br />
structural variants on gene expression at a genome-wide<br />
scale using the data from the final release of the 1000<br />
Genomes Project (Sudmant et al., <strong>2015</strong>).<br />
In parallel to our efforts in population genomics, we<br />
extended our methodological work to the field of<br />
single-cell genomics. In collaboration with the Marioni<br />
and Teichmann groups at EMBL-EBI we devised new<br />
ways to dissect transcriptional heterogeneity between<br />
single cells (Buettner et al., <strong>2015</strong>). Our approach, for<br />
the first time, enables modelling both known and<br />
unknown factors that underlie single-cell transcriptome<br />
variation. This method has already helped identify new<br />
sub-clusters of cells in single-cell RNAseq studies of<br />
differentiating T-cells and will be an important building<br />
block for our future aims.<br />
Future plans<br />
In 2016 we will continue to develop innovative statistical<br />
approaches to analyse data from high-throughput<br />
genetic and molecular profiling studies. Our on-going<br />
efforts are motivated by single-cell genomics data, in<br />
particular using new assays that allow to prolife multiple<br />
molecular layers in the same sets of cells in parallel.<br />
By linking these layers, we hope to gain new insights<br />
into gene regulation, the sources of transcriptome<br />
heterogeneity and, ultimately, cell-fate decisions in<br />
development and cell differentiation.<br />
Selected publications<br />
Buettner F, et al. (<strong>2015</strong>) Computational analysis of<br />
cell-to-cell heterogeneity in single-cell RNA-sequencing<br />
data reveals hidden subpopulations of cells. Nature<br />
Biotechnol. 33:155-160<br />
Casale FP et al. (<strong>2015</strong>) Efficient set tests for the genetic<br />
analysis of correlated traits. Nature Methods 12: 755-758<br />
Stegle O, Teichmann SA and Marioni JC (<strong>2015</strong>)<br />
Computational and analytical challenges in single-cell<br />
transcriptomics. Nature Rev. Genet. 16:133-145<br />
Stephan J, Stegle O and Beyer A (<strong>2015</strong>) A random forest<br />
approach to capture genetic effects in the presence of<br />
population structure. Nature Commun. 6:7432<br />
Sudmant PH, et al. (<strong>2015</strong>) An integrated map of<br />
structural variation in 2,504 human genomes. Nature<br />
526: 75-81<br />
143<br />
<strong>2015</strong> EMBL-EBI <strong>Annual</strong> <strong>Scientific</strong> <strong>Report</strong>