06.02.2015 Views

MOLECULAR IMAGING IN BIOINFORMATICS - Pattern Recognition ...

MOLECULAR IMAGING IN BIOINFORMATICS - Pattern Recognition ...

MOLECULAR IMAGING IN BIOINFORMATICS - Pattern Recognition ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Literature Study<br />

<strong>MOLECULAR</strong> <strong>IMAG<strong>IN</strong>G</strong><br />

<strong>IN</strong><br />

BIO<strong>IN</strong>FORMATICS<br />

Exploring Interdisciplinary Connections<br />

February 11, 2008<br />

Bioinformatics<br />

Information and Communication Theory Group<br />

Delft Technical University<br />

Laboratory for Clinical and Experimental Image Processing (LKEB)<br />

Radiology<br />

Leiden University Medical Center<br />

Author:<br />

Supervisors:<br />

Martin Wildeman<br />

Prof. dr. ir.M. J. T. Reinders<br />

1047973 Dr. ir. B. P. F. Lelieveldt


Contents<br />

1 Introduction 7<br />

2 Molecular Imaging 9<br />

2.1 About Molecular Imaging . . . . . . . . . . . . . . . . . . . . . . . . 9<br />

2.2 Novel contrast mechanisms . . . . . . . . . . . . . . . . . . . . . . . 9<br />

2.2.1 About Reporter Genes . . . . . . . . . . . . . . . . . . . . . 10<br />

2.2.2 Direct and Indirect Protein Detection . . . . . . . . . . . . . 11<br />

2.2.3 Reporter Gene Applications . . . . . . . . . . . . . . . . . . 12<br />

2.2.4 Current Limitations on Reporter Genes . . . . . . . . . . . . 13<br />

2.3 Molecular Imaging Modalities . . . . . . . . . . . . . . . . . . . . . 15<br />

2.3.1 Nuclear Imaging . . . . . . . . . . . . . . . . . . . . . . . . 16<br />

2.3.2 Computed Tomography . . . . . . . . . . . . . . . . . . . . . 18<br />

2.3.3 Magnetic Resonance Imaging . . . . . . . . . . . . . . . . . 18<br />

2.3.4 Optical Imaging . . . . . . . . . . . . . . . . . . . . . . . . 20<br />

2.3.5 Ultrasound Imaging . . . . . . . . . . . . . . . . . . . . . . 23<br />

2.4 Acquisition Challenges . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

2.4.1 Quantification of BLT and FMT . . . . . . . . . . . . . . . . 23<br />

2.4.2 Combining Information: Multi-modality fusion . . . . . . . . 25<br />

2.4.3 Combining Information: Follow Up Registration . . . . . . . 27<br />

2.4.4 Current Limitations in Molecular Imaging . . . . . . . . . . . 27<br />

3


3 Molecular Imaging as extra data source for model generation 29<br />

3.1 Acquisition of Spatiotemporal Gene Expression Data . . . . . . . . . 30<br />

3.2 Inferring a Quantitative Model using Spatiotemporal Protein Expression 32<br />

3.3 Quantitative vs. Qualitative Network Models . . . . . . . . . . . . . 34<br />

3.4 Modeling pathways using time series expression data, using conventional<br />

micro-array data . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />

3.5.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />

3.5.2 Creating models for whole body imaging data . . . . . . . . . 40<br />

4 Molecular Imaging as a means for hypothesis testing 45<br />

4.1 Gene Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45<br />

4.2 Cell Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br />

4.3 General signal detection and limitations . . . . . . . . . . . . . . . . 47<br />

4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />

5 Discussion 51<br />

5.1 Advantages of MI for the field of bioinformatics . . . . . . . . . . . . 51<br />

5.2 Current Issues and Challenges . . . . . . . . . . . . . . . . . . . . . 52<br />

5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54


Abbreviations<br />

In this paper, a lot of abbreviations are used. For readability, a list of abbreviations is<br />

listed here:<br />

• AFP - Auto Fluorescent Protein<br />

• BLI - Bioluminescence Imaging<br />

• BLT - Bioluminescence Tomography<br />

• BRET - Bioluminescence Resonance Energy Transfer<br />

• (C)CCD - (Cooled) Charge-coupeld Device<br />

• CRET - Chemoluminesce Resonance Energy Transfer<br />

• CT - Computed Tomography<br />

• (D)BN - (Dynamic) Bayesian Network<br />

• ES Cell - Embryonic Stem cell<br />

• FMI - Fluorescence Molecular Imaging<br />

• FMT - Fluorescence Molecular Tomography<br />

• FRET - Fluorescence Resonance Energy Transfer<br />

• GOI - Gene of interest<br />

• GFP - Green Fluorescent Protein<br />

• MI - Molecular Imaging<br />

• MRI - Magnetic Resonance Imaging<br />

• NMR - Nuclear Magnetic Resonance<br />

• PET - Positron Emission Tomography<br />

• SNR - Signal to Noise Ratio<br />

• SPECT - Single Photon Emission Computed Tomography<br />

• WT - Wild Type<br />

• YAC - Yeast Artificial Chromosome<br />

5


CHAPTER 1<br />

Introduction<br />

In this literature study, results are presented of research that was done to identify possible<br />

connections between two fields of research; bioinformatics and molecular imaging.<br />

To be able to study potential connections, the possibilities, limitations and pitfalls of<br />

both fields were studied. Existing techniques of both fields were then translated and<br />

interpreted to possible connections to the other fields.<br />

To be able to study the two fields, it is first important to give a definition of both fields<br />

as how they will be used in this paper.<br />

Firstly, the term bioinformatics in this study has been narrowed down to the definition<br />

of computational biology, as given by the NIH: Computational Biology is “the<br />

development and application of data-analytical and theoretical methods, mathematical<br />

modeling and computational simulation techniques to the study of biological, behavioral,<br />

and social systems” [1].<br />

Secondly, the term molecular imaging in this study is defined as ”the in vivo characterization<br />

and measurement of biological processes at a cellular and molecular level in a<br />

noninvasive manner”. In this paper the term will mainly indicate to the field of small<br />

animal whole body molecular imaging.<br />

Recent developments in molecular imaging have made it possible to visualize gene<br />

expression in vivo. It has thereby become possible to acquire data sets that cover gene<br />

expression in time and in space. This new data could be useful for computational<br />

biology, but how it can be used is a topic of research. Also some analytical tools could<br />

be useful, to aid the research that is currently done with molecular imaging, and change<br />

qualitative interpretations of data that are mostly given nowadays, into statistical sound<br />

quantitative measurements.<br />

This paper is divided into five chapters, including this introduction. First an overview of<br />

background knowledge, needed to study possible connections between the two fields,<br />

is presented in Chapter 2. After the basics of biology and molecular imaging have been<br />

7


Chapter 1. Introduction<br />

covered, a study on existing techniques from computational biology is presented in<br />

Chapter 3, including possible applications to the field of molecular imaging. In Chapter<br />

4, a step into current visualizations in molecular imaging is covered, including a review<br />

on how statistical tests can be applied to these visualizations. In the last Chapter, a<br />

discussion will be presented were global concepts and challenges are presented.<br />

8 Martin Wildeman


CHAPTER 2<br />

Molecular Imaging<br />

2.1 About Molecular Imaging<br />

Molecular Imaging can be defined as the in vivo characterization and measurement of<br />

biological processes at a cellular and molecular level in a noninvasive manner. Molecular<br />

Imaging is a relatively new imaging paradigm that instead of looking at macroscopic<br />

physical processes, sheds light onto biological processes. This field of research has its<br />

roots in the field of nuclear medicine, where images are acquired with Positron Emission<br />

Tomography (PET), by using radio labeled tracers. These tracers are injected into<br />

patients to visualize components of interest. The main advantages of molecular imaging,<br />

compared to other imaging techniques such as cryosectioning, are that biological<br />

processes can be measured in the same animal throughout the whole process of study.<br />

This way, with follow up studies in time, it is certain that the same process is observed<br />

and studied and thus no correction due to differences in anatomy between organisms, is<br />

needed. Furthermore less animals are sacrificed, compared to invasive studies, which<br />

is an improvement from an ethical point of view.<br />

Two developments have made it possible for Molecular Imaging to emerge. Firstly new<br />

contrast agents have been developed, which make current modalities from medical<br />

imaging able to be used for detecting molecular processes. This will be covered in<br />

section 2.2. Secondly, imaging devices have been miniaturized, which allows for small<br />

animal research and thus introduces molecular imaging to the pre-clinical and research<br />

laboratories. This will be discussed in section 2.3.<br />

2.2 Novel contrast mechanisms<br />

With the advent of new specific contrast agents, the field of molecular imaging has<br />

boosted. Based on new, advanced biological insights it has become possible to con-<br />

9


Chapter 2. Molecular Imaging<br />

struct probes that bind to specific biomarkers. Biomarkers are proteins that are specific<br />

for some type of tissue or disease. Contrast agents can be fused to proteins directly.<br />

They can be fused to for instance monoclonal antibodies, to bind to specific receptors<br />

that are for example uniquely expressed in certain tissue cells. Also methods exist<br />

to encapsulate contrast agents in carrier proteins. In molecular imaging, specific<br />

molecules, cells or tissues are visualized by means of these contrast agents. To be able<br />

to do so, four basic criteria for these contrast agents always have to be met: The affinity<br />

of the molecular probe has to be high and specific enough, so it can discriminate between<br />

different cell types. The probe has to be able to cross all kinds of barriers, such<br />

as the blood-brain barrier, so it is diffused homogeneously throughout the body, or at<br />

least the ‘spread function’ of the diffusion has to be known, so it can be corrected for.<br />

The contrast agent needs the ability to be amplified and the acquisition devices must be<br />

sensitive enough to measure the low concentrations of the contrast agents [2].<br />

In the last decades it has become possible to visualize gene expression in vivo by the<br />

use of reporter genes. These reporter genes are in fact contrast enhancers for a specific<br />

modality. Reporter genes are used in nuclear imaging and optical imaging, but also<br />

techniques have been developed for magnetic resonance and ultrasound. These new<br />

contrast agents enables the study of gene expression in a spatiotemporal dimension<br />

which give an advance over the traditional use of micro-arrays, which are currently<br />

used for measuring gene expression, because micro-arrays only allow for temporal<br />

expression profiles. No spatial component is possible with micro-array measurements,<br />

because micro-arrays measure RNA concentrations in a solution, extracted from animal<br />

tissue, which basically gives an average expression level as a result. The only way to<br />

incorporate some qualitative spatial expression profile in micro-arrays, is to make use<br />

of sectioned tissue profiling [3]. This literature study will mainly focus on the topic of<br />

reporter gene expression and measurements in molecular imaging.<br />

2.2.1 About Reporter Genes<br />

The purpose of reporter genes is to make invisible gene expression visible. Also<br />

substrate-protein and protein-protein interactions or other molecular events that are<br />

normally not visible may become detectable in an indirect manner. When using reporter<br />

genes it is important to keep in mind that the genes that are detected are not the<br />

compound of interest, but that the measurements are expected to be directly correlated<br />

with these compounds. In this way information on non detectable processes can still<br />

be acquired. In Bright Field Microscopy and (Laser Scanning) Confocal Microscopy<br />

it already was possible to directly view gene expression by tagging proteins with auto<br />

fluorescent protein (AFP) genes. A lot of research has been done on these AFPs and<br />

currently a range of dyes with an emission wavelength between 500 and 950 nm is<br />

available.<br />

Another gene used as a reporter is found the North American firefly or Photinus Pyralis<br />

and it is called luciferase. Luciferase is able to produce light by catalyzing a chemical<br />

reaction with a substrate luciferin and ATP. Luciferase was first used as a reporter gene,<br />

for measuring the concentration of ATP in samples, by using spectroscopic experiments<br />

[4].<br />

Reporter genes can be used to report invisible genes. The way this is done, is that<br />

the reporter gene is expressed at the same time and rate as the gene of interest. The<br />

behavior of the reporter gene is then studied and the results are interpolated to the gene<br />

10 Martin Wildeman


Chapter 2. Molecular Imaging<br />

Fig. 2.1: A. The transcription of a gene is regulated by its promoter. To this promoter all kinds<br />

of regulating transcription factors bind with a certain affinity. B If the same promoter is<br />

placed upstream of a reporter gene, then this reporter gene will be regulated by the<br />

same transcription factors as a gene of interest and thus in parallel.<br />

of interest. If a reporter gene is expressed, it is very likely that the gene of interest also<br />

is expressed, of course given that they both have the same promoter (region).<br />

Because reporter genes are heterologous, i.e. they do not occur in the host organism<br />

naturally, they can be toxic to the host carrying it, or in a less severe case affect biological<br />

processes, so that quantitative measurements are not reliable anymore. To minimize<br />

these effects, regulated gene expression is desirable. Alfke et al. gave a proof of concept<br />

where reporter genes were only synthesized at the times that measurements were<br />

needed [5].<br />

2.2.2 Direct and Indirect Protein Detection<br />

A reporter gene can be constructed by cutting the gene out of a source DNA, using<br />

restriction enzymes. If the same promoter as the gene of interest (GOI) is placed upstream<br />

of the reporter gene, the likely effect will be, that transcription of the reporter<br />

gene will be the same of that of the GOI, see Fig. 2.1. When placing a copy of the<br />

promoter upstream of the reporter gene, the only thing that can be said about the GOI<br />

is that it is transcribed. Nothing can be said about post transcriptional effects (for instance<br />

splicing) and whether a gene is translated into an active enzyme or not. Also<br />

caution should be taken when trying to predict the amount of active genes (proteins)<br />

that are formed, because transcription of a gene and translation into a protein do not<br />

always relate one to one.<br />

It is also possible to construct proteins with reported genes fused to it. This way the<br />

genes of interest can be directly observed [6]. These so called fusion proteins are<br />

inserted into the genome by using standard recombination techniques. GFP proteins are<br />

considered to be non toxic, but it has to be mentioned that altering proteins by fusing a<br />

GFP to them, may alter their functionality or influence post translational alterations.<br />

A gene can be copied by using a technique called Polymerase Chain Reaction (PCR).<br />

To do this, the right primers have to be constructed. Primers are short complementary<br />

RNA strands that have sufficient binding energy at certain temperatures to have a starting<br />

point for DNA-polymerase to start transcription. If enough DNA of transcripts and<br />

vectors is produced, then ligands can be made, which in turn can be transfected into<br />

host cells. It is also possible to directly insert the DNA into undifferentiated embry-<br />

Martin Wildeman 11


Chapter 2. Molecular Imaging<br />

onic stem cells (ES cells) and apply recombination. In this way specific genes can be<br />

replaced with (non)functional genes or they can be deleted (knockout).<br />

It is important to emphasize that most reported genes provide an indirect measuring<br />

technique and that detection of those genes are thus not the detection of a functional<br />

gene of interest, but merely an indication that the genes downstream of the same reporter<br />

as the measured protein (among which the GOI) are transcribed.<br />

2.2.3 Reporter Gene Applications<br />

With the ability to synthesize gene constructs that can be measured, the question arises<br />

on what we want to measure. There are two things that can be measured with reporter<br />

genes, of which the first is the existence and amount of a cell being of a certain genotype<br />

and the second one is the measurement of expression levels of a certain gene.<br />

In the first case, a reporter gene is placed in a construct such that it is positioned downstream<br />

of an ‘always on’ promoter, mostly being a viral promoter such as SV40 or<br />

CMV, and thus constantly synthesized in a cell. If the rate of synthesis within the cell<br />

is known, and thereby also the concentration of reporter gene protein within a cell and<br />

the amount of photons per cell per second is known, then the number of cells observed<br />

can be quantitatively be determined. This fact can be exploited to for instance determine<br />

how fast a tumor is growing over time and if, when and where it is metastasizing.<br />

Also infection processes of viruses, bacteria or parasites can be studied, as will be discussed<br />

in Chapter 4. This technique needs the ability to introduce gene constructs into<br />

cell lines.<br />

In the second case, the reporter gene is placed downstream of the same promoter as<br />

a gene of interest. This gives the ability to study gene regulation within an organism.<br />

With high throughput studies, this would allow for spatiotemporal gene expression<br />

studies and thereby act as data source for gene regulatory network inferring as will<br />

be discussed in Chapter 3. Measuring gene expression profiles needs the ability to<br />

generate transgenic model organisms.<br />

There are several techniques for introducing foreign DNA into animal cells. In cultured<br />

cells micro-injection can be applied. In in vivo cases, DNA can be introduced by<br />

particle bombardment. Both methods are called direct DNA transfer. Also transfection<br />

is possible, and the last method of introducing foreign DNA is by use of transduction,<br />

with the use of retro-viruses. Gene therapy for instance is based on this transduction<br />

method. The most used technique for producing transgenic mice, is to inject DNA into<br />

the pro nucleus of a fertilized egg [7]. A targeting vector with an inserted promoter<br />

and reporter gene is transferred to the DNA of the recipient cells and a small percentage<br />

of these cells will have the new gene incorporated into their genome. The number<br />

of gene copies is not always the same and the copy number varies from a few to hundreds<br />

inserted pieces of DNA. Also YAC vectors are used because they can carry larger<br />

strands of DNA and are thus able to express larger, more complex proteins. For GFP<br />

and Luciferase though, the SV40 vectors suffices [8]. For generation of genetically<br />

altered mice, most commonly micro-injection in blastocysts is applied, which gives at<br />

first chimeric mice as a result. This is because the ES cells in the Blastocysts will be<br />

original and transformed ES cells. If offspring of these mice have the same genes it<br />

will be homozygous. A schematic overview is given in Fig. 2.2.<br />

There is a difference between transient and stable transfection. When inserted genes<br />

12 Martin Wildeman


Chapter 2. Molecular Imaging<br />

Fig. 2.2: Constructed genes are purified and inserted into oocytes. Then a selection is made<br />

out of born mice [9].<br />

are inserted into the genome, by making use of a recombinase, the inserted genes will<br />

be expressed stably, but when new DNA is inserted extra-chromosomal, the inserted<br />

DNA will be degraded over time, because it will not be replicated. For temporal gene<br />

expression measurements, stable transfection is needed, also to be certain that each cell<br />

will contain the same genome.<br />

2.2.4 Current Limitations on Reporter Genes<br />

Gene Transfer Reliability<br />

Transfection is not always effective or efficient. The undetermined gene insertion copy<br />

number, mentioned before, makes it impossible to do a quantitative analysis on gene expression.<br />

When multiple copy-numbers are present, this will result in more translation<br />

and thus in more gene expression. To make things worse, copy number and expression<br />

profiles are not always one to one related [10]. With most DNA transfer techniques it<br />

is difficult to predict side effects based on the location where the DNA is transfected.<br />

For example many non coding RNA’s (ncRNAs) have an unknown function and it is<br />

expected that many ncRNAs are not (yet) known. The size of ncRNAs varies from 20<br />

(microRNA) to thousands of nucleotides [11]. Random insertions therefore can give<br />

unpredicted results.<br />

With a technique called Flp-in from Invitrogen, it becomes easier to insert genes into<br />

a genome. The problem to be solved for this Flp-in technique is to produce a stable<br />

cell-line which contains only one Flp site and that seems to behave like a normal cell<br />

line (the long term side effects of DNA insertion cannot be predicted), but once such a<br />

cell line is generated, virtually every gene can be inserted into the Flp system, by using<br />

homologous recombination [12]. Using a Southern-blot it can detected whether there<br />

is one and only one copy of the inserted Flp site [13].<br />

This technique is mostly used to generate on demand genetically altered cell lines.<br />

When cell lines carrying this Flp-in site are transfected with an always on promoter<br />

Martin Wildeman 13


Chapter 2. Molecular Imaging<br />

and a reporter gene, these cells become trackable with FLI, BLI or any other probe<br />

gene. Note that it is only possible to track the cells and keep track of the number<br />

of cells (quantification). No gene regulation can be monitored using this ‘always on’<br />

technique. This tracking is important for temporal study of for example tumor growth<br />

and metastasis, or tracking of infectious agents such as viruses or bacteria, as will be<br />

discussed later.<br />

As long as the regulatory effect of non-coding elements is not completely understood,<br />

it cannot be guaranteed that an insertion has no effect, but if a stable cell line with<br />

a Flp insertion is used, it is relatively certain that new insertions at that site have no<br />

side-effects on the normal functioning of the studied organism or cell line.<br />

Diffusion Coefficient<br />

When measuring reporter gene concentration it is important to keep in mind that the<br />

genes that are measured probably have the same rate of synthesis, due to the same<br />

promoter region, but it is not likely that they have the same degradation rate. With the<br />

basic conversation law it can be shown that proteins with a faster degradation rate will<br />

appear in a lower concentration than proteins with the same rate of synthesis, but a<br />

lower degradation rate.<br />

The general formula of gene formation can be stated as follows:<br />

( )<br />

time rate of change<br />

of protein conc.<br />

= Regulation + Diffusion + Decay (2.1)<br />

The only part in this equation that is equal between the gene of interest and the reporter<br />

gene, is the regulation part. The level of decay and the diffusion coefficient differ. This<br />

has as effect that the protein concentration of the gene of interest cannot be determined<br />

by the measurement of protein concentration of the reporter gene. Something qualitative<br />

can be said about upregulation or downregulation, but quantitative measurements<br />

on up or down regulation are not possible if the diffusion and decay parameters are<br />

unknown.<br />

Post Translational Effects<br />

In addition to these unknown diffusion parameters, it should also be taken in consideration<br />

that the fact that a gene is transcribed, does not guarantee that the protein is<br />

actually formed, or if it is formed, that it will be in a functional shape. Transcribed<br />

RNA in eukaryotes is often spliced into so called coding DNA (cDNA). This cDNA<br />

determines what the amount and order of amino acids in a protein will be. One single<br />

strand of translated messenger RNA (mRNA) can be spliced in different ways, so that<br />

isoforms of the same gene can appear. This also results in different forms of proteins.<br />

With reporter genes it is not possible to identify different protein isoforms. Alternative<br />

splicing is thought to be one of the most important components of the function<br />

complexity of the human genome. Given that different isoforms may be possible for<br />

different regulation effects and that genes can code for up to 40,000 protein isoforms<br />

at least some caution should be taken when interpreting gene expression data [14]. For<br />

different forms of splicing, see Fig. 2.3.<br />

14 Martin Wildeman


Chapter 2. Molecular Imaging<br />

Fig. 2.3: Different splicing effects are possible. a: exons can be included or excluded, and<br />

splice sites can be altered. b: Initiation of translation or stop signals can be altered and<br />

inframe deletions or insertions are possible [14].<br />

Fig. 2.4: Many modalities from clinical imaging have been miniaturized for the use in Molecular<br />

Imaging [16]<br />

Protein Tagging<br />

When protein tagging is possible, it is relatively certain that the molecule that is visualized<br />

is the same as the gene of interest. For tagging genes the main reporter genes<br />

that are used, are the GFP family proteins. Although these genes are thought to be non<br />

toxic, it should be taken into account that gene tagging may alter the functionality of<br />

proteins and thereby may cause the alteration of biological regulation and functioning<br />

in the studied organisms [15]. In biological processes everything is based on equilibria<br />

and minor distortions may cause great effects.<br />

2.3 Molecular Imaging Modalities<br />

Besides the upcoming of in vivo gene reporters, another trend seen in the field of molecular<br />

imaging is that detection devices have been miniaturized. These micro devices are<br />

cheaper than their clinical counterparts and allow for small animal whole body imaging<br />

[16]. Because these new acquisition devices are smaller, some scaling problems need to<br />

Martin Wildeman 15


Chapter 2. Molecular Imaging<br />

be tackled, for instance how much resolution is needed to get meaningful information<br />

and what the measured volume must be [2].<br />

Commonly seen reporter genes in short can be divided into three imaging modalities:<br />

Radio-nuclide imaging, optical imaging and magnetic resonance imaging. Each<br />

category has its own advantages and disadvantages in terms of resolution, sensitivity,<br />

acquisition time and substrate admission [16]. In Molecular Imaging also the modalities<br />

CT and Echography can be used, but because they cannot or can hardly be used<br />

for visualizing gene expression, they will be discussed in less detail in this literature<br />

study. It should be noted though that CT may give much extra information as an underlying<br />

modality if extra resolution or spatial context is required. To be able to use this<br />

information, image registration is needed, as is discussed in section 2.4.2.<br />

Most imaging modalities seen in medical imaging can be used in molecular imaging,<br />

with appropriate contrast agents. The modalities nuclear imaging, radiography imaging,<br />

magnetic resonance imaging, optical imaging and ultrasound imaging will be described<br />

shortly. For each modality a reporter gene, if applicable, and a short description<br />

of acquisition will be given. For all modalities hold the same arguments; if a contrast<br />

enhancer can be bound to a molecular probe, it is, given that it is not toxic and that it<br />

can pass all necessary barriers, suitable as an (indirect) reporter for gene expression. A<br />

short overview of different modalities and their general specifications is given in table<br />

2.1.<br />

2.3.1 Nuclear Imaging<br />

Nuclear Imaging is based on unstable molecules that emit positrons or γ-rays and<br />

thereby fall into a more stable energy state. Two modalities are seen in molecular<br />

imaging, namely PET and SPECT. In PET, most used isotopes are 15 O, 13 N, 11 C and<br />

18 F and these isotopes emit positrons. When a positron is emitted and collides with an<br />

electron it annihilates into two γ-rays which travel in a ∼ 180 ◦ direction. In PET, these<br />

γ-rays are then collected and converted to a visible image, by making use of a ring<br />

of gamma detectors. Due to the fact that the γ-rays are traveling on one line and due<br />

to attenuation in the different tissue types, the exact location of the positron emitting<br />

source can be located in the 3D space [16]. Coinciding photons in the detector ring are<br />

from the same source (See Fig. 2.5).<br />

Isotopes used in SPECT are 123 I and 99m Tc emit γ-rays [19] which do not simultaneously<br />

travel in opposite direction. It is thus not possible to use a detector ring to pinpoint<br />

the location of the source of emission. Instead of using a detector ring, γ-rays are<br />

detected by special camera’s, that consists of a pinhole collimator, a scintillating crystal<br />

and a photon detector. γ-rays are converted to photons in the visible frequency range<br />

by the use of scintillating crystals and thereafter are detected by the photo detectors.<br />

By making use of pinholes, only photons flying on a line parallel to the pinholes/septae<br />

are detected. Knowing that captured γ-rays can only come from the source directly, a<br />

line in 2D space where the source must lie on is known (Fig. 2.6). When rotating the<br />

camera around the sample, it is possible to reconstruct 2D images. The technique of<br />

SPECT therefore is comparable to CT, but different energy photons are used. Multiple<br />

2D images acquired with SPECT, can be reconstructed to a 3D model the same way as<br />

in CT as will be seen later.<br />

Sensitivity of SPECT is of an order of magnitude lower than what can be achieved with<br />

16 Martin Wildeman


Chapter 2. Molecular Imaging<br />

Fig. 2.5: PET tracers are injected into organism. A PET tracers contain atoms that are unstable<br />

and emit positrons. If these positrons collide with electrons, they annihilate into two<br />

γ-rays traveling in opposite direction. To measure gene expression, reporter genes are<br />

used that can accumulate PET tracers in a cell, so that these cells become visible.[17,<br />

18]<br />

Fig. 2.6: SPECT is based on pinhole detection. PET is based on coincidence events.[19]<br />

Martin Wildeman 17


Chapter 2. Molecular Imaging<br />

PET. This is due to the fact that in SPECT, γ-rays have to be tunneled through septae in<br />

a lead barrier, so that only straight traveling rays are detected. The longer these septae<br />

are, the higher the resolution in SPECT becomes, but also the less sensitive. (Less rays<br />

are detected, because more are shielded.) An advantage of SPECT over PET is that the<br />

used tracers have a longer half life. This allows for studies on slower/longer biological<br />

processes. The biggest disadvantage of SPECT is its lower (but still good) sensitivity<br />

compared to PET.<br />

The reporter genes for PET are genes that have an high binding specificity for some<br />

radio labeled biological molecules. These substrates are normal substrates labeled with<br />

positron emitting isotopes. To make sure that the overall criteria are met, specifically<br />

barrier crossing, it is important to use a molecular target that is expressed on the surface<br />

of a cell, a so called cell surface protein, or to make use of a molecular probe that can<br />

freely pass the cell membrane (For example see [20]). If the probe can pass the membrane,<br />

it is important that it is ‘trapped’ inside the cell, after some chemical reaction, so<br />

it accumulates inside the cell. It is important that the cell is not killed by this (toxicity),<br />

but accumulation of the radioactive compound inside the cell causes a higher signal.<br />

Also the use of monoclonal antibodies, to detect certain cell types is possible [21].<br />

2.3.2 Computed Tomography<br />

By making use of the x-ray wavelength region, the detection of heavy atoms, such as<br />

calcium atoms, is possible, because the attenuation of x-rays is different for different<br />

weight atoms.<br />

By rotating the sample or the scanner, multiple projections of the sample can be obtained<br />

(See Fig. 2.7). The scanned sample can be reconstructed slice by slice, where<br />

multiple projections of a slice are backprojected to obtain a 2D image. The projections<br />

can be filtered before backprojection, to include or occlude certain frequencies. Heavy<br />

atoms cause more attenuation than light atoms and thereby sensitive for difference of<br />

(average) atom weight in tissues. Positions of heavy atoms, or contrast agents, can be<br />

reconstructed by making use of this backprojection algorithm. The resolution of CT is<br />

limited by the ionizing effect of x-rays. This effect causes direct radiation damage and<br />

in the longer term DNA damage. To obtain a higher resolution, more rays per voxel are<br />

needed, which causes more damage and this damage needs to be minimized.<br />

Gene reporting probes, to be detectable, need to contain heavy atoms. The effect of<br />

large quantities of these substrates are not known and CT is not used as a gene expression<br />

measurement. X-ray imaging, and especially computer tomography (CT), are<br />

currently mainly used as a structural modality in MI. By making use of modality fusion,<br />

expression data can be fused into a high resolution spatial context.<br />

2.3.3 Magnetic Resonance Imaging<br />

Nuclei are brought into alignment by a strong magnetic field. They can have a high<br />

energy spin, when the poles of nuclei are the same as in the magnetic field and a low<br />

energy spin when the poles are oppositely aligned. All elements with a nucleus that has<br />

an odd amount of nucleons, being protons and/or neutrons, can be used form MRI. To<br />

be more precise, every nucleus that contains an unpaired proton and/or neutron is suitable<br />

for MRI. Nuclei that are most commonly used are 1 H, 2 H, 31 P, 23 Na, 14 N, 13 C and<br />

18 Martin Wildeman


Chapter 2. Molecular Imaging<br />

Fig. 2.7: Multiple 2D x-ray images of a body are acquired using different rotations. With a set of<br />

these images a 3D space can be reconstructed. (kabayim.com/images/spiralCT.jpg)<br />

19 F. Every isotope that has a non zero nuclear spin can be used for Nuclear Magnetic<br />

Resonance. Once all nuclei are aligned into the magnetic field, a RF pulse is generated<br />

by placing a current through a coiled wire around the sample. This pulse causes the<br />

nuclei to be brought out of alignment of the static magnetic field. After this, the spins<br />

are returning into alignment with the static magnetic field and the duration needed for<br />

this realignment, called the spin relaxation times, are measured. This can be done by<br />

the same coil or by an additional electromagnetic coil.<br />

The location of the molecules can be determined by placing a gradient in the force of<br />

the static magnetic field. This is because the frequency of the spin is determined by the<br />

force of the magnetic field, as is shown in equation 2.2.<br />

ω 0 = γB 0 (2.2)<br />

Only nuclei that have the same frequency (ω 0 ) as the RF signal, will respond to this<br />

signal. This is why the technique is called Magnetic Resonance. B 0 is the force of the<br />

magnetic field in Tesla and γ is the gyromagnetic ratio, which is a specific property of<br />

the nucleus.<br />

There are different relaxation phases, T 1 and T 2 that correspond to the Z and the X-<br />

Y plane respectively, and although these differences are quite fundamental, they are<br />

considered to be out of scope of this study.<br />

The measured relaxation times are mainly determined by the chemo-physical environment.<br />

The combination of all measured relaxation times results in a NMR signal in the<br />

time domain. This signal can then be converted into a frequency domain by applying<br />

a Fourier transform [16, 22]. MR is very sensitive to differences in soft tissues. Extra<br />

contrast agents, such as gadolinium or dysprosium can be used to enhance the MR<br />

signals in regions of interest.<br />

MR is not yet really used for imaging of gene expression, because of its lack of sensitivity<br />

to small amounts of reporter genes. With appropriate amplification strategies<br />

though, it is possible to obtain enough signal and with MR very high resolution can<br />

be achieved. Louie et al. developed a shielding container that is able to ‘switch off’<br />

gadolinium. In the presence of β-Gal, which is the protein produced by the LacZ gene,<br />

Martin Wildeman 19


Chapter 2. Molecular Imaging<br />

Fig. 2.8: Gadolinium encapsulation is cleaved by β -galactosidase at the red bond shown in A.<br />

This way the Gd 3+ becomes detectable by MRI once it gets in contact with water. Left<br />

is the intact cage and right is the cleaved cage where gadolinium is free. (A) shows the<br />

chemical geometrical structural formula and (B) shows the same molecules in a space<br />

filling model. The purple atom that can be seen in (B) right, is the free gadolinium atom<br />

[23].<br />

this shielding container gets cleaved in such a way that a coordination site at the Gd 3+<br />

becomes free and gets ‘activated’ (see Fig. 2.8). The activated Gd atom generates a<br />

roughly twofold stronger signal than the inactive Gd. Furthermore MR does not suffer<br />

from limitations that are seen in optical imaging, concerning spatial reconstruction<br />

algorithms. [23]<br />

MRI is still mainly used in MI as an extra structural modality for modality fusion. Also<br />

combined PET-MRI scanners exist, but combined PET-CT scanners are more common.<br />

2.3.4 Optical Imaging<br />

Optical imaging makes use of the frequency spectrum in the range of visible and near<br />

infra-red light. Images are acquired by using basic CCD Cameras. Photography in<br />

the clinical field was mainly used for showcases of phenotypic effects of diseases or<br />

injuries, mainly for educational purposes, but with the upcoming of optical contrast<br />

agents, it is now possible to use this modality as a molecular imaging modality. An<br />

important development for this to be possible is the availability of more sensitive cameras.<br />

The technique of these cameras is the same as normal CCD cameras, but they<br />

are cooled down. The technique is called CCCD (Cooled Charge Coupled Device) and<br />

enables that light sources with a really low intensity can still be detected.<br />

20 Martin Wildeman


Chapter 2. Molecular Imaging<br />

Fig. 2.9: Schematic overview of different capturing techniques. a and b are planar imaging c is<br />

the principle of tomography. d is a reconstructed result of optical tomography, of which<br />

the emission source has yet to be calculated [25].<br />

Fluorescence Molecular Imaging<br />

The most common Auto Fluorescent Proteins are the eGFPs (enhanced Green Fluorescent<br />

Proteins). These proteins must be excited with an outside light source, the<br />

excitation beam or source. An AFP must be exited with an higher energy than that it<br />

emits. Therefore, with appropriate filtering, emitted light can be filtered out for imaging.<br />

In this way only the light that has its origin from the AFPs is recorded. This is<br />

done because noise from other homologous AFPs might give interference because of<br />

overlapping spectra. With FMI, images can be acquired in a planar form, resulting<br />

in a 2D image, or by using a technique called optical tomography, where a 3D image<br />

can be acquired. The penetration depth for tomography is much higher than for planar<br />

imaging, but planar imaging has the possibility for much higher throughputs [24]. A<br />

short schematic view of different capturing techniques is given in Fig. 2.9.<br />

Bioluminescence Imaging<br />

When bioluminescent proteins, of which luciferase is most common, are present in<br />

an organism, an image of the gene expression can also be made with a Cooled CCD<br />

Camera. This is called bioluminescence imaging. Although the emission intensity<br />

of light in BLI is much lower than in FMI, it has a much higher sensitivity. This<br />

is because there is less background signal in BLI. The only sources of light are the<br />

proteins itself [25]. Bioluminescent sources can be detected by using a very sensitive<br />

camera, combined with a dark chamber in which no other photons are present than the<br />

photons of the bioluminescent protein. A schematics overview of steps needed for BLI<br />

is shown in Fig. 2.10.<br />

Protein-protein interaction with FRET, BRET and the yeast two-hybrid system<br />

GFP and Luciferase can also be used to measure protein-protein interaction, by making<br />

use of a phenomenon called FRET or BRET [27, 28]. It is currently possible to<br />

visualize Protein-Protein interaction [29]. This is done by the use of fusion proteins.<br />

Copies of genes are inserted into the organism of interest. With FRET two GFPs and<br />

Martin Wildeman 21


Chapter 2. Molecular Imaging<br />

Fig. 2.10: Schematic of Bioluminscence Imaging. (A.) BLI genes are inserted into cell lines<br />

or DNA constructs, (B.) are then inserted into an animal model (C.) and images are<br />

captured. (D.) Acquired data is then quantified and visualized [26].<br />

Fig. 2.11: Principles of FRET. a,b,If proteins are in close proximity (less than 60 Å) the emission<br />

of the acceptor GFP is measured. Otherwise, only the emission of the donor GFP,<br />

with different wavelength, is measured. c shows some techniques involving FRET<br />

[29].<br />

with BRET a Luciferase and GFP are fused to gene X and gene Y by placing them<br />

downstream of a promoter. When gene X and Y bind, the two GFP’s get in close proximity<br />

of each other, such that resonance energy transfer is possible, as can be seen in<br />

Fig. 2.11. Not only protein-protein activity can be visualized, but also for instance,<br />

protease activity, which can act on a restriction site in the linker DNA of two fused<br />

GFP proteins. With a CCCD camera acquisition is possible. Another method of visualizing<br />

protein-protein interaction is the yeast two-hybrid system. In [30] in a proof of<br />

concept, the interaction of MyoD and ID is visualized. Y2H is an indirect measuring<br />

technique. The interaction of the two proteins of interest induce the transcription of<br />

Luciferase which in turn is translated and can be visualized with a Cooled CCD Camera.<br />

The reporter gene of use can be chosen freely. For the mechanism, see Fig. 2.12<br />

22 Martin Wildeman


Chapter 2. Molecular Imaging<br />

Fig. 2.12: The Yeast Two Hybrid system. Gene X and Y are fused GAL4 and VP16 which<br />

form an active transcription factor [31] for a luciferase gene, by placing the luc gene<br />

downstream of a GAL4 binding site [30].<br />

2.3.5 Ultrasound Imaging<br />

Ultrasound Imaging is based on echo. To obtain an image with ultrasound, short, high<br />

frequency sound pulses are generated. At each barrier where a change of tissue is<br />

located, a portion of the signal is reflected and can be detected by a scanner. The time<br />

it takes for a signal to return to the source, is correlated to the distance that that signal<br />

has travelled. Ultrasound contrast agents are used to enhance the signal. Most common<br />

agents are small air or gas bubbles, called micro-bubbles. Not only do they form a<br />

strong reflective barrier (blood/gas), they also resonate which make them even more<br />

reflective [32]. Micro-bubbles are quantifiable. Although in the traditional ultrasound<br />

resolutions are not really high, with ultrasonic biomicroscopy resolutions of up to ∼<br />

40µm can be achieved and with scanning acoustic microscopy, which is an even higher<br />

frequency sound (200 MHz and higher) resolution of 3 µm are achievable. It should be<br />

noted though that penetration depth decreases with an increase of frequency. With new<br />

micro-bubble contrast agents, specific surfaces can be bound and contrast is enhanced.<br />

Micro-bubbles are encapsulated in a protein and fused to specific antibodies. This<br />

is used for instance, to image inflammatory cells and these specific contrast agents<br />

opens the door for molecular imaging. Ultrasound is not used for gene expression.<br />

This is mainly due to the lack of suitable gene reporters, but also the resolution versus<br />

penetration depth trade-off plays a role. This technique may provide useful information<br />

on concentration flows as will be discussed shortly in 3.<br />

2.4 Acquisition Challenges<br />

2.4.1 Quantification of BLT and FMT<br />

Forward and Inverse Problem<br />

In contrast to PET, for BLT and FMT a scattering and absorption model is required to<br />

be able to solve the inverse problem. Finding the right parameters is called the Forward<br />

Martin Wildeman 23


Chapter 2. Molecular Imaging<br />

Table 2.1: Short list of specifications of different modalities. Source: Molecular Imaging in Living<br />

Subjects, Massoud<br />

problem. E.g. Given the source of emission what must the parameters of the model<br />

be to generate the observed data Once these parameters are estimated, one can try<br />

to solve the inverse problem, e.g. given a model with known parameters and given an<br />

observation, what is the shape, location and density of the emission source For FMT<br />

it is possible to make an approximation of the forward model, because a known input<br />

light source is available, of which the output can be measured. From the attenuation<br />

model, obtained from the known laser light source, it is then possible to start solving<br />

the inverse problem for a fluorescent source. The forward problem cannot be solved<br />

with BLT as no known light source can be used for estimating the parameters of the<br />

model. A priori anatomical information therefore has to be incorporated [33]. To do<br />

that, a second modality, such as MRI or CT is needed to provide anatomical details<br />

about the model. A priori model information can also be obtained from mouse atlas<br />

databases, see Fig. 2.13 [34]. The problem with multi modality though is, that it is not<br />

straightforward to register these modalities on on each other and errors are introduced<br />

because of differences between the model and the atlas.<br />

When registration is complete and successful, different tissues in the model can be<br />

segmented an with those segments the inverse problem can be solved. For the optical<br />

parameters mean values from the literature can be used. To approximate the photon<br />

propagation, the following equation can be used [35]:<br />

{ −∇·(D(x)∇Φ(x))+µa (x)Φ(x)=S(x)<br />

D(x)=(3(µ a (x)+(1−g)µ s (x))) −1 (x ∈ Ω) (2.3)<br />

In this equation S(x) is the unknown source density, Φ(x) is the photon density at<br />

location x. µ a , µ s and g are optical parameters. In the paper of Cong [35] equation 2.3 is<br />

solved using a modified Newton method. But it is also possible to use a MAP approach<br />

[33]. It is proved that this inverse problem has a unique solution [36], provided that the<br />

model is well enough defined.<br />

Resolution Improvement<br />

A problem concerning the ill-posedness in BLT is that the optical parameters of the<br />

body tissue are temperature dependent [37]. This temperature dependency can be mod-<br />

24 Martin Wildeman


Chapter 2. Molecular Imaging<br />

eled, but this is at the cost of an even more complex model and thus at the cost of extra<br />

computational power. A higher resolution and more accurate result will be gained by<br />

adding this temperature dependency. It should also be noted though that temperature<br />

has to be measured for every tissue which will likely introduce a new inverse problem<br />

for the infrared spectrum.<br />

Chaudhari et al [38] propose to use spectral information for reconstruction of a BLI<br />

source. Because of attenuation in the body tissues, there is a spectral shift in the signal.<br />

By capturing hyper-spectral ( 100 spectrum bins) or multi-spectral( 10 bins) these attenuation<br />

differences can be taken into account. This way, two overlapping sources in<br />

a 2D image of which one is superficial and one is located deeper, can be distinguished.<br />

It should be noted that for each spectral band, an individual inverse problem has to be<br />

solved.<br />

Backprojection<br />

It remains to be seen whether these complex optimization problems are useful. The<br />

optical properties of different tissues in the small animal models are unknown and simplified<br />

assumptions are used for the reconstruction of the BLT energy source [39]. The<br />

most important question for combining BLT (or Fluorescence Tomography for that<br />

matter) and the field of Systems Biology will be: How much resolution in space and<br />

time is needed, for cell specific and process dynamic behavior respectively, for feasible<br />

application of molecular imaging to track gene expression in the organism In the<br />

paper of Kok [39] a relatively straightforward algorithm is used for reconstruction of<br />

the bioluminescent source. Scattering is not taken into account and the tissue structure<br />

is assumed to be homogeneous, which is clearly not the case. Despite these simplifications<br />

a good estimation is achieved for source localization of superficial lesions.<br />

Combined with the fact that the authors only want to attract attention to a location in<br />

the accompanying CT (or another structural data-file), the algorithm can be seen as<br />

an efficient and simple reconstruction algorithm. The authors use a backprojection of<br />

eight planar images, each rotated a known number of degrees, onto a ‘3D’ structural<br />

data set. This methods provides good resolution for superficial BLI sources, but has<br />

lower resolving power for deeper lying tissues. It is also shown though in [40] that also<br />

with coarse grained resolutions interesting new information can be obtained from gene<br />

expression data.<br />

2.4.2 Combining Information: Multi-modality fusion<br />

Because different modalities contain different information it is useful to combine this<br />

information. CT for example is sensitive to elements with a high atomic number, for<br />

example calcium which is found in bones and calcification. Heavy atoms such as iodine<br />

can be injected in the blood stream as contrast agents making veins and blood-rich<br />

organs detectable. MRI on the other hand is very powerful for visualizing different soft<br />

tissues. When these two modalities are correctly combined, they support each other<br />

and fill in tissue differences that the other modality it not able to detect.<br />

Bioluminescence and Fluorescence planar images by themselves don’t give much detail<br />

on the location of gene expression. This is due to diffusion and scattering inside the<br />

body, before photons reach the surface of the body (e.g. the skin of the mouse) from<br />

Martin Wildeman 25


Chapter 2. Molecular Imaging<br />

Fig. 2.13: Mouse atlas with a surface rendering of skeleton and different organs [34].<br />

which the picture is taken. As an effect only a rough indication (in terms of millimeters)<br />

of the location can be given based on the set of 2D images. A huge advantage of BLI<br />

and FMI though, is that they are much more sensitive to abnormalities than the existing<br />

medical imaging modalities. Therefore it is possible to detect diseases, well before<br />

morphological changes are observable. If a detection is made with BLI or FMI, other<br />

modalities can be used to study morphological changes in detail at the specific sites of<br />

interest [39].<br />

How to align different modalities The position of the mouse model during the acquisition<br />

of different modalities most likely differs. If the two modalities are combined, a<br />

reconstruction of the source will be possible. For the combination of multiple modalities<br />

though, alignment by image registration is needed. This 3D alignment is not a<br />

straightforward procedure [16]. If all modalities can be aligned to a standard atlas, this<br />

way modalities can be fused. In the paper of Baiker [41] a registration of the skeleton<br />

is automatically done based on an optimization, that minimizes differences between<br />

an mouse skeleton atlas and a skeleton generated from a CT scan. By extending this<br />

work, it is also possible to register some marks on the mouse skin and combined with<br />

the skeleton information, interpolate where the organs of the mouse are located. It is<br />

also possible to generate a 3D image from structured light from planar images. By<br />

combining those models, is should be possible to estimate where different tissues in<br />

the model are located.<br />

It is important to notice that a mapping to an atlas is needed for both qualitative as<br />

quantitative gene expression measurements [42]. To be able to tell in which organ gene<br />

expression occurs for instance, one has to know where the organs are located in the<br />

3D space of an organism first. A whole range of mouse atlas databases currently is<br />

available [34]. Few of them also contain spatiotemporal gene expression data (Mouse<br />

Atlas Project developed at the University of Edinburgh and DigiMouse), to which new<br />

measurement can be correlated. [43, 42, 34]<br />

26 Martin Wildeman


Chapter 2. Molecular Imaging<br />

2.4.3 Combining Information: Follow Up Registration<br />

Although in vivo imaging allows for continuous measurements in time without moving<br />

the animal, most if not all diseases that are studied have a progression in terms of<br />

weeks rather than in terms of hours. It is therefore infeasible to continuously maintain<br />

the studied animal at the exact same position and it is thus necessary to be able to<br />

register images of the same animal in individual experiments.<br />

For follow-up registration, the same atlas approach can be used as for multi modality<br />

fusion. Once it is possible to register the modality on an atlas, it is a small step to<br />

register a ‘time series’ of this same modality to this atlas.<br />

To overcome or prevent some of the registration problems, it is also possible combine<br />

multiple modalities during the acquisition [38]. This way, it is ensured that both<br />

modalities are exactly in the same location in the x,y,z space. Prita Ray et al. [20]<br />

are doing much work on multi modal capturing, by constructing multi modal reporter<br />

genes. In this way FMT, BLT and PET can be acquired with the use of one and the<br />

same reporter gene construct. Also a combined micro PET-CT scanner is used, to<br />

obtain high-resolution anatomical images and gene expression data [44].<br />

In the ideal case, the lab assistant should not need to worry about how to position the<br />

animal for measurements, but positioning the animal in the same way each experiment<br />

makes the registration a lot easier. An effective way to fix the organism in a spatial<br />

context is the use of animal holders. By positioning animals in the same way each<br />

time a acquisition is done, the registration problem is easier solved by reduction of the<br />

degrees of freedom.<br />

2.4.4 Current Limitations in Molecular Imaging<br />

To obtain useful gene expression data with molecular imaging, multiple measurements<br />

have to be made and results have to be combined in one data set. These measurements<br />

contain some noise which introduces inaccuracies, but registration steps will also introduce<br />

new inaccuracies that further decreases the resolution of measurements that can<br />

be achieved. Different kinds of noise are discussed below.<br />

General Noise<br />

Every modality suffers from its own noise problems. The basic problem with noise is<br />

that it can give an overlap with the signal, especially when the signal to noise ratio is not<br />

high enough. To overcome some of these SNR problems, the means of amplifications<br />

of the reporter contrast agents can be used, but if a quantification of gene expression<br />

levels is necessary it must be known how much amplification is used.<br />

Attenuation<br />

Solving the inverse problem is a difficult task. By using the anatomical information<br />

from an atlas, you introduce an error due to the difference between the organism of<br />

study and the reference organism. The optical parameters of the body tissue are temperature<br />

dependent [37]. This temperature dependency can be modulated, but this is<br />

Martin Wildeman 27


Chapter 2. Molecular Imaging<br />

at the cost of an even more complex model and thus at the cost of extra computational<br />

power. Moreover the temperature in an organism is not homogeneous but differs in<br />

space and over time. This will likely affect reconstruction accuracy.<br />

Multi-modality and Follow-up registration<br />

A problem with BLI and FLI, is that it is based on 2D images that only provide pictures<br />

of the surface. It is possible to register CT data to a 3D mouse atlas, and it is also<br />

possible to register 2D BLI data to 3D CT data [39]. Both registration steps introduce<br />

errors. Moreover because it is relatively easy to model rigid conformational changes,<br />

but it is more difficult to model soft tissue deformations. If BLI sources are located in<br />

soft tissues, the reconstruction of the source therefore becomes more inaccurate. In the<br />

ideal case, small animal models are used to be able to mimic diseases in humans, but if<br />

not high enough resolutions can be obtained with small animal models an exploration<br />

to smaller, simpler and transparent organisms can be made, such that the light sources<br />

can be seen directly and therefore reconstruction of the light source, if already needed,<br />

becomes straightforward.<br />

28 Martin Wildeman


CHAPTER 3<br />

Molecular Imaging as extra data source for model<br />

generation<br />

With the ability to visualize gene expression the question arises on what can be done<br />

with acquired data. To answer this question we take a look into the field of bioinformatics<br />

where gene expression data already is analysed.<br />

One reason to strive for an understanding of the underlying cellular processes in an<br />

organism, is to be able to predict it’s behavior and to change or correct its behavior if<br />

needed. To do this, it is not always needed to understand the full functioning of the<br />

system.<br />

There are two approaches for gaining insight in cellular processes. Firstly, by doing<br />

experiments at a low level and secondly by simulating (high level) processes to mimic<br />

observed data. With large complex biological networks possibly only the latter approach<br />

is feasible for obtaining a ‘full’ understanding [45, 40].<br />

In an attempt to relate the field of molecular imaging to the field of bioinformatics,<br />

some examples from bioinformatics are studied and related to MI in this Chapter.<br />

Firstly some studies will be highlighted where spatiotemporal data is acquired using<br />

high throughput techniques, secondly some findings on mathematical models for network<br />

inference will be presented, thirdly a short concept will be given on how to translate<br />

these mathematical models from quantitative to qualitative model, because data<br />

quality is not always good enough for quantitative model construction. Finally a concept<br />

on statistical model inference will be given, based on time series micro array<br />

experiments.<br />

Some findings will then be discussed and questions will be posed in the discussion<br />

section.<br />

29


Chapter 3. Molecular Imaging as extra data source for model generation<br />

3.1 Acquisition of Spatiotemporal Gene Expression Data<br />

In a spatial-temporal gene expression study on Drosophila melanogaster, Seroude et al.<br />

obtained a set of age related genes of which expression changes with age [46]. For the<br />

measurements, extraction and cryosectioning were used for time and spatial expression<br />

profiles respectively. Genes were visualized using the Flytrap system and staining of<br />

β-galactosidase. This way, a 3D+t gene expression profile was obtained. It should be<br />

noted that this experiment was not an in vivo measurement, but the possibility of Flytrap<br />

to express GFP [47] could open the door for non-invasive molecular imaging. In situ<br />

images of the Drosophila Melanogaster could be clustered by using pattern recognition<br />

techniques. In [3] embryo images were studied by using a Gaussian Mixture Model,<br />

an eigenvector basis and a discrete Haar-wavelet as feature space. All pictures were<br />

aligned by making sure that the dorsal side of the embryos was on top and the anterior<br />

on the left. Similar spatial gene expressions were clustered, using graph partitioning.<br />

This way the authors were able to cluster the embryos into different developmental<br />

stages (temporal) and co-regulated spatial expression profiles in those stages (spatial<br />

correlation). Genes with similar expression profiles are thought to be involved in the<br />

same pathway. With this procedure they were able to get a 99,55% staging overlap,<br />

meaning the difference in developmental stage in embryonic development annotated<br />

by the algorithm, compared to expert annotation. This overlap suggests that automated<br />

gene expression measurements are feasible. Indeed in [48] it is said that automatic<br />

high throughput measurements of ISH is feasible and the authors created a mouse atlas<br />

containing spatial gene expression data. Also in their gene expression profile clustering<br />

was done.<br />

The power of spatiotemporal expression measurements is, next to the fact that spatial<br />

information is obtained, that it is sensitive to gene expression in small clusters of<br />

cells. In microarray data these expression profiles would be averaged out by larger<br />

cell clusters with different expression levels [48]. For example, purely hypothetical,<br />

if in a developing embryo there is upregulation in the anterior and downregulation in<br />

the posterior, a microarray experiment would detect no regulation, whereas a spatial<br />

measurement would be able to show this ‘expression gradient’<br />

Dupuy et al. acquired a spatiotemporal gene expression profile by using in vivo imaging<br />

[49]. Because in their paper the authors make use of spatiotemporal in vivo imaging of<br />

which techniques may be extendable to whole body molecular imaging, their publication<br />

is covered in extra detail here.<br />

In their paper Dupuy et al. made a high throughput analysis of about 900 gene promoters.<br />

They used the technique as visualized in Fig. 2.1. Each of those 900 promoters<br />

were expressing a GFP protein and these promoters covered about 5% of the protein<br />

coding genes in C. elegans. Because they wanted to do gene expression measurements<br />

in a developmental study the authors needed some way to incorporate a temporal component<br />

in their spatial gene expression profile measurements.<br />

Temporal arrangement using COPAS<br />

The authors measured gene expression using GFP as a reporter gene and measured<br />

expression profiles on the longitudinal axis of the organism Caenorhabditis elegans.<br />

Instead of measuring expression profiles directly over time, the authors used the body<br />

30 Martin Wildeman


Chapter 3. Molecular Imaging as extra data source for model generation<br />

Fig. 3.1: a Images as captured and converted into a one dimensional GFP intensity bar. b They<br />

are aligned with respect to orientation and length, to get a chronogram c. Then the<br />

chronograms are normalized in time d so that correlation can be calculated [49].<br />

length of the organism as an indication of age. This length could automatically be<br />

sorted by a device called COPAS (‘complex object parametric analysis and sorter’,<br />

produced by a company called Union Biometrica). The working of this device is based<br />

on flow-cytometry which basically separates particles on their size. Larger/heavier<br />

particles will have a longer time of flight than relatively smaller organisms. Images<br />

were acquired with a CCD camera and a confocal microscope. The COPAS system<br />

is able to generate fluorescent emission profiles along the anterior-posterior axis of C.<br />

elegans automatically.<br />

Chronograms<br />

With the large amount of gene expression profiles that were measured this way, the<br />

authors created a set of what they call chronograms. A chronogram is a two dimensional<br />

expression profile, containing a spatial component and a temporal component.<br />

As can be seen in Fig. 3.1 the expression data was converted into intensity bars, based<br />

on the intensity measurements of COPAS. These intensity bars were then aligned and<br />

stacked on top of each other, based on size, as can be seen in Fig. 3.1 c. To be able to<br />

compare the chronograms with other genes, these chronograms were normalized to a<br />

standard chronogram size which contains one line for each size. If no measurements<br />

are available for a certain size an empty line appears in the normalized chronogram.<br />

When multiple measurements are available for a certain size, these measurements get<br />

averaged onto one line in the normalized chronogram (Fig. 3.1 d).<br />

Chronograms that were acquired report the activity of the proximal promoter of 1,610<br />

unique predicted loci, i.e. the promoter was active according to the measurements and<br />

1,610 of those chronograms have only one locus on the chromosome containing the<br />

same promoter region. Roughly 900 measurements contained an average signal that<br />

was above background noise. Most of the other 700 chronograms had a too low intensity,<br />

probably due to an extra-chromosal promoter::GFP construct, a result of limitations<br />

in gene transfer discussed earlier in this paper.<br />

Martin Wildeman 31


Chapter 3. Molecular Imaging as extra data source for model generation<br />

Spatial prior knowledge<br />

The chronograms can be related to tissue specific expression profiles. A gene that is<br />

for example only expressed in the Pharynx has a different ‘fingerprint’ than a gene<br />

that is only expressed in the Gonad sheath. To generate the chronograms, qualitative<br />

tags obtained from microscopy and microarray experiments indicating locations of<br />

gene expression were used and clustered and chronograms from all genes known to be<br />

expressed in the same (qualitative) regions were averaged into one chronogram. The<br />

authors warn that this procedure only gives robust fingerprints for large numbers of<br />

measurements containing the same tag, because many genes are expressed in multiple<br />

regions and with little chronograms to average over, these extra locations may show up<br />

as a signal in fingerprints where they actually do not belong. These fingerprint chronograms,<br />

allow for qualitative location statements on newly obtained chronograms.<br />

Temporal prior knowledge<br />

The same approach was used for expression profiles with known high correlations obtained<br />

from microarray data. These expression clusters obtained from microarray data<br />

did not give clear patterns in the averaged chronograms most of the time, indicating<br />

that co expression in time, measured in microarray data, not necessarily means coexpression<br />

in space. Some examples, such as the ‘neurons’, ‘germ line’ and ‘intestine’<br />

clusters were in correspondence with the associated high correlation in microarray data<br />

though (i.e. a clear expression pattern was seen).<br />

The chronogram promoter activity measurements can be correlated to each other. Chronograms<br />

with high correlation can be clustered and most likely will be functionally related.<br />

To get an event better spatial localization, the authors predict that in the near<br />

future COPAS will be able to generate 3D aligned expression profiles. This, they expect,<br />

will give more accurate four dimensional chronograms, where overlapping organs<br />

will not cause inaccuracies anymore.<br />

To summarize the paper of Dupuy et al. shortly: Age/developmental stage is defined as<br />

the temporal element in the measurements. In this way, high throughput measurements<br />

are feasible, where alignment of the measurements is automatically done. When time<br />

and spatial expression are combined, a so called chronogram is obtained; see Fig. 3.1.<br />

After normalization of these chronograms, they can be correlated and when high correlation<br />

is seen, the function of the proteins measured are likely to be involved in the<br />

same cellular process.<br />

Because Caenorhabditis elegans is a transparent organism, measurements are direct<br />

and precise. Compared to whole body imaging of mice, this could give a problem,<br />

because for each gene a location estimation of expression has to be done.<br />

3.2 Inferring a Quantitative Model using Spatiotemporal<br />

Protein Expression<br />

Reinitz et al. state that to model processes, high detail is not needed. The detail of the<br />

model will just be lower if less detail and lower resolution data is available [40]. In<br />

32 Martin Wildeman


Chapter 3. Molecular Imaging as extra data source for model generation<br />

their work they look at low resolution spatial gene expression profiles to study regulation<br />

effects on eve stripe formation. With a few simplifications, necessary because<br />

of a lack of detailed data, they were still able to construct a model which was capable<br />

of simulating the eve stripe formation. Where Reinitz et al. used only the longitudinal<br />

protein gradients for their model, Krul et al. take the geometrical complexity of<br />

the reality into account [50]. They do this by defining cells as point shaped objects<br />

and the intracellular as the space around it with this space having the shape of the organism,<br />

Drosophila. Krul et al. also simplified the model by only looking at a small<br />

selection of known regulating proteins. With this simplification they were still able to<br />

mimic the systems behavior, but there were deviations due to the simplifications. When<br />

studying the processes in a two dimensional space these deviations became larger. The<br />

model they used consists of the following functions where the difference between intra-<br />

/extracellular and diffusion/non-diffusion is taken into account.<br />

The change over time is described by:<br />

Where h i j =<br />

N g<br />

∑<br />

k=1<br />

δg i j (t)<br />

δt<br />

The extracellular protein concentrations are modeled by:<br />

δc j (x,t)<br />

δt<br />

And equations 3.1 and 3.2 are constrained by:<br />

= φ(h i j)<br />

k j + φ(h i j ) − λ jg i j (t)<br />

W jk g ik + h j and i = 1,..,N c and j = 1,..,N g<br />

(3.1)<br />

= D j ∇ 2 c j (x,t) − λ j c j (x,t) (3.2)<br />

g i j (t) = c j (x i ,t) (3.3)<br />

The symbols in these equations represent: g i j : concentration in cell i for gene j, c j :<br />

extracellular concentration of gene j. λ j : degradation rate of gene j, k j : formation rate<br />

of gene j, h j : activation threshold for gene j and D j : diffusion coefficient of gene j.<br />

W jk contains the regulatory effects of gene j on gene k. It consists of real number values<br />

and these values are positive, negative and zero, for upregulation, downregulation and<br />

no regulation respectively. N c is the number of cells present in the model and N g is the<br />

number of genes incorporated in the model.<br />

Clearly W is the matrix with parameters that we want to estimate, because with these<br />

regulation parameters a gene regulation network can be constructed. Positive or negative<br />

feedback loops for each gene relation are modeled. Also λ,k, h and D are parameters<br />

that need to be set.<br />

Krul tuned or optimized the parameters by hand, to mimic the model. Reinitz et al.<br />

used an optimization algorithm, called simulated annealing, but other optimization algorithms<br />

can be used, such as a genetic algorithm. The cost function they used (equation<br />

3.4) is the difference between the model and the measurements.<br />

E =<br />

∑<br />

all a, i, t and genotypes<br />

for which data<br />

exists<br />

(g a i (t) model − g a i (t) data ) 2 + (penalty terms) (3.4)<br />

Martin Wildeman 33


Chapter 3. Molecular Imaging as extra data source for model generation<br />

These penalty terms can consist of all kinds of terms and their purpose is to direct the<br />

solution faster or more accurate to the optimal solution. It can even be used to avoid<br />

local sub optima. An example of the latter one is the so called niche penalty, used<br />

in genetic algorithms to prevent a local suboptimum to become dominant over other<br />

populations in the optimization field, that are scoring less good [51]. Other terms that<br />

can be used are functions that give a penalty on infeasible solutions. For example a<br />

protein concentration may not get above some soluble value. Also penalty terms that<br />

reduce the complexity of the model, e.g. the number of regulatory connections can be<br />

included [52]. Reinitz et al. used reduction of search space as penalty term and they<br />

also incorporated a term Λ which with a given penalty function makes sure that the<br />

maximum saturation of u is limited to (1 − Λ). u a in the paper of Reinitz means the<br />

total regulatory effect onto the promotor of gene a. The regulatory effects cannot be<br />

too large, so this is also a reduction in the search space of the optimization algorithm.<br />

It should be noted that equations 2.1, 3.1, 3.2 and 3.3 are based on the conversation law<br />

which can be written as [53]:<br />

∫ xb<br />

∫ xb<br />

∫<br />

d<br />

δ<br />

xb<br />

c(x,t)dx =<br />

dt x a x a δx J(x,t)dx + f (x,t,c(x,t))dx (3.5)<br />

x a<br />

J is the flux (or transport rate) of the component and f is the production rate.<br />

In more recent work the eve stripe formation could be correctly be predicted by a more<br />

advanced model. Based on cis-regulatory mechanisms, also known as enhancers, the<br />

activation of expression could be correctly predicted, including the effect of mutations<br />

in the regulatory DNA [54].<br />

In a more recent paper from Fomekong-Nanfack et al. a parameter estimation also is<br />

done [55]. In this paper research was done on how to optimize the parameters of the<br />

eve stripe formation model to fit the observed data. In the paper it is stated that a<br />

brute-force global optimization problem is still the most used method for parameter<br />

estimation problems. This is due to the fact that the parameter fitness landscape is<br />

unknown in most of the cases and therefore the parameter search space is assumed to<br />

be unrestricted. An effective optimization algorithm needs to be found and applied for<br />

each optimization problem. The authors chose for an evolution strategy to study its performance.<br />

An island-Evolutionary Algorithm is chosen and good results are achieved<br />

using this method. 62% of the found solutions were considered to be ‘good’ solutions.<br />

It is further stressed that a good search algorithm for a three-dimensional reactiondiffusion<br />

model is mandatory, because a one dimension model is already difficult (time<br />

consuming) to solve. The authors conclude that an ES algorithm is very effective to<br />

use for estimating an initial guess for local search algorithms, where after these local<br />

search algorithms should be used for fine-tuning the parameter estimation.<br />

3.3 Quantitative vs. Qualitative Network Models<br />

Though in theory it could be possible to generate a quantitative network model of spatiotemporal<br />

gene expression, current measurements on gene expression are not precise<br />

enough. Moreover quantitative measurements of kinetics and molecular concentration<br />

are largely unknown [56]. This is the case for microarray data and missing information<br />

there will also not be available for whole-body optical imaging, so it is for large<br />

networks needed to infer a qualitative model instead of a quantitative one.<br />

34 Martin Wildeman


Chapter 3. Molecular Imaging as extra data source for model generation<br />

De Jong et al. [57] describe a method to qualitatively describe a gene regulatory network.<br />

Each protein concentration change can be modeled by an equation with generic<br />

form:<br />

ẋ i = f i (x) − g i (x)x i and x i ≥ 0,1 ≤ i ≤ n (3.6)<br />

This equation can be written in vector notation and becomes<br />

ẋ = f (x) − g(x)x with f = ( f 1 ,..., f n ) ′ and g = diag(g 1 ,...,g n ) (3.7)<br />

f i defines how the rate of synthesis of protein i is influenced by the concentrations of<br />

all genes x.<br />

f i (x) = ∑ κ il b il (x) (3.8)<br />

l∈L<br />

κ il is here the reaction rate parameter and b il : R n ≥0<br />

→ {0,1} is a regulation function.<br />

And L is a set of regulation function indices. If no regulators exist for some protein,<br />

then L is an empty set. The regulation function g(x) works at a similar level, with<br />

the exception that its outcome must be strictly positive. (You cannot have negative<br />

degradation, but you can have negative feedback regulation.) In following equations,<br />

there will be a naming convention used, where γ stands for degradation rates and κ<br />

stands for synthesis rates.<br />

b il describes the underlying logic of the gene regulation. Some examples of these<br />

functions are b il (x) = s + (x j ,θ j ), which means that b i j equals 1 if x j is below threshold<br />

θ j and else is equal to 0<br />

These binary conditions are based on the observation that gene expression level changes<br />

normally behave like steep, switch like, sigmoid functions, which means that they are<br />

either regulated or not regulated by a certain gene. (Of course still in relation to some<br />

rate κ).<br />

What follows is a simple example of two genes that autoregulate and regulate each<br />

other, mentioned in the paper of de Jong. In Fig. 3.2, a scheme of regulation is shown,<br />

then how this translates into a quantitative model, and then how the same model translates<br />

into a qualitative model. The difference in a quantitative model is that each value<br />

is given a hard, observed value, whereas in a qualitative model models these values are<br />

given by using inequality constraints.<br />

There are threshold inequalities which basically say that θ 1 ,..,θ n must lie between 0<br />

and the maximum possible concentration of protein a (max a ), and equilibrium inequalities<br />

that indicate that some threshold must be below some equilibrium. In the example<br />

of Fig. 3.2 this translates to θ 2 a < κ a<br />

γ a<br />

lower than the target equilibrium κ a<br />

γ a<br />

< max a which means that the threshold must be<br />

because otherwise the observed negative autoregulation<br />

cannot be explained by the model. κ a s − (x a ,θ 2 a ) = 1 means that while protein<br />

concentration x a is below threshold θ 2 a , protein A is synthesized with rate κ a and while<br />

it is above this threshold it is synthesized with rate 0.<br />

Martin Wildeman 35


Chapter 3. Molecular Imaging as extra data source for model generation<br />

Fig. 3.2: A: A schematic model of gene regulation translates in piecewise lineair equations (B).<br />

In a quantitative model, the values for κ and θ are known and as such put in the model<br />

as a priori knowledge. C gives the quantitative model of the same situation and the<br />

unknown parameters are optimized along with the gene regulation relations [57].<br />

3.4 Modeling pathways using time series expression data,<br />

using conventional micro-array data<br />

Signaling networks and gene networks are, unlike metabolic networks, not well studied<br />

and the network structures are largely unknown. Therefore it is not possible to use<br />

standard analytical tools from metabolic networks to study gene networks [45]. It is<br />

possible to estimate models of gene regulation though, using statistical approaches. To<br />

determine if molecular imaging is suitable for these statistical approaches, we take a<br />

look into microarray data, to study how statistical model inference is applied in this<br />

field of research. As with molecular imaging it is possible to obtain expression data<br />

over time, by taking multiple samples of a culture, or samples of tissue over time. Time<br />

series experiments are most feasible when studying single cell organisms such as yeast<br />

or bacteria while changing the conditions over time.<br />

Bayesian Networks<br />

A way of analyzing this microarray data is by making use of Bayesian Networks to<br />

model regulatory effects of genes on each other. A basic example of a Bayesian Network<br />

is shown in Fig. 3.3. With Bayesian Networks, genes that are co-regulated can<br />

be associated to each other with a certain probability. For instance, given that gene A<br />

is upregulated, gene B has an 95% chance of also being upregulated (see Fig. 3.3). It<br />

is not possible though to model regulation effects over time, or to model a regulatory<br />

36 Martin Wildeman


Chapter 3. Molecular Imaging as extra data source for model generation<br />

Fig. 3.3: Example of a Bayesian Network. Left side is a network with only observable data.<br />

Right contains hidden nodes that are estimated to obtain observed data [58].<br />

Fig. 3.4: A DBN can model feedback loops, by introducing a time component.<br />

pathway with standard Bayesian Networks. Due to the acyclic constraint of Bayesian<br />

Networks, it is not possible to model autoregulation and feedback loops.<br />

In Bayesian Networks, prior knowledge can be incorporated. If for instance gene A<br />

and gene B are located on the same operon (in prokaryotes), they will automatically be<br />

expressed at the same time and co-regulation is not due to a regulatory effect between<br />

gene A and B, but by a common, invisible, e.g. non measured parent (see Fig. 3.3, right<br />

part).<br />

Dynamic Bayesian Networks<br />

Unlike BNs, Dynamic Bayesian Networks, also called Temporal Bayesian Networks,<br />

are able to model dynamic systems and also feedback mechanisms [59]. Ong et al. use<br />

a Dynamic Bayesian Network for pathway modeling because a DBN is able to handle<br />

prior knowledge, hidden variables, time series data and stochasticity [58]. A DBN is<br />

in fact a BN, but the nodes in a DBN are pointing to an ‘object’ at a given time point.<br />

An object thus can occur multiple times in a DBN (Fig. 3.4).<br />

With these DBN’s, by using an expectation maximization algorithm, a most likely<br />

regulatory pathway can be estimated.<br />

Martin Wildeman 37


Chapter 3. Molecular Imaging as extra data source for model generation<br />

A Bayesian approach for top down modeling is feasible and suitable, because intracellular<br />

networks tend to be sparse and scale free [45]. In [58] the authors had a small<br />

amount of data points available, but they were still able to reconstruct the biological<br />

mechanism by incorporating prior knowledge into the model. With WT time series<br />

expression data, the set of genes that function in a system and the order in time of their<br />

expression can be determined. For the study of gene regulatory networks individual<br />

knockout experiments are needed [60].<br />

Data quality<br />

When using micro array experiment for obtaining time expression data, it is difficult<br />

to obtain a continuous representation of gene expression profiles. This is due to background<br />

noise, missing data points, unsynchronized cell cycles, different phases and<br />

amplitudes of expression and difference in cycle lengths, which in turn might cause<br />

aliasing of signals if the signal is undersampled. Clustering of expression data also becomes<br />

difficult, due to the sparsity of data. Finding correlation in an experiment with<br />

10 time samples is not a trivial task, especially when interpreting causality (e.g. high<br />

correlation, but time shifted).<br />

Data amount<br />

While with microarray data each sample taken costs about $300 [61], with bioluminescence<br />

an extra snapshot would be virtually free of extra costs. Oversampling therefore<br />

is not expensive which is an important advantage, especially when you take the curse<br />

of dimensionality into account, which states that the more dimensions you have, the<br />

more data points you need. With a microarray containing say a thousand gene probes,<br />

a dozen of samples is not much to work with. For robust classification in general a<br />

sample per feature ratio of 5-10 is needed [62]. When looking at BLI in a steady state<br />

process, additional snapshots generate data points that are not completely independent,<br />

because they are of the same source and process and thus no extra information of the<br />

studied process is gained, but at least the measurements will be more reliable with<br />

more samples, because random noise is averaged out. Concluding these arguments;<br />

when looking at time series expression data, an in vivo mouse model would be very<br />

suitable to obtain data.<br />

Another problem that exists with the sparsity of available data sets is, that once classifiers<br />

or models are built, there is no way to determine whether they are really robust or<br />

correct, because there are simply not enough available datasets to test its robustness.<br />

Pathway selection<br />

Microarray data can be used for the search to a high level model. Using Bayesian<br />

inference it is possible to construct a most likely model that best fits the data and by<br />

making perturbations to the network, dependencies can be further modeled. Many<br />

times, especially when a lot of genes are involved in the studied network, a lot of<br />

possible solutions are possible that all give about the same fit to the data. It is possible<br />

to select the top scoring pathway as the correct one, but there is no way to be certain<br />

38 Martin Wildeman


Chapter 3. Molecular Imaging as extra data source for model generation<br />

whether this pathway is actually the correct one or not. The only method to gain more<br />

certainty, is to make use of extra data, by doing additional experiments.<br />

If ambiguous pathways are found, the most discriminating genes between those pathways<br />

can be selected for additional knockout experiments [63] (See Fig. 3.5). The<br />

‘most discriminant’ genes can be found in different ways. In [63] mutual information<br />

is used, but also random selection, or hub-based selection can be used. Mutual selection<br />

selects the hypothesized knock-out experiment that, given the estimated model, is<br />

expected to cause the maximal information gain (i.e. reduction in ambiguity). By first<br />

designing experiments with these high scoring genes, a fast decrease in ambiguous<br />

pathways is observed. A problem with single knock-out experiments is, that multiple<br />

genes that independently regulate another gene (multiple inbound interactions) are<br />

not detected in these experiments. Multiple-gene knock-out experiments are therefore<br />

needed, to obtain a fully unambiguous regulatory pathway.<br />

With in vivo imaging, once a discriminant gene is found, a knockout model could<br />

easily be created with use of the Flp-In system of Invitrogen. With this method, genes<br />

of interest can be overexpressed or silenced, using Flp recombinase. By using the Flp-<br />

In technique it is certain that only one insertion is done in the genome and that this<br />

insertion is done at a non functional but actively transcribed part of DNA. For example<br />

pathways can be knocked down, by eliminating a certain key gene, to study redundancy<br />

in this pathway functionality or kinetics can be studied by regulating certain network<br />

components [64].<br />

Model Validation<br />

In their paper on model testing, de Jong et al. [65] state that it is infeasible to manually<br />

check the validity of a large (inferred) network model, due to the complexity of the<br />

model and the large amount of free parameters. The only way to check the validity of<br />

a network is by making use of even more data and check how well the model behaves<br />

compared to the observed data. This implicates that high-throughput measurements<br />

are needed for network validation, which immediately raises questions on feasibility of<br />

studies with whole body molecular imaging.<br />

3.5 Discussion<br />

3.5.1 General<br />

In most if not all cases of spatial gene expression measurements, no model inference<br />

is done yet, but databases with spatiotemporal gene expression data have been made<br />

available, which in turn should open the door for network inference. If registration<br />

problems can be solved and spatial gene expression over time can be accurately be<br />

registered, then there is no reason why network model inference cannot be done. This<br />

doesn’t mean it will be an easy or straightforward task as will be discussed in this<br />

section.<br />

With 3D gene expression atlases, such as genepaint.org, it is possible to obtain gene<br />

expression data of in situ hybridization. Genepaint.org only contains a time snapshot of<br />

the developing mouse embryo (E14.5) [48]. It is therefore not possible to directly infer<br />

Martin Wildeman 39


Chapter 3. Molecular Imaging as extra data source for model generation<br />

Fig. 3.5: By running top-priority scoring genes knock out experiments, the actual network can<br />

be found [63].<br />

a regulatory network from the data, but it is possible to cluster data and thereby to create<br />

groups of genes that have a high possibility of being part of the same network module,<br />

because they share the same spatial expression profile during the developmental stage<br />

of the embryo. A big problem concerning this approach is that genes that are silenced<br />

by some gene, and thus directly regulated by that gene, are not clustered to that gene,<br />

because the spatial expression profiles do not match. With temporal observations, the<br />

chance of clustering these negative feedback regulations is bigger, because it is possible<br />

to make use of mixed correlation. In the paper of Visel, only co-expressed genes are<br />

marked as candidates for a perturbation study. A WT and a Pax6 deficient mouse strain<br />

are studied at time point E15.5 and the expression profiles of the genes of interest (i.e.<br />

the genes that had the same spatial expression profile at stage 14.5) are studied and<br />

compared to E14.5 and each other. If expression between Pax6 deficient and WT mice<br />

is different, then these genes are directly or indirectly regulated by Pax6. Of course this<br />

is true, but it should be noted that it will be very difficult to obtain a gene regulatory<br />

network if all negative feedback loops are left out of scope by using this approach.<br />

The EMAP database does contain temporal information on mouse embryo development<br />

and therefore is preferable to use for gene network inferring. A module called<br />

emage, contains gene expression data that is mapped to an anatomical mouse atlas.<br />

Also a text based gene expression database (GXD) is available, which contains the annotation<br />

information. Note that this latter information is qualitative. It mentions the<br />

organs where expression is observed, not the coordinates inside of the mouse atlas.<br />

Emage is also accessible through a programmers SOAP WSDL interface which allows<br />

for data mining [66].<br />

3.5.2 Creating models for whole body imaging data<br />

Because the feasible obtainable resolution in small animals is not as high as for example<br />

in Droshophila, the describing detail of the model will automatically also be of a lower<br />

resolution when using small animal models. And with a lower resolution of the model,<br />

40 Martin Wildeman


Chapter 3. Molecular Imaging as extra data source for model generation<br />

it has less explaining power and results obtained from the model are not necessarily<br />

biologically meaningful.<br />

Although whole-body imaging does not allow for a large quantitative model easily, it<br />

does generate new information, because a 3D reconstruction of gene expression location<br />

gives a lot more information than one dimensional microarray data alone. Microarrays<br />

allow for many gene expression levels to be probed, and thus large network<br />

inferring, where whole body imaging only allow for a few expression profiles at a time.<br />

Keep in mind that for each gene visualization, a gene modification in the organism is<br />

needed.<br />

The power of small animal in vivo imaging is that processes can be followed in time.<br />

More samples are needed to reduce the degrees of freedom of the network that is modeled.<br />

With current techniques it is possible to visualize multiple gene expression profiles<br />

in the same animal by using multiple fluorescent proteins with different esmission<br />

spectra. DB Living Colors TM fluorescent proteins are an example of fluorescent<br />

proteins that are suitable for this [67]. New attenuation problems arise when different<br />

wavelength fluorophores are used, but given that these are solvable, around 5 to<br />

6 different probes can be measured simultaneously. Despite of high spectral overlaps<br />

in the different fluorophores, it is still possible to separate different reporters by using<br />

multispectral imaging and multiplexing [68]. Caution should be taken when using<br />

multiple fluorophores at the same time, as not all fluorophores can be detected with the<br />

same sensitivity which would falsely suggest that the more sensitive fluorophores are<br />

expressed earlier (because they are detectable earlier), than the less sensitive ones [69].<br />

The possibility of multiple gene taggings and thus the ability to visualize them, in<br />

combination with alignment of distinct measurements to an altas, using registration<br />

techniques also allow for the possibility to use a network inferring algorithm that is<br />

similar to that of Reinitz and Krul [40, 50] in small animal whole body molecular<br />

imaging.<br />

The model would need some changes to overcome the scaling problems observed in<br />

molecular imaging. Equations 3.1 and 3.2 will be discussed including some caution<br />

warnings and changes that are needed to be able to apply it to whole body imaging.<br />

Since we will not be able to see gene expression at a cellular resolution, we need to<br />

define something else as a cell. The most logical solution would be to define a voxel in<br />

the 3D image as a ‘cell’. g i j in equation 3.1 would then not point to cell i, but to voxel<br />

i. N c would then be the number of voxels inside the animal body. This immediately<br />

raises a problem, the correspondence problem. The voxels have to be numbered in<br />

such a way that with each registration, each voxel is numbered in exactly the same<br />

way. This also raises the need that the model embodies the same amount of voxels for<br />

each measurement. These problems can be overcome by discretizing a mouse atlas, to<br />

which we were already registering, into a fixed amount of voxels. The measurements<br />

that are then registered onto the atlas can be interpolated, so that each voxel gets an<br />

averaged out value.<br />

Concentration model<br />

Equation 3.2 was used to model the diffusion coefficient of proteins that can cross the<br />

cell barrier. These extracellular proteins can have a signaling function, where intracellular<br />

proteins that cannot cross the cell membrane will not have this signaling function.<br />

Martin Wildeman 41


Chapter 3. Molecular Imaging as extra data source for model generation<br />

The paracrine proteins, as the diffusing proteins are called, are likely to have a smoother<br />

distribution then the proteins that stay inside the cells. The paracrine signaling accounts<br />

for signaling to cells in close proximity of each other and paracrine signaling there is<br />

likely to cause the formation and survival of differentiated cell clusters.<br />

When looking at whole body models though, endocrine signals also should be taken<br />

into account. The endocrine signals are produced in the endocrine glands and commonly<br />

consists of hormones. The activation of receptors and glands can be visualized<br />

by using multiple fluorophores [69, 4, 68], but no literature of direct in vivo visualization<br />

of endocrine signaling molecules has been found and it can be doubted if reporter<br />

genes can be used to visualize the synthesis of hormones, because they are very small<br />

molecules, compared to the reporter genes. Hormone levels can be measured directly<br />

though, because they are present in the blood as endocrine signaling compounds, but it<br />

can be doubted if their concentrations will be homogeneous.<br />

It might however also be possible to incorporate endocrine signaling into the model<br />

as unknown/invisible regulation factors, without measuring them. The difference with<br />

paracrine signaling is, that the molecules can pass the endothelial barrier, so that they<br />

can travel through the blood circulatory system.<br />

In the model this will translate into a third equation, that is comparable to equation 3.2.<br />

The organs most likely will act as cells and the bloodstream will act as the extracellular<br />

region. The diffusion through the bloodstream will be faster than in the extracellular<br />

region but the rest of the equation will remain the same.<br />

With endocrine signaling incorporated into the model, the steep protein concentration<br />

gradients that are most likely to be observed at the boundaries of organs, or more<br />

generic, the boundaries between clusters of different cell types, can be explained. The<br />

equation for endocrine signaling will in the form of:<br />

δb j (x,t)<br />

δt<br />

= D2 j ∇ 2 b j (x,t) − λ j b j (x,t) (3.9)<br />

Where b j (x,t) is the concentration of gene j (or compound j, because it is most likely a<br />

hormone) and D2 j is the diffusion coefficient in the bloodstream. λ is still the degradation<br />

component.<br />

Then the difference of solubility of proteins in different cell types might also needed<br />

to be taken into account, but it might also be neglectable because both are watery environments.<br />

Also endocrine molecules are secreted directly into the bloodstream which<br />

makes it difficult to make a restriction between the concentration in the bloodstream<br />

and the secreting cell. The equation probably will be of the form:<br />

g i j (t) = b j (x i ,t) (3.10)<br />

Where g i j (t) is the concentration of gene j in voxel i at time t and b j (x i ,t) is the concentration<br />

of gene j at the location of voxel i at time t.<br />

Location model<br />

When we are able to relate different expression profiles to different organs, we would<br />

gain extra insight into functionality of the proteins. This is not necessary for the model<br />

42 Martin Wildeman


Chapter 3. Molecular Imaging as extra data source for model generation<br />

to work though.<br />

It may also be needed to know the direction of bloodstream near endocrine glands, to<br />

correctly predict the concentration gradients of the endocrine signals. With ultrasound<br />

it is possible, by using High frequency Doppler flow mapping, to determine parameters<br />

as blood velocity, blood flow and blood volume [70].<br />

Again, if endocrine signaling is modeled as invisible or free parameter, then this extra<br />

data is not needed and the endocrine signaling can be seen as a way to explain steep<br />

concentration gradients in spatial expression profiles, but strong temporal relationships<br />

in seemingly spatially non connected regions, i.e. it explains how a gene can be expressed<br />

in for example the liver and the kidneys, but not in between.<br />

For a full understanding of spatial and temporal regulation, it is necessary to register<br />

anatomical data to the gene expression data. In that way steep, concentration gradients<br />

can be explained by, for example, a boundary of an organ.<br />

It should be kept in mind though that steep gradients in protein concentration can also<br />

be caused by paracrine signaling, as can be seen with the eve stripe formation. Coregulation<br />

in non continuous space though cannot be explained by paracrine signaling<br />

alone.<br />

Martin Wildeman 43


CHAPTER 4<br />

Molecular Imaging as a means for hypothesis testing<br />

Molecular Imaging has potential to generate data for regulatory network model inferring.<br />

As was shown in Chapter 3 is has some major limitations though, such as the lack<br />

of high throughput possibilities, direct protein measurements and direct expression detection<br />

(need for reconstruction), but it does generate some new information that is<br />

not available with current techniques. The most important new aspect is probably the<br />

possibility to study processes over time.<br />

This new aspect in the data is not only useful for model inference. It also enables researchers<br />

to study (morphologic) processes over time. Although molecular imaging<br />

techniques such as BLI, FMI and PET lack high contrasts, they are much more sensitive<br />

and specific then their clinical counterparts, and thus processes that could not be<br />

detected with other techniques can now be visualized and studied.<br />

If researchers can see and study processes over time, that enables them to test new<br />

or existing hypotheses. Two possible fields of study emerge from molecular imaging,<br />

being gene tracking and cell tracking. The differences will be explained below.<br />

4.1 Gene Tracking<br />

With reporter genes, different processes can be visualized. The effect of repressors<br />

and enhancers can be studied, predicted pathways can be validated by knocking out<br />

or upregulating gene expression, given that it is not lethal. Also gene activity during<br />

events in the body can be measured, in for example growth, degradation, apoptosis,<br />

circadian cycle, etc. All these processes can be studied using techniques as discussed<br />

in Chapter 2. Examples found in literature are the inhibition of the Cdk2 gene [71],<br />

transcriptional regulation of the CYP3A4 gene [72], visualization of active estrogen<br />

receptors [73] and responses to bacterial and viral infections [26].<br />

45


Chapter 4. Molecular Imaging as a means for hypothesis testing<br />

Currently there are mainly qualitative visual inspections done on these processes. It is<br />

possible though to create statistical tests to determine gene expression levels. In the<br />

study on the CYP3A4 the authors used a post hoc t-test to compare between mean<br />

expression differences in time in one group, and multivariate analysis of variance<br />

(MANOVA) tests to compare control groups with injected groups for different injections<br />

and the difference between male and female mice [72].<br />

When combining a two-dimensional BLI/FMI image with a three-dimensional anatomical<br />

atlas it would also be possible to attach qualitative expression tags to the BLI image,<br />

in terms of location of expression. When looking at the combination of the 2D<br />

image and the registered 3D anatomical atlas, statements like: The chance of this gene<br />

being expressed in the liver is 50%, in the stomach 30% and in the kidneys 20%.<br />

Some genes are expected to have a function in the development of organs. For example,<br />

gene expression is expected to be visible before formation of an organ. To test if this<br />

expression is significantly more located at the location of the organ formation, one<br />

must first be able to indicate where the organ is formed. This can be done by making<br />

an analysis over time and registering the gene expression to another modality where<br />

the morphological formation of the organ can be detected. If the location of the organ<br />

formation is known, and the genes of interest are expected to be functional for the<br />

formation of that organ, then it is expected that those specific genes are expressed at<br />

higher levels at these locations than in other locations.<br />

4.2 Cell Tracking<br />

When no transgenic animals are used for the research, molecular imaging can still be<br />

useful. It is possible to generate xenografts that are detectible by molecular imaging<br />

techniques. The most commonly used are luc and GFP reporter genes. Examples<br />

of cells that can be tracked are labeled bacteria and viruses to determine their pathogenecity.<br />

Also the effectiveness of antibiotic therapies can be studied this way [4].<br />

A lot of work is done on cell tracking of cancer cells. Cell lines with an ‘always on’ luc<br />

reporter gene are constructed and these are injected into model organisms. The Flp-in<br />

system can be used to easily knock out or upregulate specific genes in a (tumor)cell<br />

that can afterwards be measured by using an ‘always on’ Luc gene.<br />

It should be noted that the proliferation and location of tumor cells can be followed and<br />

what is seen is not the gene regulation, but the amount and location of active (living)<br />

tumor cells, or other studied xenografts for that matter. When comparing differences of<br />

tumor growth in follow-up studies, a t-test could be used, to look for statistical relevant<br />

differences in tumor growth.<br />

It should be noted that, although the amount of active reporter enzymes will be roughly<br />

the same for each tumor cell, as with all enzymatic reactions, the turnover rate is not<br />

only depending on the enzyme concentration, but also on the amount of substrate (luciferin)<br />

and the reaction temperature. Both these variables may vary in follow-up studies.<br />

Also diffusion speed of substrate through the body is dependent on temperature<br />

profiles. All measurement techniques were substrates are involved, will suffer from<br />

these dependencies in terms of accurate quantification. FLI is likely to be less sensitive<br />

to changes in environment.<br />

46 Martin Wildeman


Chapter 4. Molecular Imaging as a means for hypothesis testing<br />

Fig. 4.1: Two datasets of the same gaussian distribution were obtained. One of 100 and one<br />

of 100,000 samples. Then two kernel density estimations (Normal kernel, width 0.2)<br />

were plotted on the dataset. Clearly the estimation made with 100,000 data points<br />

resembles the gaussian distribution better than the dataset with 100 samples.<br />

4.3 General signal detection and limitations<br />

To be able make any statements about a studied signal, a first step is to determine if any<br />

signal of interest is present at all, or that the signal is only consisting of noise. To be<br />

able to draw such conclusions, the characteristics of noise have to be determined and<br />

tests have to be created to see whether there is any signal present that is unlikely to be<br />

caused by noise alone.<br />

If such a test is created, it would be possible to set some threshold on a p-value, which<br />

can be seen as a term for likelihood, for which a image below some p-value threshold<br />

can be labeled as ‘signal found’. I.e. when the p-value is low, the chance of the<br />

observation being generated under a null hypothesis, i.e. no signal is observed, is so<br />

small, that it is likely that a signal is present and thus a significant signal is detected. A<br />

common p-value threshold used in scientific research is 0.05.<br />

To determine whether a signal is significant, or whether it is significantly located in<br />

space somewhere, a null hypothesis has to be constructed and rejected. A dataset of n<br />

elements can be seen as n random samples from a probability density function.<br />

Model estimation<br />

Thus, to be able to say something about significance, an observation has to be tested<br />

against some null distribution, but before that is possible, that null distribution has to<br />

be estimated.<br />

To do this, regression to some data has to be applied. The more data points are available<br />

from the distribution to test against (the null hypothesis), the more accurate the<br />

estimation of this null distribution will be (See Fig. 4.1) [74].<br />

There can be made a distinction between an empirical estimation, a parametric estimation<br />

and semi parametric estimation. The first one does not assume any information to<br />

be known about the model and non parametric estimation such as kernel smoothing or<br />

K Nearest Neighbor algorithms can be used to ‘reconstruct’ the model from which the<br />

samples were drawn.<br />

The second one assumes full knowledge about the model, such as a normal or a Poisson<br />

distribution. The only thing that has to be estimated then are the parameters of the<br />

Martin Wildeman 47


Chapter 4. Molecular Imaging as a means for hypothesis testing<br />

Fig. 4.2: 1. Only noise, 2. Only expression in tissue, 3. Only expression in/on bone 4. Overall<br />

expression or more noise<br />

distribution. If a correct distribution form is chosen, then this method will give smooth<br />

and well fitted distributions.<br />

The last model is a mixture of parametric and non parametric estimators. A mixture of<br />

Gaussians is a good example.<br />

Model testing<br />

The important question for each test of significance will be, against which null distribution<br />

the test will be applied. In other words, what distribution has to be rejected in<br />

order to accept the alternative hypothesis which states that the dataset is not generated<br />

by the probability function of the null hypothesis<br />

If the significance of a signal can be determined and a significant signal of p


Chapter 4. Molecular Imaging as a means for hypothesis testing<br />

null hypothesis would hold, and thus the observation could be generated by noise and<br />

thus no significant signal would be found.<br />

It can also occur that expression occurs only in A, when the test is designed for B (2).<br />

If inaccuracies in the measurements are present, then noise at the borders of B will be<br />

higher than normal noise and the test could falsely suggest that the expression measured<br />

in B is not caused by noise, and that thus expression is occurring in B. The statement<br />

that this observation is not caused by noise is indeed correct, but the alternative hypothesis<br />

that expression is thus caused by B is visually easily falsified. Another, more<br />

robust hypothesis is thus needed. This shows the complexity of statistical testing. Not<br />

only is it necessary to carefully select the null hypothesis, the alternative hypothesis<br />

needs to be correct as well.<br />

The last possibility is that expression is seen both in A and B (4). Here a new difficulty<br />

appears, because it could mean that somehow the sample is very noisy, but it could also<br />

well be that indeed overall expression is observed.<br />

4.4 Discussion<br />

For detecting signals in acquired images of gene expression, many times the methods<br />

found in literature for detecting regions of interest are by means of a qualitative, subjective,<br />

visual selection. Quantification is done by counting the number of illuminated<br />

pixels, that have a value above a certain threshold and by translating this to the number<br />

of measured photons, or photons per second [75, 76]. For automatic processing and<br />

high throughput analysis it is needed that these regions of interest are found automatically<br />

if present.<br />

Also important is to calculate the probabilities for different qualitative location information<br />

tags, which has the following meaning; Given a segmentation and expression<br />

at location x,y,z, the probability that expression is located in this organ is x %. Manual<br />

analysis would not be able to provide such objective probability estimations.<br />

Important to keep in mind, is that much data is needed to estimate probability distribution<br />

functions. When studying gene expression in 2D, at lot of samples are needed for<br />

reliable density estimations. For the estimation of noise distribution this is probably<br />

still feasible, but when estimating a reliable model for gene expression it gets complicated<br />

and one mouse as data source simply doesn’t suffice. In [77] it is stated that for a<br />

two dimensional non parametric density estimation of a normal distribution with a relative<br />

MSE of less than 0.1 using normal kernels for the estimation, at least 19 samples<br />

are needed. For three dimensions, already 67 samples are needed.<br />

It is also important to notice that it will not always be a trivial task to register segmented<br />

data (in the form of an atlas) to measured BLI, FMI, PET or SPECT data.<br />

Commonly seen is that with these modalities only two dimensional planar images are<br />

available onto which the 3D BLI, FMI, PET or SPECT data acquisition is calibrated.<br />

The only information that is available in these cases for registration are the two dimensional<br />

surface pictures of the organism to register the 3D atlas. This 2D/3D sparse data<br />

registration needs to be solved, before segmentation of the BLI, FMI, etc. data can be<br />

accomplished, let alone the statistical tests be designed and applied.<br />

Martin Wildeman 49


CHAPTER 5<br />

Discussion<br />

In this chapter a global discussion is presented on the topics covered in this literature<br />

study. New aspects that are introduced by MI and that are unique in bioinformatics<br />

will be highlighted and global issues that are limiting the feasibility of application in<br />

bioinformatics will be summarized, including challenges that must be solved and the<br />

expertise that is needed to do so.<br />

Before that is possible, it should be noted that visualization of gene expression itself<br />

can already be seen as bioinformatics. The definition of bioinformatics in this paper is<br />

therefore restricted to the field computational biology.<br />

5.1 Advantages of MI for the field of bioinformatics<br />

As stated several times in this paper, the most important advantage of MI over existing<br />

data sources in bioinformatics is the possibility of follow-up studies in the same animal,<br />

due to the non invasive nature of MI. In all known other techniques animals have to be<br />

sacrificed in order to obtain spatial and or temporal gene expression profiles by using<br />

sectioning techniques and extraction techniques respectively. Another advantage is that<br />

spatial and temporal information are obtained simultaneously.<br />

Another advantage, as with cryosectioning and in situ hybridization, is the high sensitivity<br />

to local gene expression, compared to micro arrays in which RNA concentrations<br />

are averaged out in an extraction sample.<br />

51


Chapter 5. Discussion<br />

5.2 Current Issues and Challenges<br />

Image Processing<br />

In molecular imaging, digital image processing is a very important aspect. Thresholding,<br />

backprojection, registration of multiple modalities on each other and registration<br />

of modalities onto an atlas, are all examples of image processing techniques. Though<br />

in theory it is possible to do spatial registration on different modalities, by applying<br />

some optimization function, it will not always be straightforward on how to formulate<br />

these optimization functions.<br />

New gene expression measurements, two or three dimensional, need to be aligned to<br />

MRI or CT data, which are also in two or three dimensional format. Also 2D optical<br />

surface images that are directly related (in space) to BLI, FMI, PET or SPECT, in the<br />

form of for example structured light, need to be registered to a 3D atlas.<br />

Registration is needed, to be able to relate spatial expression to segmented models, and<br />

thus to obtain qualitative knowledge on spatial expression. The segmentation information<br />

will be available in an atlas, and once registration of gene expression to an atlas<br />

is successful, the corresponding segmentation information can be related to the spatial<br />

gene expression information.<br />

All these problems lie in the field of image processing and new modality specific optimization<br />

algorithms need to be constructed. In principle the data to do that is available,<br />

so in time these problems will be solved.<br />

Undefined sources<br />

When interpreting gene expression data with small animal whole body optical imaging,<br />

the major challenge is that registration on some sort of atlas is needed before an estimation<br />

can be made on the qualitative spatial expression levels of the measured genes.<br />

Combined with the fact that RNA expression levels are measured indirectly by the use<br />

of reporter genes, in comparison to direct measurements by micro array probes, and<br />

the fact that post translational effects are not detectable with MI, many assumptions<br />

on gene expression are needed when using molecular imaging as data source. This<br />

could or could not influence data analysis and this uncertainty makes the use of optical<br />

imaging as source difficult.<br />

Radionuclide imaging gives similar problems. Here the main problems would be that<br />

reporter genes need to be expressed at cell surfaces to be able to detect radioactive<br />

compounds, or reporter enzymes are needed to ‘trap’ radioactive compounds inside the<br />

cells, with possible toxic effects and disturbed biological processes as a result.<br />

The problems in MI concerning gene expression are thus not the technical challenges<br />

of reconstructing the source of emission of photons, which can be calculated for every<br />

modality to some resolution, but the biological meaning of what is actually measured<br />

(See Fig. 5.1).<br />

What is needed for MI to overcome this problem is the development of contrast agents<br />

that are directly correlated to the expression levels of the gene of interest. Most likely<br />

this must be some sort of fusion protein, because the only way to be certain that a<br />

protein is expressed is to be able to detect it directly. Also this is the only accurate<br />

52 Martin Wildeman


Chapter 5. Discussion<br />

Fig. 5.1: By using gene reporters as gene expression source, many parameters remain unknown,<br />

with unreliable expression estimations as a result<br />

Martin Wildeman 53


Chapter 5. Discussion<br />

possibility to determine protein concentrations in vivo, because otherwise differences in<br />

diffusion will prevent accurate concentration measurements. Solutions to this problem<br />

will probably come from the field of pharmaceutical development by newly developed<br />

probes and from the field of life sciences [16].<br />

Statistical Approaches<br />

In cases where it is known what the expression levels of reporter genes mean, such<br />

as with cell tracking or fusion protein detection, in for example gene therapy [16],<br />

there is a need for high(er) throughput measurements to be able to construct reliable<br />

density models for obtaining reliable prior probability distributions. Without enough<br />

data samples, only statistical statements on difference in expression can be made in<br />

follow-up studies in the same animal model, with the use of t-tests, but even then<br />

multiple measurements are needed, to at least get an indication of means and variances<br />

in different time points.<br />

To be able to generate more data, an efficient and reliable way of gene expression is<br />

needed. It can be seen in the paper of Dupuy et al. about high throughput analysis on<br />

C. Elegans, that gene transfer efficiency was responsible for a too low signal in 36%<br />

of the total samples obtained. The Flp-In technique will enable efficient gene transfer<br />

techniques. New developments will probably come from high throughput screening of<br />

cell lines. More difficult will be to obtain similar results for more complex organisms<br />

because high throughput screening is less feasible for those organisms and long term<br />

effects are more difficult to spot because full development of the organisms are needed<br />

before side effects can be seen.<br />

Not only is an efficient gene transfer system necessary, also fully automatic registration<br />

is needed for high throughput segmentation. For these problems to be solved, work<br />

has to be done in the fields of genetics and image processing for data generation and<br />

processing.<br />

If and when enough data is available, statistical tests will have to be designed, to obtain<br />

new (statistical) information on developments in studied processes. For different<br />

studies, different tests will have to be developed.<br />

5.3 Conclusion<br />

The field of molecular imaging comprises some very powerful techniques to visualize<br />

gene expression of certain genes. Unfortunately some criteria needed for the use of<br />

bioinformatics are not met. The most important criterion that is not met is that it is not<br />

yet feasible to do high throughput measurements for whole body imaging. The main<br />

reason for this is, that unlike with micro arrays, only a few (up to 5 with FMI) genes<br />

per animal can be measured at a time with MI. This is because for each promoter a<br />

unique reporter gene will be needed to specifically visualize the corresponding gene of<br />

interest. To generate mice to obtain expression levels in the same amount as with micro<br />

arrays would be time consuming.<br />

Also some registration problems need to be solved before data from molecular imaging<br />

can be used for bioinformatics. Once registration, segmentation and high throughput<br />

54 Martin Wildeman


Chapter 5. Discussion<br />

measurements are technically feasible or solved, molecular imaging could prove to be<br />

a valuable addition to the existing data modalities in bioinformatics.<br />

The fact that only indirect measurements of protein expressions are obtained, does<br />

not necessarily mean that the data cannot be used. Regulation networks can still be<br />

obtained from the 3D+t gene expression data, but it should not be forgotten that measurements<br />

are indirect and thus expression data could be incorrect.<br />

Molecular imaging does provide a new way to observe biological processes in vivo that<br />

were not available for study without the existence of molecular imaging. For instance<br />

so called ‘biomarkers’ that are used and searched for in bioinformatics can (indirectly)<br />

be visualized in MI, by using reporter genes or specific antibody contrast agent fusions,<br />

so that not only can be determined if a disease is present, but also where it is located.<br />

In other words, micro arrays can be used to search for genes of interest and once found<br />

the ‘behavior’ of those genes can be studied with MI techniques.<br />

Also the behavior of for example cancer cells after genetic alteration can be studied,<br />

which opens new possibilities for research on gene therapy in cancer treatment.<br />

To put it bold and shortly. The field of bioinformatics in the form of computational<br />

biology and the field of molecular imaging in the form of whole body imaging are<br />

not yet ready for each other, but if the discussed technical challenges are solved, their<br />

combination holds great potential.<br />

Martin Wildeman 55


Bibliography<br />

[1] Michael Huerta, Michael Huerta, Yuan Liu, Gregory Downing, and Belinda Seto. Nih working definition<br />

of bioinformatics and nih working definition of bioinformatics and computational biology, july<br />

2000.<br />

[2] R. Weissleder and U. Mahmood. Molecular imaging. Radiology, 219(2):316–333, 2001.<br />

[3] H. Peng, F. Long, J. Zhou, G. Leung, M.B. Eisen, and E.W. Myers. Automatic image analysis for gene<br />

expression paterns of fly embryos. BMC Cell Biology, 8, July 2007.<br />

[4] D.K. Welsh and S.A. Kay. Bioluminescence imaging in living organisms. Current Opinion in Biotechnology,<br />

16:73–78, 2005.<br />

[5] H. Alfke, H. Stöppler, F. Nocken, J.T. Heverhagen, B. Kleb, F. Czubayko, and K.J. Klose. In vitro mr<br />

imaging of regulated gene expression. Radiology, 228:448–492, 2003.<br />

[6] T. Mistelli and D.L Spector. Applications of the green fluorescent protein in cell biology and biotechnology.<br />

Nature Biotechnology, 15:961–964, 1997.<br />

[7] S.B. Primrose, R.M. Twyman, and R.W. Old. Principles of Gene Manipulation. Blackwell Sciences, 6<br />

edition, 2001.<br />

[8] A. Schedl, Z. Larin, L. Montoliu, E. Thies, G. Kelsey, H. Lehrach, and S. SchuLtz. A method for the<br />

generation of yac transgenic mice by pronuclear microinjection. Nucleic Acids Research, 21(20):4783<br />

–4787, 1993.<br />

[9] The BSE Inquiry. Bse inquiry report, volume 2 science.<br />

[10] P.J. Mogayzel and M.A. Ashlock. Cftr intron 1 increases luciferase expression driven by cftr 5-flanking<br />

dna in a yeast artificial chromosome. Genomics, 64(2):211–215, March 2000.<br />

[11] S.A. Shabalina and A. Spiridonov. The mammalian transcriptome and the function of non-coding dna<br />

sequences. Genome Biology, 5, 2004.<br />

[12] N.V. Henriquez, P.G.M. Overveld, I. Que, J.T. Buijs, R. Bachelier, E.L. Kaijzel, C.W.G.M. Löwik,<br />

P. Clezardin, and G. van der Pluijm. Advances in optical imaging and noval model systems for cancer<br />

metastatis research. Clinical and Experimental Metastasis, 2007.<br />

[13] Irene C Notting, Jeroen T Buijs, Ivo Que, Ratna E Mintardjo, Geertje van der Horst, Marcel Karperien,<br />

Guy S O A Missotten, Martine J Jager, Nicoline E Schalij-Delfos, Jan E E Keunen, and Gabri van der<br />

Pluijm. Whole-body bioluminescent imaging of human uveal melanoma in a new mouse model of<br />

local tumor growth and metastasis. Invest Ophthalmol Vis Sci, 46(5):1581–1587, 2005.<br />

[14] Barmak Modrek and Christopher Lee. A genomic view of alternative splicing. Nat Genet, 30(1):13–19,<br />

2002.<br />

[15] Agenor Limon, Jorge Mauricio Reyes-Ruiz, Fabrizio Eusebi, and Ricardo Miledi. Properties of glur3<br />

receptors tagged with gfp at the amino or carboxyl terminus. Proc Natl Acad Sci U S A, 104(39):15526–<br />

15530, 2007.<br />

57


Bibliography<br />

[16] Tarik F. Massoud and Sanjiv S. Gambhir. Molecular imaging in living subjects: seeing fundamental<br />

biological processes in a new light. Genes Dev, 17(5):545–580, 2003.<br />

[17] Y Yu, A J Annala, J R Barrio, T Toyokuni, N Satyamurthy, M Namavari, S R Cherry, M E Phelps,<br />

H R Herschman, and S S Gambhir. Quantification of target gene expression by imaging reporter gene<br />

expression in living animals. Nat Med, 6(8):933–937, 2000.<br />

[18] Centre for positron emission tomography website. http://www.petnm.unimelb.edu.au/pet/detail/nucphysics.html.<br />

[19] N.I.L.J Bohnen. Toepassingen van pet en spect in de neurologische praktijk. Neurologie, 104(6):339–<br />

346, 2003.<br />

[20] R. Ray, A.M. Wu, and S.S. Gambhir. Optical bioluminescence and positron emission tomography<br />

imaging of a novel fusion reporter gene in tumor xenografts of living mice. Cancer Research, 63:1160–<br />

1165, March 2003.<br />

[21] Vijay Sharma, Gary D Luker, and David Piwnica-Worms. Molecular imaging of gene expression and<br />

protein function in vivo with pet and spect. J Magn Reson Imaging, 16(4):336–351, 2002.<br />

[22] J.P. Hornak. The basics of mri. HTML, 1996-2007.<br />

[23] A Y Louie, M M Huber, E T Ahrens, U Rothbacher, R Moats, R E Jacobs, S E Fraser, and T J<br />

Meade. In vivo visualization of gene expression using magnetic resonance imaging. Nat Biotechnol,<br />

18(3):321–325, 2000.<br />

[24] V. Ntziachrisos, C.H. Tung, C. Bremer, and R. Weissleder. Fluorescence molecular tomography resolves<br />

protease activity in vivo. Nature Medicine, 8(7):757–760, July 2002.<br />

[25] Vasilis Ntziachristos, Jorge Ripoll, Lihong V Wang, and Ralph Weissleder. Looking and listening to<br />

light: the evolution of whole-body photonic imaging. Nat Biotechnol, 23(3):313–320, 2005.<br />

[26] Timothy C Doyle, Stacy M Burns, and Christopher H Contag. In vivo bioluminescence imaging for<br />

integrated studies of infection. Cell Microbiol, 6(4):303–317, 2004.<br />

[27] D. Germain-Desprez, M. Bazinet, M. Bouvier, and M. Aubry. Oligomerization of transcriptional intermdiary<br />

factor 1 regulators and interaction with znf74 nuclear matrix protein tevealed by bioluminescence<br />

resonance energy transfer in living cells. The Journal of Biological Chemistry, 278(25):22367–<br />

22373, June 2003.<br />

[28] K.A. Eidne, K.M. Kroeger, and A.C. Hanyaloglu. Applications of novel resonance energy transfer<br />

techniques to study dynamic hormone receptor interactions in living cells. TRENDSin Endocrinology<br />

& Metabolism, 13(10):415–421, December 2002.<br />

[29] P. van Roessel and A.H. Brand. Imaging into the future: visualizing gene expression and protein<br />

interactions with fluorescent proteins. Nature Cell Biology, 4:E15–E20, 2002.<br />

[30] R. Ray, H Pimenta, R. Paulmurugan, F. Berger, M.E. Phelps, and S.S. Gambhir. Noninvasive quantitative<br />

imaging of protein-protein interactions in living subjects. PNAS, 99(5):3105–3110, March 2002.<br />

[31] C. von Mering, R. Krause, B. Snel, M Cornell, S.G. Oliver, S. Field, and P Bork. Comparative assessment<br />

of large-scale data sets of protein-protein interactions. Nature, 417:399–403, May 2002.<br />

[32] H. D. Liang and M. J. K. Blomley. The role of ultrasound in molecular imaging, 2003. British Journal<br />

of Radiology.<br />

[33] M. Guven, B. Yazici, X. Intes, and B. Chance. Diffuse optical tomography with a priori anatomical<br />

information. Physics in Medicine and Biology, 50:2837–2858, June 2005.<br />

[34] Belma Dogdas, David Stout, Arion F Chatziioannou, and Richard M Leahy. Digimouse: a 3d whole<br />

body mouse atlas from ct and cryosection data. Phys Med Biol, 52(3):577–587, 2007.<br />

[35] W. Cong, G. Wang, D. Kuman, Y. Liu, M. Jiang, L.V. Wang, E.A. Hoffman, G McLennan, P.B. McCray,<br />

J. Zabner, and A. Cong. Practical reconstruction for bioluminescence tomography. Optical Express,<br />

13(18):6756–6771, September 2005.<br />

[36] G. Wang, Y. Li, and M. Jiang. Uniqueness theorems in bioluminescence tomography. Medical Physics,<br />

31(8):2289–2299, July 2004.<br />

[37] G. Wang, H. Shen, Cong W., S. Zhao, and G.W. Wei. Temperature-modulated bioluminescence tomography.<br />

Optics Express, 14(17), August 2006.<br />

[38] A.J. Chaudhari, F. Darvas, J.R. Bading, R.A. Moats, P.S. Conti, D.J. Smith, S.R. Cherry, and R.M.<br />

Leahy. Hyperspectral and multispectral bioluminescence optical tomography for small animal imaging.<br />

Physics in Medicine and Biology, 20:5421–5541, 2005.<br />

58 Martin Wildeman


Bibliography<br />

[39] P. Kok, J. Dijkstra, C.P. Botha, F.H. Post, E. Kaijzel, I. Que, C.W.G.M. Löwik, J.H.C. Reiber, and B.P.F.<br />

Lelieveldt. Integrated visualization of multi-angle bioluminescence imaging and micro ct. Proceedings<br />

of SPIE, 6509, 2007.<br />

[40] J. Reinitz and D.H. Sharp. Mechanism of eve stripe formation. Mechanisms of Development, 49:133–<br />

158, 1995.<br />

[41] M. Baiker, J. Milles, A.M. Vossepoel, I. Que, E.L. Kaijzel, C.W.G.M. Löwik, J.H.C. Reiber, J. Dijkstra,<br />

and B.P.F. Lelieveldt. Fully automated whole-body registration in mice, using an articulated skeleton<br />

atlas. ISBI, 2007.<br />

[42] Albert Burger, Richard A. Baldock, Yiya Yang, Andrew Waterhouse, Derek Houghton, Nick Burton,<br />

and Duncan Davidson. The edinburgh mouse atlas and gene-expression database: A spatio-temporal<br />

database for biological research. In SSDBM ’02: Proceedings of the 14th International Conference<br />

on Scientific and Statistical Database Management, page 239, Washington, DC, USA, 2002. IEEE<br />

Computer Society.<br />

[43] D. Davidson, J. Bard, R. Brune, A. Burger, C. Dubreuil, W. Hill, M. Kaufman, J. Quinn, M. Stark, and<br />

R. Baldock. The mouse atlas and graphical gene-expression database. Cell & Developmental Biology,<br />

8:509–517, 1997.<br />

[44] D.W. Townsend and T. Beyer. A combined petct scanner: the path to true image fusion. The British<br />

Journal of Radiology, 2002.<br />

[45] I.I. Moraru and L. M. Loew. Intracellular signaling: Spatial and temporal control. Physiology, 20:169–<br />

179, 2005.<br />

[46] L. Seroude, T. Brummel, P. Kapahi, and S. Benzer. Spatio-temporal analysis of gene expression during<br />

aging in Drosophila melanogaster. Aging Cell, 1:47–56, 2002.<br />

[47] Flytrap website. http://www.fly-trap.org/flytrap/html/docs/egal4.html, October 2007.<br />

[48] Axel Visel, James Carson, Judit Oldekamp, Marei Warnecke, Vladimira Jakubcakova, Xunlei Zhou,<br />

Chad A Shaw, Gonzalo Alvarez-Bolado, and Gregor Eichele. Regulatory pathway analysis by highthroughput<br />

in situ hybridization. PLoS Genet, 3(10):1867–1883, 2007.<br />

[49] D. Dupuy, N. Bertin, C.A. Hidalgo, K. Venkatesan, D. Tu, D. Lee, J. Rosenberg, N. Svrzikapa,<br />

A. Blanc, A. Carnac, A. Carvunis, R. Pulak, J. Shingles, J. Reece-Hoyes, R. Hunt-Newbury,<br />

R. Viveiros, W.A. Mohler, M. Tasa, F. P. Roth, C. Le Peuch, I.A. Hope, R. Johnsen, D.G. Merman,<br />

A. L. Barbasi, D. Baillie, and M. Vidal. Genome-scale analysis of in vivo spatiotemporal promoter<br />

activity in Caenorhabditis elegans. Nature Biotechnology, 25(6):663–668, June 2007.<br />

[50] T. Krul, J.A. Kaandorp, and J.G. Blom. Modelling developmental regulatory networks. In ICCS 2003,<br />

pages 688–697, 2003.<br />

[51] Kalyanmoy Deb. An introduction to genetic algorithms.<br />

[52] Z. Yang, W. Zhu, and L. Ji. Slit: Designing complexity penalty for classification and regression trees<br />

using the srm orinciple. ISNN, 2006.<br />

[53] C.P Fall, E.S. Marland, J.M. Wagner, and J.J. Tyson. Computational Cell Biology. Springer, 2002.<br />

[54] H. Janssens, J. Hou, S. amd Jaeger, A. Kim, E. Myasnikova, D. Sharp, and J. Reinitz. Quantitative and<br />

predictive model of transcriptional control of the Drosophila Melanogaster even skipped gene. Nature<br />

Genetics, 38(10):1159–1165, 2006.<br />

[55] Yves Fomekong-Nanfack, Jaap A Kaandorp, and Joke Blom. Efficient parameter estimation for<br />

spatio-temporal models of pattern formation: case study of drosophila melanogaster. Bioinformatics,<br />

23(24):3356–3363, 2007.<br />

[56] Hidde de Jong, Johannes Geiselmann, Celine Hernandez, and Michel Page. Genetic network analyzer:<br />

qualitative simulation of genetic regulatory networks. Bioinformatics, 19(3):336–344, 2003.<br />

[57] Hidde de Jong, Jean-Luc Gouze, Celine Hernandez, Michel Page, Tewfik Sari, and Johannes Geiselmann.<br />

Qualitative simulation of genetic regulatory networks using piecewise-linear models. Bull Math<br />

Biol, 66(2):301–340, 2004.<br />

[58] I.M. Ong, J.D. Glasner, and Page.D. Modelling regulatory pathways in E.coli from time series expression<br />

profiles. Bioinformatics, 18(S241-S248), 2002.<br />

[59] Kevin P. Murphy. Dynamic bayesian networks. To appear in Probabilistic Graphical Models, M.<br />

Jordan, November 2002.<br />

[60] Z. Bar-Joseph. Analyzing time series expression data. Bioinformatics, 20(16):2493–2503, 2004.<br />

[61] Affimetrix price sheet, September 2007.<br />

Martin Wildeman 59


Bibliography<br />

[62] R.L. Somorjai, B. Dolenko, and R. Baumgartner. Class prediction and discovery using gene microarray<br />

and proteomics mass spectroscopy data: curses, caveats, cautions. Bioinformatics, 19(12):1484–1491,<br />

2003.<br />

[63] C.H. Yeang, H.C. Mak, S. McCuine, C. Workman, T. Jaakkola, and T. Ideker. Validation and refinement<br />

of gene-regulatory pathways on a network of physical interactions. Genome Biology, 2005.<br />

[64] E.L. Kaijzel, G van der Pluijm, and C.W.G.M. Löwik. Whole-body optical imaging in animal models<br />

to assess cancer development and progression. Clinical Cancer Research, 13(12):3490–3497, June<br />

2007.<br />

[65] Gregory Batt, Delphine Ropers, Hidde de Jong, Johannes Geiselmann, Radu Mateescu, Michel Page,<br />

and Dominique Schneider. Validation of qualitative models of genetic regulatory networks by model<br />

checking: analysis of the nutritional stress response in escherichia coli. Bioinformatics, 21 Suppl<br />

1:i19–28, 2005.<br />

[66] Edinburgh mouse atlas project. http://genex.hgu.mrc.ac.uk/About/intro.html.<br />

[67] BD Biosciences Clontech. BD Living Colors TM Flourescent Proteins.<br />

[68] R.M. Mansfield, J.R. Levenson. Distinguished photons: The maestro TM in-vivo fluorescence imaging<br />

system. Technical report, CRi, 2006.<br />

[69] Haiyan Wan, Jiangyan He, Bensheng Ju, Tie Yan, Toong Jin Lam, and Zhiyuan Gong. Generation of<br />

two-color transgenic zebrafish using the green and red fluorescent protein reporter genes gfp and rfp.<br />

Mar Biotechnol (NY), 4(2):146–154, 2002.<br />

[70] Simon R Cherry. In vivo molecular and genomic imaging: new challenges for imaging physics. Phys<br />

Med Biol, 49(3):R13–48, 2004.<br />

[71] Guo-Jun Zhang, Michal Safran, Wenyi Wei, Erik Sorensen, Peter Lassota, Nikolai Zhelev, Donna S<br />

Neuberg, Geoffrey Shapiro, and William G Jr Kaelin. Bioluminescent imaging of cdk2 inhibition in<br />

vivo. Nat Med, 10(6):643–648, 2004.<br />

[72] Weisheng Zhang, Anthony F Purchio, Kevin Chen, Jianming Wu, Li Lu, Richard Coffee, Pamela R<br />

Contag, and David B West. A transgenic mouse model with a luciferase reporter for studying in vivo<br />

transcriptional regulation of the human cyp3a4 gene. Drug Metab Dispos, 31(8):1054–1064, 2003.<br />

[73] Paolo Ciana, Michele Raviscioni, Paola Mussi, Elisabetta Vegeto, Ivo Que, Malcolm G Parker,<br />

Clemens Lowik, and Adriana Maggi. In vivo imaging of transcriptionally active estrogen receptors.<br />

Nat Med, 9(1):82–86, 2003.<br />

[74] F.M. Dekking, C. Kraaikamp, P. Lopuhaä, and L.E. Meester. Kanstat: Probability and statistics for<br />

the 21st century. Delft University of Technology, 2002.<br />

[75] Antoinette Wetterwald, Gabri van der Pluijm, Ivo Que, Bianca Sijmons, Jeroen Buijs, Marcel Karperien,<br />

Clemens W G M Lowik, Elsbeth Gautschi, George N Thalmann, and Marco G Cecchini. Optical<br />

imaging of cancer metastasis to bone marrow: a mouse model of minimal residual disease. Am J<br />

Pathol, 160(3):1143–1153, 2002.<br />

[76] Darlene E Jenkins, Yoko Oei, Yvette S Hornig, Shang-Fan Yu, Joan Dusich, Tony Purchio, and<br />

Pamela R Contag. Bioluminescent imaging (bli) to improve and refine traditional murine models of<br />

tumor growth and metastasis. Clin Exp Metastasis, 20(8):733–744, 2003.<br />

[77] Andrew Webb. Statistical <strong>Pattern</strong> Regognition. Wiley, 2 edition, 2002.<br />

60 Martin Wildeman

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!