16.11.2012 Views

Paterson Institute for Cancer Research SCIENTIFIC REPORT 2005

Paterson Institute for Cancer Research SCIENTIFIC REPORT 2005

Paterson Institute for Cancer Research SCIENTIFIC REPORT 2005

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

BIOINFORMATICS<br />

age, MIAME VICE is becoming the main route of<br />

data exchange between the Molecular Biology Core<br />

Facility (page 52) and its users.<br />

The complex relationship between genes,<br />

probes and transcripts<br />

Affymetrix microarrays record the presence of a<br />

transcript in solution by measuring the level of<br />

hybridization between the transcript and a set of<br />

short (typically 25mer) oligonucleotide probes<br />

anchored to the array surface. Each ‘probe-set’<br />

consists of a series of ‘perfect match’ (PM) probes,<br />

designed to match exactly to the transcript, and a<br />

series of ‘mismatch probes’ (MM), identical to the<br />

PM probes except that the middle residue has been<br />

changed. Hybridization conditions are controlled<br />

with the aim of maximizing the binding between a<br />

transcript and its PM probes, whilst minimizing the<br />

binding to its MM probes (see www.affymetrix.com<br />

<strong>for</strong> more details). The intention is that the PM<br />

probes record the presence of the transcript, whilst<br />

MM probes measure background and non-specific<br />

hybridization. One advantage of this approach is<br />

that the combination of short oligonucleotides and<br />

strict hybridization conditions makes it possible to<br />

use in silico searches to predict which probes are likely<br />

to bind to which transcripts; in<strong>for</strong>mation that is<br />

important because many transcripts have similar<br />

sequences and certain probes are capable of binding<br />

to more than one mRNA molecule (<strong>for</strong> example<br />

because alternate splicing can lead to a set of<br />

transcripts being encoded by a single gene; due to<br />

homology; or due to repetitive or low complexity<br />

regions).<br />

Not only do some probesets target multiple transcripts,<br />

the reverse is also true – there are multiple<br />

probesets that target a single transcript. This can<br />

occur, <strong>for</strong> example, with probe-sets designed to<br />

identify different splice-variants of the same gene,<br />

or where one probeset is designed to identify a gene<br />

family, whilst another targets a particular family<br />

member (see figure).<br />

Identifying these situations is useful when considering<br />

experimental data in which evidence from a particular<br />

probeset is weak. If all the other probesets<br />

targeting the same transcript behave similarly, this<br />

can provide supporting evidence; if they behave<br />

differently it may be possible to discount the probeset<br />

from further analysis. We have developed an<br />

online database, ADAPT, that allows these complex<br />

relationships to be investigated.<br />

Comparisons between protein and mRNA gene<br />

expression data<br />

A significant opportunity exists to place proteomics<br />

and microarray data side by side, allowing comparison<br />

of changes in gene expression at the transcript<br />

and protein level. To do this requires complex<br />

mappings to be made between genomic, transcript<br />

and protein sequences; a difficult task because of<br />

the structure of existing databases and the complex<br />

many-many relationship that exists between genes<br />

and gene products. We have been developing software<br />

tools and analysis strategies to support these<br />

analyses in collaboration with Professor Tony<br />

Whetton at the University of Manchester and the<br />

Kouskoff and Lacaud groups at the <strong>Paterson</strong><br />

<strong>Institute</strong>.<br />

The complex many-many relationships that exist between<br />

Affymetrix probesets (blue balls) and transcripts (yellow balls)<br />

<strong>for</strong> the HGU133 array.<br />

P A T E R S O N I N S T I T U T E S C I E N T I F I C R E P O R T 2 0 0 5<br />

Publications listed<br />

on page 56<br />

9

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!