04.04.2013 Views

Transcriptional Characterization of Glioma Neural Stem Cells Diva ...

Transcriptional Characterization of Glioma Neural Stem Cells Diva ...

Transcriptional Characterization of Glioma Neural Stem Cells Diva ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

5.2 Array Comparative Genomic Hybridization Methods<br />

which transcript it was extracted. If the cut site was less than 17nt from the<br />

end <strong>of</strong> the transcript, we extended the sequence with adenine stretches, rep-<br />

resenting the poly-adenine (poly-A) tail. In these cases, additional upstream<br />

tags were also extracted until one tag fully contained in the Ensembl cDNA<br />

sequence was obtained. We disregarded Ensembl transcripts not belonging<br />

to a gene <strong>of</strong> biotype "protein_coding" or "processed_transcript". Ensembl<br />

annotates genes <strong>of</strong> several other biotypes, e.g. pseudogenes and small RNAs,<br />

but those annotations are not based on full-length transcript sequences, so<br />

we would not expect to find valid virtual tags in those transcripts. For the<br />

majority <strong>of</strong> this study, we used a conservative subset <strong>of</strong> the virtual tags from<br />

SAGE Genie and Ensembl comprising 25,593 unique tags assigned to 15,103<br />

genes (Table 5.2). Specifically, we used SAGE Genie tags extracted from to the<br />

3’-most cut site in RefSeq or MGC cDNAs having a poly-A tail or a polyadeny-<br />

lation signal, and Ensembl tags from transcripts <strong>of</strong> type "protein_coding" or<br />

"non_coding". Any virtual tags that mapped to multiple loci by these criteria<br />

were excluded. For certain analyses, we made use <strong>of</strong> more comprehensive vir-<br />

tual tag sets. In addition, we determined unique, perfect matches for tags to<br />

the genome using Bowtie as described above. We calculated a single expression<br />

value for each gene in each cell line by summing the counts <strong>of</strong> tags assigned to<br />

the gene.<br />

5.2 Array Comparative Genomic Hybridization<br />

We re-analysed the array comparative genomic hybridization (CGH) data de-<br />

scribed by Pollard et al [404] CGH was performed with Human Genome CGH<br />

Microarray 4x44K arrays (Agilent), using genomic DNA from each cell line<br />

hybridised in duplicate (dye swap) and normal human female DNA as ref-<br />

erence (Promega). Log2 ratios were computed from processed Cy3 and Cy5<br />

intensities reported by the s<strong>of</strong>tware CGH Analytics (Agilent). We corrected<br />

for effects related to GC content and restriction fragment size using a modi-<br />

fied version <strong>of</strong> the waves array CGH correction algorithm [271]. Log2 ratios<br />

were adjusted by sequential loess normalization on three factors: fragment GC<br />

content, fragment size, and probe GC content. These were selected after in-<br />

vestigating dependence <strong>of</strong> log ratio on multiple factors, including GC content<br />

in windows <strong>of</strong> up to 500 kilobases centred around each probe. The Biocon-<br />

ductor package CGHnormaliter [506] was then used to correct for intensity<br />

dependence and log2 ratios scaled to be comparable between arrays using the<br />

98

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!