Views
5 years ago

PDF - White Rose Etheses Online

PDF - White Rose Etheses Online

The recent improvements

The recent improvements in sequencing have made it possible to include genome sequence information as part of a wider research project with relative speed and ease, and at low cost. EST sequencing The introduction of high-throughput sequencing technologies has also improved the scope for transcriptomic investigations, studying the profile of expression in coding regions of the genome. Other technologies, in particular microarray analysis platforms, have contributed to this field of study, allowing for the complete comparison of expression profiles in different tissues. This information can be particularly useful in the characterisation of disease, for example in comparing the genome-wide expression profiles of cancerous tissue with an equivalent healthy sample (Volinia, Calin et al. 2006), or in studying the knock- on effects of variation in the levels of expression of a single gene product (e.g. Branney, Faas et al. 2009). Expressed sequence tags (ESTs) are reads sequenced from cDNA prepared from a sample, providing insight into the expression profile of the sample based on the mRNA present. Rather than determining the full cDNA sequence, ESTs are read from either end of the transcript and can be used with a reference genome to identify their original position (Nagaraj, Gasser et al. 2007; Scheibye-Alsing, Hoffmann et al. 2009). The high-throughput nature of modern sequencing methods allows for a picture of the full expression profile (the transcriptome) of a sample to be built, based on the assumption that the number of ESTs sequenced for a gene is directly proportional to the level of expression of that gene (after copy number variations have been considered). Amplicon sequencing Chapter 1 - DNA sequencing - an overview High-throughput sequencing has also allowed for the identification and study of polymorphisms between the genomes of individuals of the same, or closely related, species. Primer sets are used in PCR to amplify specific target regions of sequence from each genome into ‘amplicons’, which are then sequenced. By sequencing amplicons on a massively parallel platform, the amplicon sequences are covered at great depth, allowing for even rare polymorphisms to be identified when the sequences are compared (Rosani, Varotto et al. 2011). 15

SNP analysis The large-scale identification of single nucleotide polymorphisms (SNPs) throughout genomic sequences has been made possible by high-throughput sequencing. The process requires that sequencing be performed at high depth, so that each potential polymorphism site is covered multiple times. If this criterion is fulfilled, the reads may be mapped to a reference genome sequence and positions at which the nucleotide sequence differs can be identified as candidate SNP sites. Many methods have been developed that determine the confidence with which an SNP can be called at a particular site (e.g. McKenna, Hanna et al. 2010; Le and Durbin 2011), but as a rule the more consistently that a polymorphism is detected (i.e. the more it appears in the sequencing reads covering that site), the greater confidence can be had in the assignment of a SNP at that position (Nielsen, Paul et al. 2011). Chapter 1 - DNA sequencing - an overview SNP calling is an important field in the study of population genomics, as it allows for the genetic differences between individuals of a species to be identified and studied (Nielsen, Paul et al. 2011). It forms a key part of studies such as the 1000 Genomes Project (www.1000genomes.org), which aim to elucidate the key genomic differences between individuals and populations. 16

  • Page 1: Clustering Large Raw DNA Sequencing
  • Page 5 and 6: Table of contents Table of Contents
  • Page 7 and 8: Table of Contents Extraction of DNA
  • Page 9 and 10: Table of Contents Contig assembly..
  • Page 11 and 12: List of Tables and Figures Table 3.
  • Page 13 and 14: Figure 5.5 The number of sequencing
  • Page 15: Declaration • The implementation
  • Page 18 and 19: Context Differences between the gen
  • Page 20 and 21: DNA sequencing - an overview Sanger
  • Page 22 and 23: sequence, which can then be assembl
  • Page 24 and 25: nucleotides to a strand is detected
  • Page 26 and 27: separated based on size (typically
  • Page 28 and 29: example, a pair of reads produced f
  • Page 32 and 33: Chapter 1 - Metagenomics and sequen
  • Page 34 and 35: complement metagenomics and provide
  • Page 36 and 37: such as sampling time and location,
  • Page 38 and 39: aims of the HMP are described as:
  • Page 40 and 41: As with the larger and more complex
  • Page 42 and 43: Methods of sequence comparison Alig
  • Page 44 and 45: where local alignments can identify
  • Page 46 and 47: Ladunga (1994), led to the coining
  • Page 48 and 49: Project summary The aim of this pro
  • Page 51 and 52: 2 A comparison of genomic signature
  • Page 53 and 54: GC content The GC content of DNA, t
  • Page 55 and 56: In order to ascertain the likelihoo
  • Page 58 and 59: To illustrate this point further, i
  • Page 60 and 61: words in each sequence, is benefici
  • Page 62: values, collected as the sample siz
  • Page 65 and 66: On a related note, the authors of t
  • Page 67 and 68: would likely be more closely relate
  • Page 70 and 71: Breakdown of simLC by reads-per-spe
  • Page 72 and 73: corresponding ‘true’ dataset. T
  • Page 74 and 75: if the dataset contains 50 sequence
  • Page 76 and 77: Figure 2.5 Clustering of sequences
  • Page 78 and 79: Clustering of Dataset 1 Tables 2.2
  • Page 80 and 81:

    Table 2.3 Mean recall values of clu

  • Page 82 and 83:

    use of OFDEG features was only marg

  • Page 84 and 85:

    Table 2.4 Mean precision and recall

  • Page 86 and 87:

    sequences from that genome in the d

  • Page 88 and 89:

    Table 2.6 Mean recall values of clu

  • Page 90 and 91:

    different proportions were grouped.

  • Page 92 and 93:

    R. palustris Bradyrhizobium BTAi1 C

  • Page 94 and 95:

    Figure 2.7(i) - 2.7(xv) Comparative

  • Page 96 and 97:

    IND Cluster 1 Cluster 4 Cluster 2 C

  • Page 98 and 99:

    TNF Cluster 1 Cluster 4 Cluster 2 C

  • Page 100 and 101:

    GC + OFDEG Cluster 1 Cluster 4 Clus

  • Page 102 and 103:

    IND + OFDEG Cluster 1 Cluster 4 Clu

  • Page 104 and 105:

    OFDEG + TNF Cluster 1 Cluster 4 Clu

  • Page 106 and 107:

    GC + IND + TNF Cluster 1 Cluster 4

  • Page 108 and 109:

    IND + OFDEG + TNF Cluster 1 Cluster

  • Page 110 and 111:

    When compared to the distribution o

  • Page 112 and 113:

    to that achieved with GC feature ve

  • Page 114 and 115:

    Table 2.8 Time taken (in seconds) t

  • Page 116 and 117:

    platforms, all with typical lengths

  • Page 118 and 119:

    feature vectors, which were found t

  • Page 120 and 121:

    However, because the sequencing rea

  • Page 122 and 123:

    single-variable GC content feature.

  • Page 125 and 126:

    3 Preparation and analysis of high-

  • Page 127 and 128:

    elonging to either the host species

  • Page 129:

    Materials and Methods Inoculation o

  • Page 133 and 134:

    Assay sequences: • Cucumber mosai

  • Page 135 and 136:

    Analysis of extracted RNA by qRT-PC

  • Page 137 and 138:

    Analysis of extracted DNA by qPCR E

  • Page 139 and 140:

    Table 3.5 Amount of DNA sequenced f

  • Page 141 and 142:

    Results Comparison of bacterial ino

  • Page 143 and 144:

    Mean Ct Value Mean Ct Value (a) (b)

  • Page 145 and 146:

    the CMV assay. If the poor amplific

  • Page 147 and 148:

    40 30 20 Mean Ct Value (COX Assay)

  • Page 149 and 150:

    qRT-PCR Analysis of Viral Treatment

  • Page 151 and 152:

    qPCR analysis of DNA extracts in pr

  • Page 153 and 154:

    Table 3.8 Mean Ct values observed i

  • Page 155 and 156:

    The full inoculation method involve

  • Page 157 and 158:

    Table 3.9 Mean Ct values observed i

  • Page 159 and 160:

    qRT-PCR Analysis of Dummy Inoculate

  • Page 161 and 162:

    Results of high-throughput DNA sequ

  • Page 163 and 164:

    Figure 3.10 Proportion of sequence

  • Page 165 and 166:

    Figure 3.11 Proportion of sequence

  • Page 167 and 168:

    • Viral treatment groups Table 3.

  • Page 169 and 170:

    Figure 3.12 Proportion of sequence

  • Page 171 and 172:

    Figure 3.13 Proportion of sequence

  • Page 173 and 174:

    Discussion Datasets produced from b

  • Page 175 and 176:

    Datasets produced from viral treatm

  • Page 177:

    present in the samples. The use of

  • Page 180 and 181:

    Introduction An evaluation of the p

  • Page 182 and 183:

    discussed elsewhere, the length of

  • Page 184 and 185:

    single clustering method, CLARA. Th

  • Page 186 and 187:

    UT (A. thaliana) UT (unassigned) UT

  • Page 188 and 189:

    Results The scope for the four feat

  • Page 190 and 191:

    Chapter 4 - Results GC (i) Cluster

  • Page 192 and 193:

    TNF GC + IND Cluster 1 UT (A. thali

  • Page 194 and 195:

    IND + OFDEG IND + TNF Cluster 1 UT

  • Page 196 and 197:

    GC + IND + TNF Cluster 1 GC + OFDEG

  • Page 198 and 199:

    Coherent with clustering results ob

  • Page 200 and 201:

    UT+Psp2126 - five clusters Figure 4

  • Page 202 and 203:

    IND + OFDEG (viii) Cluster 1 Cluste

  • Page 204 and 205:

    OFDEG + TNF Cluster 1 Cluster 4* Cl

  • Page 206 and 207:

    GC + IND + TNF (xii) Cluster 1 Clus

  • Page 208 and 209:

    IND + OFDEG + TNF (xiv) Cluster 1 C

  • Page 210 and 211:

    Once again, clustering results prod

  • Page 212 and 213:

    Discussion Several trends were iden

  • Page 214:

    produce large numbers of these feat

  • Page 217 and 218:

    Introduction Previous chapters have

  • Page 219 and 220:

    1981). Partitioning around mediods

  • Page 221 and 222:

    strength (Tibshirani and Walther 20

  • Page 223 and 224:

    Where these linkage metrics are mea

  • Page 225 and 226:

    many of the methods described previ

  • Page 227 and 228:

    Data can be grouped with an SOM in

  • Page 229 and 230:

    separation of data in each case. So

  • Page 231 and 232:

    here). Euclidean distance, the defa

  • Page 233 and 234:

    Beyond this general pattern within

  • Page 235 and 236:

    1.00 0.80 0.60 0.40 0.20 0 Chapter

  • Page 237 and 238:

    Parameter selection for spectral cl

  • Page 239 and 240:

    1.0 0.8 0.6 0.4 0.2 KASP Clustering

  • Page 241 and 242:

    HHSOM When originally published by

  • Page 243 and 244:

    No. of sequences assigned to node 3

  • Page 245 and 246:

    No. of sequences assigned to node 3

  • Page 247 and 248:

    No. of sequences assigned to node 5

  • Page 249 and 250:

    Comparison of partitioning clusteri

  • Page 251 and 252:

    Table 5.4 Precision and recall stat

  • Page 253 and 254:

    Cluster Species 1 2 3 4 5 6 7 A. th

  • Page 255 and 256:

    een grouped into the cluster. Preci

  • Page 257 and 258:

    Discussion The level of accuracy ac

  • Page 260 and 261:

    6 A comparison of de novo sequence

  • Page 262 and 263:

    where a pairwise comparison is made

  • Page 264 and 265:

    performed at random. This also impr

  • Page 266 and 267:

    Dataset Organism Genome Size Genome

  • Page 268 and 269:

    Results UT+Psp2126 The UT+Psp2126 d

  • Page 270 and 271:

    Metric Contigs Combined length (bp)

  • Page 272 and 273:

    As such, the increase in total leng

  • Page 274 and 275:

    As such, the predictions of mapping

  • Page 276 and 277:

    Combined length (bp) 450000 425000

  • Page 278 and 279:

    Combined length (bp) 80000 60000 40

  • Page 280 and 281:

    unclustered reads, for random clust

  • Page 282 and 283:

    Sample 1 - blackberry + suspected b

  • Page 284 and 285:

    Sample 2 - ivy + supected bacterial

  • Page 286 and 287:

    Sample 3 - tomato + Pepino mosaic v

  • Page 288 and 289:

    Speed of assembly The time taken fo

  • Page 290 and 291:

    Discussion UT+Psp2126 In previous c

  • Page 292 and 293:

    of the dataset before and after clu

  • Page 295 and 296:

    The UT+Psp2126 dataset cannot be th

  • Page 298 and 299:

    7 Abstract Discussion and future di

  • Page 300 and 301:

    pathogen material extracted from th

  • Page 302 and 303:

    would be beneficial in spite of the

  • Page 304 and 305:

    investigation might be made into wh

  • Page 306 and 307:

    Sequence assembly As new sequencing

  • Page 308 and 309:

    al. 2012). This method of character

  • Page 310 and 311:

    Appendix A-1: Use of perl scripts i

  • Page 312 and 313:

    Appendix A-3 randomSeqWriter.pl #!

  • Page 314 and 315:

    if (@alphabet < @names) { } foreach

  • Page 316 and 317:

    Appendix A-4 featureWriter.pl #! /u

  • Page 318 and 319:

    } print "GC content done...\n"; #ge

  • Page 320 and 321:

    } } #OFDEG if ($seqLength < $shorte

  • Page 322 and 323:

    } } else { } $revtethash{$tetraseq}

  • Page 324 and 325:

    } } $iteration++; @wordSizeArray =

  • Page 326 and 327:

    } } else { } if ($Odist ne "") { }

  • Page 328 and 329:

    } push (@CEF_array, $CEF); #calcula

  • Page 330 and 331:

    Appendix A-5 featureComboWriter.pl

  • Page 332 and 333:

    {$feat}}) { } } } else { } print OU

  • Page 334 and 335:

    \n"; } $rangeSplit[0] = 2; $rangeUL

  • Page 336 and 337:

    Appendix A-7 claraResultsSummariser

  • Page 338 and 339:

    $speciesPresent{$species}; } if (ex

  • Page 340 and 341:

    Appendix A-8 avePRwriter.pl #! /usr

  • Page 342 and 343:

    Appendix A-9 SAMseqAssigner.pl #! /

  • Page 344 and 345:

    } else { } close OUTFH; close PAFH;

  • Page 346 and 347:

    if ($method eq "fuzzyk" || $method

  • Page 348 and 349:

    Appendix A-11 contigInfo.pl #! /usr

  • Page 350 and 351:

    } #grep lists of reads used in each

  • Page 352 and 353:

    } $meanLength = $totalLength/$numCt

  • Page 354 and 355:

    } unless ($spCumLength > ($spSumCon

  • Page 356 and 357:

    } } else { } $seqLine = $_; chomp $

  • Page 358 and 359:

    } } if ($clusters{$ID} == $clusterN

  • Page 360 and 361:

    Appendix B-1 A table detailing the

  • Page 362 and 363:

    Taxon Genome size Reads used Total

  • Page 364 and 365:

    Taxon Genome size Reads used Total

  • Page 366 and 367:

    Taxon Genome size Reads used Total

  • Page 368 and 369:

    Taxon Genome size Reads used Total

  • Page 370 and 371:

    Taxon Genome size Reads used Total

  • Page 372 and 373:

    Taxon Genome size Reads used Total

  • Page 374 and 375:

    Species Genus Family Order Class Ph

  • Page 376 and 377:

    Species Genus Family Order Class Ph

  • Page 378 and 379:

    Species Genus Family Order Class Ph

  • Page 380 and 381:

    Species Genus Family Order Class Ph

  • Page 382 and 383:

    Species Genus Family Order Class Ph

  • Page 384 and 385:

    Species Genus Family Order Class Ph

  • Page 386 and 387:

    Species Genus Family Order Class Ph

  • Page 388 and 389:

    Species Genus Family Order Class Ph

  • Page 390 and 391:

    Table of Abbreviations Abbreviation

  • Page 392 and 393:

    Abbreviation Term Definition PAM Pa

  • Page 394 and 395:

    Bernardi, G. and G. Bernardi (1986)

  • Page 396 and 397:

    Eisen, J. A. (2007). "Environmental

  • Page 398 and 399:

    Kannan, R., S. Vempala, et al. (200

  • Page 400 and 401:

    Mavromatis, K., N. Ivanova, et al.

  • Page 402 and 403:

    Rico, A., S. L. McCraw, et al. (201

  • Page 404 and 405:

    Teeling, H., J. Waldmann, et al. (2

  • Page 406 and 407:

    Wendl, M. C. (2006). "A general cov

The Archaeology of Medieval Europe - White Rose Research Online
See PDF version here. - Blue & White Online
[+]The best book of the month Rose Red And Snow White [NEWS]
See PDF version here. - Blue & White Online
See PDF version here. - Blue & White Online
See PDF version here. - Blue & White Online
See PDF version here - Blue & White Online
[+][PDF] TOP TREND Baby Animals Black and White [FULL]
Best [PDF] Girl Boss - She Designed A Life She Loved: 6x9 Blank Lined Journal For Business Women: Chic Inspirational Notebook - Floral Roses Black and White Stripes: Volume 1 (Boss Lady Gifts) Best Sellers Rank : #2 For Iphone#D#
Read Online (PDF) Discrete Chaos, Second Edition: With Applications in Science and Engineering - Read Unlimited eBooks and Audiobooks
Read Online (PDF) Draplin Design Co.: Pretty Much Everything - All Ebook Downloads
Read Editorial online - pdf file - Laboratory equipment manufacturers
[+][PDF] TOP TREND Fly Guy Presents: The White House (Scholastic Reader, Level 2) [FULL]
[+][PDF] TOP TREND Passive Income: 25 Proven Business Models To Make Money Online From Home (Passive income ideas) [READ]
Download Brochure (PDF) - Platea Online
[+][PDF] TOP TREND Banana: The Fate of the Fruit That Changed the World [READ]
Download PDF Surgical Critical Care: For the MRCS OSCE Free download and Read online
[+][PDF] TOP TREND HBR s 10 Must Reads on Leadership (with featured article "What Makes an Effective Executive," by Peter F. Drucker) [NEWS]
[+][PDF] TOP TREND Junk to Gold: From Salvage to the World s Largest Online Auto Auction [READ]
Read Online (PDF) Dental Terminology (Book Only) - Read Unlimited eBooks and Audiobooks
Read Online (PDF) INFANTS TODDLERS CAREGIVERS:CURRICULUM RELATIONSHIP - All Ebook Downloads
[+][PDF] TOP TREND Bioinformatics: A Practical Handbook of Next Generation Sequencing and its Applications [READ]
Read Editorial online - pdf file - Laboratory equipment manufacturers
Read Online (PDF) Maternal Newborn Nursing Care Plans - Read Unlimited eBooks and Audiobooks
Part 1 Number 2 2011 - Never Give Up (PDF 1MB) - Literacy Online
[+][PDF] TOP TREND Stray - A Shelter Veterinarian s Reflection on Triumph and Tragedy: (Black and White Edition) [PDF]