Shotgun Sequencing

reu.dimacs.rutgers.edu

First Presentation - DIMACS REU

Shotgun SequencingMultiple copies of genomeShearedBIGrandom fragmentsDATASequenced ReadsContig AssemblyScaffold Assembly


Assembly difficulties• 0.1%-15% per base error rate depending ontechnology• One solution: Remove all infrequent K-mers


anana-slug.soe.ucsc.eduBase call error filtering


Single Cell Genomics


www.nature.comSingle cell problems


Single cell problemsHeterogeneous Sample CoverageAmplified Single Cell Sample Coveragewww.neb.uk.comNikolenko et al.


A possibility• Collapse hamming distance balls around mostfrequent K-merswww.math.cornell.edu


How can this be improved?• First: Does this really work? How well?– How many real K-mers do we lose?– What kind of data performs best?• More biologically sophisticated heuristics– Ex. Maximum Likelihood/Maximum entropy• Find pairs of reliable K-mers with reliabledistance estimate

More magazines by this user
Similar magazines