Sequencing
SFAF2016%20Meeting%20Guide%20Final%203
SFAF2016%20Meeting%20Guide%20Final%203
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
11th Annual <strong>Sequencing</strong>, Finishing, and Analysis in the Future Meeting<br />
QUALITY ASSESSMENT AND VALIDATION CRITERIA<br />
– TOWARDS THE DEFINITION OF TABLE 1.<br />
Wednesday, 1st June 20:00 La Fonda NM Room (1st floor) Poster (PS‐1b.09)<br />
Dominika Borek, Maciej Puzio, Zbyszek Otwinowski<br />
UT Southwestern Medical Center<br />
Although next‐generation sequencing provides the means to study properties of nucleic acids on unprecedented<br />
scales, concise measures for assessing the confidence of results from NGS experiments are<br />
lacking. There are tens to hundreds of statistical indicators available right now which separately provide<br />
information about: (1) the quality of the sequencing library and the material that was used to generate it,<br />
(2) the performance of the equipment, (3) potential biases in the results, and other important sequencingrelated<br />
features. However, the average consumer of NGS technology is rarely in a position to efficiently<br />
integrate all of this information. This leads to decisions regarding whether an experiment was successful<br />
and whether the results are trustworthy frequently being arbi‐ trary. Comparative and meta‐analyses are<br />
the area most affected by this; differences are attributed to biological phenomena when they frequently<br />
originated from differences in the experimental ap‐ proach. The lack of transparent validation criteria leads<br />
not only to the incorrect or sub‐optimal interpretation of results but also to expensive over‐sequencing.<br />
We have developed alignment‐free metrics that provide transparent and comprehensive validation<br />
of NGS experiment results and define the so‐called Table 1, which concisely summarizes the quality<br />
of an experiment and data analysis so that NGS users and reviewers of publications and grant applications<br />
can quickly and yet with high certainty asses the quality of a particular NGS experiment.<br />
Our approach is based on data mining of sequencing reads, which includes analysis of overdispersion<br />
properties. This is followed by the analysis of residuals to detect whether our models of the<br />
experiment are sufficiently complete.<br />
This approach provides partitioning of uncertainty into components related to error sources and<br />
estimates the magnitude of each error source. Together, these directly assess the quality of NGS<br />
experiments and contribute to the validation of NGS results.<br />
75