European Human Genetics Conference 2007 June 16 – 19, 2007 ...
European Human Genetics Conference 2007 June 16 – 19, 2007 ...
European Human Genetics Conference 2007 June 16 – 19, 2007 ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Genomics, technology, bioinformatics<br />
sequence searching for putative ncRNAs at this region. Among 86<br />
stable hairpin candidates, of 100 nucleotides and not overlapping with<br />
repeats, we selected 30 to check their expression by northern blot and<br />
primer extension and we identified 21 primate-specific new ncRNAs.<br />
Two correspond to new microRNAs and <strong>19</strong> to a new class and family<br />
of ncRNAs. Computational analysis based on conservation of the<br />
secondary structure supports that 15 of them are real ncRNAs. Most<br />
predicted genes targeted by these two miRNAs were common to both,<br />
and 11% of the targets encode proteins that participate in the development<br />
of the central nervous system, suggesting a role of 15q11.2<br />
miRNAs in neurological functions and disorders<br />
P1283. High throughput genomics using Applied Biosystem’s<br />
SOLiD System<br />
M. D. Rhodes, G. Costa, J. Ichikawa, J. Malek, A. Sheridan, L. Apone, C. Hendrickson,<br />
H. Peckham, S. Ranade, J. Sorenson, K. McKernan, A. Blanchard;<br />
Applied Biosystems, Foster City, CA, United States.<br />
Massively parallel sequencing systems are making genetic analysis<br />
cheaper and enabling experiments capable of answering increasingly<br />
complex biological questions. The SOLiD system, is a new platform<br />
using either fragment or mate paired libraries to generate >1-2 GB of<br />
data/run with >99.99% consensus accuracy. We will present a summary<br />
of the current chemistry of the SOLiD system together with<br />
data generated to demonstrate sequence quality and coverage of a<br />
number of increasingly complex targets, including bacteria, yeast, C.<br />
elegans and mammals. The SOLiD chemistry and instrumentation<br />
can be readily adapted to a number of applications by modification of<br />
the ways in which the input nucleic acids are prepared and the output<br />
data is analyzed. Some of the applications under development with<br />
collaborators will be presented as well as data demonstrating the utility<br />
of the system<br />
P1284. Next generation sequencing - Opportunities and<br />
constraints in pooling and quantitative sequencing<br />
F. M. De La Vega1 , F. C. L. Hyland1 , J. Sorenson1 , E. Cuppen2 , K. McKernan3 ;<br />
1 2 Applied Biosystems, Foster City, CA, United States, Hubretch Lab, Utrecht,<br />
The Netherlands, 3Applied Biosystems, Beverly, MA, United States.<br />
Next generation sequencing promises to address a range of applications<br />
including: identification of somatic mutation profiles, gene expression<br />
by tag counting, and measurement of allele frequencies in<br />
pooled samples of cases and controls in association studies. We developed<br />
a model to simulate digital sequencing with pooled samples,<br />
in the presence of error. We discover that for pooling and quantitative<br />
sequencing, the number of samples that can be pooled and the minor<br />
allele frequency of variants that can be detected is critically dependant<br />
on the threshold for SNP calling, which in turn is strongly influenced by<br />
the measurement error rate. As next generation sequencing platforms<br />
typically produce short reads (25-35bp), coverage needs to increase<br />
over 20x to compensate. Increasing the coverage improves the estimate<br />
of the error rate, but cannot overcome problems with detecting<br />
very low frequency variants with large numbers of pooled samples. We<br />
validated this model through empirical sequencing by oligonucleotide<br />
ligation and detection (Applied Biosystems SOLiD(TM) system) of 81<br />
PCR amplicons from exons of EMS-mutagenized C. elegans worms<br />
encompassing ca. 25kb of sequence with over 1500x coverage. Amplicons<br />
were pooled down to a 1:100 ratio (1:200 ratio for alleles). The<br />
results were compared with di-deoxy sequencing data carried out independently.<br />
Our results suggest that even if coverage needs to increase<br />
significantly when using short reads as compared with di-deoxy<br />
sequencing, low platform error rate is the most critical factor for detecting<br />
allele variants in pooled samples or mixtures by next generation<br />
sequencing platforms.<br />
P1285. Measuring the quality of computer tools used in<br />
diagnostic genetic testing<br />
A. Devereau1,2 , N. Walker2 ;<br />
1 2 National <strong>Genetics</strong> Reference Laboratory, Manchester, United Kingdom, University<br />
of Manchester, Manchester, United Kingdom.<br />
As part of the EuroGentest project (www.eurogentest.org) we investigated<br />
quality measurement of computer tools used within diagnostic<br />
laboratories. A literature survey highlighted the need to provide specific<br />
information, such as what the tool does, who it is for, etc.; information<br />
about data used by the tool; information about tool operation; and in-<br />
formation on its performance, such as sensitivity and specificity, or the<br />
completeness of its data coverage. A survey of laboratories found that<br />
most tools used are either for particular pieces of equipment or are databases.<br />
It also showed that quality assessment is often missing or unstructured.<br />
Furthermore, there is no trustworthy source to validate the<br />
quality of the clinically most critical features of the tools. We propose<br />
that tools can be categorised according to their purpose, with a specific<br />
list of features developed for each category. Assessment of performance<br />
requires standardised tests for parameters specific to each<br />
tool category. Expert groups for each tool type are needed to propose<br />
and review the features and performance measures. One important<br />
category of tool that we found in laboratories was sequence analysis<br />
tools. We have used previous work to propose a list of specific features<br />
and performance measures for these tools. We have investigated presentation<br />
of this data using a ‘wiki’, i.e. a web site editable by its users.<br />
This allows groups of experts to be formed for the particular tool category<br />
and to collaboratively enter and review data about the tool.<br />
P1286. In search of tissue specific regulators in periodontium - a<br />
bioinformatic approach.<br />
A. M. Lichanska, N. Pham;<br />
School of Dentistry, Brisbane, Australia.<br />
Tissue specific gene expression can be regulated by tissue specific<br />
promoters, enhancers, silencers, transcription factors, differential<br />
methylation, tissue specific alternative splicing, as well as other transcriptional<br />
and post-transcriptional factors. The methods used for<br />
studying the regulatory elements are multiple, however, they are mostly<br />
useful in cases where some information about the promoters active<br />
in a given tissue is available.<br />
While the tooth development is well described, the regulation of gene<br />
expression in the tissues maturing after the tooth development is<br />
complete is unclear. Periodontal ligament tissue (PDL) is essential for<br />
structural support of the teeth (attaches root to the bone). The understanding<br />
of what makes it so special is essential for the development<br />
of regenerative treatments of this tissue.<br />
Expression profiling data of the primary cell cultures of periodontal<br />
ligament tissue and outer gum tissue (gingiva) was performed using<br />
Affymetrix HU133A arrays. The analysis has identified 333 genes differentially<br />
regulated in these tissues. This set of genes was then subjected<br />
to promoter analysis to identify the CpG islands and promoter<br />
binding sites. We have used a number of tools, such as Promoter-Express,<br />
TRES, TFSEARCH, PAINT, CpGProD, CpG islands searcher,<br />
Methylator and MethCGI to generate an overview of the promoters of<br />
the differentially regulated genes. As a result we identified signature<br />
promoter features of these differentially expressed genes.<br />
Currently, we are analyzing the role of the differential methylation between<br />
the two tissues and the role of some selected transcription factors<br />
identified in this screen.<br />
P1287. Direct Sequencing Quality Control: a Novel Software<br />
Approach to Reducing Variant Review Time and Labor<br />
S. Jankowski, E. Vennemeyer, K. Hunkapiller, P. Shah;<br />
Applied Biosystems, Foster City, CA, United States.<br />
With the completion of the <strong>Human</strong> Genome Project, the shift from de<br />
novo sequencing to direct sequencing (resequencing) has created the<br />
need for more accurate variant detection for medical research and clinical<br />
diagnostics. The bottleneck in the workflow (from DNA extraction<br />
to result data analysis) has been cited as taking up to 70% of researcher’s<br />
time per project, due to manual review of individual nucleotide<br />
bases. This review has been required due to the necessity of having<br />
confidence in the variant result.<br />
Increasing confidence can come from applying diligent quality control<br />
metrics, including use of Quality Values for DNA trace value and confidence<br />
values for variant validity. Based on Applied Biosystems experience,<br />
this system will filter out low quality data. The software will then<br />
direct users to review only low confidence variants.<br />
A flexible workflow based system is being built to enable researchers<br />
to obtain their high confidence results in less time. Methods for filtering<br />
low quality data based on optimal settings and quality visualization<br />
tools will be integrated into the system along with simpler variant<br />
review and reporting tools to allow researchers to quickly analyze their<br />
data.<br />
1