02.10.2015 Views

2010 Best Practices Competition IT & Informatics HPC

IT Informatics - Cambridge Healthtech Institute

IT Informatics - Cambridge Healthtech Institute

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Published Resources for the Life Sciences<br />

250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425<br />

We also identified a need to export results and annotation from HCS Road to third party<br />

applications so researchers can perform calculations and generate visualizations that are<br />

not part of a common workflow. We use TIBCO Spotfire for many of our external<br />

visualizations because: it can retrieve data directly from the HCS Road database; it<br />

supports multiple user-configurable visualizations; it provides tools for filtering and<br />

annotating data, and it can perform additional analyses using internal calculations or by<br />

communicating with Accelerys PipelinePilot (SanDiego, CA). Figure 3 shows a Spotfire<br />

visualization for analyzing RNAi screening results. This workflow retrieves results and<br />

treatment information from the HCS Road database. The user is presented with<br />

information on the distribution of normalized values for each endpoint and can select<br />

wells that pass the desired activity threshold. Additional panels identify RNAi reagents<br />

where multiple replicate wells pass the threshold and genes where multiple different RNAi<br />

reagents scored as hits, an analysis that is unique to RNAi screening. Within Spotfire,<br />

HCS assay results can be cross-referenced with other information such as mRNA<br />

expression profiling to identify RNAi reagents whose phenotype correlates with levels of<br />

target expression in the assay cell line (not shown).<br />

Cell-level data<br />

Managing and analyzing cell-level data was a high priority in the development of HCS<br />

Road. Cell level data enables the analysis of correlations between measurements at the<br />

cellular level, the use of alternative data reduction algorithms such as the Kolmogorov-<br />

Smirnov distance 13; 14 , classification of subpopulations by cell cycle phase 15 , and other<br />

approaches beyond basic well-level statistics 16 . However, the volume of cell data in an<br />

HCS experiment can be very large. Storing cell data as one row per measurement per cell<br />

creates a table with large numbers of records and slows down data loading and retrieval.<br />

Because cell data is typically used on a per-plate/feature basis for automated analyses and<br />

for manual inspection, we chose to store it in files on the HCS Road file share (Fig. 1)<br />

rather than in the database. When cell data is needed, it is automatically imported into a<br />

database table using Oracle’s bulk data loading tools. When the cell measurements are no<br />

longer needed the records are deleted from the Road database (but are still retained in<br />

files). This controls database growth and improves performance compared to retaining<br />

large numbers of records in the database.<br />

ROI achieved:<br />

HCS Road currently supports target identification, lead identification and lead profiling<br />

efforts across multiple groups within BMS Applied Biotechnology. Scientists can analyze<br />

their experiments more rapidly and the time needed to load, annotate and review<br />

experiments has been reduced from days to hours. Integration with existing databases

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!