18.11.2014 Views

Workshop Report - Pervasive Technology Institute - Indiana University

Workshop Report - Pervasive Technology Institute - Indiana University

Workshop Report - Pervasive Technology Institute - Indiana University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Dr. Richard Durisen is currently using the Data Capacitor and the TeraGrid to simulate gravitational<br />

instabilities in protoplanetary discs to study the origins of gas giant planets. Durisen’s research team<br />

uses Pople, a shared-memory system at PSC as well as NCSA’s Cobalt to produce more than 50 TBs<br />

of simulation data. The second part of their workflow is to perform an analysis of the simulation<br />

data produced. This step was performed at Mississippi State <strong>University</strong> on their Raptor cluster.<br />

By using the Data Capacitor to bridge IU, PSC, and NCSA, Durisen’s team can see results as if<br />

they were happening locally. Should there be a problem with the run or the input parameters it<br />

is possible to stop the simulation and correct the problem without wasting valuable CPU time.<br />

Additionally, being able to see the simulation unfold can give the researcher insight into the next<br />

run (or set of runs) to be performed.<br />

By using the Data Capacitor to bridge IU and MSU, Durisen’s team was able to take advantage of<br />

some free compute cycles that were made available. Normally, this would require the researcher to<br />

transfer the data between sites, but the Data Capacitor made this step unnecessary.<br />

Conclusion<br />

Using the Data Capacitor as a central filesystem has permitted data sharing across distance and<br />

simplified workflows that would have previously been more complex and involved significant data<br />

transfer. When data is centrally located it can make data management significantly easier.<br />

The Data Capacitor also provides a resource for storing large data sets that might overwhelm a<br />

smaller institution. For example, the scratch filesystem at MSU was 5TB in size. Durisen’s team<br />

would have had to fill that filesystem 10 times over to complete their analysis.<br />

One might expect significant performance degradation using a wide area filesystem, however<br />

this was not always the case. In each case where data was produced across distance there was no<br />

performance differential between local and remote mounts. In the case where data was analyzed at<br />

MSU, there was a difference between the local and remote filesystems of 40%. However, this does<br />

not take into account the considerable amount of time that would be required to transfer 50 TB of<br />

data to the local filesystem.<br />

42

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!