03.01.2015 Views

prepublication copy - The Department of Astronomy & Astrophysics ...

prepublication copy - The Department of Astronomy & Astrophysics ...

prepublication copy - The Department of Astronomy & Astrophysics ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

support be proposed and highly reviewed. Proposals to NSF’s <strong>Astronomy</strong> and <strong>Astrophysics</strong> Grants<br />

program or to the ATI program could include support for the development <strong>of</strong> s<strong>of</strong>tware tools related to<br />

data reduction and analysis, and archiving.<br />

Because data archives are so central to modern astronomy, it is a matter <strong>of</strong> concern that no model<br />

exists for long-term preservation (curation) <strong>of</strong> ground-based data once observing projects or facilities are<br />

no longer funded 10 . In order to realize the full benefit <strong>of</strong> ground-based data, especially from surveys, it is<br />

therefore necessary for NSF to adopt NASA’s model <strong>of</strong> long-lived data archive centers (like IPAC,<br />

MAST, HEASARC) and also the Canadian <strong>Astronomy</strong> Data Center (CADC) for long-term curation <strong>of</strong><br />

data, with capabilities similar to what are available through existing successful archives.<br />

A coordinated inter-agency effort will be particularly important with the advent <strong>of</strong> the petabytescale<br />

surveys anticipated in the future. An example <strong>of</strong> an opportunity where there is a possible synergy in<br />

combining ground-based and space data is solar physics. <strong>The</strong> rapidly growing database from existing<br />

facilities including SDO presents an opportunity to combine complementary datasets in order to get a<br />

balanced view <strong>of</strong> the dynamic sun. This investment is likely to pay a large dividend when ATST come on<br />

line in 2017.<strong>The</strong> recent report “Long-Lived Digital Data Collections: Enabling Research and Education in<br />

the 21st Century,” by the National Science Board, and the National Academies report “Ensuring the<br />

Integrity, Accessibility, and Stewardship <strong>of</strong> Research Data in the Digital Age,” recognized the growing<br />

importance <strong>of</strong> long-term curation, and the NSB report recommended that the NSF develop a global<br />

strategy to address it. <strong>The</strong> NSF Office <strong>of</strong> Cyberinfrastructure DataNet program (Sustainable Digital Data<br />

Preservation and Access Network Partners), which is partnering with research institutions to develop data<br />

preservation facilities <strong>of</strong> general utility to the research community and which includes participation by<br />

astronomers, is an important first step in the process.<br />

RECOMMENDATION: NSF, NASA, and DOE should plan for effective long-term<br />

curation <strong>of</strong>, and access to, large astronomical data sets after completion <strong>of</strong> the missions or<br />

projects that produced these data, given the likely future scientific benefit <strong>of</strong> the data.<br />

NASA currently supports widely used curated data archives, and similar data curation<br />

models could be adopted by NSF and DOE.<br />

<strong>The</strong> committee estimated the cost <strong>of</strong> achieving these data archiving goals on the basis <strong>of</strong> an<br />

informal survey <strong>of</strong> existing archives. Data gathered by the survey's infrastructure study groups indicates<br />

that adding a new survey similar to SDSS to the portfolio <strong>of</strong> an existing archive center would<br />

involve startup costs <strong>of</strong> about $0.4M (approximately $0.15M for personnel and $0.25M for hardware)<br />

and an annual operating budget <strong>of</strong> about $0.2M-$0.3M ($0.15M for personnel and the remainder for<br />

maintenance and upgrade). Starting a new archive from scratch would be significantly more expensive,<br />

so it would be particularly cost-effective for NSF and DOE to coordinate with NASA to use existing<br />

archive and data distribution centers. This would add the scientific advantage <strong>of</strong> having a core <strong>of</strong> resident<br />

astronomers, computer scientists, and technical support staff. 11 Supporting additional archiving and<br />

long-term curation for a few existing observatories and instruments would cost roughly $15M per decade.<br />

Numerical codes could also be curated.<br />

Data Reduction and Analysis S<strong>of</strong>tware<br />

Major instruments with wide public use on federally supported telescopes and facilities would<br />

benefit greatly from pipelines that deliver calibrated data and data products for storage in a public archive.<br />

General-purpose community analysis s<strong>of</strong>tware packages like the IRAF and AIPS packages currently used<br />

10 An exception is the 2MASS survey which resides within the InfraRed Science Archive at IPAC.<br />

11 For a <strong>copy</strong> <strong>of</strong> this NRC report http://www.nap.edu/catalog.phprecord_id=11909. Accessed May 2010.<br />

PREPUBLICATION COPY—SUBJECT TO FURTHER EDITORIAL CORRECTION<br />

5-13

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!