11.12.2012 Views

NASA Scientific and Technical Aerospace Reports

NASA Scientific and Technical Aerospace Reports

NASA Scientific and Technical Aerospace Reports

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

clusters with usually small IT budgets, it is important that the storage system ‘just work’ with relatively little administrative<br />

overhead.<br />

Author<br />

Unix (Operating System); B<strong>and</strong>width; Data Storage; Object-Oriented Programming<br />

20040121042 S<strong>and</strong>ia National Labs., Albuquerque, NM, USA<br />

The Data Services Archive<br />

Haynes, Rena A.; Johnson, Wilbur R.; <strong>NASA</strong>/IEEE MSST 2004 Twelfth <strong>NASA</strong> Goddard Conference on Mass Storage<br />

Systems <strong>and</strong> Technologies in cooperation with the Twenty-First IEEE Conference on Mass Storage Systems <strong>and</strong> Technologies;<br />

April 2004, pp. 261-271; In English; See also 20040121020<br />

Contract(s)/Grant(s): DE-AC04-94AL-85000; No Copyright; Avail: CASI; A03, Hardcopy<br />

As access to multi-teraflop platforms has become more available in the Department of Energy Advanced Simulation<br />

Computing (ASCI) environment, large-scale simulations are generating terabytes of data that may be located remotely to the<br />

site where the data will be archived. This paper describes the Data Service Archive (DSA), a service oriented capability for<br />

simplifying <strong>and</strong> optimizing the distributed archive activity. The DSA is a distributed application that uses Grid components<br />

to allocate, coordinate, <strong>and</strong> monitor operations required for archiving large datasets. Additional DSA components provide<br />

optimization <strong>and</strong> resource management of striped tape storage.<br />

Author<br />

Computer Systems Simulation; Data Integration; Distributed Processing; Data Transfer (Computers); Massively Parallel<br />

Processors<br />

20040121044 Illinois Univ., Chicago, IL, USA<br />

Using DataSpace to Support Long-Term Stewardship of Remote <strong>and</strong> Distributed Data<br />

Grossman, Robert L.; Hanley, Dave; Hong, Xin-Wei; Krishnaswamy, Parthasarathy; <strong>NASA</strong>/IEEE MSST 2004 Twelfth <strong>NASA</strong><br />

Goddard Conference on Mass Storage Systems <strong>and</strong> Technologies in cooperation with the Twenty-First IEEE Conference on<br />

Mass Storage Systems <strong>and</strong> Technologies; April 2004, pp. 239-243; In English; See also 20040121020; No Copyright; Avail:<br />

CASI; A01, Hardcopy<br />

In this note, we introduce DataSpace Archives. DataSpace Archives are built on top of DataSpace s DSTP servers <strong>and</strong><br />

are designed not only to provide a long term archiving of data, but also to enable the archived data to be discovered, explored,<br />

integrated <strong>and</strong> mined. DataSpace Archives are based upon web services. Web services UDDI <strong>and</strong> WSDL mechanisms provide<br />

a simple means for any web service client to discover relevant archived data . In addition, data in DataSpace Archives can<br />

carry a variety of XML metadata, <strong>and</strong> the DSTP servers which underly the DataSpace Archives provide direct access to this<br />

metadata. Unfortunately, web services today do not provide the scalabilty required to work with large remote data sets. For<br />

this reason, DataSpace Archives employ a scalable web service we have developed called SOAP+.<br />

Author<br />

Metadata; Data Storage; Data Transfer (Computers)<br />

20040121045 Deutsches Elektronen-Synchrotron, Hamburg, Germany<br />

dCache, the Commodity Cache<br />

Fuhrmann, Patrick; <strong>NASA</strong>/IEEE MSST 2004 Twelfth <strong>NASA</strong> Goddard Conference on Mass Storage Systems <strong>and</strong> Technologies<br />

in cooperation with the Twenty-First IEEE Conference on Mass Storage Systems <strong>and</strong> Technologies; April 2004, pp. 171-175;<br />

In English; See also 20040121020; No Copyright; Avail: CASI; A01, Hardcopy<br />

The software package presented within this paper has proven to be capable of managing the storage <strong>and</strong> exchange of<br />

several hundreds of terabytes of data, transparently distributed among dozens of disk storage nodes. One of the key design<br />

features of the dCache is that although the location <strong>and</strong> multiplicity of the data is autonomously determined by the system,<br />

based on configuration, cpu load <strong>and</strong> disk space, the name space is uniquely represented within a single file system tree. The<br />

system has shown to significantly improve the efficiency of connected tape storage systems, through caching, gather & flush<br />

<strong>and</strong> scheduled staging techniques. Furthermore, it optimizes the throughput to <strong>and</strong> from data clients as well as smoothing the<br />

load of the connected disk storage nodes by dynamically replicating datasets on the detection of load hot spots. The system<br />

is tolerant against failures of its data servers which enables administrators to go for commodity disk storage components.<br />

Access to the data is provided by various ftp dialects, including gridftp, as well as by a native protocol, offering regular file<br />

system operations like open/read/write/seek/stat/close. Furthermore the software is coming with an implementation of the<br />

255

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!