09.07.2015 Views

SAGA: A Simple API for Grid Applications High-Level Application ...

SAGA: A Simple API for Grid Applications High-Level Application ...

SAGA: A Simple API for Grid Applications High-Level Application ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2 COMPUTATIONAL METHODS IN SCIENCE AND TECHNOLOGYto computational biology, have developed their own <strong>API</strong>s tosimplify access to the grid <strong>for</strong> their application such as theParticle Physics Data <strong>Grid</strong> [6] and the Illinois Bio<strong>Grid</strong> [7].This momentum has culminated in the <strong>for</strong>mation of the <strong>Simple</strong><strong>API</strong> <strong>for</strong> <strong>Grid</strong> <strong><strong>Application</strong>s</strong> Research Group (<strong>SAGA</strong>-RG) withinthe Global <strong>Grid</strong> Forum (GGF).<strong>SAGA</strong> was founded at GGF11 in Hawaii, after successfulBoFs at GGF10 and GGF9 and originally chaired by membersof the <strong>Grid</strong>Lab, Globus/Python-CoG, and Reality<strong>Grid</strong> project.The aim of the group has been to identify a set of basic gridoperations and derive a simple, consistent <strong>API</strong>, which eases thedevelopment of applications that make use of grid technologies.The implementation of the <strong>SAGA</strong> <strong>API</strong> has been guidedby examination of the requirements expressed by existing andemerging grid applications in order to find common themesand motifs that can be reused across more than one use-case.The broad array of use cases that have been collected from thenascent grid application development community continues toplay a pivotal role in all design decisions.The next section describe the process which has been usedto derive the initial scope of the <strong>API</strong> in more detail, andalso documents general design principles applied to the initialimplementation. The <strong>SAGA</strong> <strong>API</strong> is then put in relation toother <strong>API</strong> related work in GGF, and also to projects withsimilar scope, such as the GAT and Globus-CoG. A moredetailed description of the current version of the <strong>SAGA</strong> <strong>API</strong>specification follows, including a number of explicit codeexamples.II. <strong>SAGA</strong>: GENERAL DESIGNThe original <strong>SAGA</strong> Research Group charter outlined asystematic approach to deriving the <strong>API</strong>: use case analysis,requirements analysis, and <strong>API</strong> development. <strong>API</strong> developmentwas further split into the creation of an abstract, languageindependent, specification, and then a set of language bindingscovering the major languages used in scientific applicationdevelopment — Fortran, C, C++, Java, Perl and Python. AtGGF12 in Brussels, it was suggested that a design teamshould be <strong>for</strong>med to rapidly proceed through the analysisprocess and produce a strawman <strong>API</strong> that could be used <strong>for</strong>further discussion and refinement into a final <strong>API</strong>. That <strong>SAGA</strong>Strawman <strong>API</strong> was supposed to be iterated against the use caserequirement analysis which was to be produced in parallel, inorder to stay true to the process outlined in the charter. 1A. Use Case AnalysisUpon creation, the <strong>SAGA</strong> group sent out a call <strong>for</strong> usecases, which resulted in 24 responses. These came froma wide range of applications areas, such as visualization,climate simulations, generic <strong>Grid</strong> job submission, applicationmonitoring and others. A large set of use cases (9) wassubmitted by the <strong>Grid</strong>RPC Working Group, which targets asimilar set of application in GGF (see section III-A).Common to the majority of the submitted use cases wastheir explicit focus on scientific applications (not surprising, as1 Currently, the <strong>SAGA</strong> group is undergoing a re-chartering process andtrans<strong>for</strong>ms into a Working Group, and is here referred to as <strong>SAGA</strong>-WG.almost all use cases came from the academic community). Fora discussion of non-scientific applications as <strong>SAGA</strong> customers,see subsection II-C.The analysis of the <strong>SAGA</strong> use cases focused on the identificationof the <strong>SAGA</strong> <strong>API</strong> scope, on the level of abstractionneeded and wanted by the application programmers, and onnon-functional requirements to the <strong>API</strong>, such as targetedprogramming languages, requirements to thread safety etc.Additional similar requirements are also drawn from otherprominent projects (such as listed in section IV), which alltook active participation in that <strong>SAGA</strong> group ef<strong>for</strong>t.B. The Design Team and the <strong>SAGA</strong> Strawman <strong>API</strong>In response to the suggestions at GGF12, the <strong>SAGA</strong> groupsent out a call and <strong>for</strong>med a design team, which met inDecember 2003 under the auspices of the Center <strong>for</strong> Computationand Technology, at Louisiana State University, BatonRouge, and at the Albert-Einstein-Institut, Golm, Germany,with an Access<strong>Grid</strong> link connecting the two halves of thedesign team, and allowing occasional outside participation.The major participants were Tom Goodale, David Konerding,Shantenu Jha, Andre Merzky, John Shalf, and Chris Smith,with Craig Lee, Gregor von Laszewski, and Hrabri Rajicproviding important input. This was followed in January of2004 by a meeting held at the Lawrence Berkeley Lab inBerkeley, Cali<strong>for</strong>nia, USA, also with an AG-Link to the VrijeUniversiteit, Amsterdam, Netherlands.Be<strong>for</strong>e embarking on developing the <strong>API</strong>s, a few generaldesign issues were considered and agreed upon.• The <strong>API</strong> would be developed and specified in an objectorientedmanner, using the Scientific Interface DescriptionLanguage (SIDL) to provide a language-neutral representation.The consensus was that it would be easier toprovide a mapping from an object-oriented representationinto a procedural language such as Fortran or C, than toprovide a mapping from a procedural representation intoan object-oriented language such as C++ or Java.• Any asynchronicity would be handled by a polling mechanismrather than a subscribe/listen mechanism. Thismakes the implementation in non-multi-threaded environmentsclearer, and avoids defining some mechanismto represent closures in non object-oriented languages.While both these things are possible — co-operativemulti-tasking using a call to a routine giving time toasynchronous tasks, and closures by use of a data-pointer,function-pointer pair — they would provide greater divergencebetween the different implementation languages.The team did not rule out future versions of the <strong>API</strong>from having such features 2 .• Each grid subsystem should be specified independentlyfrom other subsystems, and should not make use ofthem as far as possible. This allows the independentdevelopment and implementation of parts of the <strong>API</strong>, atthe cost of a small extra amount of complexity whenusing data derived from one part of the <strong>API</strong> in another,DRAFT2 But see notes to asynchronous notification below.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!