New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
136 CHAPTER 5. COMPUTER SCIENCE GRID STRATEGIES<br />
b) to have a method to dump its content to a given stream object 17 and<br />
c) to have a method to load its content from a given stream.<br />
A list element consists <strong>of</strong> <strong>the</strong> variable name and a link to <strong>the</strong> object it<br />
represents. When a worker <strong>the</strong>n calls its save state method <strong>the</strong> list is traversed<br />
and <strong>for</strong> each object in this list <strong>the</strong> specific save method is called and <strong>the</strong> content<br />
is appended to <strong>the</strong> stream.<br />
The above definition is quite general and needs to be specified fur<strong>the</strong>r to<br />
allow <strong>for</strong> <strong>the</strong> key issue here, namely to enable storing and reading data in a<br />
system independent way. What we really need is a mechanism that allows <strong>for</strong>,<br />
say, storing <strong>the</strong> state <strong>of</strong> a worker written in Java running on a Linux system,<br />
and restoring this state on a Windows box, running a C++ worker. There<br />
are many commercial and open-source solutions available <strong>for</strong> this problem, <strong>for</strong><br />
example CORBA, RPC, (D)COM or SOAP (<strong>for</strong> an overview see (Emmerich<br />
and Kaveh, 2002; Elfwing et al., 2002) and references <strong>the</strong>rein). They all have<br />
in common that <strong>the</strong>y use some interface definition language (IDL) to describe<br />
a s<strong>of</strong>tware component’s interface and are also capable to transfer values (and<br />
mapping types) between systems. The main reason why we developed our<br />
own (proprietary) approach here is that nei<strong>the</strong>r <strong>of</strong> <strong>the</strong> systems really works<br />
in practice if used with more than a couple <strong>of</strong> programming languages, according<br />
to our experience. This mainly relies on <strong>the</strong> fact that all tested open<br />
source implementations <strong>of</strong> <strong>the</strong>se standards were incomplete or inadequate (see<br />
e.g. (Henning, 2006) and references <strong>the</strong>rein) and generated APIs that are<br />
incoherent, strange or even impossible to use.<br />
To address this, we are using a combination <strong>of</strong> enterprise Application patterns<br />
called Domain Model 18 and Active Record 19 (see eg (Fowler et al.,<br />
2003)). This means, each object (such as a peak, a spectrum or a peak assignment<br />
result) used in an algorithm can be mapped to a database object<br />
and hence stored in a database. The beauty <strong>of</strong> this approach is that it is<br />
fully programming language and OS independent, while allowing to simply<br />
reference large objects (such as 2D spectra) ra<strong>the</strong>r than copying <strong>the</strong>m. For<br />
example, if a peak picking algorithm is analyzing a 2GByte 2D spectrum and<br />
now is about to store its state it does not need to store <strong>the</strong> 2D spectrum to <strong>the</strong><br />
17 A stream is a source or sink <strong>of</strong> data, usually individual bytes or characters. Streams are<br />
an abstraction used when reading or writing files, or communicating over network sockets.<br />
18 A domain model can be thought <strong>of</strong> as a conceptual model <strong>of</strong> a system which describes<br />
<strong>the</strong> various entities involved in that system and <strong>the</strong>ir relationships. The domain model<br />
is created to document <strong>the</strong> key concepts and <strong>the</strong> vocabulary <strong>of</strong> <strong>the</strong> system. The model<br />
displays <strong>the</strong> relationships among all major entities within <strong>the</strong> system and usually identifies<br />
<strong>the</strong>ir important methods and attributes. This means that <strong>the</strong> model provides a structural<br />
view <strong>of</strong> <strong>the</strong> system which is normally complemented by <strong>the</strong> dynamic views in Use Case<br />
models. An important benefit <strong>of</strong> a domain model is to describe and constrain system scope.<br />
The domain model can be used at a low level in <strong>the</strong> s<strong>of</strong>tware development cycle since <strong>the</strong><br />
semantics shown <strong>the</strong>rein can be used in <strong>the</strong> source code. Entities become classes, while<br />
methods and attributes can be carried directly to <strong>the</strong> source code; <strong>the</strong> same names typically<br />
appear in <strong>the</strong> source code.<br />
19 Active record is an approach to accessing data in a database. A database table or view<br />
is wrapped into a class, thus an object instance is tied to a single row in <strong>the</strong> table. After<br />
creation <strong>of</strong> an object, a new row is added to <strong>the</strong> table upon save. Any object loaded gets<br />
its in<strong>for</strong>mation from <strong>the</strong> database; when an object is updated, <strong>the</strong> corresponding row in<br />
<strong>the</strong> table is also updated. The wrapper class implements accessor methods or properties <strong>for</strong><br />
each column in <strong>the</strong> table or view. This pattern is commonly used by object persistence tools,<br />
and in object-relational mapping. Typically <strong>for</strong>eign key relationships will be exposed as an<br />
object instance <strong>of</strong> <strong>the</strong> appropriate type via a property. Implementations <strong>of</strong> Active Record<br />
can be found in various frameworks <strong>for</strong> many programming environments.