12.07.2015 Views

The Computational Materials Repository

The Computational Materials Repository

The Computational Materials Repository

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.3 System Components and Processes 65discussed in section 3.3.4.1.Technical Details: <strong>The</strong>re a few practical issues that needed to be handled for ourfile system. <strong>The</strong> first one is that people will submit their results with arbitraryfile names and this results in file name conflicts; the second one is that storinga huge number of files results in a noticeable delay when listing the contentof the directory to check for newly added db-files (100000 files took approx. 3minutes on our NFS files server). For this reason the uploading process (3.3.4.2)renames them first and then moves them into the “inbox” which writes the fileinto a dedicated subdirectory if the current destination has more than 1000files. For example the file 100059 01434...db should be copied to the directoryinbox/. If inbox/ contains already 1000 files then these files are distributedinto subdirectories by their first index. Our file would go to inbox/1. Whenthe directory inbox/1 has more than 1000 files it is split again, but this timethe second character defines the directory. In our case the file would end upin the directory inbox/1/0. This pattern is followed until a directory is foundwith less than 1000 files. This technique is applied for all directories that storedb-files within the db-file repository and assures quick access times.3.3.2.2 local repositoryA local repository is a directory that contains db-files.Purpose: Store db-files locally and perform basic queries on the db-files in thatdirectory.Usage: When retrieving data from third-party databases or when creating newdatasets the db-files can be stored in a local repository in order to be updated/-modified before being uploaded to the database. An other use is to filter db-fileswithout a database. DirectoryReader enables basic to execute basic queriesand works the same way as DBReader. <strong>The</strong> performance is however considerablyslower, because the DirectoryReader has to read all db-files into memory beforebeing able filter the data.Usage example: An example of how queries are performed on a local repositorywith a DirectoryReader is shown in Fig. 2.20.3.3.2.3 CMR DatabaseIn this work the CMR database is often referred as database. <strong>The</strong> technicalcorrect term would be a MySQL database with the CMR database schema.Purpose: <strong>The</strong> CMR database allows data to be uploaded from the db-file repositoryand enables fast queries and downloads of data including attached scriptsand files.Usage: <strong>The</strong> upload process (3.3.4.2) loads data from the db-file repository to theCMR database. <strong>The</strong> only other component that has write access to the databaseis the agent (3.3.3.1). Disallowing write access to casual users ensures that theycannot break anything and that all data is uploaded consecutively.Implementation: This section provides more information about the internal organizationof the data in the database. CMR allows to use customized database

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!