13.07.2015 Views

User Manual - Web Curator Tool - SourceForge

User Manual - Web Curator Tool - SourceForge

User Manual - Web Curator Tool - SourceForge

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The harvested material is captured in ARC format which hasstrong storage and archiving characteristics.The system epitomises best practice through its use of auditing,permission management, and preservation metadata.How Does it Work?The <strong>Web</strong> <strong>Curator</strong> <strong>Tool</strong> has the following major componentsThe Control CentreThe Control Centre includes an access-controlled web interfacewhere users control the tool.It has a database of selected websites, with associatedpermission records and other settings, and maintains a harvestqueue of scheduled harvests.Harvest AgentsWhen the Control Centre determines that a harvest is ready tostart, it delegates it to one of its associated harvest agents.The harvest agent is responsible for crawling the website usingthe Heritrix web harvester, and downloading the required webcontent in accordance with the harvester settings and anybandwidth restrictions.Each installation can have more than one harvest agent,depending on the level of harvesting the organizationundertakes.Digital Asset StoreWhen a harvest agent completes a harvest, the results arestored on the digital asset store.The Control Centre provides a set of quality review tools thatallow users to assess the harvest results stored in the digitalasset store.Successful harvests can then be submitted to a digital archivefor long-term preservation.<strong>Web</strong> <strong>Curator</strong> <strong>Tool</strong> <strong>User</strong> <strong>Manual</strong> Version 1.6 Page 7 of 80

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!