10.08.2013 Views

Library of Congress NAVCC Overview and Update - (lib.stanford ...

Library of Congress NAVCC Overview and Update - (lib.stanford ...

Library of Congress NAVCC Overview and Update - (lib.stanford ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Library</strong> <strong>of</strong> <strong>Congress</strong> 
<br />

Archive Environments<br />

Presented by:<br />

Carl Watts<br />

Information Technology Specialist<br />

Enterprise Server Engineering


<strong>Overview</strong><br />

• Four Archive Environment<br />

– Packard Campus Audio <strong>and</strong> Visual Archive<br />

• Production Environment<br />

– <strong>Library</strong> <strong>of</strong> <strong>Congress</strong> Bit Preservation Program<br />

• LCBP Platform<br />

• GPFSLCBP Platform<br />

– Legacy Archive


PCAVC Architecture V2<br />

• 2GBps maximum throughput<br />

• Commissioned 2007<br />

• Currently Stores:<br />

– 1.3 PB<br />

– about 140K files<br />

– current growth: 95 TB / month<br />

– about 9K files per month


Packard Campus Production Archive<br />

POD<br />

1Gbps<br />

10Gbps<br />

POD<br />

Archive Data<br />

Network<br />

Workflow Server<br />

Production Data<br />

Network<br />

<strong>NAVCC</strong><br />

Network Core<br />

Archive<br />

Management<br />

Server<br />

<strong>NAVCC</strong> Culpeper<br />

POD<br />

Cisco<br />

MDS<br />

DWDM<br />

30TB<br />

HSM Cache<br />

HSM Cache<br />

Cisco<br />

MDS<br />

DWDM<br />

SL8500 Tape <strong>Library</strong><br />

T10000 Tape Dives<br />

10TB<br />

SL8500 Tape <strong>Library</strong><br />

T10000 Tape Dives<br />

Archive Management<br />

Server<br />

Remote Site<br />

Network<br />

<strong>Library</strong> <strong>of</strong> <strong>Congress</strong><br />

Remote Site<br />

Architecture Version 2


PCAVC Architecture V3 (in test)<br />

• Goal <strong>of</strong> 6GBps ingest rate<br />

– Oracle M9000<br />

• Increase HSM Cache Size<br />

– DDN SFA10000 to replace Sun FLX380<br />

• Change SMB data transfers<br />

– Signiant Content Distribution Management<br />

• Improve Tape Monitoring<br />

– Crossroads ReadVerify Appliance


LC Bit Preservation #1<br />

• 4Gbps maximum throughput<br />

• Commissioned 2008<br />

• Currently Stores:<br />

– 830 TB<br />

– about 32M files<br />

– current growth: 45 TB / month<br />

– more than 1 million files per month<br />

• Advance Tape Monitoring


LC Bit Preservation #1 <strong>and</strong> #2<br />

Internet / Internet 2<br />

NDIIPP Partners<br />

<strong>and</strong> Contractor<br />

1Gbps<br />

Primary Datacenter<br />

DC<br />

Ingest<br />

Workstations<br />

Archive<br />

Management<br />

Server<br />

GPFSLCBP<br />

External<br />

Production Network<br />

20TB<br />

HSM Cache<br />

GPFS/HPSS<br />

Archive<br />

Management<br />

Server<br />

LCBP<br />

20TB<br />

HSM Cache<br />

SAMFS<br />

Web Capture<br />

External Ingest<br />

Servers<br />

Cisco<br />

MDS<br />

SL8500 Tape <strong>Library</strong><br />

T10K(B) Tape Dives<br />

External Content<br />

Servers<br />

Content<br />

Processing<br />

Servers<br />

DWDM<br />

SL8500 Tape <strong>Library</strong><br />

T10K(B) Tape Dives<br />

TS3584 Tape <strong>Library</strong><br />

LTO4 Tape Dives<br />

Tape Monitoring<br />

<strong>Library</strong> <strong>of</strong> <strong>Congress</strong><br />

Remote Site<br />

TS3584 Tape <strong>Library</strong><br />

LTO3 Tape Dives<br />

Cisco<br />

MDS<br />

Remote Site<br />

Network<br />

DWDM<br />

Tape Monitoring<br />

Archive Management<br />

Server<br />

10TB<br />

HSM Cache


LCBP Environment Improvements<br />

• Second Archive for risk mitigation<br />

– 2x tape technology<br />

– 2x OS<br />

– 2x HSM<br />

– second vendor<br />

- adds 3 rd <strong>and</strong> 4 th copy<br />

• Increase network b<strong>and</strong>width to 10GigE


Migration
<br />

Migration
<br />

Migration


Near Future Directions<br />

• Moving toward digital submission<br />

• Adding Storage Resource Management<br />

• Reducing the cost <strong>of</strong> acquisition cost for storage<br />

– Segmenting Tier 2 storage into 3 sub-tiers<br />

– $1k or less per TB (10X or less consumer price per TB)<br />

• Heterogeneous Global Namespace<br />

• Geographical Dispersed Archive<br />

• Erasure Archive


Thank You<br />

Contact Information: Carl Watts<br />

Email: cwat@loc.gov<br />

LinkedIn: www.linkedin.com/in/carlwatts<br />

Join the PASIG Group on LinkedIn<br />

Some <strong>of</strong> the opinions expressed during this presentation are my own <strong>and</strong><br />

do not necessarily represent those <strong>of</strong> the <strong>Library</strong> <strong>of</strong> <strong>Congress</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!