26.12.2014 Views

Digital Object Architecture - Erpanet

Digital Object Architecture - Erpanet

Digital Object Architecture - Erpanet

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Handle System Overview<br />

Larry Lannom<br />

17 June 2004<br />

Corporation for National Research Initiatives<br />

http://www.cnri.reston.va.us/<br />

http://www.handle.net/<br />

Copyright© 2004 Corporation for National Research Initiatives. Permission is hereby granted to reproduce, disseminate, redistribute, perform and/or display this work publicly, provided, however, that credit is<br />

given to the person named as writer of the work and CNRI, and you do not abridge or edit the work in any way that alters its integrity or meaning.


<strong>Digital</strong> <strong>Object</strong> <strong>Architecture</strong> - Goals<br />

• Framework for managing <strong>Digital</strong><br />

(Information) <strong>Object</strong>s<br />

• Give it a name and talk to it<br />

– Don’t worry about where it is<br />

– Don’t worry about what it’s made of<br />

• Rise above details of application versions<br />

and content formats<br />

Corporation for National Research Initiatives


<strong>Digital</strong> <strong>Object</strong> <strong>Architecture</strong><br />

Client<br />

Repositories / Collections<br />

Resource Discovery<br />

•Search Engines<br />

•Metadata Databases<br />

•Catalogues, Guides, etc.<br />

Resolution System


<strong>Digital</strong> <strong>Object</strong> <strong>Architecture</strong> Components<br />

Handle System<br />

• Go from name to attributes<br />

• Fundamental indirection system for <strong>Digital</strong><br />

<strong>Object</strong> management on the net<br />

• No free lunch<br />

– Added layer of infrastructure<br />

– Must be managed<br />

Corporation for National Research Initiatives


Naming Resources on the Net<br />

The Problem<br />

Internet<br />

www.acme.com<br />

chapter.pdf


Naming Resources on the Net<br />

The Solution<br />

Internet<br />

Naming Service<br />

Name = Value(s)<br />

10.123/xyz = http://www.acme.com/chapter.pdf<br />

www.acme.com<br />

http://www.acme.com/chapter.pdf


Naming Resources on the Net<br />

The Solution<br />

www.newbusiness.com<br />

http://www.newbusiness.com/chapter.pdf<br />

Internet<br />

Naming Service<br />

www.acme.com<br />

Name = Value(s)<br />

10.123/xyz = http://www.newbusiness.com/chapter.pdf


CNRI Handle System<br />

• Distributed, scalable, secure<br />

• Enforces unique names<br />

• Enables association of one or more typed values,<br />

e.g., URL, with each name<br />

• Optimized for speed and reliability<br />

• Open, well-defined protocol and data model<br />

• Provides infrastructure for application domains,<br />

e.g., digital libraries, electronic publishing ...<br />

Corporation for National Research Initiatives


Handle System Usage<br />

• Library of Congress<br />

• DTIC (Defense Technical Information Center)<br />

• IDF (International DOI Foundation)<br />

– CrossRef (scholarly journal consortium)<br />

– Enpia (Korean content management technology firm)<br />

– CDI (U.S. content management technology firm)<br />

– LON (U.S. learning object technology firm)<br />

– CAL (Copyright Agency Ltd - Australia)<br />

– TSO (U.K. publisher & info mgmt service provider)<br />

– MEDRA (Multilingual European DOI Registration Agency)<br />

– Nielsen BookData (bibliographic data - ISBN)<br />

– R.R. Bowker (bibliographic data - ISBN)<br />

– Office of Publications of the European Community (applied)<br />

• NTIS (National Technical Information Service)<br />

• DSpace (MIT + HP)<br />

• ADL/SCORM: new CORDRA effort<br />

• Various digital library production and research projects<br />

Corporation for National Research Initiatives


Handles Resolve to Typed Data<br />

Handle<br />

Data type Index<br />

Handle data<br />

10.123/456 URL 1 http://acme.com/….<br />

URL 2 http://a-books.com/….<br />

DLS 9 acme/repository<br />

HS_ADMIN 100 acme.admin/jsmith<br />

XYZ 1001110011110<br />

12<br />

Corporation for National Research Initiatives


The Two Types of Handle Query<br />

1. Request all data<br />

Give me all data associated with handle 10.1000/123.<br />

LHS<br />

GHR<br />

LHS<br />

Handle<br />

Client<br />

Handle<br />

Index Type Data<br />

10.1000/123 3<br />

2<br />

5<br />

10<br />

9<br />

4<br />

URL<br />

URL<br />

URL<br />

PK<br />

EM<br />

IP<br />

URL1(Server in US)<br />

URL2 (Server in Asia)<br />

URL3 (Server in Europe)<br />

public key<br />

email address<br />

rights data<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System<br />

LHS<br />

2. Request all data of a given type<br />

Give me all data of type URL associated with handle 10.1000/123.<br />

LHS<br />

GHR<br />

LHS<br />

Handle<br />

Client<br />

Handle<br />

Index Type Data<br />

10.1000/123 3 URL URL1(Server in US)<br />

2 URL URL2 (Server in Asia)<br />

5 URL URL3 (Server in Europe)<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System<br />

LHS


Handle Resolution<br />

Client<br />

LHS<br />

GHR<br />

LHS<br />

The Handle System<br />

is a collection of<br />

handle services,<br />

each of which<br />

consists of one or<br />

more replicated sites,<br />

each of which may<br />

have one or more<br />

servers.<br />

Site 1<br />

Site 2<br />

Site 3<br />

LHS<br />

…... Site n<br />

LHS<br />

#1 #2<br />

Site 1 Site 2<br />

#1 #2 #3 #4 ... #n<br />

123.456/abc URL 4 http://www.acme.com/<br />

URL<br />

8<br />

http://www.ideal.com/


Handle Clients<br />

Request to Client:<br />

Resolve hdl:10.1000/1<br />

1. Sends request to Global to<br />

resolve 0.NA/10.1000<br />

(naming authority<br />

handle for 10.1000)<br />

Global Handle<br />

Registry<br />

Client


Handle Clients<br />

Request to Client:<br />

Resolve hdl:10.1000/1<br />

2. Global Responds with<br />

Service Information for 10.1000<br />

Global Handle<br />

Registry<br />

Client<br />

xcccxv xc xc xc<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

Service Information<br />

Acme Local Handle Service<br />

...


xcccxv xc xc xc<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

...<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

Handle Clients<br />

IP Address<br />

Port #<br />

Public Key ...<br />

Primary Site<br />

Server 1<br />

123.45.67.8<br />

2641<br />

K03RLQ...<br />

...<br />

Server 2 123.52.67.9<br />

2641<br />

5&M#FG...<br />

...<br />

Secondary Site A<br />

Server 1<br />

321.54.678.12<br />

2641<br />

F^*JLS...<br />

...<br />

Server 2<br />

321.54.678.14<br />

2641<br />

3E$T%...<br />

...<br />

Server 3<br />

762.34.1.1<br />

2641<br />

A2S4D...<br />

...<br />

Secondary Site B<br />

Server 1<br />

123.45.67.4<br />

2641<br />

N0L8H7...<br />

...<br />

Service Information - Acme Local Handle Service


xcccxv xc xc xc<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

...<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

Handle Clients<br />

IP Address<br />

Port #<br />

Public Key ...<br />

Primary Site<br />

Server 1<br />

123.45.67.8<br />

2641<br />

K03RLQ...<br />

...<br />

Server 2 123.52.67.9<br />

2641<br />

5&M#FG...<br />

...<br />

Secondary Site A<br />

Server 1<br />

321.54.678.12<br />

2641<br />

F^*JLS...<br />

...<br />

Server 2<br />

321.54.678.14<br />

2641<br />

3E$T%...<br />

...<br />

Server 3<br />

762.34.1.1<br />

2641<br />

A2S4D...<br />

...<br />

Secondary Site B<br />

Server 1<br />

123.45.67.4<br />

2641<br />

N0L8H7...<br />

...<br />

Service Information - Acme Local Handle Service


xcccxv xc xc xc<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xcccxv<br />

xccx<br />

xccx<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

xc<br />

...<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

..<br />

Handle Clients<br />

IP Address<br />

Port #<br />

Public Key ...<br />

Primary Site<br />

Server 1<br />

123.45.67.8<br />

2641<br />

K03RLQ...<br />

...<br />

Server 2 123.52.67.9<br />

2641<br />

5&M#FG...<br />

...<br />

Secondary Site A<br />

Server 1<br />

321.54.678.12<br />

2641<br />

F^*JLS...<br />

...<br />

Server 2<br />

321.54.678.14<br />

2641<br />

3E$T%...<br />

...<br />

Server 3<br />

762.34.1.1<br />

2641<br />

A2S4D...<br />

...<br />

Secondary Site B<br />

Server 1<br />

123.45.67.4<br />

2641<br />

N0L8H7...<br />

...<br />

Service Information - Acme Local Handle Service


Handle Clients<br />

Request to Client:<br />

Resolve hdl:10.1000/1<br />

Client<br />

3. Client queries Server 3<br />

in Secondary Site A<br />

for 10.1000/1<br />

Global Handle<br />

Registry<br />

Acme Local<br />

Handle Service<br />

#1<br />

#1 #2<br />

Secondary Site B<br />

Primary Site<br />

#1<br />

#2<br />

#3<br />

Secondary Site A


Handle Clients<br />

Request to Client:<br />

Resolve hdl:10.1000/1<br />

Global Handle<br />

Registry<br />

Client<br />

4. Server responds with<br />

handle data<br />

Acme Local<br />

Handle Service<br />

#1<br />

#1 #2<br />

Secondary Site B<br />

Primary Site<br />

#1<br />

#2<br />

#3<br />

Secondary Site A


Handle Clients<br />

Web Client<br />

Handle Administration<br />

Client<br />

HTTP Get<br />

HTTP Redirect<br />

http://hdl.handle.net/123.456/abc<br />

Proxy/<br />

Web Server<br />

Resolve<br />

Handle<br />

Handle Data<br />

LHS<br />

GHR<br />

LHS<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System


Handle Clients<br />

Client<br />

Plug-In<br />

Client<br />

hdl:/123.456/abc<br />

Handle Administration<br />

Client<br />

Handle Data<br />

Resolve Handle<br />

Request<br />

LHS<br />

GHR<br />

LHS<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System


Handle Clients<br />

Web<br />

Handle Administration<br />

Client<br />

HTTP<br />

Web Server<br />

Admin Forms<br />

Handle Admin API<br />

LHS<br />

GHR<br />

LHS<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System


Handle Clients<br />

Custom<br />

Client<br />

Web<br />

Handle Administration<br />

Client<br />

LHS<br />

GHR<br />

LHS<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System


Handle Clients<br />

Web<br />

Handle Administration<br />

embedded in another<br />

process<br />

LHS<br />

GHR<br />

LHS<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System


Handle Clients<br />

Handle Resolution<br />

embedded in another<br />

process<br />

Handle Administration<br />

embedded in another<br />

process<br />

LHS<br />

GHR<br />

LHS<br />

LHS LHS LHS<br />

LHS<br />

LHS<br />

LHS<br />

LHS<br />

Handle System


HS Administration<br />

• Ownership is at the handle level<br />

• Administrators defined by handles<br />

• Administrator handles contain keys<br />

• All admin transactions validated via<br />

challenge/response from server to client<br />

• Allows distributed administration<br />

Corporation for National Research Initiatives


Handle System Usage<br />

• Prefixes<br />

– DOI - 700<br />

– Other - 300<br />

• Handles<br />

– DOI - 12M<br />

– Other - unknown<br />

• Global<br />

– Three service sites (all currently in VA)<br />

– 10M resolutions last month<br />

Corporation for National Research Initiatives


Handle System Management and Standards<br />

• Specification<br />

– RFC 3650: Overview<br />

– RFC 3651: Namespace and Service Definition<br />

– RFC 3652: Protocol<br />

• HSAC - Handle System Advisory Committee<br />

• URI/URL/URN<br />

– IETF votes for URN, we don’t see any advantage<br />

• Extra layer of indirection, still need the native protocol<br />

– What are the practical implications<br />

– INFO submission from OpenURL group (also not faring<br />

well in the IETF)<br />

– Open to advice<br />

Corporation for National Research Initiatives


HS Developments<br />

• DOI AP/Services evolution<br />

– Son of Appropriate Copy<br />

– Rights Clearance services<br />

• GRID computing - Globus Toolkit<br />

• Licensing<br />

• Delegation<br />

• Renewed Repository/Registry work<br />

Corporation for National Research Initiatives


www.handle.net<br />

llannom@cnri.reston.va.us<br />

Corporation for National Research Initiatives


Appropriate Copy Problem<br />

XYZ University<br />

http://dx.doi.org/10.123/456<br />

http://abc.com/article.html<br />

10.123/456<br />

http://abc.com/article.html<br />

Reference with<br />

DOI for<br />

article.html<br />

in ABC Journal<br />

dx.doi.org<br />

proxy server<br />

Handle System<br />

article.html<br />

ABC Journal<br />

publisher<br />

abc.com<br />

Local Copy of<br />

article.html<br />

in ABC Journal


Appropriate Copy Problem: solved<br />

XYZ University<br />

http://dx.doi.org/10.123/456cookie<br />

Redirect to Local Server<br />

Reference with<br />

DOI for<br />

article.html<br />

in ABC Journal<br />

dx.doi.org<br />

proxy server<br />

understands cookies<br />

Handle System<br />

Local Server<br />

Metadata<br />

ABC Journal<br />

publisher<br />

abc.com<br />

Local Copy of<br />

article.html<br />

in ABC Journal<br />

Metadata<br />

Metadata<br />

Database


Appropriate Copy Problem<br />

solved w/o local copy<br />

XYZ University<br />

Reference with<br />

DOI for<br />

article.html<br />

in ABC Journal<br />

dx.doi.org<br />

proxy server<br />

understands cookies<br />

Handle System<br />

Local Server<br />

Metadata<br />

X<br />

Local Copy of<br />

article.html<br />

in ABC Journal<br />

Metadata<br />

Metadata<br />

Database<br />

ABC Journal<br />

publisher<br />

article.html<br />

abc.com


Appropriate Copy Problem<br />

extensible solution<br />

XYZ University<br />

http://dx.doi.org/10.123/456cookie<br />

Reference with<br />

DOI for<br />

article.html<br />

in ABC Journal<br />

Redirect to Local Server<br />

dx.doi.org<br />

proxy server<br />

understands cookies<br />

Handle System<br />

Metadata Location<br />

Local Server<br />

Meta1.com<br />

X<br />

Local Copy of<br />

article.html<br />

in ABC Journal<br />

Metadata<br />

Metadata<br />

ABC Journal<br />

publisher<br />

article.html abc.com<br />

Meta1.com<br />

Meta2.com<br />

Meta3.com<br />

Metadata Collection Services


Mirroring<br />

Local Handle Service<br />

Primary Site<br />

Secondary<br />

Site "A"<br />

Server S A1<br />

Server P 1<br />

Server S A2<br />

Server P 2<br />

Server P 3<br />

Secondary Site "B"<br />

Server S B1<br />

Server S B2<br />

Server S B3<br />

Server S B4<br />

Corporation for National Research Initiatives


Mirroring<br />

Local Handle Service<br />

Primary Site<br />

Secondary<br />

Site "A"<br />

When Secondary Site "A" started running, each<br />

secondary server sent a request to each<br />

server in the Primary Site asking for updates.<br />

Server P 1<br />

Server S A1<br />

Server P 2<br />

Server S A2<br />

Server P 3<br />

Secondary Site "B"<br />

Server S B1<br />

Server S B2<br />

Server S B3<br />

Server S B4<br />

Corporation for National Research Initiatives


Mirroring<br />

Local Handle Service<br />

Primary Site<br />

Server P 1<br />

Server P 2<br />

Secondary<br />

Site "A"<br />

Server S A1<br />

Server S A2<br />

Each server P 1- P 3 "knows" which<br />

handles in its transaction log hash to<br />

which secondary server, and sends them.<br />

Each secondary will continue to<br />

request updates on a regular basis.<br />

The request is made in the form of<br />

"all transactions since transaction X".<br />

Server P 3<br />

Secondary Site "B"<br />

Server S B1<br />

Server S B2<br />

Server S B3<br />

Server S B4<br />

Corporation for National Research Initiatives


Mirroring<br />

Local Handle Service<br />

Client<br />

Primary Site<br />

Server P 1<br />

Secondary<br />

Site "A"<br />

Server S A1<br />

For example, for a given new administrative<br />

action, the admin client knows, because of<br />

hashing, that the action is performed on<br />

Primary Server P 2 .<br />

Server P 2 then knows to send that action to<br />

Secondary Site "A" Server S A2 and to<br />

Secondary Site "B", Server S B1 .<br />

Server S A2<br />

Server P 2<br />

Server P 3<br />

Secondary Site "B"<br />

Server S B1<br />

Server S B2<br />

Server S B3<br />

Server S B4<br />

Corporation for National Research Initiatives

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!