Digital Object Architecture - Erpanet
Digital Object Architecture - Erpanet
Digital Object Architecture - Erpanet
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Handle System Overview<br />
Larry Lannom<br />
17 June 2004<br />
Corporation for National Research Initiatives<br />
http://www.cnri.reston.va.us/<br />
http://www.handle.net/<br />
Copyright© 2004 Corporation for National Research Initiatives. Permission is hereby granted to reproduce, disseminate, redistribute, perform and/or display this work publicly, provided, however, that credit is<br />
given to the person named as writer of the work and CNRI, and you do not abridge or edit the work in any way that alters its integrity or meaning.
<strong>Digital</strong> <strong>Object</strong> <strong>Architecture</strong> - Goals<br />
• Framework for managing <strong>Digital</strong><br />
(Information) <strong>Object</strong>s<br />
• Give it a name and talk to it<br />
– Don’t worry about where it is<br />
– Don’t worry about what it’s made of<br />
• Rise above details of application versions<br />
and content formats<br />
Corporation for National Research Initiatives
<strong>Digital</strong> <strong>Object</strong> <strong>Architecture</strong><br />
Client<br />
Repositories / Collections<br />
Resource Discovery<br />
•Search Engines<br />
•Metadata Databases<br />
•Catalogues, Guides, etc.<br />
Resolution System
<strong>Digital</strong> <strong>Object</strong> <strong>Architecture</strong> Components<br />
Handle System<br />
• Go from name to attributes<br />
• Fundamental indirection system for <strong>Digital</strong><br />
<strong>Object</strong> management on the net<br />
• No free lunch<br />
– Added layer of infrastructure<br />
– Must be managed<br />
Corporation for National Research Initiatives
Naming Resources on the Net<br />
The Problem<br />
Internet<br />
www.acme.com<br />
chapter.pdf
Naming Resources on the Net<br />
The Solution<br />
Internet<br />
Naming Service<br />
Name = Value(s)<br />
10.123/xyz = http://www.acme.com/chapter.pdf<br />
www.acme.com<br />
http://www.acme.com/chapter.pdf
Naming Resources on the Net<br />
The Solution<br />
www.newbusiness.com<br />
http://www.newbusiness.com/chapter.pdf<br />
Internet<br />
Naming Service<br />
www.acme.com<br />
Name = Value(s)<br />
10.123/xyz = http://www.newbusiness.com/chapter.pdf
CNRI Handle System<br />
• Distributed, scalable, secure<br />
• Enforces unique names<br />
• Enables association of one or more typed values,<br />
e.g., URL, with each name<br />
• Optimized for speed and reliability<br />
• Open, well-defined protocol and data model<br />
• Provides infrastructure for application domains,<br />
e.g., digital libraries, electronic publishing ...<br />
Corporation for National Research Initiatives
Handle System Usage<br />
• Library of Congress<br />
• DTIC (Defense Technical Information Center)<br />
• IDF (International DOI Foundation)<br />
– CrossRef (scholarly journal consortium)<br />
– Enpia (Korean content management technology firm)<br />
– CDI (U.S. content management technology firm)<br />
– LON (U.S. learning object technology firm)<br />
– CAL (Copyright Agency Ltd - Australia)<br />
– TSO (U.K. publisher & info mgmt service provider)<br />
– MEDRA (Multilingual European DOI Registration Agency)<br />
– Nielsen BookData (bibliographic data - ISBN)<br />
– R.R. Bowker (bibliographic data - ISBN)<br />
– Office of Publications of the European Community (applied)<br />
• NTIS (National Technical Information Service)<br />
• DSpace (MIT + HP)<br />
• ADL/SCORM: new CORDRA effort<br />
• Various digital library production and research projects<br />
Corporation for National Research Initiatives
Handles Resolve to Typed Data<br />
Handle<br />
Data type Index<br />
Handle data<br />
10.123/456 URL 1 http://acme.com/….<br />
URL 2 http://a-books.com/….<br />
DLS 9 acme/repository<br />
HS_ADMIN 100 acme.admin/jsmith<br />
XYZ 1001110011110<br />
12<br />
Corporation for National Research Initiatives
The Two Types of Handle Query<br />
1. Request all data<br />
Give me all data associated with handle 10.1000/123.<br />
LHS<br />
GHR<br />
LHS<br />
Handle<br />
Client<br />
Handle<br />
Index Type Data<br />
10.1000/123 3<br />
2<br />
5<br />
10<br />
9<br />
4<br />
URL<br />
URL<br />
URL<br />
PK<br />
EM<br />
IP<br />
URL1(Server in US)<br />
URL2 (Server in Asia)<br />
URL3 (Server in Europe)<br />
public key<br />
email address<br />
rights data<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System<br />
LHS<br />
2. Request all data of a given type<br />
Give me all data of type URL associated with handle 10.1000/123.<br />
LHS<br />
GHR<br />
LHS<br />
Handle<br />
Client<br />
Handle<br />
Index Type Data<br />
10.1000/123 3 URL URL1(Server in US)<br />
2 URL URL2 (Server in Asia)<br />
5 URL URL3 (Server in Europe)<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System<br />
LHS
Handle Resolution<br />
Client<br />
LHS<br />
GHR<br />
LHS<br />
The Handle System<br />
is a collection of<br />
handle services,<br />
each of which<br />
consists of one or<br />
more replicated sites,<br />
each of which may<br />
have one or more<br />
servers.<br />
Site 1<br />
Site 2<br />
Site 3<br />
LHS<br />
…... Site n<br />
LHS<br />
#1 #2<br />
Site 1 Site 2<br />
#1 #2 #3 #4 ... #n<br />
123.456/abc URL 4 http://www.acme.com/<br />
URL<br />
8<br />
http://www.ideal.com/
Handle Clients<br />
Request to Client:<br />
Resolve hdl:10.1000/1<br />
1. Sends request to Global to<br />
resolve 0.NA/10.1000<br />
(naming authority<br />
handle for 10.1000)<br />
Global Handle<br />
Registry<br />
Client
Handle Clients<br />
Request to Client:<br />
Resolve hdl:10.1000/1<br />
2. Global Responds with<br />
Service Information for 10.1000<br />
Global Handle<br />
Registry<br />
Client<br />
xcccxv xc xc xc<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
Service Information<br />
Acme Local Handle Service<br />
...
xcccxv xc xc xc<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
...<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
Handle Clients<br />
IP Address<br />
Port #<br />
Public Key ...<br />
Primary Site<br />
Server 1<br />
123.45.67.8<br />
2641<br />
K03RLQ...<br />
...<br />
Server 2 123.52.67.9<br />
2641<br />
5&M#FG...<br />
...<br />
Secondary Site A<br />
Server 1<br />
321.54.678.12<br />
2641<br />
F^*JLS...<br />
...<br />
Server 2<br />
321.54.678.14<br />
2641<br />
3E$T%...<br />
...<br />
Server 3<br />
762.34.1.1<br />
2641<br />
A2S4D...<br />
...<br />
Secondary Site B<br />
Server 1<br />
123.45.67.4<br />
2641<br />
N0L8H7...<br />
...<br />
Service Information - Acme Local Handle Service
xcccxv xc xc xc<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
...<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
Handle Clients<br />
IP Address<br />
Port #<br />
Public Key ...<br />
Primary Site<br />
Server 1<br />
123.45.67.8<br />
2641<br />
K03RLQ...<br />
...<br />
Server 2 123.52.67.9<br />
2641<br />
5&M#FG...<br />
...<br />
Secondary Site A<br />
Server 1<br />
321.54.678.12<br />
2641<br />
F^*JLS...<br />
...<br />
Server 2<br />
321.54.678.14<br />
2641<br />
3E$T%...<br />
...<br />
Server 3<br />
762.34.1.1<br />
2641<br />
A2S4D...<br />
...<br />
Secondary Site B<br />
Server 1<br />
123.45.67.4<br />
2641<br />
N0L8H7...<br />
...<br />
Service Information - Acme Local Handle Service
xcccxv xc xc xc<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xcccxv<br />
xccx<br />
xccx<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
xc<br />
...<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
..<br />
Handle Clients<br />
IP Address<br />
Port #<br />
Public Key ...<br />
Primary Site<br />
Server 1<br />
123.45.67.8<br />
2641<br />
K03RLQ...<br />
...<br />
Server 2 123.52.67.9<br />
2641<br />
5&M#FG...<br />
...<br />
Secondary Site A<br />
Server 1<br />
321.54.678.12<br />
2641<br />
F^*JLS...<br />
...<br />
Server 2<br />
321.54.678.14<br />
2641<br />
3E$T%...<br />
...<br />
Server 3<br />
762.34.1.1<br />
2641<br />
A2S4D...<br />
...<br />
Secondary Site B<br />
Server 1<br />
123.45.67.4<br />
2641<br />
N0L8H7...<br />
...<br />
Service Information - Acme Local Handle Service
Handle Clients<br />
Request to Client:<br />
Resolve hdl:10.1000/1<br />
Client<br />
3. Client queries Server 3<br />
in Secondary Site A<br />
for 10.1000/1<br />
Global Handle<br />
Registry<br />
Acme Local<br />
Handle Service<br />
#1<br />
#1 #2<br />
Secondary Site B<br />
Primary Site<br />
#1<br />
#2<br />
#3<br />
Secondary Site A
Handle Clients<br />
Request to Client:<br />
Resolve hdl:10.1000/1<br />
Global Handle<br />
Registry<br />
Client<br />
4. Server responds with<br />
handle data<br />
Acme Local<br />
Handle Service<br />
#1<br />
#1 #2<br />
Secondary Site B<br />
Primary Site<br />
#1<br />
#2<br />
#3<br />
Secondary Site A
Handle Clients<br />
Web Client<br />
Handle Administration<br />
Client<br />
HTTP Get<br />
HTTP Redirect<br />
http://hdl.handle.net/123.456/abc<br />
Proxy/<br />
Web Server<br />
Resolve<br />
Handle<br />
Handle Data<br />
LHS<br />
GHR<br />
LHS<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System
Handle Clients<br />
Client<br />
Plug-In<br />
Client<br />
hdl:/123.456/abc<br />
Handle Administration<br />
Client<br />
Handle Data<br />
Resolve Handle<br />
Request<br />
LHS<br />
GHR<br />
LHS<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System
Handle Clients<br />
Web<br />
Handle Administration<br />
Client<br />
HTTP<br />
Web Server<br />
Admin Forms<br />
Handle Admin API<br />
LHS<br />
GHR<br />
LHS<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System
Handle Clients<br />
Custom<br />
Client<br />
Web<br />
Handle Administration<br />
Client<br />
LHS<br />
GHR<br />
LHS<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System
Handle Clients<br />
Web<br />
Handle Administration<br />
embedded in another<br />
process<br />
LHS<br />
GHR<br />
LHS<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System
Handle Clients<br />
Handle Resolution<br />
embedded in another<br />
process<br />
Handle Administration<br />
embedded in another<br />
process<br />
LHS<br />
GHR<br />
LHS<br />
LHS LHS LHS<br />
LHS<br />
LHS<br />
LHS<br />
LHS<br />
Handle System
HS Administration<br />
• Ownership is at the handle level<br />
• Administrators defined by handles<br />
• Administrator handles contain keys<br />
• All admin transactions validated via<br />
challenge/response from server to client<br />
• Allows distributed administration<br />
Corporation for National Research Initiatives
Handle System Usage<br />
• Prefixes<br />
– DOI - 700<br />
– Other - 300<br />
• Handles<br />
– DOI - 12M<br />
– Other - unknown<br />
• Global<br />
– Three service sites (all currently in VA)<br />
– 10M resolutions last month<br />
Corporation for National Research Initiatives
Handle System Management and Standards<br />
• Specification<br />
– RFC 3650: Overview<br />
– RFC 3651: Namespace and Service Definition<br />
– RFC 3652: Protocol<br />
• HSAC - Handle System Advisory Committee<br />
• URI/URL/URN<br />
– IETF votes for URN, we don’t see any advantage<br />
• Extra layer of indirection, still need the native protocol<br />
– What are the practical implications<br />
– INFO submission from OpenURL group (also not faring<br />
well in the IETF)<br />
– Open to advice<br />
Corporation for National Research Initiatives
HS Developments<br />
• DOI AP/Services evolution<br />
– Son of Appropriate Copy<br />
– Rights Clearance services<br />
• GRID computing - Globus Toolkit<br />
• Licensing<br />
• Delegation<br />
• Renewed Repository/Registry work<br />
Corporation for National Research Initiatives
www.handle.net<br />
llannom@cnri.reston.va.us<br />
Corporation for National Research Initiatives
Appropriate Copy Problem<br />
XYZ University<br />
http://dx.doi.org/10.123/456<br />
http://abc.com/article.html<br />
10.123/456<br />
http://abc.com/article.html<br />
Reference with<br />
DOI for<br />
article.html<br />
in ABC Journal<br />
dx.doi.org<br />
proxy server<br />
Handle System<br />
article.html<br />
ABC Journal<br />
publisher<br />
abc.com<br />
Local Copy of<br />
article.html<br />
in ABC Journal
Appropriate Copy Problem: solved<br />
XYZ University<br />
http://dx.doi.org/10.123/456cookie<br />
Redirect to Local Server<br />
Reference with<br />
DOI for<br />
article.html<br />
in ABC Journal<br />
dx.doi.org<br />
proxy server<br />
understands cookies<br />
Handle System<br />
Local Server<br />
Metadata<br />
ABC Journal<br />
publisher<br />
abc.com<br />
Local Copy of<br />
article.html<br />
in ABC Journal<br />
Metadata<br />
Metadata<br />
Database
Appropriate Copy Problem<br />
solved w/o local copy<br />
XYZ University<br />
Reference with<br />
DOI for<br />
article.html<br />
in ABC Journal<br />
dx.doi.org<br />
proxy server<br />
understands cookies<br />
Handle System<br />
Local Server<br />
Metadata<br />
X<br />
Local Copy of<br />
article.html<br />
in ABC Journal<br />
Metadata<br />
Metadata<br />
Database<br />
ABC Journal<br />
publisher<br />
article.html<br />
abc.com
Appropriate Copy Problem<br />
extensible solution<br />
XYZ University<br />
http://dx.doi.org/10.123/456cookie<br />
Reference with<br />
DOI for<br />
article.html<br />
in ABC Journal<br />
Redirect to Local Server<br />
dx.doi.org<br />
proxy server<br />
understands cookies<br />
Handle System<br />
Metadata Location<br />
Local Server<br />
Meta1.com<br />
X<br />
Local Copy of<br />
article.html<br />
in ABC Journal<br />
Metadata<br />
Metadata<br />
ABC Journal<br />
publisher<br />
article.html abc.com<br />
Meta1.com<br />
Meta2.com<br />
Meta3.com<br />
Metadata Collection Services
Mirroring<br />
Local Handle Service<br />
Primary Site<br />
Secondary<br />
Site "A"<br />
Server S A1<br />
Server P 1<br />
Server S A2<br />
Server P 2<br />
Server P 3<br />
Secondary Site "B"<br />
Server S B1<br />
Server S B2<br />
Server S B3<br />
Server S B4<br />
Corporation for National Research Initiatives
Mirroring<br />
Local Handle Service<br />
Primary Site<br />
Secondary<br />
Site "A"<br />
When Secondary Site "A" started running, each<br />
secondary server sent a request to each<br />
server in the Primary Site asking for updates.<br />
Server P 1<br />
Server S A1<br />
Server P 2<br />
Server S A2<br />
Server P 3<br />
Secondary Site "B"<br />
Server S B1<br />
Server S B2<br />
Server S B3<br />
Server S B4<br />
Corporation for National Research Initiatives
Mirroring<br />
Local Handle Service<br />
Primary Site<br />
Server P 1<br />
Server P 2<br />
Secondary<br />
Site "A"<br />
Server S A1<br />
Server S A2<br />
Each server P 1- P 3 "knows" which<br />
handles in its transaction log hash to<br />
which secondary server, and sends them.<br />
Each secondary will continue to<br />
request updates on a regular basis.<br />
The request is made in the form of<br />
"all transactions since transaction X".<br />
Server P 3<br />
Secondary Site "B"<br />
Server S B1<br />
Server S B2<br />
Server S B3<br />
Server S B4<br />
Corporation for National Research Initiatives
Mirroring<br />
Local Handle Service<br />
Client<br />
Primary Site<br />
Server P 1<br />
Secondary<br />
Site "A"<br />
Server S A1<br />
For example, for a given new administrative<br />
action, the admin client knows, because of<br />
hashing, that the action is performed on<br />
Primary Server P 2 .<br />
Server P 2 then knows to send that action to<br />
Secondary Site "A" Server S A2 and to<br />
Secondary Site "B", Server S B1 .<br />
Server S A2<br />
Server P 2<br />
Server P 3<br />
Secondary Site "B"<br />
Server S B1<br />
Server S B2<br />
Server S B3<br />
Server S B4<br />
Corporation for National Research Initiatives