Janifer Gatenby - ODIN
Janifer Gatenby - ODIN
Janifer Gatenby - ODIN
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
International Standard Name Identifier<br />
ISNI<br />
<strong>Janifer</strong> <strong>Gatenby</strong><br />
EMEA Program Manager Metadata<br />
OCLC<br />
The world’s libraries. Connected.<br />
Berlin 18 October 2012
Agenda<br />
Introduction<br />
Purpose and focus of ISNI<br />
Some ISNI features<br />
ISNI and ORCID<br />
The world’s libraries. Connected.
Raison d’être<br />
Rights management of licensed digital resources<br />
Digitisation rights negotiation<br />
Shared disambiguation workload<br />
Improved enquiry – precision and recall<br />
Interoperation of organisations involved in trade of<br />
electronic resources<br />
Generating links as assertions to facilitate all of the<br />
above<br />
The world’s libraries. Connected.
Focus of ISNI<br />
Start with largest possible pool of identities uniting<br />
legacy data<br />
Develop high quality data and matching system<br />
Development of an effective diffusion system<br />
Enable end users to place direct comment &<br />
correction<br />
The world’s libraries. Connected.
ISNI – Linking identifier<br />
VIAF<br />
JNAM<br />
The world’s libraries. Connected.<br />
Institutions<br />
RING<br />
ORCID<br />
Articles, inst<br />
BRTH<br />
Theses, inst PROQ<br />
SCHU<br />
AMS,<br />
MLA ++
ISNI – Bridging identifier<br />
VIAF<br />
32 national libraries<br />
IPDA<br />
37 organisations<br />
The world’s libraries. Connected.<br />
Institutions<br />
RING<br />
MusicB<br />
rainz<br />
Titles; IPI<br />
CISAC<br />
DDEX,<br />
record<br />
labels<br />
Titles; IPI IFRRO<br />
225 organisations<br />
136 organisations
ISNI – Diffusion<br />
Web page comments<br />
The world’s libraries. Connected.<br />
Notification schema<br />
Wikipedia links (currently 288,000)<br />
Schema.org, RDF/<br />
XML schema
An international standard:<br />
ISO 27729<br />
…for a global identifier:<br />
ISNI: 0000 0001 2101 4127<br />
Registration<br />
Agency<br />
Registration<br />
Agency<br />
A network<br />
Quality Team<br />
ISNI Assignment Agency<br />
VIAF ISNI<br />
Registration<br />
Agency<br />
What%is%it%?%<br />
Registration<br />
Agency<br />
Registration<br />
Agency<br />
ISO Registration Authority<br />
ISNI-IA<br />
Rights management<br />
societies<br />
Quality Team<br />
Board<br />
Libraries, Education,<br />
trade<br />
A central database of<br />
identifiers
ISNI system components<br />
• Database tailored for ISNI<br />
• Public, Member and Administrative view<br />
• Web and SRU access<br />
• 15+ indexes + restrictors and facets<br />
• Web maintenance client<br />
• Advanced record maintenance client<br />
• Loading from VIAF + 2 ISNI specific formats<br />
• Batch and Atom Pub<br />
• Evaluation, matching and merging<br />
• Notification and reports<br />
The world’s libraries. Connected.
ISNI<br />
Some Features<br />
• Confidence levels<br />
• data and match<br />
• Assignment system (online addition)<br />
• Record links to related names<br />
• Public and Private data<br />
• Direct maintenance<br />
The world’s libraries. Connected.
Assigned<br />
1 or 2<br />
sources<br />
3 or more<br />
sources<br />
Non<br />
VIAF<br />
sources<br />
(Green)<br />
• Assigned = 1.5 million<br />
• 3 or more VIAF sources = c.850,000<br />
• 1 or 2 VIAF sources + at least one non<br />
VIAF source = c.340,000<br />
• At least 2 non VIAF sources = c. 60,000<br />
• Work in progress – unique names<br />
• Possible matches = 300,000<br />
• Provisional = 15 million<br />
• Suspect = 12,000
VIAF sources<br />
The world’s libraries. Connected.<br />
Not a 1 to 1 correspondence<br />
ISNI excludes sparse, undifferentiated.<br />
Sometimes 2 ISNIs to 1 VIAF (pseudonyms). I<br />
ISNI to 2+ VIAF (merges).
ISNI includes:<br />
More musicians – composers and performers<br />
More academics<br />
Education<br />
Trade<br />
& & more<br />
Theses<br />
The world’s libraries. Connected.<br />
Rights management<br />
societies<br />
…<br />
ISNI sources<br />
Access%Copyright,%Canada<br />
Authors%Guild<br />
Authors’%Licensing%and%Collec7ng%Society,%<br />
UK<br />
American%Musicological%Society<br />
Boekenbank,%Belgium<br />
Books%in%Print<br />
Bri7sh%Library%Sound%Archive<br />
Bri7sh%Library%Theses<br />
Centrum%Dienstverlening%AuteursD%en%<br />
aanverwante%Rechten,%Netherlands<br />
Centro%Español%de%Derechos%Reprográficos<br />
Irish%Copyright%Licensing%Agency<br />
Interna7onal%Performers’%Database%<br />
Associa7on<br />
Interna7onal%Confedera7on%of%Socie7es%of%<br />
Authors%and%Composers<br />
Freebase<br />
Jisc%Names%Project,%UK<br />
Modern%Languages%Associa7on<br />
MusicBrainz<br />
ProliPeris,%Switzerland<br />
Proquest,%Scholar%Universe<br />
Ringgold<br />
Scholar%Universe,%Proquest<br />
VG%WORT,%Germany<br />
Virtual%Interna7onal%Authority%File
Matching – Evaluation<br />
• NACO normalisation<br />
• Common surnames<br />
• Forename equivalents<br />
• Only fullest name form used<br />
• Noise titles<br />
• Dates<br />
• flourished and lived<br />
• Suspect 1900 year only<br />
• String compression 15% faster<br />
than UNICODE string<br />
similarity<br />
The world’s libraries. Connected.<br />
Example forename<br />
equivalents<br />
• anthony tony anthony<br />
• antoin antoin antoine<br />
• antoine antoine antoin<br />
• anton anton antonius antonie<br />
antonio<br />
• antonie antonie anton<br />
• antonio antonio anton<br />
• antonius antonius anton<br />
• arnold arnold arnoldus<br />
• arnoldus arnoldus arnold
Matching – Evaluation Functions<br />
• Name<br />
• Title<br />
• Date<br />
• Publisher<br />
• Personal affiliation<br />
• Organisation affiliation<br />
• Working on Dewey with<br />
truncation points<br />
The world’s libraries. Connected.<br />
• ISBN, ISWC, ISAN, DOI +<br />
• Other name identifier e.g.<br />
IPI, VIAF, IPD<br />
• Instrument<br />
• Linked entities<br />
• Adjust down for sparseness<br />
and / or common surnames
Matching Resolution screen: 2 matches<br />
The world’s libraries. Connected.
JISC Names matching with 2 + independent sources<br />
The world’s libraries. Connected.<br />
Total records 46,380; 92% assigned
ISNI record with 3 independent matching sources<br />
The world’s libraries. Connected.
Surprising associations<br />
The world’s libraries. Connected.
Data Quality Infrastructure<br />
OCLC Assignment Agency<br />
• Matching, merging and splitting<br />
infrastructure<br />
• Data sampling and anomaly<br />
checks, e.g. 7,000 date anomalies<br />
• Special fixes, e.g. Pseudonyms<br />
• Fixes to incoming data<br />
• Data enrichment e.g. Wikipedia,<br />
Dewey, Creation classes<br />
• Notification system<br />
The world’s libraries. Connected.<br />
ISNI Quality Team<br />
• Sampling<br />
• Merging and Splitting<br />
• Input to matching and resolution<br />
system<br />
• Responding to help desk and input<br />
notes from the data contributors and<br />
the public<br />
• Trusted to see full records<br />
ISNI Data Contributors and Registration Agencies<br />
• Quality of the input data<br />
• Resolution of possible and exact matches<br />
• Enrichment and Signalling errors as detected
Example: Record to split<br />
Two identities in the same VIAF record –<br />
detected by Authors’ Guild record matching VIAF record with a death date<br />
record provided by Author’s Guild<br />
Record no.: 13388323X For the living PhD<br />
Identifier: AGLD: 7227<br />
Name: Armstrong, Thomas PhD<br />
Title: Power of Neurodiversity: Unleashing the<br />
Advantages of Your Differently Wired<br />
Brain, The<br />
The world’s libraries. Connected.<br />
mixes
End user corrections and enrichments<br />
The world’s libraries. Connected.
Member view includes browse<br />
The world’s libraries. Connected.
Search by SRU API<br />
The world’s libraries. Connected.<br />
See Document:<br />
ISNI SRU search API guidelines.doc<br />
Example search by name keyword (pica.nw):<br />
http://isni.oclc.nl/sru/?query=pica.nw+%3D+%22maloy%2Brebecca<br />
%22&operation=searchRetrieve&recordSchema=isni-b<br />
This search is for the any records containing both “Rebecca” and<br />
“Maloy” in the name<br />
Response in XML enquiry response schema. ISNI enquiry response<br />
v2.xsd
• Access all records not just assigned<br />
• Complete data returned except private<br />
• 15 indexes plus limiters instead of 4<br />
The world’s libraries. Connected.
Atom Publishing Protocol IETF-RFC-5023<br />
POST /ATOM/isni HTTP/1.1<br />
Connection: TE, close (This line has been generated by a Perl client and is not used by the ISNI Atom Pub catcher)<br />
Host: isni-m.oclc.nl:80<br />
User-Agent: libwww-perl/5.823 (This line has been generated by a Perl client and is not used by the ISNI Atom Pub catcher)<br />
Content-Type: application/atom+xml<br />
Content-Length: - generated from the content block<br />
<br />
..<br />
....2014-05-20T09:09:35.5063705+02:00<br />
etc.<br />
Requests only (inserts)<br />
• ISNI XML Request Schema<br />
• ISNI XML Response Schema<br />
The world’s libraries. Connected.
ISNI-IA – Academic Community<br />
• Database already strong<br />
• BRTH, SCHU, PROQ, JNAM, MLA, AMS<br />
• 1.47 million records, 220,000 assigned, > 2 million links<br />
• Institutions<br />
• Ongoing assignments<br />
• via RAGs (e.g. Proquest)<br />
• Theses<br />
• Discussions on further JISC involvement – ZETOC<br />
• ORCID<br />
• Allocated a non-conflicting range of identifiers<br />
• New overtures<br />
The world’s libraries. Connected.
High quality data<br />
sources<br />
Matches & Links<br />
Diffusion<br />
Exposure<br />
Shared identifier<br />
Data exchange<br />
Synchronisation<br />
Enquiry<br />
Researcher<br />
claiming<br />
Publications<br />
Rich CV<br />
Links