03.03.2013 Views

Janifer Gatenby - ODIN

Janifer Gatenby - ODIN

Janifer Gatenby - ODIN

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

International Standard Name Identifier<br />

ISNI<br />

<strong>Janifer</strong> <strong>Gatenby</strong><br />

EMEA Program Manager Metadata<br />

OCLC<br />

The world’s libraries. Connected.<br />

Berlin 18 October 2012


Agenda<br />

Introduction<br />

Purpose and focus of ISNI<br />

Some ISNI features<br />

ISNI and ORCID<br />

The world’s libraries. Connected.


Raison d’être<br />

Rights management of licensed digital resources<br />

Digitisation rights negotiation<br />

Shared disambiguation workload<br />

Improved enquiry – precision and recall<br />

Interoperation of organisations involved in trade of<br />

electronic resources<br />

Generating links as assertions to facilitate all of the<br />

above<br />

The world’s libraries. Connected.


Focus of ISNI<br />

Start with largest possible pool of identities uniting<br />

legacy data<br />

Develop high quality data and matching system<br />

Development of an effective diffusion system<br />

Enable end users to place direct comment &<br />

correction<br />

The world’s libraries. Connected.


ISNI – Linking identifier<br />

VIAF<br />

JNAM<br />

The world’s libraries. Connected.<br />

Institutions<br />

RING<br />

ORCID<br />

Articles, inst<br />

BRTH<br />

Theses, inst PROQ<br />

SCHU<br />

AMS,<br />

MLA ++


ISNI – Bridging identifier<br />

VIAF<br />

32 national libraries<br />

IPDA<br />

37 organisations<br />

The world’s libraries. Connected.<br />

Institutions<br />

RING<br />

MusicB<br />

rainz<br />

Titles; IPI<br />

CISAC<br />

DDEX,<br />

record<br />

labels<br />

Titles; IPI IFRRO<br />

225 organisations<br />

136 organisations


ISNI – Diffusion<br />

Web page comments<br />

The world’s libraries. Connected.<br />

Notification schema<br />

Wikipedia links (currently 288,000)<br />

Schema.org, RDF/<br />

XML schema


An international standard:<br />

ISO 27729<br />

…for a global identifier:<br />

ISNI: 0000 0001 2101 4127<br />

Registration<br />

Agency<br />

Registration<br />

Agency<br />

A network<br />

Quality Team<br />

ISNI Assignment Agency<br />

VIAF ISNI<br />

Registration<br />

Agency<br />

What%is%it%?%<br />

Registration<br />

Agency<br />

Registration<br />

Agency<br />

ISO Registration Authority<br />

ISNI-IA<br />

Rights management<br />

societies<br />

Quality Team<br />

Board<br />

Libraries, Education,<br />

trade<br />

A central database of<br />

identifiers


ISNI system components<br />

• Database tailored for ISNI<br />

• Public, Member and Administrative view<br />

• Web and SRU access<br />

• 15+ indexes + restrictors and facets<br />

• Web maintenance client<br />

• Advanced record maintenance client<br />

• Loading from VIAF + 2 ISNI specific formats<br />

• Batch and Atom Pub<br />

• Evaluation, matching and merging<br />

• Notification and reports<br />

The world’s libraries. Connected.


ISNI<br />

Some Features<br />

• Confidence levels<br />

• data and match<br />

• Assignment system (online addition)<br />

• Record links to related names<br />

• Public and Private data<br />

• Direct maintenance<br />

The world’s libraries. Connected.


Assigned<br />

1 or 2<br />

sources<br />

3 or more<br />

sources<br />

Non<br />

VIAF<br />

sources<br />

(Green)<br />

• Assigned = 1.5 million<br />

• 3 or more VIAF sources = c.850,000<br />

• 1 or 2 VIAF sources + at least one non<br />

VIAF source = c.340,000<br />

• At least 2 non VIAF sources = c. 60,000<br />

• Work in progress – unique names<br />

• Possible matches = 300,000<br />

• Provisional = 15 million<br />

• Suspect = 12,000


VIAF sources<br />

The world’s libraries. Connected.<br />

Not a 1 to 1 correspondence<br />

ISNI excludes sparse, undifferentiated.<br />

Sometimes 2 ISNIs to 1 VIAF (pseudonyms). I<br />

ISNI to 2+ VIAF (merges).


ISNI includes:<br />

More musicians – composers and performers<br />

More academics<br />

Education<br />

Trade<br />

& & more<br />

Theses<br />

The world’s libraries. Connected.<br />

Rights management<br />

societies<br />

…<br />

ISNI sources<br />

Access%Copyright,%Canada<br />

Authors%Guild<br />

Authors’%Licensing%and%Collec7ng%Society,%<br />

UK<br />

American%Musicological%Society<br />

Boekenbank,%Belgium<br />

Books%in%Print<br />

Bri7sh%Library%Sound%Archive<br />

Bri7sh%Library%Theses<br />

Centrum%Dienstverlening%AuteursD%en%<br />

aanverwante%Rechten,%Netherlands<br />

Centro%Español%de%Derechos%Reprográficos<br />

Irish%Copyright%Licensing%Agency<br />

Interna7onal%Performers’%Database%<br />

Associa7on<br />

Interna7onal%Confedera7on%of%Socie7es%of%<br />

Authors%and%Composers<br />

Freebase<br />

Jisc%Names%Project,%UK<br />

Modern%Languages%Associa7on<br />

MusicBrainz<br />

ProliPeris,%Switzerland<br />

Proquest,%Scholar%Universe<br />

Ringgold<br />

Scholar%Universe,%Proquest<br />

VG%WORT,%Germany<br />

Virtual%Interna7onal%Authority%File


Matching – Evaluation<br />

• NACO normalisation<br />

• Common surnames<br />

• Forename equivalents<br />

• Only fullest name form used<br />

• Noise titles<br />

• Dates<br />

• flourished and lived<br />

• Suspect 1900 year only<br />

• String compression 15% faster<br />

than UNICODE string<br />

similarity<br />

The world’s libraries. Connected.<br />

Example forename<br />

equivalents<br />

• anthony tony anthony<br />

• antoin antoin antoine<br />

• antoine antoine antoin<br />

• anton anton antonius antonie<br />

antonio<br />

• antonie antonie anton<br />

• antonio antonio anton<br />

• antonius antonius anton<br />

• arnold arnold arnoldus<br />

• arnoldus arnoldus arnold


Matching – Evaluation Functions<br />

• Name<br />

• Title<br />

• Date<br />

• Publisher<br />

• Personal affiliation<br />

• Organisation affiliation<br />

• Working on Dewey with<br />

truncation points<br />

The world’s libraries. Connected.<br />

• ISBN, ISWC, ISAN, DOI +<br />

• Other name identifier e.g.<br />

IPI, VIAF, IPD<br />

• Instrument<br />

• Linked entities<br />

• Adjust down for sparseness<br />

and / or common surnames


Matching Resolution screen: 2 matches<br />

The world’s libraries. Connected.


JISC Names matching with 2 + independent sources<br />

The world’s libraries. Connected.<br />

Total records 46,380; 92% assigned


ISNI record with 3 independent matching sources<br />

The world’s libraries. Connected.


Surprising associations<br />

The world’s libraries. Connected.


Data Quality Infrastructure<br />

OCLC Assignment Agency<br />

• Matching, merging and splitting<br />

infrastructure<br />

• Data sampling and anomaly<br />

checks, e.g. 7,000 date anomalies<br />

• Special fixes, e.g. Pseudonyms<br />

• Fixes to incoming data<br />

• Data enrichment e.g. Wikipedia,<br />

Dewey, Creation classes<br />

• Notification system<br />

The world’s libraries. Connected.<br />

ISNI Quality Team<br />

• Sampling<br />

• Merging and Splitting<br />

• Input to matching and resolution<br />

system<br />

• Responding to help desk and input<br />

notes from the data contributors and<br />

the public<br />

• Trusted to see full records<br />

ISNI Data Contributors and Registration Agencies<br />

• Quality of the input data<br />

• Resolution of possible and exact matches<br />

• Enrichment and Signalling errors as detected


Example: Record to split<br />

Two identities in the same VIAF record –<br />

detected by Authors’ Guild record matching VIAF record with a death date<br />

record provided by Author’s Guild<br />

Record no.: 13388323X For the living PhD<br />

Identifier: AGLD: 7227<br />

Name: Armstrong, Thomas PhD<br />

Title: Power of Neurodiversity: Unleashing the<br />

Advantages of Your Differently Wired<br />

Brain, The<br />

The world’s libraries. Connected.<br />

mixes


End user corrections and enrichments<br />

The world’s libraries. Connected.


Member view includes browse<br />

The world’s libraries. Connected.


Search by SRU API<br />

The world’s libraries. Connected.<br />

See Document:<br />

ISNI SRU search API guidelines.doc<br />

Example search by name keyword (pica.nw):<br />

http://isni.oclc.nl/sru/?query=pica.nw+%3D+%22maloy%2Brebecca<br />

%22&operation=searchRetrieve&recordSchema=isni-b<br />

This search is for the any records containing both “Rebecca” and<br />

“Maloy” in the name<br />

Response in XML enquiry response schema. ISNI enquiry response<br />

v2.xsd


• Access all records not just assigned<br />

• Complete data returned except private<br />

• 15 indexes plus limiters instead of 4<br />

The world’s libraries. Connected.


Atom Publishing Protocol IETF-RFC-5023<br />

POST /ATOM/isni HTTP/1.1<br />

Connection: TE, close (This line has been generated by a Perl client and is not used by the ISNI Atom Pub catcher)<br />

Host: isni-m.oclc.nl:80<br />

User-Agent: libwww-perl/5.823 (This line has been generated by a Perl client and is not used by the ISNI Atom Pub catcher)<br />

Content-Type: application/atom+xml<br />

Content-Length: - generated from the content block<br />

<br />

..<br />

....2014-05-20T09:09:35.5063705+02:00<br />

etc.<br />

Requests only (inserts)<br />

• ISNI XML Request Schema<br />

• ISNI XML Response Schema<br />

The world’s libraries. Connected.


ISNI-IA – Academic Community<br />

• Database already strong<br />

• BRTH, SCHU, PROQ, JNAM, MLA, AMS<br />

• 1.47 million records, 220,000 assigned, > 2 million links<br />

• Institutions<br />

• Ongoing assignments<br />

• via RAGs (e.g. Proquest)<br />

• Theses<br />

• Discussions on further JISC involvement – ZETOC<br />

• ORCID<br />

• Allocated a non-conflicting range of identifiers<br />

• New overtures<br />

The world’s libraries. Connected.


High quality data<br />

sources<br />

Matches & Links<br />

Diffusion<br />

Exposure<br />

Shared identifier<br />

Data exchange<br />

Synchronisation<br />

Enquiry<br />

Researcher<br />

claiming<br />

Publications<br />

Rich CV<br />

Links

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!