Warning:
This wiki has been archived and is now read-only.
Use Case Virtual International Authority File (VIAF)
Back to Use Cases & Case Studies page
Contents
- 1 Name
- 2 Owner
- 3 Background and Current Practice
- 4 Goal
- 5 Target Audience
- 6 Use Case Scenario
- 7 Application of linked data for the given use case
- 8 Existing Work (optional)
- 9 Related Vocabularies (optional)
- 10 Problems and Limitations
- 11 Related Use Cases and Unanticipated Uses (optional)
- 12 Library Linked Data Dimensions / Topics
- 13 References (optional)
Name
The Wiki page URL should be of the form "Use_Case_Name", where Name is a short name by which we can refer to the use case in discussions. The Wiki page URL can act as a URI identifier for the use case.
Virtual International Authority File (VIAF)
Owner
The person responsible for maintaining the correctness/completeness of this use case. Most obviously, this would be the creator.
Jeff Young
Background and Current Practice
Where this use case takes place in a specific domain, and so requires some prior information to understand, this section is used to describe that domain. As far as possible, please put explanation of the domain in here, to keep the scenario as short as possible. If this scenario is best illustrated by showing how applying technology could replace current existing practice, then this section can be used to describe the current practice. Often, the key to why a use case is important also lies in what problem would occur if it was not achieved, or what problem means it is hard to achieve.
[Note: VIAF's Web interface is based an SRU XML "record" database. The "General Perspective" sections of this document reflect this reasonably familiar conceptualization. Sections labeled "Linked Data Perspective" try to show how run-time Linked Data features and use cases have been layered on.]
General Perspective
[Copied from http://www.oclc.org/research/activities/viaf/]
A joint project with the Library of Congress, the Deutsche Nationalbibliothek, and the Bibliothèque nationale de France, VIAF explores virtually combining the name authority files of all three institutions into a single name authority service.
As of the fall of 2009 there are 18 personal name authority files from 15 organizations participating in VIAF.
Linked Data Perspective
Increasingly, VIAF participants publish their authority records on the Web in a hodgepodge of formats and languages using a variety of REST and non-REST URIs forms. VIAF uses these source URIs, if available, but the URI behaviors, semantics, and document markups are often inconsistent and site-specific and sometimes ever-so-slightly and yet perniciously deviant:
Institution | Link | rdf:type | Note | National Library of Sweden | http://libris.kb.se/resource/auth/207435 | a foaf:Person | Linked Data 303 URIs forwarding to Different Documents | Deutsche Nationalbibliothek | http://d-nb.info/gnd/118640445 | (via inferencing:) a rdaFrbr:Person perhaps other rdf:types that require a priori knowledge of the http://d-nb.info/gnd/ namespace |
Not Linked Data (no 303), but content-negotiates to HTML and RDF/XML. | Library of Congress/NACO (OCLC Research proxy) | http://errol.oclc.org/laf/n++79034525.html | [Web document] | HTML only | Biblioteca Nacional de España | http://catalogo.bne.es/uhtbin/authoritybrowse.cgi?action=display&authority_id=XX971832 | [Web document] | HTML only | Getty Research Institute | http://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500010879 | [Web document] | HTML only |
---|
For records that aren't identified at the source, VIAF HTML links to a local "processed" representation instead:
- http://viaf.org/processed/NKC%7Cjn19990005005 (HTML only)
Goal
Two short statements stating (1) what is achieved in the scenario without reference to linked data, and (2) how we use linked data technology to achieve this goal.
- Consolidate the classification, identity, and discoverability of things (people, organizations, etc.) coming from various authority agencies for unexpected reuse.
- Strengthen the coherent/resolvable identity of these things and integrate them with variant human and machine-friendly Web document representations using 303 (See Other) and content-negotiable generic documents.
Target Audience
The main audience of your case. For example scholars, the general public, service providers, archivists, computer programs...
General Perspective
[Copied from http://www.oclc.org/research/activities/viaf/]
The goal of this project is to facilitate research across languages anywhere in the world by making authorities truly international.
OCLC is conducting this research because we have proven software for matching and linking authority records for personal names. This software will be used to match the authority records from The Deutsche Nationalbibliothek (dnb) and the Bibliothèque nationale de France (BnF) to the corresponding authority records from the Library of Congress (LC).
Once the existing authority records are linked, shared OAI servers will be established to maintain the authority files and to provide user access to the files. Users then will be able to see names displayed in the most appropriate language. For example, German users will be able to see a name displayed in the form established by the dnb, while French users will see the same name as established by the BnF, and American users will view the name as established by LC. Users in their respective countries will be able to view name records as established by the other nations, thus making the authorities truly international and facilitating research across languages anywhere in the world.
Linked Data Perspective
The only audience limitation for identifying an instance of some class (currently "people") is the extent of included individuals that are recognized by participating authority agencies. Within that set, any audience can benefit from globally-unique, actionable, descriptively-essential, interrelated identifiers for the individuals.
Use Case Scenario
The use case scenario itself, described as a story in which actors interact with systems. This section should focus on the user needs in this scenario. Do not mention technical aspects and/or the use of linked data.
General Perspective
The VIAF server root (http://viaf.org/) includes a form for searching "cluster" records using an SRU API. For example:
An essential feature of this SRU implementation is REST URI support for accessing individual "records":
These identifiers are suitable for human and machine-oriented applications because of their support for content-negotiation (HTML, XML, RDF/XML).
Linked Data Perspective
The numerous elements in XML "cluster records" (not to mention their attributes and hierarchy) are VIAF-specific, evolving, and detailed artifacts of the authority matching algorithms. The need is to represent these "cluster records" in a variety of result-oriented conceptual models (OWL) that have been designed for casual and detailed use cases.
Application of linked data for the given use case
This section describes how linked data technology could be used to support the use case above. Try to focus on linked data on an abstract level, without mentioning concrete applications and/or vocabularies. Hint: Nothing library domain specific.
Implementation Issues
The http://viaf.org/viaf/* URI pattern was baked into VIAF early based on the SRU "VIAF database" foundation:
Rewrite rules were added to give each record in this database a REST URI that delivers an HTML representation:
This URI was eventually upgraded to support Linked Data 303 URIs forwarding to One Generic Document to satisfy human (HTML), machine (XML), and semantic (RDF/XML) agents. The implementation's ability of map database "records" to a conceptualized 303 "real world object" is extremely handy because this pattern has potential for transitioning legacy physical models to broadly-conceptualized Linked Data models. Nevertheless, it can also be awkward because mapping database records to real world objects is currently the ONLY way for this implementation to support 303 real world objects. I hope this constraint will be relaxed once the value of Linked Data has proven itself.
In the mean time, the technical challenge has been to expose rationalized OWL individuals using pre-ordained URIs:
Resource Type | URI | Behavior | Note | Real World Object | http://viaf.org/viaf/96994048 | 303 (See Other) | Generic Document | http://viaf.org/viaf/96994048/ | 200 (OK) Content-negotiation | Note the useful Content-Location header (thanks Ralph!) | Web Document | http://viaf.org/viaf/96994048/viaf.html | 200 (OK) (application/xhtml+xml, text/html) |
Content-negotiation default | Web Document | http://viaf.org/viaf/96994048/viaf.xml | 200 (OK) (application/xml, text/xml) |
[editorial note: Ralph, please remove the distracting XSL Stylesheet reference from this representation] |
Web Document | http://viaf.org/viaf/96994048/rdf.xml | 200 (OK) (application/rdf+xml) |
A more conventional and standards-compliant name would have been something like "about.rdf". |
---|
If you're a fan of opaque URIs, then you might sympathize with this constraint. I mourn for human intuition and would prefer URIs with a class name path segment (e.g. "/person"), but because the Linked Data aspects of VIAF are experimental this hasn't been a priority. As a consequence, the classes that diverse users are likely to recognize have reluctantly been encoded in the URI hash "fragment" or suppressed altogether:
Real World Object | rdf:type | http://viaf.org/viaf/96994048/#{URL-encoded form of the established heading} | viaf:EstablishedHeading | http://viaf.org/viaf/96994048/#XRefAlternate:{URL-encoded form of the variant heading} | viaf:EstablishedHeading | http://viaf.org/viaf/96994048/#skos:Concept | skos:Concept | http://viaf.org/viaf/96994048/#foaf:Person | foaf:Person | http://viaf.org/viaf/96994048/#rdaEnt:Person | rdaEnt:Person |
---|
Ontology Considerations
Unfortunately, the "/viaf" URI path segment didn't provide much insight into the class of "Real World Object" in this case. The initial "cluster records" focus has been on "personal names", but with plans to expand outwards eventually. The obvious existing classes foaf:Person and skos:Concept were both lacking because of preferred/alternate name/label limitations. Eventually we developed a generalized VIAF ontology (JPEG) (OWL) and assigned the 303 real world objects to rdf:type viaf:NameAuthorityCluster. Although this class lacks the intuition of foaf:Person and skos:Concept, it has the merit of reflecting a key class in VIAF's internal and feedback processes.
Internal Rationalization
Native XML "cluster records" are constructed to support internal processing and are presumably not intuitive or stable representations for feedback to participating institutions:
- http://viaf.org/viaf/24604287/viaf.xml
- [editorial note: Ralph, please remove the XSL Stylesheet reference on this representation.]
In contrast, the VIAF Ontology is a concise distillation of the conceptual results:
Bulk harvesting of this distillation could be supported using RSS, Atom, and/or OAI-PMH, but this feature has not been implemented yet. In the mean time, the OWL individuals are available record-by-record by content-negotiating application/rdf+xml from generic resource URIs:
Users can also bypass content-negotiation by accessing the RDF/XML representation directly:
External Rationalization
Keep in mind that VIAF never uses or stores RDF. The VIAF OWL was rationalized after the XML cluster records were developed and the RDF representations for individuals are conjured from those cluster record at runtime using XSLT. This should be reassuring for legacy Web applications because it means they shouldn't need to be rewritten to be interoperable on the Semantic Web. Instead, existing REST URIs for HTML documents can be upgraded to support content-negotiation for retrospectively-rationalized RDF.
RDF produced like this at runtime could be rationalized against a local or external OWL ontology. In VIAF, we have tried to crosswalk to as many other OWL ontologies as seem useful. Current, this includes foaf:Person, skos:Concept, and rdaEnt:Person. Each new crosswalk results in some redundancy in the RDF representation, but in principle it should make these resources suitable for use from all these perspectives. These crosswalks are still experimental, though, and are likely to change significantly in the future.
Existing Work (optional)
This section is used to refer to existing technologies or approaches which achieve the use case (Hint: Specific approaches in the library domain). It may especially refer to running prototypes or applications.
Related Vocabularies (optional)
Here you can list and clarify the use of vocabularies (element sets and value vocabularies) which can be helpful and applied within this context.
Problems and Limitations
SKOSXL
This section lists reasons why this scenario is or may be difficult to achieve, including pre-requisites which may not be met, technological obstacles etc. Please explicitly list here the technical challenges made apparent by this use case. This will aid in creating a roadmap to overcome those challenges.
Some of the classes and properties in the VIAF ontology need to be reconsidered from SKOSXL perspective. The viaf:Heading, viaf:EstablishedHeading, viaf:XRefAlternate, and viaf:XRefRelated classes can probably be moved to skosxl:Label.
Switching to SKOSXL properties isn't so easy, though. Ideally, viaf:hasEstablishedForm, viaf:hasXrefAlternate would be moved to skosxl:prefLabel and skosxl:altLabel. The difficulty is in the SKOS S14 integrity condition:
S14: A resource has no more than one value of skos:prefLabel per language tag.
In VIAF, "preferred" literals are coupled with contributing agencies rather than language tag. There isn't a trivial one-to-one mapping between agency and language tag, so some work is needed to satisfy the condition.
The rdf:type on the 303 URI is unstable
The rdf:type assigned to VIAF 303 URIs (e.g. http://viaf.org/viaf/108389263) has been unstable and remains uncomfortable. Originally it was a foaf:Person then a skos:Concept, and is currently a viaf:NameAuthorityCluster with the foaf:Person and skos:Concept identified with hash URIs:
It's becoming quite clear, though, that people don't understand hash URIs and most automatically assume that the 303 URI is the only one that matters. Here are some possible resolutions that would be nice to have some feedback on:
- shuffle the URIs so the Person/Organization gets the 303 instead (and risk confusing existing users)
- merge the rdf:types into the individual identified with the 303 even if it creates a weird chimera of Person (or Organization, etc.), skos:Concept, viaf:NameAuthorityCluster (and risk confusing existing and future users)
Sharing Linked Data URIs in MARC
Some of the contributing agencies are starting to identify "the thing" the authority record is about. If these identifiers are included in the authority records that contributors send in, then VIAF can assert owl:sameAs to wire them together in the cluster. Corine Deliot has described a solution derived from MARBI discussions of ISNI. (Her example using the VIAF URI is problematic as described in the previous section.)
Related Use Cases and Unanticipated Uses (optional)
The scenario above describes a particular case of using linked data.. However, by allowing this scenario to take place, the likely solution allows for other use cases. This section captures unanticipated uses of the same system apparent in the use case scenario.
VIAF seems a specific example of Use Case Vocabulary Merging
Library Linked Data Dimensions / Topics
The dimensions and topics are used to organize the use cases. At the same time, they might help you to identify additional aspects currently not covered. If appropriate topics and/or dimensions are missing, please specify them here and annotate them by a “*”.
*these items are not in the initial list, suggestion for adding them
References (optional)
This section is used to refer to cited literature and quoted websites.