Copyright © 2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
Knowledge organisation systems, such as taxonomies, thesauri or subject heading lists, play a fundamental role in information structuring and access. The Semantic Web Deployment Working Group aims at providing a model for representing such vocabularies on the Semantic Web: SKOS (Simple Knowledge Organisation System).
This document presents the preparatory work for a future version of SKOS. It lists representative use cases, which were obtained after a dedicated questionnaire was sent to a wide audience. It also features a set of fundamental or secondary requirements derived from these use cases, that will be used to guide the design of SKOS.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is the first public Working Draft of the "SKOS (Simple Knowledge Organization System) Use Cases and Requirements", developed by the W3C Semantic Web Deployment Working Group [SWD]. The SWD Working Group is chartered to advance the November 2005 SKOS Core Vocabulary Specification Working Draft and the SKOS Core Guide Working Draft to W3C Recommendation.
The Use Cases detailed in this document have been selected as representative of the use cases submitted in response to a "Call for Use Cases" published in December 2006. These use cases as well as Issues identified by the working group have resulted in draft Requirements that will guide the design of the future SKOS Recommendaton. Early feedback is therefore most useful. Feedback on use cases that can help to resolve open issues is especially important. Note also that any feature listed under Candidate Requirements should be considered as "at risk" without further feedback.
Comments on this Working Draft are encouraged and may be sent to public-swd-wg@w3.org; please include the text "[SKOS] UCR comment" in the subject line. All messages received at this address are viewable in a public archive. Commentors may wish to review the list of open issues before generating a new comment.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Knowledge organisation systems play a fundamental role in information structuring and access, e.g. for asset description or web site organisation. Such vocabularies, coming in the form of thesauri, classification schemes, subject heading lists, taxonomies or even folksonomies, are developed and used worldwide, by institutions as well as individuals. However these very important knowledge resources are still mostly isolated from the outside world, and not widely used in implementing systems.
The development of new information technologies and infrastructures, such as the World Wide Web, calls for new ways to create, manage, publish and use these knowledge organisation systems. It is especially expected that conceptual schemes will benefit from greater shareability, e.g. by being published via web services. In the meantime, the documentary systems which use them will turn to advanced information retrieval techniques to construct most of their semantic structure and lexical content.
SKOS (Simple Knowledge Organisation System) [SWBP-SKOS-CORE-GUIDE] provides a model to represent and use vocabularies and ontologies in the framework of the Semantic Web. A first version has been produced by the Semantic Web Best Practices and Deployment working group [SWBPD], and is already used in some research projects. The Semantic Web Deployment Working Group [SWD] has been chartered to continue this work, and to "produce guidelines and an RDF vocabulary (SKOS) for transforming an existing vocabulary representation into an RDF/OWL representation" [SWD-Charter].
In order to delimit the scope and elicit the required features for SKOS, the SWD working group has issued a call for use cases, asking for descriptions of existing or planned SKOS applications, according to a specific questionnaire. Following the gathering of these use cases, the Working Group has elicited a number requirements for SKOS which are motivated by the previous work on SKOS, or by contributions received after the call for use cases.
This document gives an account of this process. First, section 2 presents summaries of selected contributions, and pointers to the complete set of cases which were sent to the Working Group. Second, section 3 lists the requirements the Working Group has elicited so far.
(Contributed by Antoine Isaac.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucManuscriptsDetailed
and at http://www.w3.org/2006/07/SWD/wiki/EucIconclassDetailed)
The purpose of this application is to provide the user with access to two collections of illuminated manuscripts from the Dutch and French national libraries, Medieval Illuminated Manuscripts and Mandragore (accessible online at http://www.kb.nl/manuscripts and http://mandragore.bnf.fr). The descriptions of images from these two collections follow different metadata schemes, and contain values from different controlled vocabularies for subject indexing. The user should however be able to search for items from the two collections using his preferred point of view, either using vocabulary from collection 1 or vocabulary from collection 2.
The main feature of the application is collection browsing, which uses hierarchical links in vocabularies: if a concept matching a query has subconcepts, the documents indexed against these subconcepts should be returned. The application also uses mapping links between concepts from the two vocabularies. For example, if an equivalence link is found between a query concept from one vocabulary and another concept from the second one, documents indexed by this other concept shall also be included in the query results.
Requires: R-ConceptualRelations, R-IndexingRelationship
Additionally, the application enables search based on free text queries over the collection metadata: documents can be retrieved based on free-text querying of the different fields used to describe the documents (creator, place, subject, etc.). For subject indexing, if a text query matches the label of a controlled vocabulary concept, the documents indexed against this concept will be returned.
The two collections use respectively the Iconclass and Mandragore analysis vocabularies.
Iconclass (http://www.iconclass.nl) contains 28000 items used to describe the subjects of an image (persons, event, abstract ideas). Complete versions are available for English, German, French, Italian, and partial translations for Finnish and Norwegian.
Requires: R-MultilingualLexicalInformation
The main building blocks of Iconclass are subjects, used to describe the subjects of images. An Iconclass subject consists of a notation (an alphanumeric identifier used for annotation) and a textual correlate (e.g. “25F9 mis-shapen animals; monsters”). Subjects are organized in hierarchical trees, as in the following extract:
2 Nature
25 earth, world as celestial body
25F animals
25F(+) KEY
25F1 groups of animals
…
25F9 mis-shapen animals; monsters
25FF fabulous animals (sometimes wrongly called 'grotesques');
'Mostri' (Ripa)
|
Subjects can have associative cross-reference links between them (systematic references) and are linked to keywords that are used to search for them in Iconclass tools. Keywords form a network of their own, featuring see links (from one non-preferred keyword, not attached to any subject, to a preferred one), see also links (between keywords that are semantically or iconographically related) and translation links (between keywords in different languages).
Requires: R-LabelRepresentation, R-RelationshipsBetweenLabels
Iconclass additionally provides auxiliary mechanisms for subject specialization at indexing time. These actually allow for collection-specific vocabulary extension:
11H(…) saints
can be
specialized into 11H(VALENTINE)
, which does not exist in the
standard Iconclass,25F2 mammals
can be combined with (+33) head of an animal
, resulting in
25F(+33)
which will index an image of a mammal's head. Or
11H(VALENTINE)2
can be synthesized from
11H(VALENTINE)
and 11H(...)2 early life of male
saint
to index an image which specifically denotes the early days
of St. Valentine.Requires: R-ConceptSchemeExtension, R-SkosSpecialization, R-IndexingAndNonIndexingConcepts, R-ConceptCoordination
Maintenance of the vocabulary is done via manual editing of semi-structured source files. As a general rule, the standard version will only be changed in a conservative way, not modifying the existing subjects.
Mandragore contains 16000 subjects. 15800 are descriptors, which are used to describe the illuminations and form a flat list. Additional structure is given by 200 abstract topic classes which form a hierarchy organizing the descriptors according to general domains, but cannot themselves be used to describe documents:
ZOOLOGIE
.zoologie (généralités)
.mollusques
.mammifères
cochon [mammifère ongulé]
girafe [mammifère ongulé]
|
A descriptor is specified by a French label (“cochon”, for pig), optional rejected forms (“porc”), an optional definition (“mamifère ongulé”, hoofed mammal) and a reference to one or more topic classes (“.mammifères”, mammals). A note can sometimes be found as a complementary definition.
To enable integrated browsing, elements from Mandragore and Iconclass vocabularies must be linked together using equivalence or specialization links as in the following:
|
|
|
|
Requires: R-ConceptualMappingLinks
(Contributed by Matthias Samwald,
Medizinische Universität Wien.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucBiozenDetailed)
Bio-zen (http://neuroscientific.net/index.php?id=43) allows the description of biological systems and the representation of scientific discourse on the web in a highly distributed manner. It is intended to be used by researchers and developers in the life sciences.
SKOS is used in bio-zen for the representation of many existing life sciences vocabularies, taxonomies and ontologies coming from the "Open Biomedical Ontologies" (OBO) collection (http://www.fruitfly.org/~cjm/obo-download/). The size of all converted taxonomies taken together is on the order of millions of concepts. Typical examples are the Gene Ontology or Medical Subject Headings (MeSH), an entry of which is displayed here:
id | MESH:A.01.047.025 |
name | abdominal_cavity |
def | "The region in the abdomen extending from the thoracic DIAPHRAGM to the plane of the superior pelvic aperture (pelvic inlet). The abdominal cavity contains the PERITONEUM and abdominal VISCERA\, as well as the extraperitoneal space which includes the RETROPERITONEAL SPACE." [MESH:A.01.047.025] |
synonym | abdominal_cavity |
synonym | cavitas_abdominis |
is_a | MESH:A.01.047 ! abdomen |
To represent such vocabulary elements as well as other types of information, the existing SKOS model has been integrated into a single OWL ontology, together with the DOLCE foundational ontology and the Dublin Core metadata model. In the process, the SKOS model has been extended with special types of concepts, e.g. biozen:sequence-concept. To enable efficient reasoning with the available dataset, it is important to note that existing constructs have been made compatible with the OWL-DL language.
Requires: R-CompatibilityWithOWL-DL
The bio-zen framework will consist of several applications, especially Semantic Wikis. A Bio-zen ontology incorporates constructs to make statements about digital information resources, that is creating "concept tags". This concept-tagging is an important feature of bio-zen, because it eases the integration of information from different sources.
Requires: R-IndexingRelationship
(Contributed by Margherita Sini and Johannes
Keizer, Food and Agriculture Organization.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucAimsDetailed)
This application coming from the AIMS project (http://www.fao.org/aims) is a semantic search service that makes use of mapped agriculture thesauri. It allows users to search any available terminology in any of the languages in which the thesauri are provided and retrieve information from resources which may have been indexed by one of the mapped vocabularies. Typical functions are navigating resources, helping to build boolean searches via concept identification, or expanding given searches by extra languages or synonyms.
Requires: R-IndexingRelationship
The service builds on several agriculture vocabularies: the Agrovoc Thesaurus (http://www.fao.org/aims/ag_intro.htm), the Agris/Caris Classification Scheme (ASC), the FAO Technical Knowledge Classification Scheme (TKCS), the subjects from the FAOTERM vocabulary, etc.
Agrovoc contains 35000 terms in 12 languages (not all of the languages feature the same translated terms, however), while ASC, TCKS and FAOTERM range between 100 and 200 categories available in the 5 official FAO languages. Agrovoc terms consist of one or more words and always represent a single concept. Terms are divided into Descriptors and non-descriptors, the first currently only used for indexing. For each descriptor, a word block is displayed showing the relation to other terms: BT (broader term), NT (narrower term), RT (related term), UF (non-descriptor). There are also scope notes, used to clarify the meaning of both descriptors and non-descriptors.
Term code |
1939 |
Term label |
EN : Cows, FR : Vache, ES : Vaca, AR : بقرات , ZH : ?牛
, PT : Vaca, CS : krávy, JA : 雌牛 , TH : ?ม่โค , SK :
kravy, DE : KUH |
BT |
Cattle (code 1391) |
NT |
Suckler cows, Dairy cows (26767, 36875) |
RT |
Heifers, Cow milk, Milk yielding animals, Females (3535,
4833, 15969, 16080) |
SNR |
Females (15969) |
Scope Note |
Use only for cattle and zebu cattle; for other species use
"Females" (15969) plus the descriptor for the species |
Requires: R-ConceptualRelations, R-LabelRepresentation, R-TextualDescriptionsForConcepts, R-MultilingualLexicalInformation
Actually, the AIMS project includes some more specific links, presented in http://www.fao.org/aims/cs_relationships.htm: Concept-to-Concept relationships (subclass of; caused by; member of; part of), Term-to-Term relationships (related term; synonym; translation) and String-to-String relationships (spelling variant; acronym).
Examples of such links are:
synonym |
bucket |
pail |
abbreviation_of |
Corp. |
Corporation |
acronym |
Food and Agriculture Organization |
FAO |
spelling_variant |
organisation |
organization |
translation |
vache |
cow |
scientific_taxonomic_name |
African violet |
Saintpaulia |
Requires: R-SkosSpecialization, R-RelationshipsBetweenLabels
Currently the Agrovoc management system lacks distributed maintenance, but it is expected that a new system will soon solve this problem, which is crucial since changes are made by experts from all over the world.
For AIMS, Agrovoc has been converted into SKOS (ftp://ftp.fao.org/gi/gil/gilws/aims/kos/agrovoc_formats/skos/2006) and is being mapped to two other vocabularies: the Chinese Agricultural Thesaurus (CAT) and the National Agricultural Library thesaurus (NAL). This mapping uses links inspired by the SKOS mapping vocabulary [SWBP-SKOS-MAPPING], as below:
CAT-ID | CAT-EN | Map | AG-ID | AG-EN | AG-ID | AG-EN |
30854 | Senta flammea | Exact | 9748 | Cheena | ||
50008 | Mayetola destructor | Exact-OR | 24260 | Triticale (gramineae) | 7949 | Triticales (product) |
1160 | Two-shear sheep | NT1 | 3662 | Hordeum vulgare |
Requires: R-ConceptualMappingLinks
(Contributed by Sean Barker, BAE Systems.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucProductLifeCycleSupportDetailed)
The problem of the Product Life Cycle Support (PLCS) application is to integrate a network of interconnected supply chains, with multiple, large customers buying a wide range of products (from shoes to aircraft) each dictating their own standards, and with every supplier being part of multiple supply chains. Each customer wants to maintain a common approach over all its supply chains. And each supplier wants to maintain the same system for each of the supply chains it works in.
The aim of this application is to propose a data exchange mechanism for managing the life support of complex products (http://www.oasis-open.org), including configuration definition, maintenance definition, maintenance planning and scheduling, and maintenance and usage recording (including configuration change).
For that, an upper ontology of several hundred items for the description of the product life cycle will be defined. There is no chance of the entire supply system (10,000's of businesses) developing a single detailed model. However, given the upper ontology, they will be free to specialize individual ontology terms (playing the role of place holders for local extension) to meet their precise needs.
PLCS is conceptually a co-operatively developed web in XML, with the live version being a set of runtime views assembled from files submitted by a dozen or so contributors. It may be useful, where ontologies diverge, to map terms between the diverging branches, either to indicate where terms can be harmonized to their equivalent, or to identify that there is a similarity link that is not exact equivalence.
Requires: R-ConceptualRelations, R-ConceptSchemeExtension, R-ConceptualMappingLinks
The PLCS vocabulary addresses hundreds of separate functions, including classification of items, classification of information usages (e.g. types of part identifier), classification of entity roles (e.g. date as start date) or classification of relationships (e.g. supersedes).
Typical examples of terms are:
Identification_code | An Identification_code is an identifier_type which is encoded according to some convention. Typically but not necessarily concatenated from parts each with a meaning. E.g. tag number, serial number, package number and document number. |
Part_identification_code | A Part_indentfication_code is an Identification_code that
identifies the types of parts. For example, a part number.
CONSTRAINT: An Identification_assignment classified as a Part_identification_code can only be assigned to Part Organization_name |
Owner_of | An Owner_of is an Organization_or_person_in_organization_assignment
that is assigning a person or organization to something in the role
of owner.
For example, the owner of the car. |
The vocabulary has been encoded using OWL, and is managed via the Protege OWL editor.
Requires: R-TextualDescriptionsForConcepts
(Contributed by Véronique Malaisé and
Hennie Brugman, Vrije Universiteit Amsterdam and Max Planck Institute for
Psycholinguistics.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucRankingForDescriptionDetailed
and at http://www.w3.org/2006/07/SWD/wiki/EucGtaaBrowser)
Radio and television programs at the Dutch national broadcasting archive (Sound and Vision) are typically associated with contextual text descriptions: web site texts, subtitles, program guide texts, texts from the production process, etc. These context documents are used by documentalists at Sound and Vision who manually describe programs using concepts from the GTAA thesaurus (Gemeenschappelijke Thesaurus Audiovisuele Archieven - Common Thesaurus for Audiovisual Archives).
The CHOICE project (part of the Dutch CATCH research program) uses natural language processing techniques to automatically extract candidate GTAA terms from the context documents. The application focused on in this section takes these candidate terms as input, and ranks them on the basis of the structure of the GTAA thesaurus. For example, the fact that "Voting" and "Democratization" are related in GTAA by a two-step path (via the "Election" term and two "related-to" links) will positively influence the ranking of these terms. Ranked terms will be presented to documentalists to speed up their description work.
The GTAA vocabulary covers a wide range of topics, as it is meant to describe anything that can be broadcast on TV or radio. It contains approximately 160,000 terms, divided into 6 disjoint facets: Keywords, Locations, Person Names, Organization-Group-Other Names, Maker Names, and Genres.
The thesaurus mainly uses constructs from the ISO 2788 standard, like Broader Term, Narrower Term, Related Term and Scope Notes. Terms from all facets of the GTAA may have Related Terms, Use/Use For and Scope Notes, but only Keywords and Genres can also have Broader Term/Narrower Term relations, organizing them into a set of hierarchies. In addition to these standard features, Keywords terms are thematically classified in 88 subcategories of 16 top Categories.
Preferred Term | ambachten (crafts) |
Related Terms | ondernemingen (ventures) , beroepen (professions), artistieke beroepen (artistic professions) |
Broader Term | beroepen (professions) |
Narrower Terms | boekbinders (bookbinders), bouwvakkers (building workers), glasblazers (glassblowers) |
Scope Note | niet voor afzonderlijke ambachten maar alleen als verzamelbegrip, bijv. voor (markten van) oude ambachten (not for specific crafts, only in general meaning, e.g. (markets of) old crafts) |
Categories | 05 economie (economy), 09 techniek (technique) |
Requires: R-ConceptualRelations, R-LabelRepresentation, R-SkosSpecialization
The application, envisioned as a SOAP web service, uses a Sesame RDF web repository containing the SKOS version of the GTAA thesaurus to retrieve the 'term contexts' of the terms in the input list, which is stored in a local RDF repository.
This term context includes, for one given term, all terms that are directly connected to it by Broader Term, Narrower Term or Related Term relations. This includes pre-computed inter-facet links that are not part of the ISO standard, though allowed by the GTAA data model. For example, one can link a "King" in the Person facet to the general subject "Kings" and the country which this King rules.
For the ranking, it is now assumed that candidate terms that are mutually connected by thesaurus relations (directly or indirectly) are more likely to be good descriptions than isolated candidate terms. Later on, it might be interesting to differentiate between types of thesaurus relations, or to use more complex patterns of these relations.
The thesaurus-based recommendation system can also be integrated with a recommendation system that is based on co-occurences between terms that are used in previously existing descriptions of programs.
(Contributed by William Bug, Drexel
University College of Medicine.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucBirnLexDetailed)
BIRNLex is an integrated ontology+lexicon used for various purposes — some end-user/interactive, others back-end/infrastructure — within the BIRN Project to support semantically-formal data annotation, semantic data integration, and semantically-driven, federated query resolution.
Requires: R-ConceptualMappingLinks, R-IndexingRelationship, R-LexicalMappingLinks
Below are examples of BIRNLex class definitions that illustrate the need for lexical support and links to external knowledge sources. The general design goals have been to use both the Dublin Core metadata elements and SKOS where ever possible. The goal is to use SKOS for all lexical qualities. There are certain annotation properties that should be shared across all biomedical knowledge resources. There are other required elements specific to the specific needs in BIRN (the group producing BIRNLex).
Class | Anterior_ascending_limb_of_lateral_sulcus |
birn_annot:birnlexCurator | Bill Bug |
birn_annot:birnlexExternalSource | NeuroNames |
birn_annot:bonfireID | C0262186 |
birn_annot:curationStatus | raw import |
birn_annot:neuronames | ID 49 |
birn_annot:UmlsCui | C0262186 |
obo_annot:createdDate | "2006-10-08"^^http://www.w3.org/2001/XMLSchema#date |
obo_annot:modifiedDate | "2006-10-08"^^http://www.w3.org/2001/XMLSchema#date |
skos:prefLabel | Anterior_ascending_limb_of_lateral_sulcus |
skos:scopeNote | human-only |
Class | Medium_spiny_neuron |
birn_annot:birnlexCurator | Maryann Martone |
birn_annot:birnlexDefinition | The main projection neuron found in caudate nucleus, putamen and nucleus accumbens... |
birn_annot:bonfireID | BF_C000100 |
birn_annot:curationStatus | pending final vetting |
dc:source | Maryann Martone |
obo_annot:createdDate | "2006-07-15"^^http://www.w3.org/2001/XMLSchema#date |
obo_annot:modifiedDate | "2006-09-28"^^http://www.w3.org/2001/XMLSchema#date |
skos:prefLabel | Medium_spiny_neuron |
Requires: R-CompatibilityWithDC, R-CompatibilityWithOWL-DL, R-ConceptualRelations, R-LabelRepresentation, R-ConceptSchemeExtension
The following is a subset of BIRNLex applications, either extant or in the offing:
In all of these applications, it is critical to have a clear, distinct, and shared representation for the associated lexicon. For instance, when integrating BIRN segmented brain images with those from other projects across the net, use of lexical variants from a variety of public terminologies and thesauri such as SNOMED and MeSH can provide a powerful means to largely automate semantic integration of like entities - e.g., corresponding brain region, equivalent behavioral assays described using different preferred labels/names. In providing a community shared formalism for representing the associated lexicon, SKOS can greatly simplify this task. If, for instance, the lexical repository (collection of Lexical Unique Identifier, each lexical variant of a term getting one LUI) contained in UMLS were represented according to SKOS, this would provide an extremely valuable resource to the community of semantically-oriented bioinformatics researchers, as well as a powerful tool to support latent semantic analysis or natural language processing when linking to unstructured text.
The following are the collection of terminologies and ontologies being linked into BIRNLex: Neuronames, Brainmap.org classification schemes, RadLex, Gene Ontology, Reactome, OBI, PATO, Subcellular Anatomy Ontology (CCDB - http://ccdb.ucsd.edu/), MeSH.
Neuronames concerns brain anatomy and is about 750 classes and thousands of associated lexical variants. Brainmap.org classification includes hierarchies to describe neuroanatomy, subject variables, stimulus conditions, and experimental paradigms associated with functional MRI of the nervous system The Subcellular Anatomy Ontology is designed to describe the subcellular entities associated with ultrastructural and histological imaging of neural tissue. Currently the application is only dealing with English lexical entries.
BIRNLex curators are working with the National Center for Biomedical Ontology (NCBO) to adopt the OBO Foundry recommendations in the construction of BIRNLex. Use of SKOS elements can be useful, so that, for instance, software applications can draw on "skos:prefLabel", "obo_annot:synonym", "obo_annot:definition", etc.
The management of BIRNLex is currently done manually in Protege-OWL.
Requires: R-CompatibilityWithOWL-DL
However, the ultimate goal is to adopt a client-server infrastructure that will created an RDF-based backend store and support both curation of the ontology and annotation using the ontology via Java Portlet-based applications. BIRN has a core infrastructure staff dedicated to use of the GridSphere Java Portlet implementation framework (www.gridsphere.org).
(contributed by Curt Langlotz.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucRadlexDetailed)
RadLex provides a structured vocabulary of terms used in the field of radiology. Currently completed are listings of anatomic terms and "findings", which includes things that can be seen on or inferred from images produced by radiologists. These two sets include a total of about 7500 terms. A list of the terms used to describe the creation of such images, including information about the equipment used and the various imaging sequences performed, will be complete by the end of 2007.
An example application demonstrating functionality is an image annotation program that reads in RadLex and provides users the ability to search for and use particular RadLex terms to associate with images, post-coordinating them if necessary. Users would want to be able to retrieve RadLex terms by name or synonym.
Requires: R-ConceptualRelations, R-LabelRepresentation, R-TextualDescriptionsForConcepts, R-ConceptCoordination
RadLex, which can be searched and browsed online at www.radlex.org, is a taxonomy currently built predominantly using is-a relations. But there are also part-of and other relations (especially for anatomy), and new relations will be added as RadLex expands. Each term has a rich set of metadata fields to include provenance information and terminological data such as synonyms, definition, and related terms from other vocabularies.
The practical fields include:
and optionally, any
Requires: R-ConceptualRelations, R-AnnotationOnLabel, R-RelationshipsBetweenLabels, R-LexicalMappingLinks
The relationships used among terms include:
For instance, “nervous system” has a part called “brain”, and “nervous system” contains “nervous system spaces”. The view of the hierarchy itself does not reveal the relationships among the terms; this information is found within the term features, shown in this format on the right-hand side. In this framework, the hierarchy is generated from the different relationships among terms, using either SPARQL or a custom interface to an application that consumes the terminology.
There are 9 separate hierarchies in the vocabulary: Treatment; Image acquisition, Processing and Display; Modifier; Finding; Anatomic Location; Uncertainty (to be renamed Certainty); Teaching Attribute; Relationship; and Image Quality (as seen in the screenshots above). There are currently no relations holding between terms in different hierarchies, though this could be developed in future (e.g. linking of particular Findings to potential Anatomic Locations).
The Radlex vocabulary is provided in English, with plans to include other languages (e.g., German).
Requires: R-MultilingualLexicalInformation
Protégé has been used to create a machine-readable version of the vocabulary, which is available at http://www.radlex.org/radlex/docs/downloads.html. RadLex will be available in OWL-DL in the future.
Requires: R-CompatibilityWithOWL-DL
During the design of the vocabulary, basic guidelines from Cimino and Chute were used, such as ensuring that a term only corresponds to one concept. As the terminology is being developed into a more structured form, with more types of relationships, different parents are being allowed as long as the relationship type is different. E.g. one IS-A parent, one PART-OF parent, etc.
Potential changes in the vocabulary are submitted to the chair of the RadLex Steering Committee of the Radiological Society of North America, who consults with the relevant lexicon development committee. Accepted changes are periodically incorporated into the vocabulary. The first release was made public in November 2006.
Currently, a mapping is being developed between RadLex and the corresponding terms/codes in SNOMED (Systematized Nomenclature of Medicine) and the ACR (American College of Radiology) Index, the vocabularies that were used as a starting point for terminology development.
From a representational point of view, this mapping shall consist of equivalence and specialization links. Later, we expect people to compose atomic terms (post-coordination) to describe composite entities.
Requires: R-ConceptCoordination
(Contributed by Jon Phipps, Cornell
University.
Complete description available at http://www.w3.org/2006/07/SWD/wiki/RucMetadataRegistryExtended)
The NSDL Registry is intended to provide a complete vocabulary development and management environment for development of controlled vocabularies. Services are primarily directed at vocabulary owners and include provisions for:
The registry currently has a number of vocabularies registered. A sample entry of a vocabulary/scheme and a single concept is below (taken from http://metadataregistry.org/uri/NSDLEdLvl.html).
Scheme | NSDLEdLvl |
Name | NSDL Education Level Vocabulary |
Owner | National Science Digital Library |
Community | Science, Mathematics, Engineering, Technology |
URL | http://metamanagement.comm.nsdl.org/cgi-bin/wiki.pl?VocabDevel |
Concept | NSDLEdLvl/1023 |
Label | Middle School |
Top Concept | No |
Status | published |
history note | Term source: http://www.ed.gov |
has narrower | Grade 6 |
has narrower | Grade 7 |
has broader | Grades Pre-K to 12 |
alternative label | Junior High School |
The SWD Working Group maintains on its wiki site the complete list of descriptions that were sent following its call for use cases:
The use cases presented in the previous section motivate a number of requirements that the SKOS specification must or should meet in order to fulfill its aim as a standard model for porting simple concept schemes on the semantic web. Depending on the level of consensus reached in the Working Group, these requirements are categorized into accepted and candidate requirements.
Note: in the following, to avoid ambiguities, vocabulary will be used to refer to the SKOS vocabulary, that is, the set of constructs (classes, properties) introduced in the SKOS model. Concept Scheme will be used to refer to the objects built with SKOS, i.e. the application-specific collections of concepts that are mentioned in SKOS use cases.
@@ Some requirements are linked to issues that are still being examined by the Working Group, as found on the wiki site http://www.w3.org/2006/07/SWD/wiki/SkosIssuesSandbox. @@
Motivation: Tgn, Manuscripts, Aims, ProductLifeCycleSupport, RankingForDescription, etc.
Motivation: Manuscripts, BirnLex, ProductLifeCycleSupport
Motivation: Manuscripts, Aims, ProductLifeCycleSupport, BirnLex, MetadataRegistry
Motivation: Tgn, Manuscripts, Aims, RankingForDescription, etc.
Motivation: Manuscripts , Aims, RadLex
Motivation: Manuscripts, Tgn, Aims, Biozen, RankingForDescription
@@ Linked to SKOS-I-extension-6, SKOS-I-SpecializationOfRelationships @@
Motivation: Aims, ProductLifeCycleSupport, TacticalSituationObject, BirnLexDetailed, etc.
Motivation: RadLex
@@ Linked to SKOS-I-AnnotationOnLabel @@
Motivation: BirnLex
@@ Linked to SKOS-I-CompatibilityWithDC @@
@@ Linked to SKOS-I-CompatibilityWithISO11179 @@
@@ Linked to SKOS-I-CompatibilityWithISO2788 @@
@@ Linked to SKOS-I-CompatibilityWithISO5964 @@
Motivation: Biozen, BirnLex, RadLex
@@ Linked to SKOS-I-owlImport-7, SKOS-I-Semantics-10 @@
Motivation: Manuscripts, RadLex, UDC, Rameau
@@ Linked to SKOS-I-ConceptSchemeContainment @@
Motivation: GtaaBrowser, MetadataRegistry
@@ Linked issue: SKOS-I-Semantics-10 @@
@@ Linked to SKOS-I-GroupingInConceptHierarchies, SKOS-I-collections-5 @@
@@ Linked to SKOS-I-IndexingAndNonIndexingConcepts, SKOS-I-coordination-8 @@
Motivation: Manuscripts, Biozen, Aims, BirnLex
@@ Linked to SKOS-I-IndexingRelationship @@
@@ Linked to SKOS-I-LexicalMappingLinks @@
Motivation: MetadataRegistry
@@ Linked to SKOS-I-MappingProvenanceInformation @@
Motivation: Manuscripts, Aims, RadLex
To elicit the requirements that a new version of the Simple Knowledge Organisation System (SKOS) should meet, the Semantic Web and Deployment working group has issued a call for use cases to the different communities that are concerned by the use of SKOS.
More than 25 submissions have been sent to the working group, which illustrates the variety of usages one can make of such a proposal. In this document, eight of them were selected as being the most representative.
Some of these use cases have come with very high-quality descriptions, and most correspond to development efforts that are presently being carried out, going therefore beyond pure research hypotheses. This gives a sound basis for the process of gathering requirements for SKOS, which the second part of this document describes.
Currently, requirements are divided into accepted and candidate requirements, reflecting the level of consensus they have reached in the Working Group at the time this document was created. In the near future, the Working Group will have to make a final decision regarding the candidate requirements, either accepting them or rejecting them. It will of course have to adapt the existing SKOS material so that it meets the accepted requirements.
The editors gratefully acknowledge contributions from Lora Aroyo, Hugh Barnes, Bruce Bargmeyer, Sean Barker, Sean Bechhofer, Pieter Bellekens, Hennie Brugman, Dario Cerizza, Irene Celino, Thierry Cloarec, Francesco Corcoglioniti, Sarah Currier, Emanuele Della Valle, Diane Hillmann, Chris Holmes, Bernard Horan, Julian Johnson, Simon Jupp, Johannes Keizer, Walter Koch, Véronique Malaisé, George Macgregor, Frédéric Martin, John McCarthy, Emma McCulloch, Alistair Miles, Mitsuharu Nagamori, Dennis Nicholson, Matthias Samwald, Margherita Sini, Aida Slavic, Davide Sommacampagna, Robert Stevens, Doug Tudhope, Andrea Turati, Bernard Vatant, Anna Veronesi.