Final Model Specification

From Ontology-Lexica Community Group

This is an incomplete draft. The final published specification describes the model definitively.


This document describes the specification of the lexicon model for ontologies (lemon) as resulting from the work of the W3C Ontology Lexicon Community Group.

The aim of the lexicon model for ontologies (lemon) is to provide rich linguistic grounding for ontologies. Rich linguistic grounding includes the representation of morphological and syntactic properties of lexical entries as well as the syntax-semantics interface, i.e. the meaning of these lexical entries with respect to an ontology or vocabulary.

This document is structured into nine sections, where the first five correspond to the main modules of the lexicon model for ontologies. Depending on their needs and requirements, applications will use one or more of the modules mentioned below, with the use of the ontolex module being the minimal choice.

  • Ontology-lexicon interface (ontolex)
  • Syntax and Semantics (synsem)
  • Decomposition (decomp)
  • Variation and Translation (vartrans)
  • Linguistic Metadata (lime)

The last three sections do not describe the formal modelling but clarify

  • how one can add linguistic levels of description by means of external ontologies (section 7)
  • how one can use the lexicon model for ontologies to describe lexical nets and other linguistic resources (section 8)
  • the relation between the lexicon model for ontologies and the Simple Knowledge Organization System (SKOS), the Lexical Markup Model (LMF), and the Open Annotation Model (section 9)


Introduction

Ontologies are an important component of the Semantic Web but current standards such as OWL only support the addition of a simple label to entities in the ontology. It is not currently possible to add inflected forms, different genders, usage notes or even create a full lexical resource such as Princeton WordNet. The model described in this document aims to overcome this issue by extending OWL to support rich lexical information rendering ontologies suitable for human consumption and supporting meaningful interaction with and manipulation of them by human users.

OWL and RDF(S) rely on a property rdfs:label to capture the relation between a vocabulary element and its (preferred) lexicalization in a given language. This lexicalization provides a lexical anchor that makes the concept, property, individual etc. understandable to a human user. The use of a simple label for linguistic grounding as available in OWL and RDF(S) is far from being able to capture the necessary linguistic and lexical information that Natural Language Processing (NLP) applications working with a particular ontology need. Such NLP applications are for example:

  • Natural language generation systems that produce coherent discourses by verbalizing a set of triples
  • Question Answering systems that interpret user questions with respect to one or more ontologies
  • Text interpretation systems that extract triples with respect to one or more ontologies
  • Query interpretation and semantic search in information retrieval systems
  • Natural language based interfaces to ontologies, Semantic Web and Linked Data.


Purpose of the model

The purpose of the model is to support linguistic grounding of a given ontology by adding information to the ontology about how the elements in the vocabulary of the ontology (individuals, classes, properties) are lexicalized in a given natural language.

The model follows the principle of semantics by reference [1] in the sense that the semantics of a lexical entry is expressed by reference to an individual, class or property defined in the ontology. In some cases, the lexicon itself can add named concepts which are not made explicit in the ontology.

The model described here is supposed to be open in the sense that it provides a core vocabulary to add information about the linguistic realization of ontology and vocabulary elements. This vocabulary can and should be extended as required by a particular application. In particular, the model abstracts from the specific linguistic theory and category systems used to describe the linguistic properties of lexical entries and their syntactic behavior, encouraging reuse of existing data category systems or linguistic ontologies. The model is thus agnostic with respect to the linguistic theory and category systems adopted for the description of the linguistic properties of lexical entries. We make explicit in this document at which points we refer to an external repository of data categories or introduce novel sub-properties of properties defined in the lexicon model for ontologies.

The model as presented here is inspired by many other models, in particular the Lexical Markup Framework (LMF), the LexInfo model, the LIR model, the Linguistic Meta Model (LMM), the semiotics.owl ontology design pattern, and the Senso Comune core model.

It is important to also mention what is not the purpose of the model:

  • It is not the goal of the model to replace any existing W3C Standard or Recommendation. In particular, this model is not intended to represent informal schemas such as taxonomies, thesauri and other classification schemes. This is covered by the SKOS model.
  • It is not a vocabulary for annotation of texts. If you need to add annotations to textual data, then please consider using the Open Annotation, NLP Interchange Format (NIF) or the Extremely Annotational RDF Markup (Earmark).
  • It is not a formal, semantic model. The model is not supposed to be used to define an ontology and instead assumes that there is a given ontology in some ontology language that is to be linked to a lexicon that expresses how the classes, properties and individuals defined in the ontology are lexicalized.
  • It does not contain a complete collection of linguistic categories. There are many existing efforts to provide a vocabulary to describe the properties of linguistic objects, such as ISOcat, the CLARIN concept registry, OLiA, GOLD, LexInfo. The model recommends to build on those and does not introduce any vocabulary of linguistic description.
  • The lexicon model for ontologies is a model for describing lexical resources in connection to ontologies, it is not a generic vocabulary supporting the publication of any sort of linguistic data including typological data, corpora, word lists etc. See the activities of the Open Linguistics Working Group for more information here.

Namespaces

The model is available with the following sub-namespaces for the various modules of the overall model:

All modules may be imported from the following URL:

Conventions in this document

Throughout this document, we will use Turtle RDF Syntax to provide examples showing the use of the model. Axioms will be paraphrased in natural language. We will assume the following namespaces to be defined throughout all the examples in this document:

@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
@prefix synsem: <http://www.w3.org/ns/lemon/synsem#> .
@prefix decomp: <http://www.w3.org/ns/lemon/decomp#> .
@prefix vartrans: <http://www.w3.org/ns/lemon/vartrans#> .
@prefix lime: <http://www.w3.org/ns/lemon/lime#> .

As we sometimes also refer to other models, we will also assume the following namespaces to be defined in all examples:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix skos: <http://www.w3.org/2004/02/skos#>.
@prefix dbr: <http://dbpedia.org/resource/>.
@prefix dbo: <http://dbpedia.org/ontology/>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix lexinfo: <http://www.lexinfo.net/ontology/2.0/lexinfo#>.
@prefix semiotics: <http://www.ontologydesignpatterns.org/cp/owl/semiotics.owl#>.
@prefix oils: <http://lemon-model.net/oils#>.
@prefix dct: <http://purl.org/dc/terms/>.
@prefix provo: <http://www.w3.org/ns/prov#>.

Furthermore, we require that instances of the model adhere to the RDF 1.1 specification and they follow the appropriate guidelines. In particular, we require that language tags adhere to Best Common Practice 47, where tags are made up of a language code (based on ISO 639 codes part 1, 2, 3 or 5), optionally followed by a hyphen and a ISO 3166-1 country code. Language tags may also contain further subtags expressing e.g. the region, script or further variants.

In all examples in this document, the above namespaces are assumed to be introduced using an appropriate @prefix statement. Prefixes are omitted from class and object property definitions if the referenced ontology element is defined in the same module. For cross-module and external references, the prefix is made explicit.

Core

The following diagram depicts the core model (ontolex). Boxes represent classes of the model. Arrows with filled heads represent object properties, while arrows with empty heads represent subclass relations. In arrows labeled 'X/Y' (e.g. sense/isSenseOf), X (sense) is the name of the object property and Y (isSenseOf) the name of the inverse property.

Lexical Entries

The main class of the core of the lexicon ontology model is the class Lexical Entry. A lexical entry is defined as follows:

Class: Lexical Entry


URI: http://www.w3.org/ns/lemon/ontolex#LexicalEntry

A lexical entry represents a unit of analysis of the lexicon that consists of a set of forms that are grammatically related and a set of base meanings that are associated with all of these forms. Thus, a lexical entry is a word, multiword expression or affix with a single part-of-speech, morphological pattern, etymology and set of senses.


SubClassOf: lexicalForm min 1 Form, canonicalForm max 1 Form, exactly 1 dct:language, semiotics:Expression

A Lexical Entry thus needs to be associated with at least one form, and has at most one canonical form (see below).

Lexical entries are further specialized into words, affixes (e.g., suffix, prefix, infix or circumfix) and multiword expressions.

Class: Word


URI: http://www.w3.org/ns/lemon/ontolex#Word

A word is a lexical entry that consists of a single token.


SubClassOf: LexicalEntry


Class: Multiword Expression


URI: http://www.w3.org/ns/lemon/ontolex#MultiwordExpression

A multiword expression is a lexical entry that consists of two or more words.


SubClassOf: LexicalEntry


Class: Affix


URI: http://www.w3.org/ns/lemon/ontolex#Affix

An affix is a lexical entry that represents a morpheme (suffix, prefix, infix, circumfix) that is attached to a word stem to form a new word.


SubClassOf: LexicalEntry

The following code gives examples of lexical entries for each of these subclasses, corresponding to the word cat, the multiword expression minimum finance lease payments and the affix anti:


:cat a ontolex:Word

:minimum_finance_lease_payments a ontolex:MultiwordExpression

:anti- a ontolex:Affix
Example ontolex/example1 : View as image or source


Forms

A lexical entry can be realized in different ways from a grammatical point of view. These different grammatical realizations are represented as different forms of the lexical entry. A form is defined as follows:

Class: Form


URI: http://www.w3.org/ns/lemon/ontolex#Form

A form represents one grammatical realization of a lexical entry.


SubclassOf: writtenRep min 1 rdf:langString

A lexical entry can be associated to one of its forms by means of the lexicalForm property, although it is preferred to use one of the two subproperties (canonical form, other form) defined below.

ObjectProperty: Lexical Form


URI: http://www.w3.org/ns/lemon/ontolex#lexicalForm

The lexical form property relates a lexical entry to one grammatical form variant of the lexical entry.


Domain: LexicalEntry

Range: Form

Each form can thus have one or more written representations, defined as follows:

DatatypeProperty: Written Representation


URI: http://www.w3.org/ns/lemon/ontolex#writtenRep

The written representation property indicates the written representation of a form.


Domain: Form

Range: rdf:langString

SubPropertyOf: representation

A simple example of a lexical entry with two different forms corresponding to two different grammatical realizations (as singular and plural noun, respectively) is given below:

:lex_child a ontolex:LexicalEntry ;                                                
  ontolex:lexicalForm :form_child_singular, :form_child_plural .              
                                                                                
:form_child_singular a ontolex:Form ;                                          
  ontolex:writtenRep "child"@en .                                               
                                                                                
:form_child_plural a ontolex:Form ;                                            
  ontolex:writtenRep "children"@en .
Example ontolex/example2 : View as image or source


Different forms are used to express different morphological forms of the entry. They should not be used to represent ortographical variants, which should be represented as different representations of the same form. For example, for the lexical entry color, we would have two different representations of the same form, one for the British English written representation colour and one for the American English written representation color. Both representations have the same pronunciation and the same meaning, so they are two different lexicographic variants of the same lexical entry:


:lex_color a ontolex:LexicalEntry;
     ontolex:form :form_color.

:form_color a ontolex:Form;
     ontolex:writtenRep "colour"@en-GB, "color"@en-US.
Example ontolex/example3 : View as image or source


A form may also have a phonetic representation, indicating the pronunciation of the word.

DatatypeProperty: Phonetic Representation


URI: http://www.w3.org/ns/lemon/ontolex#phoneticRep

The phonetic representation property indicates one phonetic representation of the pronunciation of the form using a scheme such as the International Phonetic Alphabet (IPA).


Domain: Form

Range: rdf:langString

SubPropertyOf: representation

The following example shows how we can represent two different pronunciations for one form of a lexical entry using the example of "privacy" (the phonetic code is based on IPA):


:lex_privacy a ontolex:LexicalEntry;
     ontolex:form :form_privacy.

:form_privacy a ontolex:Form;
     ontolex:writtenRep "privacy"@en;
     ontolex:phoneticRep "ˈpɹɪv.ə.si"@en-US-fonipa;
     ontolex:phoneticRep "ˈpɹaɪ.və.si"@en-GB-fonipa.
Example ontolex/example4 : View as image or source


Phonetic representation and written representation are both considered to be sub-properties of a more general property representation, for which users may define extra sub-properties as required.

DatatypeProperty: Representation


URI: http://www.w3.org/ns/lemon/ontolex#representation

The representation property indicates a string by which the form is represented according to some scheme.


Domain: Form

Range: rdf:langString

A lexical entry has a canonical form, which is the form that primarily identifies this entry and may be used as an index term in the lexicon. The canonical form for single words is typically the lemma of that word and is determined by lexicographic conventions for that language. In the case of verbs, the lemma is typically the infinitive form or, alternatively, the present tense of the verb (note that if an external particle is used to indicate the infinitive as in English "to play", this particle should be omitted). For nouns it is the noun singular form, while for adjectives it is the positive (i.e., non-negative, non-graded) form. For multi-word entries it is assumed that the same principles of lemmatization are applied to the head word.

The property canonical form has a LexicalEntry as domain and a Form as range. It is a subproperty of the property lexicalForm. The canonical form has to be unique, so that the property canonical form is declared to be functional:

ObjectProperty: Canonical Form


URI: http://www.w3.org/ns/lemon/ontolex#canonicalForm

The canonical form property relates a lexical entry to its canonical or dictionary form. This usually indicates the "lemma" form of a lexical entry.


Domain: LexicalEntry

Range: Form

Characteristics: Functional

SubPropertyOf: lexicalForm

It is recommended to use the rdfs:label property to indicate the canonical form in addition to the property canonicalForm to ensure compatibility with RDFS-based systems that expect an RDFS label. The lexical entries for the noun "cat", the verb "marry" and the adjective "high" would look as follows (in Turtle syntax):


:lex_cat a ontolex:LexicalEntry, ontolex:Word;
     ontolex:canonicalForm :form_cat;
     rdfs:label "cat"@en .

:form_cat a ontolex:Form;
     ontolex:writtenRep "cat"@en .

:lex_marry a ontolex:LexicalEntry, ontolex:Word;
     ontolex:canonicalForm :form_marry;
     rdfs:label "marry"@en .

:form_marry a ontolex:Form;
     ontolex:writtenRep "marry"@en .

:lex_high a ontolex:LexicalEntry, ontolex:Word;
     ontolex:canonicalForm :form_high;
     rdfs:label "high"@en .

:form_high a ontolex:Form; 
    ontolex:writtenRep "high"@en .
Example ontolex/example5 : View as image or source


Of course, lexical entries need not to correspond to one word only, they can correspond to a multi-word term, as the following example for the lexical entry "intangible assets" shows:


:lex_intangible_assets a ontolex:LexicalEntry, ontolex:MultiwordExpression;
     ontolex:canonicalForm :form_intangible_assets;
     rdfs:label "intangible assets"@en .

:form_intangible_assets a ontolex:Form;
     ontolex:writtenRep "intangible assets"@en .
Example ontolex/example6 : View as image or source


Mulitword expressions are assumed to be distinct in both their full form and any abbreviated form as there may be distinct lexical and pragmatic properties associated with the two different forms of the term. Links using other vocabularies such as LexInfo may be used to describe the type of abbreviation:


:nasa a ontolex:LexicalEntry, lexinfo:Acronym ;
  ontolex:canonicalForm :form_nasa ;
  lexinfo:abbreviationFor :national_aeronautics_and_space_administration;
  rdfs:label "NASA"@en .

:form_nasa a ontolex:Form ;
  ontolex:writtenRep "NASA"@en .

:national_aeronautics_and_space_administration a ontolex:LexicalEntry, ontolex:MultiwordExpression ;
  ontolex:canonicalForm :form_national_aeronautics_and_space_administration ;
  lexinfo:abbreviationFor :nasa ;
  rdfs:label "National Aeronautics and Space Administration"@en .

:form_national_aeronautics_and_space_administration a ontolex:Form ;
  ontolex:writtenRep "National Aeronautics and Space Administration"@en .
Example ontolex/example6a : View as image or source



It is also possible to indicate non-canonical forms of lexical entries, which we call other forms:

ObjectProperty: Other Form


URI: http://www.w3.org/ns/lemon/ontolex#otherForm

The other form property relates a lexical entry to a non-preferred ("non-lemma") form that realizes the given lexical entry.


Domain: LexicalEntry

Range: Form

SubPropertyOf: lexicalForm

For example we may specify non-canonical forms of the verb (to) marry as follows:


:lex_marry a ontolex:LexicalEntry ;
  ontolex:canonicalForm :form_marry ;
  ontolex:otherForm :form_marries .

:form_marry a ontolex:Form;
     ontolex:writtenRep "marry"@en .

:form_marries a ontolex:Form;
     ontolex:writtenRep "marries"@en .
Example ontolex/example7 : View as image or source



The morphological class (i.e., declension, conjugation or similar) may be specified with the morphological pattern property to avoid having to list all regular forms of a word. The implementation of these patterns is not specified by this document (but should be provided by some suitable vocabulary such as LIAM).

ObjectProperty: Morphological Pattern


URI: http://www.w3.org/ns/lemon/ontolex#morphologicalPattern

The morphological pattern property indicates the morphological class of a word.


Domain: LexicalEntry

The following example shows how to indicate the conjugation for the Latin words amare and videre.


:amare ontolex:morphologicalPattern :latin_first_conjugation ;
  ontolex:canonicalForm :amare_form .

:amare_form ontolex:writtenRep "amare"@la .

:videre ontolex:morphologicalPattern :latin_second_conjugation ;
  ontolex:canonicalForm :videre_form .

:videre_form ontolex:writtenRep "videre"@la
Example ontolex/example8 : View as image or source


Semantics

The model supports the specification of the meaning of lexical entries with respect to a given ontology. The lexicon model for ontologies follows the paradigm of semantics by reference in the sense that the meaning of a lexical entry is specified by pointing to the ontological concept that captures or represents its meaning.

The property denotes is defined as follows:

ObjectProperty: Denotes


URI: http://www.w3.org/ns/lemon/ontolex#denotes

The denotes property relates a lexical entry to a predicate in a given ontology that represents its meaning and has some denotational or model-theoretic semantics.


Domain: LexicalEntry

Range: rdfs:Resource

SubPropertyOf: semiotics:denotes

InverseOf: isDenotedBy

PropertyChain: sense o reference

For the lexical entries cat and marry, the meaning could be expressed by pointing to the corresponding DBpedia resources:


:lex_cat a ontolex:LexicalEntry;
   ontolex:canonicalForm :form_cat;
   ontolex:denotes <http://dbpedia.org/resource/Cat>.

:form_cat a ontolex:Form;
   ontolex:writtenRep "cat"@en.

:lex_marriage a ontolex:LexicalEntry;
   ontolex:canonicalForm :form_marriage;
   ontolex:denotes <http://dbpedia.org/resource/Marriage>.

:form_marriage a ontolex:Form;
   ontolex:writtenRep "marriage"@en .
Example ontolex/example9 : View as image or source



The following example shows how we can model the fact that a word is ambiguous with respect to the meanings it denotes, for example the word 'troll' can refer both to a mythical creature and to someone who makes inflammatory posts on the internet. These two meanings can be easily captured as shown in the following example:


:troll a ontolex:LexicalEntry ;
  ontolex:denotes <http://dbpedia.org/resource/Troll> ;
  ontolex:denotes <http://dbpedia.org/resource/Internet_troll> .
Example ontolex/example10 : View as image or source


Two terms may be different lexical entries if they are distinct in part-of-speech, gender, inflected forms or etymology. For example the following words with lemma 'bank' are all considered distinct:


:bank1_en a ontolex:LexicalEntry ;
  dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
  lexinfo:partOfSpeech lexinfo:noun ;
  lexinfo:etymologicalRoot :banque_frm ;
  ontolex:denotes <http://dbpedia.org/resource/Bank> .

:bank2_en a ontolex:LexicalEntry ;
  dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
  lexinfo:partOfSpeech lexinfo:noun ;
  lexinfo:etymologicalRoot :hobanca_ang ;
  ontolex:denotes <http://dbpedia.org/resource/Bank_(geographic)> .

:bank3_en a ontolex:LexicalEntry ;
  dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
  lexinfo:partOfSpeech lexinfo:verb ;
  lexinfo:etymologicalRoot :hobanca_ang ;
  ontolex:denotes <http://dbpedia.org/resource/Banked_turn> .

:bank1_de a ontolex:LexicalEntry ;
  dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
  lexinfo:partOfSpeech lexinfo:noun ;
  lexinfo:gender lexinfo:feminine ;
  ontolex:denotes <http://dbpedia.org/resource/Bank> ;
  ontolex:otherForm :banken .

:banken ontolex:writtenRep "Banken"@de ;
  lexinfo:number lexinfo:plural .

:bank2_de a ontolex:LexicalEntry ;
  odct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
  lexinfo:partOfSpeech lexinfo:noun ;
  lexinfo:gender lexinfo:feminine ;
  ontolex:denotes <http://dbpedia.org/resource/Bench_(furniture)> ;
  ontolex:otherForm :baenke .

:baenke ontolex:writtenRep "Bänke"@de ;
  lexinfo:number lexinfo:plural .
Example ontolex/example10a : View as image or source


Note that the target of a denotation does not need to be an individual in the ontology but may also refer to a class, property or datatype property defined by the ontology. The model is agnostic with respect to the ontology language used to express the ontological meaning referred to. The assumption is merely that the entity in the range represents some predicate that has a denotational semantics in some formal logical system.

Properties in the model for linking to ontologies have an inverse property named as "is x-ed by", where x is the original property name to enable the lexicon to be defined in an ontology focused manner. In the case of denotes this property is isDenotedBy.

In some cases the meaning of a lexical entry is not explicit in the given ontology. Yet, to represent the meaning of a lexical entry we might want to create a new class at the interface between lexicon and ontology by reusing atomic ontological entities defined in the ontology in question. For example, we might want to express the meaning of an adjective by creating an anonymous restriction class at the level of the lexicon-ontology interface. This is illustrated below for the adjective "female" expressing the membership of an anonymous class ∃gender.{female}:


:female a ontolex:LexicalEntry; 
  lexinfo:partOfSpeech lexinfo:adjective;
  ontolex:canonicalForm :female_canonical_form;
  ontolex:sense :female_sense.

:female_canonical_form ontolex:writtenRep "female"@en.

:female_sense ontolex:reference [
    a owl:Restriction;
    owl:onProperty <http://dbpedia.org/ontology/gender> ;
    owl:hasValue <http://dbpedia.org/resource/Female> ] ;
  synsem:isA :female_arg .
Example ontolex/example10b : View as image or source


Lexical Sense & Reference

For many practical modelling situations, the denotes property is not sufficient to capture the precise linking between a lexical entry and its meaning with respect to a given ontology. Thus, the lexicon model for ontologies introduces an intermediate element called lexical sense to capture the particular sense of a word that refers to the particular ontology entity. The lexical entry is linked to a lexical sense by means of the sense property and the lexical sense is linked to the ontology by means of the reference property. The chain sensereference is equivalent to the property denotes introduced above.

Class: LexicalSense


URI: http://www.w3.org/ns/lemon/ontolex#LexicalSense

A lexical sense represents the lexical meaning of a lexical entry when interpreted as referring to the corresponding ontology element. A lexical sense thus represents a reification of a pair of a uniquely determined lexical entry and a uniquely determined ontology entity it refers to. A link between a lexical entry and an ontology entity via a Lexical Sense object implies that the lexical entry can be used to refer to the ontology entity in question.


SubClassOf: reference exactly 1 rdfs:Resource; isSenseOf exactly 1 LexicalEntry, semiotics:Meaning

Via the lexical sense object we can attach additional properties to a pair of lexical entry and ontological predicate that it denotes to describe under which conditions (context, register, domain, etc.) it is valid to regard the lexical entry as having the ontological entity as meaning. For example, we may wish to express the usages of the word "consumption" in terms of the topic and diachronic usage of the word. As shown in the following example, we can use the Dublin Core property subject to indicate the topic of the Sense. The example also shows how to use the property dating defined in the LexInfo ontology to specify that the fourth sense of consumption is outdated.

:lex_consumption a ontolex:LexicalEntry;
   ontolex:canonicalForm :form_consumption;
   ontolex:sense :consumption_sense1;
   ontolex:sense :consumption_sense2;
   ontolex:sense :consumption_sense3;
   ontolex:sense :consumption_sense4 .

:form_consumption ontolex:writtenRep "consumption"@en.

:consumption_sense1 a ontolex:LexicalSense;
  dct:subject <http://dbpedia.org/resource/Ecology> ;
  ontolex:reference <http://dbpedia.org/resource/Consumption_(ecology)> .

:consumption_sense2 a ontolex:LexicalSense;
  dct:subject <http://dbpedia.org/resource/Anatomy> ;
  ontolex:reference <http://dbpedia.org/resource/Ingestion> .

:consumption_sense3 a ontolex:LexicalSense;
   dct:subject <http://dbpedia.org/resource/Economics> ;
   ontolex:reference <http://dbpedia.org/resource/Consumption_(economics)> .

:consumption_sense4 a ontolex:LexicalSense;
   dct:subject <http://dbpedia.org/resource/Medicine> ;
   lexinfo:dating lexinfo:old ;
   ontolex:reference <http://dbpedia.org/resource/Tuberculosis> .
Example ontolex/example11 : View as image or source



The lexical sense has a single lexical entry and a single reference in the ontology. As a consequence, the properties "sense" and "reference" are defined as inverse functional and functional, respectively.

ObjectProperty: Sense


URI: http://www.w3.org/ns/lemon/ontolex#sense

The sense property relates a lexical entry to one of its lexical senses.


Domain: LexicalEntry

Range: LexicalSense

InverseOf: isSenseOf

Characteristics: Inverse Functional



ObjectProperty: Reference


URI: http://www.w3.org/ns/lemon/ontolex#reference

The reference property relates a lexical sense to an ontological predicate that represents the denotation of the corresponding lexical entry.


Domain: LexicalSense or synsem:OntoMap

Range: rdfs:Resource

InverseOf: isReferenceOf

Characteristics: Functional


Usage

The interpretation of a word (lexical entry) with respect to a meaning defined in a given ontology is often modulated by usage conditions or pragmatic implications in particular due to register, connotations or meaning nuances of a word. For example, consider as an example the French words 'rivière' and 'fleuve', which refer to rivers flowing into a sea and flowing into other rivers, respectively. As corresponding ontological classes to capture the specific meanings of these French words might not be available in the ontology, these meaning nuances can be specified using the property usage, which allows to capture information related to usage conditions and pragmatic implications under which the lexical entry can be used to refer to the ontological meaning in question. These usage conditions are not introduced instead of a formally defined sense but complement the corresponding sense by additional information describing the usage of the lexical entry.

How exactly constraints on the usage of senses are defined is not defined by this specification. Yet, we give an example below that shows how to model the lexical meaning of 'rivière' and 'fleuve' when used to refer to the DBpedia class River:


:riviere a ontolex:LexicalEntry ;
  ontolex:sense :riviere_sense .

:fleuve a ontolex:LexicalEntry ;
  ontolex:sense :fleuve_sense .

:riviere_sense ontolex:reference <http://dbpedia.org/ontology/River> ;
  ontolex:usage [ 
    rdf:value "A riviere is a river that flows into another river"@en
  ] .

:fleuve_sense ontolex:reference <http://dbpedia.org/ontology/River>;
  ontolex:usage [
    rdf:value "A fleuve is a river that flows into the sea"@en
  ] .

Example ontolex/example12 : View as image or source


ObjectProperty: Usage


URI: http://www.w3.org/ns/lemon/ontolex#usage

The usage property indicates usage conditions or pragmatic implications when using the lexical entry to refer to the given ontological meaning.


Domain: LexicalSense

Range: rdfs:Resource

Lexical Concept

We have seen above how to capture the fact that a certain lexical entry can be used to denote a certain ontological predicate. We capture this by saying that the lexical entry denotes the class or ontology element in question. However, sometimes we would like to express the fact that a certain lexical entry evokes a certain mental concept rather than that it refers to a class with a formal interpretation in some model. Thus, in the lexicon model for ontologies we introduce the class Lexical Concept that represents a mental abstraction, concept or unit of thought that can be lexicalized by a given collection of senses. A lexical concept is thus a subclass of skos:Concept.

Class: Lexical Concept


URI: http://www.w3.org/ns/lemon/ontolex#LexicalConcept

A lexical concept represents a mental abstraction, concept or unit of thought that can be lexicalized by a given collection of senses.


SubClassOf: skos:Concept, semiotics:Meaning

The lexical entry is said to evoke a particular lexical concept, similar to how a lexical entry denotes an ontology reference.

ObjectProperty: Evokes


URI: http://www.w3.org/ns/lemon/ontolex#evokes

The evokes property relates a lexical entry to one of the lexical concepts it evokes, i.e. the mental concept that speakers of a language might associate when hearing the lexical entry.


Domain: Lexical Entry

Range: Lexical Concept

InverseOf: isEvokedBy

Property Chain: sense o isLexicalizedSenseOf

The evoked concept is different from the reference in the ontology, as the reference primarily gives an interpretation of a word in terms of the identifiers that would be generated by the semantic parsing of the sentence. For example if we were to understand the sentence John F. Kennedy died in 1963. we may understand the verb "die (in)" as generating the URI deathDate within a SPARQL query. However, we might also want to record the actual lexical sense of the word with respect to a mental lexicon, in which die evokes the event of dying, as modelled in the following example:


:die a ontolex:Word ;
     ontolex:denotes <http://dbpedia.org/ontology/deathDate> ;
     ontolex:evokes  :Dying .
Example ontolex/example13 : View as image or source


We can link a lexical concept to a lexical sense that lexicalizes the concept via the property lexicalized sense:

ObjectProperty: Lexicalized Sense


URI: http://www.w3.org/ns/lemon/ontolex#lexicalizedSense

The lexicalized sense property relates a lexical concept to a corresponding lexical sense that lexicalizes the concept.


Domain: Lexical Concept

Range: Lexical Sense

InverseOf: isLexicalizedSenseOf

A simple example involving the use of a lexical concept is the following:

:temporary_change_of_possession a ontolex:LexicalConcept;
     ontolex:lexicalizedSense :borrow_sense;
     ontolex:lexicalizedSense :lend_sense;
     ontolex:isEvokedBy :borrow_le;
     ontolex:isEvokedBy :lend_le.

:borrow_le a ontolex:LexicalEntry;
     ontolex:sense :borrow_sense;
     ontolex:evokes :temporary_change_of_possession.

:lend_le a ontolex:LexicalEntry;
    ontolex_sense :lend_sense;
    ontolex:evokes :temporary_change_of_possession.

Example ontolex/example14 : View as image or source



Similarly, we can link a lexical concept to a reference in the ontology by means of the concept property:

ObjectProperty: Concept


URI: http://www.w3.org/ns/lemon/ontolex#concept

The concept property relates an ontological entity to a lexical concept that represents the corresponding meaning.


Domain: owl:Thing

Range: Lexical Concept

InverseOf: isConceptOf

The combined usage of the properties denotes, sense, evokes, concept and lexicalized sense is demonstrated in the example below for the case of a lexical resource such as Princeton WordNet. Roughly, the synsets in a wordnet correspond to a lexical concept in OntoLex. The modelling would thus look as follows:

:cat_lex a ontolex:LexicalEntry ;                                               
  ontolex:canonicalForm :cat_form ;
  ontolex:sense :cat_sense ;
  ontolex:denotes <http://dbpedia.org/resource/Cat> ;
  ontolex:evokes pwn:102124272-n .

:cat_form ontolex:writtenRep "cat"@en .

:cat_sense a ontolex:LexicalSense ;
  ontolex:reference <http://dbpedia.org/resource/Cat> ;
  ontolex:isLexicalizedSenseOf pwn:102124272-n ;
  ontolex:isSenseOf :cat_lex .

<http://dbpedia.org/resource/Cat>
  ontolex:concept pwn:102124272-n ;
  ontolex:isReferenceOf :cat_sense ;
  ontolex:isDenotedBy :cat_lex .

pwn:102124272-n
  ontolex:isEvokedBy :cat_lex ;
  ontolex:lexicalizedSense :cat_sense ;
  ontolex:isConceptOf <http://dbpedia.org/resource/Cat> .
Example ontolex/example15 : View as image or source


A definition can be added to a lexical concept as a gloss by using the skos:definition property.

In addition to organizing a lexicon by lexical entries, we may alternatively create a lexicon of concepts, by means of the the concept set class, defined as follows:

Class: Concept Set


URI: http://www.w3.org/ns/lemon/ontolex#ConceptSet

A concept set represents a collection of lexical concepts.


SubClassOf: skos:ConceptScheme, void:Dataset

In this way lexicons can be ordered onomasiologically, that is by meanings rather than by lemmas. The concept set is a special type of skos:ConceptScheme. A lexical concept is linked to a ConceptSet using the property skos:inScheme


:conceptLexicon a ontolex:ConceptSet .

:consumption1 a ontolex:LexicalConcept ;
  ontolex:isConceptOf <http://dbpedia.org/resource/Tuberculosis> ;
  skos:definition "Tuberculosis, MTB, or TB (short for tubercle bacillus), in the past also called phthisis, phthisis pulmonalis, or consumption, is a widespread, and in many cases fatal, infectious disease caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. Tuberculosis typically attacks the lungs, but can also affect other parts of the body. It is spread through the air when people who have an active TB infection cough, sneeze, or otherwise transmit respiratory fluids through the air."@en;
  ontolex:isEvokedBy :consumption ;
  skos:inScheme :conceptLexicon .
                                                                                
:consumption2 a ontolex:LexicalConcept ;                                         
  ontolex:isConceptOf <http://dbpedia.org/resource/Consumption_(Economics)> ;
  skos:definition "Consumption is a major concept in economics and is also studied by many other social sciences. Economists are particularly interested in the relationship between consumption and income, and therefore in economics the consumption function plays a major role.";
  ontolex:isEvokedBy :consumption ;
  skos:inScheme :conceptLexicon .
                                                                                
:tuberculosis1 a ontolex:LexicalConcept ;
  ontolex:isConceptOf <http://dbpedia.org/resource/Tuberculosis> ;
  skos:definition "Tuberculosis, MTB, or TB (short for tubercle bacillus), in the past also called phthisis, phthisis pulmonalis, or consumption, is a widespread, and in many cases fatal, infectious disease caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. Tuberculosis typically attacks the lungs, but can also affect other parts of the body. It is spread through the air when people who have an active TB infection cough, sneeze, or otherwise transmit respiratory fluids through the air."@en;
  ontolex:isEvokedBy :tuberculosis ;
  skos:inScheme :conceptLexicon .

:consumption a ontolex:LexicalEntry ;
  ontolex:canonicalForm :consumption_lemma .

:consumption_lemma ontolex:writtenRep "consumption"@en .

:tuberculosis a ontolex:LexicalEntry ;
  ontolex:canonicalForm :tuberculosis_lemma .

:tuberculosis_lemma ontolex:writtenRep "tuberculosis"@en .
Example ontolex/example17 : View as image or source


Syntax and Semantics (synsem)

Syntactic Frames

Most words in a language do not stand by their own, but have a certain syntactic behavior in the sense that they appear in certain syntactic structures and require a number of syntactic arguments to be complete. Examples of this are i) transitive verbs (e.g. to own), which require a syntactic subject and a syntactic object, ii) relational nouns (e.g. capital (of), mother (of), son (of), brother (of), etc.), which require a prepositional object, or iii) adjectives, which require a noun to modify, etc. The syntactic behavior of a lexical entry is defined in the lexicon model for ontologies by a syntactic frame:

Class: Syntactic Frame


URI: http://www.w3.org/ns/lemon/synsem#SyntacticFrame

A syntactic frame represents the syntactic behavior of an open class word in terms of the (syntactic) arguments it requires. It essentially describes the so called subcategorization structure of the word in question, in particular the syntactic arguments it requires.



In order to relate a lexical entry to one of its various syntactic behaviors as captured by a syntactic frame, the synsem module defines the syntactic behaviour property. Each lexical entry should have its own syntactic frame instance, generic behavior such as 'transitive' should be captured by classes.

ObjectProperty: Syntactic Behavior


URI: http://www.w3.org/ns/lemon/synsem#synBehavior

The syntactic behavior property relates a lexical entry to one of its syntactic behaviors as captured by a syntactic frame.


Domain: ontolex:LexicalEntry

Range: SyntacticFrame

Characteristics: InverseFunctional

The following example shows how to indicate that the verb (to) own can be used as a transitive verb. This is accomplished by adding a frame own_frame_transitive that is declared as a transitive frame, using the class TransitiveFrame defined in the LexInfo Ontology.


:own_lex a ontolex:LexicalEntry ;
  ontolex:canonicalForm :own_form ;
  synsem:synBehavior :own_frame_transitive .

:own_frame_transitive a synsem:SyntacticFrame, lexinfo:TransitiveFrame.

:own_form ontolex:writtenRep "own"@en . 
Example synsem/example1 : View as image or source


Arguments of a syntactic frame are represented by the class Syntactic Argument:

Class: Syntactic Argument


URI: http://www.w3.org/ns/lemon/synsem#SyntacticArgument

A syntactic argument represents a slot that needs to be filled for a certain syntactic frame to be complete. Syntactic arguments typically realize a certain grammatical function (e.g. subject, direct object, indirect object, prepositional object, etc.).



The object property synArg is used to relate a (syntactic) frame to one of its syntactic arguments.

ObjectProperty: SynArg


URI: http://www.w3.org/ns/lemon/synsem#synArg

The object property synArg relates a syntactic frame to one of its syntactic arguments.


Domain: SyntacticFrame

Range: SyntacticArgument

The following example shows how to extend the example for the verb (to) own by specifically indicating the arguments, in this case via two specific sub-properties of synArg, i.e. lexinfo:subject or lexinfo:directObject defined in the external LexInfo ontology.


:own_lex a ontolex:LexicalEntry ;
  synsem:canonicalForm :own_form ;
  synsem:synBehavior :own_frame_transitive .

:own_form ontolex:writtenRep "own"@en. 

:own_frame_transitive a lexinfo:TransitiveFrame;
       lexinfo:subject :own_frame_subj;
       lexinfo:directObject :own_frame_obj.
Example synsem/example2 : View as image or source


Note that if an external ontology is used to describe the type of arguments in more detail, e.g. indicating the grammatical function as in the example above, the external property used needs to be a sub-property of synArg.

Ontology Mappings

At the lexicon-ontology interface, syntactic frames need to be mapped or bound to ontological structures that represent their meaning. In the same way that a lexical sense binds a lexical entry to an ontology entity, the ontology map maps a syntactic frame onto an ontology entity.

Class: OntoMap


URI: http://www.w3.org/ns/lemon/synsem#OntoMap

An ontology mapping (OntoMap for short) specifies how a syntactic frame and its syntactic arguments map to a set of concepts and properties in the ontology that together specify the meaning of the syntactic frame.



In order to link an ontology map to a corresponding sense, the model foresees the property ontoMapping, which is defined as functional and inverse functional, that is in exact 1:1 relationship with a lexical sense. As such, it is recommended that in the case that a lexicon requires both the ontology map and the lexical sense, then these two entities are defined using the same URI as there is no technical reason to distinguish them and they have very similar functions.

ObjectProperty: ontoMapping


URI: http://www.w3.org/ns/lemon/synsem#ontoMapping

The ontoMapping property relates an ontology mapping to its corresponding lexical sense.


Domain: OntoMap

Range: LexicalSense

Characteristics: Functional, InverseFunctional

The synsem module introduces the property ontoCorrespondence to establish a mapping between an argument of a predicate defined in the ontology and the syntactic argument that realizes this predicate argument in a given syntactic frame:

ObjectProperty: ontoCorrespondence


URI: http://www.w3.org/ns/lemon/synsem#ontoCorrespondence

The ontoCorrespondence property binds an argument of a predicate defined in the ontology to a syntactic argument that realizes this predicate argument syntactically.


Domain: OntoMap or LexicalSense

Range: SyntacticArgument

Without limitation, we assume that an ontology consists of symbols representing individuals, unary predicates and binary predicates, as indicated by the following table:

Type Predicate Predicate Logic Notation RDF Notation
Class Unary predicate City(x) ?x rdf:type dbo:City
Object, Datatype or Annotation Property Binary predicate knows(x,y), ?x foaf:knows ?y
Individual Constant (null-ary predicate) London, dbr:London

Predicates with an arity of more than two can be represented by complex senses (see below). This is due to the fact that this module is aligned to RDF and OWL, which distinguish between: individuals/resources (constants), classes (unary predicates) and properties (predicates of arity "2").

In the following, we introduce three sub-properties of the ontoCorrespondence property. The first property is a is used to refer to the single argument of a unary predicate in the ontology:

ObjectProperty: Is A


URI: http://www.w3.org/ns/lemon/synsem#isA

The is a property represents the single argument of a class or unary predicate.


SubPropertyOf: ontoCorrespondence

Following the terminology used in RDF/OWL we call the first argument of a property its subject and the second argument the object. The synsem module defines two properties subjOfProp and objOfProp that can be used to refer to the 1st (subject) and 2nd (object) argument of a property, that is a predicate of arity "2".

ObjectProperty: Subject of Property


URI: http://www.w3.org/ns/lemon/synsem#subjOfProp

The subjOfProp property represents the 1st argument or subject of a binary predicate (property) in the ontology.


SubPropertyOf: ontoCorrespondence


ObjectProperty: Object of Property


URI: http://www.w3.org/ns/lemon/synsem#objOfProp

The objOfProp represents the 2nd argument or object of a binary predicate (property) in the ontology.


SubPropertyOf: ontoCorrespondence

Finally, we can specify the reference owner that expresses the meaning of "to own" with respect to the DBpedia ontology, specifying the mapping between arguments of the property owner and the arguments that realize these arguments syntactically.


:own_lex a ontolex:LexicalEntry ;
  synsem:canonicalForm :own_form ;
  synsem:synBehavior :own_frame_transitive ;
  ontolex:denotes <http://dbpedia.org/ontology/owner> .

:own_form ontolex:writtenRep "own"@en. 

:own_frame_transitive a lexinfo:TransitiveFrame;
       lexinfo:subject :own_subj;
       lexinfo:directObject :own_obj.

:own_ontomap a synsem:OntoMap;
         synsem:subjOfProp :own_obj;
         synsem:objOfProp :own_subj.
Example synsem/example3 : View as image or source


As a further example we show a lexical entry for the relational noun "father (of)". The entry indicates that the relation noun "father (of)" can be used to verbalize the DBpedia property father, whereby the subject in a copula construct such as "X is father of Y" (:arg1 below) corresponds to the 2nd argument of the property father, and the prepositional argument at position Y (:arg2 below) corresponds to the 1st argument of the property father. We use the LexInfo vocabulary to provide linguistic information.


:father_of a ontolex:LexicalEntry ; 
    lexinfo:partOfSpeech lexinfo:noun ;
    ontolex:canonicalForm :father_form;
    synsem:synBehaviour :father_of_nounpp;
    ontolex:sense :father_sense_ontomap.

:father_form a ontolex:Form;
    ontolex:writtenRep "father"@en.

:father_of_nounpp a lexinfo:NounPPFrame;
   lexinfo:subject :arg1;
   lexinfo:prepositionalArg :arg2.

:father_sense_ontomap a synsem:OntoMap, ontolex:LexicalSense;
   ontolex:ontoMapping :father_sense_ontomap;
   ontolex:reference <http://dbpedia.org/ontology/father>;
   ontolex:subjOfProp :arg2;
   ontolex:objOfProp :arg1.

:arg2 synsem:marker :of .

Example synsem/example4 : View as image or source


ObjectProperty: Marker


URI: http://www.w3.org/ns/lemon/synsem#marker

The object property marker indicates the marker of a syntactic argument; this can be a case marker or some other lexical entry such as a preposition or particle.


Domain: SyntacticArgument

Range: rdfs:Resource

The following example shows how to specify that the intransitive verb operate, subcategorizing a prepositional phrase introduced by the preposition in, can be used to denote the property regionServed in DBpedia. The entry specifies that in a construction such as `X operates in Y', the X refers to the subject of the property regionServed, and the Y refers to the object of the property regionServed. Again, we use the LexInfo ontology in our example to provide linguistic information:


:operate_in a ontolex:LexicalEntry ; 
    lexinfo:partOfSpeech lexinfo:verb ;
    ontolex:canonicalForm :operate_form;
    synsem:synBehavior :operate_intransitivepp;
    ontolex:sense :operate_sense_ontomap.

:operate_form a ontolex:Form;
   ontolex:writtenRep "operate"@en.

:operate_intransitivepp a synsem:Frame;
   lexinfo:subject :operate_subj ;
   lexinfo:prepositionalArg :operate_pobj.

:operate_sense_ontomap a ontolex:LexicalSense, synsem:OntoMap;
   ontolex:ontoMapping :operate_sense_ontomap;
   ontolex:reference <http://dbpedia.org/ontology/regionServed>;
   ontolex:subjOfProp :operate_subj;
   ontolex:objOfProp :operate_pobj.
 
:operate_pobj synsem:marker :in .

Example synsem/example5 : View as image or source


Complex ontology mappings / submappings

In many cases, the meaning of a syntactic frame can not be expressed by exactly one binary predicate as in the examples given above. Take for instance the case of a transitive verb (to) launch, which subcategorizes a subject expressing the company that launched a product, a direct object expressing the launched product, and a prepositional object introduced by the preposition "in" indicating the year of the launch of the product in question. The important thing here is that there are three syntactic arguments (subject, object and prepositional object, represented as arg1, arg2 and arg3 below, respectively) that realize the arguments of a complex predicate that consist of the sub-predicates dbpedia:product and dbpedia:launchDate.

Thus, the synsem module introduces the property submap that relates a (complex) ontological map involving various ontological predicates to a set of less complex ontological maps that bind the arguments of one of the involved predicates to a syntactic argument that realizes it.

ObjectProperty: Submap


URI: http://www.w3.org/ns/lemon/synsem#submap

The submap property relates a (complex) ontological mapping to a set of bindings that together bind the arguments of the involved predicates to a set of syntactic arguments that realize them syntactically.


Domain: OntoMap

Range: OntoMap

The following example shows how to use the submap property to indicate that the meaning of the phrase X launched Y in Z is a composition of the properties dbpedia:product and dbpedia:launchDate, which together express the meaning of the syntactic frame:


:launch a ontolex:LexicalEntry ;
  lexinfo:partOfSpeech lexinfo:verb ;
  ontolex:canonicalForm :launch_canonical_form;
  synsem:synBehaviour :launch_transitive_pp;
  ontolex:sense :launch_sense_ontomap.

:launch_canonical_form ontolex:writtenRep "launch"@en.

:launch_transitive_pp a lexinfo:TransitivePPFrame;
 lexinfo:subject                      :arg1 ;
 lexinfo:directObject              :arg2 ;
 lexinfo:prepositionalAdjunct :arg3.

:arg3 synsem:marker :in ;
             synsem:optional "true"^^xsd:boolean .


:launch_sense_ontomap a ontolex:LexicalSense, synsem:OntoMap;
   synsem:ontoMapping :launch_sense_ontomap;
   synsem:submap :launch_submap1;
   synsem:submap :launch_submap2.

:launch_submap1 ontolex:reference <http://dbpedia.org/ontology/product>;
                                 synsem:subjOfProp :arg1;
                                 synsem:objOfProp  :arg2.

:launch_submap2 ontolex:reference <http://dbpedia.org/ontology/launchDate>;
                                 synsem:subjOfProp :arg2;
                                 synsem:objOfProp  :arg3.

Example synsem/example6 : View as image or source


It is possible to specify that a certain argument is not compulsory by the optional property. It is generally only advised to use this property with complex senses. Indicating that an argument is optional means that it does not have to be realized syntactically in which case from a semantic point of view the corresponding argument of the ontological predicate is existentially quantifier over. In the above example we have indicated that arg3 is optional, allowing to assign the correct semantics to an expression such as X launched Y by existentially quantifying over the year.

DatatypeProperty: Optional


URI: http://www.w3.org/ns/lemon/synsem#optional

The optional property indicates whether a syntactic argument is optional, that is, it can be syntactically omitted.


Domain: SyntacticArgument

Range: xsd:boolean

The following example shows how we can capture the diathesis alternation between X gave Y Z and X gave Z to Y, which in our modelling represent the same ontological meaning:



:give a ontolex:LexicalEntry ; 
    lexinfo:partOfSpeech lexinfo:verb ;
    ontolex:canonicalForm :give_form;
    synsem:synBehavior :give_ditransitive;
    synsem:synBehaviour :give_transitive_pp;
    ontolex:sense :giving_sense_ontomap.

:give_form a ontolex:Form;
   ontolex:writtenRep "give"@en.

:give_transitive_pp a lexinfo:TransitivePPFrame;
   lexinfo:subject :give_subj1 ;
   lexinfo:directObject :give_dobj1; 
   lexinfo:prepositionalArg :give_pobj1.

:give_ditransitive a lexinfo:DitransitiveFrame;
   lexinfo:subject :give_subj2 ;
   lexinfo:indirectObject :give_iobj2;
   lexinfo:directObject :give_dobj2.


:giving_sense_ontomap a ontolex:LexicalSense, synsem:OntoMap;
   synsem:ontoMapping :giving_sense_ontomap;
   ontolex:reference <http://www.ontologyportal.org/SUMO.owl#Giving>;
   synsem:submap :giving_submap1;
   synsem:submap :giving_submap2;
   synsem:submap :giving_submap3.
 
:giving_submap1 ontolex:reference <http://www.ontologyportal.org/SUMO.owl#agent>;
                                 synsem:subjOfProp :giving_event;
                                 synsem:objOfProp  :give_subj1;
                                 synsem:objOfProp  :give_subj2.

:giving_submap2 ontolex:reference <http://www.ontologyportal.org/SUMO.owl#patient>;
                                 synsem:subjOfProp :giving_event;
                                 synsem:objOfProp  :give_dobj2;
                                 synsem:objOfProp :give_dobj1.

:giving_submap3 ontolex:reference <http://www.ontologyportal.org/SUMO.owl#destination>;
                                 synsem:subjOfProp :giving_event;
                                 synsem:objOfProp  :give_iobj2;
                                 synsem:objOfProp :give_pobj1.

:give_pobj1 synsem:marker :to .
Example synsem/example7 : View as image or source


For adjectives a modelling may be as follows:


:female a ontolex:LexicalEntry; 
  lexinfo:partOfSpeech lexinfo:adjective;
  ontolex:canonicalForm :female_canonical_form;
  synsem:synBehaviour :female_syn,:female_syn1;
  ontolex:sense :female_sense_ontomap.

:female_canonical_form ontolex:writtenRep "female"@en.

:female_sense_ontomap ontolex:reference [
    a owl:Restriction;
    owl:onProperty <http://dbpedia.org/ontology/gender> ;
    owl:hasValue <http://dbpedia.org/resource/Female> ] ;
  synsem:ontoMapping :female_sense_ontomap;
  synsem:isA :female_arg .

:female_syn a lexinfo:AdjectivePredicateFrame;
   lexinfo:copulativeSubject :female_arg.
                                                                                
:female_syn1 a lexinfo:AdjectiveAttributiveFrame ;                              
   lexinfo:attributiveArg :female_arg.  
Example synsem/example8 : View as image or source


Note that in the above example the property synsem:isA property is used to mark the single argument/variable of the class of all the things that have female gender. The copulative subject in an expression such as "Mary is female" is bound to this single argument of the corresponding ontological predicate. The semantics is thus in essence the characteristic function that for each element decides if it is in the set denoted by the class.

Conditions

Conditions describe precise conditions that must be met by a context in which a lexical entry can be used to refer to a certain ontological predicate (reference). These contextual conditions are attached to the lexical sense that mediates the relation between a lexical entry and the ontological predicate it can be used to express.

ObjectProperty: condition


URI: http://www.w3.org/ns/lemon/synsem#condition

The condition property defines an evaluable constraint that derives from using a certain lexical entry to express a given ontological predicate.


Domain: LexicalSense

Range: rdfs:Resource

SubPropertyOf: context

Two special types of conditions are defined in the synsem module, which formulate constraints on the type of arguments that can be used at the first or second position of a property when a certain lexical entry is used to express that property. Take for instance the distinction between the English verbs (to) ride and (to) drive. Both express the means of transportation, but have different implications. Ride implies that the means of transportation is a bike. Instead of introducing different ontological predicates and different senses, the modulation can be captured by specifying restrictions on the values that can fill the 1st or 2nd argument of the corresponding ontological predicate. This is illustrated by the example below:


:ride a ontolex:LexicalEntry ;
  ontolex:sense :ride_sense1 .

:ride_sense1 a ontolex:LexicalSense ;
  ontolex:reference :methodOfTransportation ;
  synsem:propertyRange :Bicycle ;
  synsem:semArg :subj, :obj .

:methodOfTransportation a rdf:Property ;
  rdfs:range :Vehicle .
Example synsem/example9 : View as image or source


It is important to note that the propertyDomain or propertyRange properties do not modify in any way the ontological status or commitment of the corresponding property (here: methodOfTransportation). Instead, they make explicit certain implications on the type of arguments involved that derive from the use of a certain lexical entry to express the property in question.

ObjectProperty: propertyDomain


URI: http://www.w3.org/ns/lemon/synsem#propertyDomain

The propertyDomain property specifies a constraint on the type of arguments that can be used at the first position of the property that is referenced by the given sense.


Domain: LexicalSense

Range: rdfs:Resource


ObjectProperty: propertyRange


URI: http://www.w3.org/ns/lemon/synsem#propertyRange

The propertyRange property specifies a constraint on the type of arguments that can be used at the first position of the property that is referenced by the given sense.


Domain: LexicalSense

Range: rdfs:Resource

Decomposition (decomp)

Subterms

Decomposition is the process of indicating which elements constitute a multi-word or compound lexical entry. The simplest way to do this is by means of the subterm property, which indicates that a lexical entry is a part of another entry. This property allows to specify which lexical entries a certain compound lexical entry is composed of.

ObjectProperty: Subterm


URI: http://www.w3.org/ns/lemon/decomp#subterm

The property subterm relates a compound lexical entry to one of the lexical entries it is composed of.


Domain: LexicalEntry

Range: LexicalEntry

The subterm property is used to indicate which terms have been derived from another term by means of adding or removing words, for example


:AfricanSwineFever a ontolex:LexicalEntry ;
  decomp:subterm :SwineFever .
Example decomp/example1 : View as image or source


The subterm property may also be used to indicate the decomposition of compound words. The following example shows how to indicate that the German compound Lungenentzündung ('pneumonia' literally 'lung inflammation') is decomposed into the lexical entries Lunge and Entzündung:


:Lungenentzündung a ontolex:LexicalEntry ;
  decomp:subterm :Lunge_lex;
  decomp:subterm :Entzündung_lex .
Example decomp/example2 : View as image or source


It is important to mention that the subterm property is a relation between lexical entries and does neither indicate the specific inflected word of a lexical entry that appears in the compound nor the position at which it appears.

Components

The subterm property allows us to indicate which lexical entries a compound is composed of, but it does not indicate the internal structure of the compound. This can be achieved by introducing so called components. Such components represent a fixed list of each of the elements that compose a lexical entry. In the most common case of a multiword expression, the components of the lexical entry are the individual tokens that compose that entry.

Class: Component


URI: http://www.w3.org/ns/lemon/decomp#Component

A component is a particular realization of a lexical entry that forms part of a compound lexical entry.



Each component is said to be a constituent of a lexical entry:

ObjectProperty: Constituent


URI: http://www.w3.org/ns/lemon/decomp#constituent

The property constituent relates a lexical entry or component to a component that it is constituted by.


Domain: LexicalEntry or Component

Range: Component


:AfricanSwineFever a ontolex:MultiwordExpression ;
  decomp:constituent :African_comp , :Swine_comp , :Fever_comp ;
  decomp:subterm :SwineFever .

:African_comp a decomp:Component .

:Swine_comp a decomp:Component .

:Fever_comp a decomp:Component .

:SwineFever a ontolex:MultiwordExpression ;
  decomp:constituent :Swine_comp , :Fever_comp .
Example decomp/example3 : View as image or source


As a component represents a particular realization of a lexical entry which forms part of a compound lexical entry, we need to link the component to the corresponding lexical entry it is a realization of. This is done by the property correspondsTo:

ObjectProperty: Corresponds To


URI: http://www.w3.org/ns/lemon/decomp#correspondsTo

The property correspondsTo links a component to a corresponding lexical entry or argument.


Domain: Component

Range: LexicalEntry or Argument

It may be necessary to add inflectional properties to the component to uniquely determine the actual form of the lexical entry. This inflectional information can be attached to the component as shown in the following example for the Spanish term 'comunidad autónoma' (federal state), whose second word is the singular feminine form autónoma instead of the canonical form autónomo.


:comunidad_autonoma_lex a ontolex:LexicalEntry ;
  decomp:constituent :comunidad_component;
  decomp:constituent :autonoma_component .

:comunidad_component a decomp:Component;
     decomp:correspondsTo :comunidad_lex.

:autonoma_component a decomp:Component;
     decomp:correspondsTo :autonomo_lex;
     lexinfo:gender lexinfo:feminine;
     lexinfo:number lexinfo:singular.

Example decomp/example4 : View as image or source



If we want to specify the order of the components, we can use the RDF properties rdf:_1, rdf:_2, etc. as in the following example to specify the absolute order, in addition to the constituent properties. Note that the property constituent alone is not sufficient to specify the order of components.



:comunidad_autonoma_lex a ontolex:LexicalEntry ;
  decomp:constituent :comunidad_component;
  rdf:_1             :comunidad_component; 
  decomp:constituent :autonoma_component;
  rdf:_2             :autonoma_component;
  ontolex:denotes <http://dbpedia.org/ontology/federalState>;
  ontolex:canonicalForm :comunidad_autonoma_lex_canonical_form.

:comunidad_autonoma_lex_canonical_form ontolex:writtenRep "comunidad autónoma"@es.

:comunidad_component a decomp:Component;
     decomp:correspondsTo :comunidad_lex.

:autonoma_component a decomp:Component;
     decomp:correspondsTo :autonomo_lex;
     lexinfo:gender lexinfo:feminine;
     lexinfo:number lexinfo:singular.
Example decomp/example5 : View as image or source


Phrase structure

The constituent property can also be used to specify the structure of a phrase, by means of showing some components as being constituted of further components. In this way, each of the components represents a node in the phrase structure tree and may be annotated with a phrase tag as in the following example:


:AfricanSwineFever_root a decomp:Component ;
  decomp:correspondsTo :AfricanSwineFever ;
  decomp:constituent :African_node, :SwineFever_node ;
  rdf:_1 :African_node;
  rdf:_2 :SwineFever_node;
  olia:hasTag penn:NP .

:African_node a decomp:Component ;
  decomp:correspondsTo :African ;
  olia:hasTag penn:JJ .

:SwineFever_node a decomp:Component ;
  decomp:constituent :Swine_node, :Fever_node ;
  rdf:_1 Swine_node;
  rdf:_2 Fever_node;
  olia:hasTag penn:NP .

:Swine_node a decomp:Component ; 
  decomp:correspondsTo :Swine ;
  olia:hasTag penn:NN .

:Fever_node a decomp:Component ; 
  decomp:correpondsTo :Fever ;
  olia:hasTag penn:NN .
Example decomp/example6 : View as image or source


The syntactic categories of the phrases are indicated using the property olia:hasTag from the OLiA vocabulary using the Penn TreeBank tagset.

The following example shows how to use the synsem module in conjunction with the decomp module to indicate the phrase structure tree of a frame. This is done by making the frame the target of the correspondsTo property and including components in the tree that correspond to individual arguments. As such it is possible to represent modelling of lexicalized grammars within the lexicon.


:know a ontolex:Word ;
  synsem:synBehavior :know_frame .

:know_frame a synsem:Frame ;
  lexinfo:subject :subject ;
  lexinfo:directObjet :directObject .

:know_root a decomp:Component ;
  decomp:correspondsTo :know_frame ;
  decomp:constituent :X_node, :knowY_node ;
  olia:hasTag penn:S .

:X_node a decomp:Component ;
  decomp:correspondsTo :subject ;
  olia:hasTag penn:NP .

:knowY_node a decomp:Component ;
  decomp:constituent :know_node, :Y_node ;
  olia:hasTag penn:VP .

:know_node a decomp:Component ;
  decomp:correspondsTo :know ;
  olia:hasTag penn:V .

:Y_node a decomp:Component ;
  decomp:correspondsTo :directObject ;
  olia:hasTag penn:NP .
Example decomp/example7 : View as image or source


Variation & Translation (vartrans)

The variation and translation module introduces vocabulary needed to represent relations between lexical entries and lexical senses that are variants of each other. The following diagram provides an overview of the vocabulary introduced by the module:

Lexico-Semantic Relations

The model defines a generic class lexico-semantic relation that allows to relate two lexical entries or two lexical senses to each other, this is done principally by means of two properties lexicalRel and senseRel that allow to directly link two lexical entries / lexical senses that are related.

ObjectProperty: lexicalRel


URI: http://www.w3.org/ns/lemon/vartrans#lexicalRel

The lexicalRel property relates two lexical entries that stand in some lexical relation.


Domain: ontolex:LexicalEntry

Range: ontolex:LexicalEntry


ObjectProperty: senseRel


URI: http://www.w3.org/ns/lemon/vartrans#senseRel

The senseRel property relates two lexical senses that stand in some sense relation.


Domain: ontolex:LexicalSense

Range: ontolex:LexicalSense

In general, these properties should not be used directly but instead a sub-property should be introduced, for example:


:fao lexinfo:initialismFor :food_and_agriculture_organization.

:surrogate_mother lexinfo:hypernym :mother.

lexinfo:initialismFor rdfs:subProperty vartrans:lexicalRel.
lexinfo:hypernym rdfs:subProperty vartrans:senseRel.
    
Example vartrans/example3 : View as image or source


In the case that further information about the relationship needs to be represented it is possible to create an individual that 'reifies' the relationship.

Class: Lexico-Semantic Relation


URI: http://www.w3.org/ns/lemon/vartrans#LexicoSemanticRelation

A lexico-semantic relation represents the relation between two lexical entries or lexical senses that are related by some lexical or semantic relationship.


subClassOf: relates exactly 2 (ontolex:LexicalEntry OR ontolex:LexicalSense)

The object property relates links a lexico-semantic relation to the lexical entries or lexical senses between which it establishes the relation:

ObjectProperty: relates


URI: http://www.w3.org/ns/lemon/vartrans#relates

The relates property links a lexico-semantic relation to the two lexical entries or lexical senses between which it establishes the relation.


Domain: ontolex:LexicalEntry OR ontolex:LexicalSense

Range: ontolex:LexicalEntry OR ontolex:LexicalSense

Many lexico-semantic relations are asymmetric, it is necessary to distinguish the source from the target:

ObjectProperty: source


URI: http://www.w3.org/ns/lemon/vartrans#source

The source property indicates the lexical sense or lexical entry involved in a lexico-semantic relation as a 'source'.


SubPropertyOf: relates


ObjectProperty: target


URI: http://www.w3.org/ns/lemon/vartrans#target

The target property indicates the lexical sense or lexical entry involved in a lexico-semantic relation as a 'target'.


SubPropertyOf: relates

The class lexico-semantic relation is specialized into the following two subclasses: lexical relation and sense relation, which relate two lexical entries or two lexical senses, respectively:

Class: Lexical Relation


URI: http://www.w3.org/ns/lemon/vartrans#LexicalRelation

A lexical relation is a lexico-semantic relation that represents the relation between two lexical entries the surface forms of which are related grammatically, stylistically or by some operation motivated by linguistic economy.


subClassOf: LexicoSemanticRelation, relates exactly 2 ontolex:LexicalEntry

By lexical relations we understand those relations at the surface forms, mainly motivated by grammatical requirements, style (Wortklang), and linguistic economy (helping to avoid excessive denominative repetition and improving textual coherence). Examples of lexical relations are the following:

  • Derivational relation (e.g., adjective → adverb variation: quick vs. quickly)
  • Morphosyntactic relation (e.g. ecological tourism vs. eco-tourism)
  • Abbreviation relation (including acronyms, among others. E.g., peer to peer and p2p; WYSWYG, FAO, UNO, etc.)

The specific type of lexical or sense relation can be specified via the object property category, which is defined as follows:

ObjectProperty: Category


URI: http://www.w3.org/ns/lemon/vartrans#category

The category property indicates the specific type of relation by which two lexical entries or two lexical senses are related.


Domain Lexico-Semantic Relation

Characteristics: Functional

The following example shows how to model the relation between "Food and Agriculture Organization" and its initialism "FAO" as one example of a lexical relation:


:fao a ontolex:LexicalEntry ;
     ontolex:sense :fao_sense; 
     ontolex:lexicalForm :fao_form.

:fao_sense ontolex:reference <http://dbpedia.org/resource/Food_and_Agriculture_Organization> .

:food_and_agriculture_organization a ontolex:LexicalEntry;
     ontolex:sense :food_and_agriculture_organization_sense ;
     ontolex:lexicalForm :food_and_agriculture_organization_form.

:food_and_agriculture_organization_sense ontolex:reference <http://dbpedia.org/resource/Food_and_Agriculture_Organization> .

:fao_form ontolex:writtenRep "FAO"@en .
:food_and_agriculture_organization_form ontolex:writtenRep "Food and Agriculture Organization"@en .

:fao_initialism a vartrans:LexicalRelation ;
      vartrans:source :food_and_agriculture_organization ; 
      vartrans:target :fao ;
      vartrans:category :initialism.
Example vartrans/example1 : View as image or source


Class: Sense Relation


URI: http://www.w3.org/ns/lemon/vartrans#SenseRelation

A sense relation is a lexico-semantic relation that represents the relation between two lexical senses the meanings of which are related.


subClassOf: LexicoSemanticRelation, relates exactly 2 ontolex:Sense

Examples of semantic relations are the equivalence relation between two senses, hypernymy and hyponymy relations, synonymy, antonymy, translations, etc.

The following example gives an example of a sense relation:


:surrogate_mother_lex a ontolex:LexicalEntry ;
     ontolex:sense :surrogate_mother_sense ;
     ontolex:canonicalForm :surrogate_mother_form.

:surrogate_mother_sense ontolex:reference <http://dbpedia.org/ontology/surrogate_mother>.

:surrogate_mother_form ontolex:writtenRep "surrogate mother"@en .

:mother_lex a ontolex:LexicalEntry ;
     ontolex:sense :mother_sense ;
     ontolex:canonicalForm :mother_form.

:mother_sense ontolex:reference <http://dbpedia.org/ontology/mother>.

mother_form ontolex:writtenRep "mother"@en .

:senseRelation a vartrans:SenseRelation ;
      vartrans:source :surrogate_mother ;
      vartrans:target :mother ; 
      vartrans:category :hypernym .
Example vartrans/example2 : View as image or source


Further, we consider terminological relations, which are defined as follows:

Class: Terminological Relation


URI: http://www.w3.org/ns/lemon/vartrans#TerminologicalRelation

A terminological relation is a sense relation that relates two lexical senses of terms that are semantically related in the sense that they can be exchanged in most contexts, but their surface forms are not directly related. The variants vary along dimensions that are not captured by the given ontology and are intentionally (pragmatically) caused.


SubclassOf: SenseRelation

Examples of categories of terminological relations include:

  • Diatopic (dialectal or geographical variants) (e.g., gasoline vs. petrol)
  • Diaphasic (register) (e.g., headache vs. cephalalgia; swine flu vs. pig flu vs. H1N1 vs. Mexican pandemic flu)
  • Diachronic (or chronological variants) (e.g., tuberculosis vs. phthisis)
  • Diastratic (discursive or stylistic variants) (e.g., man vs. bloke)
  • Dimensional variants: the terms point to the same concept but highlight a different property or dimension of the concept (e.g., bio-sanitary waste vs. hospital waste; Novel Coronavirus vs. Middle East Respiratory Syndrome Coronavirus; obsolete technology vs. dangerous technology; madre de alquiler (rental mother) vs. vientre de alquiler (womb mother), in Spanish).

We illustrate the use of terminological relations with the following example of a diachronic variant:


:tuberculosis a ontolex:LexicalEntry ;
       ontolex:lexicalForm :tuberculosis_form ; 
       ontolex:sense :tuberculosis_sense.

:tuberculosis_form ontolex:writtenRep "tuberculosis"@en .

:tuberculosis_sense ontolex:reference <http://dbpedia.org/resource/Tuberculosis>.

:phthisis a ontolex:LexicalEntry ;
       ontolex:lexicalForm :phthisis_form ; 
       ontolex:sense :phthisis_sense.

:phthisis_form ontolex:writtenRep "phthisis"@en .

:phtisis_sense ontolex:reference <http://dbpedia.org/resource/Tuberculosis>;
               dct:subject <http://dbpedia.org/resource/Medicine> .

:phtisis_diachronic_relation a vartrans:TerminologicalRelation ;
      vartrans:source :phthisis_sense ;
      vartrans:target :tuberculosis_sense ; 
      vartrans:category :diachronic.
Example vartrans/example4 : View as image or source


Translation

Translation relates two lexical entries from different languages the meaning of which is 'equivalent'. This 'equivalence` can be expressed at three different levels:

  • Ontological Equivalence (Shared reference): The simplest case is to have two entries in different languages that denote the same ontology entity. In this case they are clearly translations as they have the same interpretation.
  • Translation: In these cases, the lexical entries might not denote exactly the same concept, but their lexical meanings (senses) be equivalent in that they can be exchanged for each other in most contexts. Translation in this case is a subclass of sense relation.
  • Translatable as: In this case, we underspecify the exact involved meanings of the two lexical entries that are said to be translations of each other, essentially specifying that, in some contexts, one lexical entry in a source language can be replaced by an entry in the target language, depending on the senses of these lexical entries in the given context.

Translation as shared reference

In order to express that the lexical senses of two lexical entries are ontologically equivalent, we do not need other machinery than the one introduced already above:


:surrogate_mother a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
      ontolex:sense :surrogate_mother_sense.

:surrogate_mother_sense ontolex:reference ontology:SurrogateMother.

:madre_de_alquiler a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/es>, <http://lexvo.org/id/iso639-1/es> ;
      ontolex:sense :madre_de_alquiler_sense.

:madre_de_alquiler_sense ontolex:reference ontology:SurrogateMother.

:leihmutter a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
      ontolex:sense :leihmutter_sense.

:leihmutter_sense ontolex:reference ontology:SurrogateMother.
Example vartrans/example5 : View as image or source


By this, the corresponding senses of the lexical entries surrogate mother, madre de alquiler and Leihmutter are said to be equivalent in that they denote the same class in the ontology.

Translation as a relation between lexical senses

The second alternative mentioned above can be realized through the class translation, which relates two senses that can be regarded as equivalent in that they can be exchanged for each other.

Class: Translation


URI: http://www.w3.org/ns/lemon/vartrans#Translation

A translation is a sense relation expressing that two lexical senses corresponding to two lexical entries in different languages can be translated to each other without any major meaning shifts.


subClassOf: SenseRelation


:zip_code a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
      ontolex:sense :zip_code_sense.

:zip_code_sense ontolex:reference <http://dbpedia.org/ontology/zipCode>.

:postleitzahl a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
      ontolex:sense :postleitzahl_sense.

:postleitzahl_sense ontolex:reference <http://de.dbpedia.org/resource/Postleitzahl>.


:trans a vartrans:Translation;
       vartrans:source :zip_code_sense;
       vartrans:target :postleitzahl_sense;
       vartrans:category <http://purl.org/net/translation-categories#directEquivalent>.
Example vartrans/example6 : View as image or source


Thus, in spite of using having different denotations, both Postleitzahl and zip code can be seen as cross-lingual equivalents and thus as translations of each other.

Besides the class Translation, which reifies the translation relation between two lexical senses, as a shortcut the model also allows to directly express the relation of translation between lexical senses by a property translation that is regarded as equivalent to the reification:

ObjectProperty: translation


URI: http://www.w3.org/ns/lemon/vartrans#translation

The translation property relates two lexical senses of two lexical entries that stand in a translation relation to one another.


subPropertyOf: senseRel


With the translation property, the above example can be replaced with:


:zip_code a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
      ontolex:sense :zip_code_sense.

:zip_code_sense ontolex:reference <http://dbpedia.org/ontology/zipCode>.

:postleitzahl a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
      ontolex:sense :postleitzahl_sense.

:postleitzahl_sense ontolex:reference <http://de.dbpedia.org/resource/Postleitzahl>.

:zip_code_sense vartrans:translation :postleitzahl_sense.
Example vartrans/example7 : View as image or source


Translatable As

The third option foreseen in the vartrans model is one where we say that a lexical entry can be translated into some other entry in some contexts, underspecifying the exact lexical senses involved and the exact contextual conditions under which this translation is valid. For this, the model introduces the property translatableAs:

ObjectProperty: translatableAs


URI: http://www.w3.org/ns/lemon/vartrans#translatableAs

The translatableAs property relates a lexical entry in some language to a lexical entry in another language that it can be translated as depending on the particular context and specific senses of the involved lexical entries.


Domain: ontolex:LexicalEntry

Range: ontolex:LexicalEntry

Characteristics: Symmetric

Subproperty of: isSenseOf o translation o sense

The following example shows how to use the relation translatableAs to specify that corner (which can mean street intersection or intersection of two inside walls) can be translated as the Spanish rincón (intersection of two inside walls) or esquina (street intersection), depending on the particular sense involved.


:corner a ontolex:LexicalEntry;
      dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> .
 
:rincón a ontolex:LexicalEntry;
       dct:language <http://id.loc.gov/vocabulary/iso639-2/es>, <http://lexvo.org/id/iso639-1/es> .

:esquina a ontolex:LexicalEntry;
       dct:language <http://id.loc.gov/vocabulary/iso639-2/es>, <http://lexvo.org/id/iso639-1/es> .

:corner vartrans:translatableAs :rincón.
:corner vartrans:translatableAs :esquina.

Example vartrans/example8 : View as image or source


We can group translations into a set by using the class translation set:

Class: Translation Set


URI: http://www.w3.org/ns/lemon/vartrans#TranslationSet

A translation set is a set of translations that have some common source.



In order to relate a translation set to one of the translations contained in it, the model defines a property trans:

ObjectProperty: trans


URI: http://www.w3.org/ns/lemon/vartrans#trans

The trans property relates a TranslationSet to one of its translations.


Domain: vartrans:TranslationSet

Range: vartrans:Translation

Metadata (lime)

The LInguistic MEtadata (lime) module allows for describing metadata at the level of the lexicon-ontology interface. This module is intended to complement existing metadata schemas such as Dublin Core, the PROV ontology, DCAT or VoID, as lime provides a profile to describe metadata as related to the lexicon-ontology interface.

Following the conceptual model of the lexicon-ontology interface, lime distinguishes three main metadata entities:

  1. the reference dataset (describing the semantics of the domain, e.g., the ontology),
  2. the lexicon (being a collection of lexical entries),
  3. the concept set (an optional set of lexical concepts, bearing a conceptual backbone to a lexicon)

Note: the reference dataset here is not limited to OWL vocabularies, but includes any RDF dataset which contains references to objects of a domain of discourse.

As a metadata vocabulary, lime focuses on summarizing quantitative and qualitative information about these entities and the relations among them.

Metadata is attached in particular to three types of sets that lime distinguishes:

  1. the set of lexicalizations, containing the bindings between logical predicates in the ontology and lexical entries in the lexicon
  2. the set of conceptualizations, containing the bindings between lexical concepts in the concept set and entries in the lexicon
  3. the set of lexical links, linking lexical concepts from a concept set to references in an ontology

In the following sections, we provide detailed descriptions for the lime vocabulary to describe metadata for the lexicon as a whole as well as for the three types of sets described above. Metadata about ontologies (and domain datasets as well) and lexical concept sets can be provided by means of the already mentioned existing metadata vocabularies.

Lexicon and Lexicon Metadata

The main metadata-bearing entity in lemon is a lexicon object that represents a collection of lexical entries for a particular language. A small example lexicon consisting of four lexical entries for cat, marry, high and intangible assets would look as follows:


:lexicon a lime:Lexicon;
   lime:language "en";
   lime:entry :lex_high;
   lime:entry :lex_cat;
   lime:entry :lex_marry;
   lime:entry :lex_intangible_assets.
Example lime/example1 : View as image or source


A lexicon is expected to consist of at least one lexical entry and is defined as a subclass of void:Dataset:

Class: Lexicon


URI: http://www.w3.org/ns/lemon/lime#Lexicon

A lexicon represents a collection of lexical entries for a particular language or domain.


SubClassOf: entry min 1 ontolex:LexicalEntry, language exactly 1 rdfs:Literal, void:Dataset

The property linking a lexicon to a lexical entry is the property entry:

ObjectProperty: Entry


URI: http://www.w3.org/ns/lemon/lime#entry

The entry property relates a lexicon to one of the lexical entries contained in it.


Domain: Lexicon

Range: ontolex:LexicalEntry

The language property can be stated on either a lexicon or a lexical entry (note that all entries in the same lexicon should be in the same language and that the language of the lexicon and entry should be consistent with the language tags used on all forms) and its value should be a literal representing the language.

DatatypeProperty: Language


URI: http://www.w3.org/ns/lemon/lime#language

The language property indicates the language of a lexicon, a lexical entry, a concept set or a lexicalization set.


Domain: Lexicon or ontolex:LexicalEntry or ConceptSet or LexicalizationSet

Range: rdfs:Literal

Beyond using the lime:language property, which has a Literal as a range, it is recommended to use the Dublin Core language property with reference to either Lexvo.org or The Library of Congress Vocabulary

The property lexical entries indicates the number of lexical entries contained in a lexicon. The property is also used for lexicalization and conceptualization sets, indicating in this case the number of lexical entries involved in these sets.

DatatypeProperty: Lexical Entries


URI: http://www.w3.org/ns/lemon/lime#lexicalEntries

The lexical entries property indicates the number of distinct lexical entries contained in a lexicon, lexicalization set or conceptualization set.


Domain: Lexicon or LexicalizationSet or ConceptualizationSet

Range: xsd:integer

The model also allows to specify the linguistic (annotation) model used to describe characteristics of lexical entries via the linguisticCatalog property:

ObjectProperty: Linguistic Catalog


URI: http://www.w3.org/ns/lemon/lime#linguisticCatalog

The linguistic catalog property indicates the catalog of linguistic categories used in a lexicon to define linguistic properties of lexical entries.


Domain: Lexicon

SubPropertyOf: void:vocabulary

As an example we may describe a simple lexicon using the above introduced properties in addition to Dublin Core properties. The part-of-speech of the four lexical entries is indicated using the lexinfo vocabulary, so that the value of linguisticCatalog is set to http://www.lexinfo.net/ontologies/2.0/lexinfo. In the example, there is one (RDF) resource that represents both the lexicon itself and its metadata:


:lexicon a lime:Lexicon;
   lime:language "en";
   dct:language <http://id.loc.gov/vocabulary/iso639-2/en>, <http://lexvo.org/id/iso639-1/eng> ;
   lime:lexicalEntries "4"^^xsd:integer;                                               
   lime:linguisticCatalog <http://www.lexinfo.net/ontologies/2.0/lexinfo> ;
   dct:description "This is an example lexicon"@en;                              
   dct:description "Questo è un lessico di esempio"@it;                          
   dct:creator <http://john.mccr.ae/>;                                           
   void:triples "29"^^xsd:integer ;                                                       
   lime:entry :lex_high;                                                     
   lime:entry :lex_cat;                                                      
   lime:entry :lex_marry;                                                    
   lime:entry :lex_intangible_assets.                                        

                                                                                
:lex_cat a ontolex:LexicalEntry, lexinfo:Noun;                                                
   ontolex:canonicalForm :form_cat.
:form_cat ontolex:writtenRep "cat"@en.                                          
                                                                                
:lex_marry a ontolex:LexicalEntry, lexinfo:Verb;                                              
   ontolex:canonicalForm :form_marry.                    
:form_marry ontolex:writtenRep "marry"@en .                                     
                                                                                
:lex_high a ontolex:LexicalEntry, lexinfo:Adjective;                                               
   ontolex:canonicalForm :form_high.                   
:form_high ontolex:writtenRep "high"@en .                                       
                                                                                
:lex_intangible_assets a ontolex:LexicalEntry, lexinfo:Noun;                               
  ontolex:canonicalForm :form_intangible_assets.               
:form_intangible_assets ontolex:writtenRep "intangible assets"@en.
Example lime/example2 : View as image or source


Lexicalization Set

A lexicalization set is a void:Dataset that comprises a collection of so called lexicalizations, which we understand as pairs of a lexical entry and an associated reference in the ontology.

Class: Lexicalization Set


URI: http://www.w3.org/ns/lemon/lime#LexicalizationSet

A lexicalization set is a dataset that comprises a collection of lexicalizations, that is pairs of lexical entry and corresponding reference in the associated ontology/vocabulary/dataset.


SubClassOf: void:Dataset, lexiconDataset max 1 ontolex:Lexicon, referenceDataset exactly 1 void:Dataset, partition only LexicalizationSet, lexicalizationModel exactly 1

The lexicalization set is linked to the ontology and the lexicon by means of the properties reference dataset and lexicon dataset, respectively.

ObjectProperty: Reference Dataset


URI: http://www.w3.org/ns/lemon/lime#referenceDataset

The reference dataset property indicates the dataset that contains the domain objects or vocabulary elements that are either referenced by a given lexicon, providing the grounding vocabulary for the meaning of the lexical entries, or linked to lexical concepts in a concept set by means of a lexical link set.


Domain: LexicalizationSet or LexicalLinkset

Range: void:Dataset


ObjectProperty: Lexicon Dataset


URI: http://www.w3.org/ns/lemon/lime#lexiconDataset

The lexicon dataset property indicates the lexicon that contains the entries referred to in a lexicalization set or a conceptualization set.


Domain: LexicalizationSet or ConceptualizationSet

Range: ontolex:Lexicon

The optionality of the lexicon dataset property is required to support other lexicalization models (e.g. RDFS, SKOS, SKOS-XL) that do not introduce a separate notion of lexicon, since lexical entries only exist implicitly being part of a lexicalization. The property lexicalization model indicates the specific lexicalization model used.

ObjectProperty: Lexicalization Model


URI: http://www.w3.org/ns/lemon/lime#lexicalizationModel

The lexicalization model property indicates the model used for representing lexical information. Possible values include (but are not limited to) http://www.w3.org/2000/01/rdf-schema# (for the use of rdfs:label), http://www.w3.org/2004/02/skos/core (for the use of skos:pref/alt/hiddenLabel), http://www.w3.org/2008/05/skos-xl (for the use of skosxl:pref/alt/hiddenLabel) and http://www.w3.org/ns/lemon/all for lemon.


Domain: LexicalizationSet

Range: rdfs:Resource

SubPropertyOf: void:vocabulary

The model defines the property references, which indicates the number of vocabulary elements lexicalized by at least one lexical entry. This number can be obviously smaller than the number of entities in the ontology (in case some vocabulary elements are not lexicalized) and the number of lexical entries in the lexicon (in case that several lexical entries refer to the same ontology element), respectively.

DatatypeProperty: References


URI: http://www.w3.org/ns/lemon/lime#references

The references property indicates the number of distinct ontology or vocabulary elements that are either associated with lexical entries via a lexicalization set or linked to lexical concepts via a lexical link set.


Domain: LexicalizationSet or LexicalLinkset

Range: xsd:integer

In the following example, we describe a lexicalization set expressing how elements of an ontology can be verbalized in Japanese by means of entries from a supplied lexicon. The metadata clearly tells which ontology and lexicon are involved in the lexicalization set, that is http://www.example.com/ontology and http://www.example.com/lexicon, respectively, as well as the relevant natural language. The knowledge of these facts about a lexicalization set allows us to assess its usefulness for a given task as well to discover relevant lexicalization sets, when we are constrained by the choice of an ontology, lexicon or natural language.

The ontology is modelled as an instance of the class voaf:Vocabulary that is a kind of void:Dataset representing vocabularies (both RDFS Schemas and OWL Ontologies). We benefit from the more specific distinctions made by VOAF, by breaking down the total number of entities in the ontology (held by the property void:entities) into separate counts for the classes and properties (held by voaf:classNumber and voaf:propertyNumber, respectively).

Similarly, terms from the lime vocabulary are used to represent statistics about the linguistic content of the lexicon and the lexicalization set. Overall, the ontology defines 100 entities and the lexicon 80 lexical entries; however, only 20 entities from the target ontology have been associated with a total of 50 lexical entries. In this sense, only 20 references from the ontology have been actually lexicalized by linking them to a lexical entry.

When counting the entities in the ontology or, in general, in the reference dataset, we recommend to ignore the resources describing the ontology itself (that is an instance of the class owl:Ontology) as well as other metadata entities.


:Lexicalization a lime:LexicalizationSet ;
  lime:language "ja";
  dct:language  <http://id.loc.gov/vocabulary/iso639-1/ja>, <http://lexvo.org/id/iso639-3/jpn> ;
  lime:lexicalizationModel <http://www.w3.org/ns/lemon/all> ;
  lime:referenceDataset <http://www.example.com/ontology> ;
  lime:lexiconDataset <http://www.example.com/lexicon> ;
  lime:references 20 ;
  lime:lexicalEntries 50 .

<http://www.example.com/ontology> a owl:Ontology, voaf:Vocabulary, void:Dataset ;
  void:entities 100 ;
  voaf:classNumber 60 ;
  voaf:propertyNumber 40 .

<http://www.example.com/lexicon> a ontolex:Lexicon ;
  ontolex:language "ja" ;
  dct:language  <http://id.loc.gov/vocabulary/iso639-1/ja>, <http://lexvo.org/id/iso639-3/jpn> ;
  lime:lexicalEntries 80 .
Example lime/example3 : View as image or source


A lexicalization set comprises a set of pairs of a lexical entry and the corresponding reference that the lexical entry denotes. These pairs are expressed differently depending on the lexical model adopted:

  • In the lexicon model for ontologies (lemon), the pairs are indicated by relating a lexical entry to a reference through the denotes property or via the chain sense o reference
  • In RDFS, a lexicalization is expressed via the property rdfs:label.
  • In SKOS(-XL), a lexicalization is expressed via the skos(-xl):{pref,alt,hidden}Label properties.

In addition to specifying the number of entities in the ontology lexicalized, it is also possible to give the total number of lexicalizations, that is the total connections between lexical entries and references. This number should in most cases be the same as the total number of lexical senses defined in the lexicon. The value may be given by the absolute number of lexicalizations:

DatatypeProperty: Lexicalizations


URI: http://www.w3.org/ns/lemon/lime#lexicalizations

The lexicalizations property indicates the total number of lexicalizations in a lexicalization set, that is the number of unique pairs of lexical entry and denoted ontology element.


Domain: LexicalizationSet

Range: xsd:integer

In addition or alternatively to the absolute number of lexicalizations, the model also supports the indication of the average number of lexicalizations per ontology element:

DatatypeProperty: Average Number of Lexicalizations


URI: http://www.w3.org/ns/lemon/lime#avgNumOfLexicalizations

The average number of lexicalizations property indicates the average number of lexicalizations per ontology element.


Domain: LexicalizationSet

Range: xsd:decimal

The average number of lexicalizations is calculated as specified by the following formula:

The following example describes an ontology consisting of 30 ontology elements. The corresponding lexicalization set contains 20 lexicalizations involving 15 lexical entries (so some entries have multiple meanings in the ontology). On average, for each element in the ontology there are thus 20/30 = 0.66 lexicalizations.


:Lexicalization a lime:LexicalizationSet ;
  lime:lexicalizations 20 ;
  lime:references 20 ;
  lime:lexicalEntries 15 ;
  lime:avgNumOfLexicalizations 0.66 ;
  lime:referenceDataset <http://www.example.com/ontology> ;
  lime:lexiconDataset <http://www.example.com/lexicon> .

<http://www.example.com/ontology> a owl:Ontology, void:Dataset ;
  void:entities 30 .
Example lime/example4 : View as image or source


Finally, the percentage property may be used to express the percentage of entities in an ontology which are lexicalized, formally:

DatatypeProperty: Percentage


URI: http://www.w3.org/ns/lemon/lime#percentage

The percentage property expresses the percentage of entities in the reference dataset which have at least one lexicalization in a lexicalization set or are linked to a lexical concept in a lexical linkset.


Domain: LexicalizationSet or LexicalLinkset

Range: xsd:decimal

Partitions

In many cases, we want to provide descriptive metadata about a subset of a lexicalization set, that is for the subset representing all the lexicalizations for a certain type of ontology entity (class, property, etc.). To logically partition a lexicalization set, the lime module introduces the property partition:

ObjectProperty: Partition


URI: http://www.w3.org/ns/lemon/lime#partition

The partition property relates a lexicalization set or lexical linkset to a logical subset that contains lexicalizations for a given ontological type only.


Domain: LexicalizationSet or LexicalLinkset

Range:: LexicalizationSet or LexicalLinkset

SubPropertyOf: void:subset


ObjectProperty: Resource Type


URI: http://www.w3.org/ns/lemon/lime#resourceType

The resource type property indicates the type of ontological entity of a lexicalization set or lexical linkset.


Domain: LexicalizationSet or LexicalLinkset

Range: rdfs:Class

Characteristics: Functional

For example, we may limit our metadata about lexicalizations to a particular class, e.g. restricting the metadata to the logical partition of lexicalizations that denote an element in the extension of the corresponding class:


:Lexicalization a lime:LexicalizationSet ;
  lime:partition :CountryPartition ;
  lime:references 2000 .

:CountryPartition
  lime:resourceType ontology:Country ;
  lime:references 50 .
Example lime/example5 : View as image or source


In addition it is also possible to give RDF(S) or OWL types as the target of the resource type property. This allows us to state the number of classes that are lexicalized by at least one lexical entry:


:Lexicalization a lime:LexicalizationSet ;
  lime:partition :ClassPartition .

:ClassPartition
  lime:resourceType owl:Class ;
  lime:references 50 .
Example lime/example6 : View as image or source


Lexical Linkset

Lexical linksets are similar in many ways to the lexicalization sets above in the sense that they connect a concept set to an ontology. The primary purpose of this is to describe the linking of a concept set such as the synsets in a wordnet to an ontology.

Class: Lexical Linkset


URI: http://www.w3.org/ns/lemon/lime#LexicalLinkset

A lexical linkset represents a collection of links between a reference dataset and a set of lexical concepts (e.g. synsets of a wordnet).


SubClassOf: void:Linkset, conceptualDataset exactly 1 ontolex:ConceptSet, referenceDataset exactly 1 void:Dataset, partition only LexicalLinkset

The lexical linkset is linked to a concept set by means of the conceptual dataset property:

ObjectProperty: Conceptual Dataset


URI: http://www.w3.org/ns/lemon/lime#conceptualDataset

The conceptual dataset property relates a lexical link set or a conceptualization set to a corresponding concept set.


Domain: LexicalLinkset or ConceptualizationSet

Range: ontolex:ConceptSet

There are several properties that are analogous to properties defined for a lexicalization set. For example concepts indicates the number of concepts in a concept set:

DatatypeProperty: Concepts


URI: http://www.w3.org/ns/lemon/lime#concepts

The concepts property indicates the number of lexical concepts defined in a concept set or involved in either a LexicalLinkset or ConceptualizationSet.


Domain: ontolex:ConceptSet or LexicalLinkset or ConceptualizationSet

Range: xsd:integer

Similarly, the links and avgNumOfLinks properties are analogous to the properties lexicalizations and avgNumOfLexicalizations.

DatatypeProperty: Links


URI: http://www.w3.org/ns/lemon/lime#links

The links property indicates the number of links between concepts in the concept set and entities in the reference dataset.


Domain: LexicalLinkset

Range: xsd:integer


DatatypeProperty: Average Number of Links


URI: http://www.w3.org/ns/lemon/lime#avgNumOfLinks

The average number of links property indicates the average number of links to lexical concepts for each ontology element in the reference dataset.


Domain: LexicalLinkset

Range: xsd:decimal


Finally, we note that the references, percentage and partition properties apply to the lexical linkset in the same way as to the lexicalization set.

Conceptualization Set

A conceptualization set is analogous to a lexicalization set, but associates a concept set with a lexicon and consists of conceptualizations, that is pairs formed by a single lexical entry and its associated lexical concept.

Class: Conceptualization Set


URI: http://www.w3.org/ns/lemon/lime#ConceptualizationSet

A conceptualization set represents a collection of links between lexical entries in a lexicon and lexical concepts in a concept set they evoke.


SubClassOf: void:Dataset, lexiconDataset exactly 1 Lexicon, conceptualDataset exactly 1 ontolex:ConceptSet

A number of properties already described for other metadata entities can also be used in the description of a conceptualization set.

  • the two properties indicating the lexicon dataset and the conceptual dataset, that is lime:lexiconDataset and lime:conceptualDataset.
  • lexicalEntries: indicating the number of distinct lexical entries
  • concepts: indicating the number of distinct lexical concepts

Additional properties have been defined specifically to characterize a given set of conceptualizations:

DatatypeProperty: Conceptualizations


URI: http://www.w3.org/ns/lemon/lime#conceptualizations

The conceptualizations property indicates the number of distinct conceptualizations in a conceptualization set.


Domain: ConceptualizationSet

Range: xsd:integer


DatatypeProperty: Average Ambiguity


URI: http://www.w3.org/ns/lemon/lime#avgAmbiguity

The average ambiguity property indicates the average number of lexical concepts evoked by each lemma/canonical form in the lexicon.


Domain: ConceptualizationSet

Range: xsd:decimal


DatatypeProperty: Average Synonymy


URI: http://www.w3.org/ns/lemon/lime#avgSynonymy

The average synonymy property indicates the average number of lexical entries evoking each lexical concept in the concept set.


Domain: ConceptualizationSet

Range: xsd:decimal

The following example shows how to describe the metadata of a version of WordNet 3.0 transformed into RDF. The example illustrates how to describe the main components of the resource (a lexicon, a concept set and a conceptualization relating them). The transformation to RDF is based on a straightforward mapping between the WordNet meta-model and the ontolex model:

By having this mapping in mind, it should be clear how some of the statistics about WordNet 3.0 would be specified by means of the vocabulary introduced by the lime module:


:WnConceptualizationSet a lime:ConceptualizationSet ;
  lime:conceptualDataset :WnConceptSet ;
  lime:lexiconDataset :WnLexicon ;
  lime:lexicalEntries "155287"^^xsd:integer ;
  lime:concepts "117659"^^xsd:integer ;
  lime:conceptualizations "206941"^^xsd:integer ;
  lime:avgAmbiguity "1.33"^^xsd:decimal ;
  lime:avgSynonymy "1.76"^^xsd:decimal
  .

:WnConceptSet a ontolex:ConceptSet ;
  lime:concepts "117659"^^xsd:integer .

:WnLexicon a ontolex:Lexicon ;
  lime:lexicalEntries "155287"^^xsd:integer .
Example lime/example7 : View as image or source


Formal definition of properties

The lime module essentially provides vocabulary to describe the relation between three sets:

  • L: the set of lexical entries
  • O: the set of ontology elements
  • C: the set of concepts

The model considers binary relations over these sets as follows:

  • RlexO × L: the set of lexicalizations, that is the set of pairs (o,l), with oO, lL.
  • RconL × C: the set of conceptualizations, that is the set of pairs (l,c), with lL, cC
  • RlinksO × C: the set of links between ontology references and concepts, that is the set of pairs (o,c), with oO, cC

For each Ri, it holds that the relation is a subset of the Cartesian product of the involved sets, i.e. RiA × B

For each of these relations RiA × B, we define the following counts:

  • cardinality(Ri) : the total number of pairs in the relation Ri
  • count(πA(Ri)): the a's that occur in at least one pair in R = |{aA | ∃ bB . (a,b) ∈ R}|
  • count(πB(Ri)): the b's that occur in at least one pair in R = |{bB | ∃ aA . (a,b) ∈ R}|

and ratios:

  • coverageA(Ri): ratio between the elements in A that participate in at least one (a,b) pair, and the total number of elements in A = |{a ∈ A | ∃ b ∈ B . (a,b) ∈ R}| / |A|
  • averageA(Ri): average number of b’s in B related with each a in A = |R| / |A|
  • averageB(Ri): average number of a’s in A related with each b in B = |R| / |B|

The lime model does not introduce all the properties to express all of the above counts for all three relations, but has selected to model the following relations:

Relation Related Dataset cardinality(Ri) count(πA(Ri)) count(πB(Ri)) coverageA(Ri) averageA(Ri) averageB(Ri)
RlexO × L lime:LexicalizationSet lime:lexicalizations lime:references lime:lexicalEntries percentage avgNumOfLexicalizations ---- N/A ----
RconL × C lime:ConceptualizationSet lime:conceptualizations lime:lexicalEntries lime:concepts ---- N/A ---- avgAmbiguity avgSynonymy
RlinkO × C lime:LexicalLinkset lime:links lime:references lime:concepts percentage avgNumOfLinks ---- N/A ----

Publication Scenarios

In this section, we describe different publication scenarios for lemon models. The lexicon ontology model essentially describes three types of entities:

Irrespective of their logical dependencies, all of the entities above can be published as physically independent data sources. At the other end of the set of options, the entities can be published together as one data source.

We highlight four common publication scenarios:

  1. Independent resources: A reference dataset, a lexicon and a lexicalization set are published as independent data sources. This scenario is very common in case of independently developed resources. A reference dataset and a general-purpose (i.e. not tailored towards that dataset) lexicon exist and are published separately (possibly by different publishers). A third party then decides to link these datasets by a lexicalization set and publishes it as a third entity and advertises it through proper lime metadata.
  2. Linking to 3rd party lexicon: A general-purpose lexicon is published as an independent resource. Then, in developing a reference dataset/ontology, its authors decide to publish it together with a lexicalization set based on the lexical entries from the existing lexicon.
  3. Linking to 3rd party ontology: A lexicon tailored to an existing reference dataset is published together with a lexicalization set. This is the opposite scenario to scenario 2 above. In this case the reference dataset or ontology vocabulary is the pre-existing resource developed by some 3rd party, and a lexicon is created ad hoc for it, with the associated lexicalizations.
  4. Integrated: Reference dataset, dataset-specific lexicon and lexicalization set are combined into a single data source: this scenario corresponds to closed environments where a single party is in control of the ontology, the lexicon and the lexicalizations and publishes the three as one dataset. In this scenario, the reference dataset is created and lexicalized with lexical elements created specifically for it. This scenario is the typical setting of ontology vocabularies/datasets naturally lexicalized by means of rdfs:label, skos or skosxl labeling properties.

Similarly, there is Concept Set for a collection of lexical concepts and ConceptualizationSet for the triples expressing how lexical concepts relate to lexical entries from a given lexicon. Similar considerations to the ones above apply to these datasets.

Identifying a Concept Set as an independent dataset allows reusing the same lexical concepts across different conceptualization sets. For example, this allows to reuse the same lexical concepts from an existing wordnet to conceptualize a lexicon in a different natural language than the one for which the resource was initially conceived. Otherwise, it is possible to define different concept sets, one for each conceptualization set, and then to relate them via a VoID Linkset.

Linguistic Description

An important goal of a lexicon is to record linguistic properties of the lexical entries defined in the lexicon such as its part-of-speech, gender, aspect, inflectional pattern, etc. The lemon model does not prescribe any vocabulary for doing so, but leaves it at the discretion of the user of the model to select an appropriate vocabulary that is in line with a given theoretical linguistic framework or grammar. We show below how third party category systems can be reused to describe the properties of lexical entires in a lemon lexicon. We will use the lexinfo ontology in our examples as such as third party ontology describing relevant linguistic categories and properties.

Morphosyntactic Description

A lexicon typically indicates the part-of-speech of a given lexical entry. We can specify the part of speech of a word as follows using the lexinfo vocabulary:


:cat a ontolex:Word ;
  lexinfo:partOfSpeech lexinfo:noun .
Example description/example1 : View as image or source


When defining categories, it is crucial to link these categories to other models to establish coherence. The partOfSpeech property is defined as follows in lexinfo:


lexinfo:partOfSpeech 
  rdfs:label "part of speech"@en ;
  rdfs:comment "A category assigned to a word based on its grammatical and semantic properties."@en ;
  dcr:datcat <http://www.isocat.org/datcat/DC-1345> ,
             <http://www.isocat.org/datcat/DC-396> ;
  rdfs:range lexinfo:PartOfSpeech ;
  rdfs:subPropertyOf lexinfo:morphosyntacticProperty .
Example description/example2 : View as image or source


The concrete part of speech "noun" is defined as follows and linked to the ISOcat category DC-1333.


lexinfo:noun
  a lexinfo:PartOfSpeech, lexinfo:NounPOS ;
  rdfs:label "noun"@en ;
  rdfs:comment "Part of speech used to express the name of a person, place, action or thing."@en ;
  dcr:datcat <http://www.isocat.org/datcat/DC-1333> .
Example description/example2b : View as image or source


Indeed, we could have written our example above also as follows:


:cat a ontolex:Word ;
  lexinfo:partOfSpeech <http://www.isocat.org/datcat/DC-1333> .
Example description/example3 : View as image or source


The following morpho-syntactic properties are defined in the lexinfo ontology:

  • Animacy: indicating whether a word denotes something animate (human or animal)
  • Aspect: indicating the grammatical aspect (e.g. perfect or imperfect for verbs)
  • Case: indicating the grammatical case (e.g. nominative, accusative, dative, genitive, etc.)
  • Cliticness: indicating whether the word acts as a clitic
  • Definiteness: indicating whether the word refers to a particular element in a set
  • Degree: indicating whether an adjective is comparative or superlative
  • Finiteness: indicating whether the form is finite
  • Gender: indicating the grammatical gender of a word (e.g. female, masculine, neuter etc.)
  • Modification Type: indicating whether a modifier precedes or follows a word
  • Mood: indicating the modality (imperative, conditional, etc.) of a verb
  • Negative: indicating the negative form of verbs (e.g. in Japanese)
  • Number: indicating the grammatical number of a word (e.g. singular, plural, etc.)
  • Part of speech: indicating the syntactic category of the word in question (e.g. noun, verb, adjective, etc.)
  • Person: indicating whether a noun or pronoun refers to the speaker (first person) or listener (second person) or other entity (third person) and agreeing forms of verbs
  • Tense: indicating whether a words makes a temporal reference to the past, present or future
  • Voice: indicating type of sentence (active vs. passive voice)

When using these properties, care should be taken to distinguish between linguistic properties of the entry itself and properties of any of the forms. By default, it should be assumed that a property of a lexical entry also holds for all its forms. For example, in many languages gender is an entry property for nouns, but a form property for adjectives, for example:


:spiaggia a ontolex:Word ;
  ontolex:canonicalForm :spiaggia_lemma ;
  ontolex:otherForm spiaggia_plural ;
  lexinfo:partOfSpeech lexinfo:noun ;
  lexinfo:gender lexinfo:feminine .

:spiaggia_lemma 
  ontolex:writtenRep "spiaggia"@it ;
  lexinfo:number lexinfo:singular .

:spiaggia_plural
  ontolex:writtenRep "spiagge"@it ;
  lexinfo:number lexinfo:plural .

:famoso a ontolex:Word ;
  ontolex:canonicalForm :famoso_lemma ;
  ontolex:otherForm :famosa_form, :famose_form, famosi_form ;
  lexinfo:partOfSpeech lexinfo:adjective .

:famoso_lemma
  ontolex:writtenRep "famoso"@it ;
  lexinfo:number lexinfo:singular ;
  lexinfo:gender lexinfo:masculine .

:famosa_form 
  ontolex:writtenRep "famosa"@it ;
  lexinfo:number lexinfo:singular ;
  lexinfo:gender lexinfo:feminine .
Example description/example4 : View as image or source


For convenience, lexinfo also introduces specific classes for each part of speech so that the part of speech of a word can be specified by a rdf:type statement. For example, the part of speech Noun is defined as follows:

Noun ≡ ∃ partOfSpeech.NounPOS

It is recommended to use both the rdf:type statement as well as the lexinfo:partOfSpeech to maximize interoperability in spite of the small redundancy:


:geneesmiddel a lexinfo:Noun ;
  lexinfo:partOfSpeech lexinfo:noun .
Example description/example5 : View as image or source


Pragmatic & Paradigmatic Description

Pragmatic aspects related to the usage of a lexical entry as well as the paradigmatic relationships between lexical entries can also be described using the lemon model by resorting to some external vocabulary. As for the case of the description of the morphosyntactic properties of lexical entries and their forms, lemon does not prescribe any vocabulary but encourages the use of external vocabularies to describe aspects related to the temporal use of a lexical entry, e.g. to indicate whether the use of the lexical entry is modern or anachronic or to specify lexico-semantic relationships between lexical senses. Examples of such paradigmatic or lexico-semantic relationships are: synonymy, antonymy, holonymy, hypernymy, meronymy, etc.

Arguments

When describing syntactic frames it is important to specify the grammatical role or function played by different syntactic arguments. We might want to specify, for instance, which argument plays the grammatical role of subject and which argument plays the role of a direct object, etc. LexInfo distinguishes the following types of arguments:


  • Subject: indicating the syntactic subject of a sentence. The subject typically expresses the agent of the action denoted by the verb, but this need not to be so in all cases. In a passive construction, the subject actually expresses the patient or beneficiary of the action denoted by the verb.
  • Object: distinguishing between direct, indirect, prepositional (a non-optional object marked with a preposition) and genitive
  • Adjunct: adjuncts are optional arguments. Lexinfo distinguishes between prepositional, possessive, comparative and superlative arguments.
  • Copulative: a copulative argument indicates one argument involved in a so called copula construction involving a copulative verb. Lexinfo distinguishes the copulative subject and the copula predicate.
  • Clausal: certain verbs also subcategorize a whole clause or sentence as an argument (e.g. the verb (to) claim). In this case lexinfo talks about a clausal argument and distinguishes between declarative, gerundive, infinitive, interrogative (infinitive), possessive infinitive, prepositional gerund/interrogative, sentential or subjunctive clausal arguments.
  • Attributive: The word modified by an an adjective in an attributive construction

Each argument is associated with a specific property indicating the grammatical role to the actual object representing the syntactic argument.


:father a lexinfo:Noun ;
  synsem:synBehavior :father_frame.

:father_frame a lexinfo:NounPredicateFrame ;
  rdfs:label "X is the father of Y" , "X is Y's father" ;
  lexinfo:copulativeArg :father_frame_arg1 ;
  lexinfo:possessiveAdjunct :father_frame_arg2 .

:father_frame_arg1 a lexinfo:CopulativeArg .

:father_frame_arg2 a lexinfo:PossessiveAdjunct .
Example description/example6 : View as image or source


Frames

Syntactic or subcategorization frames describe which syntactic arguments a certain lexical entry (verb, noun etc.) requires to be complete. A verb that requires a subject and a direct object is called a transitive verb. The corresponding frame that generalizes across particular verbs is called transitive frame or transitive construction (in construction grammar theories).

In lexinfo, frames can be axiomatized by describing which type of arguments they subcategorize. A transitive frame would be axiomatized as follows in lexinfo:

   TransitiveFrame ≡ VerbFrame ⊓ (=1 subject ⊓ =1 directObject)

Lexical Nets

Lexical nets, so called wordnets in particular, are an important type of lexical resource used very often in natural language processing applications. Lexical nets organize the senses of words into groups of equivalent meaning, so called synsets. Further, synsets are related to each other using lexico-semantic relationships so that the the resource can be regarded as a "net". We discuss below how lexical nets can be represented using the lemon vocabulary using Princeton wordnet as an example.

Lexical nets in lemon

As mentioned above, lexical nets indicate the different lexical senses that a word has and groups these senses into sets of equivalent senses (so called synsets). Below we state how the main entities of a lexical net (words, lemmas, senses and synsets) can be represented in lemon:

  • Synset: Lexical Concept
  • Word: Lexical Entry
  • (Word) Sense: Lexical Sense
  • Lemma: Canonical Form

Lexico-semantic relations should be represented between lexical concepts. The WordNet-RDF ontology defines some of these lexico-semantic relations:

http://wordnet-rdf.princeton.edu/ontology

The following example shows how to model the lexical entry for cat:


@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
@prefix wordnet-ontology: <http://wordnet-rdf.princeton.edu/ontology#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .

:cat a ontolex:LexicalEntry ;
    wordnet-ontology:part_of_speech wordnet-ontology:noun ;
    ontolex:canonicalForm :cat#canonicalForm;
    ontolex:sense <#1-n>,
        :cat#2-n,
        :cat#3-n,
        :cat#4-n,
        :cat#5-n,
        :cat#6-n .

:cat#1-n a ontolex:LexicalSense ;
    wordnet-ontology:gloss "feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats"@eng ;
    wordnet-ontology:lex_id 0 ;
    wordnet-ontology:old_sense_key "cat%1:05:00::" ;
    wordnet-ontology:sense_number 1 ;
    wordnet-ontology:tag_count 18 ;
    ontolex:isLexicalizedSenseOf <http://wordnet-rdf.princeton.edu/wn31/102124584-n> ;
    ontolex:reference <http://dbpedia.org/resource/Cat> ;
    owl:sameAs <http://lemon-model.net/lexica/uby/wn/WN_Sense_574>,
        <http://www.lexvo.org/page/wordnet/30/noun/cat_1_05_00> .

<http://wordnet-rdf.princeton.edu/wn31/102124584-n> a ontolex:LexicalConcept, wordnet-ontology:Synset ;
    rdfs:label "cat"@eng,
        "true cat"@eng ;
    wordnet-ontology:gloss "feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats"@eng ;
    wordnet-ontology:hypernym <http://wordnet-rdf.princeton.edu/wn31/102123961-n> ;
    wordnet-ontology:hyponym <http://wordnet-rdf.princeton.edu/wn31/102124772-n>,
        <http://wordnet-rdf.princeton.edu/wn31/102127587-n> ;
    wordnet-ontology:lexical_domain wordnet-ontology:noun.animal ;
    wordnet-ontology:part_of_speech wordnet-ontology:noun ;
    wordnet-ontology:synset_member <http://wordnet-rdf.princeton.edu/wn31/cat-n>,
        <http://wordnet-rdf.princeton.edu/wn31/true+cat-n> ;
    owl:sameAs <http://www.w3.org/2006/03/wn/wn20/instances/synset-cat-noun-1> ,
        <http://lemon-model.net/lexica/uby/wn/WN_Synset_11048> .
Example wordnet/example1 : View as image or source


Relation to Other Models

In this section, we informally clarify the relation to other models, in particular SKOS, the Lexical Markup Model (LMF), and the Open Annotation standard.

SKOS(-XL)

SKOS is a vocabulary used to represent so called knowledge organization systems (KOS), comprising taxonomies, classification schemes, thesauri etc. SKOS thus addresses an orthogonal use case to lemon. lemon was designed to provide detailed information about the linguistic grounding of an ontological vocabulary, specifying in particular by which lexical entries a class or property can be verbalized. SKOS has only a very rudimentary way of doing this, that is by means of SKOS labels and the properties (prefLabel, altLabel and hiddenLabel). This is by no means a criticism of SKOS, but merely to make clear that SKOS and lemon have been designed with a different purpose and use case in mind.

Nevertheless, SKOS and lemon can be used in conjunction to provide more detailed information about the "labels". We recommend to use the property evokes and its inverse isEvokedBy to relate a skos:Concept to a lexical entry. This is shown in the following example:

The use case we address is one where a thesaurus or other taxonomic resource or classification system in SKOS needs to be enriched with more detailed linguistic information.


:financial_assets a skos:Concept;
                ontolex:lexicalizedSense :financial_assets_lex.

:financial_assets_lex a ontolex:LexicalEntry;
                 ontolex:evokes :financial_assets;
                 ontolex:form :financial_assets_form. 

:financial_assets_form ontolex:writtenRep "financial assets".
Example other/example-skos1 : View as image or source


The above represents the recommended way of linking a SKOS concept to a lexical entry in the lexicon ontology model.

To show how to make statements about preferred lexicalizations akin to the properties prefLabel, altLabel and hiddenLabel as used in SKOS, the following example shows how to attach such preference information via the lexical senses:


:tuberculosis a skos:Concept;
     ontolex:isEvokedBy :tuberculosis_lex;
     ontolex:isEvokedBy :consumption_lex.

:tuberculosis_lex a ontolex:LexicalEntry;
      ontolex:sense :tuberculosis_sense;
      ontolex:evokes :tuberculosis.
   
:tuberculosis_sense a ontolex:LexicalSense;
      ontolex:isContainedIn :tuberculosis; 
      ontolex:usage [ rdf:value "preferred" ].

:consumption_lex a ontolex:LexicalEntry;
       ontolex:sense :consumption_sense;
       ontolex:evokes :tuberculosis.

:consumption_sense a ontolex:LexicalSense;
        ontolex:isLexicalizedSenseOf :tuberculosis;
        ontolex:usage [ rdf:value "outdated" ]. 
Example other/example-skos2 : View as image or source


In case you are using reified labels as in SKOS-XL, it is possible to have forms or lexical entries in the range of the skosxl:prefLabel, skosxl:altLabel and skosxl:hiddelLabel properties. However, we note that from this it would follows that lexical entries and forms would be inferred to be skosxl:Labels, which does not correspond to the understanding of forms and lexical entries of this community as linguistic objects rather than mere `labels'.

LMF

The Lexical Markup Framework (LMF) (ISO-24613:2008) is a standard for representing machine readable lexicons. The model is not suited, however, to publish lexica on the web as linked data as it only knows a serialization in XML rather than in RDF. Further, LMF does not address the interface between lexica and ontologies as lemon does.

Nevertheless, the lemon model draws heavy inspiration from the LMF model. lemon has imported many classes/entities from LMF and adopted its core ontology. On the other hand, lemon has added vocabulary to describe the syntax-semantics interface with respect to an ontology and remove a number of classes that create syntactic overhead. A complete description of the relationship between LMF and the original lemon model is provided here. The main differences are summarized here:

  • lemon defines the meaning of a term by reference to an ontology element defined by the OWL model.
  • lemon provides a more compact description than LMF to describe the syntax-semantics interface
  • lemon relies on external category system and linguistic ontologies to describe linguistic properties of lexical entries instead of proposing an own category system
  • ontolex does not include a module for describing inflectional morphology patterns (called intentional morphology in LMF). Further, it does not allow to define global constraints on the lexicon. This can be done using OWL axioms, but not in lemon itself.

OpenAnnotation

In many uses cases the need arises to annotate a text corpus with links to entities defined in a lexicon, e.g. lexical entries, forms, lexical senses, lexical concepts etc. lemon does not support this annotation per se, as there are other models that are dedicated exactly to this. This is the case for the Open Annotation standard. In both models an element of lexicon may be the target of an annotation. This target may be a form, lexical entry, lexical sense or lexical concept and it is important to give the class to make clear what the target of the annotation is.

We will now give an example of annotating a word "cat" occurring at character 7 in a file at the URL [2], where the lemon element is given as the body of an annotation. For example


@prefix dctypes: <http://purl.org/dc/dcmitype/> .
@prefix oa: <http://www.w3.org/ns/oa#> .
@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .

:annotation a oa:Annotation ;
  oa:hasBody :cat ;
  oa:hasTarget <anno#target> .

:annotation#target a dctypes:Text ;
  oa:hasSelector <http://www.example.com/doc.txt#char=7,10> .

<http://www.example.com/doc.txt#char=7,10> a oa:FragmentSelector .

<cat> a ontolex:LexicalEntry .
Example other/example-oa : View as image or source