Lime old specification
The metadata module (lime) provides a vocabulary to describe metadata information about lexica as well as about ontologies or other datasets. It builds on the void and dcat vocabularies and allows to add the following information to a dataset:
- The number of lexical entries available in a lexicon as well as in a dataset
- The average number of ontology elements that a lexical entry in the dataset refers to
- The average number of lexical entries per ontology/dataset element in a given language, etc.
We provide a list of properties available in the metadata module:
- lime:language (with domain void:Dataset and range xsd:String), used to express for which natural languages lexicalizations exist in the given dataset.
- lime:linguisticModel: describing by which model/vocabulary information about lexicalization is attached; the domain is void:Dataset and the range is the URI of the vocabulary used (?); lime:linguisticModel is a subproperty of void:vocabulary
- lime:languageCoverage: describing how many lexicalizations of a particular type of concepts there are for a specific language in the dataset in question. The domain is void:Dataset (and the range is?
- lime:type (???) do we want to have a type for the resource so that we can hook up to some classification
Class: Lexical Resource?
URI: http://www.w3.org/ns/lemon/lime#LexicalResource
SubClassOf: void:Dataset
ObjectProperty: In resource
URI: http://www.w3.org/ns/lemon/lime#inResource
Domain: LexicalResource
Range: Lexicon
ObjectProperty: Linguistic Model
URI: http://www.w3.org/ns/lemon/lime#linguisticModel
Domain: LexicalResource or Lexicon
Range: ???
ObjectProperty: Language Coverage
URI: http://www.w3.org/ns/lemon/lime#languageCoverage
Domain: LexicalResource
Range: void:Dataset(?)
DatatypeProperty: Total Entries
URI: http://www.w3.org/ns/lemon/lime#totalEntries
Or maybe numberOfLexicalEntries??
Domain: LexicalResource or Lexicon
Range: xsd:integer
DatatypeProperty: Entries per entity
URI: http://www.w3.org/ns/lemon/lime#avgNumOfEntries
Or maybe averageEntitesPerOntologyEntity
Domain: LexicalResource or Lexicon
Range: xsd:decimal
DatatypeProperty: Entities per entry
URI: http://www.w3.org/ns/lemon/lime#averageAmbiguity
Domain: LexicalResource or Lexicon
Range: xsd:decimal
All of the above properties apply to a void:Dataset and can in principle be applied to add metadata about any datasets including a lexicon or an ontology.
We give examples of the usage of the model below:
The following code says that for a given dataset :dat, 75% of the owl:Class objects have attached lexical information in English and that on average, for every owl:Class there are 3.5 lexical entries. For German, the coverage is smaller with 10% of the owl:Class objects having attached lexical information and the average number of entries being 1.2
:dat a void:Dataset; lime:languageCoverage :lang_cov_en; lime:languageCoverage: lang_cov_de; lime:language "de"; lime:language "en". :lang_cov_en lime:lang "en"; :lang_cov_en lime:resourceCoverage :lang_cov_en_resource. :lang_cov_en lime:totalEntries 555. lang_cov_en_resource lime:class owl:Class; lime:percentage 0.75; lime:avgNumOfEntries 3.5. lang_cov_de lime:lang "de"; :lang_cov_de lime:resourceCoverage :lang_cov_de_resource. :lang_cov_de lime:totalEntries 55. lang_cov_de_resource lime:class owl:Class; lime:percentage 0.10; lime:avgNumOfEntries 1.2.
For a given lexicon, we define the following metadata properties:
- lime:language (with domain lemon:Lexicon and range xsd:String)
- lime:numberofLexicalEntries (with domain lemon:Lexicon and range xsd:Integer)
- lime:averageAmbiguity (with domain lemon:Lexicon and range xsd:Double)
- lime:averageEntriesPerOntologyEntity (with domain lemon:Lexicon and range xsd:Double)