Eurosentiment
Introduction
In the EuroSentiment project, the lemon model will be used to represent language resources for sentiment analysis such as WordNet Affect in a semantically interoperable way, using Linked Data principles. The representation of WordNet Affect in lemon is in effect limited to a straightforward transformation of the WordNet data model into the lemon model, but importantly we introduce the use of URIs to uniquely and formally define structure and content of this WordNet based language resource. URIs are adopted from existing Linked Data resources, thereby further enhancing semantic interoperability. We further integrate domain categories into this representation in order to enable domain-specific definition of polarity (sentiment tendency) for each lexical item. The lemon model allows for the representation of all aspects of lexical information, including lexical sense (word meaning) and polarity, but also morphosyntactic features such as part-of-speech, inflection, etc. This kind of information is not really provided by WordNet Affect but will be available from other language resources, including those available at EuroSentiment SME partners that can be easily integrated with the WordNet Affect information using lemon. The representation of WordNet Affect in lemon depends on the representation of sentiment concepts using a formally defined vocabulary (ontology) based on MARL, a model for the representation of sentiment annotations.
lemon Representation of WordNet Affect
Consider the following example for the English noun ‘fear’ in WordNet and equivalent Italian synonyms in WordNet Affect:
Princeton WordNet:
n#05590260 12 n 03 fear 0 fearfulness 0 fright 0 017 @ 05560878 n 0000 ! 05595229 n 0101 = 00080744 a 0000 = 00084648 a 0000 ~ 05590744 n 0000 ~ 05590900 n 0000 ~ 05591021 n 0000 ~ 05591212 n 0000 ~ 05591290 n 0000 ~ 05591377 n 0000 ~ 05591481 n 0000 ~ 05591591 n 0000 ~ 05591681 n 0000 ~ 05591792 n 0000 ~ 05592739 n 0000 ~ 05593389 n 0000 %p 10337259 n 0000 | an emotion experienced in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight)
WordNet Affect:
n#05590260 fifa paura spavento terrore timore | "una emozione che si prova prima di qualche specifico dolore o pericolo" n#05590260 affective-label="negative-fear" n#05590260 domain-label="Psychological_Features"
Using lemon we can represent and integrate information on the Italian synonyms, their links to the English based synset using Princeton WordNet URIs, and sentiment properties using an extended version of the MARL ontology. Domain properties are based on a separate ‘Eurosentiment domain ontology’. The example illustrates the positive polarity of ‘fear’ in English (and ‘fifa, paura, spavento, terrore’ in Italian) in the context of ‘horror movies’ and negative polarity in the context of ‘children movies’.
Declaration of namespaces used – wn declares WordNet 3.0 synsets, lemon declares the core lemon lexicon model, lexinfo declares specific properties for part-of-speech etc., ed declares domain categories, marl declares sentiment properties:
@prefix wn: <http://semanticweb.cs.vu.nl/europeana/lod/purl/vocabularies/princeton/wn30/> . @prefix lemon: <http://www.monnet-project.eu/lemon#> . @prefix lexinfo: <http://www.lexinfo.net/ontology/2.0/lexinfo#> . @prefix ed: <http://www.eurosentiment/domain/> . @prefix marl: <http://purl.org/marl/ns#> .
Declaration of lexicon identifier, language and lexical entries:
:lexicon a lemon:Lexicon ; lemon:language "it" ; lemon:entry :fifa, :paura, :spavento, :terrore.
Declaration of lemma, sense (link to synset in WordNet 3.0, polarity and domain context) and part-of-speech of ‘fifa’:
:fifa a lemon:Lexicalentry ; lemon:canonicalForm [ lemon:writtenRep "fifa"@it ] ; lemon:sense [ lemon:reference wn:synset-fear-noun-1; marl:polarityValue 0.375 ; marl:hasPolarity marl:positive ; lemon:context ed:horror_movies ] ; lemon:sense [ lemon:reference wn:synset-fear-noun-1; marl:polarityValue -0.375 ; marl:hasPolarity marl:negative ; lemon:context ed:children_movies ]; lexinfo:partOfSpeech lexinfo:noun .
Declarations of lemma and part-of-speech of ‘paura, spavento, terrore, timore’:
:paura a lemon:Lexicalentry ; lemon:canonicalForm [ lemon:writtenRep "paura"@it ] ; lexinfo:partOfSpeech lexinfo:noun . :spavento a lemon:Lexicalentry ; lemon:canonicalForm [ lemon:writtenRep "spavento"@it ] ; lexinfo:partOfSpeech lexinfo:noun . :terrore a lemon:Lexicalentry ; lemon:canonicalForm [ lemon:writtenRep "terrore"@it ] ; lexinfo:partOfSpeech lexinfo:noun . :timore a lemon:Lexicalentry ; lemon:canonicalForm [ lemon:writtenRep "timore"@it ] ; lexinfo:partOfSpeech lexinfo:noun .
Declarations of sense equivalence (synonymy) of ‘paura, spavento, terrore, timore’ with ‘fifa’:
:paura a lemon:LexicalSense ; lemon:equivalent :fifa. :spavento a lemon:LexicalSense ; lemon:equivalent :fifa. :terrore a lemon:LexicalSense ; lemon:equivalent :fifa. :timore a lemon:LexicalSense ; lemon:equivalent :fifa.
lemon Representation of Lexical Features
The examples discussed in the previous section showed the representation of WordNet based language resources with lemon. However also many other types of language resources exist, including sentiment dictionaries that define domain words with their polarity scores as well as inflectional variants, part-of-speech, etc. We can also represent such language resources using lemon, thereby making them interoperable with the lemon version of WordNet Affect as well as other lemon based language resources.
Consider the following example for the German noun ‘Einschlag’ (‘impact’) with lexical features (inflection, part-of-speech) and polarity score:
Einschlag Einschlag NN negative -/-0.0048/- L Einschlages Einschlag NN negative -/-0.0048/- L Einschlags Einschlag NN negative -/-0.0048/- L Einschläge Einschlag NN negative -/-0.0048/- L Einschlägen Einschlag NN negative -/-0.0048/- L
Using lemon we can represent this and integrate it with additional information as follows:
Declaration of namespaces used – wn declares WordNet 3.0 synsets, lemon declares the core lemon lexicon model, isocat declares specific properties for part-of-speech etc. (isocat is part of the lexinfo model used in the previous example), MARL declares sentiment properties:
@prefix wn: <http://semanticweb.cs.vu.nl/europeana/lod/purl/vocabularies/princeton/wn30/> . @prefix lemon: <http://www.monnet-project.eu/lemon#> . @prefix isocat: <https://catalog.clarin.eu/isocat/interface/index.html> . @prefix marl: <http://purl.org/marl/ns#> .
Declaration of lexicon identifier, language and lexical entry:
:lexicon a lemon:Lexicon ; lemon:language "de" ; lemon:entry :Einschlag.
Declaration of lemma, sense (link to synset in WordNet 3.0, polarity), alternate forms (inflectional variants with features) and part-of-speech:
:Einschlag lemon:canonicalForm [ lemon:writtenRep "Einschlag"@de ; isocat:DC-1297 isocat:DC-1883 ; # gender=masculine isocat:DC-1298 isocat:DC-1387 ; # number=singular isocat:DC-2720 isocat:DC-1331 ] ; # case=nominative lemon:sense [ lemon:reference wn:synset-impact-noun-1; marl:polarityValue -0.0048; marl:hasPolarity marl:negative ] ; lemon:altForm [ lemon:writtenRep "Einschlages"@de ; isocat:DC-1297 isocat:DC-1883 ; # gender=masculine isocat:DC-1298 isocat:DC-1387 ; # number=singular isocat:DC-2720 isocat:DC-1293 ] ; # case=genitive [ lemon:writtenRep "Einschlags"@de ; isocat:DC-1297 isocat:DC-1883 ; # gender=masculine isocat:DC-1298 isocat:DC-1387 ; # number=singular isocat:DC-2720 isocat:DC-1293 ] ; # case=genitive [ lemon:writtenRep "Einschläge"@de ; isocat:DC-1297 isocat:DC-1880 ; # gender=masculine isocat:DC-1298 isocat:DC-1354 ; # number=plural isocat:DC-2720 isocat:DC-1331 ] ; # case=nominative [ lemon:writtenRep "Einschlägen"@de ; isocat:DC-1297 isocat:DC-1880 ; # gender=masculine isocat:DC-1298 isocat:DC-1354 ; # number=plural isocat:DC-2720 isocat:DC-1265 ] ; # case=dative isocat:DC-1345 isocat:DC-1333. # partOfSpeech=noun