HCLSIG/SWANSIOC/Actions/RhetoricalStructure/models/salt
SALT (Semantically Annotated LaTeX)
This pages contains a series of examples of SALT-annotated documents. To make things easier, we used the light version of SALT, that does not include the annotation of complex rhetorical relations. It contains only claims and rhetorical blocks.
The SALT ontologies can be found at: http://salt.semanticauthoring.org/ontologies.html
Documents
- SALT annotated examples in LaTeX: Media:HCLSIG$$SWANSIOC$$Actions$$RhetoricalStructure$$models$$salt$salt-latex.zip
- SALT instances after extraction: Media:HCLSIG$$SWANSIOC$$Actions$$RhetoricalStructure$$models$$salt$iswc2009.pdf.rdf
- The SALT extraction process uses both the LaTeX sources and the compiled PDF to create SALT ontology instances. This results in a rich model that contains:
- Shallow metadata: title, authors, affiliations
- Full linear structure of the publication
- Full list of references, including their citation context! (the paragraph where the reference is actually cited)
- Rhetorical blocks: the abstract is automatically extracted, while the rest are manually created -- in the above examples one can find: scenario, motivation, contribution, background and conclusion.
- Rhetorical elements: here only claims.
- The SALT extraction process uses both the LaTeX sources and the compiled PDF to create SALT ontology instances. This results in a rich model that contains:
- Unlike other approaches, SALT does not duplicate the textual content of the rhetorical blocks / elements or citation contexts in the instance model. SALT uses a pointer-based approach, in which all elements that point to some textual content (such as Paragraph or TextChunk, and indirectly via Annotations, Claims, RhetoricalBlocks or CitationContexts) have attached two properties: startPointer and endPointer that define precisely the localisation of the text span inside the PDF. One can then use the SALT deserializer to extract the actual text from the PDF, based on a set of given pointers.
- SALT(ed) PDF - the above instance model is attached inside the PDF with a salt_metadata signature: Media:HCLSIG$$SWANSIOC$$Actions$$RhetoricalStructure$$models$$salt$iswc2009.pdf
- We used to replace the XMP field of the PDF with SALT metadata, but since the typical PDF readers cannot interpret it, we have decided attach it to the PDF, and we are currently working on creating actual XMP metadata snippets from the complete SALT instance model.
- SALT citation: Media:HCLSIG$$SWANSIOC$$Actions$$RhetoricalStructure$$models$$salt$salt_citation.ppt