W3C

RDFa in HTML Overview

Editors' Draft

This version:
$Id: rdfa-overview.html,v 1.1 2008/11/20 05:54:10 adida Exp $
Latest version:
http://www.w3.org/2006/07/SWD/RDFa/
Previous version:
This is the first draft version.
Editors:
Michael Hausenblas, JOANNEUM RESEARCH michael.hausenblas@joanneum.at
Ben Adida, Creative Commons ben@adida.net

Abstract

RDFa is a serialization syntax for embedding an RDF graph (cf. [RDF Concepts], Section 3.1) into XHTML 1.1 [XHTML 1.1] by means of using a set of selected attributes. Additionally, RDFa defines how the attributes are to be interpreted to generate an RDF graph contained in an XHTML 1.1 document. This document further describes the RDF-features supported by the RDFa specification.

This document gives an overview of what the RDFa specification comprises, along with a non-normative description of the terminology used herein. It as well contains a description of how RDFa is related to other approaches that deal with the deployment of RDF-metadata in (X)HTML content.

Status of this document

This is an Editor's Draft, hence work in progress. Publication as a Editor's Draft does not imply endorsement by the W3C Membership. As this is a draft document, it may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The worked described herein has been produced by the RDF-in-HTML Task Force [RDFHTML-TF], a joint task force of the Semantic Web Deployment Working Group [SWD-WG] and HTML Working Group [HTML-WG].

For comments, please send a mail to public-rdf-in-xhtml-tf@w3.org, with http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/ being the public archive.

Last Modified: $Date: 2008/11/20 05:54:10 $


Table of contents

  1. Introduction
  2. Document Roadmap
  3. Terminology
  4. RDF Coverage (RiR)
  5. Relation to Other Approaches

  6. References
  7. Acknowledgements

1. Introduction

This section gives a high-level overview of RDFa processing. In the RDFa Terminology, an RDFa aware agent uses the RDFa interpretation rules formally defined in the RDFa Syntax document to process the content of an XHTML document in order to generate an RDF graph as depicted in Figure 1., below.

The subset of RDF that is supported by the RDFa specification (called RiR) is described in the RDF Coverage section.

Figure showing how RDFa is used to process an XHTML document to generate an RDF graph
Figure 1.: RDFa is used to process an XHTML document to generate an RDF graph.

References in Figure 1.:
[1] ... The documents listed in the Document Roadmap of this document.
[2] ... The Resource Description Framework (RDF) [RDF Concepts]

RDFa itself is intended to be a technique that allows for adding metadata to any (XML) markup document, including SMIL, RSS, SVG, MathML, etc. Note, however, that in the current state, RDFa is being defined only for the (X)HTML family of languages. A RDFa FAQ is available as well.

2. Document Roadmap

RDFa is described by a set of documents, each fulfilling a different purpose. The following provides a brief roadmap for navigating through this set of documents:

Note: The document you are currently reading is to be understood as an umbrella specification.

3. Terminology

A non-normative tabular overview of the RDFa Terminology is given in Table 1., Table 2., and Table 3. below. The formal, yet non-normative defintion of the RDFa Terminology is available as an OWL-Lite ontology in RDF/XML format. The RDFa Terminology is depicted in Figure 2..

Figure showing the RDFa Terminology
Figure 2.: The RDFa Terminology.

Note that the figure above and the tables below use the namespaces as defined in Table 4..

Table 1.: Concepts in the terminology used in RDFa documents.
Concept Description
rdfat:SWApplication A piece of software that rdfat:operatesOn an rdfat:RDFGraph.
rdfat:RDFGraph An RDF graph is a set of triples; each triple contains three components: subject, predicate, and object.
rdfat:Vocabulary A set of atomic words known to an agent forming a part of a specific language.
rdfat:RDFaAwareAgent A software agent that is able to apply the rules defined in rdfat:RDFaHTML that rdfat:generates an rdfat:RDFGraph out of an rdfat:XHTML11 document
rdfat:RDFaRESTExtractor An rdfat:RDFaAwareAgent that provides for a REST interface.
rdfat:RDFaDOMExtractor An rdfat:RDFaAwareAgent that provides for a DOM interface.

Table 2.: Properties in the terminology used in RDFa documents.
Property Description Domain Range
rdfat:uses A rdfat:SWApplication rdfat:uses an rdfat:RDFaAwareAgent. rdfat:SWApplication rdfat:RDFaAwareAgent
rdfat:operatesOn A rdfat:SWApplication rdfat:operatesOn an rdfat:RDFGraph. rdfat:SWApplication rdfat:RDFGraph
rdfat:isEmbeddedIn A rdfat:Vocabulary rdfat:isEmbeddedIn in another rdfat:Vocabulary. rdfat:Vocabulary rdfat:Vocabulary
rdfat:processes An rdfat:RDFaAwareAgent rdfat:processes a rdfat:Vocabulary. rdfat:RDFaAwareAgent rdfat:Vocabulary
rdfat:generates An rdfat:RDFaAwareAgent rdfat:generates an rdfat:RDFGraph. rdfat:RDFaAwareAgent rdfat:RDFGraph

Table 3.: Instances in the terminology used in RDFa documents.
Instance Type Description
rdfat:XHTML11 rdfat:Vocabulary XHTML 1.1 [XHTML 1.1]
rdfat:RDFaHTML rdfat:Vocabulary RDFa in HTML, as of RDFa Syntax document, with rdfat:RDFaHTML rdfat:isEmbeddedIn rdfat:XHTML11

Table 4.: XML Namespaces used in this document.
Prefix Namespace
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs http://www.w3.org/2000/01/rdf-schema#
owl http://www.w3.org/2002/07/owl#
rdfat http://www.w3.org/2006/07/SWD/RDFa#

4. RDF Coverage (RiR)

This section provides an informal description of the subset of RDF (cf. [RDF Concepts]) that are supported by the RDFa specification. As per decision of the Semantic Web Deployment Working Group [SWD-WG] at the Boston F2F meeting on 2007-01-23, RDFa is not required to support every feature of RDF/XML .

The following is a list of RDF features covered by RDFa (RiR) based on the RDF Semantics document [RDF Semantics] along with the proposed levels in parenthesis. Level 1 denotes the Core of RDFa, Level 2 the full implementation.

  1. URIRef (1)
  2. bNodes (1)
  3. Literals (1)
  4. Datatypes (1)
  5. Containers (2)
  6. Collections (2)
  7. Reification (2)

TBD: Further elaborate on features and if it makes sense to introduce a subdivision by levels for RDFa.

5. Relation to Other Approaches

This section gives an informal description of how RDFa relates to other approaches that head after the deployment of RDF-metadata in (X)HTML content.

5.1 RDFa and GRDDL

Gleaning Resource Descriptions from Dialects of Languages (GRDDL) [GRDDL] is a mechanism for extracting RDF from XML dialects. As one XML dialect that GRDDL can process is XHTML the following issue raises: what is the relationship between RDFa and GRDDL? This section aims to answer that question.

GRDDL as an RDFa Parser

One valid use case is to use GRDDL to extract RDF from an XHTML+RDFa document. Effectively, GRDDL acts as the RDFa parser in this case. The schema document for XHTML+RDFa can (and likely will) contain a namespace-level GRDDL transformation. Such GRDDL transformations, indicated in the namespace document, are meant to apply to instances of documents that reference the namespace document. Thus, a GRDDL agent will find an XHTML+RDFa document, follow its namespace pointer, find the namespace-level transformation, and apply it to the XHTML+RDFa document to extract RDF.

It is important to note that, while such a mechanism extracts RDF correctly from an RDFa document, it may lose some of the features of RDFa. Specifically, the binding of triples to specific rendered HTML regions is lost. In other words, a GRDDL approach to parsing RDFa is quite reasonable when machines, and only machines, will ever deal with the structured data from that point on. If it is desirable for humans to be involved in selecting and accessing this structured data, it may be best to use a native RDFa parser that maintains the DOM-to-RDF correspondence.

GRDDL for RDFa on Other XHTML Documents

Note how RDFa is not currently defined in any XHTML/HTML documents. A document-level transformation, rather than a namespace-level transformation, can be used to indicate the presence of RDFa statements in such existing XHTML/HTML documents, e.g. an XHTML 1.0 or HTML 4.01 document. Such documents likely will not validate because of the extra RDFa attributes, but they are perfectly processable by GRDDL, using the GRDDL XHTML Profile or the GRDDL HTML Profile.

Just like the previous case, this RDF extraction is meant mostly for machine readers: the DOM-RDF tie-ins are lost by GRDDL processing. However, as there is no other way to cleanly include RDFa statements in existing versions of XHTML/HTML, this direction should not be discounted. It is even possible that, by detecting this GRDDL RDFa transformation on a given document, RDFa native parsers would be able to provide the DOM-RDF correspondence features of RDFa on XHTML/HTML documents that contain the appropriate GRDDL/RDFa declaration.

5.2 RDFa and Microformats

Besides RDFa, there are other HTML/XHTML approaches to embedding structured data. Microformats are the preeminent example. Unfortunately, microformat syntax varies from one application domain to another: it would be quite useful to transform them into a generic syntax and structural approach, like RDF, while maintaining the DOM-RDF correspondence, like RDFa. GRDDL is already being used to transform HTML+microformats into RDF/XML.

One way to transform Microformats to RDFa is hGRDDL, a GRDDL-like feature. Ideally, an XHTML+microformat document would contain an hGRDDL profile which would trigger a GRDDL-like transform from XHTML+microformat to XHTML+RDFa. All of the structure and DOM-to-data-structure correspondence from microformats will be preserved in the RDFa, allowing RDFa to become a "big umbrella" of structured data in HTML: eRDF, microformats, custom-designed structure, can all feed into the RDFa parser pipeline.

References

[GRDDL]
Gleaning Resource Descriptions from Dialects of Languages (GRDDL) , Dan Connolly, Editor, W3C Working Draft (work in progress), 2 March 2007, http://www.w3.org/TR/2007/WD-grddl-20070302/ . Latest version available at http://www.w3.org/TR/grddl/ .
[HTML-WG]
HTML Working Group, see http://w3.org/MarkUp/Group/ .
[SWD-WG]
Semantic Web Deployment Working Group, see http://www.w3.org/2006/07/SWD/ .
[RDFHTML-TF]
RDF-in-HTML Task Force, see http://w3.org/2001/sw/BestPractices/HTML/ .
[RDF/XML Syntax]
RDF/XML Syntax Specification (Revised), Dave Beckett, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/ . Latest version available at http://www.w3.org/TR/rdf-syntax-grammar/ .
[RDF Concepts]
Resource Description Framework (RDF): Concepts and Abstract Syntax, Graham Klyne and Jeremy J. Carroll, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/ . Latest version available at http://www.w3.org/TR/rdf-concepts/ .
[RDF Schema]
RDF Vocabulary Description Language 1.0: RDF Schema, Dan Brickley and R. V. Guha, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-schema-20040210/ . Latest version available at http://www.w3.org/TR/rdf-schema/ .
[RDF Semantics]
RDF Semantics, Patrick Hayes, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-mt-20040210/ . Latest version available at http://www.w3.org/TR/rdf-mt/ .
[XHTML 1.1]
XHTML 1.1 - Module-based XHTML, Murray Altheim and Shane McCarron, Editors, ??? http://www.w3.org/TR/xhtml11/ .

Acknowledgements

This document is the result of discussions within the RDF-in-HTML Task Force. The editors would like to thank Tim Boland (NIST) and Karl Dubost (W3C) for their helpful comments and their support w.r.t. Quality Assurance issues.