Copyright © 2006 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
This document introduces the RDF/A syntax for expressing RDF metadata within XHTML. The reader is expected to be fairly familiar with XHTML, and somewhat familiar with RDF.
This is an internal draft produced by the RDF-in-HTML task force [RDFHTML], a joint task force of the Semantic Web Best Practices and Deployment Working Group [SWBPD-WG] and HTML Working Group [HTML-WG].
This document is for internal review only and is subject to change without notice. This document has no formal standing within the W3C.
1 Purpose of RDF/A and Preliminaries
2 A Scenario: The Shutr Photo Management System
3 Simple Metadata
3.1 Literal Properties
3.2 URI Properties
4 Beyond the Current Document
4.1 Qualifying Other Documents
4.2 Inheriting about
4.3 Qualifying Chunks of Documents
4.4 Compact URIs (CURIEs)
4.4.1 Mixing CURIEs and URIs
4.4.2 Which Attributes are Which?
4.4.3 Back to Shutr
5 Bibliography
RDF/A is a set of attributes used to embed RDF in XHTML. An important goal of RDF/A is to achieve this RDF embedding without repeating existing XHTML content when that content is the metadata. Though RDF/A was initially designed for XHTML2, one should be able to use RDF/A with other XML dialects, e.g. XHTML1, SVG, given proper schema additions.
We note that RDF/A makes use of XML namespaces. In this
document, we assume, for simplicity's sake, that the
following namespaces are defined: dc
for Dublin
Core, foaf
for FOAF, cc
for
Creative Commons, and xsd
for XML Schema Definitions.
Consider a (fictional) photo management web site
called Shutr, whose web site
is http://shutr.net
. Users of Shutr can upload
their photos at will, annotate them, organize them into
albums, and share them with the world. They can choose to
keep these photos private, or make them available for public
consumption under licensing terms of their choosing.
The primary interface to Shutr is its web site and the XHTML it delivers. Since photos are contributed by users with significant amount of built-in metadata (camera type, exposure, etc...) and additional, explicitly provided metadata (photo caption, license, photographer's name), Shutr may benefit from using RDF to express this rich metadata.
We explore how Shutr might use RDF/A to express this RDF
metadata right in the XHTML it already publishes. We assume
an additional XML namespace, shutr
, which
corresponds to URI http://shutr.net/rdf/shutr#
.
The simplest structured metadata Shutr might want to expose is basic information about a photo album: the creator of the album, the date of creation, and its license. We consider literal properties first, and URI properties second. (We ignore photo-specific metadata for now, as that involves RDF statements about an image, which is not an XHTML document. We will, of course, get back to this soon.)
A literal property is a string of text, e.g. "Ben Adida", a number, e.g. "28", or any other typed, self-contained datum that one might want to express as a metadata property.
Consider Mark Birbeck, a user of the Shutr system with
username markb
, and his latest photo album
"Vacation in the South of France." This photo album
resides
at http://shutr.net/user/markb/album/12345
. The
XHTML document presented upon request of that URI includes
the following XHTML snippet:
<h1>Photo Album #12345: Vacation in the South of France</h1> <h2>created by Mark Birbeck</h2>
Notice how the rendered XHTML contains elements of the photo album's structured metadata. Using RDF/A, Shutr can mark up this XHTML to indicate these structured metadata properties without repeating the raw data:
<h1>Photo Album #12345: <span property="dc:title">Vacation in the South of France</span></h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
An RDF/A-aware browser would thus extract the following RDF triples:
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral .
(The ^^XMLLiteral
notation, which denotes a datatype, will be explained shortly.)
One might wonder, given the above example, if
the span
element is required to attach RDF
properties to rendered content. In fact, it is not:
the property
attribute can be used on any
XHTML element. For example, if the original HTML did not
include the explicit words "Photo Album #12345":
<h1>Vacation in the South of France</h1> <h2>created by Mark Birbeck</h2>
Then the RDF/A might look like this:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
and would yield the same RDF triples, of course.
A reader who knows about XML datatypes might, at this point in the presentation, wonder what datatype these values will have. Given the above RDF/A, "Vacation in the South of France" is an XML Literal. In some cases, this may not be appropriate. Consider an expanded HTML snippet which includes the photo album's creation date:
<h1>Vacation in the South of France</h1> <h2>created by Mark Birbeck on 2006-01-02</h2>
A precise way to augment this HTML with RDF/A is:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" type="xsd:date">2006-01-02</span></h2>
which would yield the following triples (note how the
default datatype is XMLLiteral
, which
explains the first example above.):
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral . <> dc:date "2006-01-02"^^xsd:date .
Going further, Shutr realizes
that 2006-01-02
, while a correct xsd:date
representation, is not exactly user-friendly. In this
case, having the rendered data be the same as the
structured data might not be the right answer. Shutr may
instead opt for the following RDF/A:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" type="xsd:date" content="2006-01-02"> January 2nd, 2006 </span> </h2>
The above XHTML will render the date as "January 2nd,
2006" but will yield the exact same triples as above. The
use of the content
attribute should be limited
to cases where the rendered text is not well-enough
structured to represent the metadata.
A URI property is one that is merely a reference to a web-accessible resource, e.g. an image, a PDF document, or another XHTML document, all reachable via the web.
As Mark Birbeck uploads many photo albums to Shutr, the
site decides to build a user-profile page for him, a page
that summarizes all of his albums and user profile
information for others to see. This profile lives
at http://shutr.net/user/markb
. Thus,
the dc:creator
property should probably
reference this URI. At the same time, Mark's name on the
Shutr site should consistently link to this same URI in a
clickable fashion.
The raw XHTML snippet might look like:
<h2>created by <a href="/user/markb">Mark Birbeck</a></h2>
Using the rel
attribute, one can easily update
this HTML to include an RDF/A statement:
<h2>created by <a rel="dc:creator" href="/user/markb">Mark Birbeck</a></h2>
This would then yield the expected triple:
<> dc:creator </user/markb> .
Similarly, Shutr may want to give its users the ability to license their photos to the world under certain specific conditions. For this purpose, there are numerous existing licenses, including those published by Creative Commons. Thus, if Mark Birbeck chooses to license his vacation album for others to reuse, Shutr might use the following XHTML snippet (currently -- January 2006 -- recommended by Creative Commons):
This document is licensed under a <a href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
This clickable link has an intended semantic meaning: it is the document's license. Using RDF/A can cement that meaning within the XHTML itself:
This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
Note the use of the rel
attribute to indicate a
URI property rather than a textual one. The use of this
attribute goes hand in hand with an href
attribute within the same element. This href
attribute indicates the URI object of the RDF
triple. Thus, the above RDF/A yields the following triple:
<> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .
Compared with other existing RDF mechanisms to indicate Creative Commons licensing -- e.g. a parallel RDF/XML file or inline RDF/XML within XHTML comments --, the RDF/A approach provides Creative Commons and Shutr with a significant integrity advantage: the clickable link is is the semantic link, and any change to the target will change both the human and machine views. Also, a simple copy-and-paste of the XHTML will carry through both the rendered and semantic data.
In both cases, the target URI may provide an XHTML document which includes further RDF/A statements. The Creative Commons license page, for example, may include RDF/A statements about its legal details.
The above examples casually swept under the rug the issue of
the RDF subject: all the triples expressed were about the
current document representing a photo album. However, not
all RDF triples in a given XHTML2 document will be about
that document itself. In RDF/A, the default subject is the
current document, but it can easily be overriden using
the about
attribute.
Shutr may choose to present many photos in a given XHTML
page. In particular, at the
URI http://shutr.net/user/markb/album/12345
,
all of the album's photos will appear inline. Metadata
about each photo can be included simply by specifying
an about
attribute:
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> </li> <li> <img src="/user/markb/photo/34567" />, <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> </li> </ul>
The above RDF/A yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .
This same approach applies to statements with URI objects. For example, each photo in the album has a creator and may have its own usage license.
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> taken by photographer <a about="/user/markb/photo/23456" rel="dc:creator" href="/user/markb"> Mark Birbeck </a>, licensed under a <a about="/user/markb/photo/23456" rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li> <img src="/user/markb/photo/34567" /> <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> taken by photographer <a about="/user/markb/photo/34567" rel="dc:creator" href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a about="/user/markb/photo/34567" rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
This yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/23456> dc:creator </user/markb> . </user/markb/photo/23456> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral . </user/markb/photo/34567> dc:creator </user/stevenp> . </user/markb/photo/34567> cc:license <http://creativecommons.org/licenses/by/2.5/> .
about
At this point, Shutr might begin to worry about the
fast-growing size of its HTML document, given that the
photo's URI must be repeated in the about
attribute for every RDF property expressed. To address
this issue, RDF/A allows the value of this attribute to be
inherited from a parent element. In other words, if an
element carries a rel
or property
attribute, but no about
attribute, an RDF/A
browser will determine the subject of the RDF statement by
navigating up the parent hierarchy of that element until
it finds an about
, or until it gets to the root
element, at which point the default
is about=""
.
Thus, the markup for the above example can be simplified to:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> <span property="dc:title"> Sunset in Nice </span>, taken by photographer <a rel="dc:creator" href="/user/markb/"> Mark Birbeck </a>, licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li about="/user/markb/photo/34567"> <img src="/user/markb/photo/34567" /> <span property="dc:title"> W3C Meeting in Mandelieu </span>, taken by photographer <a rel="dc:creator" href="/user/stevenp"> Steven Pemberton </a> licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
which yields the same triples as the previous example, though, in this case, one can easily see the parallel to the corresponding N3 shorthand:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral ; dc:creator </user/markb> ; cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral ; dc:creator </user/stevenp> ; cc:license <http://creativecommons.org/licenses/by/2.5/> .
While it makes sense for Shutr to have a whole web page dedicated to each photo album, it might not make as much sense to have a single page for each camera owned by a user. A single page that describes all cameras belong to a single user is the more likely scenario. For this purpose, RDF/A provides ways to make metadata statements about chunks of documents using natural XHTML constructs.
Consider the
page http://shutr.net/user/markb/cameras
,
which, as its URI implies, lists Mark Birbeck's
cameras. Its HTML includes:
<ul> <li id="nikon_d200"> Nikon D200, purchased on 2004-06-01. </li> <li id="canon_sd550"> Canon Powershot SD550, purchased on 2005-08-01. </li> </ul>
and the photo page will then include information about which camera was used to take each photo:
<ul> <li> <img src="/user/markb/photo/23456" /> ... using the <a href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
The RDF/A syntax for formally specifying the relationship is exactly the same as before, as expected:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> ... using the <a rel="shutr:takenWith" href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
which generates the triple:
</user/markb/photo/23456> shutr:takenWith </user/markb/cameras#nikon_d200>
Then, the XHTML snippet at http://shutr.net/user/markb/cameras
is:
<ul> <li id="nikon_d200" about="#nikon_d200"> <span property="dc:title" type="xsd:string"> Nikon D200 </span> purchased on <span property="dc:date" type="xsd:date"> 2004-06-01 </span> </li> <li id="canon_sd550" about="#canon_sd550"> <span property="dc:title" type="xsd:string"> Canon Powershot SD550 </span> purchased on <span property="dc:date" type="xsd:date"> 2005-08-01 </span> </li> </ul>
which then yields the following triples:
<#nikon_d200> dc:title "Nikon D200"^^xsd:string ; dc:date "2004-06-01"^^xsd:date . <#canon_sd550> dc:title "Canon SD550"^^xsd:string ; dc:date "2005-08-01"^^xsd:date .
One immediately wonders whether the redundancy between
the about
and id
attributes can be
simplified. Partly for this purpose, RDF/A includes
elements link
and meta
, which behave in
a special way : they only apply to their immediate parent
element, even if an ancestor element bears an
alternate about
attribute.
<ul> <li id="nikon_d200"> <meta property="dc:title" type="xsd:string"> Nikon D200 </span> purchased on <meta property="dc:date" type="xsd:date"> 2004-06-01 </span> </li> <li id="canon_sd550"> <meta property="dc:title" type="xsd:string"> Canon Powershot SD550 </span> purchased on <meta property="dc:date" type="xsd:date"> 2005-08-01 </span> </li> </ul>
One might now wonder how meta
and link
behave when their parent element doesn't have
an id
or about
attribute. The result
of such syntax is an RDF bnode, an advanced topic which we
skip in this Primer.
For Shutr, as for many other web publishers, the
introduction of RDF/A attributes tends to increase the size
of the XHTML noticeably, sometimes unnecessarily so: there
is significant data duplication with full expression of
URIs. We have already shown how judicious use of
the about
attribute can reduce the number of
times an RDF subject is expressed. We have also shown how
the use of link
and meta
elements can
further reduce the use of the about
attribute
when attaching metadata to particular XHTML chunks.
We now address URI duplication, RDF/A's most significant
data duplication issue, with Compact URIs, known as
CURIEs. A CURIE, e.g. dc:title
is
composed of a prefix, e.g. dc
, followed by a
colon, followed by a suffix, e.g. title
. The
compact URI is resolved by
Note that QNames used for RDF properties are valid CURIEs,
and resolve in exactly the same
way. Thus dc:title
and cc:license
resolve as expected when dc
and cc
are correctly defined namespaces.
The differences to note between CURIEs and QNames are:
:next
, in which case the base URI
defaults to the default XML namespace, which is usually
xhtml2
in our case.
_
as
a prefix when referencing bnodes. More on this in the Advanced section.
One of the most important applications of CURIEs in RDF/A
is the use of a CURIE/URI attribute, where either a normal
URI or a CURIE can be used interchangeably. In order to
differentiate between the two types, square
brackets []
are used around a CURIE, whereas
a URI is written normally.
For example, if Shutr wants to reference the Creative
Commons
license http://creativecommons.org/licenses/by/2.5/
in an attribute that accepts both CURIEs and URIs, it can
use either:
... attr="http://creativecommons.org/licenses/by/2.5/" ...
or, assuming the namespace cclicenses
has been properly defined:
... attr="[cclicenses:by/2.5/]" ...
In RDF/A, the property
attributes property
,rel
,
and rev
are all CURIE-only, which ensures
backwards compatibility with past uses of rel
,
e.g. rel="next"
. The about
and href
attributes, on the other hand, accept
mixed CURIE/URI datatypes. This ensures compatibility with
browsers that expect clickability for the href
,
and consistency between subject and object.
Thus, getting back to Shutr's photo list:
<ul> <li> <img src="/user/markb/photo/23456" />, Sunset in Nice, taken by <a href="/user/markb"> Mark Birbeck </a>, licensed under a <a href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons License </a>. </li> <li> <img src="/user/markb/photo/34567" />, W3C Meeting in Mandelieu taken by <a href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> </ul>
adding metadata to these photos with CURIEs can save significant space (over the non-CURIE use) as soon as there are a number of photos in the list:
<ul xmlns:cclic="http://creativecommons.org/licenses/" xmlns:photos="/user/markb/photo/"> <li about="[photos:23456]"> <img src="/user/markb/photo/23456" />, <span property="dc:title"> Sunset in Nice </span>, taken by <a rel="dc:creator" href="/user/markb"> Mark Birbeck </a>, licensed under a <a rel="cc:license" href="[cclic:by/2.5/]"> Creative Commons License </a>. </li> <li about="[photos:34567]"> <img src="/user/markb/photo/34567" />, <span property="dc:title"> W3C Meeting in Mandelieu </span> taken by <a rel="dc:creator" href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a rel="cc:license" href="[cclic:by-nc/2.5/]"> Creative Commons Non-Commercial License </a>. </li> </ul>
Of course, this assumes a browser that can parse CURIEs
for clickable links. Initially, complete URIs may be
preferable in the href
attribute.