Copyright © 2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
Current web pages, written in HTML, contain significant inherent structured data. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself, enabling structured search and sharing.
RDFa is a syntax for expressing this structured data in XHTML. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't repeat themselves. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure.
This document is an introduction to RDFa. A more detailed syntax specification is being produced.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is joint work of the W3C Semantic Web Deployment Working Group [SWD-WG] and the former W3C HTML Working Group, now called the W3C XHTML2 Working Group [XHTML2-WG]. This work is part of both the W3C Semantic Web Activity and the HTML Activity. The two Working Groups expect to advance this work to Recommendation Status.
Comments on this Working Draft are welcome and may be sent to public-rdf-in-xhtml-tf@w3.org; please include the text "comment" in the subject line. All messages received at this address are viewable in a public archive.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
Changes since the previous version include:
role
attribute is no longer used to
declare rdf:type
, as this was rushed out before
the group agreed on the syntax. We now use the
class
attribute.link
and meta
in the
body is not mentioned anymore in this Primer, because this
may confuse folks working on XHTML 1.1 and earlier.1 Purpose of RDFa and
Preliminaries
1.1
Audience
2 A First Scenario: Publishing Events and
Contacts
2.1
The Basic HTML
2.2
Publishing An Event
2.3
Publishing Contact
Information
2.4
The Complete HTML with RDFa
2.5
RDFa with Limited HTML control
2.6
The RDF Triples
3 A Second Scenario: Publishing
Photos
3.1
The Shutr Photo Management
System
3.2
Literal Properties
3.3
URI Properties
4 Beyond the Current
Document
4.1
Qualifying Other Documents
4.2
Inheriting about
4.3
Qualifying Chunks of Documents
5 More Complex Structured Data: Social
Networking with FOAF
5.1
Two Layers of Structured Data
5.2
Additional Layers
5.3
The RDF Triples
5.4
Naming the Nodes
6 Bibliography
7 Acknowledgments
Current web pages, written in HTML, contain significant inherent structured data. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself, enabling structured search and sharing.
RDFa is a syntax that expresses this structured data using a set of elements and attributes that embed RDF in HTML. An important goal of RDFa is to achieve this RDF embedding without repeating existing HTML content when that content is the structured data. RDFa is designed to work with different XML dialects, e.g. XHTML1, SVG, etc., given proper schema additions. In addition, RDFa is defined so as to be compatible with non-XML HTML.
An XHTML document marked up with RDFa constructs should validate, and a non-XML HTML document marked up with RDFa remains compliant. RDFa uses existing HTML constructs and HTML-compatible extensions to specify RDF 'content'. It is not about embedding RDF/XML syntax into HTML documents.
We note that RDFa makes use of XML namespaces. In this
document, we assume, for simplicity's sake, that the
following namespaces are defined: dc
for Dublin
Core, foaf
for FOAF, cc
for
Creative Commons, and xsd
for XML Schema
Definitions:
dc
: http://purl.org/dc/elements/1.1/foaf
: http://xmlns.com/foaf/0.1/cc
: http://web.resource.org/cc/xsd
: http://www.w3.org/2001/XMLSchemaThe audience for this document should have a working knowledge of XHTML. Some familiarity with RDF is useful, though the basics can be picked up from reading this Primer. Similarly, the basic XML concepts used in this work — in particular namespaces — can be picked up from reading this Primer.
Jo blogs about her work, which involves web development.
Jo has an upcoming talk at the XTech Conference, on May
8th at 10am, where she will be discussing "web widgets".
She blogs an announcement of her talk at
http://jo-blog.example.org/
. Her blog also
includes her contact information (Jo has a fantastic spam
filter, so she is unafraid of publishing her email
address):
<html> <head><title>Jo's Blog</title></head> <body> ... <p> I'm giving a talk at the XTech Conference about web widgets, on May 8th at 10am. </p> ... <p class="contactinfo"> My name is Jo Smith. I'm a distinguished web engineer at <a href="http://example.org"> Example.org </a>. You can contact me <a href="mailto:jo@example.org"> via email </a>. </p> ... </body> </html>
This short piece of mark-up is already full of structured data.
The markup describes an event: a talk that Jo is giving. This event starts at 10am on May 8th. A summary of the event is "a talk at XTech 2007 on web widgets." We also have contact information for Jo: she works for the organization Example.org, with job title of "Distinguished Web Engineer." She can be contacted at the email address "jo@example.org."
At the moment, it is very difficult for software like web browsers and search engines to make use of this implicit data. We need a standard mechanism to explicitly express it. This is precisely where RDFa comes in.
Jo would like to add some structure to this blog entry so that readers of her blog might be able to add her talk directly to their calendar. RDFa allows her to do just that, using extra attributes. Since this is a calendar event, Jo will specifically use the iCal vocabulary [ICAL-RDF] to denote the data's structure.
The first step is to reference the iCal vocabulary within the HTML page, so that a parser may know where to look up the vocabulary terms:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> ...
then, Jo declares a new event and gives it a name:
<p class="cal:Vevent" id="xtech_conference_talk"> ... </p>
Note how the class
attribute is used here
to define the type of the data being expressed, exactly as
this attribute was initially intended in HTML. (If Jo
wanted to declare multiple types, she could include more
than one value in the class
attribute with
space separation.) Now, Jo wants to make sure that all
information within this p
describe the event
itself. She adds an additional attribute:
<p class="cal:Vevent" id="xtech_conference_talk" about="#xtech_conference_talk"> ... </p>
then, inside this event declaration, Jo can set up the event fields, reusing the existing HTML. For example, the event summary can be declared as:
I'm giving <span property="cal:summary">a talk at the XTech Conference about web widgets</span>,
The property
attribute on the
span
element declares a data field that
pertains to the closest declared about
, in
this case the Vevent
declared in the
p
. Note how the existing rendered content, "a
talk at the XTech Conference about web widgets", is the
value of this field. Sometimes, this isn't the right thing.
Specifically, the start time of the event should be
rendered nicely — "May 8th" —
but should likely be represented in
an easy, machine-parsable way, the standard iCal format:
20070508T1000+0200
. In this case, the markup
needs only a slight modification:
<span property="cal:dtstart" content="20070508T1000+0200">May 8th at 10am</span>
In this case, the actual content of the
span
element, "May 8th at 10am", is ignored
for structured data purposes: it has been replaced by the
explicit content
attribute. The full markup is
then:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> <head><title>Jo's Blog</title></head> <body> ... <p class="cal:Vevent" about="#xtech_conference_talk"> I'm giving <span property="cal:summary"> a talk at the XTech Conference about web widgets </span>, on <span property="cal:dtstart" content="20070508T1000+0200"> May 8th at 10am </span>. </p> ... </body> </html>
The above markup can be interpreted now as a set of RDF triples, the details of which we we explain in Section 2.6 The RDF Triples.
Note that Jo could have used any other HTML element, not
just span
, to carry the structure of her data.
In other words, when the structure of the data is already
laid out in the HTML using elements such as
h1
, em
, div
, etc...,
Jo can simply add the property
attribute, and
optionally the content
attribute, to indicate
the specific structure.
Now that Jo has published an event using structured data, she realizes there is much data on her blog that she can mark up in the same way. Her contact information, in particular, is an easy target for structured markup with RDFa:
... <p class="contactinfo"> My name is Jo Smith. I'm a distinguished web engineer at <a href="http://example.org"> Example.org </a>. You can contact me <a href="mailto:jo@example.org"> via email </a>. </p> ...
Jo discovers the vCard RDF vocabulary [VCARD-RDF], which she adds to her
existing page. Since Jo thinks of vCards as a way to
publish her contact information, she uses the prefix
contact
to designate this vocabulary. Note
that, although Jo already imported the iCal vocabulary,
adding the vCard vocabulary is just as easy and does not
interfere:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> ...
Jo then sets up her vCard using RDFa, by deciding that
the appropriate p
will be her vcard. She
notes, however, that the vCard schema does not require
declaring a vCard type. Instead, it is recommended that a
vCard refer to a web page that identifies the individual.
Jo thus uses RDFa's special attribute about
for just for this purpose, indicating that all contained
HTML pertain to Jo's designated URL. Note how the
about
attribute is inherited from parent
elements in the HTML: the about
attribute on
the nearest ancestor applies to declared structured
data.
... <p class="contactinfo" about="http://example.org/staff/jo"> ...everything here pertains to http://example.org/staff/jo... </p> ...
"Simple enough!" Jo realizes. She adds her first vCard fields: name, title, organization and email.
... <p class="contactinfo" about="http://example.org/staff/jo"> My name is <span property="contact:fn"> Jo Smith </span>. I'm a <span property="contact:title"> distinguished web engineer </span> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Notice how Jo was able to use the rel
attribute directly within the anchor tag for designating
her organization and email address. In this case, the
rel
indicates a relationship between
the current URL, designated by about
, and the
target URL, designated by href
. The exact
meaning of this relationship is defined by the
rel
. In this case, contact:org
indicates the relationship of "vCard organization", while
contact:email
indicates the relationship of
"vCard email".
The rel
attribute is naturally paired with
the href
attribute, much like the
property
attribute is paired with the
content
attribute. The astute reader will
notice that we have defined what happens when a
property
attribute is present without a
content
attribute, but not what happens when a
rel
attribute is present without its
corresponding href
. We explore this feature in
Section 5 More Complex Structured Data:
Social Networking with FOAF.
(The above example slightly simplifies the vCard
vocabulary where email
is concerned, since
vCard technically requires indicating the type of
the email. This simplification is for clarity's sake.)
Jo's complete HTML with RDFa is thus:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> ... <p class="cal:Vevent" about="#xtech_conference_talk"> I'm giving <span property="cal:summary"> a talk at the XTech Conference about web widgets </span>, on <span property="cal:dtstart" content="20070508T1000+0200"> May 8th at 10am </span>. </p> ... <p class="contactinfo" about="http://example.org/staff/jo"> My name is <span property="contact:fn"> Jo Smith </span>. I'm a <span property="contact:title"> distinguished web engineer </span> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Note how, if Jo changes her email address link, her
organization, or the title of her talk, the RDFa approach
will automatically pick up these changes in the marked up,
structured data. The only places where this doesn't happen
is when the content
attribute must override
the rendered content, which is inevitable when the
human-rendered data and the machine-readable data must
differ.
The RDF triples generated by the above markup are detailed in Section 2.6 The RDF Triples.
What if Jo does not have complete control over the HTML
of her blog? For example, she may be using a templating
system which makes it particularly difficult to add the
vocabularies in the html
element at the top of
her page without adding it to every page on her site. Or,
she may be using a web blogging provider that doesn't allow
her to change the header of the page to begin with.
Fortunately, RDFa uses standard XML namespaces, which means that the vocabularies can be imported "locally" to an HTML element. Jo's HTML blog page could express the exact same structured data with the following markup:
<html> ... <p class="cal:Vevent" about="#xtech_conference_talk" xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> I'm giving <span property="cal:summary"> a talk at the XTech Conference about web widgets </span>, on <span property="cal:dtstart" content="20070508T1000+0200"> May 8th at 10am </span>. </p> ... <p class="contactinfo" about="http://example.org/staff/jo" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> My name is <span property="contact:fn"> Jo Smith </span>. I'm a <span property="contact:title"> distinguished web engineer </span> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Of course, just like in the case of the vocabularies
defined on the top-level html
tag, more than
one vocabulary can be imported into any element. In this
case, each p
only needs one vocabulary: the
first uses iCal, the second uses vCard. This approach helps
with the desired ability to copy-and-paste HTML from one
page to another: the closer the namespace declarations to
their relevant statements, the easier it is to copy and
paste the content.
RDFa is parsed to generate RDF triples, which we denote
using N3 notation [N3]. URIs are written
using angle brackets, e.g.
<http://example.org/foo/bar>
, literals
are written in quotation marks, e.g. "a talk", and QNames
are written directly, e.g. cal:summary
.
In Section 2.2 Publishing An Event, Jo published an event. The RDF triples extracted from her markup are:
<http://jo-blog.example.org/blog/?p=123#xtech_conference_talk> rdf:type cal:Vevent; cal:summary "a talk at the XTech Conference about web widgets"^^XMLLiteral; cal:dtstart "20070508T1000+0200" .
In Section 2.3 Publishing Contact Information, Jo published contact information. The RDFa is parsed to generate the following RDF triples:
<http://example.org/staff/jo> contact:fn "Jo Smith"^^XMLLiteral; contact:title "distinguished web engineer"^^XMLLiteral; contact:org <http://example.org>; contact:email <mailto:jo@example.org>.
(The ^^XMLLiteral
notation, which denotes a
datatype, will be explained shortly.)
Consider a (fictional) photo management web site called
Shutr, whose web site is
http://www.shutr.net
. Users of Shutr can
upload their photos at will, annotate them, organize them
into albums, and share them with the world. They can choose
to keep these photos private, or make them available for
public consumption under licensing terms of their
choosing.
The primary interface to Shutr is its web site and the HTML it delivers. Since photos are contributed by users with significant amount of built-in structured data (camera type, exposure, etc...) and additional, explicitly provided data (photo caption, license, photographer's name), Shutr may benefit from using RDF to express this structure.
We explore how Shutr might use RDFa to express RDF right
in the HTML it already publishes. We assume an additional
XML namespace, shutr
, which corresponds to URI
http://www.shutr.net/rdf/shutr#
.
The simplest structured data Shutr might want to expose is basic information about a photo album: the creator of the album, the date of creation, and its license. We consider literal properties first, and URI properties second. (We ignore photo-specific data for now, as that involves RDF statements about an image, which is not an HTML document. We will, of course, get back to this soon.)
A literal property is a string of text, e.g.
"Ben Adida", a number, e.g. "28", or any other typed,
self-contained datum that one might want to express as a
property. In RDFa, literal properties are expressed using
the property
attribute and an optional
content
attribute.
Consider Mark Birbeck, a user of the Shutr system with
username markb
, and his latest photo album
"Vacation in the South of France." This photo album resides
at
http://www.shutr.net/user/markb/album/12345
.
The HTML document presented upon request of that URI
includes the following HTML snippet:
<h1>Photo Album #12345: Vacation in the South of France</h1> <h2>created by Mark Birbeck</h2>
Notice how the rendered HTML contains elements of the photo album's structured data. Using RDFa, Shutr can mark up this HTML to indicate these structured data properties without repeating the raw data:
<h1>Photo Album #12345: <span property="dc:title">Vacation in the South of France</span></h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
An RDFa-aware browser would thus extract the following RDF triples:
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral .
When the existing HTML elements already delineate the
exact structure, adding a new span
element is
not required. One can easily add the RDFa attributes to
existing HTML elements:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
which yields the same RDF triples, of course. The use of
an extra span
is helpful when existing HTML
markup isn't enough to isolate the rendered content that is
relevant to the RDF triple.
A reader who knows about XML datatypes might, at this point in the presentation, wonder what datatype these values will have. Given the above RDFa, "Vacation in the South of France" is an XML Literal. In some cases, this may not be appropriate. Consider an expanded HTML snippet which includes the photo album's creation date:
<h1>Vacation in the South of France</h1> <h2>created by Mark Birbeck on 2007-01-02</h2>
A precise way to augment this HTML with RDFa is:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" datatype="xsd:date">2007-01-02</span></h2>
which would yield the following triples (note how the
default datatype is XMLLiteral
, which explains
the first example above.):
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral . <> dc:date "2007-01-02"^^xsd:date .
Going further, Shutr realizes that
2007-01-02
, while a correct xsd:date
representation, is not exactly user-friendly. In this case,
having the rendered data be the same as the structured data
might not be the right answer. Shutr may instead opt for
the following RDFa:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" datatype="xsd:date" content="2007-01-02"> January 2nd, 2007 </span> </h2>
The above HTML will render the date as "January 2nd,
2007" but will yield the exact same triples as above. The
use of the content
attribute should be limited
to cases where the rendered text is not well-enough
structured to represent the data.
If Shutr wants to indicate that a specific object, e.g. "Mark Birbeck", plays two different roles, e.g. creator and publisher, the markup can easily be updated to reflect this duality without repeating the data:
<h1 property="dc:title">Vacation in the South of France</h1> created by <span property="dc:creator dc:publisher">Mark Birbeck</span>
In all of the above markup and triples, as well as in
the rest of the document, we use the
dc:creator
predicate with both literals, e.g.
strings, and URIs, e.g. names and "persons": both are
allowed by the Dublin Core specification. We show in
4 Beyond the Current
Document how to refer to fragments of a document,
and in 5 More Complex Structured Data:
Social Networking with FOAF how to create deeper
structures to address this type of issue.
A URI property is one that is merely a
reference to a web-accessible resource, e.g. an image, a
PDF document, or another HTML document, all reachable via
the web. In RDFa, URI properties are expressed using the
rel
and href
attributes. The
href
attribute is the well-understood target
of a link, while rel
indicates a
relationship.
Shutr may want to give its users the ability to license their photos to the world under certain specific conditions. For this purpose, there are numerous existing licenses, including those published by Creative Commons. Thus, if Mark Birbeck chooses to license his vacation album for others to reuse, Shutr might use the following HTML snippet:
This document is licensed under a <a href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
This clickable link has an intended meaning: it is the document's license. Using RDFa can cement that meaning within the HTML itself:
This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
Note the use of the rel
attribute to
indicate a URI property rather than a textual one. The use
of this attribute goes hand in hand with an
href
attribute within the same element. This
href
attribute indicates the URI object of the
RDF triple. Thus, the above RDFa yields the following
triple:
<> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .
It is worth noting that the rel
attribute,
like property
and class
, supports
multiple values, separated by spaces. The triples generated
are the same as if each value were declared
independently.
Compared with other existing RDF mechanisms to indicate Creative Commons licensing; e.g. a parallel RDF/XML file or inline RDF/XML within HTML comments, the RDFa approach provides Creative Commons and Shutr with a significant integrity advantage: the clickable link is the semantic link, and any change to the target will change both the human and machine views. Also, a simple copy-and-paste of the HTML will carry through both the rendered and semantic data.
In both cases, the target URI may provide an HTML document which includes further RDFa statements. The Creative Commons license page, for example, may include RDFa statements about its legal details.
The above examples casually swept under the rug the issue
of the RDF subject: most of the triples expressed were about
the current document representing a photo album. However, not
all RDF triples in a given HTML document will be about that
document itself. In RDFa, the default subject is the current
document, but it can easily be overridden using the
about
attribute, which we briefly introduced in
the very first example.
Shutr may choose to present many photos in a given HTML
page. In particular, at the URI
http://www.shutr.net/user/markb/album/12345
,
all of the album's photos will appear inline. Structured
data about each photo can be included simply by specifying
an about
attribute:
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> </li> <li> <img src="/user/markb/photo/34567" />, <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> </li> </ul>
The above RDFa yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .
This same approach applies to statements with URI objects. For example, each photo in the album has a creator and may have its own usage license.
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> taken by photographer <a about="/user/markb/photo/23456" property="dc:creator" href="/user/markb"> Mark Birbeck </a>, licensed under a <a about="/user/markb/photo/23456" rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li> <img src="/user/markb/photo/34567" /> <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> taken by photographer <a about="/user/markb/photo/34567" property="dc:creator" href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a about="/user/markb/photo/34567" rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
This yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/23456> dc:creator "Mark Birbeck"^^XMLLiteral . </user/markb/photo/23456> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral . </user/markb/photo/34567> dc:creator "Steven Pemberton"^^XMLLiteral . </user/markb/photo/34567> cc:license <http://creativecommons.org/licenses/by/2.5/> .
about
At this point, Shutr might begin to worry about the
fast-growing size of its HTML document, given that the
photo's URI must be repeated in the about
attribute for every RDF property expressed. To address this
issue, RDFa allows the value of this attribute to be
inherited from a parent or ancestor element. In other
words, if an element carries a rel
or
property
attribute, but no about
attribute, an RDFa browser will determine the subject of
the RDF statement by navigating up the parent hierarchy of
that element until it finds an about
, or until
it gets to the root element, at which point the default is
about=""
.
Thus, the markup for the above example can be simplified to:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> <span property="dc:title"> Sunset in Nice </span>, taken by photographer <a property="dc:creator" href="/user/markb/"> Mark Birbeck </a>, licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li about="/user/markb/photo/34567"> <img src="/user/markb/photo/34567" /> <span property="dc:title"> W3C Meeting in Mandelieu </span>, taken by photographer <a property="dc:creator" href="/user/stevenp"> Steven Pemberton </a> licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
which yields the same triples as the previous example, though, in this case, one can easily see the parallel to the corresponding N3 shorthand:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral ; dc:creator "Mark Birbeck"^^XMLLiteral ; cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral ; dc:creator "Steven Pemberton"^^XMLLiteral ; cc:license <http://creativecommons.org/licenses/by/2.5/> .
While it makes sense for Shutr to have a whole web page dedicated to each photo album, it might not make as much sense to have a single page for each camera owned by a user. A single page that describes all cameras belong to a single user is the more likely scenario. For this purpose, RDFa provides ways to make structured data statements about chunks of documents using natural HTML constructs.
Consider the page
http://www.shutr.net/user/markb/cameras
,
which, as its URI implies, lists Mark Birbeck's cameras.
Its HTML includes:
<ul> <li id="nikon_d200"> Nikon D200, purchased on 2004-06-01. </li> <li id="canon_sd550"> Canon Powershot SD550, purchased on 2005-08-01. </li> </ul>
and the photo page will then include information about which camera was used to take each photo:
<ul> <li> <img src="/user/markb/photo/23456" /> ... using the <a href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
The RDFa syntax for formally specifying the relationship is exactly the same as before, as expected:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> ... using the <a rel="shutr:takenWith" href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
which corresponds to:
</user/markb/photo/23456> shutr:takenWith </user/markb/cameras#nikon_d200>
Then, the HTML snippet at
http://www.shutr.net/user/markb/cameras
is:
<ul> <li id="nikon_d200" about="#nikon_d200"> <span property="dc:title" datatype="xsd:string"> Nikon D200 </span> purchased on <span property="dc:date" datatype="xsd:date"> 2004-06-01 </span> </li> <li id="canon_sd550" about="#canon_sd550"> <span property="dc:title" datatype="xsd:string"> Canon Powershot SD550 </span> purchased on <span property="dc:date" datatype="xsd:date"> 2005-08-01 </span> </li> </ul>
which then yields the following triples:
<#nikon_d200> dc:title "Nikon D200"^^xsd:string ; dc:date "2004-06-01"^^xsd:date . <#canon_sd550> dc:title "Canon SD550"^^xsd:string ; dc:date "2005-08-01"^^xsd:date .
If the reader wishes only to embed simple, name-value pairs into an HTML document, this section is not required reading. However, many structured datasets quickly require some additional level of depth. In this section, we consider these more complex structures. One popular RDF vocabulary is FOAF [FOAF], which provides structure for social networking and personal information. FOAF is particularly interesting to consider because it provides deeper structure than the examples provided so far: a FOAF person has an office, which has an address, which has a street, city, zip code, and country. So far, we have only explained how to define structure "one-level deep."
Consider, specifically, that Tim Berners-Lee is encouraging folks to publish a FOAF file. Let's express (a portion of) Tim's FOAF file using RDFa. We start with a portion of Tim's homepage:
<dl> <dt>Email</dt> <dd>timbl@w3.org</dd> <dt>Address</dt> <dd> 77 Massachusetts Ave.<br /> MIT Room 32-G524<br /> Cambridge MA 02139<br /> USA </dd> <dt>Phone</dt> <dd>+1 (617) 253 5702</dd> <dt>Fax:</dt> <dd>+1 (617) 258 5999</dd> </dl>
We can easily mark up the "one-layer deep" structure,
specifically the email, phone, and fax fields with
properties foaf:mailbox
,
foaf:phone
, and foaf:fax
:
<dl class="foaf:Person" about="#card" id="card"> <dt>Email</dt> <dd property="foaf:mbox">timbl@w3.org</dd> ... <dt>Phone</dt> <dd property="foaf:phone">+1 (617) 253 5702</dd> <dt>Fax:</dt> <dd property="foaf:fax">+1 (617) 258 5999</dd> </dl>
Now, we need to express the address information in
relation to Tim, as well as the address's properties, e.g.
street address, city, state, etc. Recall that, when
referencing another named resource, we've used the
rel
attribute. For example, to describe the
licensing of a document, we've used the markup:
This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons License </a>
What we need to do here is describe a relationship of
foaf:address
between Tim and some
unnamed node of data, off which we want to hang
additional properties. Thus, we use the rel
attribute again, this time without a corresponding
href
.
... <dt>Address</dt> <dd rel="foaf:address"> 77 Massachusetts Ave.<br /> MIT Room 32-G524<br /> Cambridge MA 02139<br /> USA </dd> ... </dl>
The HTML element on which this rel
is
expressed, in this case dd
, then represents a
blank node (in RDF terminology) that is the object
of the foaf:address
relationship. In addition,
the subject of all contained RDFa statements is
transparently set to be this blank node, as if there were
an implicit about
. This then allows the
following markup to say exactly what we mean:
<dl class="foaf:Person" about="#card" id="card"> ... <dt>Address</dt> <dd rel="foaf:address"> <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br /> <span property="foaf:address_line_2">MIT Room 32-G524</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </dd> ... </dl>
This layering of structured data easily extends to
multiple layers. Consider what Tim might do if he were to
list both his home and office addresses. The properties
foaf:office
and foaf:home
each
relate a foaf:Person
to a location, and each
location has an address. Thus, the markup becomes, quite
naturally:
<dl class="foaf:Person" about="#card" id="card"> ... <dt>Office Address</dt> <dd rel="foaf:office"> <div rel="foaf:address"> <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br /> <span property="foaf:address_line_2">MIT Room 32-G524</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </div> </dd> <dt>Home Address</dt> <dd rel="foaf:home"> <div rel="foaf:address"> <span property="foaf:address_line_1">1 Web Way</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </div> </dd> ... </dl>
Using this technique, it is relatively easy and natural to express fairly extensive and complex structured data.
When interpreted as RDF, the use of the rel
attribute without a corresponding href
creates
a new RDF blank node. Specifically, in the first example
where Tim publishes a single address, the triples are:
<#card> rdf:type foaf:Person . <#card> foaf:address _:dd0 . _:dd0 foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral . _:dd0 foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral . _:dd0 foaf:city "Cambridge"^^XMLLiteral . _:dd0 foaf:country "USA"^^XMLLiteral .
In the case of the multiple addresses, the triples become:
<#card> rdf:type foaf:Person . <#card> foaf:office _:dd0 . <#card> foaf:home _:dd1 . _:dd0 foaf:address _:div0 . _:div0 foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral . _:div0 foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral . _:div0 foaf:city "Cambridge"^^XMLLiteral . _:div0 foaf:country "USA"^^XMLLiteral . _:dd1 foaf:address _:div1 . _:div1 foaf:address_line_1 "1 Web Way"^^XMLLiteral . _:div1 foaf:city "Cambridge"^^XMLLiteral . _:div1 foaf:country "USA"^^XMLLiteral .
In most cases, it is neither useful nor desired to make the internal components of the structured data accessible by the outside world. For example, there is likely no good reason for Tim to give his office address a URI which other RDF statements might reference. Simply handing out his FOAF URI is enough.
However, in some cases, it may in fact be useful to name
all components yet to continue to use the rel
without an href
to designate the structured
relationship. RDFa allows this as naturally as possible: if
the element on which the rel
is added has an
id
or an about
, then the value of
that attribute becomes the name of the node, and all
triples are appropriately updated. about
takes
precedence over id
, since it is an explicit
RDFa statement.
For example, if Tim chose the following markup:
<dl class="foaf:Person" about="#card" id="card"> ... <dt>Address</dt> <dd rel="foaf:address" id="address"> <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br /> <span property="foaf:address_line_2">MIT Room 32-G524</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </dd> ... </dl>
the triples would then become:
<#card> rdf:type foaf:Person . <#card> foaf:address <#address> . <#address> foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral . <#address> foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral . <#address> foaf:city "Cambridge"^^XMLLiteral . <#address> foaf:country "USA"^^XMLLiteral .
Note that this doesn't change anything significant about the structured data itself, only that the address is now addressable by other structured data statements.
This document is the work of the RDF-in-HTML Task Force, including (in alphabetical order) Ben Adida, Mark Birbeck, Jeremy Carroll, Michael Hausenblas, Steven Pemberton, Ralph Swick, Elias Torres, and Wing Yung. This work would not have been possible without the help of the Semantic Web Deployment Working Group, in particular chairs Guus Schreiber and Tom Baker. Earlier versions of this document were produced with the help of members of the Semantic Web Best Practices and Deployment Working Group, chaired by Guus Schreiber and David Wood. Gary Ng and David Booth, provided insightful comments on previous versions.