RDF in HTML
The TAG meeting of 2002-04-08 concluded with
the following reflection - not consensus.
Requirements
- There is a requirement for namespace documents to be human-readable.
This allows, for example, an engineer to read a namespace document for a
new standard and find a link to a specification that explains how to
author content.
- There is a requirement for semantic web applications to be able to put
arbitrary information about things in a namespace (be they
rdf:Properties
or for that matter any other individual
thing). Some forms of semantic web processing also need to be able to
pick up that data quickly and in real time without unnecessary
indirections;.
- There is a requirement for a machine application now to be able to
find, for example, XML Schema documents by dereferencing a namespace URI
(a function which is provided by RDDL language).
Observations
- XHTML satisfies the first requirement as a W3C Recommendation for human
readable documents.
- RDF encoded RDDL information (see schema) satisfies the
second requirement.
- RDF satisfies the third requirement.
Conclusion
- It would make sense to institute a convention where a namespace
document would be an XHTML document with embedded RDF
Problems
- The RDF specification specifies how to understand the semantics (in
terms of RDF triples) in an RDF document that contains only RDF, but does
not explain how and when one can extract semantics from documents in
other namespaces which contain embedded RDF.
- The XHTML specification explains how to process XHTML namespace
content, but gives no indication about how to process embedded RDF
information.
Therefore, despite widely adopted specifications for XHTML and RDF, there
is no specification for the interpretation of the mixture. The TAG felt that
this lack, falling between the scopes of two working groups, was within its
scope to fill or ask to be filled.
A futher problem is that the question of how to define the meaning of a
URIref with fragement id wihtin such a document. This is the subject of a lot
of discussion on www-tag.
Possibilities
We either have to:
- Specify the architecture for XML so that the thing referred to by a
#idvalue reference to an XML document depends on not the MIME type
simply, but for a mixed namespace document itself, the namespace of the
identified element; or
- Change XHTML and its MIME type to know about embedded RDF and specify
its meaning; or
- specify the architecture so that the semantic web langauges always use
#idvalue to refer to the abstract thing described by a bit of XML, while
hypertext languages always mean the bit of the document; or
- Just don't mix HTML and RDF, as it will always be confusing to have two
parts of the meaning of a document.
My gut feeling is to go for 3. I think 1 means that you can't use fragids
to point to a generic bit of XML when just doing XML text processing.
Solution 2 doesn't solve the general problem, and will need n^2 fixes for n
langauges. Solution 3 has the problem that the same URIref s being associated
with two different levels of meaning in different contexts, which on the face
of it violates the rule that the same URI always refers to the same thing,
but actually doesn't as you just say that they both refer to the bit of
document but there is an implicit dereference operation in every use of a
URIref in a semantic web langauge. This is, I think, normal, as for example a
graphic language which refers to a circle by URIref does refer to the circle
not the bit of XML.
This, however, means that for example EARL (an RDF language for talking
about accessability tests) cannot use RDF to refer to pieces of an RDF
document as XML.
Proposal
While one could imagine more complex specifications for processing the
mixture, here is a very straightforward one:
- An XHTML processor ignores all embedded RDF
- An RDF processor processes all RDF within an XHTML document as though
it were within an RDF document.
- An RDF processor ignores any XHTML within which any RDF is
embedded.
If this seemed on the face of it to be acceptable to the community, the
TAG would then encourage:
- That the HTML Working Group and/or RDF Core Working Group make a
statement to the above effect in the form of a W3C Note or an appendix to
another specification;
- That W3C look for a way to bring consensus at a more formal level on
the RDF vocabulary appropriate for reference to things such as XML
schemas for documents.
Last change
$Id: htmlrdf.html,v 1.9 2002/04/17 17:05:44 timbl Exp $
Tim Berners-Lee