Linking CSS style sheets to XML documents

Abstract

HTML 4.0 has various ways to asscociate style rules to a document, in particular the LINK and STYLE elements, and the STYLE attribute. Style sheets also make heavy use of the CLASS and ID attributes to fine-tune styles to specific HTML elements. Because of that, CSS provides easy-to-use shorthand notations for CLASS and ID.

Some XML-based document formats will want to define a similar vocabulary of style linking mechanisms. This Note outlines several ways of achieving HTML's functionality, or a generalization of it, in XML-based documents. It is merely a snapshot of the discussions, not a final specification.

Status

This Note records the state the discussion has reached in the CSS & FP Working Group (WG). It is a snapshot of the discussions, not a specification. With the formation of the XML Syntax WG, the responsability for the topic of style sheet linking for XML documents has passed to the latter group. The CSS & FP WG doesn't plan to produce any further working drafts on the topic, but still wanted to make its work public, to invite feedback, both on the proposed mechanisms themselves, and on possible enhancements to CSS to better support XML-based formats.

Feedback is welcome on the mailing list www-style@w3.org (public) or the mailing lists of the XML Syntax WG and CSS & FP WG (W3C Members only).

Introduction

In HTML, there are several mechanisms for attaching style sheets to the document or to individual elements, in particular these five: the LINK and STYLE elements, and the CLASS, ID and STYLE attributes. Designers of document formats written in XML might want to provide similar mechanisms. To make it easier for applications that work with several XML-based formats at the same time, and to make it easier for the designers of those formats, this specification describes a set of conventions for adding one or more of the five mechanisms to an XML-based format.

A format written in XML doesn't have to provide all of these mechanisms. Many XML-based formats do not need a style sheet at all, and others may only need a single mechansism to link to a style sheet. This specification includes a convention by which a format can declare in a self-describing way which of the mechanisms it uses. "Self-describing" in this context means that an application without a-priori knowledge about the format (other than that it is written in XML) can find out from the actual data which mechanisms are used and with which syntax.

For example, here is a piece of text written in some (unnamed) XML-based format, that declares itself to use the LINK mechanism:

The keyword "xml-stylesheet" indicates that the document has a link to a style sheet in the same way as a LINK element with REL="stylesheet" in HTML. There are some variations of this processing instruction corresponding to the different variants of LINK in HTML.

The LINK mechanism is the simplest to use. The others need a little more mark-up.

Summary

mechanism	#	characteristic	1*	2*	comments
LINK	#1	PI		n/a
#2	predefined elt.	×
#3	namespace	×
#4	Xlink	×	×	"stylesheet" is not a reserved relation
STYLE elt	#1	PI		×
#2	predefined elt.	×
#3	namespace	×
CLASS	#1	PI		×
#2	predefined att.	×
STYLE att.	#1	PI		×
#2	predefined att.	×
ID	#1	PI		×	high potential for errors (ID not unique...)
#2	predefined att.	×		idem

1* = works in trivial subset?
2* = element/attribute name can be internationalized?

The LINK mechanism

The LINK mechanism is used to apply one or more external style sheets to a subtree (or the whole tree) of a document. It can indicate a default as well as alternative style sheets, and also style sheet fragments that are common to all alternatives. The style sheets can be specified as a list of style fragments that are to be concatenated.

A default style sheet is one that is intended to be applied unless a user indicates otherwise. An alternative style sheet is one that is only applied if the user explicitly asks for it.

Alternative 1

The syntax of this LINK mechanism is simple. It is a processing instruction (PI) that is inserted before the subtree to which the style sheets apply. It has the following general form:

The order of the attributes is arbitrary. The "media" and "type" attributes are optional. If the "title" is omitted, it indicates that this link is to a common style sheet, that is concatenated at the start of all style sheets with titles. If the "media" attribute is present, it indicates that the style sheet fragment is only applicable if the output medium is one of those mentioned in the attribute.

"uuu" is the relative URI of the style sheet fragment, relative to the base URI of the document in which the PI is found. "mmm" is a comma-separated list of media types. The media types available for CSS are listed in [???]. "ttt" is a MIME type. For CSS this would be "text/css". The "type" attribute, if present, must indicate the actual MIME type of the style sheet, and its purpose is to help applications avoid fruitless requests to a server for style formats they can't handle anyway.

A set of PIs specifies as many style sheets as there are different "title" attributes. The style sheets for each title are found by concatenating the fragments with the same title, and prepending any fragments without titles. Only fragments that apply to the desired output medium are used.

The title that occurred first in the list of PIs indicates the default style sheet. The other style sheets are the alternative style sheets.

The PI applies to the following subtree, excluding any nested subtrees that have their own PI, and possibly excluding nested documents (see below).

This mechanism is a slight generalization of that proposed in a Note by James Clark.

Alternative 2

A disadvantage may be that this puts a predefined, English-like name in the list of elements, which may not match the naming scheme of other elements. An advantage is that this will work in the trivial subset of XML (still under development), which will likely not allow processing instructions.

Alternative 3

Instead of a reserved element, it is also possible to use the conventions of XML namespaces (still under development). In that case, the link to the style sheet will be carried by an element (like in alternative 2), but the name of the element is "stylesheet" and the element has an associated URL "http://www.w3.org/TR/XML-style". Depending on how XML namespaces are defined, it could look like this:

Alternative 4

The conventions for linking in XML, Xlink (under development), could also be used as a basis for defining a LINK mechanism. In this case, the element that does the linking can be called anything, but it must have two predefined attributes (with fixed values), in addition to the normal attributes for LINK: "xml:link=simple" indicates that this is a link to something, "role" takes the place of "rel" to indicate what kind of link it is:

Potential problem with this approach is that there is currently no way to reserve the keyword "stylesheet," but if it is sufficiently advertised, people are likely to use it as intended here.

The STYLE mechanism

The STYLE mechanism can be used when it is desirable to embed the style sheet directly in the XML document, instead of linking to it. With this mechanism, one or more XML elements are defined as fulfilling a similar role as the STYLE element in HTML, i.e., an element the content of which is a style sheet.

The style sheet applies to the whole of the document in which it is embedded (possibly excluding any embeddings). If the STYLE element is itself part of an embedded document, it only applies to the embedded document (see below).

Alternative 1

To make this mechanism self-describing, a PI must be inserted somewhere before the element that acts as STYLE element. This PI can also indicate the names of the attributes, if they are different from the ones used by HTML. The general form of this PI is:

The order of the attributes is arbitrary. The "element" attribute gives the name of the element that fulfills the role of STYLE element. The "title-att" attribute names the attribute that acts as the "title" attribute on the STYLE element (default is "title"). The "type-att" and "media-att" attributes are analogous for the "type" and "media" attributes of the STYLE element.

The PI applies to the subtree that follows it, with the exception of any nested subtrees that have their own PI, and possibly with the exception of embedded documents.

This declares that any element called "layout" that occurs inside the subtree rooted at "doc" is in fact a STYLE element, and therefore contains a style sheet that is to be applied to the whole of the document. The PI also defines that the "label" attribute on that element fulfills the role of "title". Since the PI doesn't rename any other attributes, they are assumed to have their default names ("type" and "media").

The attributes on the STYLE element (or the "layout" element in the example) act in the same way as the attributes in the LINK mechanism explained above.

Alternative 2

Instead of an indirection via a PI, it is also possible to use a predefined element.

although that will put a strange, fixed name in the list of element names, it may be good enough, and then simplicity has something to say for it. It is also likely to work in the trivial subset (which is expected not to allow PIs).

Alternative 3

XML namespaces can also be used to label an element as a STYLE element. Depending on the namespace syntax, this could look like:

The CLASS mechanism

The CLASS attributes in HTML is a way to make variants (or "subclasses") of elements, that are typically rendered the same as the element from which they are derived, except for some small change. For example. a paragraph with class="warning" may be rendered as a normal paragraph, except that it has a red border around it.

In CSS, there is a special convenience-syntax for CLASS attributes. Instead of writing a selector like

Alternative 1

An XML-based data format that has an attribute with a similar function as CLASS in HTML, and wants to make it available for use with the short syntax in CSS, can indicate so with yet another variant of the PI:

where "ccc" is the name of the attribute that acts like CLASS. This PI again applies to the subtree that follows it, excepting any nested subtrees that have their own PI and possibly excepting any embedded documents. Here is an example:

Alternative 2

Like for the style element above, it may be good enough to reserve a fixed name for the attribute (for example "xml-class"), although in this case that seems less desirable, especially in document formats inspired by other languages than English.

The STYLE ATTRIBUTE mechanism

HTML has a STYLE attribute that makes it possible to set a style on an individual element, by embedding the style rules in an attribute of the element. Compared to the CLASS mechanism (see above) or the ID mechanism (see below) this saves one indirection. Using CSS syntax, such an attribute contains style rules without selectors, since the rules implicitly apply just to that element.

Alternative 1

The way an XML document can declare that it uses this mechanism is completely analogous to the way it indicates the CLASS mechanism:

Alternative 2

Instead of declaring the attribute with a PI, it may be good enough if the name of the attribute is fixed, e.g.: "xml-style".

The ID mechanism

Like the CLASS attribute, the ID attribute in HTML also benefits from a short syntax as well as from special cascading rules in CSS. In this case an element with ID="pq34" can be given a style with a rule like this:

In any HTML document there can only be one element with a given value for ID. If there are two or more elements with the same ID value, the document is in error and the behavior of the style sheet is undefined.

Alternative 1

An XML-based format can opt to make the short syntax available to a CSS style sheet, by allowing this PI in a document:

where "iii" is the name of the attribute that will benefit from the special support in CSS.

Note that the PI again applies to the following subtree, and may be repeated in front of subtrees in which a different attribute acts as the ID attribute.

Also note that there is no requirement that the document format is defined by a DTD, and even if it is there is no requirement that the indicated attribute is of (XML-)type "ID". But there is a requirement that in a given document the values of the ID attributes are unique.

Alternative 2

A somewhat more limited solution is to state that the ID attribute must be called "id", and vice versa, that any attribute called "id" is treated (for the purposes of CSS) as an ID. Of course, using non-unique "id" attributes in combination with #-type selectors in CSS will still be an error.

Embedded documents

Many formats written in XML will have a mechanism for allowing other formats (typically XML-based formats) to be embedded. For example, a document format may allow a mathematical formula written in MathML to be embedded, or a diagram in some vector-graphics format.

As of this writing, there are no common conventions for recognizing an embedding, but work is being done on something called "namespaces."

For the application of style sheets to embedded documents the following needs to be worked out:

For the moment these are open issues, and it is advisable not to use the style linking mechanisms described above in the case of documents with namespace-like embeddings.

Linked documents

In some cases, documents are embedded by reference, such as when a document is embedded using the equivalent of the IMG or OBJECT element in HTML. Sometimes the embedded document could inherit the style of the containing document. How to arrange for this is also still an open issue.

Style sheets and HTTP

The HTTP header called "Link:" can be used as an alternative to the LINK mechanism, for style sheets that apply to the whole document. In the case that a document is received from a server with Link-headers that indicate a style sheet, those Link headers are only used for those parts of the document to which no other LINK mechanism applies. In other words, a LINK mechanism in the document itself overrides the HTTP headers. The Link header looks like this:

Style sheets and RDF

A style sheet can be considered as a piece of meta-data for a document, and as such it is a candidate for expression in the RDF formalism. There are different ways of doing that: for example, each individual style property could be described as an RDF property on an element, but in this description we consider the whole style sheet to be a single node, which is a property of the whole document, not of individual elements in that document.

In other words, this section describes an RDF-way to achieve the equivalent of the LINK mechanism in the special case that the style sheets only apply to the whole document, and not to subtrees.

The following RDF properties are defined:

RDF schema for linking style sheets
Property name	Applies to	Type	Description
stylesheet	XML document	Set of StyleSheets	Links a resource to its style sheets
title	StyleSheet	Text	Provides a label for a style sheet
media	StyleSheet	Set of Media	A set of media descriptors for a style sheet, restricting the use of the style sheet to the given media.
type	StyleSheet	Text	A MIME type for the stylesheet
location	StyleSheet	URI	Where to find the style sheet
default	XML document	StyleSheet	The default style sheet for this document

Here is an example. If "doc.doc" is the URL of a document written in some XML-based format, and "style1.css" and "style2.css" are the URLs of two alternative style sheets for this document, their relation might be described as the following set of predicates ("3-tuples"):

Security considerations

The mechanisms described in this specification, if implemented, will make it possible to build a document from arbitrary elements and attributes and have it displayed in an apparently meaningful way. However, a CSS style sheet does not contain any machine-readable semantics and thus the document will not be usable by any other program. It is recommended to use existing formats, such as HTML, whenever possible.

A link to an external style sheet may cause executable code to be downloaded. Care must be taken that what is downloaded is indeed a style sheet and not something else.

An ill-specified style sheet may make important text hard to read or even invisible.

Other dangers may be the result of buggy formatters: very large style sheets may cause buffers to overflow.

NOTE-XML-and-stylesheets-19981012

Linking style sheets to XML documents

W3C Note

Abstract

Status

Introduction

Summary

The LINK mechanism

Alternative 1

Alternative 2

Alternative 3

Alternative 4

The STYLE mechanism

Alternative 1

Alternative 2

Alternative 3

The CLASS mechanism

Alternative 1

Alternative 2

The STYLE ATTRIBUTE mechanism

Alternative 1

Alternative 2

The ID mechanism

Alternative 1

Alternative 2

Embedded documents

Linked documents

Style sheets and HTTP

Style sheets and RDF

Security considerations

Internationalization

References