Copyright ©2000, 2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document explains how to design accessible XML languages. Compared to the HTML or MathML language, XML is one level up: it is a meta syntax used to describe these languages as well as new ones, and it provides no intrinsic guarantee of device independence or textual alternate support. In this context, guidelines are needed that explain to XML formats and tools designers how to include basic accessibility features - such as the ones present in HTML - in all their new developments.
This document is a WAI PF Internal Working Draft made available by the Protocols and Formats Working Group (PFWG). It is intended that this document will become a W3C Note, following appropriate review.
This version, dated 22 April, follows the face to face meeting of the working group in Amsterdam on the 17th and 18th of April.This reflects changes made at the meeting, but has not been reviewed by the wider membership of the working group.
This document includes a very preliminary atempt to assign some priorities (according to the model used in WCAG) to checkpoints. There is not yet consensus in the group that this should be done, or that these priorities are appropriately assigned. for any marked [[p12@@]] there is known disagreement among the people at the face to face meeting.
Publication of this draft does not imply endorsement by the W3C membership. A list of current W3C technical reports and publications, including working drafts and notes, can be found at http://www.w3.org/TR/.
Please send comments about this document to w3c-wai-pf@w3.org.
XML (eXtensible Markup Language) is a meta-syntax, used to create new languages.
It can be seen as a simplification of SGML (Standard Generalized Markup Language), designed to promote a wider acceptance in Web markets, but serving the same functionality of extensibility and new language design.
HTML (HyperText Markup Language), on the other hand, is one particular application of SGML, which covers one set of needs ("simple" hypertext documents) and one set of element and attributes.
For instance, in HTML, authors can write elements like:-
<TITLE>XML and Accessibility</TITLE> <ADDRESS lang=fr>Daniel Dardailler</ADDRESS> <H1>Background</H1>
and they can only use elements (TITLE, H1, etc) defined by the HTML specification (which defines about a hundred), and their attributes.
In SGML and XML, authors can define their own set of elements, and end up with documents like:-
<MENU>New England Restaurant</MENU> <APPETIZER>Clam Chowder <PHOTO url="clam.jpg">A large creamy bowl of clam showder, with bread crumbs on top</PHOTO> </APPETIZER>
which may fit more closely the needs of their information system.
Within W3C, the HTML language is now being recast as XML - this is called XHTML - including a modularization of HTML to suit the needs of a larger community (mobile users, Web TV, etc).
XML is therefore not to be seen as a replacement of HTML, but as a new building layer, usage examples of which are: XHTML (for general HyperText content) MathML (for representing mathematical formula), SMIL (for synchronizing multimedia), SVG (for scalable graphics), etc., and other new languages designed byother organizations (such a OpenEBook, XML-EDI, etc.).
Furthermore, it is important to understand that XML is not only a User Interface technology (like HTML), but can and is often used in protocol communication, to serialize and encode data to be sent from one machine to another.
The XML grammars (called schemas - but see the caveat about our use of the term "schema" in the definition section) can be classified along two different axes:-
According to this taxonomy, these guidelines only address Data-centric schemas. This does not imply that the second type of schema doesn't have accessibility issues or features (see how XSLT, can help Braille formatting for instance). However since they are not conveying data, they are out of our scope here.
The WAI (Web Accessibility Initiative) has done extensive work in the HTML area, resulting in lots of new functionality being added to the version 4.0 of the language (see the HTML4 Accessibility Improvements paper).
These features includes:
One area of concern with the advent of XML is that the freedom of design it brings will result in a loss of accessibility features, present today because of HTML's pervasive presence and widely available specification.
For instance, one could design a new XML language that would prevent the creation of accessible documents, by not including in the element or attribute set a way to attach an alternate textual description for a photo:-
<MENU>New England Restaurant</MENU> <APPETIZER>Clam Chowder <PHOTO url="clam.jpg"/> <!-- no alt attribute or textual content model --> </APPETIZER>
In this example, the problem is not that the author of this document didn't put an alt attribute value attached to the PHOTO element, it's that the designer of the language didn't put the attribute in the language itself (e.g. in the schema).
But let's start by defining what we mean by accessible schema and documents (Details on these definitions are provided at the end of this document):-
An XML schema is accessible if it enables, and indeed actively promotes, the creation of accessible documents
A document is accessible if it can be equally understood by its targeted audience regardless of the device used to access it.
An accessible document is also defined as conforming to the Web Content Accessibility Guidelines.
As explained in the introduction, we're only considering Data-centric languages here, and for them, the message is simple: be device independent and export your semantics as much as you can.
While the priority is stronger on the first aspect (multi-modality), both aspects are important, as without the knowledge of the meaning of the XML elements and attributes, there is little chance that alternative user agents can do something intelligent with just the document bits.
This semantics knowledge can be provided through human readable documentation of course, but having machine readable assertions of semantics that can then be used to present the document in various media is paramount for pervasive access (i.e., you don't need a programmer, you just need a program). Enabling others to map from your language to exisiting ones, or vice versa, is a useful accessibility feature.
ICADD (International Committee on Accessible Document Design) was a pioneer in this topic, for SGML accessibility and ways to convey arbitrary schema semantics (using specific SGML binding mechanisms). A few years later, ICADD has not really been adopted (in fact, the ICADD DTD was replaced by HTML and its well known semantics), and people are still trying to solve the same problem, albeit with more experience in the field of HTML accessibility, and applied to XML this time.
This section provides a list of four abstract guidelines. Some examples of checkpoints are provided, and detailed checkpoints and techniques that schema designers can follow to achieve accessibility when designing new XML schemas still have to be defined by WAI and W3C.
For example, textual alternatives can be repurposed for many different output devices, whereas non-textual content is often confined to a certain set of devices. Thus, by allowing and encouraging synchronized textual alternatives, you allow your tagset to be more interoperable, and hence accessible.
xml:lang
to associate sign language video, the
desc
element of SVG, or the caption
element
for the XHTML tables module. SMIL's systemCaptions
doesn't do it - it is defined as an operational processing thing, not
as an association semantic. See also WCAG 1 checkpoint 1.1 [[p1]]Data-oriented XML should contain precise methods of encoding the data for its particular scope. By increasing the semantics of your tagset, and setting linking devices to outside presentations or further semantics, you allow your data to become "Webized" and hence to operate within many environments.
Select default style hints for your languages, and provide them along with the documentation, or in the schema of the language itself (where possible).
Make sure that people can map to and from your elements, and easily make assertions about them. Furthermore, make sure that you provide your own first party assertions about your languages: e.g. don't make users second guess an elements purpose.
Languages used only for presentation to a certain scope of users (i.e. final form tagsets) should adhere to the following caveats:-
In the presentation of guidelines for XML accessibility, we try to separate abstract guidelines from implementation techniques. This allows us to talk about the general guideline principles without spending the time up-front to solve the implementation issues.
In fact, there are several techniques for achieving the same result and people's decision will be a function of time and product available and their own commitment to access.
For instance, if an XML designer want to create some kind of "list" element in a given markup, this can be implemented using various techniques:
Schema: Even though we use the term "schema", we don't want people to assume we are only talking about a schema as defined in XML Schema but rather some document or collection of documents which contains all the references for interpreting a document which is encoded in accordance with the usage of some application or community of discourse. "Profile" might be a better word for our usage.
An XML schema is accessible if it enables and actively promotes the creation of accessible documents
A document is accessible if it can be equally understood by its targeted audience regardless of the device used to access it.
An accessible document is also defined by conforming to the Web Content Accessibility Guidelines.
The word "promote" is important as "enable" alone does not cover the case where a schema could include some open string representation somewhere and claim minimal accessibility.
To take an example, suppose HTML didn't have an ALT attribute on IMG, it would still in theory "enable" the creation of accessible documents, since HTML files carry textual content and one could always describe images inline, as in:-
<IMG SRC="Tax.gif"> How to pay your taxes
but this doesn't "promote" accessibility as most authors will not want to repeat "How to pay your taxes" if the logo already says "How to pay your taxes" (assuming CSS cannot be used for that). Having ALT "promotes" accessibility as it allows images to be described without performance loss - such as duplication - for image viewer.
In any case, accessibility is not just about alternative content, as the next section will show.
The word "device" is also important as it encompasses more than just media independence: it's both output (graphical, voice, braille, text-only) and input (mouse, keyboard, voice, keypad, one-touch).
This term also potentially carries with it the issues related to high bandwidth availability (or lack thereof), where access to data becomes impossible on slow connection because of their volume.
The term "equal understanding" is critical as it permits some form of graceful transformation when presenting in one media content primarily designed for another media.
Graceful transformation is a key concept in the area of accessibility. Let's define it.
Definition:
For instance, suppose I need to check the online yellow line train schedule and I don't have visual access to the Web. If the train Web site uses a yellow wagon animated icon to point me at the schedule, and do not provide a label somewhere saying that this is for the yellow line, thus only relying on my capacity to see the color, I suddenly cannot understand this site: it does not transform gracefully.
If the schema designer hasn't provided a way to attach alternate content to some rich piece like an animated yellow wagon, the content provider will not reach all of his/her audience with this information.
Suppose now in a different page this Web site provides a nice clickable 2D map with all the stops and ask me to select my start and destination. If a simple list of the line stops is provided in textual form, it does transform gracefully: it's not as fast as a couple of mouse clicks, so there is some "degradation" in the system, but a user reliant on text can obtain the information.
The WAI Protocols and Formats Working Group (PF) participants have contributed directly to the content of this document.