Copyright ©2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document contains the requirements for the Document Object Model, a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents. The Document Object Model provides a standard set of objects for representing HTML and XML documents, a standard model of how these objects can be combined, and a standard interface for accessing and manipulating them. Vendors can support the DOM as an interface to their proprietary data structures and APIs, and content authors can write to the standard DOM interfaces rather than product-specific APIs, thus increasing interoperability on the Web.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.
This document is a Working Draft of the requirements of the Document Object Model. Comments on this document are invited and are to be sent to the public mailing list www-dom@w3.org. An archive is available at http://lists.w3.org/Archives/Public/www-dom/.
This is still a draft document and may be updated, replaced or obsoleted by other documents at any time. It is therefore inappropriate to use it as reference material or to cite it as other than "work in progress". This is work in progress and does not imply endorsement by, or the consensus of, either W3C or members of the DOM Working Group.
This document has been produced as part of the W3C DOM Activity. The authors of this document are the DOM WG members.
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.
Listed below are the general requirements of the Document Object Model.
This refers to the navigation around a document, such as finding the parent of a given element, or what children elements a given parent element contains.
These are specific to HTML document.
The event model must be rich enough to create completely interactive documents. This requires the ability to respond to any user action that may occur on the document. Therefore, many of these requirements only apply if a UI component is involved.
Cascading Style Sheets (CSS) is one model for manipulating the style of the document. The Stylesheet Object Model exposes the ability to create, modify, and associate CSS style sheets with the document. The stylesheet model will be extensible to other stylesheet formats in the future.
The DOM Range API should support the following types of operations on ranges:
Here are the items that will be addressed:
Here are other items that will be considered:
The DOM Level 3 Events specification will attempt to address some of the remaining issues from the DOM Level 2 Event specification as well as a couple or requested enhancements to the model. It will not attempt to redesign the model nor will attempt to define any additional event models.
The specification must define a technique for registering
EventListener
s in groups. These groups will then have
specified behavior in which attempts to modify the flow of an event
will be restricted and affected only the group to which the
EventListener
in question belongs.
It is also required that whatever technique is specified to
accomplish this purpose be compatible with the existing DOM Level 2
Event model and any EventListener
s registered using
DOM Level 2 Event model methods.
The specification must define a set of key events to handle keyboard input. This key event set must be fully internationalizable. It is hoped that this key event set will be compatible with existing key event sets used in current systems however this is not a requirement.
The specification should attempt to define a set of input events to handle IME based keyboard input. It is expected that this requirement will depend heavily on any key event set defined by the specification.
The specification must define a device independent event set. This event set should allow notification of typical user interaction with a document without requiring the use of either mouse events or key events.
Each Document View must provide a Device Independent UI Event Model.
The following events are not present in the DOM Level 2 specification. Those related to selection should be picked up when a selection model is included:
The content model referenced in these use cases/requirements is an abstraction and does not refer to DTDs or XML Schemas or any transformations between the two.
For the CM-editing and document-editing worlds, the following use cases and requirements are common to both and could be labeled as the "Validation and Other Common Functionality" section:
Use Cases:
Requirements:
Specific to the CM-editing world, the following are use cases and requirements and could be labeled as the "CM-editing" section:
Use Cases:
Requirements:
Specific to the document-editing world, the following are use cases and requirements and could be labeled as the "Document-editing" section:
Use Cases:
Requirements:
General Issues:
isNamespaceAware
attribute to
the generic CM object has been added to help applications determine
if qualified names are important. Note that this should not be
interpreted as helping identify what the underlying content model
is. A MathML example to show how namespaced documents will be
validated will be added later.DOM Level 3 will provide an API for loading XML source documents into a DOM representation and for saving a DOM representation as a XML document.
Some environments, such as the Java platform or COM, have their own ways to persist objects to streams and to restore them. There is no direct relationship between these mechanisms and the DOM load/save mechanism. This specification defines how to serialize documents only to and from XML format.
Requirements that apply to both loading and saving documents.
Documents must be able to be parsed from and saved to the following sources:
Note that Input and Output streams take care of the in memory case. One point of caution is that a stream doesn't allow a base URI to be defined against which all relative URIs in the document are resolved.
While creating a new document using the DOM API, a mechanism must be provided to specify that the new document uses a pre-existing Content Model and to cause that Content Model to be loaded.
Note that while DOM Level 2 creation can specify a Content Model when creating a document (public and system IDs for the external subset, and a string for the subset), DOM Level 2 implementations do not process the Content Model's content. For DOM Level 3, the Content Model's content must be read.
When processing a series of documents, all of which use the same Content Model, implementations should be able to reuse the already parsed and loaded Content Model rather than reparsing it again for each new document.
This feature may not have an explicit DOM API associated with it, but it does require that nothing in this section, or the Content Model section, of this specification block it or make it difficult to implement.
Some means is required to allow applications to map public and system IDs to the correct document. This facility should provide sufficient capability to allow the implementation of catalogs, but providing catalogs themselves is not a requirement. In addition XML Base needs to be addressed.
Loading a document can cause the generation of errors including:
Saving a document can cause the generation of errors including:
This section, as well as the DOM Level 3 Content Model section should use a common error reporting mechanism. Well-formedness and validity checking are in the domain of the Content Model section, even though they may be commonly generated in response to an application asking that a document be loaded.
The following requirements apply to loading documents.
Parsers may have properties or options that can be set by applications. Examples include:
A mechanism to set properties, query the state of properties, and to query the set of properties supported by a particular DOM implementation is required.
The fundamental requirement is to write a DOM document as XML source. All information to be serialized should be available via the normal DOM API.
There are several options that can be defined when saving an XML document. Some of these are:
The following items are not committed to, but are under consideration. Public feedback on these items is especially requested.
Provide the ability for a thread that requested the loading of a document to continue execution without blocking while the document is being loaded. This would require some sort of notification or completion event when the loading process was done.
Provide the ability to examine the partial DOM representation before it has been fully loaded.
In one form, a document may be loaded asynchronously while a DOM based application is accessing the document. In another form, the application may explicitly ask for the next incremental portion of a document to be loaded.
Provide the capability to write out only a part of a document. May be able to leverage TreeWalkers, or the Filters associated with TreeWalkers, or Ranges as a means of specifying the portion of the document to be written.
Document fragments, as specified by the XML Fragment specification, should be able to be loaded. This is useful to applications that only need to process some part of a large document. Because the DOM is typically implemented as an in-memory representation of a document, fully loading large documents can require large amounts of memory.
XPath should also be considered as a way to identify XML Document fragments to load.
Document fragments, as specified by the XML Fragment specification, should be able to be loaded into the context of an existing document at a point specified by a node position, or perhaps a range. This is a separate feature than simply loading document fragments as a new Node.
This document discusses the requirements and framework for using multiple implementations of DOM or DOM-based APIs designed for a particular markup language within a single standard DOM application. Up until now, the Document Object Model design has been concerned with defining an API to an entire XML document, where all methods and attributes in the API apply equally to the entire document and it is assumed that only one implementation of the DOM is needed by an application.
With the advent of markup languages such as Scalable Vector Graphics and the Mathematical Markup Language, it has become obvious that this simple model no longer applies. It is quite possible to have documents which embed some MathML or SVG markup, where a DOM application might reasonably expect to be able to use the specialized MathML or SVG DOM-based APIs. Similarly, many DOM applications are being designed to "glue" together two systems that both implement the DOM, and need some standard mechanism to assist in making the multiple implementations interoperate.
A module of Level 3 DOM, which we shall refer to by the shorthand name "EDOM", will address this issue.
As new XML vocabularies are developed, those defining the vocabularies are beginning to define specialized APIs for manipulating XML instances of those vocabularies by extending the DOM to provide interfaces and methods that perform operations frequently needed their users. For example, the MathML and SVG groups are developing DOM extensions to allow users to manipulate instances of these vocabularies using semantics appropriate to images and mathematics (respectively) as well as the generic DOM "tree" semantics. Instances of SVG or MathML are often embedded in XML documents conforming to a different schema such as XHTML or DocBook. While the XML Namespaces Recommendation provides a mechanism for integrating these documents at the syntax level, it has become clear that the DOM Level 2 Recommendation is not rich enough to cover all the issues that have been encountered in having these different DOM implementations be used together in a single application. The Embedded DOM module deals with the requirements brought about by embedding fragments written according to a specific markup language (the embedded component) in a document where the rest of the markup is not written according to that specific markup language (the host document). It does not deal with fragments embedded by reference or linking.
We are seeing at least two implementation scenarios in which DOM
components can be embedded in a host DOM. One extreme might be
called the "monolithic" scenario in which a single product (e.g.
the Mozilla browser) implements both the generic host DOM and the
specialized embedded DOM. The embedded DOM still has a different
DOMImplementation
object than the host because it will
support a different feature set, although it is quite likely that
the embedded and host DOMs will use compatible classes or data
structures. At the other extreme, the embedded DOM reflects a
completely different implementation, perhaps from an entirely
different vendor, e.g. an Adobe SVG component plugged into the
SoftQuad editor.
The general objective of the EDOM ET is to define whatever mechanisms are required in order to make documents that are actually handled by two or more DOM implementations work together as seamlessly and compatibly as possible under various implementation scenarios. Ideally, a DOM application writer should see the entire document as a coherent unit, with certain Nodes that are actually handled by embedded DOMs simply having more specialized capabilities. It is not clear at this point whether this is achievable for all scenarios, but our goal is to make it seamless for applications that do not care about the differences, and to make it possible for applications that do care about the differences to discover which DOM handles embedded nodes, be informed of where the boundaries are, and to use that implementation to its fullest extent.
Achieving these objective may entail clarifications to the wording of the DOM specification, new interfaces or methods on existing interfaces, revised requirements for the Load/Save module so that the multiple DOMs are built and linked together at parse time, or some combination of these.
We will consider the following use cases when assessing proposed requirements and in designing the DOM extensions to support embedded DOMs. All assume that some DOM Level 3 methods have been called to link the various DOM implementations together so that DOM boundaries can be detected and handled.
A DOM application running on the host DOM implementation may need to access information controlled by the embedded DOM, e.g. to serialize the entire document.
A DOM application running on the embedded DOM implementation may need to access information controlled by the host DOM, e.g. to query a style attribute, namespace declaration, etc.
A DOM application may need to detect and process events irrespective of whether they occur in a host or embedded DOM.
It may be acceptable for DOM Level 3 applications to use additional APIs to detect host/embedded DOM boundaries and to navigate across them. Nevertheless, it would be far better for users if ordinary node navigation operations, validation, iterators/treewalkers, and event propagation worked seamlessly across DOM boundaries.
We have prepared the following grid to clarify the different scenarios under which a standard for defining how an embedded DOM interoperates with a host DOM could be implemented, and what this means for the application programmer using the DOM interfaces.
One axis of the grid reflects the different architectures in which one DOM can be imbedded in another. The alternatives we are considering include:
The other axis reflects the properties or features of the DOM API that could be preserved across the host / embedded border. The features under consideration include:
The Embedded DOM plans to support the following use cases derived from this grid:
Monolithic | Dynamic | Wrappers | Data Island | |
Awareness | Yes | Yes | Yes | Maybe |
Boundary Navigation | Yes | Yes | Yes | Maybe |
Seamlessness | Yes | Yes | Maybe | No |
Event Propagation | Yes | Yes | Maybe | No |
CSS Inheritance | Maybe | Maybe | Maybe | No |
Ranges are probably out of scope for the EDOM, anyone disagree?
Do we want to support both a procedural mechanism and a non-procedural mechanism (something like the HTML <object> tag) to specify the relationship between the host and embedded DOM?
Do we have two DOM trees, one for the generic DOM, and one for the specialized DOM? (The sentiment at the Redwood Shores F2F was "no").
How do you know when to hand control over to embedded DOM? Does getChildNodes() and getParentNode() throw a new exception?
What sort of node should the top-level embedded node be? If it is a special node that is a "document" in some sense but a child of another in other senses, that would help. But it does need to be an element rather than a document, so that the ownerDocument is consistent throughout the complete document, including the embedded fragment.
How do you get a handle to a DOMImplementation object without a Document object ... we may need to solve the bootstrapping problem.
The embedded DOM API needs to add something like a createMyTopLevelElement from a Core DOM element, or maybe from a string. The string method, while not as elegant, is more likely to work across different languages and platforms.
What happens if a node from the embedded DOM tree is moved outside the embedded DOM tree? Is this possible?
What about recursive embedded DOMs? Including the case where XHTML is within SVG is within XHTML.
There is a widely-perceived need to offer a vendor-neutral way to use XPath expressions to select matching nodes in a DOM document. A module of DOM, which we shall refer to by the shorthand name "XPATH", will address this issue.
Levels 1 and 2 of the DOM has some functionality to allow
specific nodes to be located without the user having to navigate
through the DOM tree, notably getElementsByTagName
and
getElementById
. The Working Group's intention was to
add these very limited APIs as a stop-gap until an XML-aware
"query" language was available. While not by any means a complete
XML query language, XPath
does provide a syntax for locating XML content by value and has
been a W3C Recommendation since November, 1999. Various DOM
implementations support APIs that allow one or more nodes to be
located via an XPath expression, and there have been a number of
calls from the user community to incorporate this capability into
the DOM Level 3 Recommendation.
A semi-public mailing list was established in early 2000 to solicit input and advice (the archives can be viewed at http://lists.w3.org/Archives/Public/www-dom-xpath). The comments on this list tended to push for a quite complete XPath API rather than the quite limited interfaces provided by existing DOM implementations. Since this is out of scope for the DOM WG, efforts were made to find another W3C Working Group to take on this requirement. Those efforts have not been successful.
Thus, it falls to the DOM WG to define a simple but effective API for DOM that provides a basis for writing interoperable DOM applications that use XPath expressions rather than tree navigation to locate nodes matching simple search criteria.