DOM Level 3 will provide an API for loading XML source documents into a DOM representation and for saving a DOM representation as a XML document.
Some environments, such as the Java platform or COM, have their own ways to persist objects to streams and to restore them. There is no direct relationship between these mechanisms and the DOM load/save mechanism. This specification defines how to serialize documents only to and from XML format.
Requirements that apply to both loading and saving documents.
Documents must be able to be parsed from and saved to the following sources:
Note that Input and Output streams take care of the in memory case. One point of caution is that a stream doesn't allow a base URI to be defined against which all relative URIs in the document are resolved.
While creating a new document using the DOM API, a mechanism must be provided to specify that the new document uses a pre-existing Content Model and to cause that Content Model to be loaded.
Note that while DOM Level 2 creation can specify a Content Model when creating a document (public and system IDs for the external subset, and a string for the internal subset), DOM Level 2 implementations do not process the Content Model's content. For DOM Level 3, the Content Model's content must be be read.
When processing a series of documents, all of which use the same Content Model, implementations should be able to reuse the already parsed and load ed Content Model rather than reparsing it again for each new document.
This feature may not have an explicit DOM API associated with it, but it does require that nothing in this section, or the Content Model section, of this specification block it or make it difficult to implement.
Some means is required to allow applications to map public and system IDs to the correct document. This facility should provide sufficient capability to allow the implementation of catalogs, but providing catalogs themselves is not a requirement. In addition XML Base needs to be addressed.
Loading a document can cause the generation of errors including:
Saving a document can cause the generation of errors including:
This section, as well as the DOM Level 3 Content Model section should use a common error reporting mechanism. Well-formedness and validity checking are in the domain of the Content Model section, even though they may be commonly generated in response to an application asking that a document be loaded.
The following requirements apply to loading documents.
Parsers may have properties or options that can be set by applications. Examples include:
A mechanism to set properties, query the state of properties, and to query the set of properties supported by a particular DOM implementation is required.
The fundamental requirement is to write a DOM document as XML source. All information to be serialized should be available via the normal DOM API.
There are several options that can be defined when saving an XML document. Some of these are:
The following items are not committed to, but are under consideration. Public feedback on these items is especially requested.
Provide the ability for a thread that requested the loading of a document to continue execution without blocking while the document is being loaded. This would require some sort of notification or completion event when the loading process was done.
Provide the ability to examine the partial DOM representation before it has been fully loaded.
In one form, a document may be loaded asynchronously while a DOM based application is accessing the document. In another form, the application may explicitly ask for the next incremental portion of a document to be loaded.
Provide the capability to write out only a part of a document. May be able to leverage TreeWalkers, or the Filters associated with TreeWalkers, or Ranges as a means of specifying the portion of the document to be written.
Document fragments, as specified by the XML Fragment specification, should be able to be loaded. This is useful to applications that only need to process some part of a large document. Because the DOM is typically implemented as an in-memory representation of a document, fully loading large documents can require large amounts of memory.
XPath should also be considered as a way to identify XML Document fragments to load.
Document fragments, as specified by the XML Fragment specification, should be able to be loaded into the context of an existing document at a point specified by a node position, or perhaps a range. This is a separate feature than simply loading document fragments as a new Node.