This is the official issues list for the DOM Level 3 Core Last Call period.
This document contains a list of issues regarding the DOM Level 3 Core specification Last Call period. All comments or issues regarding the specification or this document must be reported to www-dom@w3.org (public archives) before July 31, 2003. After this date, we don't guarantee to take your comments into account before moving to Candidate Recommendation.
An XML version of this issues' list is also available.
DOM Working Group members: see also the W3C Member-only bugzilla for non-official issues.
Color key: error warning note
Id:Title | State | Type | Open actions | Ack. |
---|---|---|---|---|
clover-1 : an application data | agreed | editorial | Agreement | |
clover-2 : DOM features and plus (+) sign | declined | clarification | Agreement | |
clover-3 : renameNode and non-namespace nodes | declined | proposal | Agreement | |
clover-4 : logically adjacent and wholeText | agreed | clarification | Agreement | |
clover-5 : error types | agreed | clarification | Agreement | |
clover-6 : whitespace-in-element-content | declined | clarification | Agreement | |
clover-7 : Text.data | agreed | editorial | Agreement | |
stenius-1 : derived types | agreed | request | Agreement | |
arnold-1 : NameList: contains and containsNS | agreed | request | Agreement | |
adams-1 : DOMImplementationRegistry behavior | agreed | proposal | Agreement | |
aillon-1 : Document.xmlVersion | agreed | clarification | No reply from reviewer | |
arnold-2 : attribute name case | agreed | error | No response to reviewer | |
arnold-3 : hasFeature and Core 1.0 | agreed | clarification | Agreement | |
arnold-4 : ordered collection | agreed | editorial | Agreement | |
arnold-5 : getName vs getLocalName | agreed | proposal | No reply from reviewer | |
arnold-6 : exceptions in NameList | agreed | proposal | Agreement | |
arnold-7 : DOMImplementationSource | agreed | proposal | Agreement | |
clover-8 : DOMError severity | agreed | proposal | Agreement | |
nichol-1 : Node.namespaceURI | agreed | proposal | No reply from reviewer | |
arnold-8 : actual encoding? | agreed | clarification | Agreement | |
arnold-9 : config or configuration? | declined | editorial | Agreement | |
arnold-10 : documentURI and valid URIs | declined | proposal | Agreement | |
arnold-11 : recognized XML encoding | declined | clarification | Agreement | |
arnold-12 : Requirements of standalone | declined | proposal | Agreement | |
arnold-13 : adoptNode and importNode | declined initial suggestion, agreed on follow-up | editorial | Proposal incorporated | |
importNode-1 : adoptNode and importNode | declined | editorial | Agreement | |
arnold-14 : adoptNode return value | declined | proposal | Agreement | |
arnold-15 : Attributes and renameNode | declined | clarification | Agreement | |
manian-1 : getDOMImplementation(s) | agreed | proposal | No reply from reviewer | |
manian-2 : unspecified standalone | agreed | proposal | No reply from reviewer | |
manian-3 : whitespaces | agreed | proposal | No reply from reviewer | |
arnold-16 : Position comparison | declined | clarification | No reply from reviewer | |
arnold-17 : getFeature and hasFeature | declined | proposal | Agreement | |
arnold-18a : isWhitespaceInElementContent() or isWhitespaceInElementContent ? | agreed | proposal | No reply from reviewer | |
arnold-18 : isId() or isId ? | declined | proposal | No reply from reviewer | |
arnold-19 : DOMError.clone() | declined | proposal | Agreement | |
arnold-20 : User data and event handling | declined | proposal | Agreement | |
arnold-21 : anonymous types | declined | proposal | Agreement | |
arnold-22 : XML Schema and DTD types | declined | proposal | No reply from reviewer | |
arnold-23 : nested and anonymous types | declined | proposal | No reply from reviewer | |
arnold-24 : handleErrors | agreed | clarification | Agreement | |
arnold-25 : byte and character offsets | agreed | proposal | Agreement | |
lesch-1 : isWhitespaceInElementContent or isWhiteSpaceInElementContent ? | agreed | editorial | No reply from reviewer | |
yergeau-1 : nodeName? | agreed | clarification | No reply from reviewer | |
yergeau-2 : Attribute values and entity references | agreed | clarification | No reply from reviewer | |
yergeau-3 : Entity declarations | agreed | clarification | No reply from reviewer | |
yergeau-4 : redundant if? | agreed | editorial | No reply from reviewer | |
yergeau-5 : Namespace algorithm | agreed | editorial | No reply from reviewer | |
yergeau-6 : actualEncoding vs xmlEncoding | agreed | error | No reply from reviewer | |
yergeau-7 : xmlVersion mapping | agreed | editorial | No reply from reviewer | |
yergeau-8 : notations mapping | agreed | editorial | No reply from reviewer | |
yergeau-9 : no namespace and Infoset | agreed | error | No reply from reviewer | |
yergeau-10 : previous and next sibling mapping | agreed | error | No reply from reviewer | |
yergeau-11 : CDATA sections and Infoset | agreed | editorial | No reply from reviewer | |
yergeau-12 : actualEncoding and xmlEncoding | agreed | clarification | Agreement | |
yergeau-13 : renameNode and invalid characters | agreed | error | Agreement | |
yergeau-14 : normalize() and character normalization | agreed | proposal | No reply from reviewer | |
yergeau-15 : CharacterData methods and character normalization | agreed | clarification | No reply from reviewer | |
yergeau-16 : "cdata-sections" default must be false | declined | proposal | Agreement | |
yergeau-17 : check character normalization? | declined | clarification | Objection | |
yergeau-18 : Unicode 4.0 | agreed | editorial | Agreement | |
yergeau-19 : URI/IRIs | agreed | clarification | No reply from reviewer | |
rancier-1 : iterators for DOMConfiguration | agreed | proposal | No reply from reviewer | |
ustiansky-1 : What exception to raise? | no decision (raised) | error | ||
stenback-1 : setting Document.prefix | agreed | error | No response to reviewer | |
adler-1 : "negative count" and unsigned counts | declined | error | No response to reviewer |
The repeated wording "an application data" is rather odd; 'data' is of course plural.
the new method of prepending a '+' to the feature name seems rather clumsy. If a Level 2 feature is updated to a Level 3 feature which can be non-castable, an application that wants the Level 2 feature and doesn't care about casting would have to call hasFeature twice to find out whether the feature can be supplied, once with "+"..."3.0" and once with "2.0".
it seems to be impossible to rename a node and end up with
a non-namespace (Level 1) node. For orthogonality,
shouldn't there be renameNode
and
renameNodeNS
?
We discussed that and didn't find enough interest in having a renameNode/renameNodeNS solution, so unless people start to express an interest in having it, we won't do it. By the way, createDocument is another exception to that orthogonality...
by my reading of the definition of "logically adjacent text nodes", fooNode's wholeText should also give "barfoo". Is this a mistake? If not, why is fooNode adjacent to barNode but not vice versa? If wholeText is only supposed to look forwards, the spec should say so.
still seems a bit vague. How exactly does a fatal error differ from an error? Can an error handler be called for arbitrary DOM exceptions, or just the few circumstances defined here? Are parse errors in Load/Save going to cause DOMErrors? What should DOMErrorHandlers do with unrecognised errors? Are the "wf-..." errors warnings?
A fatal error stops the processing, unlike an error.
No. DOMExceptions are and stay exceptions. The relatedException is meant to platform dependent ones, not DOMException. An example would be a SecurityException or IOException when using DOMParser.
We certainly don't intent to define the required behaviors of DOMErrorHandler for recognized or unrecognized errors. The unrecognized error must be of one of the 3 severity levels but that's all. No changes were done to the specification.
"Discard all Text nodes that contain whitespaces in element content" implies that Text nodes in element content with *any* whitespace characters in their data should be removed, rather than Text nodes composed *only* of whitespace. I'm pretty sure this is not what was meant.
The definition of whitespaces in element content is quite clear in the draft, especially since the following sentence indicates that Text.isWhitespaceInElementContent() should be used. We added an extra link to the infoset property as well.
"The DOMString attribute of the Text node..." - surely "The data attribute...".
Is it possible to find out if the type of an Element or Attribute is derived from some specific type in the Schema?
What I would like to see is something like the Xerces XSTypeDefinition.derivedFrom
method in the TypeInfo
interface.
isDerivedFrom
has been added to
TypeInfo
.
add this issue on the f2f agenda.
Consider the following addition for the
NameList
interface:
boolean contains(DOMString name); boolean containsNS(DOMString namespaceURI, DOMString localName);
While the Group rejected the proposal to remove
ElementEditVAL.isElementDefined
and
ElementEditVAL.isElementDefinedNS
(cf the
appropriate resolution of this DOM Level 3 Validation
issue), it has been decided to consider the addition of
contains
and containsNS
.
contains and containsNS have been added.
The implementation of
DOMImplementationRegistry
shown in Appendix
G.1 makes use of a Java System Property to record a list
of names of classes that implement
DOMImplementationSource
. A user of the
registry may naturally infer that sources are checked in
the order specified by this property (and the order by
which additional sources are added to the registry using
the addSource()
method). However, this may or
may not be the case since the implementation of
DOMImplementationRegistry
shown here makes
use of a Hashtable
which does not guarantee
any specific enumeration order.
I would suggest doing one of the following:
DOMImplementationRegistry
in Appendix G.1
to specify it in terms of an interface rather than a
class; furthermore, define its behavior more clearly
in terms of order of evaluation in the case that two
or more DOMImplementationSource
instances
implement the same features;
List
;
DOMImplementationSource
will be returned
when more than one supports the same features.
The draft states:
"
An attribute specifying, as part of the XML declaration,
the version number of this document. If there is no
declaration, the value is "1.0"
.
"
HTML (served as text/html) documents do not have XML
declarations, and in this case it seems to me this propery
should return null
, not "1.0"
.
Am I correct in making this assumption? Or should this
really be returning "1.0"
in that case as
well?
Document interface, "xmlStandalone" and "xmlVersion" attributes: both descriptions say that there is a NOT_SUPPORTED_ERR if this document *does* support the "XML" feature. Shouldn't that be does *not* support?
Document.xmlVersion
should be
null
if not dealing with XML.
The description should say that the exception is raised is the XML feature is *not* supported.
" typically using uppercase for element names and lowercase for attribute names "
The "typical" behavior in the L3 Core is contrary to DOM L2 HTML which specifies that both attribute names and element names should be uppercase which currently only Opera and Konqueror implement. The best solution I see is to issue an errata to L2 HTML specifying that attribute should be lowercase. Otherwise, this sentence should be expanded to not appear to be recommending behavior contrary to L2 HTML.
The sentences:
"For instance, element and attribute names are exposed as all uppercase (for consistency) when used on an HTML document, regardless of the character case used in the markup. Since XHTML is based on XML, in XHTML everything is case sensitive, and element and attribute names must be lowercase in the markup. "
should read
"For instance, element names are exposed as all uppercase (for consistency) when used on an HTML document, regardless of the character case used in the markup. The names of attributes defined in HTML 4.01 are also exposed as all lowercase, regardless of the character case used in the markup, but for other attributes (i.e. ones that are not defined by HTML 4.01), the character casing is implementation dependent. Since XHTML is based on XML, in XHTML everything is case sensitive, and element and attribute names must be lowercase in the markup."
propose an erratum for DOM Level 2 HTML
To contact Opera and Konqueror regarding the erratum
There has been a bug on www-dom-ts to ask for clarification if hasFeature("Core", "1.0") should return true since the L1 recommendation only had "XML" and "HTML" features. This sentence should reflect the resolution of that issue. At the end of the first paragraph of 1.4, "1.0" is omitted which could be interpreted as supporting the "Core didn't exist in L1" position.
The feature "Core" wasn't defined in "1.0". The text in DOM Level 2 Core is indeed misleading and should not make any claims regarding hasFeature("Core", "1.0"). The DOM Level 3 Core was changed as follows:
" For example, this specification defines the features "Core" and "XML", and thus for the version "3.0". Versions "1.0" and "2.0" can also be used for features defined in the corresponding DOM Levels. "
" ordered collection of parallel pairs of name and namespace values "
should read
ordered collection of qualified names
We change NameList: "ordered collection of parallel pairs of name and namespace values (which could be null values)"
add a descriptive paragraph for NameLists in Validation.
Reply to Curt, as soon as Ben is done.
getName()
does not define whether the return
value is a local name or might contain a namespace prefix.
I'd would assume that local name would be preferrable.
Changing the name to getLocalName()
would be
clearer and consistent with XPath.
Correct. This depends on its usage. NameList is meant to be a generic interface and used in different ways depending on the context. The DOM Level 3 Validation draft has been clarified to make that context clear.
add a descriptive paragraph for NameLists in Validation.
Reply to Curt, as soon as Ben is done.
Throwing an exception on out of range indexes is not
consistent with DOMStringList
and other
lists. I can understand the motivation since
getNamespaceURI()
could be null
before the end of the list, however you could distinguish
between a null
namespace and end of the list
since getName
would be null
at
the end of the list.
The index is declared as unsigned. The exceptions have been removed. getName() can return null even before the end of the list, if wildcards are in use for example. (as indicated in ElementEditVal) The application can only rely on NameList.length in order to know the end of the list.
I dislike the form of this interface for a couple of reasons: it requires that each implementation source to parse the features list which could have been done once for all implementation sources and it enables the implementation source to return inconsistent first implementation sources. I'd suggest something like
interface DOMImplementationSource { DOMImplementation getDOMImplementation(DOMStringList features, DOMStringList versions, unsigned int index); }
I believe that eliminates any use of
DOMImplementationList
so that interface could
be eliminated.
We clarified the description of getDOMimplementation as follows:
" This method returns the first item of the list returned by getDOMImplementationList. "
getDOMImplementationList()
seemed to be
required of an implementation source but never used by a
DOMImplementationRegistry.
This question was reconsidered.
DOMImplementationRegistry.getDOMImplementationList was broken and has been fixed.
Just a minor further nitpick on L3 Core: the DOMError ErrorSeverity definition group starts at 0; for consistency with the rest of the spec it should probably be 1.
I complained originally that what exactly DOMError severity meant was not adequately defined in the WD. Having now implemented it this way, I'd suggest something like the following:
SEVERITY_WARNING
error will not cause
processing to stop unless a DOMErrorHandler
returns false
. If there is no
DOMErrorHandler
set up, processing will always
continue.
SEVERITY_ERROR
error will cause
processing to stop unless a a
DOMErrorHandler
returns
true
. If there is no
DOMErrorHandler
set up, processing will always
stop.
SEVERITY_FATAL_ERROR
will always cause
processing to stop. Return value from
handleError
is ignored.
Going back to DOM 2, this attribute's description has
started "The namespace URI of this node, or null
if it is
unspecified". This seems quite clear, and in
programming languages like Java, I expect a null
value in
that language to be returned.
However, there are implementations that do not use what I would consider to be a null. For example, Oracle's XML parser's Node.getNamespaceURI() returns an empty (zero-length) string, and Microsoft's .NET framework (the NamespaceURI property of the System.Xml.XmlNode class) likewise returns an empty string.
I would like to see DOM 3 Core clarify or amend the
statement "The namespace URI of this node, or null
if it
is unspecified". If implementations returning empty
strings are to be considered out-of-spec, please specify
this explicitly. If returning a zero-length string is an
acceptible implementation according to the standard,
please state this. If this issue is clarified elsewhere
in the DOM 3 spec, please refer to that place from the
description of Node.namespaceURI.
Not adequately explained. Is this the encoding at the time of parsing? Do subsequent saves change the value? initialEncoding might be better.
the description is accurate. It is the encoding used while loading the document. The value is read-only and cannot be changed, including the LS module. write* operations in LS don't never modify the document.
I assume that you meant "adequate" in the first sentence in the previous paragraph and I would disagree since I could not find anything is the spec like the second sentence in the previous paragraph. Is there some other recommendation that already uses the term "actual encoding" in the manner?
If that is your intention, I think that "initialEncoding" is clearer.
This question was reconsidered.
In order to clearly express that it represents the encoding used during the load of the Document, the attribute was renamed "inputEncoding". This matches the idea of an input, introduced by the DOMInput interface in the LS module.
Using an abbreviation is unusual.
Should it be possible (or required) for an implementation to raise an exception if a new value is set that is not a valid URI.
We didn't want to go into the issue of URI/IRI checking, so no exception or error if you set to an invalid one. I clarified that no lexical checking was done when setting documentURI. baseURI will return null if an absolute URI cannot be determined. Note that we don't check the xml:base attributes either.
Can an implementation raise an exception if it does not recognize the encoding on setting? How is this affected by saves? Does this affect saves (which would be in the L/S spec)? Maybe it is cleaner just to allow the encoding to be specified on the save request.
xmlEncoding has been changed to read-only, in order to simplify the computation of the encoding used at save time (i.e. only DOMOutput.encoding could be changed). Save defines an "unsupported-encoding" error if the encoding is not supported.
What occurs when this attribute is set to
true
and the document does not satisify the
requirements for standalone="yes"
.
normalizeDocument and the DOMSerializer will catch it, as defined by the XML specification: this is a validity constraint. I added for that effect in the description of xmlStandalone.
There should be something in L/S that allows you to
specify that you want to document serialized with
standalone="yes"
which would either place
everything in the internal subset or expand entity
references and explicitly serialize default attribute
values.
It would be dangerous to start doing "magic" based on the value of thestandalone attribute. XML Parsers are required to check the value ofstandalone when validating. It is therefore logic to do the same in theDOM and make the check dependent on the validate and validate-if-schemaparameters. No change is needed in the specifications since this errortype is already controlled by XML.
The difference between adoptNode
and
importNode
is not immediately obvious.
We think that the current description is clear enough.
It would be good if the first sentences in each description were parallel. For example, adopt could say:
"Attempts to adopt a node from another document to this document. [...] If supported, the source node is removed from the original document and altered changing the ownerDocument of the node and any descendants,,,"
The new text says:
"Attempts to adopt a node from another document to this document. If supported, it changes the ownerDocument of the source node, its children, as well as the attached attribute nodes if there are any. If the source node has a parent it is first removed from the child list of its parent. This effectively allows moving a subtree from one document to another (unlike importNode() which create a copy of the source node instead of moving it). When it fails, applications should use Document.importNode() instead."
The fact that this does not throw an INVALID_CHARACTER_ERR when a 1.0 document adopts a node containing names not legal in 1.0 is clarified but really bizarre. Why is this different from importNode()?
importNode will invoke createElement -> so exception
adoptNode will never invoke createElement -> so no exception
"or null
if the operation fails, such as when the source
node comes from a different implementation".
This seems to be a opening for an implementation to always return null. The expected failure scenarios should be enumerated as exceptions.
Correct, it is indeed an opening for an implementation to refuse to adopt a node from one document to an other. Several failure scenarios can happen, predictable or implementation dependent ones, so any list would be incomplete.
What is the behavior when attempting to change the attribute name to the name of an attribute that already exists on the element.
The current description says:
" When the node being renamed is an Attr that is attached to an Element, the node is first removed from the Element attributes map. Then, once renamed, either by modifying the existing node or creating a new one as described above, it is put back. "
i.e. it will replace the old attribute node, since it is equivalent to the removal of the Attr node, then set it back using setAttributeNode.
The two apis getDOMImplementation and getDOMImplementations could cause confusion. I would prefer if a more distinct name is used. e.g getDOMImplementationList (Similarly, I would prefer if getFeatures is changed to getFeatureList).
To be consistent with the other attributes, probably it should be added that "This attribute is false when unspecified".
This is said to be from the XML declaration, but is boolean whereas the XML declaration can specify 3 values: "yes", "no" and not specified. Either the datatype should be changed or (my preference) it should be specified that this is true when the XML declaration says "yes", false otherwise.
Appendix C.1.1 says that this comes from the [standalone] property, but does not address the case where the property has no value.
It is not clear whether the text should be returned when it contains "any" whitespace or when it has "all" whitespace as its element content.
How do the definitions differ from XPath's axes? Why CONTAINED_BY instead of ANCESTOR and CONTAINS instead of DESCENDANT? Could there not be bits to indicate parent, child, attribute, sibling and self? Is there a compelling reason to fabricate an order for attributes of the same element or for nodes in different documents?
I would think that Node.compareDocumentPosition would be very prone to usage errors. Most scripting languages would not provide symbolic constants and it would be relatively easy to misuse bit-wise operators.
It would seem a whole lot simplier and more usable to provide boolean methods that correspond to the XPath Axes,
boolean isChild(Node node); boolean isDescendant(Node node); boolean isParent(Node node); boolean isAncestor(Node node); boolean isFollowingSibling(Node node); // always false for attribute boolean isPrecedingSibling(Node node); // always false for attributes boolean isSibling(Node node); boolean isFollowing(Node node); boolean isPreceding(Node node); boolean isAttribute(Node node);
Plus the existing isSame(Node node) for the self axis. descendant-or-self and ancestor-or-self could be handled by or'ing isSame and the appropriate other axis test.
Adding a method to determine if the nodes are in the same document might also be helpful.
If you are only interested if the particular pair are, for example, parent and child, it should be more efficient to only check that instead of figuring out how they are related.
The addition of several methods in order to achieve the current functionality of compareDocumentPosition doesn't really encourage to change the current proposal, especially since it is easier to add a new constant than a new method if future extensions. So we still prefer the current proposal in the specification...
It would be useful if there was some statement that if an
DOMImplementation.hasFeature(feature,
version)
returns true
that
Node.getFeature(feature, version)
could not
return null
or something else that would
allow you to always use getFeature()
and not
have to try both getFeature()
and casting.
You are comparing DOMImplementation.hasFeature and Node.getFeature (instead of Node.isSupported and Node.getFeature?). if Node.isSupported("+Events", "3.0"), then it is guarantee that getFeature will work in that node with the same feature/version.
This would be more appropriate as an read-only attribute. The description is insufficient since it doesn't explicitly explain under what conditions the implementation is able and required to determine that the content model only contains elements.
done. For Java, if an attribute starts with "is" that is followed by an upper case character, then the getter keeps the name as-is. The setter replaces "is" with "set". Only exception is HTMLInputElement.isMap as suggested. IsId and isWhitespaceInElementContent (now renamed to isElementContentWhitespace) are back to be attributes again.
We renamed the method to make it clear that it reflects the Infoset property "elenment-content-whitespace". Also added a link to the Infoset in the description.
Attr.isId()
is a declared as a method, though
it would be more appropriate as an attribute. If made a
read-write attribute, then
Element.setIdAttribute()
,
Element.setIdAttributeNS()
and
Element.setAttributeNode()
could be
eliminated since you could do the equivalent using
Element.getAttributeNode[NS]().isId = true | false
.
done, but as a read-only attribute, since that would require the implementation to create an attribute node for the invocation of isId. The purpose of the setIdAttribute* is to provide the ability for the element node to be returned by getElementById, which does not require creating an Attr node.
The description of
DOMErrorHandler.handleError
mentions that the
error parameter may be reused across multiple calls. That
would suggest the need for a clone method on
DOMError
in case a handler wishes to hold on
the error.
I would prefer enhancing Events to enable UserData
maintenance, though that may require defining events types
for cloning and importing and adding a
DOMStringList
userDataKeys attribute to
Node
.
If not, does there need to be an NODE_ADOPTED
operation type.
NODE_DELETED
is problematic. However,
mimicing DOMNodeRemovedFromDocument
would be
well defined for garbage-collected platforms.
Having attributes that allow you to readily determine if the type is complex vs simple or named vs anonymous would seem to be helpful. Such as:
readonly attribute boolean isSimple; readonly attribute boolean isAnonymous; (may be redundant with typeName == null)
XML Schema Datatypes has equivalents for the DTD attribute
types which could be used instead of null
and attribute
type property. The current definition would not provide
an easy way to distinguish between:
<!ATTLIST foo name CDATA #IMPLIED>
and
<!-- contrived, no namespace schema --> <xsd:schema> <xsd:simpleType name="CDATA"/> ... <xsd:attribute name="name" type="CDATA"/>
An anonymous type could be nested many levels down in the
content model, a containingElements NameList
attribute could be used to enumerate the element names.
For the schema:
<xsd:schema targetNamespace="http://www.example.com/typeinfo"> <xsd:element name="hello"> <xsd:complexType> <xsd:choice> <xsd:element name="world"> <xsd:simpleType>...</xsd:simpleType> </xsd:element>
The TypeInfo
associated with the world element in:
<hello xmlns="http://www.example.com/typeinfo"> <world/> </hello>
would have
typeName == null, typeNamespace == null, isSimple == true, containingElements.getNamespace(0) == "http://www.example.com/typeinfo", containingElements.getName(0) == "hello" containingElements.getNamespace(1) == null (if elementFormDefault = false) or "http://www..." containingElements.getName(1) == world
Why not looking at the ancestors of the node where the TypeInfo is attached?
typeName and typeNamespace cannot be both null if there is a declaration. In your case, it will exposed the the namespace and local name of the corresponding anonymous type name, with anonymous type name being an implementation-defined, globally unique qualified name provided by the processor for every anonymous type declared in a schema.
The description of the boolean return value appears to describe three different behaviors by explaining how a value of true differs from the default behavior and how a value of false differs from the default behavior. But there is no way of specifying that you want the implementation to do the default behavior.
This should be split up into a distinct byteOffset and characterOffset. An implementation could return values for both, a value for one and -1 for the other or -1 for both. However, the user doesn't have to somehow know the nature of the source to interpret the values.
There should be two attributes, one for byte offset and the other for character offset (or alternatively another attribute that says whether "offset" is byte or character), since the application may not be able to determine if the source was bytes or characters.
offset has been splitted in byteOffset and utf16Offset (since the DOM deals with utf-16 units, and not characters).
White space is two words (see all prose and productions in
XML
1.0 except one typo in 1998 and the Infoset since
WD-xml-infoset-20010202).
I don't know why it wound up being one word in
[element content whitespace]. Do you? isWhitespaceInElementContent
could (should?) be
isWhiteSpaceInElementContent
and whitespace-in-element-content
could be white-space-in-element-content
. If
Last Call is too late to change those, perhaps at least
the prose could match XML and the Infoset to say "white
space" (rather than "whitespaces").
The name of the method isWhitespaceInElementContent has been changed to isElementContentWhitespace to better reflect that it refers to the Infoset property "element-content-whitespace". Same kind of change has been applied to the "whitespace-in-element-content" parameter. Since it is too late to rename the Infoset property, and our method and parameter are mapped to it, we don't plan to rename them, whatever the outcome of the whitespace vs white space is. Regarding the prose itself, we will happily change to whatever becomes the recommendation, and thus at any stage.
the description doesn't say that this is supposed to be the qualified name of the node. Nor do the descriptions of Element.tagName and Attr.name. One has to determine that from the description of Node.prefix!
the spec should clearly specify that when retrieving the
value of an attribute that contains a reference to an
entity for which no definition is available, the processor
will treat this entity reference as an empty reference
(see the reply to C15 in Core
I18N response). Same comment for
Element.getAttribute()
and
Element.getAttributeNS()
.
The 4th paragraph starts "XML does not mandate that a non-validating XML processor read and process entity declarations made in the external subset or declared in external parameter entities." The last occurrence of "external" is superfluous and somewhat misleading, since non-validating processors are not obligated to read even *internal* parameter entities. This latter point is pretty obscure and under-documented in the XML spec, but see the last sentence in XML 1.0, as well as published erratum E8 at XML 1.0 Erratum E8.
void Element.normalizeNamespaces() { ... // Fixup element's namespace // if ( Element's namespaceURI != null ) { ... } else { // Element has no namespace URI: if ( Element's localName is null ) { ... } else { // Element has no namespace URI // Element has no pseudo-prefix if ( default Namespace in scope is "no namespace" ) { ==> do nothing, we're fine as we stand } else { if ( there's a conflicting local default namespace declaration already present ) { ==> change its value to use this empty namespace. } else { ==> Set the default namespace to "no namespace" by creating or changing a local declaration attribute: xmlns="". }
There seems to be useless redundancy in the last "if". Paraphrasing: "If there's a declaration, change it, else create one or change an existing one". Either drop the "if" and keep the "else" part or keep the "if" and drop "or changing" from the "else" part.
The first sentence (after the initial Note) is really hard to parse. Comments in the algorithms say things like "if the prefix/namespace pair is within scope...", using this wording would be clearer. It's not an element that's within scope, it's really the prefix/namespaceURI pair. Some rewording/tightening needed.
The section was reworded a bit.
look at this section Appendix B1.1 for futher changes (if necessary)
In C 1.1.1, Document.actualEncoding should be set to [character encoding scheme], which is "The name of the character encoding scheme in which the document entity is expressed." in the infoset spec, not necessarily the value of the encoding pseudo-att of the XML declaration. Document.xmlEncoding probably should not be set.
In appendix C.1.2, [character encoding scheme] should be set from Document.actualEncoding.
Document.xmlVersion
should be "The [version]
property or 1.0 if the latter has no value."
Same, mutatis mutandis, for xmlStandalone (false if no value).
[notations] should be "Document.doctype.notations", in order to point to the correct DocumentType object.
the statement "Element nodes with no namespace URI (Node.namespaceURI equals to null) cannot be represented using the Infoset." is counterfactual. The infoset supports these, provided names do not contain colons.
Node.previousSibling and Node.nextSibling should be set to null.
it should be clarified that CDATASection nodes cannot occur from an infoset mapping, since the infoset doesn't contain CDATA section boundaries.
This is very much improved since the previous version, but unfortunately still not totally clear. Since the DOM stores documents in UTF-16 exclusively, these attributes must necessarily refer to the encoding of a serialized document that is parsed to create the DOM tree; since xmlEncoding is not read-only, it can also be set programmatically, but that shouldn't change its semantics. Semantics is precisely where some dark spots remain. actualEncoding is pretty clearly defined as the actual encoding of the parsed document, supposedly gleaned from the parser. xmlEncoding is then said to be taken from the XML declaration, but Appendix C.1.1 says that xmlEncoding is supposed to come from the infoset's [character encoding scheme] property. The latter is defined as "The name of the character encoding scheme in which the document entity is expressed", matching the semantics of actualEncoding, not those of an encoding label read from the XML declaration. So the meaning of xmlEncoding remains pretty murky.
One wonders why there are actually 2 attributes, since
there is only one encoding of interest: that of the
document that was parsed to create the DOM tree. If the
intent was to enable DOM users to control encoding during
later serialization, this is defeated by the order of
priorities specified in
DOMSerializer.write()
:
actualEncoding
precedes
xmlEncoding
. The former being read-only, the
user has no control.
xmlEncoding is now read-only, and only represents what was found in the XML declaration, if any. actualEncoding was the encoding used to load the document, again if any. For the Save module, it should be clarified that the XML declaration, if generated, will get whatever encoding was used for the serialization, and not actualEncoding or xmlEncoding necessarily.
Should specify, like createAttribute() and others, that an
INVALID_CHARACTER_ERR
exception can be
thrown, depending on the "xmlVersion" attribute.
fixed
This should also perform character normalization, perhaps conditional to the config of the containing Document. This method's business in life is to concatenate Text nodes; concatenation is one of the well-known cases that actually produces character denormalization. It would be silly to have a method called normalize() which actually denormalizes, so any denormalizations caused by concatenation should be repaired as part of the method's normal functioning. Backward compatibility can probably be addressed by making the repairs conditional on xmlVersion or the config of the containing document or both.
Also, it should be specified that this method is sensitive to the value of the "cdata-sections" config parameter.
normalize() is a DOM Level 1 method. The name is unfortunate since it collides character normalization but we cannot change its semantics or rename it. This explains the introduction of normalizeDocument(), instead of reusing normalize() on Document nodes. An other example of discrepancy with names is our namespaceURI and the [namespace name] Infoset property.
This doesn't seem to address the comment. Backward compatibility is certainly an issue, but not necessarily a show-stopper: there are numerous instances of "Modified in DOM Level 3" in the spec. We did offer some ideas for addressing compatibility, they may be dead-on-arrival but we would like to understand why.
This question was reconsidered.
The issue was reconsidered and we agreed with the recommendation.
Are the various methods supposed to maintain character normalization? Under the control of the config of the containing Document? Of "strictErrorChecking"?
The config parameters
"check-character-normalization"
and
"normalize-characters"
appear to be
pertinent, but neither their descriptions nor the
descriptions of the CharacterData.* methods say that they
have any effect for these methods.
The description of the DOMString interface has been changed to reflect the fact that only normalizeDocument has the control over the character normalization.
some rewording would be appreciated.
This question was reconsidered.
The paragraph was rewording to include the change on Node.normalize() as well.
we need somewhere a section that says nothing do character normalization except normalizeDocument when requested.
Characters are fully-normalized according to the rules
defined in [CharModel] supplemented by the definitions
of relevant constructs from Section 2.13 of [XML 1.1],
if the normalization happened at load time, or by
calling the method
Document.normalizeDocument()
(in both
cases, the parameter
"normalize-characters"
needs to be
true). Note that, with the exception of
Document.normalizeDocument()
,
manipulating characters using DOM methods does not
preserve a fully-normalized text.
This should default to false. CDATA sections are mere syntactic sugar with no structural role (hint: they do not exist in the infoset), they do not deserve to be preserved by default.
The parse methods of the LS module don't load CDATA sections by default (the "infoset" parameter is true by default, this implies that cdata-sections default is false for the parse methods). So unless an application adds CDATASection nodes during manipulations, the "cdata-sections" parameter won't change anything in the tree. And if the application do add CDATASection nodes in the tree, or the parse operation was requested to preserve the cdata sections, then they should be preserved by default since the application explicitly asked to get them.
it is not clear *when* this setting has any effect (i.e. what methods of what interfaces it affects). Since Charmodel says that text SHOULD be checked, the default for this should be true, the user having the chance to set it to false after careful consideration of the consequences (see definition of SHOULD in RFC2119).
The parameter check-character-normalization is optional so the default cannot be true. Applications can certainly check if the parameter is activated, or can be activated, using the methods defined on the DOMConfiguration object.
The optional character of check-character-normalization is the first wrong (the other being that "true" is not the default). As argued in our comment, a DOM user cannot do the right thing (check normalization unless careful consideration of the consequences...) if the normalization checking functionality is absent. The DOM is missing functionality essential for things as ssimple and basic as string matching. We object to that.
For performance reasons, we believe the character normalization cannot be true by default. Also, no one in the WG committed to implement this feature.
The reference to Unicode 3.0 should be updated to Unicode 4.0, ISBN 0-321-18578-1.
fixed
We consider this section overly vague. At least two points should be improved:
The paragraph was rewording to include the proposed changes.
send a proposal to fix DOM URIs section.
Would an iterator make sense in the
DOMConfiguration
? That would allow the
properties to be discovered dynamically, instead of
hardwiring strings like "canonical-form"
in
applications. Or perhaps a
getNextProperty()
?
added a read-only attribute parameterNameList, of type DOMStringList, on the DOMConfigurationList. This gives you the possiblity of iterating through the parameter names supported by the object. Note that this list can also contain parameter names defined outside the specification., i.e. extensions provided by the implementation.
Some methods may raise several types of exceptions. It can be situations where more then one exceptional conditions are met at the same time. Is it described anywhere how to determine exceptions' precedence?
For example if I try to call insertBefore method on a readonly node with newChild being the node created from a different document, which exception should be raised: NO_MODIFICATION_ALLOWED_ERR or WRONG_DOCUMENT_ERR?
It appears that the DOM Core spec doesn't specify what should happen if the prefix attribute is set on a node of a type other than ELEMENT_NODE or ATTRIBUTE_NODE. It does state that the prefix is always null, but should setting it be a no-op or throw an exception?
Consider the interface CharacterData from DOM Level 1. The method deleteData has an unsigned long parameter named count. The technical report says that if the specified count is negative the exception INDEX_SIZE_ERR should be raised. But since count is an unsigned long, it makes no sense to say an exception should be raised if the count is negative; an unsigned long can't be negative.
In practice, this creates a problem for language binding implementors. For example, the W3C DOM test suite has an ECMAScript test that attempts to pass a negative number to deleteData, and expects INDEX_SIZE_ERR. But conceptually, it's the ECMAScript DOM language binding that is going to encounter the need to convert the negative number to an unsigned long, and I think it may be unreasonable to expect the language binding to know that it should raise a DOMException INDEX_SIZE_ERR in this case, since the process of converting argument types shouldn't be required to know details of individual DOM methods.
I was unable to find any standard that defines what should happen to values of inappropriate types that are passed through a language binding. I expect that's outside the scope of the DOM technical report.
There's definitely a problem here, but I'm not sure what can be done to the DOM technical report to fix this. Perhaps the idea of using unsigned long as the type in the CharacterData interface is the problem; it's certainly inconsistent to talk about a negative count if a count is an unsigned long.
This issue exists for the substringData, insertData, deleteData, and replaceData methods of the interface CharacterData, and the splitText method of the interface Text.
One can expect an error but can't expect a specific error. This is decsribed in the DOMException interface already. No change needed to the specification.
Add this to the FAQ and reply to the DOM TS list.
Last update: $Date: 2004/02/19 16:25:03 $
This page was generated as part of the Extensible Issue Tracking System (ExIT)
Copyright © 2003 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.