This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
In the situation of element-only content there exists a string-value of the content but no typed-value. So I think xpath20, 2.5.2 should be changed slightly to avoid misunderstanding: original: "An implementation may store both the typed value and the string value of a node, or it may store only one of these and derive the other from it as needed. The string value of a node must be a valid lexical representation of the typed value of the node, but the node is not required to preserve the string representation from the original source document. For example, if the typed value of a node is the xs:integer value 30, its string value might be "30" or "0030"." changed (changes in big letters): "An implementation may store both the typed value and the string value of a node, or it may store only one of these and derive the other from it as needed. The string value of a node must be a valid lexical representation of the typed value of the node, IF THIS EXISTS, but the node is not required to preserve the string representation from the original source document, SO THE STRING VALUE MAY DIFFER SLIGHTLY. For example, if the typed value of a node is the xs:integer value 30, its string value might be "30" or "0030"." The most important word in the second change is 'slightly'. This means, that the string value retrieved in this way is not allowed to be much different from a 'direct' constructed string value. -------------------------------- Additional comment would be also helpfull on 3.3.1.3 Relationship Between Typed-Value and String-Value: "In order to permit these various implementation strategies, some variations in the string value of a node are defined as insignificant. Implementations that store only the typed value of a node are permitted to return a string value that is different from the original lexical form of the node content." Klaus Bosse
A better way to improve this paragraph might be simply to precede it with "In the case of a node with no children, ...". This seems to deal with the problem that the choice described in this paragraph isn't available for all nodes. I think the other change "so the string value may differ slightly" is unnecessary and undesirable. We already say that the string value must be a valid lexical representation of the typed value, and that's a much more precise statement than saying it may vary "slightly". It's a matter of opinion whether "1" differs only slightly from "true", but it's a matter of fact that they are both valid lexical representations of the boolean value TRUE. Michael Kay personal response
(In reply to comment #1) > A better way to improve this paragraph might be simply to precede it with "In > the case of a node with no children, ...". This seems to deal with the problem > that the choice described in this paragraph isn't available for all nodes. > > I think the other change "so the string value may differ slightly" is > unnecessary and undesirable. We already say that the string value must be a > valid lexical representation of the typed value, and that's a much more precise > statement than saying it may vary "slightly". It's a matter of opinion whether > "1" differs only slightly from "true", but it's a matter of fact that they are > both valid lexical representations of the boolean value TRUE. > > Michael Kay > personal response > First to the second part of your comment: fully agreed Now to the first: This would indeed avoid confusion and it would be okay for me, but than not only the case of element-only content but also of mixed content is avoided. Is this intended?
For mixed content (a) the typed value and the string value consist of the same sequence of Unicode characters, and (b) it's also equal to the concatenation of the string-values of the descendant text nodes: so I think it's fairly obvious that you can store things any way you like and you get exactly the same result. It's with simple-valued content, e.g. numbers and dates, that results can vary depending on the implementation strategy, so that's the case we want to talk about. Michael Kay (personal response)
(In reply to comment #3) Yes, my fault ([DM] 6.2.4 typed-value, 4th item). I somehow mixed it up with the typed-value of a list. I think there is even more need to change the first sentence in [DM] '3.3.1.3 Relationship Between Typed-Value and String-Value' in a way similar to the way you suggested in comment #1. Should this be reported for [DM]?
Klaus, Thank you for your comment, which was considered by the joint Query and XSLT working groups on May 9, 2006. The working groups agreed with your observation that an element node that has an element-only complex type does not have a typed value, and that this fact is pertinent to the following sentence in XPath/XQuery Section 2.5.2: "An implementation may store both the typed value and the string value of a node, or it may store only one of these and derive the other from it as needed." The working groups agreed to remove the words "from it" from this sentence, reflecting the fact that, in the case of an element node with an element-only complex type, the string value of the node is derived from its descendants rather than from its typed value. This change will be reflected in the next version of the XPath and XQuery specifications. Section 3.3.1.2 of the Data Model document states that "Implementations are allowed some flexibility in how [the typed-value and string-value properties] are stored." It then briefly outlines some possible strategies, subject to the constraint that the relationship between the string value of a node and its typed value must be consistent with schema validation. The working groups feel that this explanation is adequate and does not need to be changed. If you are satisfied with this resolution of your issue, please close this Bugzilla entry. If you take no action the entry will be closed by the working groups in two weeks. Regards, Don Chamberlin (for the joint working groups)
Sorry, the above comment should reference Data Model Section 3.3.1.3, not 3.3.1.2. --Don Chamberlin
[XPath]and[DM] I agree with your suggestion in the first Point ([XPath] Section 2.5.2). This sets the focus on the flexibility of implementations and not on the asymmetric relation between string-value and typed-value, but this is ok here. I see now that in [DM] Section 3.3.1.3 the first sentence is correct because it says "typed-value and string-value properties" (and not only "typed-value and string-value") as you emphasized. But I would prefer here a note like for string-values ("If an implementation stores only the string-value of a node, the following considerations apply:...") which says, that the relation between string-values and typed-values is not symmetric, because this can not be obvious to the reader (me) at this point of the document (--> [DM]6.2.4 typed-value). But, ok, this is no tutorial. So I reopen the bug but if you will close it you will see no protest. Regards Klaus Bosse
Proposed resolution: Having reviewed the bug again and attempted to reconstruct mentally the discussions we had back in May, my best effort to resolve this issue is to add the following to the end of 3.3.1.3 in XDM: First, a new bullet at the end of the existing bulleted list: * Where an element with a complex type and element-only content occurs, it is an error to attempt to access the typed-value of the node. And the following paragraph below the list: If an implementation stores only the typed-value of a node, it must be prepared to construct string values from not only the node, but in some cases also the descendants of that node. For example, an element with a complex type and element-only content has no typed-value but does have a string-value that is the concatenation of the string-values of all its Text Node descendants in document order.