This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Thinking about some of our use cases for check clauses the other day, I became slightly more concerned than I had been by the restrictions placed on the XPath expressions of assertions in our current working draft. The use cases provide the following examples of XPath expressions for 'high priority' use cases. Those marked * are not legal according to the grammar in our most recent public working draft. Some variations on them, which do not appear in the wiki, some of which are intended to have the same semantics when used as assertions and some of which are intended to illustrate syntactic limits on the existing grammar, are also given, marked + (or +* if they are not accepted by the grammar). It's possible that my parser is faulty, so first of all I ask those interested to see if they agree with me about which of these XPath expressions are accepted, and which are not accepted, by our current XPath subset. Value-equals test required (1) * @type='BridgeEthernet' & @BrEthernetIP = '' +* @type='BridgeEthernet' and @BrEthernetIP = '' +* @type eq 'BridgeEthernet' and @BrEthernetIP eq '' + @type eq xsd:string('BridgeEthernet') and @BrEthernetIP eq xsd:string('') * ./@type='BridgedEthernet' + ./@type eq xsd:string('BridgedEthernet') * ./@type='BridgedEthernet' and not ./@BrEthernetIP +* ./@type eq xsd:string('BridgedEthernet') and not ./@BrEthernetIP +* ./@type eq xsd:string('BridgedEthernet') and not(./@BrEthernetIP) + ./@type eq xsd:string('BridgedEthernet') and ./@BrEthernetIP Value arithmetic required - attributes (2) * @min <= @max + @min le @max * . < ../@min * . le ../@min * @max >= @min + @max ge @min Constraints on grandchildren (5) Simple attribute implication (6) * ./@attrOne or not(./@attrTwo) * ./@attrOne or not ./@attrTwo + ./@attrOne or ./@attrTwo ./@attrOne * not(./@attrOne) * ./@attrTwo and not(@attrOne) Attribute mutex (7) * (./@dec or ./@hex) and not(./@dec and ./@hex) +* (./@dec or ./@hex) and (./@dec and ./@hex) +* (./@dec or ./@hex) + ./@dec or ./@hex ./@dec and ./@hex +* (./@dec and ./@hex) * not(./@dec) and not(./@hex) Open content, sort of (9) Value arithmetic required - elements (12) * (./a + ./b + ./c) <= 30 +* (./a + ./b + ./c) le 30 +* ./a <= 30 +* ./a le 30 +* ./a le xsd:int(30) + ./a le xsd:int('30') * ./a + ./b + ./c > 30.00 Require somewhere (20) * count(//buyer) > 0 +* count(//buyer) gt 0 +* count(//buyer) gt xsd:int('0') +* count(//buyer) * count(//buyer-id | //buyer/@id) > 0 * count(//seller) > 0 * count(//seller-id | //seller/@id) > 0 * count(//binding-jurisdiction) = 1 * count(//severability | //nonseverability) = 1 * count(//start-date) = 1 and count(//end-date) = 1 Deep inclusions (23) * not(ancestor::html:form) * not(.//html:input[not(./ancestor::html:form]) * count(.//html:input[ancestor::html:form]) = count(.//html:input) * count(.//html:form//html:input) = count(.//html:input) I think the bottom line is (a) either the grammar or my parser is having some trouble with not() and count(), and (b) that if my parser is correct then our subset is too small, because it makes it too hard to write useful assertions.
I was also thinking about expanding the subset. My focus has been: - Allow "quantified" expression (some/every ... satisfies ...) and possibly "if" expressions (if ... then ... else ...) (it's unfortunate there isn't a short form like if ... then ...) - Allow more than attributes in predicates (hoping that it's still streamable) Now how are you suggesting we expand it? - not/count: in XPath 2.0, I think they became fn:not and fn:count, which are allowed by the grammar. Hum... not sure whether it still allows XPath 1.0 functions without the namespace. - About casting: I think maybe it's OK to omit xs:string(). Treat it as the default. We can also treat integer literals in the same way. Or we can go to the extreme and omit all constructor functions and implicitly cast the string value to the value space of the other operand. - Comparison: I think we have to use the 2-letter operators to match XPath 2.0 semantics - About arithmatic and promotion/casting: this is the discussion we had and I'm inclined to say "no" for now. Also note that for "Require somewhere", we only allow ".//buyer" and not "//buyer". To make sure "buyer" appears somewhere in the tree, you only need .//buyer which is equivalent to fn:count(.//buyer) > xs:int('0')
Here are some other things which I would like to say, but can't: (1) events must be in chronological order every $x in event, $y in event satisfies if $y >> $x then $y/date >= $x/date (2) currency must be one of the currencies in http://example.com/currencies . = doc(http://example.com/currencies)/currencies/currency (3) events must not be in the future date <= current-date() (4) date must not be a Sunday (5) height must be a multiple of 0.25 I'm even finding it difficult to write basic co-occurrence tests such as if (@a > 0) then exists(@b) I think there are workarounds for most of these within the proposed subset, but some of them are unnecessarily tortuous, for example not(@a gt 0 and not(@b)) The restrictions are so arbitrary that it's going to be very hard for users to remember them, let alone to learn how to work around them. Michael Kay
An update on current status. At the ftf meeting at the end of October and beginning of November, the WG agreed that in principle a legal schema can use any legal XPath 2.0 expression as an assertion. To avoid requiring all XSD processors to implement all of XPath 2.0, the subset defined in the spec will be retained, and all schema processors are required to support at least that subset of XPath 2.0; other processors may choose to support more, or all, of XPath 2.0. Schema authors who care more about power than interoperability will choose schema processors accordingly; schema authors who care about interoperability more than about power will restrict themselves to expressions in the subset. (Schema authors who care about both power and interoperability will presumably just curse the Working Group.) So a technical direction for resolution of this issue has been set, although no final wording has been adopted (and thus the decision is not yet part of the status quo text). A wording proposal is expected to go to the Working Group real soon now, possibly today or tomorrow.
The wording proposal mentioned in comment 3 has been discussed at length by the Working Group and most of it was adopted in the WG call of 23 March 2007. The part not adopted dealt with the typing of the data model instance; see bug 4416. The part adopted makes clear that the XPath subset is not a restriction on the XPath expressions contained in a legal schema, but an implementation minimum. Accordingly, I'm marking this issue closed.
As the originator of this issue, I assent to the WG's resolution of the question and accordingly close the bug report. I note in passing that some other readers of the XML Schema 1.1 spec are still unhappy with the definition of the XPath subset as an implementation minimum, but I will leave it to those readers to make their views heard.