This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Replacing optional content is very awkward, as discussed here: http://lists.w3.org/Archives/Member/member-query-ultf/2006Feb/0040.html My preferred solution to this is Version 2 of the fold proposal: http://lists.w3.org/Archives/Member/w3c-xml-query-wg/2006Oct/0058 Giorgio provides good reasons for preferring Version 2: http://lists.w3.org/Archives/Member/w3c-xml-query-wg/2006Oct/0064.html
The thing I don't like about this is that it special-cases operations on singletons. We should try to design operations that a general enough to work on sets or sequences. Is it really that much of a hardship to do a delete first, then an insert?
Responding to comment #1 from Michael Kay: As I understand it, the update spec does not currently provide any combination of expressions that is equivalent to the proposed "fold" expression. Here are two attempts to simulate the effects of the following "fold" expression: fold <c/> into /a/b (1) We could try a delete followed by an insert: delete /a/b, insert <c/> after /a/b In this case, the new node <c/> will be inserted just after the position of the deleted b-node, but in nondeterministic order with other nodes that are also inserted after /a/b in the same snapshot. So this approach is not equivalent to the fold expression. (2) We could try a conditional with a replace-branch and an insert-branch: if (/a/b) then replace /a/b with <c/> else insert <c/> into /a This approach has a serious problem: the else-branch does not specify the position within the parent where the new node should be inserted. It cannot say "after /a/b" because in this branch we know that /a/b does not exist. So this approach is also not equivalent to the fold expression. Regards, Don Chamberlin
I don't think I understand the semantics of the proposed fold operation. If my existing element has children Q, T, and G, and I fold in a sequence containing Z, T, and K, what is the resulting sequence? Does it use schema knowledge to know that the Z belongs before the Q and the K belongs after the G? If the schema said that the K element has to come first, would it give me K Z Q T G? If this is intended to work in the absence of a schema, I fail to see how. Michael Kay
CORRECTS ERRORS IN COMMENT #2 (SORRY) As I understand it, the update spec does not currently provide any combination of expressions that is equivalent to the proposed "fold" expression. Here are two attempts to simulate the effects of the following "fold" expression: fold <c/> into /a (1) We could try a delete followed by an insert: delete /a/c, insert <c/> after /a/c In this case, the new node <c/> will be inserted just after the position of the deleted c-node, but in nondeterministic order with other nodes that are also inserted after /a/c in the same snapshot. So this approach is not equivalent to the fold expression. (2) We could try a conditional with a replace-branch and an insert-branch: if (/a/c) then replace /a/c with <c/> else insert <c/> into /a This approach has a serious problem: the else-branch does not specify the position within the parent where the new node should be inserted. It cannot say "after /a/c" because in this branch we know that /a/c does not exist. So this approach is also not equivalent to the fold expression. Regards, Don Chamberlin
As stated, the "fold" proposal fails to address an important issue. Consider the example fold <c/> into /a If /a/c exists, the semantics are equivalent to replace /a/c with <c/> and these semantics are well-defined. If /a/c does not exist, the semantics are equivalent to insert <c/> into /a which causes a new c-element to be inserted into a non-deterministic position within /a. But the user will probably want to control the position of the new c-element within its parent. The user may intend the new node to be the first child of its parent, or the last child of its parent, or just before or after some designated sibling. These options are all supported by the insert expression but are missing from the proposed "fold" expression. The missing functionality could be added to the "fold" expression by allowing the "into $parent" clause to take any of the following forms: into $parent as first into $parent as last into $parent before $sibling after $sibling Consider the following example: fold <c/> after /a/b The semantics would be as follows: if /a/b/../c exists, semantics are equivalent to replace /a/b/../c with <c/> If /a/b/../c does not exist, semantics are equivalent to insert <c/> after /a/b It is worth noting that specifying "after", "as last", etc. can be done fairly easily with the keyword-syntax approach (Jonathan's Version 1) but not with the functional approach (Jonathan's Version 2). This comment is not intended to express support or lack of support for the "fold" proposal, but simply to call attention to an issue that needs to be resolved to make the proposal well-specified. Regards, Don Chamberlin
Just thinbking out loud on this topic. The "do fold $children into $parent" seems to be a shorthand for the following (considering only element nodes): let $matching := $children[name(.) = $parent/*/name()] let $nonmatching := $children except $matching return ( for $m in $matching return do replace ($parent/*[name() = name($m)]) with $m, do insert $nonmatching into $parent ) As Michael Kay pointed out, this will cause an error if one of the elements being has a sibling with the same name (caught by one of the executions of replace). I suppose that we could delete the 2nd and subsequent duplicates of the matching nodes and execute the replace only on the first copy. let $matching := $children[name(.) = $parent/*/name()] let $nonmatching := $children except $matching return ( for $m in $matching return ( let $matched := $parent/*[name() = name($m)] return ( do delete $matched[position() >= 2], do replace $matched[1] with $m ) ), do insert $nonmatching into $parent ) Hmmm. Getting a bit complicated. As Don has suggested, it could be extended to allow "do fold $children as first into $parent": let $matching := $children[name() = $parent/*/name()] let $nonmatching := $children except $matching return ( for $m in $matching return do replace ($parent/*[name() = name($m)]) with $m, do insert $nonmatching as first into $parent ) It could be extended to allow the non-matching nodes to be placed before or after an existing node, using "do fold $children after $node": let $parent := $node/.. let $matching := $children[name() = $parent/*/name()] let $nonmatching := $children except $matching return ( for $m in $matching return do replace ($parent/*[name() = name($m)]) with $m, do insert $nonmatching after $node )
(In reply to comment #1) > The thing I don't like about this is that it special-cases operations on > singletons. We should try to design operations that a general enough to work on > sets or sequences. > > Is it really that much of a hardship to do a delete first, then an insert? There may not be a better solution. Optimizers can catch this fairly easily. Unfortunately, users may try a variety of things to get around the problem, and I'm not sure optimizers will catch all of them, so there is at least an education issue (which could be addressed with a use case or an example in the language book). It's especially confusing if you implement static typing, the data is known to be there, but your system refuses to replace it because of static typing (which is why many implementations won't implement the static typing by the book). Jonathan
(In reply to comment #3) > I don't think I understand the semantics of the proposed fold operation. If my > existing element has children Q, T, and G, and I fold in a sequence containing > Z, T, and K, what is the resulting sequence? Does it use schema knowledge to > know that the Z belongs before the Q and the K belongs after the G? If the > schema said that the K element has to come first, would it give me K Z Q T G? > > If this is intended to work in the absence of a schema, I fail to see how. My intent was indeed to allow implementations to use a schema to determine the right place to put the data. For a view of a relational database, it would simply put data in the corresponding column. This part of the semantics is the reason for using "insert into" rather than specifying position. Jonathan
> (2) We could try a conditional with a replace-branch and an insert-branch: > > if (/a/c) > then replace /a/c with <c/> > else insert <c/> into /a > > This approach has a serious problem: the else-branch does not specify the > position within the parent where the new node should be inserted. It cannot say > "after /a/c" because in this branch we know that /a/c does not exist. So this > approach is also not equivalent to the fold expression. Actually, I believe this is precisely equivalent to the fold proposal, which says: "If Parent has a child element or attribute $c with the same name and node kind as $n, invokes upd:replaceNode($c, $n); if there is no child element or attribute with the same name and node kind, invokes upd:insertInto($parent, $n)." Order is not intended to be specifiable - the reason this is easier in SQL is precisely that (1) order does not need to be specified and (2) the system knows that there is only one column with a given name in a table. If we try to make fold generalize to everything you can find in an arbitrary XML document, we lose the ability to simplify based on these two assumptions. The reason for this proposal is to deal with a specific impedance mismatch that is irritating. Jonathan
(In reply to comment #5) > It is worth noting that specifying "after", "as last", etc. can be done fairly > easily with the keyword-syntax approach (Jonathan's Version 1) but not with the > functional approach (Jonathan's Version 2). Of course, that's also easily done using delete followed by an insert. Perhaps closing this issue with no action is the best we can do. Jonathan
The response in comment #10 suggests that you are satisfied with the resolution of this bug. Therefore, we are marking it CLOSED.