This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
In the CR, the following query (examples-362-5 in XQFTTS) is said to return an empty result when applied to the sample document (FT-3-examples-source-document.xml): /books/book[@number="1" and . contains text "efficient" ftand ftnot "and" window 3 words] The reasoning is that there is no occurrence of "efficient" within a window of 3 tokens which would not also contain an occurrence of "and". The formal definition of FTWindow seems to indicate a different result: - There are 3 (not 2) occurences of "and" in book, two within the <p> element and a third in the <title> element. Applying FTNot and then FTAnd would yield three matches, each with the StringInclude of the single occurence of "efficient" (within <p>), and a StringExlude that corresponds to the occurences of "and" - The StringInclude (trivially) fulfills the window condition for all three cases - According to 4.2.6.8, fts:ApplyFTWordWindow, only the StringExcludes are retained which are within the window limits. Clearly, the two StringExcludes within <p> fulfill this criterion, and are retained in the result. For the StringExclude stemming from <title>, the window condition is not fulfilled, therefore it is dropped. - As a consequence, there is now a Match without a StringExclude, causing the book to become part of the result Is my understanding of the semantics correct? If yes, a possible solution would be to modify the search context, as to search not inside <book>, but inside <p>
> Applying FTNot and then FTAnd would yield three matches, each with > the StringInclude of the single occurence of "efficient" (within <p>), > and a StringExclude that corresponds to the occurences of "and" Actually, I think you'll find it yields a single match, containing the one StringInclude and all three StringExcludes. See the example in 4.2.6.3 FTUnaryNot, in which FTUnaryNot transforms: an AllMatches containing 3 Matches each with 1 StringInclude into: an AllMatches containing 1 Match with 3 StringExcludes.
For completeness, here is my analysis of that example. In the example document, the 'book' node has one occurrence of the word "efficient" (call it E) and 3 occurrences of the word "and" (call them A1, A2, A3). I'll use an ad hoc notation for matches. "efficient" generates 1 match: [include E] "and" generates 3 matches: [include A1], [include A2], [include A3] ftnot "and" fts:ApplyFTUnaryNot calls fts:UnaryNotHelper and generates a single match: [exclude A1, exclude A2, exclude A3] "efficient" ftand ftnot "and" fts:ApplyFTAnd generates a single match: [include E, exclude A1, exclude A2, exclude A3] "efficient" ftand ftnot "and" window 3 words fts:ApplyFTWindow calls fts:ApplyFTWordWindow, which takes the single match (from ApplyFTAnd) and finds all windows of width 3 that contain all the stringIncludes of the match (i.e., just E), thus: and enable efficient enable efficient and efficient and effective For each such window, it generates a match consisting of: -- the "join" of the stringIncludes (i.e., just E again), and -- all of the stringExcludes (from the input match) that fall within the window. For the first window, this is the stringExclude for A2. For the second and third window, this is the stringExclude for A3. That is, it generates 3 matches: [include E, exclude A2], [include E, exclude A3], [include E, exclude A3] . ftcontains "efficient" ftand ftnot "and" window 3 words fts:FTContainsExpr receives the above 3 matches, looks for one that has zero stringExcludes, finds no such match, and so returns false (which agrees with the prose accompanying the example). If there's a flaw in this reasoning, please let us know.
At its meeting on December 7, the Task Force endorsed my responses above. Consequently, I'm marking this issue Resolved-Invalid. If you agree, please mark it Closed.