Warning:
This wiki has been archived and is now read-only.
Design:Negation
Contents
Negation
Definition and Scope of feature
Provide an explicit operator for negation in SPARQL, i.e. to express a graph pattern which must not match in the dataset.
This defines negation for the pattern based NOT EXISTS.
An alterntive design is the use of set MINUS, then it is the removal of certain query solutions from a set of other query solutions. See below.
Status
In discussion in the working group.
http://www.w3.org/2009/sparql/track/issues/29 - which negation design to choose?
Design : NOT EXISTS
One design is a binary NOT EXISTS operator as a graph pattern operation. This is the UNSAID design from the previous working group. The operator filters matches to the preceding (left hand side) pattern by taking each possibility, then seeing is the pattern (right hand side) matches for that set of bindings.
Examples : Graph Pattern Operator
All people, who do not know Simon
Dataset:
<#alice> foaf:name "Alice"; foaf:knows [foaf:name "Simon"] . <#bob> foaf:name "Bob" .
Query:
SELECT ?name WHERE { ?x foaf:name ?name NOT EXISTS { ?x foaf:knows [foaf:name "Simon"] } }
returns "Bob".
Ordering issue: In the following, the NOT EXISTS always matches, because ?x is not yet bound and Alice matches the NOT EXISTS pattern. Hence, also Bob is not returned.
SELECT ?name WHERE { NOT EXISTS { ?x foaf:knows [foaf:name "Simon"] } ?x foaf:name ?name }
This form is clearly readable also for nested NOT EXISTS:
All people, with only mutual friends (i.e. who do not foaf:know people, who do not foaf:know themselves)
SELECT ?name WHERE { ?x foaf:name ?name NOT EXISTS { ?x foaf:knows ?y NOT EXISTS { ?y foaf:knows ?x } } }
Filter based
The graph operator acts to filter matches of the left hand side using the righ hand side pattern. In this sense, it is a FILTER, but unlike a SPARQL FILTER syntax it does not act logically over later triple patterns. In the translation to the SPARQL algebra, it does split the BGP at this point (c.f. OPTIONAL) and then becomes an algebra filter.
Functions EXISTS and NOT EXISTS for a FILTER could also be introduced:
All people, who do not know Simon
SELECT ?name WHERE { ?x foaf:name ?name FILTER ( !EXISTS { ?x foaf:knows [foaf:name "Simon"] } ) }
As FILTER is evaluated in the end, this does not result in teh same ordering issues. Moreover, if we follow the initial assumption, that negated subqueries can not bind variables, they naturally correspond to subqueries.
A disadvantage here is, that this syntactic form is more difficult to detect and optimize if it is part of a complex filter expression.
This form becomes more difficult to read in the presence of nested negation:
All people, with only mutual friends (i.e. who do not foaf:know people, who do not foaf:know themselves)
SELECT ?name WHERE { ?x foaf:name ?name FILTER ( !EXISTS { ?x foaf:knows ?y FILTER ( !EXISTS { ?y foaf:knows ?x } ) } ) }
Syntax
The keywords EXISTS and NOT EXISTS can be used both inside and outside FILTERs.
EXISTS is mainly for completeness.
GraphPatternNotTriples ::= ... | ExistsPattern | NotExistsPattern ExistsPattern ::= 'EXISTS' GroupGraphPattern ExistsPattern::= 'NOT EXISTS' GroupGraphPattern
For for FILTER
BuiltInCall ::= ... | 'EXISTS' GroupGraphPattern | 'NOT EXISTS' GroupGraphPattern
This is (syntactically) ExistsPattern and NotExistsPattern again but a parser might trigger different internal structures so it is better to restate the forms.
Algebra Operator
There is a filter operator "exists" that takes a graph pattern. exists returns true/false depending on whether the pattern matches. No additional binding of variables occurs.
Mapping from AST to Algebra
{ ?s rdf:type <t> NOT EXISTS { ?s <p> ?v } ?s :p ?o }
becomes a filter using (not (exists ...)) over the preceding pattern. That is, it is a filter in the algebra but it does not get moved around during translation to the algebra as it would if a FILTER in the syntax (without adding more syntax {} to control that).
(join (filter (not (exists { ?s <p> ?v })) (bgp (triple (?s rdf:type <t>)))) (bgp (triple (?s :p ?o))))
So NOT EXISTS as a graph pattern operator is equivalence to the filter form:
P1 NOT EXISTS P2 {P1 FILTER (NOT EXISTS P2)}
The additional { } puts the FILTER in the right place.
Test Cases
@@TODO
Alternative Design : MINUS
An alternative approach is based on set based MINUS, e.g. SeRQL. This assumes both operands of MINUS return the same variable bindings (in SQL terms, column compatible), in order to allow for the set based semantics. E.g. if the left hand side of MINUS binds 3 variables, the right hand side must bind exactly these three as well.
SQL has both MINUS and EXISTS condition.
Example in SeRQL (from here):
SELECT title FROM {album} dc10:title {title} MINUS SELECT title FROM {album} dc10:title {title}; dc10:creator {creator} WHERE creator like "Paul" USING NAMESPACE dc10 = <http://purl.org/dc/elements/1.0/>, dc11 = <http://purl.org/dc/elements/1.1/>
Distinguish MINUS from NOT EXISTS
Dataset:
_:simon1 foaf:name "Simon" . _:simon2 foaf:name "Simon" . foaf:knows _:simon1 . _:alice foaf:name "Alice" ; foaf:knows _:simon1 . _:bob foaf:name "Bob" . _:eve foaf:knows _:simon1 . _:dan foaf:name "Dan" ; foaf:knows _:simon2 .
UNSAID Query:
SELECT ?who WHERE { ?who foaf:name ?name UNSAID { ?who foaf:knows ?whom . ?whom foaf:name "Simon" . } }
?who | ?name | ?whom |
---|---|---|
_:simon1 | "Simon" | |
_:bob | "Bob" |
MINUS Query:
SELECT ?who WHERE { ?who foaf:name ?name MINUS { ?who foaf:knows ?whom . ?whom foaf:name "Simon" . } }
?who | ?name |
---|---|
_:simon1 | "Simon" |
_:simon2 | "Simon" |
_:alice | "Alice" |
_:bob | "Bob" |
_:dan | "Dan" |
?who | ?whom |
---|---|
_:simon2 | _:simon1 |
_:alice | _:simon1 |
_:eve | _:simon1 |
_:dan | _:simon2 |
Both return ( _:simon1, _:bob ). P1 MINUS P2 has more of a bottom-up evaluation so the temporary set may be larger (include folks like _:eve who know Simon, but don't have a foaf:name).
MINUS with unification on multiple columns
SELECT ?who
WHERE {
?who foaf:name ?name
MINUS {
?who foaf:knows ?whom .
?whom foaf:name ?name .
}
}
?who | ?name |
---|---|
_:simon1 | "Simon" |
_:simon2 | "Simon" |
_:alice | "Alice" |
_:bob | "Bob" |
_:dan | "Dan" |
?who | ?name | ?whom |
---|---|---|
_:simon2 | "Simon" | _:simon1 |
_:alice | "Simon" | _:simon1 |
_:eve | "Simon" | _:simon1 |
_:dan | "Simon" | _:simon2 |
Because the binding { ?who:simon2, ?name:"Simon" } appears in the MINUS set, that row gets removed, yielding the same results as
SELECT ?who
WHERE {
?who foaf:name ?name
UNSAID {
?who foaf:knows ?whom .
?whom foaf:name ?name .
}
}
@@Untested thesis: the difference in these two forms is only detectable in oddball queries with a OPTIONAL in P1 binds variable ?x differently than it does in an OPTIONAL in P2.