This document contains working notes for a heterogenous RDF database. It is not endorsed by the W3C in any way.
description | algae query | SQL query | ||
---|---|---|---|---|
graph | parse tree | constraints | parse tree | |
OrderTracking4-alg.sh | OrderTracking4-queryGraph-img | OrderTracking4-query-img | OrderTracking4-compiledGraph-img | OrderTracking4-compiled-img |
OrderTracking8-alg.sh | OrderTracking8-queryGraph-img | OrderTracking8-query-img | OrderTracking8-compiledGraph-img | OrderTracking8-compiled-img |
The task is to transform the faux SQL compile tree (called so because it is an attempt at looking like what you get when you parse the SQL) into a valid SQL query string. State stored by the altae tree to SQL tree transformer and potentially by the SQL tree to query string transformer tells the results interpreter how fields should be interpreted as literals, or uris or parts the a table key. Ideally, each result from the SQL engine will result in a result in the RDF ResultSet, bit is possible to impose constraints on the returned tuples from he query.
One pproach is to walk the top-level conjunctions of the SQL tree for join candiates. Each attribute-binding constraint has a WHERE clause and a set of aliases that it binds together. A spanning tree can tell us if all aliases are bound against the others. Higher level constraints can keep a list of all the aliases which are mentioned in any branch of contained disjunctions, and those constrained definitively (by not being in a disjunction or by being constrained definitively in all branches of a disjunction). I suspect this algorithm is used for SQL interpreters for validating well formed formulae, but I've never written one and would like help here.
query 8 presents a difficult case where different sides of a disjunction entail different aliases. Since no path of the disjunction may chose not to bind aliases that another binds, the aliases not in common must be outer joined on the appropriate alias.attributes for the defining disjunction branch. It is important that this join, when applies to the corresponding alias/attributes on the unrelated branches of the disjunction present at most one answer (none, resulting in no bindings in the RDF graph, is far preferable).
Working this problem from a different angle, we can solve the different joins problem:
(?n1 p1 ?n2. ?n2 ?pz ?n3) || (?n1 ?px ?n2. ?n1 ?px ?n3.[?n3!=?n2] ?n3 p2 ?n4)
1n1 p1 1n2. 1n2 1pz 1n3. 2n1 2px 2n2. 2n1 2px 2n3. 2n3 p2 2n4.
CREATE TABLE S (s CHAR(4), p CHAR(4), o CHAR(4)); INSERT INTO S VALUES ('1n1', 'p1', '1n2'); INSERT INTO S VALUES ('1n2', '1pz', '1n3'); INSERT INTO S VALUES ('2n1', '2px', '2n2'); INSERT INTO S VALUES ('2n1', '2px', '2n3'); INSERT INTO S VALUES ('2n3', 'p2', '2n4');
SELECT *, (a.p='p1' AND a.o=b.s) AS _disj1a, (a.p=b.p AND a.s=b.s AND a.o!=b.o AND b.o=c.s AND c.p='p2') AS _disj1b FROM S AS a INNER JOIN S AS b ON (a.o=b.s OR a.p=b.p) LEFT OUTER JOIN S AS c ON (b.o=c.s AND c.p='p2') WHERE (a.p='p1' AND a.o=b.s) OR (a.p=b.p AND a.s=b.s AND a.o!=b.o AND b.o=c.s AND c.p='p2')
+------+------+------+------+------+------+------+------+------+---------+---------+ | s | p | o | s | p | o | s | p | o | _disj1a | _disj1b | +------+------+------+------+------+------+------+------+------+---------+---------+ | 1n1 | p1 | 1n2 | 1n2 | 1pz | 1n3 | NULL | NULL | NULL | 1 | 0 | | 2n1 | 2px | 2n2 | 2n1 | 2px | 2n3 | 2n3 | p2 | 2n4 | 0 | 1 | +------+------+------+------+------+------+------+------+------+---------+---------+