SPARQL Query Language for RDF

7 RDF Dataset

The RDF data model expresses information as graphs, comprising of triples with subject, predicate and object. Many RDF data stores hold multiple RDF graphs, and record information about each graph, allowing an application to make queries that involve information from more than one graph.

A SPARQL query is made against an RDF Dataset which represents such a collection of graphs. Different parts of the query are matched against different graphs as described in the next section. There is one graph, the aggregate graph and zero or more named graphs, each identified by URI reference.

Definition: RDF Dataset

An RDF dataset is a set
{ G, (<u₁>, G₁), (<u₂>, G₂), . . . (<u_n>, G_n) }
where G and each G_i are graphs, and each <u_i> is a URI. Each <u_i> is distinct.

G is called the aggregate graph. (<u₁>, G_i) are called named graphs.

In the previous sections, all queries have been shown executed against a single, aggregate graph. A query does not need to involve the aggregate graph; the query can just involve the named graphs.

All named graphs are merged in the aggregate graph. The sematantics of merging are described in RDF Semantics ([17]). The aggregate graph may also contain data not from any named graph. Queries that do not use the GRAPH directive do not use any named graphs.

# Aggregate graph
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

_:x foaf:name "Bob" .
_:x foaf:mbox <mailto:bob@oldcorp.example.org> .

_:y foaf:name "Alice" .
_:y foaf:mbox <mailto:alice@work.example.org> .

<http://example.org/bob>    dc:publisher  "Bob" .
<http://example.org/alice>  dc:publisher  "Alice" .

# Named graph: http://example.org/bob
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Bob" .
_:a foaf:mbox <mailto:bob@oldcorp.example.org> .

# Named graph: http://example.org/alice
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" .
_:a foaf:mbox <mailto:alice@work.example> .

In this example, the aggregate graph contains the merge of named graphs alice and bob. It also contains the names of the publishers of the two graphs.

8 Querying the Dataset

When querying a collection of graphs, the GRAPH keyword allows access to the URIs naming the graphs in the RDF Dataset, or restricts a graph pattern to be applied to a specific named graph.

The following two graphs will be used in examples:

# Named graph: http://example.org/foaf/aliceFoaf
@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .
@prefix  rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix  rdfs:   <http://www.w3.org/2000/01/rdf-schema#> .

_:a  foaf:name     "Alice" .
_:a  foaf:mbox     <mailto:alice@work.example> .
_:a  foaf:knows    _:b .

_:b  foaf:name     "Bob" .
_:b  foaf:mbox     <mailto:bob@work.example> .
_:b  foaf:nick     "Bobby" .
_:b  rdfs:seeAlso  <http://example.org/foaf/bobFoaf> .

<http://example.org/foaf/bobFoaf>
     rdf:type      foaf:PersonalProfileDocument .

# Named graph: http://example.org/foaf/bobFoaf
@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .
@prefix  rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix  rdfs:   <http://www.w3.org/2000/01/rdf-schema#> .

_:z  foaf:mbox     <mailto:bob@work.example> .
_:z  rdfs:seeAlso  <http://example.org/foaf/bobFoaf> .
_:z  foaf:nick     "Robert" .
<http://example.org/foaf/bobFoaf>
     rdf:type      foaf:PersonalProfileDocument .

8.1 Accessing Graph Names

Access to the graph names of the collection of graphs being queried is by variable in the GRAPH clause.

The query below matches the graph pattern on each of the named graphs in the dataset and forms solutions which have the src variable bound to URIs of the graph being matched. The pattern part of the GRAPH only matched triples in a single named graph in the same way that a graph pattern matches the aggregate graph when there is no GRAPH clause being applied.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?src ?bobNick
WHERE
  {
    GRAPH ?src
    { ?x foaf:mbox <mailto:bob@work.example> .
      ?x foaf:nick ?bobNick
    }
  }

The query result gives the name of the graphs where the information was found and the value for Bob's nick:

src	bobNick
<http://example.org/foaf/aliceFoaf>	"Bobby"
<http://example.org/foaf/bobFoaf>	"Robert"

8.2 Restricting by Graph URI

The query can restrict the matching applied to a specific graph by supplying the graph URI. This query looks for Bob's nick as given in the graph http://example.org/foaf/bobFoaf.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX data: <http://example.org/foaf/>

SELECT ?nick
WHERE
  {
     GRAPH data:bobFoaf {
         ?x foaf:mbox <mailto:bob@work.example> .
         ?x foaf:nick ?nick }
  }

which yields a single solution:

nick
"Robert"

8.3 Restricting by Bound Variables

A variable used in the GRAPH clause may also be used elsewhere in the query, whether in another GRAPH clause or in a graph pattern matched against the aggregate graph in the dataset.

This can be used to find information in one part of a query, and using it to restrict the graphs matched in another part of the query. The query below uses the graph with URI http://example.org/foaf/aliceFoaf to find the profile document for Bob; it then matches another pattern against that graph. The pattern in the second GRAPH clause finds the blank node (variable ?w) for the person with the same mail box (given by variable mbox) as found in the first GRAPH clause (variable ?whom), because the blank node used to match for variable whom from Alice's FOAF file is not the same as the blank node in the profile document (they are in different graphs).

PREFIX  data:  <http://example.org/foaf/>
PREFIX  foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX  rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?mbox ?nick ?ppd
WHERE
{
  GRAPH data:aliceFoaf
  {
    ?alice foaf:mbox <mailto:alice@work.example> ;
           foaf:knows ?whom .
    ?whom  foaf:mbox ?mbox ;
           rdfs:seeAlso ?ppd .
    ?ppd  a foaf:PersonalProfileDocument .
  } .
  GRAPH ?ppd
  {
      ?w foaf:mbox ?mbox ;
         foaf:nick ?nick 
  }
}

mbox	nick	ppd
<mailto:bob@work.example>	"Robert"	<http://example.org/foaf/bobFoaf>

Any triple in Alice's FOAF file giving Bob's nick is not used to provide a nick for Bob because the pattern involving variable nick is restricted by ppd to a particular Personal Profile Document.

8.4 Named and Aggregate Graphs

Query patterns can involve both the aggregate graph and the named graphs. In this example, an aggregator has read in a web resource on two different occasions. Each time a graph is read into the aggregator, it is given a URI by the local system. The graphs are nearly the same but the email address for "Bob" has changed.

The aggregate graph is being used to record the provenance information and the RDF data actually read is kept in two separate graphs, each of which is given a different URI by the system. The RDF dataset consists of two, named graphs and the information about them.

RDF Dataset:

# Aggregate graph
@prefix dc: <http://purl.org/dc/elements/1.1/> .

<urn:x-local:graph1> dc:publisher "Bob" .
<urn:x-local:graph1> dc:date "2004-12-06"^^xsd:date .

<urn:x-local:graph2> dc:publisher "Bob" .
<urn:x-local:graph2> dc:date "2005-01-10"^^xsd:date .

# Graph: locally allocated URI: urn:x-local:graph1
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" .
_:a foaf:mbox <mailto:alice@work.example> .

_:b foaf:name "Bob" .
_:b foaf:mbox <mailto:bob@oldcorp.example.org> .

# Graph: locally allocated URI: urn:x-local:graph2
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" .
_:a foaf:mbox <mailto:alice@work.example> .

_:b foaf:name "Bob" .
_:b foaf:mbox <mailto:bob@newcorp.example.org> .

This query finds email addresses, detailing the name of the person and the date the information was discovered.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc:   <http://purl.org/dc/elements/1.1/>

SELECT ?name ?mbox ?date
WHERE
  {  ?g dc:publisher ?name ;
        dc:date ?date .
    GRAPH ?g 
      { ?person foaf:name ?name ; foaf:mbox ?mbox }
  }

The results show that the email address for "Bob" has changed.

name	mbox	date
"Bob"	<mailto:bob@oldcorp.example.org>	"2004-12-06"^^xsd:date
"Bob"	<mailto:bob@newcorp.example.org>	"2005-01-10"^^xsd:date

The URI for the date datatype has been abbreviated in the results for clarity.

8.5 Definition for GRAPH

Definition: RDF Dataset Graph Pattern

If D is a dataset {G, (<u1> G1), ...}, and P is a graph pattern then S is a pattern solution of GRAPH(g, P) if either of:

g is a URI where g = <u_i> for some i, and S is pattern solution of P on G_i
g is a variable, S maps the variable g to <u_i> and S is a pattern solution of P on G_i.

9 Specifying RDF Datasets

A SPARQL query may use a FROM clause to supplement the data set.

FROM uri

The graph identified by the URI SHOULD be available in aggregate graph and in a named graph identified by the URI. In the following example, the data set includes at least the two graphs listed in the FROM clause.

# Graph: http://example.org/bob
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Bob" .
_:a foaf:mbox <mailto:bob@oldcorp.example.org> .

# Graph: http://example.org/alice
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" .
_:a foaf:mbox <mailto:alice@work.example> .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?src ?name

FROM NAMED <http://example.org/alice>
FROM NAMED <http://example.org/bob> 

WHERE
{ GRAPH ?src { ?x foaf:name ?name } }

src	name
<http://example.org/bob>	"Bob"
<http://example.org/alice>	"Alice"

B. References

Normative References

[13]: RDF Semantics , P Hayes, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-mt-20040210/ . Latest version available at http://www.w3.org/TR/rdf-mt/ .

Issues

Default trust only works if the query engine MAY NOT (and has not and will not during the query) absorb data in a FROM NAMED into the default graph. c.f. non-empty default graphs
How does one query for this: Bob said "Sue said 'Clark can fly'" ?

Changes

s/default graph/aggregate graph/
pulled out some query processor refrences
s/GRAPH/SOURCE/ to leave room for a named graph specification in the future?