Copyright © 2004 W3C® ( MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, and document use rules apply.
RDF is a flexible, extensible way to represent information about World Wide Web resources. It is used to represent, among other things, personal information, social networks, metadata about digital artifacts like music and images, as well as provide a means of integration over disparate sources of information. A standardized query language for RDF data with multiple implementations offers developers and end users a way to write and to consume the results of queries across this wide range of information. This document describes a query language for RDF, called SPARQL, for querying RDF data.
This document describes the query language part of SPARQL for easy access to RDF stores. It is designed to meet the requirements and design objectives described in the W3C RDF Data Access Working Group (DAWG) document "RDF Data Access Use Cases and Requirements".
This is a first Public Working Draft of the Data Access SPARQL Query Language by the RDF Data Access Working Group (part of the Semantic Web Activity) for review by W3C Members and other interested parties. It reflects the best effort of the editors to reflect implementation experience and incorporate input from various members of the WG, but is not yet endorsed by the WG as a whole. Some sections are incomplete and there are a number of issues in the document and working group issues. Please send comments to public-rdf-dawg-comments@w3.org, a mailing list with a public archive.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced under the 5 February 2004 W3C Patent Policy. The Working Group maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing [and excluding] a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.
Per section 4 of the W3C Patent Policy, Working Group participants have 150 days from the title page date of this document to exclude essential claims from the W3C RF licensing requirements with respect to this document series. Exclusions are with respect to the exclusion reference document, defined by the W3C Patent Policy to be the latest version of a document in this series that is published no later than 90 days after the title page date of this document.
See also:
Section status: bare outline
Key features in one page. Refs to other documents by DAWG.
An RDF graph is a set of triples, each consisting of a subject, an object, and a property relationship between them as defined in RDF Concepts and Abstract syntax. These triples can come from a variety of sources. For instance, they may come directly from an RDF document. They may be inferred from other RDF triples. They may be the RDF expression of data stored in other formats, such as XML or relational databases.
SPARQL is a query language for accessing such RDF graphs. It provides facilities to:
As a data access language, it is suitable for both local and remote use. When used across networks, the companion document [@@ protocol document not yet published @@] describes a remote access protocol.
When undeclared, the namespace rdf
stands in
place of
http://www.w3.org/1999/02/22-rdf-syntax-ns#
, the
namespace rdfs
stands in place of
http://www.w3.org/2000/01/rdf-schema#
, and the
namespace xsd
for
http://www.w3.org/2001/XMLSchema#.
Queries match graph patterns against the target graph of the query. Patterns are like graphs but may named variables in place of some of the nodes or predicates; the simplest graph patterns are single triple patterns. and graph patterns can be combined using various operators into more complicated graph patterns.
A binding is a mapping from the a variable in a query to terms. A pattern solution is a set of bindings which, when applied to the variables in the query, cab be used to produce a subgraph of the target graph; query results are a set of pattern solutions. If there are no result mappings, the query results is an empty set.
Pictorially, suppose we have a graph with two triples and the given triple pattern:
triple1
triple2
triplePattern1
with the result:
who | addr |
---|---|
_:1 | "alice@work.example" |
_:2 | "robt@home.example" |
RDF graphs are constructed from one or more triples, ex. graph1.
A query for graphPattern1 will return the
email address of people known by Alice (specifically, the person
with the mbox alice@work.example
). When matched
against the example RDF graph, we get one result mapping
which binds three variables:
who | whom | address |
---|---|---|
_:1 | _:2 | "robt@home.example" |
The example below shows a query to find the title of a book from the information in an RDF graph. The query consists of two parts, the SELECT clause and the WHERE clause. Here, the SELECT clause names the variable of interest to the application, and the WHERE clause has one triple pattern.
Data:
<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL Tutorial" .
Query:
SELECT ?title WHERE ( <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title )
Query Result:
title |
---|
"SPARQL Tutorial" |
The terms delimited by "<>" are URI References [13] (URIRefs); URIRefs can also abbreviated with an XML QName-like form [14]; this is syntactic assistance and is translated to the full URIRef. Other RDF terms are literals which, following N-Triples syntax [7], are a string and optional language tag (introduced with '@') and datatype URIRef (introduced by '^^').
Variables in SPARQL queries have global scope; it is the same variable everywhere the name is used. Variables are indicated by '?'; the '?' does not form part of the variable's name.
An alternative choice here is '$'. Awaiting reports of usage in DB connection technologies.
Because URIRefs can be long, SPARQL provides an abbreviation mechanism. Prefixes can be defined and a QName-like syntax provides shorter forms: we also use the N3/Turtle [15] prefix mechanism for describing data. Prefixes apply to the whole query.
PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?title WHERE ( <http://example.org/book/book1> dc:title ?title )
PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX : <http://example.org/book/> SELECT ?title WHERE ( :book1 dc:title ?title )
Similarly, we abbreviate data:
@prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix : <http://example.org/book/> . :book1 dc:title "SPARQL Tutorial" .
Prefixes are syntactic: the prefix name does not effect the query, nor do prefix names in queries need to be the same prefixes as used for data. This query is equivalent to the previous one and will give the same results when applied to the same graph.
PREFIX dcore: <http://purl.org/dc/elements/1.1/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT ?title WHERE ( ?book dcore:title ?title )
RDF has typed literals. Such literals are written using "^^". Integers can be directly written and are interpreted as typed literals of datatype xsd:integer.
@prefix ns: <http://example.org/ns#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix : <http://example.org/book/> . :book1 ns:numPages "200"^^xsd:integer . :book2 ns:numPages 100 .
The building blocks of queries are triple patterns. Syntactically, a SPARQL triple pattern is a subject, predicate and object delimited by parentheses. The previous example shows a triple pattern with a variable subject (the variable book), a predicate of dcore:title and a variable object (the variable title).
( ?book dcore:title ?title )
A triple pattern applied to a graph matches all triples with identical RDF terms for the corresponding subject, predicate and object. The variables in the triple pattern, if any, are bound to the corresponding RDF terms in the matching triples.
Definition: RDF Term
An RDF Term is anything that can
occur in the
RDF data model.
let RDF-U be the set of all
RDF URI References
let RDF-L be the set of all
RDF Literals
let RDF-B be the set of all
bNodes
The set of RDF Terms, RDF-T, is RDF-U union RDF-L union RDF-B
Definition: Query Variable
Let V be the set of all query variables. V and RDF-T are
disjoint.
A query variable is a name, used to define queries as graph patterns. A query variable is associated with RDF terms in a graph by a binding.
An RDF triple contains three components:
In SPARQL, a triple pattern is an RDF triple but with the addition that components can be a query variable instead.
Definition: Triple Pattern
The set of triple patterns
is
(RDF-U union RDF-B union V) x (RDF-U union V)
x (RDF-T union V)
Definition: Binding
A binding is a pair which
defines a mapping from a variable to an RDF Term. If B is such a
binding, var(B) is the variable of the binding, and val(B) is the
RDF term.
In this document, we illustrate bindings in results in tabular form,:
x | y |
---|---|
"Alice" | "Bob" |
Not every binding needs to exist in every row of the table.
Definition: Triple Pattern Matching
A binding, B, defines a substitution subst(T, B) on triple
T that replaces every occurrence of the variable, var(B), with
the corresponding RDF Term, val(B).
If SB is a set of bindings, with no two bindings having the same
variable, we write
subst(T, SB) for the triple pattern formed by substituting
variables in T using each B in SB.
Triple Pattern T matches RDF
graph G with set of bindings, SB, if subst(T, SB) is a triple of
G.
If the same variable name is used more than once in a pattern then, within each solution to the query, the variable has the same value.
For example, the query:
SELECT * WHERE ( ?x ?x ?v )
matches the triple:
rdf:type rdf:type rdf:Property .
with set of bindings:
x | v |
---|---|
rdf:type | rdf:Property |
It does not match the triple:
rdfs:seeAlso rdf:type rdf:Property .
because the variable x would need to be both rdfs:seeAlso and rdf:type in the same set of bindings.
The keyword WHERE is followed by a Graph Pattern which is made of one or more Triple Patterns. These Triple Patterns are "and"ed together. More formally, the Graph Pattern is the conjunction of the Triple Patterns. In each query solution, all the triple patterns must be satisfied with the same binding of variables to values.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Johnny Lee Outlaw" . _:a foaf:mbox <mailto:jlow@example.com> .
There is a bNode [12] in this dataset. Just within the file, for encoding purposes, the bNode is identified by _:a but the information about the bNode label is not in the RDF graph. No query will be able to identify that bNode by the label used in the serialization.
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?mbox WHERE ( ?x foaf:name "Johnny Lee Outlaw" ) ( ?x foaf:mbox ?mbox )
Query Result:
mbox |
---|
<mailto:jlow@example.com> |
This query contains a conjunctive graph pattern. A conjunctive graph pattern is a set of triple patterns, each of which must match for the graph pattern to match.
Definition: Graph Pattern (Partial Definition) –
Conjunction
A set of triple patterns is a graph pattern GP.
For binding B, we write subst(GP, B) for the set of triple
patterns formed by applying, for each T in GP, subst(T, B) .
Definition: Graph Pattern Matching
For set of bindings, SB, we write subst(GP, SB) for the graph
pattern produced by using each binding B in SB to substitute
variables in GP with the corresponding RDF Terms as given by
SB.
Graph Pattern GP matches RDF
graph G with set of bindings SB if subst(GP, SB) is a subgraph of
G.
The results of a query are all the ways a query can match the graph being queried. Each result is one solution to the query and there may be zero, one or multiple results to a query, depending on the data.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Johnny Lee Outlaw" . _:a foaf:mbox <mailto:jlow@example.com> . _:b foaf:name "Peter Goodguy" . _:b foaf:mbox <mailto:peter@example.org> .
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name, ?mbox WHERE ( ?x foaf:name ?name ) ( ?x foaf:box ?mbox )
Query Result:
name | mbox |
---|---|
"Johnny Lee Outlaw" | <mailto:jlow@example.com> |
"Peter Goodguy" | <mailto:peter@example.org> |
The results enumerate the RDF terms to which the selected variables can be bound in the graph pattern. In the above example, the following two subsets of the data caused the two matches.
_:a foaf:name "Johnny Lee Outlaw" . _:a foaf:box <mailto:jlow@example.com> .
_:b foaf:name "Peter Goodguy" . _:b foaf:box <mailto:peter@example.org> .
For a simple, conjunctive graph pattern match, all the variables used in the query pattern will be bound in every solution.
Definition: Pattern Solution
A Pattern Solution of Graph
Pattern GP on graph G is any set of bindings SB such that GP
matches G with SB. Each binding B in SB, has a different
variable.
For a graph pattern GP formed as a set of triple patterns,
subst(GP, SB), has no variables and is a subgraph of G.
Definition: Query Solution
A Query Solution is a Pattern
Solution where the pattern is the whole pattern of the query.
Definition: Query Results
The Query Results, for a given
graph pattern GP on G, is R(GP,G), and is the set of all query
solutions such that GP matches G.
R(GP, G) may be the empty set.
Graph pattern matching creates bindings of variables. It is possible to further restrict possible solutions by constraining the allowable binding of variables to RDF Terms. Constraints in SPARQL take the form of boolean-valued expressions; the language also allows application-specific filter functions.
Data:
@prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix : <http://example.org/book/> . @prefix ns: <http://example.org/ns#> . :book1 dc:title "SPARQL Tutorial" . :book1 ns:price 42 . :book2 dc:title "The Semantic Web" . :book2 ns:price 23 .
Query:
PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX ns: <http://example.org/ns#> SELECT ?title ?price WHERE ( ?x dc:title ?title ) ( ?x ns:price ?price ) AND ?price < 30
Query Result:
title | price |
---|---|
"The Semantic Web" | 23 |
By having a constraint on the "price" variable, only one of the books matches the query. Like a triple pattern, this is just a restriction on the allowable values of a variable.
Definition: Constraints
A constraint is a boolean-valued expression of variables and RDF
Terms that can be applied to restrict query solutions.
Definition: Graph Pattern (Partial
Definition) – Constraints
A graph pattern can also include constraints. These constraints
further restrict the possible query solutions of matching a graph
pattern with a graph.
SPARQL defines a set of operations that all implementations must provide. In addition, there is an extension mechanism for boolean tests that are specific to an application domain or kind of data.
So far, the graph matching and value constraints allow queries that perform exact matches on a graph. For every solution of the query, every variable has an RDF Term. Sometimes useful, additional information about some item of interest in the graph can be found but, for another item, the information is not present. If the application writer wants that additional information, the query should not fail just because the some information is missing.
Optional portions of the graph may be specified in either of two equivalent ways:
OPTIONAL (?s ?p ?o)...
[ (?s ?p ?o)... ]
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . _:a rdf:type foaf:Person . _:a foaf:name "Alice" . _:a foaf:mbox <mailto:alice@work.example> . _:b rdf:type foaf:Person . _:b foaf:name "Bob" .
Query (these two are the same query using slightly different syntax):
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE ( ?x foaf:name ?name ) OPTIONAL ( ?x foaf:mbox ?mbox )
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE ( ?x foaf:name ?name ) [ ( ?x foaf:mbox ?mbox ) ]
Query result:
name | mbox |
---|---|
"Alice" | <mailto:alice@example.com> |
"Bob" |
Now, there is no value of mbox where the name is "Bob". It is left unbound in the result.
This query finds the names of people in the dataset, and, if there is an mbox property, retrieve that as well. In the example, only a single triple pattern is given in the optional match part of the query but in general it is a graph pattern.
For each optional block, the query processor attempts to match the query pattern. Failure to match the block does not cause this query solution to be rejected. The whole graph pattern of an optional block must match for the optional to add to the query solution.
A query may have zero or more top-level optional blocks. These blocks will fail or provide bindings independently. Optional blocks can also be nested, that is, an optional block may appear inside another optional block.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . _:a foaf:name "Alice" . _:a foaf:homepage <http://work.example.org/alice/> . _:b foaf:name "Bob" . _:b foaf:mbox <mailto:bob@work.example> .
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox ?hpage WHERE ( ?x foaf:name ?name ) [ ( ?x foaf:mbox ?mbox ) ] [ ( ?x foaf:homepage ?hpage ) ]
Query result:
name | mbox | hpage |
---|---|---|
"Alice" | <http://work.example.org/alice/> | |
"Bob" | <mailto:bob@example.com> |
In this example, there are two independent optional blocks. Each depends only on variables defined in the non-optional part of the graph pattern. If a new variable is introduced in an optional block (as mbox and hpage are introduced in the previous example), that variable can be bound in that block and can not be mentioned in a subsequent block.
A variable must only be bound in one optional block : either it is known to be bound in an outer optional block or set of triple patterns, or it is used in only one optional block at a given level of nesting.
The purpose of this rule is to enable the query processor to process the query blocks in arbitrary (or optimized) order. If a variable was introduced in one optional block and mentioned in another, it would be used to constrain the second. Reversing the order of the optional blocks would reverse the blocks in which the variable was was introduced and was used to constrain. Such a query could give different results depending on the order in which those blocks were evaluated.
In an optional match, either a graph pattern matches a graph and so defines a set of bindings, or gives an empty set of bindings but does not cause matching to fail overall. Any graph pattern optionally matches any graph in some way; it provides a empty set of bindings if the graph pattern does not match the graph.
Definition: Optional Matching
Given graph pattern GP1, and graph pattern GP2, let GP= (GP1
union GP2).
The optional match of GP2 of
graph G, given GP1, defines a pattern solution PS such
that:
if GP match G, then PS is the set of bindings where GP
matches G
else PS is a pattern solution of GP1 matching G.
Section status: placeholder.
This sections will discuss combining graph patterns.
Graph patterns may contain nested patterns. We've seen this earlier in optional matches. Nested patterns are delimited with ()s:
It is likely that the grouping markers will change to be {} - braces - but the grammar rework has not been done.
( ( ?s ?p ?n2 ) ( ?n2 ?p2 ?n3 ) )
Definition: Graph Pattern – Nesting
A graph pattern GP can contain other graph patterns
GPi. A query solution of Graph Pattern GP on graph G
is any B such that each element GPi of GP matches G
with binding B.
For example:
SELECT ?name ?foafmbox PREFIX foaf: <http://xmlns.com/foaf/0.1/> WHERE ( ?x foaf:name ?name ) ( ( ?x foaf:mbox ?mbox ) )
Because this example has a simple conjunction for the nested pattern, and because the nested pattern is a conjunctive element in the outer pattern, this has the same results:
SELECT ?name ?foafmbox PREFIX foaf: <http://xmlns.com/foaf/0.1/> WHERE ( ?x foaf:name ?name ) ( ?x foaf:mbox ?mbox )
Optional blocks can be nested. The outer optional block must match for any nested one to apply. That is, the outer graph pattern pattern is fixed for the purposes of any nested optional block.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> . _:a foaf:name "Alice" . _:a foaf:mbox <mailto:alice@work.example> . _:a vcard:N _:d . _:d vcard:Family "Hacker" . _:d vcard:Given "Alice" . _:b foaf:name "Bob" . _:b foaf:mbox <mailto:bob@work.example> . _:c foaf:name "Eve" . _:c vcard:N _:e . _:e vcard:Family "Hacker" . _:e vcard:Given "Eve" .
Query:
SELECT ?foafName ?mbox ?fname ?gname PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> WHERE ( ?x foaf:name ?foafname ) [ ( ?x foaf:mbox ?mbox ) ] [ ( ?x vcard:N ?vc ) [ ( ?vc vcard:Family ?fname ) ( ?vc vcard:Given ?gname ) ] ]
Query result:
foafName | mbox | fname | gname |
---|---|---|---|
"Alice" | <mailto:alice@work.example> | "Hacker" | "Alice" |
"Bob" | <mailto:bob@work.example> | ||
"Eve" | "Hacker" | "Eve |
This query finds the name, optionally the mbox, and also optionally the vCard structured name components. By nesting the optional access to vcard:Family and vcard:Given, the query only reaches these if there is a vcard:N property. It is possible to expand out optional blocks to remove nesting at the cost of duplication of expressions. Here, the expression is a simple triple pattern on vcard:N but it could be a complex graph match with value constraints.
Section status: working group is not working on this feature at the moment.
Feature on issues list as to whether to have it in the language or not.
Section status: working group is not working on this feature at the moment. It is currently likely to be dropped from the SPARQL query language.
Section status: Initial Draft
Charter: RDF graphs are often constructed through logical inference, and that sometimes the graphs are never materialized. Such graphs may be arbitrarily large or infinite.
A SPARQL query treats an RDF graph purely as data. A query processor is unaware of any inference an RDF store may provide and SPARQL makes no distinction between inferred triples and asserted triples.
May need revising - depends on discussions of accessing direct subclass relationship in an RDF inferred graph and other cases.
A SPARQL query is executed against some real or virtual RDF graph. The RDF graph can be given implicitly by the local API, externally from the SPARQL protocol or it can be specified in the query itself. The FROM clause gives URIs that the query processor can use to supply RDF Graphs for the query execution.
Query:
SELECT * FROM <http://www.w3.org/2000/08/w3c-synd/home.rss> WHERE ( ?x ?y ?z )
A query processor could use this URI to retrieve the document, parse it and use the resulting triples to provide the query graph. Aggregate graphs may also be queried by using multiple source URIs in the FROM clause such as:
SELECT ... FROM <uri1>, <uri2>
An aggregate graph is the RDF-merge of a number of subgraphs. Implementations may provide a single web service target that aggregates multiple source URIs, accessed by the DAWG protocol or some other mechanism.
Will need to significantly update when the protocol is drafted.
The RDF graph may be constructed through inference rather than retrieval or never be materialized.
Section status: under discussion – likely to change. Just some initial text here.
While the RDF data model is limited to expressing triples with a subject, predicate and object, many RDF data stores augment this with a notion of the source of each triple ?. Typically, implementations associate RDF triples or graphs with a URI specifying their real or virtual origin. The SOURCE keyword allows you to query or constrain the source of the following triple pattern or nested graph pattern. The general form of the SOURCE query is:
SOURCE <identifier> (?s ?p ?o)
If the identifier is a constant or bound variable, it will constrain the matches for the following term. If an unbound variable, the variable will be bound to all of the known sources for the term. A data store that does not support source SHOULD bind SOURCE variables to NULL and fail to match source-constrained queries.
aliceFoaf.n3:
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:1 foaf:mbox <mailto:alice@work.example>.
_:1 foaf:knows _:2.
_:2 foaf:mbox <mailto:bob@work.example>.
_:2 foaf:age 32.
bobFoaf.n3:
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:1 foaf:mbox <mailto:bob@work.example>.
_:1 foaf:PersonalProfileDocument <bobFoaf.n3>.
_:1 foaf:age 35.
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?mbox ?age ?ppd WHERE ( ?alice foaf:mbox <mailto:alice@work.example> ) ( ?alice foaf:knows ?whom ) ( ?whom foaf:mbox ?mbox ) ( ?whom foaf:PersonalProfileDocument ?ppd ) SOURCE ?ppd ( ?whom foaf:age ?age )
mbox | age | ppd |
---|---|---|
<mailto:bob@work.example> | 35 | <bobFoaf.n3> |
This query returns the email addresses of people that Alice knows. It also returns their age according to their PersonalProfileDocument documents, as well as the URI of the document. Alice's guess of Bob's age (32) is not returned.
Any variable that is bound to NULL must not match another variable that is bound to NULL. Thus,
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?given ?family WHERE SOURCE ?ppd ( ?whom foaf:given ?family ) SOURCE ?ppd ( ?whom foaf:family ?family )
will match only if the source of both triples are known and the same.
? It is possible that future work will standardize the expression of source in an RDF graph. Until then, the semantics of a SPARQL SOURCE constant are not defined in terms of the RDF semantics.
Section status: Create this after preceding sections moderately stable.
Will be a brief summary of the terms defined above to bring them together.
Section status: Initial Draft.
SPARQL has a number of query forms for returning results. These result forms use the bindings in the query results to form result sets or RDF graphs. A result set is a serialization of the bindings in a query result. The query forms are:
- SELECT
- Returns all, or a subset of, the variables bound in a query pattern match. Formats for the result set can be in XML or RDF/XML (see the result format document)
- CONSTRUCT
- Returns either an RDF graph that provides matches for all the query results or an RDF graph constructed by substituting variables in a set of triple patterns.
- DESCRIBE
- Returns an RDF graph that describes the resources found.
- ASK
- Returns whether a query pattern matches or not.
The SELECT form of results returns the variables directly. The syntax SELECT * is shorthand for select all the variables.
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name WHERE ( ?x foaf:name ?name )
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT * WHERE ( ?x foaf:name ?name )
Results can be thought of as a table, with one row per query solution. Some cells may be empty because a variable is not bound in that particular solution.
The result set can be modified by adding the DISTINCT keyword which ensures that every combination of variable bindings (i.e. each result) in a result set is unique. Thought of as a table, each row is different.
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT DISTINCT ?name WHERE ( ?x foaf:name ?name )
The LIMIT form puts an upper bound on the number of solutions returned. A query may return a number of results up to and including the limit.
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name WHERE ( ?x foaf:name ?name ) LIMIT 20
Limits on the number of results can also be applied via the SPARQL query protocol [@@ protocol document not yet published @@].
The CONSTRUCT form returns an RDF graph specified by either a graph template or by "*". If a graph template is supplied, then the RDF graph is formed by taking each query solution and substituting the variables into the graph template and merging the triples into a single RDF graph:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> CONSTRUCT ( ?x foaf:name ?name ) WHERE ( ?x vcard:FN ?name )
The CONSTRUCT form returns a single RDF graph formed as the union of the graph template with variable values obtained from each query result. Explicit variable bindings are not returned.
The form CONSTRUCT * returns a subgraph that has all the triples that matched the query. It will give all the same bindings if the query is executed on the subgraph.
PREFIX: vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> CONSTRUCT * WHERE ( ?x vcard:FN ?name )
Section status: placeholder text - not integrated
See also Principles of Boundaries in the Semantic Web or Concise Bounded Descriptions or BrownSauce.
See also LSID getMetaData()
The DESCRIBE form returns RDF data associated with a resource. The resource can be a query variable or it can be a URI. The RDF returned is the choice of the implementation; it should be the useful information the server has (within security matters outside of SPARQL) about a resource. It may include information about other resources: the RDF data for a book may also include details of the author.
The result is a single RDF graph.
A simple query such as
DESCRIBE ?x WHERE (?x ent:employeeId "1234")
might return a description of the employee and some other potentially useful details:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0> . @prefix myOrg: <http://myorg.example/employees#> . _:a myOrg:employeeId "1234" ;foaf:mbox_sha1sum "ABCD1234" ;
vcard:N [ vcard:Family "Smith" ; vcard:Given "John" ; ] .foaf:mbox_sha1sum rdf:type owl:InverseFunctionalProperty .
which includes the bNode closure for the vcard vocabulary vcard:N and
vcard:ORG triples. For a vocabulary such as FOAF, where the data
is typically all bNodes, returning sufficient information to
identify a node such as the InverseFunctionalProperty
foaf:mbox_sha1sum
as well information which as name
and other details recorded would be appropriate. In the example,
the match to the WHERE clause was returned but this is not
required.
In the returned graph there is information about one of the properties that the query server has deemed to be relevant and helpful in further processing.
DESCRIBE ?x, ?y WHERE (?x ns:marriedTo ?y)
When there are multiple resources found, the RDF data for each is merged into the result graph.
If the application already has the URI for the resource. This can be provided directly.
DESCRIBE <http://example.org/>
Possible graphs to note as reasonable returns:
bNode closure
LSID getMetaData() ??
Applications can use the ASK form to test whether or not a query pattern has a solution. No information is returned about the possible query solutions, just whether the server can find one or not.
PREFIX foaf: <http://xmlns.com/foaf/0.1/> ASK (?x foaf:mbox_sha1sum "ABCD1234")
This query tests whether the knowledge base has any FOAF information with the given property and value.
Section status: placeholder text - not integrated
SPARQL defines a number of test operations on the RDF values in a query. These operations are chosen from the XQuery and XPath Functions and Operators.
Evaluation rules:
The SPARQL language provides a subset of the operations on plain literals, XSD integers and XSD floats defined in XQuery and XPath Functions and Operators.
Not limited to integers and floats: may have other types such a dates.
XQuery and XPath Functions and Operators
XMLSchema
Best Practices Task Force on XML Schema Datatypes
Section status: placeholder text - not integrated
Implementations may provide custom extended value testing operations, for example, for specialised datatypes. These are provided by functions in the query that return true or false for their arguments.
&qname(?var or constant, ?var or constant , ...)
Example:
SELECT ?x WHERE (?x ns:location ?loc) AND &func:test(?loc, 20)
A function can test some condition of bound and unbound variables or constants. The function is called for each possible query result (or the equivalent effect if optimized in some way). A function is named by URIRef in a qname form, and returns a boolean value. "true" means accept; "false" means reject this result.
If a query processor encounters a function that it does not provide, the query is not executed and an error is returned.
Section status: drafted – terminal syntax not checked against that of the XML 1.1 spec
This grammar defines the allowable constructs in a SPARQL query. The EBNF format is the same as that used in the XML 1.1 [14] specification. Please see the "Notation" section of that specification for specific information about the notation.
References to lexical tokens are enclosed in <>. Whitespace is skipped.
Notes: The term "literal" refers to a constant value, and not only an RDF Literal.
The grammar starts with the Query production.
Productions: |
|||
[1] | Query |
::= | PrefixDecl*
ReportFormat
PrefixDecl*
FromClause?
WhereClause? |
[2] | ReportFormat |
::= | 'select' 'distinct'? <VAR> (
CommaOpt
<VAR> )* |
[3] | FromClause |
::= | 'from'
FromSelector (
CommaOpt FromSelector
)* |
[4] | FromSelector |
::= | URI |
[5] | WhereClause |
::= | 'where' GraphPattern |
[6] | SourceGraphPattern |
::= | 'source' '*'
GraphPattern1 |
[7] | OptionalGraphPattern |
::= | 'optional' GraphPattern1 |
[8] | GraphPattern |
::= | PatternElement
PatternElement* |
[9] | PatternElement |
::= | TriplePatternList |
[10] | GraphPattern1 |
::= | PatternElement1 |
[11] | PatternElement1 |
::= | SingleTriplePatternOrGroup |
[12] | PatternElementForms |
::= | SourceGraphPattern |
[13] | SingleTriplePatternOrGroup |
::= | TriplePattern |
[14] | ExplicitGroup |
::= | '('
GraphPattern
')' |
[15] | TriplePatternList |
::= | TriplePattern
TriplePattern* |
[16] | TriplePattern |
::= | '('
VarOrURI VarOrURI VarOrLiteral
')' |
[17] | VarOrURI |
::= | <VAR> |
[18] | VarOrLiteral |
::= | <VAR> |
[19] | PrefixDecl |
::= | 'prefix' <NCNAME> ':' QuotedURI |
[20] | Expression |
::= | ConditionalOrExpression |
[21] | ConditionalOrExpression |
::= | ConditionalAndExpression
( '||' ConditionalAndExpression
)* |
[22] | ConditionalAndExpression |
::= | ValueLogical (
'&&' ValueLogical
)* |
[23] | ValueLogical |
::= | StringEqualityExpression |
[24] | StringEqualityExpression |
::= | EqualityExpression
StringComparitor* |
[25] | StringComparitor |
::= | 'eq'
EqualityExpression |
[26] | EqualityExpression |
::= | RelationalExpression
RelationalComparitor? |
[27] | RelationalComparitor |
::= | '=='
RelationalExpression |
[28] | RelationalExpression |
::= | AdditiveExpression
NumericComparitor? |
[29] | NumericComparitor |
::= | '<'
AdditiveExpression |
[30] | AdditiveExpression |
::= | MultiplicativeExpression
AdditiveOperation* |
[31] | AdditiveOperation |
::= | '+'
MultiplicativeExpression |
[32] | MultiplicativeExpression |
::= | UnaryExpression
MultiplicativeOperation* |
[33] | MultiplicativeOperation |
::= | '*'
UnaryExpression |
[34] | UnaryExpression |
::= | UnaryExpressionNotPlusMinus |
[35] | UnaryExpressionNotPlusMinus |
::= | ( '~'
| '!' ) UnaryExpression |
[36] | PrimaryExpression |
::= | <VAR> |
[37] | FunctionCall |
::= | '&' <QNAME>
'(' ArgList? ')' |
[38] | ArgList |
::= | VarOrLiteral (
',' VarOrLiteral
)* |
[39] | Literal |
::= | URI |
[40] | NumericLiteral |
::= | <INTEGER_LITERAL> |
[41] | TextLiteral |
::= | String
<LANG>? ( '^^' URI )? |
[42] | String |
::= | <STRING_LITERAL1> |
[43] | URI |
::= | QuotedURI |
[44] | QName |
::= | <QNAME> |
[45] | QuotedURI |
::= | <URI> |
[46] | CommaOpt |
::= | ','? |
Terminals:These terminals are further factored for readability. |
|||
[47] | <URI > |
::= | "<" <NCCHAR1> (~[">","
"])* ">" |
[48] | <QNAME > |
::= | (<NCNAME>)? ":"
<NCNAME> |
[49] | <VAR > |
::= | "?" <NCNAME> |
[50] | <LANG > |
::= | '@'
<A2Z><A2Z> ("-" <A2Z><A2Z>)? |
[51] | <A2Z > |
::= | ["a"-"z","A"-"Z"]> |
[52] | <INTEGER_LITERAL > |
::= | (["+","-"])? <DECIMAL_LITERAL>
(["l","L"])? |
[53] | <DECIMAL_LITERAL > |
::= | <DIGITS> |
[54] | <HEX_LITERAL > |
::= | "0" ["x","X"]
(["0"-"9","a"-"f","A"-"F"])+ |
[55] | <FLOATING_POINT_LITERAL > |
::= | (["+","-"])? (["0"-"9"])+ "."
(["0"-"9"])* (<EXPONENT>)? |
[56] | <EXPONENT > |
::= | ["e","E"] (["+","-"])?
(["0"-"9"])+ |
[57] | <STRING_LITERAL1 > |
::= | "'" ( (~["'","\\","\n","\r"]) |
("\\" ~["\n","\r"]) )* "'" |
[58] | <STRING_LITERAL2 > |
::= | "\"" ( (~["\"","\\","\n","\r"]) |
("\\" ~["\n","\r"]) )* "\"" |
[59] | <DIGITS > |
::= | (["0"-"9"]) |
[60] | <PATTERN_LITERAL > |
::= | [m]/pattern/[i][m][s][x] |
[61] | <NCCHAR1 > |
::= | ["A"-"Z"] |
[62] | <NCNAME > |
::= | <NCCHAR1>
(<NCCHAR1> | "." | "-" |
["0"-"9"] | "\u00B7" )* |
Section status: misc
References
[1] "Three Implementations of SquishQL, a Simple RDF Query Language", Libby Miller, Andy Seaborne, Alberto Reggiori; ISWC2002
[2] "RDF Query and Rules: A Framework and Survey", Eric Prud'hommeaux
[3] "RDF Query and Rule languages Use Cases and Example", Alberto Reggiori, Andy Seaborne
[4] RDQL Tutorial for Jena (in the Jena tutorial).
[6] Enabling Inference, R.V. Guha, Ora Lassila, Eric Miller, Dan Brickley
[8] RDF http://www.w3.org/RDF/
[9] "Representing vCard Objects in RDF/XML", Renato Iannella, W3C Note.
[10] "RDF Data Access Working Group"
[11] "RDF Data Access Use Cases and Requirements ? W3C Working Draft 2 June 2004", Kendall Grant Clark.
[12] "Resource Description Framework (RDF): Concepts and Abstract Syntax", Graham Klyne, Jeremy J. Carroll, W3C Recommendation.
[13] "RFC 2396", T. Berners-Lee, R. Fielding, L. Masinter, Internet Draft.
[14] "Namespaces in XML 1.1", Tim Bray et al., W3C Recommendation.
[15] "Turtle - Terse RDF Triple Language", Dave Beckett.
CVS Change Log:
$Log: Overview.html,v $ Revision 1.12 2018/10/09 13:30:00 denis fix validation of xhtml documents Revision 1.11 2017/10/02 10:29:26 denis add fixup.js to old specs Revision 1.10 2004/10/12 20:41:32 eric - fixed latest version link Revision 1.9 2004/10/12 19:37:51 connolly oops... not trade, but reg Revision 1.8 2004/10/12 19:33:13 eric - fixed copyright Revision 1.7 2004/10/12 18:56:57 eric - spelling Revision 1.6 2004/10/12 18:55:59 eric - reflect impelementation experience in the SOTD Revision 1.5 2004/10/12 18:43:03 connolly 2 kinds of issues Revision 1.4 2004/10/12 18:40:37 connolly working on STOD Revision 1.3 2004/10/12 18:29:43 eric - remove "this is a live document" - add RCS-Id Revision 1.2 2004/10/12 14:40:24 matthieu Fixed anchor error references1 => references Revision 1.1 2004/10/12 14:21:44 matthieu Created Revision 1.115 2004/10/12 09:46:58 eric - CSS validated - removed links to old text Revision 1.114 2004/10/12 09:39:36 eric validating pubrules compliance and CSS Revision 1.113 2004/10/12 09:28:11 eric : commit to check pubrules - switch to publication headers (this version, ...) Revision 1.112 2004/10/11 12:17:59 aseaborne Changes in respect SimonR comments (part III) http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0092.html Revision 1.111 2004/10/11 11:06:29 aseaborne Fixed CVS Date field Revision 1.110 2004/10/11 11:05:12 aseaborne Fixed entities. Revision 1.109 2004/10/11 10:55:59 aseaborne Changes based on Kevin's comments: http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0085.html Changes recorded in the archive. Revision 1.107 2004/10/11 08:39:13 aseaborne Changes based on SimonR's comments (partII) http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0083.html Reversed: bNode put back in list of RDF term in 2-intro Revision 1.106 2004/10/10 13:05:54 eric address more of SimonR's comments. Revision 1.105 2004/10/10 12:24:17 eric address more of PatH's comments. Revision 1.104 2004/10/08 16:36:45 aseaborne Changes based on first set of SimonR's comments: http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0075.html Revision 1.102 2004/10/08 16:00:43 aseaborne Updates based on PatH's comments. http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0054.html Revision 1.101 2004/10/08 15:30:42 aseaborne Minor changes Revision 1.100 2004/10/08 15:24:02 aseaborne + Various small typographical changes. + Remove any implied fixed connection between FROM URIs and the graph (sec 8). Indeed, a query processor can ignore FROM if it so chooses. + Tidied DESCRIBE Revision 1.99 2004/10/08 14:53:45 aseaborne + Various small typographical changes. + Added "Pattern Solution" - a set of bindings with unique variables that satisfies some graph pattern Revision 1.97 2004/10/07 22:14:02 eric addressed howard's publication issues Revision 1.96 2004/10/07 15:20:32 aseaborne Fix <br> Revision 1.95 2004/10/07 15:11:54 aseaborne spell check with ,spell Revision 1.94 2004/10/07 15:08:27 aseaborne spell check with ,spell Revision 1.93 2004/10/07 15:00:00 aseaborne Changes arising from DaveB II comments + T => RDF-T in defn of query variable + Typos fixed as pointed out. Revision 1.92 2004/10/07 14:45:41 aseaborne + Removed defintion of nesting Need to talk about combination of graph patterns here. + Removed defn of target graph. Revision 1.91 2004/10/07 13:54:03 aseaborne + Changes example of matches to show data fragment (sec 4) Avoids potential issues of patterns matching patterns + Query Solution and Query Results defintions tidied up. Only the plural "Query Results" is defined. + Value Constraints defintion tidied + Query Stage becomes Query Block + Removed red issues in sec 4 (Including Optional Values) as the issue is in the issues list. + Commented out the Document Outline. + Have example with OPTIONAL + Sec 4.3 (defn of optional match) - rewritten Revision 1.90 2004/10/06 15:45:49 aseaborne Fixed some <br>'s Revision 1.89 2004/10/06 15:34:47 aseaborne + Defn of Graph Pattern simplified. + Reworked matching definitions. + Defn of Graph Pattern Matching simplified. Revision 1.88 2004/10/06 14:49:09 aseaborne + Section: Binding: remove use of "value" - can confuse. + Replaced subscipts on RDF.. with RDF-B etc. + removed "substitution" as a definition. Text merged into triple pattern matching. Use subst() + Went back and lowerceased var() and val() Revision 1.86 2004/10/06 13:21:32 aseaborne Fixed <a /> tags which break some editors Revision 1.85 2004/10/06 10:34:54 aseaborne CVS comment with </div>s broke defns.xsl\! Revision 1.84 2004/10/06 10:32:04 aseaborne Excess trailing </div>s broke defns.xsl Revision 1.83 2004/10/06 08:21:06 eric finishing first pass through DaveB's comments Revision 1.82 2004/10/05 06:44:06 eric working on dave's comments Revision 1.81 2004/10/04 23:32:42 eric - improved title Revision 1.80 2004/10/04 15:38:57 aseaborne + Removed TOC 4.4 as there isn't such a section + Sec 8: Noted we need to revisit the text on what a query targets. + 11.3 Fix syntax of result data Revision 1.79 2004/10/04 13:07:10 eric addressed some of daves comments Revision 1.78 2004/10/04 12:28:30 eric - s/class="underline"/class="definedTerm"/ - added <div class="exampleOuter exampleInner"></div> around examples - added but disabled style for examples - removed commented section 9 Revision 1.77 2004/10/03 13:06:28 eric - further simplified grammar - added *lots* of style to the grammar Revision 1.76 2004/10/03 01:15:44 eric address typographic issues raised by steveH 02102004T22:14:56+0100 Revision 1.75 2004/10/03 01:01:39 eric - simplified grammar - up-cased keywords in grammar Revision 1.74 2004/10/02 07:22:35 eric - changed grammar1 anchor to grammar - changed start production Revision 1.73 2004/10/01 16:17:47 eric fixed a couple syntax probs in the grammar file Revision 1.72 2004/10/01 16:11:28 eric embedded tokens in BNF and comment error messages Revision 1.71 2004/10/01 10:35:30 eric snapshot of grammar work Revision 1.70 2004/09/30 17:41:49 aseaborne + removed editor notes that no longer are needed + removed old style result sets - leaving table-style Revision 1.69 2004/09/30 13:38:42 aseaborne + removed issue slist - linked to issues doc. + Added informative links to XQuery/XPath Functions&Operators and XPath2.0 docs + Noted SWBPWG taskforce on xsd datatypes. + Noted bNodes in CONSTRUCT Revision 1.68 2004/09/30 03:54:04 eric validated Revision 1.67 2004/09/29 15:51:47 aseaborne + Some text in sec 12 - Testing Values - just a hint of where we are going + Some intro text. + Fixed document outline to at least be not complete incorrect Revision 1.66 2004/09/29 14:30:27 aseaborne + Wrote first pass of Query Forms (sec 11) using material already there Revision 1.64 2004/09/29 08:27:23 eric first pass at grammar Revision 1.63 2004/09/29 00:20:14 eric struck duplicate id="TriplePatterns" Revision 1.62 2004/09/28 23:59:40 eric html-validated Revision 1.61 2004/09/28 16:47:49 aseaborne + Name is now SPARQL - no mention of BRQL + Added Kendall's abstract text. + Fixed a comment in sec 9 that ended too early. Revision 1.60 2004/09/27 14:44:06 aseaborne Continued changes due to Bristol Face-to-face meeting: + Bug-fixed definitions + Fixec orrupted ndashes (became \266 - reason unknown) Revision 1.59 2004/09/26 08:13:12 eric drafted new SOURCE section. should be superset of two old SOURCE sections. Revision 1.58 2004/09/25 14:28:06 eric some intro words to make it clear we are operating outside the RDF model when implementing SOURCE Revision 1.57 2004/09/25 10:09:12 eric dtd-valid Revision 1.56 2004/09/24 13:08:26 aseaborne Changes from the Bristol Face-to-face meeting: + Change examples to (triple) syntax + Move text for disjunction to OtherText.html + Move text for unsaid to OtherText.html