Copyright ©2005 W3C ® ( MIT , ERCIM , Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
SPARQL is a query language and protocol for RDF. This document specifies the SPARQL Protocol for RDF; it uses WSDL 2.0 to describe a means for conveying SPARQL queries to an SPARQL query processing service and returning the query results to the entity that requested them. This protocol was developed by the W3C RDF Data Access Working Group (DAWG), part of the Semantic Web Activity as described in the activity statement .
This is a Last Call Working Draft. The first release of this document was 14 January 2005 and the RDF Data Access Working Group (part of the Semantic Web Activity) has made its best effort to address comments recieved since then, releasing several drafts and resolving a list of issues meanwhile. This document defines HTTP and SOAP interfaces for RDF queries. Several existing RDF query services offer such interfaces. The Working Group would appreciate feedback from users and custodians of these services. The interfaces are described with WSDL 2.0. Review and advice from users of WSDL is also encouraged.
Comments on this document are due 14 October 2005; please send them to public-rdf-dawg-comments@w3.org, a mailing list with a public archive.
A change log shows the differences between this document and the previous version.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced under the 5 February 2004 W3C Patent Policy. The Working Group maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing [and excluding] a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.
Per section 4 of the W3C Patent Policy, Working Group participants have 150 days from the title page date of this document to exclude essential claims from the W3C RF licensing requirements with respect to this document series. Exclusions are with respect to the exclusion reference document, defined by the W3C Patent Policy to be the latest version of a document in this series that is published no later than 90 days after the title page date of this document.
This document describes the SPARQL Protocol for RDF (hereinafter, SPARQL Protocol), a means of conveying SPARQL queries from query clients to query processors. SPARQL Protocol has been designed for compatability with the SPARQL Query Language for RDF [SPARQL]. SPARQL Protocol is described in two ways: first, as an interface independent of any concrete realization, implementation, or binding to another protocol; second, as HTTP and SOAP bindings of this interface.
When this document uses the words must, must not, should, should not, may and recommended, and the words appear as emphasized text, they must be interpreted as described in RFC 2119 [RFC2119].
SPARQL Protocol contains one interface,
SparqlQuery
, which in turn contains one
operation, query
. SPARQL Protocol
is described abstractly
with WSDL 2.0 [WSDL2] in
terms of a web service that implements its interface, types,
faults, and operations, as well as by HTTP and SOAP bindings.
query
operationSparqlQuery
is the protocol's only interface. It
contains one operation, query
, which is used to
convey
a SPARQL
query string and, optionally,
an RDF
dataset description.
The query
operation is described as an
In-Out
message exchange pattern [WSDL-Adjuncts]. The constraints of
an In-Out message exchange pattern are as follows:
This pattern consists of exactly two messages, in order, as follows:
A message:
indicated by a Message Label component whose {message label} is 'In' and {direction} is 'in'
received from some node N
A message:
indicated by a Message Label component whose {message label} is 'Out' and {direction} is 'out'
sent to node N
This pattern uses the rule 2.1.1 Fault Replaces Message.
This interface and its operation are described in the following WSDL 2.0 fragment (from sparql-protocol-query.wsdl, which contains the relevant namespace declarations):
<interface name="SparqlQuery" styleDefault="http://www.w3.org/2005/08/wsdl/style/iri"> <!-- the faults --> <fault name="MalformedQuery" element="st:malformed-query"></fault> <fault name="QueryRequestRefused" element="st:query-request-refused"></fault> <operation name="query" pattern="http://www.w3.org/2005/08/wsdl/in-out" wsldx:safe="true"> <documentation>The operation is used to convey queries and their results from clients to services and back again..</documentation> <input messageLabel="In" element="st:query-request"/> <output messageLabel="Out" element="st:query-result"/> <outfault ref="tns:MalformedQuery" messageLabel="Out"/> <outfault ref="tns:QueryRequestRefused" messageLabel="Out">/ </operation> </interface>
query-result
In Message
Abstractly, the contents of the In Message
of SparqlQuery
's query
operation is
an instance of an XML Schema complex type, called
st:query-result
in Figure 1.0, composed of two further
parts:
one SPARQL
query string; and zero or
one RDF
dataset descriptions. The SPARQL query string, represented
in the message schema by query
,
is defined
by [SPARQL] as "a sequence of characters in the language
defined by the [SPARQL] grammar, starting with the Query
production". The RDF dataset description is composed of zero
or one default RDF graphs — composed by the RDF merge of
the RDF graphs identified by zero or
more default-graph-uri
types — and by zero
or more named RDF graphs, identified by zero or more
named-graph-uri
types. These correspond to the
FROM
and FROM NAMED
keywords in
[SPARQL], respectively.
The RDF dataset may be specified either in a [SPARQL] query
using FROM
and FROM NAMED
keywords; or
it may be specified in the protocol described in this document; or
it may be specified in both the query string and in the
protocol.
In the case where both the query and the protocol specify an RDF
dataset, but not the identical RDF dataset, the dataset
specified in the protocol
must be the RDF dataset consumed by
SparqlQuery
's query
operation.
These types are defined in the following XML Schema fragment, from sparql-protocol-types.xsd:
<xs:element name="query-result"> <xs:complexType> <xs:all> <xs:element minOccurs="1" maxOccurs="1" name="query" type="xs:string"> <xs:annotation> <xs:documentation>query is an xs:string constrained by the language definition, http://www.w3.org/TR/rdf-sparql-query/#grammar, as "a sequence of characters in the language defined by the [SPARQL] grammar, starting with the Query production".</xs:documentation> </xs:annotation> </xs:element> <xs:element minOccurs="0" maxOccurs="unbounded" name="default-graph-uri" type="xs:anyURI"/> <xs:element minOccurs="0" maxOccurs="unbounded" name="named-graph-uri" type="xs:anyURI"/> </xs:all> </xs:complexType> </xs:element>
Figure 1.1 XML Schema fragment
query
Out MessageAbstractly, the contents of the Out Message
of SparqlQuery
's query
operation is an
instance of an XML Schema complex type,
called query-result
in Figure 1.2, composed of either:
The query-result
type is defined in this W3C XML
Schema fragment, from sparql-protocol-types.xsd:
<xs:element name="query-result"> <xs:annotation> <xs:documentation>The type for serializing query results, either as XML or RDF/XML.</xs:documentation> </xs:annotation> <xs:complexType> <xs:choice> <xs:element maxOccurs="1" ref="vbr:sparql"/> <xs:element maxOccurs="1" ref="rdf:RDF"/> </xs:choice> </xs:complexType> </xs:element>
Figure 1.2 XML Schema fragment
query
Fault Messages[WSDL2-Adjuncts] defines several fault propagation rules which
specify how operation faults and messages
interact. The query
operation described here employs
the Fault
Replaces Message rule:
Any message after the first in the pattern may be replaced with a fault message, which must have identical direction. The fault message must be delivered to the same target node as the message it replaces. If there is no path to this node, the fault must be discarded.
Thus, the query
operation contained in the
SparqlQuery
interface may return, in place of
the Out Message, either the MalformedQuery
message or the QueryRequestRefused
message, both of which
are defined in this WSDL fragment
from sparql-protocol-types.xsd:
<xs:element type="xs:string" name="fault-details"> <xs:annotation> <xs:documentation> This element contains human-readable information about the fault returned by the SPARQL query processing service.</xs:documentation> </xs:annotation> </xs:element> <xs:element name="malformed-query"> <xs:complexType> <xs:all> <xs:element minOccurs="0" maxOccurs="1" ref="spt:fault-details"/> </xs:all> </xs:complexType> </xs:element> <xs:element name="query-request-refused"> <xs:complexType> <xs:all> <xs:element minOccurs="0" maxOccurs="1" ref="spt:fault-details"/> </xs:all> </xs:complexType> </xs:element> </xs:schema>
When a SPARQL query string is not a legal sequence of
characters in the language defined by the SPARQL grammar, this
fault message should be returned, but an
HTTP 2xx
status code must not be
returned.
When the MalformedQuery
fault message is returned,
query processing services should include
explanatory, debugging, or other additional information intended for
human consumpution via the fault-details
type defined
in Figure 1.3.
This fault message must be returned when a
client submits a request that the server is unable or unwilling
to process, perhaps because of resource consumption or other
policy considerations. The QueryRequestRefused
fault message does not indicate whether the server may or may
not process a subsequent, identical request or
requests. Consult 3. Policy Considerations
for further discussion.
When the QueryRequestRefused
fault message is
returned, query processing services should
include explanatory, debugging, or other additional information
intended for human consumpution via the fault-details
type defined in Figure 1.3.
The SparqlQuery
interface operation query
described thus far is an abstract operation; it requires protocol
bindings to become an invocable operation. This next two sections of
this document describe HTTP and SOAP bindings. A compliant SPARQL
Protocol service must support
the SparqlQuery
interface; if a SPARQL Protocol service
supports HTTP bindings, it must support the bindings
as described
in sparql-protocol-query.wsdl. A
SPARQL Protocol service may support other
interfaces. See
2.3 SOAP Bindings for more
information.
[WSDL2-Adjuncts] defines a means of binding abstract interface
operations to HTTP. The HTTP bindings for the query
operation (from sparql-protocol-query.wsdl) are as
follows:
<binding name="queryHttp" interface="tns:SparqlQuery" type="http://www.w3.org/2005/08/wsdl/http" whttp:version="1.1"> <fault name="MalformedQuery" whttp:code="400"/> <fault name="QueryRequestRefused" whttp:code="500"/> <!-- the GET binding for query operation --> <operation ref="tns:query" whttp:method="GET" whttp:inputSerialization="application/x-www-form-urlencoded" /> <!-- the POST binding for query operation --> <operation ref="tns:query" whttp:method="POST" whttp:inputSerialization="application/x-www-form-urlencoded" /> </binding>
The name of the HTTP binding is queryHttp
, which is
described as a binding of the SparqlQuery
interface. The two faults described in SparqlQuery
interface, MalformedQuery
and QueryRequestRefused
, are bound
to HTTP
status codes 400 Bad Request
and 500 Internal
Server Error
, respectively [HTTP].
The HTTP binding for the query
operation contains two
parts. The first uses [HTTP] GET
with application/x-www-form-urlencoded
serialization
and UTF-8 encoding. The second uses [HTTP] POST
with application/x-www-form-urlencoded
serialization
and UTF-8 encoding. The
GET
binding should be used except in
cases where the URL-encoded query exceeds practicable limits, in
which case the POST
binding should be used.
(Note: The bindings shown here are not legal according
to the latest draft of WSDL 2.0 recommendation. The issues related
to describing SPARQL Protocol for RDF with WSDL2 are summarized in
this
email message and in the thread that follows it. In particular,
the whttp:outputSerialization
attribute is required in
WSDL 2.0, and is required to have a single Internet Media Type as
its value. However, in the service design described herein, the
query operation may return XML, RDF/XML or an equivalent graph
serialization IMT. Second, WSDL2 does not
allow whttp:inputSerialization
to have a value
"application/x-www-urlencoded" when the value a binding style is
"http://www.w3.org/2005/08/wsdl/style/iri". That is, roughly, WSDL2
does not allow us to describe the service design for query's POST
binding we prefer, which is to POST application/x-www-urlencoded to
the endpoint. Third, whttp:faultSerialization
is also
required and required to have as its value a single Internet Media
Type. Similarly to the case
with whttp:outputSerialization
, DAWG has designed a
service where the value of the fault serialization is
implementation-dependent and cannot be represented by a single XML
Schema type. DAWG acknowledges the risk inherent in describing its
protocol in an illegal variant of WSDL 2.0.)
The following abstract HTTP trace examples illustrate invocation of
the query
operation under several different
scenarios. These example traces are abstracted from complete HTTP
traces in three ways: (1) In each example the string
"EncodedQuery" represents the URL-encoded string equivalent of
the SPARQL query given in the first block of each example; (2) only
partial response bodies, containing the query results, are displayed;
(3) the URI values of default-graph-uri
and named-graph-uri
are also not URL-encoded.
This SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?book ?who WHERE { ?book dc:creator ?who }
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery HTTP/1.1 Host: my.example User-agent: my-sparql-client/0.1
That query against the service-supplied RDF dataset, executed by that SPARQL query service, returns the following query result:
HTTP/1.1 200 OK
Date: Fri, 06 May 2005 20:55:12 GMT
Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3
Connection: close
Content-Type: application/sparql-results+xml; charset=utf-8
<?xml version="1.0"?>
<sparql xmlns=" http://www.w3.org/2005/sparql-results#">
<head>
<variable name="book"/>
<variable name="who"/>
</head>
<results distinct="false" ordered="false">
<result>
<binding name="book"><uri>http://my.example/book/book5</uri></binding>
<binding name="who"><bnode>r29392923r2922</bnode></binding>
</result>
...
<result>
<binding name="book"><uri>http://my.example/book/book6</uri></binding>
<binding name="who"><bnode>r8484882r49593</bnode></binding>
</result>
</results>
</sparql>
This SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?book ?who WHERE { ?book dc:creator ?who }
is conveyed to the SPARQL query service, http://my.other.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery&default-graph-uri=http://my.other.example/books HTTP/1.1 Host: my.other.example User-agent: my-sparql-client/0.1
That query — against the RDF dataset identified by the value
of the default-graph-uri
parameter, http://my.other.example/books — executed
by that SPARQL query service, returns the following query
result:
HTTP/1.1 200 OK
Date: Fri, 06 May 2005 20:55:12 GMT
Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3
Connection: close
Content-Type: application/sparql-results+xml; charset=utf-8
<?xml version="1.0"?>
<sparql xmlns=" http://www.w3.org/2005/sparql-results#">
<head>
<variable name="book"/>
<variable name="who"/>
</head>
...
<results distinct="false" ordered="false">
<result>
<binding name="book"><uri>http://my.example/book/book2</uri></binding>
<binding name="who"><bnode>r1115396427r1133</bnode></binding>
</result>
<result>
<binding name="book"><uri>http://my.example/book/book3</uri></binding>
<binding name="who"><bnode>r1115396427r1133</bnode></binding>
</result>
<result>
<binding name="book"><uri>http://my.example/book/book1</uri></binding>
<binding name="who"><literal>J.K. Rowling</literal></binding>
</result>
</results>
</sparql>
This SPARQL query
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX myfoaf: <http://my.example/jose/foaf.rdf#> CONSTRUCT { myfoaf:jose foaf:depiction <http://my.example/jose/jose.jpg>. myfoaf:jose foaf:schoolHomepage <http://www.edu.example/>. ?s ?p ?o.} WHERE { ?s ?p ?o. myfoaf:jose foaf:nick "Little Jo". FILTER ( ! (?s = myfoaf:kendall && ?p = foaf:knows && ?o = myfoaf:edd ) && ! ( ?s = myfoaf:julia && ?p = foaf:mbox && ?o = <mailto:julia@mail.example> ) && ! ( ?s = myfoaf:julia && ?p = rdf:type && ?o = foaf:Person)) }
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery&default-graph-uri=http://my.example/jose-foaf.rdf HTTP/1.1 Host: my.example User-agent: sparql-client/0.1 Accept: text/rdf+n3, application/rdf+xml
With the response illustrated here:
HTTP/1.1 200 OK
Date: Fri, 06 May 2005 20:55:11 GMT
Server: Apache/1.3.29 (Unix)
Connection: close
Content-Type: text/rdf+n3; charset=utf-8
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix myfoaf: <http://my.example/jose/foaf.rdf#>.
myfoaf:jose foaf:name "Jose Jimeñez";
foaf:depiction <http://my.example/jose/jose.jpg>;
foaf:nick "Little Jo";
...
foaf:schoolHomepage <http://www.edu.example/>;
foaf:workplaceHomepage <http://www.corp.example/>;
foaf:homepage <http://my.example/jose/>;
foaf:knows myfoaf:juan;
rdf:type foaf:Person.
myfoaf:juan foaf:mbox <mailto:juan@mail.example>;
rdf:type foaf:Person.
This SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> ASK WHERE { ?book dc:creator "J.K. Rowling"}
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery&default-graph-uri=http://my.example/books HTTP/1.1 Host: my.example User-agent: sparql-client/0.1
With the response illustrated here:
HTTP/1.1 200 OK Date: Fri, 06 May 2005 20:48:25 GMT Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3 Connection: close Content-Type: application/sparql-results+xml; charset=utf-8 <?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head></head> <boolean>true</boolean> </sparql>
This SPARQL query
PREFIX books: <http://my.example/book/> DESCRIBE books:book6
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated here:
GET /sparql/?query=EncodedQuery&default-graph-uri=http://my.example/books HTTP/1.1 Host: my.example User-agent: sparql-client/0.1
With the response illustrated here:
HTTP/1.1 200 OK
Date: Wed, 03 Aug 2005 12:48:25 GMT
Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3
Connection: close
Content-Type: application/sparql-results+xml; charset=utf-8
<?xml version="1.0"?>
<rdf:RDF ...
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:books="http://my.example/book/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
<rdf:Description rdf:about="http://my.example/book/book6">
<dc:title>Example Book #6 </dc:title>
</rdf:Description>
</rdf:RDF>
This SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?who ?g ?mbox WHERE { ?g dc:publisher ?who . GRAPH ?g { ?x foaf:mbox ?mbox } }
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated here (with line breaks for legibility):
GET /sparql/?query=EncodedQuery&default-graph-uri=http://my.example/publishers &default-graph-uri=http://my.example/morepublishers&named-graph-uri=http://your.example/foaf-alice &named-graph-uri=http://www.example/foaf-bob&named-graph-uri=http://www.example/foaf-susan &named-graph-uri=http://this.example/john/foaf Host: my.example User-agent: sparql-client/0.1
With the response illustrated here:
HTTP/1.1 200 OK
Date: Wed, 03 Aug 2005 12:48:25 GMT
Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3
Connection: close
Content-Type: application/sparql-results+xml; charset=utf-8
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
<head>
<variable name="who"/>
<variable name="g"/>
<variable name="mbox"/>
</head>
...
<results ordered="false" distinct="false">
<result>
<binding name="who">
<literal>Alice</literal>
</binding>
<binding name="g">
<uri>http://your.example/foaf-alice</uri>
</binding>
<binding name="mbox">
<uri>mailto:alice@example.org</uri>
</binding>
</result>
<result>
<binding name="who">
<literal>Bob</literal>
</binding>
<binding name="g">
<uri>http://www.example/foaf-bob</uri>
</binding>
<binding name="mbox">
<uri>mailto:bob@work.example</uri>
</binding>
</result>
<result>
<binding name="who">
<literal>Susan</literal>
</binding>
<binding name="g">
<uri>http://www.example/foaf-susan</uri>
</binding>
<binding name="mbox">
<uri>mailto:susan@work.example</uri>
</binding>
</result>
<result>
<binding name="who">
<literal>John</literal>
</binding>
<binding name="g">
<uri>http://this.example/john/foaf</uri>
</binding>
<binding name="mbox">
<uri>mailto:john@home.example</uri>
</binding>
</result>
</results>
</sparql>
This SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?who ?g ?mbox FROM <http://my.example/publishers> FROM NAMED <http://my.example/alice> FROM NAMED <http://my.example/bob> WHERE { ?g dc:publisher ?who . GRAPH ?g { ?x foaf:mbox ?mbox } }
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery HTTP/1.1 Host: my.example User-agent: sparql-client/0.1
With the response illustrated here:
HTTP/1.1 200 OK
Date: Wed, 03 Aug 2005 12:48:25 GMT
Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3
Connection: close
Content-Type: application/sparql-results+xml; charset=utf-8
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
...
<head>
<variable name="who"/>
<variable name="g"/>
<variable name="mbox"/>
</head>
<results ordered="false" distinct="false">
<result>
<binding name="who">
<literal>Bob Hacker</literal>
</binding>
<binding name="g">
<uri>http://my.example/bob</uri>
</binding>
<binding name="mbox">
<uri>mailto:bob@oldcorp.example</uri>
</binding>
</result>
<result>
<binding name="who">
<literal>Alice Hacker</literal>
</binding>
<binding name="g">
<uri>http://my.example/alice</uri>
</binding>
<binding name="mbox">
<uri>mailto:alice@work.example</uri>
</binding>
</result>
</results>
</sparql>
This SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?who ?g ?mbox FROM <http://my.example/publishers> FROM NAMED <http://my.example/john> FROM NAMED <http://my.example/susan> WHERE { ?g dc:publisher ?who . GRAPH ?g { ?x foaf:mbox ?mbox } }
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery&default-graph-uri=http://my.example/morepublishers &named-graph-uri=http://my.example/bob&named-graph-uri=http://my.example/alice HTTP/1.1 Host: my.example User-agent: sparql-client/0.1
With the response illustrated here:
HTTP/1.1 200 OK Date: Wed, 03 Aug 2005 12:48:25 GMT Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3 Connection: close Content-Type: application/sparql-results+xml; charset=utf-8 <?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head> <variable name="who"/> <variable name="g"/> <variable name="mbox"/> </head> <results ordered="false" distinct="false"> <result> <binding name="who"> <literal>Bob Hacker</literal> </binding> <binding name="g"> <uri>http://my.example/bob</uri> </binding> <binding name="mbox"> <uri>mailto:bob@oldcorp.example</uri> </binding> </result> <result> <binding name="who"> <literal>Alice Hacker</literal> </binding> <binding name="g"> <uri>http://my.example/alice</uri> </binding> <binding name="mbox"> <uri>mailto:alice@work.example</uri> </binding> </result> </results> </sparql>
This syntactically invalid SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name WHERE { ?x foaf:name ?name ORDER BY ?name }
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery&default-graph-uri=http://my.example/morepublishers HTTP/1.1 Host: my.example User-agent: sparql-client/0.1
With the response — the MalformedQuery
fault
replacing the Out Message, as per 2.1
SparqlQuery — illustrated here:
HTTP/1.1 400 Bad Request Date: Wed, 03 Aug 2005 12:48:25 GMT Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3 Connection: close Content-Type: text/plain 4:syntax error, unexpected ORDER, expecting '}'
This SPARQL query
PREFIX bio: <http://bio.example/schema/#> SELECT ?valence FROM http://another.example/protein-db.rdf WHERE { ?x bio:protein ?valence } ORDER BY ?valence
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
GET /sparql/?query=EncodedQuery&default-graph-uri=http://another.example/protein-db.rdf HTTP/1.1 Host: my.example User-agent: sparql-client/0.1
With the response — the QueryRequestRefused
fault replacing the Out Message, as per 2.1
SparqlQuery — illustrated here:
HTTP/1.1 500 Internal Server Error Date: Wed, 03 Aug 2005 12:48:25 GMT Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3 Connection: close Content-Type: text/html <html> <head> <title>SPARQL Processing Service: Query Request Refused</title> </head> <body> <p> Query Request Refused: your request could not be processed because <code>http://another.example/protein-db.rdf</code> could not be retrieved within the time alloted.</p> </body> </html>
Some SPARQL queries, perhaps machine generated, may be longer than can be reliably conveyed by way of the HTTP GET binding described in 2.2 HTTP Bindings. In those cases the POST binding described in 2.2 may be used. This SPARQL query
PREFIX bio: <http://bio.example/schema/#> ... SELECT ?valence ... FROM http://another.example/protein-db.rdf WHERE { ?x bio:protein ?valence ... } ORDER BY ?valence ...
is conveyed to the SPARQL query service, http://my.example/sparql/, as illustrated in this HTTP trace:
POST /sparql/?query HTTP/1.1 Host: my.example User-agent: sparql-client/0.1 Content-Type: application/x-www-form-encoded Content-Length: ... EncodedQuery&default-graph-uri=http://another.example/protein-db.rdf
HTTP/1.1 200 OK Date: Wed, 03 Aug 2005 12:48:25 GMT Server: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3 Connection: close Content-Type: application/sparql-results+xml; charset=utf-8 <?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> ... </sparql>
[WSDL2-Adjuncts] defines a means of binding abstract interface
operations to SOAP. The SOAP bindings for the query
operation
(from sparql-protocol-query.wsdl)
are as follows:
<binding name="querySoap" interface="SparqlQuery" type="http://www.w3.org/2005/08/wsdl/soap" wsoap:version="1.2" > wsoap:protocol="http://www.w3.org/2003/05/soap/bindings/HTTP"> <fault ref="MalformedQuery" wsoap:code="soap:Sender" /> <fault ref="QueryRequestRefused" wsoap:code="soap:Sender" /> <operation ref="query" wsoap:mep="http://www.w3.org/2003/05/soap/mep/request-response" /> </binding>
The name of the SOAP binding
of SparqlQuery
's query
operation
is querySoap
; it is a SOAP binding because of the value
of type
attribute, which is set to the URI identifying
SOAP. The version of SOAP is 1.2
. The underlying
protocol used in this SOAP binding is HTTP, as determined by the URI
value of the wsoap:protocol
attribute. If a SPARQL
Protocol service supports SOAP bindings with the value of
the {http://www.w3.org/2005/08/wsdl/soap, protocol}
attribute set
to http://www.w3.org/2003/05/soap/bindings/HTTP
,
it must support the bindings as described
in sparql-protocol-query.wsdl. SOAP
bindings with wsoap:protocol
values set to transmission
protocols other than HTTP are not described in this document.
The two fault
elements refer to the fault messages
defined in the SparqlQuery
interface.
Finally, the operation
element references
the query
operation of the SparqlQuery
interface which has been previously described
in Figure 1.0 above. Since this SOAP
binding describes the operation as using HTTP as the underlying
transport protocol, the value of the wsoap:mep
attribute determines which HTTP method is to be used. This operation
is described as being implemented by a SOAP message exchange
pattern http://www.w3.org/2003/05/soap/mep/request-response
,
which, according to [SOAP12] 7.4 Supported Features, is bound to an
HTTP POST
method.
POST /services/sparql-query HTTP/1.1 Content-Type: text/xml; charset=utf-8 Accept: application/soap+xml, application/dime, multipart/related, text/* User-Agent: Axis/1.2.1 Host: my.example SOAPAction: "" Content-Length: 431 <?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" +xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <query xmlns="http://www.w3.org/2005/09/sparql-protocol-types/#"> <query xmlns="">SELECT ?z {?x ?y ?z . FILTER regex(?z, 'Harry')}</query> </query> </soapenv:Body> </soapenv:Envelope>
HTTP/1.1 200 OK
Content-Type: text/xml; charset=utf-8
<?xml version="1.0" encoding="utf-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<query-result xmlns="http://www.w3.org/2005/09/sparql-protocol-types/#">
<ns1:sparql xmlns:ns1="http://www.w3.org/2005/sparql-results#">
<ns1:head>
<ns1:variable name="z"/>
</ns1:head>
<ns1:results distinct="false" ordered="false">
<ns1:result>
<ns1:binding name="z">
<ns1:literal>Harry Potter and the Chamber of Secrets</ns1:literal>
</ns1:binding>
</ns1:result>
...
<ns1:result>
<ns1:binding name="z">
<ns1:literal>Harry Potter and the Half-Blood Prince</ns1:literal>
</ns1:binding>
</ns1:result>
<ns1:result>
<ns1:binding name="z">
<ns1:literal>Harry Potter and the Goblet of Fire</ns1:literal>
</ns1:binding>
</ns1:result>
<ns1:result>
<ns1:binding name="z">
<ns1:literal>Harry Potter and the Philosopher's Stone</ns1:literal>
</ns1:binding>
</ns1:result>
<ns1:result>
<ns1:binding name="z">
<ns1:literal>Harry Potter and the Order of the Phoenix</ns1:literal>
</ns1:binding>
</ns1:result>
<ns1:result>
<ns1:binding name="z">
<ns1:literal>Harry Potter and the Prisoner Of Azkaban</ns1:literal>
</ns1:binding>
</ns1:result>
</ns1:results>
</ns1:sparql>
</query-result>
</soapenv:Body>
</soapenv:Envelope>
There are at least two possible sources of denial-of-service attacks against SPARQL query processing services. First, under-constrained queries can result in very large numbers of results, which may require equally large expenditures of computing resources to process, assemble, or return. Another possible source are queries containing very complex — either because of resource size, the number of resources to be retrieved, or a combination of size and number — RDF dataset descriptions, which the service may be unable to assemble without significant expenditure of resources, including bandwidth, CPU, or secondary storage. In some cases such expenditures may effectively constitute a denial-of-service attack. There may be other sources of denial-of-service attacks against SPARQL query processing services.
Further, since SPARQL query processing services may make HTTP requests of other origin servers on behalf of its clients, it may be used as a vector of attacks against other sites or services. In this case, since it's acting, effectively, as a proxy for a third-party client, it is important to avoid anonymizing the client requests such that valid forensic tracing is impeded. SPARQL query processing services should log client requests in such a way as to avoid anonymizing them with regard to third-party origin servers or services, and they should do so in keeping with the Privacy considerations discussed below.
SPARQL query processing services may choose to detect these and other costly, or otherwise unsafe, queries, impose time or memory limits on queries, or impose other restrictions to reduce the service's (and other service's) vulnerability to denial-of-service attacks. They also may refuse to process such query requests.
SPARQL query processing services must take care that facts disclosed in or implied by query results do not violate applicable privacy, security or other policies. Conversely, it is good practice to consider query interfaces when gathering data and to publish a realistic privacy policy for the benefit of everyone who contributes data.
It is not practical to identify every kind of private or sensitive information. However, addresses, credit card numbers, and health records are clearly sensitive, for example; and even the language constraints on literal queries can associate the client with a particular ethnic or language group. Implementation details — including query URLs and server logs — may contain information about clients' areas of interest. This information is confidential unless otherwise indicated in a privacy policy. Storage and distribution of this data may be constrained by law in some countries.
There is no a priori method of determining the sensitivity of any particular piece of information within the context of any given request. SPARQL query processing services should supply as much control over this information as possible to the provider of that information.
My thanks to members of DAWG, especially Bijan Parsia, Bryan Thompson, Andy Seaborne, Steve Harris, Eric Prud'hommeaux, Yoshio FUKUSHIGE, Howard Katz, Dirk-Willem van Gulik, Dan Connolly, and Lee Feigenbaum. Particular thanks are owed to Elias Torres for his generous assistance and support. Thanks as well to my UMD colleagues Jim Hendler, Ron Alford, Amy Alford, Yarden Katz, Chris Testa, and members of the Mindlab Semantic Web Undergraduate Social. Particular thanks are also owed my UMD-NASA colleague and friend, Andy Schain. I also thank Jacek Kopecky, Morten Frederiksen, Mark Baker, Jan Algermissen, Danny Ayers, Bernd Simon, Graham Klyne, Arjohn Kampman, Tim Berners-Lee, Dan Brickley, Patrick Stickler.
Changes since 27 May Working Draft:
Revision 1.71 2005/09/12 14:14:29 kclark - changed compliance language to not restrict SOAP bindings with non-HTTP transport protocols - added http example numbers - finished SOAP bindings description - renamed query outer element to query-request Revision 1.70 2005/09/09 19:41:45 etorres - updated divs and anchors for all examples to match proto-test cases Revision 1.69 2005/09/08 20:23:35 etorres - added closing div tag to toc section Revision 1.68 2005/09/06 14:50:37 kclark - fix bug in compliance language Revision 1.67 2005/09/05 17:08:18 kclark - start of soap bindings description, which is unfinished Revision 1.66 2005/09/05 17:01:48 kclark - fixing some HTML bugs and spelling errors Revision 1.65 2005/09/05 16:59:21 kclark - removing spurious cvs merge conflict marker Revision 1.64 2005/09/05 16:57:42 kclark - changed TOC - removed Accept: from all examples but the con-neg example - changed sparql-query to query (In Message type name) - noted risk with WSDL2 - removed output and fault serialization IMTs from HTTP bindings - finished description of HTTP bindings - lots of editorial tweaks from EricP - added normative reference to RDF concepts Revision 1.63 2005/08/19 19:42:37 connolly change entity refs to numeric char refs so that we can check the spec without reading the DTD Revision 1.62 2005/08/19 14:09:35 connolly wf fixes Revision 1.61 2005/08/15 19:07:05 kclark -fixing missing < Revision 1.60 2005/08/15 19:03:18 kclark - WSDL, XSD, and spec excerpts in synch; discussion of excerpts needs to be updated yet - new paragraph in security section based on steve h's feedback/review Revision 1.59 2005/08/11 17:27:19 kclark - abstracting HTTP traces - added <div> containers around traces - added new trace for POST binding - tweaked several examples Revision 1.58 2005/08/09 18:06:15 kclark - changed 'may' to 'must' for malformed query fault, as a result of WG decision - added some additional uris to consult re: security - tweaked security language to be more explicit about retrieving arbitrary numbers of web resources based on user input Revision 1.57 2005/08/08 20:55:08 kclark - many tweaks resulting from Andy Seaborne's review, including: s/RDF dataset/RDF dataset description/ s/XML type/instance of an XML type/ (did this by hand, not s&r) killed the bit about "equivalent serialization" - added new examples from Elias Torres - added examples for fault returns - changed CONSTRUCT example: complex FILTER, con-neg - changed may to must for query req refused fault - changed must to may for malformed query fault Revision 1.56 2005/08/03 20:27:23 kclark - reorganized some sections - added sample SOAP trace (which isn't really valid yet, just a place holder) - tweaked policy language very slightly Revision 1.55 2005/07/29 14:09:39 kclark - general readability edits - changed should to may for fault message fault-details - added anchors for dataset sections Revision 1.54 2005/07/28 15:11:44 kclark - tweaked policy considerations section for readability & concision Revision 1.53 2005/07/27 19:42:14 kclark - changed to new MIME type for query results - added new http trace example - added two new trace stubs - added query request refused fault type - expanded policy section with more about security - synch'd rdf dataset with rq23 - added initial POST binding for query operation - flattened structure of query type by inlining rdf-dataset, per AndyS - using new "SPARQL Results Document" from rf1 - added semantics for malformed query fault, though I believe this may be incomplete as spec'd currently Revision 1.52 2005/07/27 18:12:33 kclark - synching rdf dataset with rq23 - changing SOTD to reflect editor's draft status