PROV-AQ: Provenance Access and Query

W3C Working Draft 12 March Group Note 30 April 2013

This version:: ~~http://www.w3.org/TR/2013/WD-prov-aq-20130312/~~ http://www.w3.org/TR/2013/NOTE-prov-aq-20130430/
Latest published version:: http://www.w3.org/TR/prov-aq/
~~Latest editor's draft: http://dvcs.w3.org/hg/prov/raw-file/tip/paq/prov-aq.html~~ Previous version:: ~~http://www.w3.org/TR/2012/WD-prov-aq-20120619/~~ http://www.w3.org/TR/2013/WD-prov-aq-20130312/
Editors:: Graham Klyne , University of Oxford; Paul Groth , , VU University Amsterdam
Authors:: Luc Moreau , University of Southampton; Olaf Hartig , Invited Expert; Yogesh Simmhan , Invited Expert; James Myers , Rensselaer Polytechnic Institute; Timothy Lebo , Rensselaer Polytechnic Institute; Khalid Belhajjame , University of Manchester; Simon Miles , Invited Expert; Stian Soiland-Reyes , University of Manchester

Abstract

This document specifies how to use standard Web protocols, including HTTP, to obtain information about the provenance of resources on the Web. We describe both simple access mechanisms for locating provenance records associated with web pages or resources, and provenance query services for more complex deployments. This is part of the larger W3C PROV provenance family of documents.

The PROV Document Overview describes the overall state of PROV, and should be read before other PROV documents.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

PROV Family of Documents

This document is part of the PROV family of documents, a set of documents defining various aspects that are necessary to achieve the vision of inter-operable interchange of provenance information in heterogeneous environments such as the Web. These documents are listed below. Please consult the [ PROV-OVERVIEW ] for a guide to reading these documents.

PROV-OVERVIEW ~~(To be published as Note),~~ (Note), an overview of the PROV family of documents [ PROV-OVERVIEW ];
PROV-PRIMER ~~(To be published as Note),~~ (Note), a primer for the PROV data model [ PROV-PRIMER ];
PROV-O ~~(Proposed Recommendation),~~ (Recommendation), the PROV ontology, an OWL2 ontology allowing the mapping of the PROV data model to RDF [ PROV-O ];
PROV-DM ~~(Proposed Recommendation),~~ (Recommendation), the PROV data model for ~~provenance;~~ provenance [ PROV-DM ] ];
PROV-N ~~(Proposed Recommendation),~~ (Recommendation), a notation for provenance aimed at human consumption [ PROV-N ];
PROV-CONSTRAINTS ~~(Proposed Recommendation),~~ (Recommendation), a set of constraints applying to the PROV data model [ PROV-CONSTRAINTS ];
PROV-XML ~~(To be published as Note),~~ (Note), an XML schema for the PROV data model [ PROV-XML ];
PROV-AQ ~~(To be published as Note), the~~ (Note), mechanisms for accessing and querying provenance (this document);
PROV-DICTIONARY ~~(To be published as Note)~~ (Note) introduces a specific type of collection, consisting of key-entity pairs [ PROV-DICTIONARY ];
PROV-DC ~~(To be published as Note)~~ (Note) provides a mapping between ~~PROV~~ PROV-O and ~~Dublic~~ Dublin Core Terms [ PROV-DC ];
PROV-SEM ~~(To be published as Note),~~ (Note), a declarative specification in terms of first-order logic of the PROV data model [ PROV-SEM ];
PROV-LINKS ~~(To be published as Note)~~ (Note) introduces a mechanism to link across bundles [ PROV-LINKS ].

~~Third Public Working Draft~~

Implementations Encouraged

~~This is the third public working. This revision introduces a new definition~~ The Provenance Working Group encourages implementation of ~~a provenance pingback service as well as making various clarifications about~~ the ~~definition of service descriptions~~ material defined in this document. Although work on this document by the Provenance Working Group is complete, errors may be recorded in the errata or and ~~how they are retrieved.~~ these may be addressed in future revisions.

Please Send Comments

This document was published by the Provenance Working Group as a Working ~~Draft.~~ Group Note. If you wish to make comments regarding this document, please send them to public-prov-comments@w3.org ( subscribe , archives ). All comments are welcome.

Publication as a Working ~~Draft~~ Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . ~~The group does not expect this document to become a W3C Recommendation.~~ W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .

1. Introduction
2. Accessing provenance records
3. Locating provenance records
4. Provenance query services
5. ~~Forward provenance~~ Provenance pingback
6. Security considerations
A. Acknowledgements
B. Terms added to prov: namespace
~~C. Changes log C.1 Changes since 20120619 publication~~ D. C. References
- ~~D.1~~ C.1 Informative references

1. Introduction

The Provenance Data Model [ PROV-DM ], Provenance Ontology [ PROV-O ] and related specifications define how to represent provenance in on the World Wide Web (see the [ PROV-OVERVIEW ]).

This note describes how standard web protocols may be used to locate, retrieve and query provenance records:

Simple mechanisms for retrieving and discovering provenance records are described in section 2. Accessing provenance records and section 3. Locating provenance records .
Provenance query mechanisms that may be used for more demanding deployments are described in section 4. Provenance query services .
A simple "ping-back" mechanism allowing for discovery of ~~"forward provenance" (i.e.~~ additional provenance that would otherwise be unknown to the publisher of the resource (e.g. provenance about future entities that are based upon or influenced by a resource) is described in section 5. ~~Forward provenance~~ Provenance pingback .

Most mechanisms described in this note are independent of the provenance format used, and may be used to access provenance in any available format. For interoperable provenance publication, use of ~~PROV-O~~ PROV represented in ~~a standardized RDF format~~ any of its specified formats is recommended. Where alternative formats are available, selection may be made by HTTP content ~~negotiation.~~ negotiation [ HTTP11 ].

For ease of reference, the main body of this document contains some links to external web pages. Such links are distinguished from internal references thus: W3C Provenance Working Group .

This document is a W3C Note, not a formal W3C Specification. However, to clarify the description of intended behaviours, it does use the key words MUST , MUST NOT , REQUIRED , SHOULD , SHOULD NOT , RECOMMENDED , MAY and OPTIONAL as described in [ RFC2119 ].

1.1 Concepts

This document uses the term URI for web resource identifiers, as this is the term used in many of the currently ratified specifications that this document builds upon. In many situations, a URI may also be an IRI [ RFC3987 ], which is a generalisation of a URI allowing a wider range of Unicode characters. Every absolute URI is an IRI, but not every IRI is an URI. When IRIs are used in situations that require a URI, they must first be converted according to the mapping defined in section 3.1 of [ RFC3987 ]. A notable example is retrieval over the HTTP protocol. The mapping involves UTF-8 encoding of non-ASCII characters, %-encoding of octets not allowed in URIs, and Punycode-encoding of domain names.

In defining the specification below, we make use of the following concepts.

Resource: a resource in the general sense of "whatever might be identified by a URI", as described by the Architecture of the World Wide Web [ WEBARCH ], section 2.2 . A resource may be associated with multiple instances or views ( constrained resource s) with differing provenance.
Constrained resource: a specialization (e.g. an aspect, version or instance) of a resource , about which one may wish to present provenance record s. For example, a weather report for a given date may be an aspect of a resource that is maintained as the current weather report. A constrained resource is itself a resource , and may have its own URI different from that of the original. See also section 1.2 Provenance and resources , [ PROV-DM ] section 5.5.1 , and [ WEBARCH ] section 2.3.2 .
Target-URI: a URI denoting a resource (including any constrained resource ), and which identifies that resource for the purpose of expressing provenance. Such a resource is typically an entity in the sense of [ PROV-DM ], but may be something else described by provenance records, such as an activity .
Provenance record: refers to provenance represented in some fashion.
Provenance-URI: a URI denoting some provenance record .
Provenance query service: a service that accesses provenance given a query containing a target-URI or other information that identifies the desired provenance.
Service-URI: the URI of a provenance query service .
~~Forward provenance provenance describing how a resource is used after it has been created~~ Pingback-URI: the URI of a provenance pingback service that can receive references to ~~forward~~ additional provenance . related to an entity.
Accessing provenance records: given the identity of a resource, the process of discovering and retrieving some provenance record (s) about that resource. This may involve locating a provenance record, then performing an HTTP GET to retrieve it, or locating and using a query service for provenance about an identified resource, or some other mechanism not covered in this document.
Locating provenance records: given the identity of a resource, discovery of a provenance-URI or a service-URI that may be used to obtain a provenance record about that resource.
provenance provider: is an agent that makes available provenance records.
provenance consumer: is an agent that receives and interprets provenance records.

The pingback definition is new. Review is encouraged. This document uses the term URI for web resource identifiers, as this is the term used in many of the currently ratified specifications that this document builds upon. In many situations, a URI may also be an IRI [ RFC3987 ], which is a generalisation of a URI allowing a wider range of Unicode characters. Every absolute URI is an IRI, but not every IRI is an URI. When IRIs are used in situations that require a URI, they must first be converted according to the mapping defined in section 3.1 of [ RFC3987 ]. A notable example is retrieval over the HTTP protocol. The mapping involves UTF-8 encoding of non-ASCII characters, %-encoding of octets not allowed in URIs, and Punycode-encoding of domain names.

1.2 Provenance and resources

Fundamentally, a provenance record is about resource s. In general, resources may vary over time and context. E.g., a resource describing the weather in London changes from day-to-day, or a listing of restaurants near you will vary depending on your location.

Provenance records a history of the entities, activities, and people involved in producing an artifact, and may be collected from several sources at different ~~times.~~ times [ PROV-DM ]. In order to create a meaningful history, the individual provenance records used must ~~remain valid and correct~~ retain their intended meaning when interpreted in a context other than that in which they were collected. ~~Yet~~ Yet, we may still want to make provenance assertions about dynamic or context-dependent resources (e.g. a weather forecast for London on a particular day may have been derived from a particular set of Meteorological Office data).

Provenance records for dynamic and context-dependent resources are possible through a notion of constrained resources. A constrained resource is simply a resource (in the sense defined by [ WEBARCH ], section 2.2 ) that is a specialization or instance of some other resource. For example, a W3C specification typically undergoes several public revisions before it is finalized. A URI that refers to the "current" revision might be thought of as denoting the specification throughout its lifetime. Each individual revision would also have its own target-URI denoting the specification at that particular stage in its development. Using these, we can make provenance assertions that a particular revision was published on a particular date, and was last modified by a particular editor. Target-URIs may use any URI scheme, and are not required to be dereferencable.

Requests for provenance about a resource may return provenance records that use one or more target-URIs to refer to versions of that resource, such as when there are assertions referring to the same underlying resource in different contexts. For example, a provenance record for a W3C document might include information about all revisions of the document using statements that use the different target-URIs of the various revisions.

These ideas are represented in the provenance data model [ PROV-DM ] by the concepts entity and specialization . In particular, an entity may be a specialization of some resource whose "fixed aspects" provide sufficient constraint for expressed provenance about the resource to be invariant with respect to that entity. This entity is itself just another resource (e.g. the weather forecast for a give date as opposed to the current weather forecast), with its own URI for referring to it within a provenance record.

1.3 Interpreting provenance records

~~Review second para below.~~

The mechanisms described in this document are intended to allow a provider to supply information that allows a consumer to access provenance record s, which themselves explicitly identify the entities they describe. A provenance record may contain information about several entities, referring to them using their various target-URI s. ~~Thus~~ Thus, a consumer should be selective in its use of the information provided when interpreting a provenance record.

A provenance record consumer will need to isolate information about the specific entity or entities of interest. These may be constrained resource s identified by separate target-URIs ~~than~~ that differ from the ~~original resource,~~ resource URI, in which case ~~it will need to know about~~ the ~~target-URIs used.~~ consumer needs to discover those target-URIs. The mechanisms defined later allow a provider to expose such URIs.

While a provider should avoid giving spurious information, there are no fixed semantics, particularly when multiple resources are indicated, and a client should not assume that a specific given provenance-URI will yield information about a specific target-URI. In the general case, a client presented with multiple provenance-URIs and multiple target-URIs should look at all of the provenance-URIs for information about any or all of the target-URIs.

A provenance record is not of itself guaranteed to be authoritative or correct. Trust in provenance records must be determined separately from trust in the original resource. Just as in the web at large, it is a user's responsibility to determine an appropriate level of trust in any other resource; e.g. based on the domain that serves it, or an associated digital signature. (See also section 6. Security considerations .)

1.4 URI types and dereferencing

A number of resource types are described above in section 1.1 Concepts . The table below summarizes what these various URIs are intended to denote, and the kind of information that should be returned if they are dereferenced:

	Denotes	Dereferences to
Target-URI	Any resource that is described by some provenance - typically an entity (in the sense of [ PROV-DM ], ]), but may be an of another type (such as [ PROV-DM ] activity).	~~If the~~ Not specified (the URI is ~~dereferencable, it should return a representation or description of the resource for which provenance is provided.~~ not even required to be dereferencable).
Provenance-URI	A provenance record, or provenance description, in the sense described by [ PROV-DM ] ( PROV Overview ).	A provenance record in any defined format, selectable via content negotiation.
Service-URI	A provenance query service. The service-URI is the initial URI used when accessing a provenance query service; following REST API style [ REST-APIs ], URIs for accessing provenance are determined via the service description.	A provenance query service description per section 4.1 Provenance query service description . Alternative formats may be offered via HTTP content negotiation.
Pingback-URI	A provenance pingback service. This is a service to which provenance pingback information can be submitted using an HTTP POST operation per section 5. ~~Forward provenance~~ Provenance pingback . No other operations are specified.	None specified (the owner of a provenance pingback URI may choose to return useful information, but is not required to do so.)

2. Accessing provenance records

This specification describes two ways to access provenance record s:

Direct access: given a provenance-URI , simply dereference it, and
Indirectly via a query service: given the URIs of some resource (or maybe other information) and a provenance query service , use the service to access provenance of the resource.

Web applications may access a provenance record in the same way as any resource on the Web, by dereferencing its URI (commonly using an HTTP GET operation). Thus, any provenance record may be associated with a provenance-URI , and may be accessed by dereferencing that URI using web mechanisms. How much or how little provenance is returned in a provenance record is a matter for the provider, taking account that a provenance trace may extend as linked data across multiple provenance records.

When there is no easy way to associate a provenance-URI with a resource (e.g. for resources not directly web-accessible, or whose publication mechanism is controlled by someone else), a provenance description may be obtained using a provenance query service at an indicated service-uri . A REST protocol for provenance queries is defined in Section section 4. Provenance query services ; also described there is a mechanism for locating a SPARQL query service [ SPARQL-SD ].

When publishing provenance, corresponding provenance-URI s or service-URI s should be discoverable using one or more of the mechanisms described in section 3. Locating provenance records .

Note

Provenance may be presented as a bundle , which is " a named set of provenance descriptions, and is itself an entity, so allowing provenance of provenance to be expressed " [ PROV-DM ]. A provenance description at a dereferencable provenance-URI may be treated as a bundle, and this is a good way to make provenance easily accessible. But there are other possible implementations of a bundle, such as a named graph in an RDF dataset [ RDF-CONCEPTS11 ], for which the bundle URI may not be directly dereferencable.

When a bundle is published as part of an RDF Dataset, to access it would require accessing the RDF Dataset and then extracting the identified graph component; this in turn would require knowing a URI or some other way to retrieve the RDF dataset. This specification does not describe a specific mechanism for extracting components from a document containing multiple graphs.

The W3C Linked Data Platform group ( www.w3.org/2012/ldp/ ) is chartered to produce a W3C Recommendation for HTTP-based (RESTful) application integration patterns using read/write Linked Data; we anticipate that they may address access to RDF Datasets in due course.

3. Locating provenance records

A provenance record can be accessed using direct web retrieval, given its provenance-URI . If this is known in advance, there is nothing more to specify. If a provenance-URI is not known then a mechanism to discover one must be based on information that is available to the would-be accessor. Likewise, provenance may be exposed by a query service, in which case, the corresponding service-URI must be discovered.

Three mechanisms are defined for a provenance consumer to find information about a provenance-URI or service-URI , along with a target-URI :

The consumer knows the resource URI and the resource is accessible using HTTP
The consumer has a copy of a resource represented as HTML or XHTML
The consumer has a copy of a resource represented as RDF (including the range of possible RDF syntaxes, such as HTML with embedded RDFa)

These particular cases are selected as corresponding to current primary web protocol and data formats. Similar approaches may be defined for other protocols or resource formats.

Provenance records may be offered by several provider s other than that of the original resource publisher, each with different concerns, and presenting provenance at different locations. It is possible that these different providers may present contradictory provenance.

3.1 Resource accessed by HTTP

For a resource accessible using HTTP, a provenance record may be indicated using an HTTP Link header field, as defined by Web Linking (RFC 5988) [ LINK-REL ]. The Link header field is included in the HTTP response to a GET or HEAD operation (other HTTP operations are not excluded, but are not considered here).

A has_provenance link relation type for referencing a provenance record may be used thus:

Link: <provenance-URI>;
  rel="http://www.w3.org/ns/prov#has_provenance";
anchor="

target-URI

"

When used in conjunction with an HTTP success response code ( 2xx ), this HTTP header field indicates that provenance-URI is the URI of a provenance record about the originally requested resource, and that the requested resource is identified within the provenance record as target-URI. (See also section 1.3 Interpreting provenance records .)

If no anchor parameter is provided then the target-URI is assumed to be the URI of the requested resource used in the corresponding HTTP request.

This ~~specification~~ note does not define the meaning of these links returned with other HTTP response codes: future revisions may define interpretations for these.

An HTTP response MAY include multiple has_provenance link header fields, indicating a number of different provenance resources (and anchors) that are known to the responding server, each referencing a provenance record about the accessed resource.

The presence of a has_provenance link in an HTTP response does not preclude the possibility that other providers also may offer provenance records about the same resource. In such cases, discovery of the additional provenance records must use other means (e.g. see section 4. Provenance query services ).

An example ~~request~~ HTTP response including provenance headers ~~in its response~~ might look like this (where C: and S: prefixes indicate client and server emitted data respectively):

~~C: GET http://example.com/resource/ HTTP/1.1~~

Example 1

C: GET http://example.com/resource123/ HTTP/1.1
C: Accept: text/html
S: HTTP/1.1 200 OK
S: Content-type: text/html
S: Link: <http://example.com/resource/provenance/>; 

S: Link: <http://example.com/resource123/provenance/>; 

         rel="http://www.w3.org/ns/prov#has_provenance"; 
         anchor="http://example.com/resource/"

         anchor="http://example.com/resource123/"

S:
S: <html ...>
S:  :
S:
</html>

Tim comment (14): Should a reference to the forward provenance section be included, too? [GK] I don't see the need. Forward provenance is not primarily *about* the same resource, IMO, and I think mentioning it here could be more confusing than helpful.

3.1.1 Specifying Provenance Query Services

The resource provider may indicate that provenance records about the resource are provided by a provenance query service . This is done through the use of a has_query_service link relation type following the same pattern as above:

Link: <service-URI>;
  rel="http://www.w3.org/ns/prov#has_query_service";
  anchor="target-URI"

The has_query_service link identifies the service-URI . Dereferencing this URI yields a service description that provides further information to enable a client to submit a query to retrieve a provenance record for a resource ; see section 4. Provenance query services for more details.

Example 2

C: GET http://example.com/resource123/ HTTP/1.1
C: Accept: text/html
S: HTTP/1.1 200 OK
S: Content-type: text/html
S: Link: <http://example.com/resource123/provenance-query/>; 
         rel="http://www.w3.org/ns/prov#has_query_service"; 
         anchor="http://example.com/resource123/"
S:
S: <html ...>
S:  :
S:
</html>

There ~~may~~ MAY be multiple has_query_service link header fields, and these MAY appear in an HTTP response together with has_provenance link header fields.

C: GET http://example.com/resource/ HTTP/1.1 C: Accept: text/html S: HTTP/1.1 200 OK S: Content-type: text/html S: Link: <http://example.com/resource/provenance/>; rel="http://www.w3.org/ns/prov#has_query_service"; anchor="http://example.com/resource/" S: S: <html ...> S: : S: </html>

3.1.2 Content negotiation, redirection and Link: headers

When performing content negotiation for a resource, it is common for HTTP 302 or 303 redirect response codes to be used to direct a client to an appropriately-formatted resource. When accessing a resource for which provenance is available, link headers SHOULD be included with the response to the final redirected request, and not on the intermediate 303 responses. (When accessing a resource from a browser using Javascript, the intermediate 303 responses are usually handled transparently by the browser and are not visible to the HTTP client code.)

Following content negotiation, any provenance link returned refers to the resource whose URI is used in the corresponding HTTP request, or the given anchor parameter if that is different.

An example transaction using content negotiation and redirection might look like this (where C: and S: prefixes indicate client and server emitted data respectively):

~~C: GET http://example.com/resource/ HTTP/1.1~~

Example 3

C: GET http://example.com/resource123/ HTTP/1.1
C: Accept: text/html
S: HTTP/1.1 302 Found
S: Location: /resource/content.html

S: Location: /resource123/content.html

S: Vary: Accept
S:
S: HTML content for http://example.com/resource/ 
S: is available at http://example.com/resource/content.html

S: HTML content for http://example.com/resource123/ 
S: is available at http://example.com/resource123/content.html

C: GET http://example.com/resource/content.html HTTP/1.1

C: GET http://example.com/resource123/content.html HTTP/1.1

C: Accept: text/html
S: HTTP/1.1 200 OK
S: Content-type: text/html
S: Link: <http://example.com/resource/provenance/>; 

S: Link: <http://example.com/resource123/provenance/>; 

         rel="http://www.w3.org/ns/prov#has_provenance"; 
         anchor="http://example.com/resource/20130226/content.html"

         anchor="http://example.com/resource123/20130226/content.html"

S:
S: <html>
S:  <!-- HTML content here... -->
S:
</html>

This example indicates a provenance record at http://example.com/resource123/provenance/, which uses http://example.com/resource123/20130226/content.html as the target-URI for the requested resource. If the anchor= parameter were to be omitted from the Link header field, the indicated target-URI would be http://example.com/resource123/content.html.

3.2 Resource represented as HTML

For a document presented as HTML or XHTML, without regard for how it has been obtained, a provenance record may be associated with a resource by adding a


<link>

element to the HTML


<head>

section. Two link relation types for referencing provenance may be used: ~~<html xmlns="http://www.w3.org/1999/xhtml">~~

  <html>
     <head>
        <link rel="http://www.w3.org/ns/prov#has_provenance" href="provenance-URI">
        <link rel="http://www.w3.org/ns/prov#has_anchor" href="target-URI">
        <title>Welcome to example.com</title>
     </head>
     <body>
       <!-- HTML content here... -->
     </body>
</html>

The provenance-URI given by the first link element ( #has_provenance ) identifies the provenance-URI for the document.

The target-URI given by the second link element ( #has_anchor ) specifies an identifier for the document that may be used within the provenance record when referring to the document.

If no target-URI is provided (via a #has_anchor link element) then is it is assumed to be the URI of the document. It is RECOMMENDED that this convention be used only when the document has a URI that is reasonably expected to be known or easily discoverable by a consumer of the document (e.g. when delivered from a web server, or as part of a MIME structure containing content identifiers [ RFC2392 ]).

An HTML document header MAY present multiple provenance-URI s over several #has_provenance link elements, indicating a number of different provenance records that are known to the publisher of the document, each of which may provide provenance about the document (see section 1.3 Interpreting provenance records ).

~~Check~~

Note

The mechanisms used with ~~Dong: I think the cross reference should make the assumptions explicit. I, too, feel this material is not strictly needed, but was previously asked~~ HTTP and HTML/RDF are slightly inconsistent in their approach to ~~add some clarification about mutliple links.~~ specifying target-URI values. In HTTP Link header fields, an optional anchor= parameter may be supplied for each such header. In HTML and RDF, separate #has_anchor relations are defined. It was felt that avoiding reinvention of existing mechanisms was more important than being completely consistent. If anchors are processed as described in section 1.3 Interpreting provenance records (3rd paragraph), observable behaviour across all approaches should be consistent.

3.2.1 Specifying Provenance Query Services

The document creator may specify that the provenance about the document is provided by a provenance query service . This is done through the use of a third link relation type following the same pattern as above:

  <html xmlns="http://www.w3.org/1999/xhtml">
     <head>
        <link rel="http://www.w3.org/ns/prov#has_query_service" href="service-URI">
        <link rel="http://www.w3.org/ns/prov#has_anchor" href="target-URI">
        <title>Welcome to example.com</title>
     </head>
     <body>
       <!-- HTML content here... -->
     </body>
</html>

The has_query_service link element identifies the service-URI . Dereferencing this URI yields a service description that provides further information to enable a client to query for provenance about a resource; see section 4. Provenance query services for more details.

There MAY be multiple #has_query_service link elements, and these MAY appear in the same document as #has_provenance link elements (though we do not anticipate that #has_provenance and #has_query_service link relations will commonly be used together).

~~Check with Dong: This test was already revised in response to earlier comment. I, too, feel this material is not strictly needed, but was previously asked to add some clarification.~~

3.3 Resource represented as RDF

If a resource is represented as RDF (in any of its recognized syntaxes, including RDFa), it may contain references to its own provenance using additional RDF statements. For this ~~purpose~~ purpose, the link relations introduced above ( section 3. Locating provenance records ) may be used as RDF properties: prov:has_provenance, prov:has_anchor, and prov:has_query_service, where the prov: prefix here indicates the PROV namespace URI http://www.w3.org/ns/prov#.

The RDF property prov:has_provenance is a relation between two resources, where the object of the property is a provenance-URI that denotes a provenance record about the subject resource. Multiple prov:has_provenance assertions may be made about a subject resource.

Property prov:has_anchor specifies a target-URI used in the indicated provenance to refer to the containing RDF document.

Property prov:has_query_service specifies a service-URI for provenance queries.

Example 4

@prefix prov: <http://www.w3.org/ns/prov#>.
<> dcterms:title        "Welcome to example.com" ;
   prov:has_anchor       <http://example.com/data/resource.rdf> ;
   prov:has_provenance   <http://example.com/provenance/resource.rdf> ;
   prov:has_query_service <http://example.com/provenance-query-service/> .
#
(More
RDF
data
...)

(The above example uses Turtle RDF syntax [ TURTLE ].)

Note

These terms ( prov:has_provenance, prov:has_anchor, and prov:has_query_service ) may be also used in RDF statements with other subjects to indicate provenance of other resources, but discussion of such use is beyond the scope of this document.

See also the note about target-URIs at the end of section 3.2 Resource represented as HTML .

4. Provenance query services

This section describes a simple HTTP query protocol for accessing provenance records, and also a mechanism for locating a SPARQL service endpoint [ SPARQL-SD ]. The HTTP query protocol specifies HTTP operations [ HTTP11 ] for retrieving provenance records from a provenance query service, following the approach of the SPARQL Graph Store HTTP Protocol [ SPARQL-HTTP ].

The introduction of query services is motivated by the following possible considerations:

third-party providers of provenance descriptions may be unable to use the mechanisms of Section 3 because the corresponding target-URI is outside their control;
services unknown to the original publisher may have provenance records about the same resource;
there is no known dereferencable provenance-URI or a particular entity;
query services may provide additional filters over what provenance is returned; and
query services may support more expressive selections, such as "which entities were derived from entities attributed to agent X".

The patterns for using provenance query services are designed around REST principles [ REST ], which aim to minimize coupling between client and server implementation details.

The query mechanisms provided by a provenance query service are described by a service description, which is obtained by dereferencing a service-URI . A service description may contain information about additional mechanisms that are not described here. In keeping with REST practice for web applications, alternative service descriptions using different formats may be offered and accessed using HTTP content negotiation. We describe below a service description format that uses RDF to describe two query mechanisms.

The general procedure for using a provenance query service is:

retrieve the service description;
within the service description, locate information about a recognized query mechanism (ignoring unrecognized descriptions if the description covers multiple service options);
if a recognized query mechanism is found, extract information needed to use that mechanism (e.g. a URI template or a SPARQL service endpoint URI); and
use the information obtained to query for required provenance, using the selected query mechanism.

The remainder of this section covers the following topics:

section 4.1 Provenance query service description - describes an RDF-based service description format and vocabularies to convey information about direct HTTP query and/or SPARQL service options.
- section 4.1.1 Direct HTTP query service description - RDF structure for describing a direct HTTP query service.
- section 4.1.2 SPARQL query service description - RDF structure for describing a SPARQL query service.
section 4.2 Direct HTTP query service invocation - describes how to perform a direct HTTP query for provenance, using information obtained from the service description.
section 4.3 Provenance query service discovery - briefly discusses some possible approaches to discovery of provenance query services.

4.1 Provenance query service description

Review. Stian suggests recommending use of JSON-LD. I am resisting this because it is clearly allowed by "RDF (in any of its common serializations as determined by HTTP content negotiation)", focusing on a particular format as part of the underlying mechanism seems to go against REST principles, and at this stage it seems that promoting any particular format will draw objections from proponents of other formats. I've taken a different tack, making the text more open about possible service description formats, while specifically presenting a description based on the RDF model.

Dereferencing a service-URI yields a service description. The service description ~~presented here~~ may be ~~supplied as RDF (in~~ in any ~~of its common serializations as determined by HTTP~~ format selectable through content ~~negotiation),~~ negotiation, and it may contain descriptions of one or more available query mechanisms. ~~Each~~ The format described here uses RDF, serialized as Turtle [ TURTLE ], but any selectable RDF serialization could be used. In this RDF service description, each query mechanism is associated with an RDF type, as explained below. ~~(The presentation here of RDF service descriptions does not preclude use of non-RDF formats selectable by HTTP content negotiation.)~~

The overall structure of a service description is as follows:

<service-URI> a prov:ServiceDescription ;
    prov:describesService <direct-query-description>, <sparql-query-description> .
<direct-query-description> a prov:DirectQueryService ;
  prov:provenanceUriTemplate "direct-query-template"
  .
<sparql-query-description> a sd:Service ;
  sd:endpoint <sparql-query> ;
  # other details...
  .

We see here that the service-URI identifies a resource of type prov:ServiceDescription, which collects descriptions of one or more provenance query mechanisms. Each associated mechanism is indicated by a prov:describesService statement.

Note We expect the presentation of service descriptions to be considered by the W3C Linked Data Platform group ( www.w3.org/2012/ldp/ ); at the time of writing, there is no consensus (cf. message at lists.w3.org/Archives/Public/public-ldp/2012Nov/0036.html and responses). As and when such consensus emerges, we recommend that provenance query service implementers consider adopting it, or at least consider making their implementations compatible with it.

4.1.1 Direct HTTP query service description

A direct HTTP query service is described by an RDF resource of type prov:DirectQueryService. It allows for accessing provenance about a specified target-URI . The query URI to use is described by a URI Template [ URI-template ] (level 2 or above) in which ~~which~~ the variable uri stands for the ~~target-URI; e.g.~~ target-URI. The URI template is specified as:

~~@prefix prov: <http://www.w3c.org/ns/prov#> <direct-query-description> a prov:DirectQueryService ; prov:provenanceUriTemplate "?target={+uri}" .~~

<direct-query-description> a prov:DirectQueryService ;
  prov:provenanceUriTemplate "uri-template" .

where ~~query-URI is the base URI of the direct query service, and~~ direct-query-description is any distinct RDF subject node (i.e. a blank node or a ~~URI).~~ URI), and uri-template is a URI template [ RFC3986 ].

The URI template indicated by prov:provenanceUriTemplate may expand to an absolute or relative URI reference. A URI for the desired provenance record is obtained by expanding the URI template with the variable uri set to the target-URI for which provenance is requested. In this example, if the target-URI contains '#' or '&' these must be %-escaped as %23 or %26 respectively before template expansion [ RFC3986 ]. If the result is a relative reference, it is interpreted per [ RFC3986 ] (section 5.2) using the URI of the service description as its base URI (which is generally the same as the query service-URI, unless HTTP redirection has been invoked).

Example 5

<http://example.com/prov/service> a prov:ServiceDescription;
    prov:describesService _:direct .
_:direct a prov:DirectQueryService ;
  prov:provenanceUriTemplate 
"http://www.example.com/provenance/service?target={uri}"
.

A provenance query service MAY recognize additional parameters encoded as part of a URI for the provenance record. If it does, it SHOULD include these in the provenance URI template in the service description, so that clients may discover how a URI is formed using this additional information. For example, a query service might offer to include just the immediate provenance of a target, or to also supply provenance of other resources from which the target is derived. Suppose a service accepts an additional parameter steps that defines the number of previous steps to include in a provenance trace, it might publish its service description thus:

~~> a prov:DirectQueryService ;~~

Example 6

<http://example.com/prov/service> a prov:ServiceDescription;
    prov:describesService _:direct .
_:direct a prov:DirectQueryService ;

  prov:provenanceUriTemplate 
"http://www.example.com/provenance/service?target={+uri}{&steps}"

"http://www.example.com/provenance/service?target={uri}{&steps}"

.

~~which might result in an HTTP query for provenance information that looks like this: Example 6 GET http://example.com/provenance/service? target =http://www.example.com/entity& steps =2 HTTP/1.1~~ (Note that in this case, a "level 3" URI template feature is used [ URI-template ].)

Section section 4.2 Direct HTTP query service invocation discusses how a client interacts with a direct HTTP query service.

4.1.2 SPARQL query service description

A SPARQL query service is described by an RDF resource of type sd:Service [ SPARQL-SD ].

It allows for accessing provenance information using a SPARQL query, which may be constructed to retrieve provenance for a particular resource, or for multiple resources. The query may be formulated using the PROV-O vocabulary terms [ PROV-O ], and others supported by the SPARQL endpoint as appropriate.

The SPARQL query service description is constructed as defined by SPARQL 1.1 Service Description [ SPARQL-SD ]; e.g.

~~a sd:Service ; sd:endpoint </sparql/> ;~~

Example 7

<http://example.com/prov/service> a prov:ServiceDescription;
    prov:describesService _:sparql .
_:sparql a sd:Service ;
    sd:endpoint <http://www.example.com/provenance/sparql> ;

sd:supportedLanguage
sd:SPARQL11Query
.

where query-URI http://www.example.com/provenance/sparql is the ~~base~~ URI of ~~the~~ a provenance query ~~service, and sparql-query-description is any distinct RDF subject node (i.e. a blank node or a URI).~~ SPARQL endpoint.

The SPARQL service description may be detailed or sparse, provided that it includes at least a ~~minimum~~ sd:endpoint statement with the ~~following:~~ SPARQL service endpoint URI.

~~a sd:Service ; sd:endpoint <(SPARQL service endpoint URI reference)> .~~

The endpoint may be given as an absolute or relative URI reference. If a relative reference is given, it is interpreted in the normal way for the RDF format used, which will commonly be relative to the URI of the service document itself.

4.1.3 Service description example

The following service description example uses Turtle [ TURTLE ] syntax to describe both direct HTTP and SPARQL query services:

@prefix prov: <http://www.w3c.org/ns/prov#> @prefix dcterms: <http://purl.org/dc/terms/> @prefix foaf: <http://xmlns.com/foaf/0.1/> @prefix sd: <http://www.w3.org/ns/sparql-service-description#>

Example 8

@prefix prov:    <http://www.w3c.org/ns/prov#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf:    <http://xmlns.com/foaf/0.1/> .
@prefix sd:      <http://www.w3.org/ns/sparql-service-description#> .

<> a prov:ServiceDescription ;
    prov:describesService <#direct>, <#sparql> ;
    dcterms:publisher <#us>
    .
<#us> a foaf:Organization ;
    foaf:name "and not a service!"
    .
<#direct> a prov:DirectQueryService ;
    prov:provenanceUriTemplate "/direct?target={+uri}"
    .
<#sparql> a sd:Service ;
    sd:endpoint </sparql/> ;
    sd:supportedLanguage sd:SPARQL11Query ;
    sd:resultFormat <http://www.w3.org/ns/formats/RDF_XML> ,
                    <http://www.w3.org/ns/formats/Turtle> ,
                    <http://www.w3.org/ns/formats/SPARQL_Results_XML> ,
                    <http://www.w3.org/ns/formats/SPARQL_Results_JSON> ,
                    <http://www.w3.org/ns/formats/SPARQL_Results_CSV> ,
                    <http://www.w3.org/ns/formats/SPARQL_Results_TSV>
.

4.2 Direct HTTP query service invocation

This ~~protocol combines~~ section describes the interaction between a client and a direct HTTP query service whose service description is as presented in section 4.1.1 Direct HTTP query service description , once the service description has been analyzed and its URI template has been extracted.

The target-URI ~~with a~~ for which provenance is required is used in the expansion of the supplied URI template [ RFC3986 ] to formulate an HTTP GET request.

Thus, ~~if the URI template extracted from~~ in the first service description example in section 4.1.1 Direct HTTP query service description , the URI template is http://example.com/provenance/service?target={uri} and http://www.example.com/provenance/service?target={uri}. If the supplied target-URI is http://www.example.com/entity123, this would be used as the value for variable uri when expanding the template. The resulting HTTP request used to retrieve a provenance record would be:

Example 9

GET /provenance/service?target=http%3A%2F%2Fwww.example.com%2Fentity123 HTTP/1.1
Host:
example.com

Any server that implements this protocol and receives a request URI in ~~this~~ a form corresponding to its published URI template SHOULD return a provenance record for the ~~target-URI~~ embedded ~~in the query component, where that URI~~ target-URI. The target-URI is ~~the result of~~ obtained by percent-decoding [ RFC3986 ] the part of the request URI corresponding to occurrences of the variable {var} uri in the URI template. E.g., in the above example, the decoded target-URI is http://www.example.com/entity123. The target-URI MUST be an absolute URI, and the server SHOULD respond with 400 Bad Request if it is not.

A server SHOULD NOT offer a template containing {+uri} or other non-simple variable expansion options [ URI-template ] unless all valid target-URIs for which it can provide provenance do not contain problematic characters like '#' or '&'.

Note

The defined URI template expansion process [ URI-template ] generally takes care of %-escaping characters that are not permitted in URIs. However, when expanding a template with {+uri} , (or other non-simple variable expansion options), some permitted characters such as '#' and '&' are not escaped. If the supplied target-URI contains these characters, then they may disrupt interpretation of the resulting query URI. ~~To prevent this, '#' and '&' characters in the target-URI may be replaced with %23 and %26 respectively, before performing the URI template expansion. An alternative, simpler and~~ A generally more reliable approach is to use {uri} in the URI template string, which will cause all URI-reserved characters to be %-escaped as part of the URI-template expansion, as in the example above.

If the provenance described by the request is unknown to the server, a suitable error response code SHOULD be returned. In the absence of any security of privacy concerns about the resource, that might be 404 Not Found. But if the existence or non-existence of a resource is considered private or sensitive, an authorization failure or other ~~error~~ response may be returned.

The direct HTTP query service may return provenance in any available format. For interoperable provenance publication, use of ~~the PROV-O vocabulary [ PROV-O ]~~ PROV represented in ~~a standardized RDF format~~ any of its specified formats is recommended. Where alternative formats are available, selection may be made by content negotiation, using Accept: header fields in the HTTP request. Services MUST identify the Content-Type of the provenance returned.

Additional URI query parameters may be used as indicated by the service description in section 4.1.1 Direct HTTP query service description . The second service description example specifies a URI template with an additional variable which may be used to control the scope of provenance information returned: http://www.example.com/provenance/service?target={+uri}{&steps}. Following [ RFC3986 ], if no value for variable steps is provided when expanding the template, this extra element is effectively ignored. But if a steps value of (say) 2 is provided, then the resulting HTTP query would be:

Example 10


GET
http://example.com/provenance/service?

target

=http://www.example.com/entity&

steps

=2
HTTP/1.1

Note

The use of any specific URI template variable other than uri for the target-URI is a matter for agreement between the client and query service, and is not specified in this note. It is mentioned here simply to show that the possibility exists to formulate more detailed queries.

4.3 Provenance query service discovery

Previously, section 3. Locating provenance records has described use of HTTP Link: header fields, HTML <link> elements and RDF statements to indicate provenance query services. Beyond that, this specification does not define any specific mechanism for discovering query services. Applications may use any appropriate mechanism, including but not limited to: prior configuration, search engines, service registries, etc.

To facilitate service discovery, we recommend that RDF publication of dataset and service descriptions use the property prov:has_query_service and the provenance service type prov:ServiceDescription as appropriate (see the appendix section B. ).

For example, a VoID description [ VoID ] of a dataset might indicate a provenance query service providing information about that dataset:

  <http://example.org/dataset/> a void:Dataset ;
prov:has_query_service
<http://example.org/provenance/>
.

The RDF service description example in section 4.1.3 Service description example shows use of the prov:ServiceDescription type.

5. Forward provenance Provenance pingback

~~REVIEW.~~

This section describes ~~an "at-risk" feature whose final inclusion in this document is undecided. Does the use of~~ a ~~"ping-back" for discovering forward~~ mechanism that may be used to discover related provenance ~~fall under~~ information that the ~~remit~~ publisher of ~~"provenance access and query"? Is it~~ a ~~useful feature to define? This section describes a discovery mechanism for forward provenance ; i.e.~~ resource does not otherwise know about; e.g. provenance describing how ~~a resource~~ it is used after it has been ~~created .~~ created.

The mechanisms discussed in previous sections are primarily concerned with the publisher enabling access to ~~historical provenance, dealing~~ known provenance about an entity, answering with questions such as:

what was this resource based upon?
how was it constructed?
who made it?
when was it made?

These questions can be ~~turned around~~ opened up to consider ~~a publisher's forward-looking use of a resource,~~ provenance information created by unrelated third parties, like:

what new resources are based on this resource?
what has this resource been used for?
who has used it?
~~etc.~~ what other resources are derived from the same sources as this resource?

The ability to answer ~~forward-looking~~ such broader questions requires some cooperation among the parties who use a resource; for example, a consumer could report use directly to the publisher, or a search engine could discover and report ~~such~~ downstream resource usage. To facilitate such cooperation, a ~~publisher of a~~ resource publisher may ~~implement a "ping-back" capability.~~ receive provenance "ping-backs". (The mechanism described here is inspired by blog pingbacks , but avoids the need for XML-RPC and is specific for provenance records.)

A resource may have an associated ~~"ping-back" URI~~ provenance ping-back URI, which ~~can~~ may be presented with references to provenance about the resource. The ping-back URI is associated with a resource using mechanisms similar to those used for presenting a provenance-URI , but using a pingback prov:pingback link relation instead of has_provenance prov:has_provenance. A consumer of the resource, or some other system, may perform an HTTP POST operation to the pingback URI, with a request body containing a list of provenance-URIs for provenance records describing uses of the resource.

For example, consider a resource that is published by acme.example.com, and is subsequently used by wile-e.example.org coyote.example.org in the construction of some new entity; we might see an exchange along the following lines. We start with wile-e.example.org coyote.example.org retrieving a copy of acme.example.org 's resource:

~~C: GET http://acme.example.org/super-widget HTTP/1.1~~

Example 11

C: GET http://acme.example.org/super-widget123 HTTP/1.1
  S: 200 OK
  S: Link: <http://acme.example.org/super-widget/provenance>; 
           rel=http://www.w3.org/ns/prov#has_provenance
  S: Link: <http://acme.example.org/super-widget/pingback>; 
           rel=http://www.w3.org/ns/prov#pingback
   :
(super-widget

S: 200 OK
S: Link: <http://acme.example.org/super-widget123/provenance>; 
         rel="http://www.w3.org/ns/prov#has_provenance"
S: Link: <http://acme.example.org/super-widget123/pingback>; 
         rel="http://www.w3.org/ns/prov#pingback"
 :
(super-widget123

resource
data)

The first of the links in the response is a has_provenance link with a provenance-URI that has been described previously ( section 3.1 Resource accessed by HTTP ). The second is a distinct resource that exists to receive provenance pingbacks. Later, when a new resource has been created or some related action performed based upon the acme.example.org/super-widget acme.example.org/super-widget123, a client ~~MAY~~ may post a pingback request to ~~any~~ the supplied pingback URI:

~~C: POST http://acme.example.org/super-widget/pingback HTTP/1.1~~

Example 12

C: POST http://acme.example.org/super-widget123/pingback HTTP/1.1
C: Content-Type: text/uri-list
C:
C: http://wile-e.example.org/contraption/provenance
C: http://wile-e.example.org/another/provenance

C: http://coyote.example.org/contraption/provenance
C: http://coyote.example.org/another/provenance

S: 204 No Content
S: Link: <http://acme.example.org/super-widget/provenance>; 
         rel=http://www.w3.org/ns/prov#has_provenance;
anchor="http://acme.example.org/super-widget"

S:
204
No
Content

The pingback request supplies a list of provenance-URI s from which ~~forward~~ additional provenance may be retrieved. The pingback service may do as it chooses with these URIs; e.g., it may choose to save them for later use, to retrieve the associated provenance and save that, to publish the URIs along with other provenance information about the original entity to which they relate, or even to ignore them.

The client MAY further supply has_query_service links indicating provenance query services that can describe the target-URI. The anchor MUST be included, and SHOULD be the target-URI of the resource to which this pingback service belongs, or some related resource with relevant provenance. C: POST http://acme.example.org/super-widget/pingback HTTP/1.1 C: Link: <http://wile-e.example.org/sparql>; rel="http://www.w3.org/ns/prov#has_query_service"; anchor="http://acme.example.org/super-widget" C: Content-Type: text/uri-list C: Content-Length: 0 C: S: 204 No Content S: Link: <http://acme.example.org/super-widget/provenance>; rel=http://www.w3.org/ns/prov#has_provenance; anchor="http://acme.example.org/super-widget" In the above example, the client did not submit any provenance-URIs and the URI list is therefore empty. The client MAY similarly include has_provenance links to specify provenance records with a different anchor. The provenance-URIs of those headers SHOULD also be included in the content if the POSTed Content-type is text/uri-list . Does this SHOULD requirement serve any useful purpose? There is no required information in the server response to a pingback POST request. In the examples ~~above,~~ here, the pingback service responds positively with 204 No Content and an empty response ~~body, and links to provenance for the original resource. (Note that the~~ body. Other HTTP status values like Link: header returned contains an explicit 200 OK, anchor parameter with the URI of the original resource; without this, the link would relate the indicated URI to the pingback URI 201 Created, http://acme.example.org/super-widget/pingback 202 Accepted, and 303 See Other ~~rather than~~ might also be appropriate positive responses depending on the ~~original resource.)~~ domain and application.

The only defined operation on a pingback-URI is POST, which supplies links to provenance information or services as described above. A pingback-URI MAY respond to other requests, but no requirements are imposed on how it responds. In particular, it is not specified here how a pingback resource should respond to an HTTP GET request. ~~This leaves open~~

The pingback client MAY include extra has_provenance links to indicate provenance records related to a ~~possibility~~ different resources, specified with correspondingly different anchor URIs. These MAY indicate further provenance about existing resources, or about new resources (such as new entities derived or specialized from that for which the pingback ~~resource~~ URI was provided). For example:

Example 13

C: POST http://acme.example.org/super-widget123/pingback HTTP/1.1
C: Link: <http://coyote.example.org/extra/provenance>;
         rel="http://www.w3.org/ns/prov#has_provenance";
         anchor="http://acme.example.org/extra-widget"
C: Content-Type: text/uri-list
C:
C: http://coyote.example.org/contraption/provenance
C: http://coyote.example.org/another/provenance
C: http://coyote.example.org/extra/provenance
S:
204
No
Content

The client MAY ~~have~~ also supply has_query_service links indicating provenance query services that can describe the ~~same URI as~~ target-URI. The anchor MUST be included, and SHOULD be either the ~~original resource,~~ target-URI of the resource for which the pingback URI was provided (from the examples above, that would be http://acme.example.org/super-widget123 ), or some related resource with relevant provenance. For example:

Example 14

C: POST http://acme.example.org/super-widget123/pingback HTTP/1.1
C: Link: <http://coyote.example.org/sparql>;
         rel="http://www.w3.org/ns/prov#has_query_service";
         anchor="http://acme.example.org/super-widget123"
C: Content-Type: text/uri-list
C: Content-Length: 0
C:
S:
204
No
Content

Here, the ~~original does~~ pingback client has supplied a query service URI, but did not ~~respond~~ submit any provenance-URIs and the URI list is therefore empty. The Link header field indicates that the resource http://acme.example.org/super-widget123/provenance contains provenance information relating to ~~POST in some different way.~~ http://acme.example.org/super-widget123 (that being the URI of the resource for which the pingback URI was provided).

6. Security considerations

Provenance is central to establishing trust in data. If provenance is corrupted, it may lead agents (human or software) to draw inappropriate and possibly harmful conclusions. Therefore, care is needed to ensure that the integrity of provenance is maintained. Just as provenance can help determine a level of trust in some information, a provenance record related to the provenance itself ("provenance of provenance") can help determine trust in the provenance.

The HTTP security considerations [ HTTP11 ] generally apply for all of the resources and services located through the mechanism in this document.

Secure HTTP (https) SHOULD be used across unsecured networks when accessing provenance that may be used as a basis for trust decisions, or to obtain a provenance URI for same.

When retrieving a provenance URI from a document, steps SHOULD be taken to ensure the document itself is an accurate copy of the original whose author is being trusted (e.g. signature checking, or use of a trusted secure web service). (See also section 1.3 Interpreting provenance records .)

Provenance may present a route for leakage of privacy-related information, combining as it does a diversity of information types with possible personally-identifying information; e.g. editing timestamps may provide clues to the working patterns of document editors, or derivation traces might indicate access to sensitive materials. In particular, note that the fact that a resource is openly accessible does not mean that its provenance should also be. When publishing provenance, its sensitivity SHOULD be considered and appropriate access controls applied where necessary. When a provenance-aware publishing service accepts some resource for publication, the contributors SHOULD have some opportunity to review and correct or conceal any provenance that they don't wish to be exposed. Provenance management systems SHOULD embody mechanisms for enforcement and auditing of privacy policies as they apply to provenance. Implementations MAY choose to use standard HTTP authorization mechanisms to restrict access to resources, returning 401 Unauthorized,403 Forbidden or 404 Not Found as appropriate.

Provenance may be used by audits to establish accountability for information use [ INFO-ACC ] and to verify use of proper processes in information processing activities. Thus, provenance management systems can provide mechanisms to support auditing and enforcement of information handling policies. In such cases, provenance itself may be a valuable target for attack by malicious agents, and care must be taken to ensure it is stored securely and in a fashion that resists attempts to tamper with it.

The pingback service described in section 5. ~~Forward provenance~~ Provenance pingback might be abused for "link spamming" (similar to the way that weblog ping-backs have been used to direct viewers to spam sites). As with many such services, an application needs to find a balance between maintaining ease of submission for useful information and blocking unwanted information. We have no easy solutions for this problem, and the caveats noted above about establishing integrity of provenance records apply similarly to information provided by ping-back calls.

When clients and servers are retrieving submitted URIs such as provenance descriptions and following or registering links; reasonable care should be taken to prevent malicious use such as distributed denial of service attacks (DDoS), cross-site request forgery (CSRF), spamming and hosting of inappropriate materials. Reasonable preventions might include same-origin policy, HTTP authorization, SSL, rate-limiting, spam filters, moderation queues, user acknowledgements and validation. It is out of scope for this document to specify how such mechanisms work and should be applied.

~~Is CSRF~~

Provenance pingback uses an HTTP POST operation, which may be used for non-"safe" interactions in the sense of [ WEBARCH ] ( section 3.4 ). Care needs to be taken that user agents are not tricked into POSTing to incorrect URIs in such a ~~real threat here? How?~~ way that may incur unintended effects or obligations. For example, a malicious site may present a pingback URI that executes an instruction on a different web site. Risks of such abuse may be mitigated by: performing pingbacks only to URIs from trusted sources; performing pingbacks only to the same origin as the provider of the pingback URI (like in-browser javascript same-origin restrictions), not sending credentials with pingback requests that were not obtained specifically for that purpose, and any other measures that may be appropriate.

Accessing provenance services might reveal to the service and third-parties information which is considered private, including which resources a client has taken interest in. For instance, a browser extension which collects all provenance data for a resource which is being saved to the local disk, could be revealing user interest in a sensitive resource to a third-party site listed by prov:has_provenance or prov:has_query_service relation. A detailed query submitted to a third-party provenance query service might be revealing personal information such as social security numbers. Accordingly, user agents in particular SHOULD NOT follow provenance and provenance service links without first obtaining the user's explicit permission to do so.

A. Acknowledgements

The editors acknowledge the contribution and review from members of the W3C Provenance working group for their feedback throughout the development of this specification.

~~The provenance query service description~~ Thanks to Erik Wilde and ~~forward provenance specifications are substantially based on proposals by Stian Soiland-Reyes (University~~ other members of ~~Manchester).~~ the W3C Linked Data Platform working group for an extended discussion of REST service design issues, which has informed some aspects of the provenance service mechanisms.

Thanks to Robin Berjon for making our lives easier with his ReSpec tool.

Members of the PROV Working Group at the time of publication of this document were: Ilkay Altintas (Invited expert), Reza B'Far (Oracle Corporation), Khalid Belhajjame (University of Manchester), James Cheney (University of Edinburgh, School of Informatics), Sam Coppens (iMinds - Ghent University), David Corsar (University of Aberdeen, Computing Science), Stephen Cresswell (The National Archives), Tom De Nies (iMinds - Ghent University), Helena Deus (DERI Galway at the National University of Ireland, Galway, Ireland), Simon Dobson (Invited expert), Martin Doerr (Foundation for Research and Technology - Hellas(FORTH)), Kai Eckert (Invited expert), Jean-Pierre EVAIN (European Broadcasting Union, EBU-UER), James Frew (Invited expert), Irini Fundulaki (Foundation for Research and Technology - Hellas(FORTH)), Daniel Garijo (Ontology Engineering Group, Universidad Politécnica de Madrid, Spain), Yolanda Gil (Invited expert), Ryan Golden (Oracle Corporation), Paul Groth (Vrije Universiteit), Olaf Hartig (Invited expert), David Hau (National Cancer Institute, NCI), Sandro Hawke ( W3C / MIT ), Jörn Hees (German Research Center for Artificial Intelligence (DFKI) Gmbh), Ivan Herman, ( W3C / ERCIM ), Ralph Hodgson (TopQuadrant), Hook Hua (Invited expert), Trung Dong Huynh (University of Southampton), Graham Klyne (University of Oxford), Michael Lang (Revelytix, Inc.), Timothy Lebo (Rensselaer Polytechnic Institute), James McCusker (Rensselaer Polytechnic Institute), Deborah McGuinness (Rensselaer Polytechnic Institute), Simon Miles (Invited expert), Paolo Missier (School of Computing Science, Newcastle university), Luc Moreau (University of Southampton), James Myers (Rensselaer Polytechnic Institute), Vinh Nguyen (Wright State University), Edoardo Pignotti (University of Aberdeen, Computing Science), Paulo da Silva Pinheiro (Rensselaer Polytechnic Institute), Carl Reed (Open Geospatial Consortium), Adam Retter (Invited Expert), Christine Runnegar (Invited expert), Satya Sahoo (Invited expert), David Schaengold (Revelytix, Inc.), Daniel Schutzer (FSTC, Financial Services Technology Consortium), Yogesh Simmhan (Invited expert), Stian Soiland-Reyes (University of Manchester), Eric Stephan (Pacific Northwest National Laboratory), Linda Stewart (The National Archives), Ed Summers (Library of Congress), Maria Theodoridou (Foundation for Research and Technology - Hellas(FORTH)), Ted Thibodeau (OpenLink Software Inc.), Curt Tilmes (National Aeronautics and Space Administration), Craig Trim (IBM Corporation), Stephan Zednik (Rensselaer Polytechnic Institute), Jun Zhao (University of Oxford), Yuting Zhao (University of Aberdeen, Computing Science).

B. Terms added to prov: namespace

~~Possible renaming of service description relations to lowercase-only forms?~~

This specification defines the following additional names in the provenance namespace with URI http://www.w3.org/ns/prov# .

Name	Description	Definition ref
`ServiceDescription`	Type for a generic provenance query service. Mainly for use in RDF provenance query service descriptions, to facilitate discovery in linked data environments.	section 4.3 Provenance query service discovery
`DirectQueryService`	Type for a direct HTTP query service description. Mainly for use in RDF provenance query service descriptions, to distinguish direct HTTP query service descriptions from other query service descriptions.	section 4.1.1 Direct HTTP query service description
`has_anchor`	Indicates a target-URI for an resource, used by an associated provenance record.	section 3.2 Resource represented as HTML , section 3.3 Resource represented as RDF
`has_provenance`	Indicates a provenance-URI for a resource; the resource identified by this property presents a provenance record about its subject or anchor resource.	section 3.1 Resource accessed by HTTP , section 3.2 Resource represented as HTML
`has_query_service`	Indicates a provenance query service that can access provenance related to its subject or anchor resource.	section 3.1.1 Specifying Provenance Query Services
`describesService`	relates a generic provenance query service resource (type `prov:ServiceDescription` ) to a specific query service description (e.g. a `prov:DirectQueryService` or a `sd:Service` ).	section 4.1 Provenance query service description
`provenanceUriTemplate`	Indicates a URI template string for constructing provenance-URIs	section 4.1.1 Direct HTTP query service description
`pingback`	Relates a resource to a provenance pingback service that may receive ~~forward~~ additional provenance links about the resource.	section 5. ~~Forward provenance~~ Provenance pingback

The ontology describing these terms is ~~at paq/prov-aq.ttl or paq/prov-aq.owl Update when location and copy finalized.~~ available here .

C. Changes log References

~~Always update copy of mercurial change log. Below are changes since 19 June.~~

C.1 Changes since 20120619 publication 2013-02-27 16:23 +0000 35385cbbfb9f Graham Klyne Further refinements and bug fixes in the forward provenance section 2013-02-27 15:33 +0000 2dfd7fac85c9 Graham Klyne Merge 2013-02-27 15:19 +0000 bae275eaaf81 Stian Soiland-Reyes Added Stian as PROV-AQ author 2013-02-27 15:33 +0000 ecf3af571f1e Graham Klyne Changed entity-URI back to target-URI, and updated concept definition to indicate it may also denote an activity 2013-02-27 13:02 +0000 d6085196a22d Graham Klyne Some tidying up of section decsribing provenance pingback 2013-02-27 12:26 +0000 018e25f63183 Graham Klyne Changed all normative Informative references to informative (this being a NOTE) 2013-02-27 12:23 +0000 33eea34f1863 Graham Klyne Renamed link relations 'hasProvenance', 'hasAnchor' and 'hasQueryService' to 'has_provenance', 'has_anchor' and 'has_query_service' respectively. This is because RFC5988 strongly recommends link relations to be all lowercase. 2013-02-26 18:42 +0000 25eb1149862f Graham Klyne Add placeholder links to ontology; deleted note in Appendiox B 2013-02-26 18:33 +0000 473af1a90df2 Graham Klyne Added describesService in appendix B 2013-02-26 18:24 +0000 211ff39ee699 Graham Klyne Add to security considerations: possible malicious use of links; possible information leakage when provenance links are folloiwed, or services used. (Stian:58,59) 2013-02-26 18:12 +0000 ec95bc31a43b Graham Klyne Incorporate Stian's revised pingback proposal: accept links rather than actual provenance. 2013-02-26 17:19 +0000 eadbf7c04b31 Graham Klyne Renamed prov:provPingback as just prov:pingback (Stian:55) 2013-02-26 17:07 +0000 4fd6242c151c Graham Klyne

Re-worked description of direct HTTP query, particularly escaping of URI special characters and provenance formats returned. Revise description of return codes. (Section 4.2) 2013-02-26 16:10 +0000 bbef7256bc91 Graham Klyne

[HTTP11]: ~~Incorporarted Stian's proposal for a service description property 2013-02-26 14:40 +0000 ae119d2377ca Graham Klyne~~ R. Fielding et al. ~~Query service~~ Hypertext Transfer Protocol - editorial updates per Stian's comments 35-41, including re-work of motivation 2013-02-26 13:34 +0000 1a0bded232fb Graham Klyne Edirtorial updates in response to Luc's comments; updated CSS for external link 2013-02-26 11:26 +0000 52d0f60610aa Graham Klyne Editorial updates in response to comments by Dong 2013-02-26 10:44 +0000 a87d5f3c056c Graham Klyne Moved entire discussion of bundles in section 2 to a NOTE, and simplified. 2013-02-26 10:24 +0000 fd2f1476c795 Graham Klyne Further editorial changes suggestred by Stian's comments 28,29,30,33; quote anchor parameters in Link: header examples 2013-02-21 19:16 +0000 2a38eacd1735 Graham Klyne Section 3, editorial changes and reorganization of text suggested by Stian's comments (15,16,17,18,19,20,21,22,23,24,25,26,27). Moved some more material from sect 3 to sect 1.3, and trimmed. Removed duplicate material about authority guarantees from section 2. Added URI/IRI discussion to concepts. Added HTTP examples. Included '#' in shorthand references to link relations. 2013-02-21 17:27 +0000 6c7dc767652a Graham Klyne More trimming of material onprovenance interpretation 2013-02-21 15:57 +0000 be8f6ec8a2d5 Graham Klyne Section 3, editorial changes and reorganization of text suggested by Stian's comments (10,11,12,13,14). provider abnd consumer definitions moved to 1.1. Further discussion of provenance interpretation moved to section 1.3. 2013-02-21 15:31 +0000 7e2c896d5b3b Graham Klyne Section 2, non-substantive editorial changes suggested by Stian's comments (8,9) 2013-02-21 15:22 +0000 9afefa103a72 Graham Klyne Non-substantive editorial changes suggested by Stian's comments (4,6,7) 2013-02-21 15:07 +0000 27cf75d7a6f2 Graham Klyne Changed 'target-URI' to 'entity-URI', and revised some associated text to describe provenance access with respect to entities. Stian's comment (2) 2013-02-21 14:06 +0000 cc0ec03fed0e Graham Klyne Update Turtle reference to CR 2013-02-21 13:33 +0000 2d42fb73e214 Graham Klyne Add placeholder appendix for change log 2013-02-07 18:26 +0000 ae85f08dcda4 Graham Klyne Further editorial fixes suggested by Tim 2013-02-07 17:43 +0000 266d233ce54c Graham Klyne Move text about isolating information from section 3 (locating) to 1.3 (interpreting). Tim's comment (13) 2013-02-07 17:29 +0000 a53fb5b58d8f Graham Klyne Section 2 editorial rework including suggestions by Tim 2013-02-07 16:41 +0000 cf11871bb9ba Graham Klyne Editorial fixes suggested by Tim, and replace 'provenance description' with 'provenance record', following PROV-DM 2013-02-07 15:53 +0000 4d16b451fc1a Graham Klyne Editorial fixes suggested by Simon 2013-02-07 15:42 +0000 9406759adad3 Graham Klyne Editorial fixes suggested by Simon 2013-02-07 14:50 +0000 d5e7a0c76495 Graham Klyne Minor editorial fixes suggested by Ivan 2013-02-07 13:44 +0000 d4cf95c11dcd Graham Klyne Minor editorial fixes suggested by Ivan 2013-01-10 15:43 +0100 cb49c07522ff Paul Groth updated to fix minor editorial errors 2013-01-04 14:23 +0000 47a1f3baf67a Graham Klyne Editorial tweaks 2013-01-04 13:26 +0000 a374c48027b9 Graham Klyne Separate prov:DirectQueryService (specifric mechanism) from prov:ProvenanceQueryService (generic) 2013-01-04 13:11 +0000 7b4580bd6d52 Graham Klyne Fix some section cross-references; rename 'prov:hasProvenanceService' as 'prov:hasQueryService' 2013-01-04 13:03 +0000 f00efdbb4f6e Graham Klyne Reorganize provenance query service description to accommodate SPARQL (ISSUE 601); 'rename provenance service' as 'provenance query service'; rename 'prov:ProvenanceService' as 'prov:ProivenanceQueryService' 2012-12-10 22:40 +0000 08deb1462a7c Graham Klyne Fix apostrophe 2012-12-10 19:03 +0000 f0f315c6781c Graham Klyne Fold in Jun's editorial comments. Remove reference to POWDER 2012-11-26 14:43 +0000 23f86dc89992 Graham Klyne Change 'provenance information' to 'provenance description' (ISSUE 601); extensive editorial changes should be reviewed 2012-11-26 12:07 +0000 4bbf03ed4eff Graham Klyne Add note about service description and LDP consideration thereof 2012-11-26 11:49 +0000 72ec0c88ede8 Graham Klyne Add non-commital paragraph about accessing provenance bundles 2012-11-26 11:08 +0000 b811ccb53b45 Graham Klyne Update forward provenance (pingback) in response to comments; fix text around VoID example 2012-11-20 16:00 +0000 839d503bd064 Graham Klyne Added specification for pingback link header (ISSUE 600) 2012-11-20 14:15 +0000 0e668b4ce436 Graham Klyne Tidy up some text 2012-11-20 14:03 +0000 3cba6f76b797 Graham Klyne Expanded discussion of provenance service discovery to include prov:hasProvenanceService 2012-11-20 12:22 +0000 44586518f842 Graham Klyne Remove speculative non-specification text to be covered in FAQ (ISSUES 426, 598) 2012-11-20 11:36 +0000 2f1a8ca558a8 Graham Klyne Added section with table of URIs and what they dereference to (ISSUE 424) 2012-11-20 10:37 +0000 d2acceb63c0b Graham Klyne Make treatment of direct retrieval and service for provenance access more equally visible (ISSUE 422) 2012-11-19 18:53 +0000 69d2ba847f40 Graham Klyne Point out that provenance services can accept paraneters other than just 'target' (ISSUE 420) 2012-11-19 18:10 +0000 6a283fe7b70c Graham Klyne Added definitions for accessung and locating (ISSUE 417) 2012-11-19 13:23 +0000 45683a007809 Graham Klyne More CSS tweaking to override RewSopec.js colouring 2012-11-19 13:06 +0000 390df423d8b3 Graham Klyne Formatting and CSS colour tweaks for examples 2012-11-19 12:58 +0000 7f6c1a541631 Graham Klyne Added section on Link: headers and content negotiation (ISSUE 416) 2012-11-19 12:17 +0000 9c24773fd6c7 Graham Klyne Revert to W3C -hosted ReSpec.js 2012-11-19 11:48 +0000 10c290a46ee0 Graham Klyne Added icon to distibguish external links (ISSUE 400) 2012-11-16 16:29 +0000 5f35560288e7 Graham Klyne Updsate and cross-link table of prov: URIs defined 2012-11-16 16:01 +0000 3144d9093733 Graham Klyne Changed link relations to URIs; removed IANA considerations section 2012-11-10 19:03 +0000 77c849e3d67d Graham Klyne Hyperlink concept definitions to themselves (per request from Tim 2012-06-05) 2012-11-10 18:26 +0000 aff3d2fcf8f5 Graham Klyne Update security considerations with note about use of provenance as part of audit/enforcement mechanism 2012-11-10 18:07 +0000 84864eeceffb Graham Klyne Update list of PROV documents, copied from PROV-DM 2012-11-10 02:08 +0000 beb278a59b28 Graham Klyne Revert JS import to W3C server 2012-11-08 22:36 +0000 35ee2d7209ea Graham Klyne Adjust TODO notes 2012-11-08 22:27 +0000 d269fcdfc9fd Graham Klyne Cut back on verbiage in sect 3,4 rbitrary data 2012-11-08 22:24 +0000 aba651f6da5e Graham Klyne Adjust TODO notes 2012-11-08 22:20 +0000 06627e013264 Graham Klyne Revised section 3 descriptions in terms of prodcuers and consumers 2012-11-08 12:12 +0000 098e7be8b4d5 Graham Klyne Updated security considerations to mention audit 2012-11-06 17:04 +0000 96247e20c8b4 Graham Klyne Add TODO for producer/consumer roles, add note about multiple links, add reference to RFC3986 for %-escaping 2012-11-06 15:17 +0000 4fbe51a47591 Graham Klyne Update security considerations, note about non-RDF service desription, PROV-O link, acknowledgements 2012-06-20 14:27 -0700 d0af0446868d Paul Groth HTTP/1.1 ~~updated to reflect newly published version~~ . June 1999. RFC 2616. URL: http://www.ietf.org/rfc/rfc2616.txt
[INFO-ACC]: Weitzner, Abelson, Berners-Lee, Feigenbaum, Hendler, and Sussman. Information Accountability . Communications of the ACM, Jun. 2008, 82-87, http://doi.acm.org/10.1145/1349026.1349043 , http://dig.csail.mit.edu/2008/06/info-accountability-cacm-weitzner.pdf (alt)
[LINK-REL]: M. Nottingham, Web Linking , October 2010, Internet RFC 5988. URL: http://www.ietf.org/rfc/rfc5988.txt
[PROV-CONSTRAINTS]: James Cheney; Paolo Missier; Luc Moreau; eds. Constraints of the PROV Data Model . ~~12 March~~ 30 April 2013, W3C ~~Proposed~~ Recommendation. URL: ~~http://www.w3.org/TR/2013/PR-prov-constraints-20130312/~~ http://www.w3.org/TR/2013/REC-prov-constraints-20130430/
[PROV-DC]: Daniel Garijo; Kai Eckert; eds. Dublin Core to PROV Mapping . ~~12 March~~ 30 April 2013, ~~Working Draft.~~ W3C Note. URL: ~~http://www.w3.org/TR/2013/WD-prov-dc-20130312/~~ http://www.w3.org/TR/2013/NOTE-prov-dc-20130430/
[PROV-DICTIONARY]: Tom De Nies; Sam Coppens; eds. PROV Dictionary: Modeling Provenance for Dictionary Data Structures . ~~12 March~~ 30 April 2013, ~~Working Draft.~~ W3C Note. URL: ~~http://www.w3.org/TR/2013/WD-prov-dictionary-20130312/~~ http://www.w3.org/TR/2013/NOTE-prov-dictionary-20130430/
[PROV-DM]: Luc Moreau; Paolo Missier; eds. PROV-DM: The PROV Data Model . ~~12 March~~ 30 April 2013, W3C ~~Proposed~~ Recommendation. URL: ~~http://www.w3.org/TR/2013/PR-prov-dm-20130312/~~ http://www.w3.org/TR/2013/REC-prov-dm-20130430/
[PROV-LINKS]: Luc Moreau; Timothy Lebo; eds. Linking Across Provenance Bundles . ~~12 March~~ 30 April 2013, ~~Working Draft.~~ W3C Note. URL: ~~http://www.w3.org/TR/2013/WD-prov-links-20130312/~~ http://www.w3.org/TR/2013/NOTE-prov-links-20130430/
[PROV-N]: Luc Moreau; Paolo Missier; eds. PROV-N: The Provenance Notation . ~~12 March~~ 30 April 2013, W3C ~~Proposed~~ Recommendation. URL: ~~http://www.w3.org/TR/2013/PR-prov-n-20130312/~~ http://www.w3.org/TR/2013/REC-prov-n-20130430/
[PROV-O]: Timothy Lebo; Satya Sahoo; Deborah McGuinness; eds. PROV-O: The PROV Ontology . ~~12 March~~ 30 April 2013, W3C ~~Proposed~~ Recommendation. URL: ~~http://www.w3.org/TR/2013/PR-prov-o-20130312/~~ http://www.w3.org/TR/2013/REC-prov-o-20130430/
[PROV-OVERVIEW]: Paul Groth; Luc Moreau; eds. PROV-OVERVIEW: An Overview of the PROV Family of Documents . ~~12 March~~ 30 April 2013, ~~Working Draft.~~ W3C Note. URL: ~~http://www.w3.org/TR/2013/WD-prov-overview-20130312/~~ http://www.w3.org/TR/2013/NOTE-prov-overview-20130430/
[PROV-PRIMER]: Yolanda Gil; Simon Miles; eds. PROV Model Primer . ~~12 March~~ 30 April 2013, ~~Working Draft.~~ W3C Note. URL: ~~http://www.w3.org/TR/2013/WD-prov-primer-20130312/~~ http://www.w3.org/TR/2013/NOTE-prov-primer-20130430/
[PROV-SEM]: James Cheney; ed. Semantics of the PROV Data Model . ~~12 March~~ 30 April 2013, ~~Working Draft.~~ W3C Note. URL: ~~http://www.w3.org/TR/2013/WD-prov-sem-20130312~~ http://www.w3.org/TR/2013/NOTE-prov-sem-20130430 .
[PROV-XML]: Hook Hua; Curt Tilmes; Stephan Zednik; eds. PROV-XML: The PROV XML Schema . ~~12 March~~ 30 April 2013, ~~Working Draft.~~ W3C Note. URL: ~~http://www.w3.org/TR/2013/WD-prov-xml-20130312/~~ http://www.w3.org/TR/2013/NOTE-prov-xml-20130430/
[RDF-CONCEPTS11]: Richard Cyganiak, David Wood, eds. RDF 1.1 Concepts and Abstract Syntax . Working Draft. URL: http://www.w3.org/TR/rdf11-concepts/
[REST]: R. Fielding. Representational State Transfer (REST) . 2000, Ph.D. dissertation. URL: http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
[REST-APIs]: R. Fielding. REST APIs must be hypertext driven . October 2008 (blog post), URL: http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
[RFC2119]: S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Internet RFC 2119. URL: http://www.ietf.org/rfc/rfc2119.txt
[RFC2392]: E. Levinson. Content-ID and Message-ID Uniform Resource Locators. August 1998. Internet RFC 2392. URL: http://www.ietf.org/rfc/rfc2392.txt
[RFC3986]: T. Berners-Lee; R. Fielding; L. Masinter. Uniform Resource Identifier (URI): Generic Syntax (RFC 3986) . January 2005. ~~RFC 3986.~~ RFC. URL: http://www.ietf.org/rfc/rfc3986.txt
[RFC3987]: M. Dürst; M. Suignard. Internationalized Resource Identifiers (IRIs) (RFC 3987) . January 2005. ~~RFC 3987.~~ RFC. URL: http://www.ietf.org/rfc/rfc3987.txt
[SPARQL-HTTP]: Chimezie Ogbuji. SPARQL 1.1 Graph Store HTTP Protocol . 21 March 2013, W3C ~~Candidate Recommendation 8 November 2012,~~ Recommendation. URL: http://www.w3.org/TR/sparql11-http-rdf-update/
[SPARQL-SD]: G. T. Williams. SPARQL 1.1 Service Description . ~~2011, Work in progress.~~ 21 March 2013, W3C Recommendation. URL: http://www.w3.org/TR/sparql11-service-description/
[TURTLE]: Eric Prud'hommeaux, Gavin Carothers. Turtle: Terse RDF Triple Language . 19 February 2013. W3C Candidate Recommendation. URL: http://www.w3.org/TR/turtle/
[URI-template]: J. Gregorio; R. Fielding; M. Hadley; M. Nottingham; D. Orchard. URI Template . March 2012, Internet RFC 6570. URL: http://tools.ietf.org/html/rfc6570
[VoID]: Keith Alexander, Richard Cyganiak, Michael Hausenblas, Jun Zhao. Describing Linked Datasets with the VoID Vocabulary , W3C Interest Group Note 03 March 2011, http://www.w3.org/TR/void/
[WEBARCH]: Norman Walsh; Ian Jacobs. Architecture of the World Wide Web, Volume One . 15 December 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-webarch-20041215/

PROV-AQ: Provenance Access and Query

W3C Working Draft 12 March Group Note 30 April 2013

Abstract

Status of This Document

PROV Family of Documents

Implementations Encouraged

Please Send Comments

Table of Contents

1. Introduction

1.1 Concepts

1.2 Provenance and resources

1.3 Interpreting provenance records

1.4 URI types and dereferencing

2. Accessing provenance records

3. Locating provenance records

3.1 Resource accessed by HTTP

3.1.1 Specifying Provenance Query Services

3.1.2 Content negotiation, redirection and Link: headers

3.2 Resource represented as HTML

3.2.1 Specifying Provenance Query Services

3.3 Resource represented as RDF

4. Provenance query services

4.1 Provenance query service description

4.1.1 Direct HTTP query service description

4.1.2 SPARQL query service description

4.1.3 Service description example

4.2 Direct HTTP query service invocation

4.3 Provenance query service discovery

5. Forward provenance Provenance pingback

6. Security considerations

A. Acknowledgements

B. Terms added to prov: namespace

C. Changes log References