Copyright © 2018 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and permissive document license rules apply.
DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. This document defines the schema and provides examples for its use.
By using DCAT to describe datasets in data catalogs, publishers are using a standard model and vocabulary that facilitates the consumption and aggregation of metadata from multiple catalogs, and in doing so can increase the discoverability of datasets. It also makes it possible to have a decentralized approach to publishing data catalogs and makes federated search for datasets across catalogs in multiple sites possible using the same query mechanism and structure. Aggregated DCAT metadata can serve as a manifest file as part of the digital preservation process.
The namespace for DCAT terms is http://www.w3.org/ns/dcat#
The suggested prefix for the DCAT namespace is dcat
The (revised) DCAT vocabulary is available here.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
Since the First Public Working Draft, the main changes to the DCAT vocabulary have been:
dcat:Resource
class for representing any resource than can be included in the catalog, this is
now the super-class of dcat:Dataset
dcat:DataService
, as a sub-class of dcat:Resource
, to support cataloguing service end-points providing access to resourcesdcat:DataDistributionService
, as a sub-class of dcat:DataService
,
representing service end-points providing access to datasets through their distributions, respectively The detailed differences between the two documents can be seen here and the list of all the changes since the previous version of DCAT in the Change History section.
This document is part of the output of the Dataset Exchange Working Group (DXWG). All documents from the group are listed here.
The DCAT documents are about the revised Data Catalog Vocabulary.
These documents give guidance on profiling. Some of the documents are general while some are technology-specific. Please consult the Profile Guidance [PROF-GUIDE] document for an overview of all profiling documents. It is the recommended starting point.
The original DCAT vocabulary (originally hosted at http://vocab.deri.ie/dcat) was developed at the Digital Enterprise Research Institute (DERI), refined by the eGov Interest Group, and then finally standardized in 2014 [VOCAB-DCAT-20140116] by the Government Linked Data (GLD) Working Group.
This revised version of DCAT was developed by the Dataset Exchange Working Group in response to a new set of Use Cases and Requirements [DCAT-UCR] submitted on the basis of experience with the DCAT vocabulary from the time of the original version, and new applications not originally considered. A summary of the changes from [VOCAB-DCAT-20140116] can be found at Change History
DCAT incorporates terms from pre-existing vocabularies where stable terms with appropriate meanings could be found, such as foaf:homepage and dct:title. Informal summary definitions of the externally-defined terms are included here for convenience, while authoritative definitions are available in the normative references. Changes to definitions in the references, if any, supersede the summaries given in this specification. Note that conformance to DCAT (Section 4) concerns usage of only the terms in the DCAT namespace itself, so possible changes to the external definitions will not affect the conformance of DCAT implementations.
This document was published by the Dataset Exchange Working Group as a Working Draft. This document is intended to become a W3C Recommendation.
Comments regarding this document are welcome. Please send them to public-dxwg-comments@w3.org (archives).
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 February 2018 W3C Process Document.
This section is non-normative.
From DCAT 2014 [VOCAB-DCAT-20140116]
Data can come in many formats, ranging from spreadsheets, through XML and RDF, to various specialty formats. DCAT does not make any assumptions about the serialization format of the datasets described in a catalog. Other, complementary vocabularies MAY be used together with DCAT to provide more detailed format-specific information. For example, properties from the VoID vocabulary [VOID] can be used to express various statistics about a DCAT-described dataset if that dataset is in RDF format.
This document does not prescribe any particular method of deploying data expressed in DCAT. DCAT is applicable in many contexts including RDF accessible via SPARQL endpoints, embedded in HTML pages as RDFa, or serialized as e.g. RDF/XML or Turtle. The examples in this document use Turtle simply because of Turtle's readability.
This section is non-normative.
The original Recommendation [VOCAB-DCAT-20140116], published in January 2014, provided the basic framework for describing datasets. Importantly, it made the distinction between a dataset as an abstract idea and a distribution as a manifestation of the dataset. Although DCAT has been widely adopted, it has become clear that the original specification lacked a number of essential features that were added either through application profiles, such as the European Commission's DCAT-AP [DCAT-AP], or the development of larger vocabularies that, to a greater or lesser extent, built upon the base standard, such as the Healthcare and Life Sciences Community Profile [HCLS-Dataset], the Data Tag Suite [DATS] and more. This version of DCAT has been developed to address the specific shortcomings that have come to light through the experiences of different communities, the aim being, of course, to improve interoperability between the outputs of these larger vocabularies.
This draft includes re-writing of the specification throughout. Significant changes from the 2014 Recommendation are marked within the text using "Note" sections, as well as being described in the Change History.
The namespace for DCAT is http://www.w3.org/ns/dcat#
.
However, note that DCAT makes extensive use of terms from other vocabularies, in particular Dublin Core [DCTERMS].
DCAT defines a minimal set of classes and properties of its own.
A full set of namespaces and prefixes used in this document is shown in the table below.
Prefix | Namespace |
---|---|
dcat | http://www.w3.org/ns/dcat# |
dct | http://purl.org/dc/terms/ |
dctype | http://purl.org/dc/dcmitype/ |
dqv | http://www.w3.org/ns/dqv# |
foaf | http://xmlns.com/foaf/0.1/ |
owl | http://www.w3.org/2002/07/owl# |
prov | http://www.w3.org/ns/prov# |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs | http://www.w3.org/2000/01/rdf-schema# |
schema | https://schema.org/ |
skos | http://www.w3.org/2004/02/skos/core# |
vcard | http://www.w3.org/2006/vcard/ns# |
xsd | http://www.w3.org/2001/XMLSchema# |
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST NOT, SHOULD, and SHOULD NOT are to be interpreted as described in [RFC2119].
Modified from DCAT 2014 [VOCAB-DCAT-20140116]
A data catalog conforms to DCAT if:
A DCAT profile is a specification for data catalogs that adds additional constraints to DCAT. A data catalog that conforms to the profile also conforms to DCAT. Additional constraints in a profile MAY include:
The requirement for a DCAT profile to conform to all of DCAT is under discussion.
This section is non-normative.
Significantly extended from DCAT 2014 [VOCAB-DCAT-20140116]
DCAT is an RDF vocabulary well-suited to representing data catalogs such as data.gov and data.gov.uk. DCAT defines eight main classes:
dcat:Catalog
represents the catalogdcat:Resource
represents an item described by an entry in a catalog.dcat:Dataset
represents a dataset in a catalog.dcat:Distribution
represents an accessible form or representation of a dataset as for example a downloadable file.dcat:DataService
represents a data service in a catalog. Example data services include data distribution services and data discovery services.dcat:DataDistributionService
represents a data service that provides access to distributions of datasets and extracts of datasets, such as an API.dcat:DiscoveryService
represents a data service that supports discovery functions.dcat:CatalogRecord
describes a dataset entry in the catalog, primarily concerning the registration information such as who added the item and whenAlong with the rest of the Vocabulary overview, this diagram is non-normative. Furthermore, while the diagram uses UML-style class notation, it should be interpreted following the usual RDF open-world assumptions around the presence/absence of properties, relationships, and their cardinality. The properties shown in each class reflect those recommended in the descriptions of classes in the Vocabulary specification. To assist in understanding the full scope of each class, properties are copied down from each '::super-class'. Cardinalities are shown in a few places to reinforce expectations, but these are not axiomatized or enforced in any way by this recommendation.
A dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". A dataset is a conceptual entity, and can be represented by one or more distributions that serialize the dataset for transfer. Distributions of a dataset can be provided via data distribution services. Detailed properties for a data distribution service API are out of the scope of this version of DCAT.
Datasets and data services, and potentially other types of thing, can be included in a catalog. Types of data service that might be found in a catalog include data distribution services, discovery services such as portals and catalog services, data transformation services such as coordinate transformation services, re-sampling and interpolation services, and various data processing services.
The scope of DCAT 2014 [VOCAB-DCAT-20140116] was limited to catalogs of datasets. A number of use cases for the revision involve also having data distribution services as members of a catalog - see DCAT Distribution to describe web services - ID6 and Modeling service-based data access - ID18. It has been decided to add an explicit class for Data Distribution Services in this revision of DCAT, and to enable these to be part of a Catalog. Provision for other data services to also be part of a Catalog is also made, as well as for Catalogs to be composed of other Catalogs. See Issue #172.
A CatalogRecord describes an entry in the catalog. Notice that while dcat:Resource
represents the dataset or service itself, dcat:CatalogRecord
is the record that describes the registration of an item in the catalog. The use of dcat:CatalogRecord
is considered optional. It is used to capture provenance information about entries in a catalog. If this distinction is not necessary then dcat:CatalogRecord
can be safely ignored.
There is ongoing discussion about whether a DCAT Resource represents only a dataset or also a distribution.
RDF allows resources to have global identifiers (IRIs) or to be blank nodes. Blank nodes can be used to denote resources without explicitly naming them with an IRI. They can appear in the subject and object position of a triple [RDF11-PRIMER]. While blank nodes may offer flexibility for some use cases, in a Linked Data context, blank nodes limit our ability to collaboratively annotate data. A blank node resource cannot be the target of a link and it can't be annotated with new information from new sources. As one of the biggest benefits of the Linked Data approach is that "anyone can say anything anywhere", use of blank nodes undermines some of the advantages we can gain from wide adoption of the RDF model. Even within the closed world of a single application dataset, use of blank nodes can quickly become limiting when integrating new data [LinkedDataPatterns]. For these reasons, it is recommended that instances of the DCAT main classes have a global identifier, and use of blank nodes is generally discouraged when encoding DCAT in RDF.
All RDF examples in this document are written in Turtle syntax [Turtle] and many are available from the DXWG code repository.
Each RDF example in this document is intended to demonstrate specific capabilities of DCAT, and therefore only shows a subset of all the potential properties and links which might appear in a complete DCAT resource.
This example provides a quick overview of how DCAT might be used to represent a government catalog and its datasets.
First, the catalog description:
:catalog a dcat:Catalog ; dct:title "Imaginary Catalog" ; rdfs:label "Imaginary Catalog" ; foaf:homepage <http://example.org/catalog> ; dct:publisher :transparency-office ; dct:language <http://id.loc.gov/vocabulary/iso639-1/en> ; dcat:dataset :dataset-001 , :dataset-002 , :dataset-003 ; .
The publisher of the catalog has the relative URI :transparency-office. Further description of the publisher can be provided as in the following example:
:transparency-office a foaf:Organization ; rdfs:label "Transparency Office" ; .
The catalog lists each of its datasets via the dcat:dataset property. In the example above, an example dataset was mentioned with the relative URI :dataset-001. A possible description of it using DCAT is shown below:
:dataset-001 a dcat:Dataset ; dct:title "Imaginary dataset" ; dcat:keyword "accountability","transparency" ,"payments" ; dct:creator :finance-employee-001 ; dct:issued "2011-12-05"^^xsd:date ; dct:modified "2011-12-05"^^xsd:date ; dcat:contactPoint <http://example.org/transparency-office/contact> ; dct:temporal <http://reference.data.gov.uk/id/quarter/2006-Q1> ; dct:spatial <http://www.geonames.org/6695072> ; dct:publisher :finance-ministry ; dct:language <http://id.loc.gov/vocabulary/iso639-1/en> ; dct:accrualPeriodicity <http://purl.org/linked-data/sdmx/2009/code#freq-W> ; dcat:distribution :dataset-001-csv ; .
In order to express the frequency of update in the example above, we chose to use an instance from the Content-Oriented Guidelines developed as part of the W3C Data Cube Vocabulary [VOCAB-DATA-CUBE] efforts. Additionally, we chose to describe the spatial and temporal coverage of the example dataset using URIs from Geonames and the Interval dataset (originally available from http://reference.data.gov.uk/id/interval) from data.gov.uk, respectively. A contact point is also provided where comments and feedback about the dataset can be sent. Further details about the contact point, such as email address or telephone number, can be provided using vCard [VCARD-RDF].
The dataset distribution :dataset-001-csv can be downloaded as a 5Kb CSV file. This information is represented via an RDF resource of type dcat:Distribution.
:dataset-001-csv a dcat:Distribution ; dcat:downloadURL <http://www.example.org/files/001.csv> ; dct:title "CSV distribution of imaginary dataset 001" ; dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ; dcat:byteSize "5120"^^xsd:decimal ; .
The catalog classifies its datasets according to a set of domains represented by the relative URI :themes. SKOS can be used to describe the domains used:
:catalog dcat:themeTaxonomy :themes . :themes a skos:ConceptScheme ; skos:prefLabel "A set of domains to classify documents" ; . :dataset-001 dcat:theme :accountability .
Notice that this dataset is classified under the domain represented by the relative URI :accountability. It is recommended to define the concept as part of the concepts scheme identified by the URI :themes that was used to describe the catalog domains. An example SKOS description:
:accountability a skos:Concept ; skos:inScheme :themes ; skos:prefLabel "Accountability" ; .
The type or genre of a dataset can be indicated using the dct:type property. It is recommended that the value of the property is be taken from a well governed and broadly recognised set of resource types, such as the DCMI Type Vocabulary, the MARC Genre/Terms Scheme, the ISO 19115 MD_Scope codes, the DataCite resource types, or the PARSE.Insight content-types from Re3data [RE3DATA-SCHEMA].
In the following examples, a (notional) dataset is classified separately using values from different vocabularies.
:dataset-001 rdf:type dcat:Dataset ; dct:type <http://purl.org/dc/dcmitype/Text> ; . :dataset-001 rdf:type dcat:Dataset ; dct:type <http://id.loc.gov/vocabulary/marcgt/man> ; .
It is also possible for multiple classifications to be present in a single description.
:dataset-001 rdf:type dcat:Dataset ; dct:type <http://purl.org/dc/dcmitype/Text> ; dct:type <http://id.loc.gov/vocabulary/marcgt/man> ; dct:type <http://registry.it.csiro.au/def/datacite/resourceType/Text> ; dct:type <http://registry.it.csiro.au/def/re3data/contentType/doc> ; . <http://registry.it.csiro.au/def/datacite/resourceType/Text> rdfs:label "Text" ; dct:source "DataCite resource types" ; . <http://registry.it.csiro.au/def/re3data/contentType/doc> rdfs:label "Standard office documents" ; dct:source "Re3data content types" ; .
If the catalog publisher decides to keep metadata describing its records (i.e. the records containing metadata describing the datasets), dcat:CatalogRecord can be used. For example, while :dataset-001 was issued on 2011-12-05, its description on Imaginary Catalog was added on 2011-12-11. This can be represented by DCAT as in the following:
:catalog dcat:record :record-001 . :record-001 a dcat:CatalogRecord ; foaf:primaryTopic :dataset-001 ; dct:issued "2011-12-11"^^xsd:date ; .
:dataset-002 is available as a CSV file. However :dataset-002 can only be obtained through some Web page where the user needs to follow some links, provide some information and check some boxes before accessing the data
:dataset-002 a dcat:Dataset ; dcat:landingPage <http://example.org/dataset-002.html> ; dcat:distribution :dataset-002-csv ; . :dataset-002-csv a dcat:Distribution ; dcat:accessURL <http://example.org/dataset-002.html> ; dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ; .
On the other hand, :dataset-003 can be obtained through some landing page but also can be downloaded from a known URL.
:dataset-003 a dcat:Dataset ; dcat:landingPage <http://example.org/dataset-003.html> ; dcat:distribution :dataset-003-csv ; . :dataset-003-csv a dcat:Distribution ; dcat:downloadURL <http://example.org/dataset-003.csv> ; dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ; .
The background to this example is discussed in Best practice for a loosely-structured catalog
In many legacy catalogues and repositories (e.g. CKAN), ‘datasets’ are ‘just a bag of files’. There is no distinction made between part/whole, distribution (representation), and other kinds of relationship (e.g. documentation, schema, supporting documents) from the dataset to each of the files.
If the nature of the relationships between a dataset and component resources in a catalogue, repository, or elsewhere are not known, dct:relation can be used:
:d33937 dct:description "A set of RDF graphs representing the International [Chrono]stratigraphic Chart, ..." ; dct:identifier "https://doi.org/10.25919/5b4d2b83cbf2d"^^xsd:anyURI ; dct:creator <https://orcid.org/0000-0002-3884-3420>; dct:relation <https://vocabs.ands.org.au/viewById/196> ; dct:relation :ChronostratChart2017-02.pdf ; dct:relation :ChronostratChart2017-02.jpg ; dct:relation :timescale.zip ; dct:relation :isc2017.jsonld ; dct:relation :isc2017.nt ; dct:relation :isc2017.rdf ; dct:relation :isc2017.ttl ; .
If it is clear that any of these related resources is a proper representation of the dataset, dcat:distribution should be used.
:d33937 rdf:type dcat:Dataset ; dct:description "A set of RDF graphs representing the International [Chrono]stratigraphic Chart, ..." ; dct:identifier "https://doi.org/10.25919/5b4d2b83cbf2d"^^xsd:anyURI ; dct:relation <https://vocabs.ands.org.au/viewById/196> ; dct:relation :ChronostratChart2017-02.pdf ; dct:relation :ChronostratChart2017-02.jpg ; dct:relation :timescale.zip ; dcat:distribution :d33937-jsonld ; dcat:distribution :d33937-nt ; dcat:distribution :d33937-rdf ; dcat:distribution :d33937-ttl ; . :d33937-jsonld rdf:type dcat:Distribution ; dcat:downloadURL :isc2017.jsonld ; dcat:byteSize "698039"^^xsd:decimal ; dcat:mediaType <https://www.iana.org/assignments/media-types/application/ld+json> ; . :d33937-nt rdf:type dcat:Distribution ; dcat:downloadURL :isc2017.nt ; dcat:byteSize "2047874"^^xsd:decimal ; dcat:mediaType <https://www.iana.org/assignments/media-types/application/n-triples> ; . :d33937-rdf rdf:type dcat:Distribution ; dcat:downloadURL :isc2017.rdf ; dcat:byteSize "1600569"^^xsd:decimal ; dcat:mediaType <https://www.iana.org/assignments/media-types/application/rdf+xml> ; . :d33937-ttl rdf:type dcat:Distribution ; dcat:downloadURL :isc2017.ttl ; dcat:byteSize "531703"^^xsd:decimal ; dcat:mediaType <https://www.iana.org/assignments/media-types/text/turtle> ; .
This example is available from the DXWG code repository at csiro-dap-examples.ttl
The provenance or business context of a dataset can be described using elements from the W3C Provenance Ontology [PROV-O].
For example, a simple link from a dataset description to the project that generated the dataset can be formalized as follows (other details elided for clarity):
dap:atnf-P366-2003SEPT rdf:type dcat:Dataset ; dct:bibliographicCitation "Burgay, M; McLaughlin, M; Kramer, M; Lyne, A; Joshi, B; Pearce, G; D'Amico, N; Possenti, A; Manchester, R; Camilo, F (2017): Parkes observations for project P366 semester 2003SEPT. v1. CSIRO. Data Collection. https://doi.org/10.4225/08/598dc08d07bb7" ; dct:title "Parkes observations for project P366 semester 2003SEPT" ; dcat:landingPage <https://data.csiro.au/dap/landingpage?pid=csiro:P366-2003SEPT> ; prov:wasGeneratedBy dap:P366 ; . dap:P366 rdf:type prov:Activity ; dct:type "Observation" ; prov:startedAtTime "2000-11-01"^^xsd:date ; prov:used dap:Parkes-radio-telescope ; prov:wasInformedBy dap:ATNF ; rdfs:label "P366 - Parkes multibeam high-latitude pulsar survey" ; rdfs:seeAlso <https://doi.org/10.1111/j.1365-2966.2006.10100.x> ; .
This example is available from the DXWG code repository at csiro-dap-examples.ttl
Several properties capture provenance information, including within the citation and title, but the primary link to a formal description of the project is through prov:wasGeneratedBy. A terse description of the project is shown as a prov:Activity, though this would not necessarily be part of the same catalog. Note that as the project is ongoing, the activity has no end date.
Further provenance information might be provided using the other starting point properties from PROV, in particular prov:wasAttributedTo (to link to an agent associated with the dataset production) and prov:wasDerivedFrom (to link to a predecessor dataset). Both of these complement Dublin Core properties already used in DCAT, as follows:
For a more detailed discussion of the use of PROV for dataset provenance, including recommendations on the use of qualified properties, see the chapter on Provenance Patterns below.
Data services may be described using DCAT. The values of the classifiers dct:type, dct:conformsTo, and dcat:endpointDescription provide progressively more detail about a service, whose actual endpoint is given by the dcat:endpointURL.
The first example describes a data catalog hosted by the European Environment Agency.
This is classified as a dcat:DiscoveryService and also has the dct:type
set to discovery from the INSPIRE classification of spatial data service types.
This example is available from the DXWG code repository at eea-csw.ttl
a:EEA-CSW-Endpoint rdf:type dcat:DiscoveryService ; dc:subject "infoCatalogueService"@en ; dct:accessRights <http://publications.europa.eu/resource/authority/access-right/PUBLIC> ; dct:conformsTo <http://www.opengis.net/def/serviceType/ogc/csw> ; dct:description "The EEA public catalogue of spatial datasets references the spatial datasets used by the European Environment Agency as well as the spatial datasets produced by or for the EEA. In the latter case, when datasets are publicly available, a link to the location from where they can be downloaded is included in the dataset's metadata. The catalogue has been initially populated with the most important spatial datasets already available on the data&maps section of the EEA website and is currently updated with any newly published spatial dataset."@en ; dct:identifier "eea-sdi-public-catalogue" ; dct:issued "2012-01-01"^^xsd:date ; dct:license <https://creativecommons.org/licenses/by/2.5/dk/> ; dct:spatial [ rdf:type dct:Location ; locn:geometry "<gml:Envelope srsName=\"http://www.opengis.net/def/crs/OGC/1.3/CRS84\"><gml:lowerCorner>-180 -90</gml:lowerCorner><gml:upperCorner>180 90</gml:upperCorner></gml:Envelope>"^^gsp:gmlLiteral ; locn:geometry "POLYGON((-180 90,180 90,180 -90,-180 -90,-180 90))"^^gsp:wktLiteral ; ] ; dct:title "European Environment Agency's public catalogue of spatial datasets."@en ; dct:type <http://inspire.ec.europa.eu/metadata-codelist/ResourceType/service> ; dct:type <http://inspire.ec.europa.eu/metadata-codelist/SpatialDataServiceType/discovery> ; dcat:contactPoint a:EEA ; dcat:endpointDescription <https://sdi.eea.europa.eu/catalogue/srv/eng/csw?service=CSW&request=GetCapabilities> ; dcat:endpointURL <http://sdi.eea.europa.eu/catalogue/srv/eng/csw> ; .
The next example shows a dataset hosted by Geoscience Australia, which is available from three distinct services, as indicated by the value of the dcat:servesDataset property of each of the service descriptions.
These are classified as a dcat:DataDistributionService and also have the dct:type
set to download and view from the INSPIRE classification of spatial data service types.
This example is available from the DXWG code repository at ga-courts.ttl
ga-courts:jc rdf:type dcat:Dataset ; dct:description "The dataset contains spatial locations, in point format, of the Australian High Court, Australian Federal Courts and the Australian Magistrates Courts." ; dct:spatial [ rdf:type dct:Location ; locn:geometry "<gml:Envelope srsName=\"http://www.opengis.net/def/crs/EPSG/0/4283\"><gml:lowerCorner>115.864566 -42.885989</gml:lowerCorner><gml:upperCorner>153.276835 -12.460578</gml:upperCorner></gml:Envelope>"^^gsp:gmlLiteral ; ] ; dct:title "Judicial Courts" ; dct:type <http://purl.org/dc/dcmitype/Dataset> ; dcat:landingPage <https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/cc365600-294a-597d-e044-00144fdd4fa6> ; . ga-courts:jc-esri rdf:type dcat:DataDistributionService ; dct:conformsTo <https://developers.arcgis.com/rest/> ; dct:description "This web service provides access to the National Judicial Courts dataset and presents the spatial locations of all the known Australian High Courts, Australian Federal Courts and the Australian Federal Circuit Courts located within Australia, all complemented with feature attribution." ; dct:identifier "2b8540c8-4a43-144d-e053-12a3070a3ff7" ; dct:title "National Judicial Courts MapServer" ; dct:type <http://purl.org/dc/dcmitype/Service> ; dct:type <https://inspire.ec.europa.eu/metadata-codelist/SpatialDataServiceType/download> ; dct:type <https://inspire.ec.europa.eu/metadata-codelist/SpatialDataServiceType/view> ; dcat:endpointURL <http://services.ga.gov.au/gis/rest/services/Judicial_Courts/MapServer> ; dcat:landingPage <https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/2b8540c8-4a43-144d-e053-12a3070a3ff7> ; dcat:servesDataset ga-courts:jc ; . ga-courts:jc-wfs rdf:type dcat:DataDistributionService ; dct:conformsTo <http://www.opengis.net/def/serviceType/ogc/wfs/2.0.0> ; dct:conformsTo <http://www.opengis.net/def/serviceType/ogc/wfs/1.1.0> ; dct:conformsTo <http://www.opengis.net/def/serviceType/ogc/wfs/1.0.0> ; dct:description "This web service provides access to the National Judicial Courts dataset and presents the spatial locations of all the known Australian High Courts, Australian Federal Courts and the Australian Federal Circuit Courts located within Australia, all complemented with feature attribution." ; dct:identifier "2b8540c8-4a42-144d-e053-12a3070a3ff7" ; dct:title "National Judicial Courts WFS" ; dct:type <http://purl.org/dc/dcmitype/Service> ; dct:type <https://inspire.ec.europa.eu/metadata-codelist/SpatialDataServiceType/download> ; dcat:endpointDescription <http://services.ga.gov.au/gis/services/Judicial_Courts/MapServer/WFSServer?request=GetCapabilities&service=WFS> ; dcat:endpointURL <http://services.ga.gov.au/gis/services/Judicial_Courts/MapServer/WFSServer> ; dcat:landingPage <https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/2b8540c8-4a42-144d-e053-12a3070a3ff7> ; dcat:servesDataset ga-courts:jc ; . ga-courts:jc-wms rdf:type dcat:DataDistributionService ; dct:conformsTo <http://www.opengis.net/def/serviceType/ogc/wms/1.3> ; dct:description "This web service provides access to the National Judicial Courts dataset and presents the spatial locations of all the known Australian High Courts, Australian Federal Courts and the Australian Federal Circuit Courts located within Australia, all complemented with feature attribution." ; dct:identifier "2b8540c8-4a41-144d-e053-12a3070a3ff7" ; dct:title "National Judicial Courts WMS" ; dct:type <http://purl.org/dc/dcmitype/Service> ; dct:type <https://inspire.ec.europa.eu/metadata-codelist/SpatialDataServiceType/view> ; dcat:endpointDescription <http://services.ga.gov.au/gis/services/Judicial_Courts/MapServer/WMSServer?request=GetCapabilities&service=WMS> ; dcat:endpointURL <http://services.ga.gov.au/gis/services/Judicial_Courts/MapServer/WMSServer> ; dcat:landingPage <https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/2b8540c8-4a41-144d-e053-12a3070a3ff7> ; dcat:servesDataset ga-courts:jc ; .
This section will contain more examples on the use of DCAT.
The DCAT RDF representation is modularized into several files or graphs to help users access a version of DCAT with just the alignments that they need. This mechanism can also be used to capture different levels of axiomatization, though the status of such proposals has not been finalized. See Issue #134 and the issues enumerated below.
Guidance on the use DCAT in a weakly-axiomatized environment, such as schema.org, has been identified as a requirement to be satisfied in this revision of DCAT.
An RDF graph containing a proposed alignment of DCAT with schema.org is available. Comments on this alignment are invited.
The use of guarded constraints (existence, cardinality, range-type) to control the use of the recommended properties in the context of a class is being considered as part of the revision of DCAT.
The axiomatization of DCAT 2014 used global domain and range constraints for many of the properties defined in the DCAT namespace [VOCAB-DCAT-20140116]. This makes quite strong ontological commitments, some of which are now being reconsidered - see individual issues noted inline below.
The (revised) DCAT vocabulary is available in RDF. The primary artefact dcat.ttl is a serialization of the core DCAT vocabulary. Alongside it are a set of other RDF files that provide additional information, including:
The implementation of a DCAT 2014 profile of the revised DCAT is being considered.
The definitions (including domain and range) of terms outside the DCAT namespace are provided here only for convenience and MUST NOT be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [DC11], [DCTERMS], [FOAF], [RDF-SCHEMA], [SKOS-REFERENCE], [XMLSCHEMA11-2] and [VCARD-RDF].
The following properties are recommended for use on this class: catalog record, hasPart, dataset, service, catalog, description, homepage, language, license, publisher, release date, rights, spatial/geographical, themes, title, update/modification date
The scope of DCAT 2014 was limited to catalogs of datasets [VOCAB-DCAT-20140116]. A number of use cases for the revision involve also having data distribution services as members of a catalog - see DCAT Distribution to describe web services - ID6 and Modeling service-based data access - ID18. It has been decided to add an explicit class for Data Distribution Services in this revision of DCAT, and to enable these to be part of a Catalog. Provision for other services to also be part of a Catalog will also be made, as well as for Catalogs to be composed of other Catalogs. See Issue #172 and Issue #116.
RDF Class: | dcat:Catalog |
---|---|
Sub-class of: | Dataset |
Definition: | A curated collection of metadata about datasets and data services |
Usage note: | A web-based data catalog is typically represented as a single instance of this class. |
See also: | Catalog record, Dataset |
RDF Property: | dct:title |
---|---|
Definition: | A name given to the catalog. |
Range: | rdfs:Literal |
RDF Property: | dct:description |
---|---|
Definition: | A free-text account of the catalog. |
Range: | rdfs:Literal |
RDF Property: | dct:issued |
---|---|
Definition: | Date of formal issuance (e.g., publication) of the catalog. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
See also: | resource release date, catalog record listing date and distribution release date |
RDF Property: | dct:modified |
---|---|
Definition: | Most recent date on which the catalog was changed, updated or modified. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
See also: | resource modification date, catalog record modification date and distribution modification date |
RDF Property: | dct:language |
---|---|
Definition: | A language of the catalog. This refers to the language used in the textual metadata describing titles, descriptions, etc. of the resources (i.e. datasets and services) in the catalog. |
Range: |
dct:LinguisticSystem
Resources defined by the Library of Congress (1, 2) SHOULD be used. If a ISO 639-1 (two-letter) code is defined for language, then its corresponding IRI SHOULD be used; if no ISO 639-1 code is defined, then IRI corresponding to the ISO 639-2 (three-letter) code SHOULD be used. |
Usage note: | Multiple values can be used. The publisher might also choose to describe the language on the resource (e.g. dataset, or service) level (see dataset language). |
RDF Property: | foaf:homepage |
---|---|
Definition: | A homepage of the catalog (a public web document usually available in HTML). |
Range: | foaf:Document |
Usage note: | foaf:homepage is an inverse functional property (IFP) which means that it SHOULD be unique and precisely identify the catalog. This allows smushing various descriptions of the catalog when different URIs are used. |
RDF Property: | dct:publisher |
---|---|
Definition: | The entity responsible for making the catalog available. |
Usage note: | Resources of type foaf:Agent are recommended as values for this property. |
See also: | Class: Organization/Person |
RDF Property: | dct:spatial |
---|---|
Definition: | The geographical area covered by the catalog. |
Range: | dct:Location |
RDF Property: | dcat:themeTaxonomy |
---|---|
Definition: | A knowledge organization system (KOS) used to classify catalog's datasets and services. |
Domain: | dcat:Catalog |
Range: | skos:ConceptScheme |
Models for the kind of license or rights representation indicated by the dct:license and dct:rights property are being considered as part of the revision of DCAT. See also License and rights statements.
RDF Property: | dct:license |
---|---|
Definition: | A legal document under which the catalog is made available and not the datasets or services |
Range: | dct:LicenseDocument |
Usage note: | If the license of the catalog applies to all of its datasets and distributions, it SHOULD also be replicated on each distribution. |
See also: | catalog rights, distribution license |
Models for the kind of license or rights representation indicated by the dct:license and dct:rights property are being considered as part of the revision of DCAT. See also License and rights statements.
RDF Property: | dct:rights |
---|---|
Definition: | Information about the rights under which the catalog can be used/reused, but not the datasets or services listed. |
Range: | dct:RightsStatement |
Usage note: | If the rights associated with the catalog applies to all of its datasets and distributions, it SHOULD also be replicated on each distribution. |
See also: | catalog license, distribution rights |
Explicit use of this property added in this revision of DCAT.
RDF Property: | dct:hasPart |
---|---|
Definition: | An item that is listed in the catalog. |
Domain: | dcat:Catalog |
Range: | dcat:Resource |
Usage note: | This is the most general predicate for membership of a catalog. Use of a more specific sub-property is recommended when available. |
See also: | Sub-properties of dct:hasPart in particular dcat:dataset, dcat:catalog, dcat:service. |
RDF Property: | dcat:dataset |
---|---|
Definition: | A collection of data that is listed in the catalog. |
Sub property of: | dct:hasPart |
Domain: | dcat:Catalog |
Range: | dcat:Dataset |
Property added in this revision of DCAT.
RDF Property: | dcat:service |
---|---|
Definition: | A site or end-point that is listed in the catalog. |
Sub property of: | dct:hasPart |
Domain: | dcat:Catalog |
Range: | dcat:DataService |
Property added in this revision of DCAT.
RDF Property: | dcat:catalog |
---|---|
Definition: | A catalog whose contents are of interest in the context of this catalog |
Sub property of: | dct:dataset |
Domain: | dcat:Catalog |
Range: | dcat:Catalog |
RDF Property: | dcat:record |
---|---|
Definition: | A record describing the registration of a single dataset or dataservice that is part of the catalog. |
Domain: | dcat:Catalog |
Range: | dcat:CatalogRecord |
The following properties are recommended for use on this class: conformsTo, contact point, description, identifier, keyword/tag, landing page, resource language, relation, publisher, release date, theme/category, title, type/genre, update/modification date,
The possible association of items with zero or multiple catalogs has been identified as a requirement to be satisfied in the revision of DCAT.
The need to be able to link a catalogued resource with the source of funding that supported its production has been identified as a requirement to be satisfied in the revision of DCAT.
The need to be able to describe the business or project context related to production of a catalogued resource has been identified as a requirement to be satisfied in the revision of DCAT.
RDF Class: | dcat:Resource |
---|---|
Definition: | Resource published or curated by a single agent. |
Usage note: | The class of all catalogued resources, the superclass of dcat:Dataset, dcat:DataService, dcat:Catalog and any other member of a dcat:Catalog. This class carries properties common to all catalogued resources, including datasets and data services. It is strongly recommended to use a more specific sub-class when available. |
See also: | Catalog record |
The class dcat:Resource has been added to the DCAT vocabulary in this revision.
RDF Property: | dct:title |
---|---|
Definition: | A name given to the item. |
Range: | rdfs:Literal |
RDF Property: | dct:description |
---|---|
Definition: | free-text account of the item. |
Range: | rdfs:Literal |
RDF Property: | dct:issued |
---|---|
Definition: | Date of formal issuance (e.g., publication) of the item. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
Usage note: | This property SHOULD be set using the first known date of issuance. |
RDF Property: | dct:modified |
---|---|
Definition: | Most recent date on which the item was changed, updated or modified. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
Usage note: | The value of this property indicates a change to the actual item, not a change to the catalog record. An absent value MAY indicate that the item has never changed after its initial publication, or that the date of last modification is not known, or that the item is continuously updated. |
See also: | dataset frequency |
RDF Property: | dct:language |
---|---|
Definition: | The language of the item. |
Range: | dct:LinguisticSystem
Resources defined by the Library of Congress (1, 2) SHOULD be used. If a ISO 639-1 (two-letter) code is defined for language, then its corresponding IRI SHOULD be used; if no ISO 639-1 code is defined, then IRI corresponding to the ISO 639-2 (three-letter) code SHOULD be used. |
Usage note: | This overrides the value of the catalog language in case of conflict. If the item is available in multiple languages, use multiple values for this property. If each language is available separately for a dataset, define an instance of dcat:Distribution for each language and describe the specific language of each distribution using dct:language (i.e. the dataset will have multiple dct:language values and each distribution will have one of these languages as value of its dct:language property). |
The desire to have qualified forms of properties (as done in [PROV-O]) has been raised. If dcat:Dataset is a prov:Entity (not decided yet) then `publisher` could be a qualified prov:Entity → prov:Agent relationship.
RDF Property: | dct:publisher |
---|---|
Definition: | The entity responsible for making the item available. |
Usage note: | Resources of type foaf:Agent are recommended as values for this property. |
See also: | Class: Organization/Person |
The desirability of dereferenceable identifiers has been raised as an item for consideration in the revision of DCAT. dct:identifier has limited expressivity for this. It has been suggested that ADMS identifier, or a property from another ontology might be recruited to help DCAT in this area.
The need to clearly distinguish between primary and legacy identifiers for a dataset has been identified as a requirement to be satisfied in the revision of DCAT.
The need to indicate the scheme or authority for identifiers for a dataset has been identified as a requirement to be satisfied in the revision of DCAT.
RDF Property: | dct:identifier |
---|---|
Definition: | A unique identifier of the item. |
Range: | rdfs:Literal |
Usage note: | The identifier might be used as part of the URI of the item, but still having it represented explicitly is useful. |
In DCAT 2014 [VOCAB-DCAT-20140116] the domain of dcat:theme was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #123.
RDF Property: | dcat:theme |
---|---|
Definition: | A main category of the resource. A resource can have multiple themes. |
Sub property of: | dct:subject |
Range: | skos:Concept |
Usage note: | The set of skos:Concepts used to categorize the resources are organized in a skos:ConceptScheme describing all the categories and their relations in the catalog. |
See also: | catalog themes |
Added in DCAT revision - see Issue #64.
RDF Property: | dct:type |
---|---|
Definition: | The nature or genre of the resource. |
Sub property of: | dc:type |
Range: | rdfs:Class |
Usage note: | The value SHOULD be taken from a well governed and broadly recognised controlled vocabulary, such as:
|
RDF Property: | dct:relation |
---|---|
Definition: | A resource with an unspecified relationship to the catalogued item. |
Usage note: | dct:relation SHOULD be used where the nature of the relationship between a catalogued item and related resources is not known. A more specific sub-property SHOULD be used if the nature of the relationship of the link is known. The property dcat:distribution SHOULD be used to link from a dcat:Dataset to a representation of the dataset, described as a dcat:Distribution |
See also: | Sub-properties of dct:relation in particular dcat:distribution, dct:hasPart, (and its sub-properties dcat:catalog, dcat:dataset, dcat:service ), dct:isPartOf, dct:conformsTo, dct:isFormatOf, dct:hasFormat, dct:isVersionOf, dct:hasVersion, dct:replaces, dct:isReplacedBy, dct:references, dct:isReferencedBy, dct:requires, dct:isRequiredBy |
Many existing and legacy catalogues do not distinguish between dataset components, representations, documentation, schemata and other resources that are lumped together as part of a dataset. dct:relation is a super-property of a number of more specific properties which express more precise relationships, so use of dct:relation is not inconsistent with a subsequent reclassification with more specific semantics, though the more specialized sub-properties SHOULD be used to link a dataset to component and supplementary resources if possible.
The general need to describe relationships between datasets has been identified as a requirement to be satisfied in the revision of DCAT. Guidance on the use of more specific relationship predicates is required, particularly in the context of the dcat:Dataset sub-class.
Use of this Dublin Core Terms property in this context added in this revision of DCAT.
RDF Property: | dct:conformsTo |
---|---|
Definition: | An established standard to which the described resource conforms. |
Range: | dct:Standard (A basis for comparison; a reference point against which other things can be evaluated.) |
In DCAT 2014 [VOCAB-DCAT-20140116] the domain of dcat:keyword was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #121.
RDF Property: | dcat:keyword |
---|---|
Definition: | A keyword or tag describing the resource. |
Range: | rdfs:Literal |
In DCAT 2014 [VOCAB-DCAT-20140116] the domain of dcat:contactPoint was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #95.
The axiomatization of dcat:contactPoint is being re-evaluated as part of the revision of DCAT.
RDF Property: | dcat:contactPoint |
---|---|
Definition: | Relevant contact information (provided using vCard [VCARD-RDF]) for the catalogued resource. |
Range: | vcard:Kind |
In DCAT 2014 [VOCAB-DCAT-20140116] the domain of dcat:landingPage was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #122.
RDF Property: | dcat:landingPage |
---|---|
Definition: | A Web page that can be navigated to in a Web browser to gain access to the catalog, a dataset, its distributions and/or additional information. |
Sub property of: | foaf:page |
Range: | foaf:Document |
Usage note: | If the distribution(s) are accessible only through a landing page (i.e. direct download URLs are not known), then the landing page link SHOULD be duplicated as dcat:accessURL on a distribution. (see 5.5 Dataset available only behind some Web page) |
The following properties are recommended for use on this class: description, listing date, primary topic, title, update/modification date
The need to be able to express rights relating to the re-use of DCAT metadata has been identified as a requirement to be satisfied in the revision of DCAT.
The need to be able to link a metadata record to its original source has been identified as a requirement to be satisfied in the revision of DCAT.
RDF Class: | dcat:CatalogRecord |
---|---|
Definition: | A record in a catalog, describing the registration of a single dataset or data service. |
Usage note | This class is optional and not all catalogs will use it. It exists for catalogs where a distinction is made between metadata about a dataset or service and metadata about the entry in the catalog about the dataset or service. For example, the publication date property of the dataset reflects the date when the information was originally made available by the publishing agency, while the publication date of the catalog record is the date when the dataset was added to the catalog. In cases where both dates differ, or where only the latter is known, the publication date SHOULD only be specified for the catalog record. Notice that the W3C PROV Ontology [PROV-O] allows describing further provenance information such as the details of the process and the agent involved in a particular change to a dataset or its registration. |
See also | Dataset |
If a catalog is represented as an RDF Dataset with named graphs (as defined in [SPARQL11-QUERY]), then it is appropriate to place the description of each dataset (consisting of all RDF triples that mention the dcat:Dataset, dcat:CatalogRecord, and any of its dcat:Distributions) into a separate named graph. The name of that graph SHOULD be the IRI of the catalog record.
RDF Property: | dct:title |
---|---|
Definition: | A name given to the record. |
Range: | rdfs:Literal |
RDF Property: | dct:description |
---|---|
Definition: | A free-text account of the record. |
Range: | rdfs:Literal |
RDF Property: | dct:issued |
---|---|
Definition: | The date of listing (i.e. formal recording) of the corresponding dataset or service in the catalog. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
Usage note: | This indicates the date of listing the dataset in the catalog and not the publication date of the dataset itself. |
See also: | resource release date |
RDF Property: | dct:modified |
---|---|
Definition: | Most recent date on which the catalog entry was changed, updated or modified. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
Usage note: | This indicates the date of last change of a catalog entry, i.e. the catalog metadata description of the dataset, and not the date of the dataset itself. |
See also: | resource modification date |
RDF Property: | foaf:primaryTopic |
---|---|
Definition: | The dcat:Resource (dataset or service) described in the record. |
Usage note: | foaf:primaryTopic property is functional: each catalog record can have at most one primary topic i.e. describes one dataset or service. |
The class dcat:DataDistributionService has been added to the DCAT vocabulary in this revision.
In addition to the properties inherited from the super-class dcat:DataService, the following properties are recommended for use on this class: servesDataset,
RDF Class: | dcat:DataDistributionService |
---|---|
Definition: | A site or end-point that provides access to datasets through distributions of the datasets |
Sub class of: | dcat:DataService |
RDF Property: | dcat:servesDataset |
---|---|
Definition: | A collection of data that this DataDistributionService can distribute |
Range: | dcat:Dataset |
The class dcat:DataService has been added to the DCAT vocabulary in this revision.
In addition to the properties inherited from the super-class dcat:Resource, the following properties are recommended for use on this class: endpointDescription, endpointURL, license, accessRights
RDF Class: | dcat:DataService |
---|---|
Definition: | A site or end-point for discovery, access or processing data or related resources. |
Sub class of: | dcat:Resource |
Sub class of: | dctype:Service |
RDF Property: | dcat:endpointURL |
---|---|
Definition: | The IRI of the service end-point. |
Domain: | dcat:DataService |
Range: | xsd:anyURI |
RDF Property: | dcat:endpointDescription |
---|---|
Definition: | A description of the service end-point, for example an OpenAPI (Swagger) description, an OGC getCapabilities response, a SD Service, an OpenSearch or WSDL document. |
Domain: | dcat:DataService |
Range: | rdfs:Resource |
RDF Property: | dct:license |
---|---|
Definition: | A legal document under which the service is made available. |
Range: | dct:LicenseDocument |
See also: | distribution rights, catalog license |
RDF Property: | dct:accessRights |
---|---|
Definition: | Access Rights MAY include information regarding access or restrictions based on privacy, security, or other policies. |
Range: | dct:RightsStatement |
In addition to the properties inherited from the super-class dcat:Resource, the following properties are recommended for use on this class: creator, distribution, frequency, spatial/geographic coverage, temporal coverage, was generated by
Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.
The need to more formally encode access restrictions for both datasets and distributions has been identified as a requirement to be satisfied in the revision of DCAT.
The need to provide richer descriptions of dataset aspects (e.g. instrument/sensor used, spatial feature, observable property, quantity kind) has been identified as a requirement to be satisfied in the revision of DCAT.
The need to provide better guidance and vocabulary elements for dataset citation has been identified as a requirement to be satisfied in the revision of DCAT.
Dataset citation is one of the requirements identified for the DCAT revision. Data citation is the practice of referencing data in a similar way as when providing bibliographic references, acknowledging data as a first class output in any investigative process. Data citation offers multiple benefits, such as crediting those producing the data, facilitating data discovery, supporting tracking the impact and reuse of data. To support data citation, the dataset description should include at a minimum: the dataset identifier, the dataset creator (added in this DCAT revision as part of the properties recommended for Dataset), the dataset title, the dataset publisher and the dataset release date. The constraints on the availability of such properties in the dataset description can be represented as a DCAT data citation profile. See the wiki page on Data Citation for more discussion.
The need to be able to link a dataset with publications arising from it has been identified as a requirement to be satisfied in the revision of DCAT.
The need provide a more comprehensive method for describing dataset provenance has been identified as a requirement to be satisfied in the revision of DCAT. A preliminary alignment of DCAT with PROV-O is available.
The need to be able to provide summary statistics about a dataset has been identified as a requirement to be satisfied in the revision of DCAT.
The need to be able to provide usage notes for a dataset or distribution has been identified as a requirement to be satisfied in the revision of DCAT.
The need to be able to provide citations for a distribution has been identified as a potential requirement to be satisfied in the revision of DCAT.
RDF Class: | dcat:Dataset |
---|---|
Definition: | A collection of data, published or curated by a single agent, and available for access or download in one or more formats. |
Sub class of: | dcat:Resource |
Usage note: | This class represents the actual dataset as published by the dataset publisher. In cases where a distinction between the actual dataset and its entry in the catalog is necessary (because metadata such as modification date and maintainer might differ), the catalog record class can be used for the latter. |
In DCAT 2014 [VOCAB-DCAT-20140116] dcat:Dataset was a sub-class of dctype:Dataset, which is a member of the DCMI Types vocabulary [DCTERMS]. The scope of dcat:Dataset also includes other members of the DCMI Types vocabulary, so the implicitly limited sub-class relationship from DCAT 2014 [VOCAB-DCAT-20140116] has been removed in this revised DCAT vocabulary - see Issue #98.
Note that members of the DCMI Types vocabulary may appear as the value of the dct:type property, as shown in Classifying dataset types.
New property for Dataset in this revision of DCAT, added when considering the data citation requirement.
RDF Property: | dct:creator |
---|---|
Definition: | The entity responsible for producing the dataset. |
Range: | foaf:Agent |
Usage note: | Resources of type foaf:Agent are recommended as values for this property. |
See also: | Class: Organization/Person |
RDF Property: | dcat:distribution |
---|---|
Definition: | An available distribution of the dataset. |
Sub property of: | dct:relation |
Domain: | dcat:Dataset |
Range: | dcat:Distribution |
Issue #81 concerns the general need to describe relationships between datasets. Guidance on the use of specializations of dct:relation is required in the context of dcat:Dataset.
RDF Property: | dct:accrualPeriodicity |
---|---|
Definition: | The frequency at which dataset is published. |
Range: | dct:Frequency (A rate at which something recurs) |
The need to indicate the spatial reference system used in the spatial description of a dataset has been identified as a requirement to be satisfied in the revision of DCAT.
The need to be able to describe the spatial coverage of a dataset as a geometry has been identified as a requirement to be satisfied in the revision of DCAT.
RDF Property: | dct:spatial |
---|---|
Definition: | The geographical area covered by the dataset. |
Range: | dct:Location (A spatial region or named place) |
The need to be able to describe the temporal coverage of a dataset in a structured way has been identified as a requirement to be satisfied in the revision of DCAT.
RDF Property: | dct:temporal |
---|---|
Definition: | The temporal period that the dataset covers. |
Range: | dct:PeriodOfTime (An interval of time that is named or defined by its start and end dates) |
RDF Property: | prov:wasGeneratedBy |
---|---|
Definition: | An activity that generated, or provides the business context for, the creation of the dataset. |
Domain: | prov:Entity |
Range: | prov:Activity An activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities. |
Usage note: | The activity associated with generation of a dataset will typically be an initiative, project, mission, survey, on-going activity ("business as usual") etc. Multiple prov:wasGeneratedBy properties can be used to indicate the dataset production context at various levels of granularity. |
Usage note: | Use prov:qualifiedGeneration to attach additional details about the relationship between the dataset and the activity, e.g. the exact time that the dataset was produced during the lifetime of a project |
New property in this context in this revision of DCAT.
Details about how to describe the activity that generated a dataset, such as a project, initiative, on-going activity, mission or survey, are out of scope for this document. prov:Activity provides for some basic properties such as begin and end time, associated agents etc. Further details may be provided through classes defined in applications. A number of ontologies for describing projects are available, for example VIVO for academic research projects [VIVO-ISF], DOAP (Description of a Project) for software projects [DOAP], and DBPedia for general projects [DBPEDIA-ONT] which are expected to be suitable for different applications.
In addition to the properties inherited from the super-class dcat:DataDistributionService, the following properties are recommended for use on this class: none yet
RDF Class: | dcat:DiscoveryService |
---|---|
Definition: | A site or end-point that supports data discovery, usually by providing access to catalogs |
Sub class of: | dcat:DataDistributionService |
Usage note: | |
See also: |
The class dcat:DiscoveryService has been added to the DCAT vocabulary in this revision.
The following properties are recommended for use on this class: access URL, access service, byte size, conforms to, description, download URL, format, license, media type, release date, rights, title, update/modification date
The packaging of files in a dcat:Distribution is being considered as part of the revision of DCAT.
The need to more formally encode access restrictions for both datasets and distributions has been identified as a requirement to be satisfied in the revision of DCAT.
RDF class: | dcat:Distribution |
---|---|
Definition: | A specific representation of a dataset. A dataset might be available in several different forms, and these forms might comprise both different serializations or different schematic arrangements of the same data. Examples of distributions include a CSV file, a netCDF file, or a data-cube |
Usage note: | This represents a general availability of a dataset. It implies no information about the actual access method of the data, i.e. whether by direct download, API, or through a Web page. The use of dcat:downloadURL property indicates directly downloadable distributions. |
See also: | Data distribution service |
The scope of dcat:Distribution here is narrower than in DCAT-2014 [VOCAB-DCAT-20140116], where it also included APIs and feeds. Data catalogues designed using DCAT-2014 therefore used instances of type dcat:Distribution to describe data distribution services. Applications consuming DCAT should be aware that catalogues designed using DCAT-2014 might use dcat:Distribution to represent both services and representations.
Under the revised scope, instances of type dcat:Distribution SHOULD be limited to representations of datasets which might be transported as files, and SHOULD NOT be used for data services such as APIs or feeds. Data services including APIs and feeds SHOULD be described using instances of type dcat:DataService whose sub-class dcat:DataDistributionService MAY serve dcat:Distributions.
Links between a dcat:Distribution and services or web addresses where it can be accessed are expressed using dcat:accessURL, dcat:accessService, dcat:downloadURL, as shown in Figure 1 and described in the definitions below.
The definition text of dcat:Distribution has been revised to clarify that distributions are primarily representations of datasets. As such, all distributions of a given dataset should be informationally equivalent.
The intention of the phrase "informationally equivalent" needs to be clarified, in particular as different serializations may have different expressivity.
RDF Property: | dct:title |
---|---|
Definition: | A name given to the distribution. |
Range: | rdfs:Literal |
RDF Property: | dct:description |
---|---|
Definition: | free-text account of the distribution. |
Range: | rdfs:Literal |
RDF Property: | dct:issued |
---|---|
Definition: | Date of formal issuance (e.g., publication) of the distribution. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
Usage note: | This property SHOULD be set using the first known date of issuance. |
See also: | resource release date |
RDF Property: | dct:modified |
---|---|
Definition: | Most recent date on which the distribution was changed, updated or modified. |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
See also: | resource modification date |
Models for the kind of license or rights representation indicated by the dct:license and dct:rights property are being considered as part of the revision of DCAT. See also License and rights statements.
RDF Property: | dct:license |
---|---|
Definition: | A legal document under which the distribution is made available. |
Range: | dct:LicenseDocument |
Usage note: | Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts. |
See also: | distribution rights, catalog license |
Models for the kind of license or rights representation indicated by the dct:license and dct:rights property are being considered as part of the revision of DCAT. See also License and rights statements.
RDF Property: | dct:rights |
---|---|
Definition: | Information about rights held in and over the distribution. |
Range: | dct:RightsStatement |
Usage note: | dct:license, which is a sub-property of dct:rights, can be used to link
a distribution to a license document. However, dct:rights allows linking to a rights statement that
can include licensing information as well as other information that supplements the licence such as attribution. Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts. |
See also: | distribution license, catalog rights |
The granularity of dcat:accessURL is being re-considered to provide different usages for list and item endpoints as well as supporting the declaration of different profiles (for list results and data payload).
RDF Property: | dcat:accessURL |
---|---|
Definition: | A URL of the resource that gives access to a distribution of the dataset. E.g. landing page, feed, SPARQL endpoint. |
Domain: | dcat:Distribution |
Range: | rdfs:Resource |
Usage note: | dcat:accessURL SHOULD be used for the address of a service or location that can provide access to this distribution, typically through a web form, query or API call.
dcat:downloadURL is preferred for direct links to downloadable resources. If the distribution(s) are accessible only through a landing page (i.e. direct download URLs are not known), then the landingPage address associated with the dcat:Dataset SHOULD be duplicated as accessURL on a distribution. (see 5.5 Dataset available only behind some Web page) |
See also | download address, access service |
dcat:accessURL generally matches the property-chain dcat:accessService/dcat:endpointURL. In the RDF representation of DCAT this is axiomatized as an OWL property-chain axiom.
New property in this revision of DCAT.
RDF Property: | dcat:accessService |
---|---|
Definition: | A site or end-point that gives access to the distribution of the dataset |
Sub-property of: | dcat:accessURL |
Range: | dcat:DataDistributionService |
Usage note: | dcat:accessService SHOULD be used to link to a description of a dcat:DataDistributionService that can provide access to this distribution. |
See also | download address, access address |
RDF Property: | dcat:downloadURL |
---|---|
Definition: | The URL of the downloadable file in a given format. E.g. CSV file or RDF file. The format is indicated by the distribution's dct:format and/or dcat:mediaType |
Domain: | dcat:Distribution |
Range: | rdfs:Resource |
Usage note: | dcat:downloadURL SHOULD be used for the address at which this distribution is available directly, typically through a HTTP Get request. |
See also | access address, access service |
The axiomatization of dcat:byteSize is being re-evaluated as part of the revision of DCAT.
RDF Property: | dcat:byteSize |
---|---|
Definition: | The size of a distribution in bytes. |
Domain: | dcat:Distribution |
Range: | rdfs:Literal typed as xsd:decimal. |
Usage note: | The size in bytes can be approximated when the precise size is not known. |
New property in this context in this revision of DCAT.
RDF Property: | dct:conformsTo |
---|---|
Definition: | An established standard to which the described resource conforms. |
Range: | dct:Standard (A basis for comparison; a reference point against which other things can be evaluated.) |
Usage note: | This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation conforms to. This is (generally) a complementary concern to the media-type or format. |
See also: | format , media type |
dct:Standard is defined as "A basis for comparison; a reference point against which other things can be evaluated." It is not restricted to formal standards issued by bodies like ISO and W3C. In this context it will usually be used for a schema, ontology, data model or profile which specifies the structure of a dataset. This is not necessarily tied to a single encoding or serialization.
The range of dcat:mediaType has been tightened from dct:MediaTypeOrExtent to dct:MediaType as part of the revision of DCAT.
RDF Property: | dcat:mediaType |
---|---|
Definition: | The media type of the distribution as defined by IANA [IANA-MEDIA-TYPES]. |
Sub property of: | dct:format |
Domain: | dcat:Distribution |
Range: | dct:MediaType |
Usage note: | This property SHOULD be used when the media type of the distribution is defined in IANA [IANA-MEDIA-TYPES], otherwise dct:format MAY be used with different values. |
See also: | format , conforms to |
RDF Property: | dct:format |
---|---|
Definition: | The file format of the distribution. |
Range: | dct:MediaTypeOrExtent |
Usage note: | dcat:mediaType SHOULD be used if the type of the distribution is defined by IANA [IANA-MEDIA-TYPES]. |
See also: | media type , conforms to |
RDF Class: | skos:ConceptScheme |
---|---|
Definition: | A knowledge organization system (KOS) used to represent themes/categories of datasets in the catalog. |
See also: | catalog themes, dataset theme |
RDF Class: | skos:Concept |
---|---|
Definition: | A category or a theme used to describe datasets in the catalog. |
Usage note: | It is recommended to use either skos:inScheme or skos:topConceptOf on every skos:Concept used to classify datasets to link it to the concept scheme it belongs to. This concept scheme is typically associated with the catalog using dcat:themeTaxonomy |
See also: | catalog themes, dataset theme |
RDF Classes: | foaf:Person for people and foaf:Organization for government agencies or other entities. |
---|---|
Usage note: | [FOAF] provides sufficient properties to describe these entities. |
This section is non-normative.
This section is not-normative as it provides guidance on how to document the quality of DCAT first class entities (e.g., datasets, distributions) and it does not define new DCAT terms. The guidance relies on the Data Quality Vocabulary(DQV)[VOCAB-DQV], which is a W3C Group Note.
The following examples make no comments on where the quality information would reside and how it is managed. That is out of scope for the DCAT vocabulary. The assumption made is that the quality individuals are available using the URIs indicated. Besides, the examples and more in general the DQV is neutral to the data portal design choices on how to collect quality information. For example, data portals can collect DQV instances by implementing specific UI to annotate data or by taking inputs from 3rd-party services.
We might want to include examples of quality documentation related to services.
A data consumer (:consumer1) describes the quality of the dataset :genoaBusStopsDataset that includes a georeferenced list of bus stops in Genoa. He/she annotates the dataset with a DQV quality note (:genoaBusStopsDatasetCompletenessNote) about data completeness (ldqd:completeness) to warn that the dataset includes only 20500 out of the 30000 stops.
:genoaBusStopsDataset a dcat:Dataset ; dqv:hasQualityAnnotation :genoaBusStopsDatasetCompletenessNote . :genoaBusStopsDatasetCompletenessNote a dqv:UserQualityFeedback ; oa:hasTarget :genoaBusStopsDataset ; oa:hasBody :textBody ; oa:motivatedBy dqv:qualityAssessment ; prov:wasAttributedTo :consumer1 ; prov:generatedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime ; dqv:inDimension ldqd:completeness . :textBody a oa:TextualBody ; rdf:value "Incomplete dataset: it contains only 20500 out of 30000 existing bus stops" ; dc:language "en" ; dc:format "text/plain" .
The activity :myQualityChecking employs the service :myQualityChecker to check the quality of the :genoaBusStopsDataset dataset. The metric :completenessWRTExpectedNumberOfEntities is applied to measure the dataset completeness (ldqd:completeness) and it results in the quality measurement :genoaBusStopsDatasetCompletenessMeasurement.
:genoaBusStopsDataset dqv:hasQualityMeasurement :genoaBusStopsDatasetCompletenessMeasurement . :genoaBusStopsDatasetCompletenessMeasurement a dqv:QualityMeasurement ; dqv:computedOn :genoaBusStopsDataset ; dqv:isMeasurementOf :completenessWRTExpectedNumberOfEntities ; dqv:value "0.6833333"^^xsd:decimal ; prov:wasAttributedTo :myQualityChecker ; prov:generatedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime ; prov:wasGeneratedBy :myQualityChecking . :completenessWRTExpectedNumberOfEntities a dqv:Metric ; skos:definition "it returns the degree of completeness as ratio between the actual number of entities included in the dataset and the declared expected number of entities."@en ; dqv:expectedDataType xsd:decimal ; dqv:inDimension ldqd:completeness . # :myQualityChecker is a service computing some quality metrics :myQualityChecker a prov:SoftwareAgent ; rdfs:label "A quality assessment service"^^xsd:string . # Further details about quality service/software can be provided, for example, # deploying vocabularies such as Dataset Usage Vocabulary (DUV), Dublin Core or ADMS.SW # :myQualityChecking is the activity that has generated :genoaBusStopsDatasetCompletenessMeasurement from :genoaBusStopsDataset :myQualityChecking a prov:Activity; rdfs:label "The checking of genoaBusStopsDataset's quality"^^xsd:string; prov:wasAssociatedWith :myQualityChecker; prov:used :genoaBusStopsDataset; prov:generated :genoaBusStopsDatasetCompletenessMeasurement; prov:endedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime; prov:startedAtTime "2018-05-27T00:52:02Z"^^xsd:dateTime .
The issue suggests to represent "the degree a dataset conforms to a stated quality standard" and "the details of data quality conformance test results". This section is a starter to deal with the above requests by relying on the existing W3C vocabularies. Comments and suggestions about the following examples as well as the proposal of alternative patterns are more than welcome.
This subsection shows different modelling patterns combining DQV [VOCAB-DQV] with PROV [PROV-O] and EARL [EARL10-Schema] to represent the conformance degree to a stated quality standard and the details about the conformance tests.
The use of dct:conformsTo and dct:Standard is a well-known pattern to represent the conformance to a standard. The following example declares a fictional a:Dataset conformant to the "Commission Regulation (EU) No 1089/2010 of 23 November 2010 implementing Directive 2007/2/EC".
a:Dataset a dcat:Dataset; dct:conformsTo <http://data.europa.eu/eli/reg/2014/1312/oj> . # Reference standard / specification <http://data.europa.eu/eli/reg/2014/1312/oj> a dct:Standard ; dct:title "Commission Regulation (EU) No 1089/2010 of 23 November 2010 implementing Directive 2007/2/EC of the European Parliament and of the Council as regards interoperability of spatial data sets and services"@en dct:issued "2010-11-23"^^xsd:date .
Some legal context requires to specify the degree of conformance. For example, INSPIRE metadata adopts a specific controlled vocabulary to express non-conformance and non-evaluation beside the full compliance. As suggested in [GeoDCAT-AP], the following example specifies the degree of conformance (i.e., not conformant) by declaring the dct:type for the result of conformance test. [GeoDCAT-AP] suggests to use a PROV entity to model the conformance test (e.g., a:TestResult), a PROV activity to model the testing activity (e,g., a:TestingActivity), a PROV plan derived by the INSPIRE Directive to represent conformance test (e.g., a:ConformanceTest). A qualified PROV association binds the testing activity to the conformance test.
a:Dataset a dcat:Dataset ; prov:wasUsedBy a:TestingActivity . a:TestingActivity a prov:Activity ; prov:generated a:TestResult ; prov:qualifiedAssociation [ a prov:Association ; # http://validator.example.org/ is the agent who did the test. prov:agent <http://validator.example.org/> #following the plan a:ConformanceTest prov:hadPlan a:ConformanceTest ] . # Conformance test result a:TestResult a prov:Entity ; dcterms:type <http://inspire.ec.europa.eu/metadata-codelist/DegreeOfConformity/notConformant> . a:ConformanceTest a prov:Plan ; # Here you can specify additional information on the test prov:wasDerivedFrom <http://data.europa.eu/eli/reg/2014/1312/oj> .
Also, DQV [VOCAB-DQV] can be deployed to measure the compliance to a specific standard. In the following, the :levelOfComplianceToINSPIRE is a quality metrics which measures the compliance of a dataset to INSPIRE in terms of the percentage of passed compliance tests. The example assumes iso as a namespace representing the quality dimensions and categories defined in the ISO/IEC 25012.
:levelOfComplianceToINSPIRE a dqv:Metric ; skos:definition "It returns the degree of compliance to INSPIRE defined as the percentage of passed compleance tests."@en dqv:expectedDataType xsd:double ; dqv:inDimension iso:compliance . iso:compliance a dqv:Dimension ; skos:prefLabel "Compliance"@en ; skos:definition "The degree to which data has attributes that adhere to standards, conventions or regulations in force and similar rules relating to data quality in a specific context of use."@en ; dqv:inCategory iso:InherentandSystemDependentDataQuality . iso:inherentandSystemDependentDataQuality a dqv:Category ; skos:prefLabel "Inherent and System-Dependent Data Quality"@en.
The quality measurement :measurement_complianceToINSPIRE represents the level of compliance for a dataset a:Dataset, namely, measurement of the metric :levelOfComplianceToINSPIRE. If only a part of the compliance tests succeeds (e.g. half of the compliance tests), the measurement would look like in the following:
:measurement_complianceToINSPIRE a dqv:QualityMeasurement; dqv:computedOn a:Dataset; dqv:value "50"^^xsd:double ; sdmx-attribute:unitMeasure <http://www.wurvoc.org/vocabularies/om-1.8/Percentage> dcterms:date "2018-01-10"^^xsd:date ; dqv:isMeasurementOf :levelOfComplianceToINSPIRE .
Further information about the tests can be provided using EARL [EARL10-Schema]. EARL provides specific classes to describe the testing activity, which can be adopted in conjunction with PROV. The following example describes the Testing activity a:TestingActivity as an EARL Assertion instead of a qualified association on the PROV activity. The EARL Assertion states the dataset a:Dataset has been tested with the conformance test a:ConformanceTest, and it has passed the test as described in a:testResult.
a:assertion a earl:Assertion; earl:subject a:Dataset; earl:test a:ConformanceTest; earl:result a:testResult ; # let's indicate if the test was manual, automatic, or what .. earl:mode earl:automatic ; earl:assertedBy <http://validator.example.org/> ; prov:wasAttributedTo <http://validator.example.org/>. a:ConformanceTest a earl:TestRequirement, prov:Plan; dct:title "Set of conformance test derived by the Commission Regulation (EU) No 1089/2010 of 23 November 2010 implementing Directive 2007/2/EC of the European Parliament and of the Council as regards interoperability of spatial data sets and services"@en"; # it includes different subtests dct:hasPart a:test1, a:test2, ..., a:testn. #It is derived by the reference standard prov:wasDerivedFrom <http://data.europa.eu/eli/reg/2014/1312/oj>. a:testResult a earl:TestResult; # results in conformancy. dcterms:type <http://inspire.ec.europa.eu/metadata-codelist/DegreeOfConformity/conformant>; #the overall set of tests have been passed earl:outcome earl:passed . ]; # the description of the validator <http://validator.example.org/> a earl:Assertor, prov:Agent ; dcterms:description "A test execution service that runs conformance test suites."@en ; dcterms:title "Validator"@en . #the testing activity a:TestingActivity a prov:Activity; prov:generated a:TestAssertion, a:TestResult; prov:use a:Dataset ; prov:wasAssociatedWith <http://validator.example.org/> .
The following example shows how the description would have looked like if the subtest a:testq1 had failed. In particular, dcterms:description and earl:info provide additional warnings or error messages in a human-readable form.
a:assertion1 a earl:Assertion ; earl:subject a:Dataset ; earl:test a:testq1 ; earl:result [ a earl:TestResult; # results in no conformancy. dcterms:type <http://inspire.ec.europa.eu/metadata-codelist/DegreeOfConformity/notConformant> #the overall set of tests have not been passed (!?) dcterms:date "2015-09-29T11:50:00+00:00"^^xsd:dateTime ; # Some XML encoding of the error dcterms:description """ <ul xmlns="http://www.w3.org/1999/xhtml"> <li> test 1 has failed. Some description of the errors found</li> </ul>"^^rdf:XMLLiteral; earl:info """" <test-method duration-ms="47" finished-at="2015-09-29T11:50:00Z" name="validate" signature="validate()" started-at="2015-09-29T11:50:00Z" status="FAIL"> <exception class="java.lang.AssertionError"> <message> Total validation errors found: 2 </message> </exception> </test-method>"""^^rdf:XMLLiteral; earl:outcome earl:fail . ]; # we do not know if the test was manual, automatic, or what .. earl:mode earl:automatic.
:error a dqv:QualityAnnotation ; #this annotation is derived by the measurement prov:wasGeneratedBy a:TestingActivity; oa:hasTarget a:Dataset ; oa:hasBody [ #errors/failed test description a oa:TextualBody; rdf:value """<test-method duration-ms="47" finished-at="2015-09-29T11:50:00Z" name="validate" signature="validate()" started-at="2015-09-29T11:50:00Z" status="FAIL"> <exception class="java.lang.AssertionError"> <message> Total validation errors found: 2 </message> </exception> </test-method>"""^^rdf:XMLLiteral ; #it can be in any format suppored by dc dct:format "text/xml" ] oa:motivatedBy dqv:qualityAssessment, oa:assessing ; dqv:inDimension iso:compliance . a:TestResult a dqv:QualityMetadata ; # change the the dcterms:type according to the resulted compliance dcterms:type <http://inspire.ec.europa.eu/metadata-codelist/DegreeOfConformity/conformant> ; prov:wasAttributedTo <http://validator.example.org/> ; prov:generatedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime ; prov:wasGeneratedBy a:TestingActivity . # The graph contains the rest of the statements presented in the previous examples. a:testResult { a:Dataset dqv:hasQualityMeasurement :measurement_complianceToINSPIRE; dqv:hasQualityAnnotation :errors . } #the testing activity a:TestingActivity a prov:Activity; prov:generated a:TestResult; prov:use a:Dataset ; prov:wasAssociatedWith <http://validator.example.org/> .
A number of requirements identify the need to provide better support for Dataset and Record provenance - see Issue #78, Issue #77, Issue #76, Issue #71, Issue #66, Issue #63. It has been suggested that many of the requirements can be satisfied by using capabilities from the [PROV-O] ontology, in particular by treating dcat:Resource and/or dcat:CatalogRecord a sub-class of prov:Entity. A preliminary alignment of DCAT with PROV-O is available.
In this chapter it is planned to describe patterns for the use of the [PROV-O] vocabulary to support the various provenance-related requirements. See the wiki page on Provenance Patterns for more discussion.
DCAT 2014 handling of license and rights do not appear to satisfy all requirements [VOCAB-DCAT-20140116]. The recently completed W3C ODRL vocabulary [ODRL-VOCAB] provides a rich language for describing many kinds of rights and obligations. In this chapter it is planned to describe some patterns for linking DCAT Datasets and/or Distributions to suitable rights expressions. See the wiki page on License and rights for more discussion.
The need to be able to describe version relationships of datasets has been identified as a requirement to be satisfied in the revision of DCAT. Also see detailed requirements in Issue #89, Issue #91, Issue #92, Issue #93,
In this chapter it is planned to describe some patterns for describing Dataset and/or Distribution versions. See the wiki page on Dataset versioning for more discussion.
See the wiki page on Alignments and Crosswalks for more discussion.
This section is non-normative.
Schema.org [SCHEMA-ORG] includes a number of types and properties based on the original DCAT work (see schema:Dataset as a starting point), and the index for Google's Dataset Search service relies on structured description in web pages about datasets based on both schema.org and DCAT.
Most general purpose web search services that pay attention to metadata at all rely primarily on schema.org, so the detailed relationship of DCAT to schema.org is of interest for data providers who wish their datasets and services to be exposed through those indexes.
A mapping between DCAT 2014 and schema.org was discussed on the original proposal to extend schema.org for describing datasets and data catalogs. Partial mappings between DCAT 2014 [VOCAB-DCAT-20140116] and schema.org were provided earlier by the European Commission and the Spatial Data on the Web Working Group.
A recommended mapping from the revised DCAT (this document) to schema.org is available in an RDF file.
This mapping is axiomatized using the standard predicates rdfs:subClassOf
, rdfs:subPropertyOf
, owl:equivalentClass
, owl:equivalentProperty
, and also using the
annotation properties schema:domainIncludes
and schema:rangeIncludes
to match schema.org semantics. The mapping is summarized in the table below, considering the prefix schema
as http://schema.org/
.
This alignment of DCAT with schema.org is provisional and non-normative. Feedback is invited in the issue tracker.
DCAT element | mapping property | target element from schema.org |
---|---|---|
dct:description | owl:equivalentProperty |
schema:description |
dct:format | owl:equivalentProperty |
schema:encodingFormat |
dct:identifier | owl:equivalentProperty |
schema:identifier |
dct:issued | owl:equivalentProperty |
schema:datePublished |
dct:language | owl:equivalentProperty |
schema:inLanguage |
dct:license | owl:equivalentProperty |
schema:license |
dct:modified | owl:equivalentProperty |
schema:dateModified |
dct:publisher | owl:equivalentProperty |
schema:publisher |
dct:spatial | owl:equivalentProperty |
schema:spatialCoverage |
dct:temporal | owl:equivalentProperty |
schema:temporalCoverage |
dct:title | owl:equivalentProperty |
schema:name |
dct:type | owl:equivalentProperty |
schema:additionalType |
dcat:Catalog | owl:equivalentClass |
schema:DataCatalog |
dcat:DataService | owl:equivalentClass |
schema:DataFeed |
Unclear if a DataFeed is a data service, or a data collection. From a REST viewpoint there is no difference, but some commonly used APIs support additional queries, slices, etc which make the characterization of a service more efficient than listing the (potentially infinite) set of resources available from it. |
||
dcat:Dataset | owl:equivalentClass |
schema:Dataset |
dcat:Distribution | owl:equivalentClass |
schema:DataDownload |
dcat:Resource | rdfs:subClassOf |
schema:Thing |
dcat:accessURL | rdfs:subPropertyOf |
schema:contentUrl |
schema:domainIncludes |
dcat:Distribution , schema:DataDownload | |
schema:rangeIncludes |
rdfs:Resource , schema:URL | |
dcat:byteSize | rdfs:subPropertyOf |
schema:contentSize |
schema:domainIncludes |
dcat:Distribution , schema:DataDownload | |
schema:rangeIncludes |
rdfs:Literal , schema:Text | |
dcat:catalog | schema:domainIncludes |
dcat:Catalog , schema:DataCatalog |
schema:rangeIncludes |
dcat:Catalog , schema:DataCatalog | |
dcat:contactPoint | owl:equivalentProperty |
schema:contactPoint |
schema:domainIncludes |
dcat:Resource , dcat:Dataset , dcat:DataService , schema:Dataset | |
dcat:dataset | owl:equivalentProperty |
schema:dataset |
schema:domainIncludes |
dcat:Catalog , schema:DataCatalog | |
schema:rangeIncludes |
dcat:Dataset , schema:Dataset | |
dcat:distribution | owl:equivalentProperty |
schema:distribution |
schema:domainIncludes |
dcat:Dataset , schema:Dataset | |
schema:rangeIncludes |
dcat:Distribution , schema:DataDownload | |
dcat:downloadURL | rdfs:subPropertyOf |
schema:contentUrl |
schema:domainIncludes |
dcat:Distribution , schema:DataDownload | |
schema:rangeIncludes |
rdfs:Resource , schema:Thing | |
dcat:keyword | rdfs:subPropertyOf |
schema:keywords |
dcat:keyword is singular, schema:keywords is plural | ||
schema:domainIncludes |
dcat:Resource , dcat:Dataset , dcat:DataService , schema:Dataset | |
schema:rangeIncludes |
rdfs:Literal , schema:Text | |
dcat:landingPage | rdfs:subPropertyOf |
schema:url |
schema:domainIncludes |
dcat:Resource , dcat:Dataset , dcat:DataService , schema:Dataset | |
schema:rangeIncludes |
foaf:Document , schema:WebPage | |
dcat:mediaType | owl:equivalentProperty |
schema:encodingFormat |
schema:domainIncludes |
dcat:Distribution , schema:DataDownload | |
schema:rangeIncludes |
dct:MediaTypeOrExtent , schema:Text , schema:url | |
dcat:record | schema:domainIncludes |
dcat:Catalog , schema:DataCatalog |
schema:rangeIncludes |
dcat:CatalogRecord | |
dcat:service | schema:domainIncludes |
dcat:Catalog , schema:DataCatalog |
schema:rangeIncludes |
dcat:DataService | |
dcat:theme | owl:equivalentProperty |
schema:about |
schema:domainIncludes |
dcat:Resource , dcat:Dataset , dcat:DataService , schema:Dataset | |
schema:rangeIncludes |
skos:Concept , schema:Class | |
dcat:themeTaxonomy | schema:domainIncludes |
dcat:Catalog , schema:DataCatalog |
schema:rangeIncludes |
skos:ConceptScheme | |
foaf:Organization | owl:equivalentClass |
schema:Organization |
foaf:Person | owl:equivalentClass |
schema:Person |
foaf:homepage | owl:equivalentProperty |
schema:url |
foaf:mbox | owl:equivalentProperty |
schema:email |
This section is non-normative.
An alignment of DCAT with PROV-O [PROV-O] is being prepared. A provisional version is available.
This section is non-normative.
This section is non-normative.
An alignment of DCAT with HCLS [HCLS-Dataset] is being prepared.
This section is non-normative.
An alignment of DCAT with ISO 19115 [ISO-19115-1] is being prepared.
This section is non-normative.
This section is non-normative.
DCAT provides a generic metadata vocabulary for cataloguing datasets. Profiles of DCAT are required for specific applications and disciplines. Providing a model and formalization for profiles is planned to be an important part of the Dataset eXchange Working Group (DXWG). Also see Issue #73, Issue #74, Issue #75.
See the Profile Guidance working document and wiki page on Application Profiles for more discussion.
This section will describe security and privacy considerations relevant to the DCAT revision.
DCAT should be aligned with other recent Linked Data based Recommendations.
DCAT provides a data model for representation of metadata about datasets in the form of Linked Data, but it does not specify how this metadata can be accessed or modified. The DCAT compatible metadata can be viewed as collections of Catalog Records, Datasets and Data Services contained in a Catalog, and a collection of Distributions contained in a Dataset. The Linked Data Platform [LDP] specification deals with access to and modification of Linked Data Platform Containers (LDPCs). This section provides guidance on how to represent DCAT metadata as LDP Containers, which supports namely the implementation of Solid based DCAT catalogs.
First, we will present an example of a LDPC for datasets in a catalog.
There is one catalog with one dataset.
The dataset is contained in the </datasets/>
LDP Direct Container.
To ensure the LDPC discovery, we connect it to the Catalog using the dcat:datasets
predicate.
@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix ldp: <http://www.w3.org/ns/ldp#> . @base <https://example.org/resource/catalog> . <> a dcat:Catalog ; dcat:datasets </datasets/> ; dcat:dataset </datasets/001> . </datasets/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource <> ; ldp:hasMemberRelation dcat:dataset ; ldp:contains </datasets/001> . </datasets/001> a dcat:Dataset .
In the second example, we add LDPCs </records/>
for Catalog Records and </services/>
for Data Services, discoverable using dcat:records
and dcat:services
predicates from the Catalog:
@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix ldp: <http://www.w3.org/ns/ldp#> . @base <https://example.org/resource/catalog> . <> a dcat:Catalog ; dcat:records </records/> ; dcat:datasets </datasets/> ; dcat:services </services/> ; dcat:dataset </datasets/001> . </records/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource <> ; ldp:hasMemberRelation dcat:record ; ldp:contains </records/001> . </datasets/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource <> ; ldp:hasMemberRelation dcat:dataset ; ldp:contains </datasets/001> . </services/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource <> ; ldp:hasMemberRelation dcat:service ; ldp:contains </services/001> . </records/001> a dcat:CatalogRecord ; foaf:primaryTopic </datasets/001> . </datasets/001> a dcat:Dataset ; </services/001> a dcat:DataService .
Each dataset has its own LDPC for its distributions.
In the third example, we show the LDPC </datasets/001/distributions/>
for distributions of a single dataset, </datasets/001>
, discoverable through the dcat:distributions
predicate.
@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix ldp: <http://www.w3.org/ns/ldp#> . @base <https://example.org/resource/catalog> . </datasets/001> a dcat:Dataset ; dcat:distributions </datasets/001/distributions/> ; dcat:distribution </datasets/001/distributions/001> . </datasets/001/distributions/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource </datasets/001> ; ldp:hasMemberRelation dcat:distribution ; ldp:contains </datasets/001/distributions/001> . </datasets/001/distributions/001> a dcat:Distribution .
For catalogs with many datasets, catalog records, data services or distributions, the Linked Data Platform Paging mechanism [LDP-Paging] SHOULD be used to provide access to them.
In the next sections we formally define the additional properties used for discovery of LDP containers.
RDF Property: | dcat:datasets |
---|---|
Definition: | Connects a catalog to the LDP container of its datasets. |
Domain: | dcat:Catalog |
Range: | ldp:DirectContainer |
RDF Property: | dcat:records |
---|---|
Definition: | Connects a catalog to the LDP container of its catalog records. |
Domain: | dcat:Catalog |
Range: | ldp:DirectContainer |
RDF Property: | dcat:services |
---|---|
Definition: | Connects a catalog to the LDP container of its data services. |
Domain: | dcat:Catalog |
Range: | ldp:DirectContainer |
RDF Property: | dcat:distributions |
---|---|
Definition: | Connects a dataset to the LDP container of its distributions. |
Domain: | dcat:Dataset |
Range: | ldp:DirectContainer |
Linked Data Notifications (LDN) [LDN] can be used with DCAT e.g. for feedback collection.
Any resource can have an LDN Inbox.
In the following example we show a dataset </datasets/001>
as an LDN Target with an LDN Inbox.
@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix ldp: <http://www.w3.org/ns/ldp#> . @base <https://example.org/resource/catalog> . </datasets/001> a dcat:Dataset ; ldp:inbox </datasets/001/inbox/> . </datasets/001/inbox/> ldp:contains </datasets/001/inbox/001> .
All currently open issues are available at: https://github.com/w3c/dxwg/labels/dcat
The editors gratefully acknowledge the contributions made to this document by all members of the working group.
The editors also gratefully acknowledge the chairs of this Working Group: Karen Coyle, Caroline Burle and Peter Winstanley — and staff contacts Phil Archer and Dave Raggett.
A full change-log is available on GitHub
The document has undergone the following changes since the W3C Recommendation of 16 January 2014 [VOCAB-DCAT-20140116]: