Share-PSI 2.0 logo

Best Practice: Dataset Criteria

Draft: 7 October 2015

This version
http://www.w3.org/2013/share-psi/bp/dc-20151007/
Latest version
http://www.w3.org/2013/share-psi/bp/dc/

This is one of a set of Best Practices developed by the Share-PSI 2.0 Thematic Network.

Creative Commons Licence Share-PSI Best Practice: Dataset Criteria by Share-PSI 2.0 is licensed under a Creative Commons Attribution 4.0 International License.


Outline

This best practice sets out a number of criteria that can be used to prioritise the publication of some datasets ahead of others.

Management Summary

Challenge

To develop the criteria for ‘high-value datasets’ taking into consideration the likely re-use of open data and to help governments understand which datasets to prioritise for publication.

Solution

To follow this guidance on dataset criteria which has been developed through engaging with both users and re-users of the data. The characteristics of ‘high-value datasets’ are seen from three perspectives: re-usabiity, value for data owners, value for re-users.

Re-usability

  • High-value data should reach at least 3-stars on Tim Berners-Lee's 5 star schema (making it available on the Web under an open license in a non-proprietary, structured format).

Value for data owner

A dataset may be considered of high-value when one or more of the following criteria are met:

  • sharing it contributes to transparency;
  • the publication is subject to a legal obligation;
  • the data directly or indirectly relates to their public task;
  • sharing it helps with cost reduction.

Value for re-users

The value of a dataset primarily depends on its use and re-use potential, which can lead to the generation of business activity. The potential of the dataset is defined by:

  • the size and dynamics of the target audience;
  • the number of systems or services that could use the dataset.

Datasets contributing to transparency have a strong social impact and re-user’s interest in these datasets is high.

Engaging with Reusers

It is important to engage directly with re-users to understand the value of your dataset. Recommendations:

  • establish a communication channel, for example, with a mailing list or a community on Joinup or on the Open Data Portal that could be used to make announcements to re-users and to gather feedback;
  • use collaborative tools. This encourages collaboration between a community or re-users and the cross-fertilisation of ideas and business opportunities.

Best Practice Identification

Why is this a Best Practice? What’s the impact of the Best Practice

It’s important to have a shared understanding of what can be considered to be high-value datasets so that publication of these datasets can be prioritised.

Why is there a need for this Best Practice?

Understanding which datasets should be published, under what criteria and priority, will help public authorities to see the benefits of publishing more high quality datasets.

What do you need for this Best Practice?

An understanding of high-value data, and communication channels with data users and data reusers.

Applicability by other Member States

The approach is applicable to any Member State.

Contact Info

Nicolas Loozen, PwC

$Id: Overview.html,v 1.5 2016/08/19 09:02:16 phila Exp $