Share-PSI 2.0 logo

Best Practice: Encourage crowdsourcing around PSI

Draft: 6 October 2015

This version
http://www.w3.org/2013/share-psi/bp/ec-20151006/
Latest version
http://www.w3.org/2013/share-psi/bp/ec/
Previous version
http://www.w3.org/2013/share-psi/bp/ec-20150415/

This is one of a set of Best Practices developed by the Share-PSI 2.0 Thematic Network.

Creative Commons Licence Share-PSI Best Practice: Encourage crowdsourcing around PSI by Share-PSI 2.0 is licensed under a Creative Commons Attribution 4.0 International License.


Outline

Preparing PSI for sharing can be time consuming, expensive and, sometimes difficult. Engaging the community in the task will increase the quality and quantity of available data as well as enthusing the potential users.

Management Summary

Challenge

To increase the quality and quantity of machine readable data within a constrained budget.

Solution

Crowd sourcing can be an efficient way to increase quality and availability of machine readable data, in particular for cultural heritage institutions. Innovative techniques, including gamification, can be used to harness the skill and enthusiasm of the community at large. On a practical level, datasets can be made avilable on platforms such as GitHub so that users can offer corrections (accepting such corrections remains under the control of the data owner). This is the approach undertaken by the City of Chicago. On a policy level, identifying community crowd sourcing projects outside government institutions can also be an indicator of valuable datasets that should be prioritised for open publication since the level of community involvement is generally proportional to the level of interest in that data.

Best Practice Identification

Why is this a Best Practice? What’s the impact of the Best Practice

Many institutions lack resources necessary to manually go through large collections of unstructured data that has been created over many years. By engaging external communities to collaborate on this data it is possible to create more detailed machine readable data supporting a wider range of re-use cases.

Why is there a need for this Best Practice?

More machine readable open data supports a wider range of use-cases in services and applications.

  • Many institutions lack resources necessary to manually go through large collections of unstructured data
  • By engaging external communities to collaborate on this data it is possible to create more detailed machine readable data supporting a wider range of re-uses.
  • Crowd sourcing engages the community that the end product serves.

What do you need for this Best Practice?

Planning phase:

  • Identify the exact need first and then seek groups able to support solving that need via crowd sourcing.
  • Think of crowd sourcing as another tool to create/improve data sets and think about the phases of your data collection project and where crowd sourcing could best fit in.
  • Involve stakeholders who could benefit from a free source of certain data sets and have them provide funding in order to sustain crowd sourcing efforts.

Implementation phase:

  • The tasks have to be really small tasks.
  • Utilize a gamification approach if possible, that is, by playing a game, users perform a useful task.
  • It is possible to use crowdsourcing without the user's knowledge. The best known example of this is the use of CAPTCHAs to solve the micro task of reading words that optical recognition software cannot and by that method digitising hard to read texts.

Applicability by other Member States

The approach is applicable to any Member State.

Contact Info

Peter Krantz peter@peterkrantz.se

$Id: Overview.html,v 1.4 2016/08/19 09:10:34 phila Exp $