Warning:
This wiki has been archived and is now read-only.
Best Practices/Commercial Considerations in Open Data Portal Design
Contents
Title of the Best Practice: Commercial Considerations in Open Data Portal Design
Outline of the best practice
Data Marketplaces
While certainly data markets have existed long before the advent of the internet (matching information supply and demand has been driving many media revolutions from newspapers to the telegraph), the wholesale exchange of large structured dataset is relatively new.
The Deloitte Study "Market assessment of public sector information" for the UK Department for Business, Information and Skills identifies data marketplaces and data enrichment as important business models utilizing open data. Many early entrants have been forced to shift their focus from "statistics on speed" (timetrics) to more cautious approaches or they left the market altogether
What do data marketplaces do?
- They function as search engines for (public or exclusive) datasets or data services, API access,
- they provide quality indication for data (often crowdsourced),
- they facilitate the comparison of datasets,
- they allow the download of data,
How do they do it?
They build an enormous collection of structured data through
- automated methods,
- editorial work and
- crowd sourced commits and edits.
How do they earn money?
- They charge (buyers, sellers or both) a commission of any data that is sold on the site or after referral,
- they charge sellers for listing (akin to the yellow pages),
- they distribute premium data for a commission or sell value added custumer services (the so called Freemium model),
- they monetize t traffic with targeted advertising (rarely),
- they earn money through other cloud services, which become more attractive to developers through the ready availability of third party data,
- they trade data for data or
Data enrichment providers
Data enrichers - often specialist for a narrow business area (like geo marketing) seem to fare better but here the public data sources are often not spelled out in detail. All this may be because these business models and open data policies geared towards them face a couple of dilemmas often overlooked.
What do market enrichment providers do?
- The collect data necessary for a certain task
- They enhance, refine or otherwise improve this data
- They sell datasets or services specifically designed for the needs of a certain business
How do they do it?
They enhance large datasets through e.g.
- Extrapolation
- Error correction
- Matching with (proprietary) data
How do they earn money?
- Sale of datasets
- Service charges
- Sale of software that uses the data
Challenges and solutions
- What X is the thing that should be done to publish or reuse PSI?
- Why does X facilitate the publication or reuse of PSI?
- How can one achieve X and how can you measure or test it?
Chicken egg vs. market foreclosure | Challenge: Possible market foreclosure for data brokers through public data portals
What? Distinguish between the task of a data broker and that of an open data portal. Why? Because they are not designed for data brokerage, rather they should help government to streamline its own processes, provide a single, reliable point of access for government data and help managing the open data progress (e.g. prevent departments or agencies from prematurely developing their own apps. Also because concentrating solely on optimizing and enriching today's portals with brokerage functionalities might distract from advancing the technologically possible services (e.g. Machine based discovery - standard access to DCAT repositories - Content delivery Networks) How? Withstand pressure to come up with brokerage solutions, instead provide services and APIs commercial data brokers can use.
|
Moving the data vs. moving the computation of data | Challenge: Bandwidth issues for large datasets.
What? Start thinking about accepting computational queries and delivering results instead of data (as we already do with our search engines, only much more elaborate). Why? Because it is much more efficient when it comes to large datasets and might also be a solution to privacy concerns as it allows more control. How? Don't be content with today's solutions (meta data catalogues), but keep an eye on data industry developments already underway. |
Privacy vs. information density | Challenge: Overly restrictive application of privacy rules.
What? Treat privacy issue as a risk to be managed, not a yes/dichotomy. Why? Because privacy concerns can easily be used as a smokescreen for other motivations not to publish PSI, but is really a matter of the granularity of data. How? Keep in mind cultural differences. Not everything that is considered possible in one country is feasible in another. Include the prohibition of re-identification in the use condition, or make it even a criminal offence. |
TRP and share alike vs. monetization of added value | Challenge: • Obstacles for commercial re-use through technical restriction prohibition and share alike clauses
What? Do not demand share alike for open data an don't prohibit technical restrictions for sharing the data. Why? It will prevent any commercial usage that relies on selling the identical dataset to many customers How? Choosing appropriate licensing terms |
- Management summary
Challenge
Early commercial data marketplaces and data enrichers have faced a number of challenges (sometimes forcing them do abandon their business models), that were aggravated because the possible commercial re-use of PSI was not adequately addresses in the design of public sector data portals as well as in licensing policies. Among the challenges are:
- Possible market foreclosure for data brokers through public data portals
- Bandwidth issues for large datasets.
- Overly restrictive application of privacy rules.
- Obstacles for commercial re-use through Technical restriction prohibition and share alike clauses
Solution
- Distinguish between the task of a data broker and that of an open data portal.
- Start thinking about accepting computational queries and delivering results instead of data.
- Treat privacy issue as a risk to be managed, not a yes/dichotomy
- Do not demand share alike for open data an don't prohibit technical restrictions for sharing the data.
- Best Practice Identification
Why is this a Best Practice? What's the impact of the Best Practice?
Commercial re-use of PSI is an important driver for Open Data. The suggested measures remove common hurdles to commercial re-use an thus strengthen the demand-side of the Open Data process.
Link to the PSI Directive
(Please use one or more of the categories listed on the last page of this document, as many as relevant)
- Open Data platform(s) / Publication and deployment of information/data and metadata
- Licensing of information/data and metadata
Why is there a need for this Best Practice?
Not addressing these issues will hinder important areas of commercial re-use of PSI.
- What do you need for this Best Practice?
Freedom in the design of Open Data Portals (not out-of-the box solutions) and license regimes.
- Applicability by other member states?
The best practice can be applied in any member state on all levels of government. It should be especially useful for projects with at least moderate budgets that can afford a custom design of their open data portal.
- Contact info - record of the person to be contacted for additional information or advice.
Dietmar Gattwinkel
Projektleiter Open Government Data \| Project Manager Open Government Data
STAATSBETRIEB SÄCHSISCHE INFORMATIK DIENSTE \| SAXON IT SERVICES
Fachbereich 3.1 \| E-Government- und Querschnittverfahren
Riesaer Straße 7 Haus D \| 01129 Dresden
Tel.: +49 351 20545259 \| Fax: +49 351 451 3264 310