Encouraging open data usage by commercial developers: Report

The second Share-PSI workshop was very different from the first. Apart from presentations in two short plenary sessions, the majority of the two days was spent in facilitated discussions around specific topics. This followed the success of the bar camp sessions at the first workshop, that is, sessions proposed and organised in an ad hoc fashion, enabling people to discuss whatever subject interests them.

Each session facilitator was asked to focus on three key questions:

  1. What X is the thing that should be done to publish or reuse PSI?
  2. Why does X facilitate the publication or reuse of PSI?
  3. How can one achieve X and how can you measure or test it?

This report summarises the 7 plenary presentations, 17 planned sessions and 7 bar camp sessions. As well as the Share-PSI project itself, the workshop benefited from sessions lead by 8 other projects. The agenda for the event includes links to all papers, slides and notes, with many of those notes being available on the project wiki. In addition, the #sharepsi tweets from the event are archived, as are a number of photo albums from Makx Dekkers, Peter Krantz and José Luis Roda. The event received a generous write up on the host's Web site (in Portuguese). The spirit of the event is captured in this video by Noël Van Herreweghe of CORVe.

Share PSI - Lissabon from Noël on Vimeo.

Opening Presentations, Workshop Themes

The event was generously hosted by Portugal's Agência para a Modernização Administrativa (AMA). Its president, Paulo Neves, stated in his opening remarks that it is expected that open data will have great impact on economic, political and research level. He made 4 key points:

  1. It is not sufficient just to make data available, you have to build communities around the data for using and re-using it. That is the way to prove which information is more essential to be opened.
  2. We should guarantee the quality of the published data. Maintaining the quality of data is a difficult task, but it has to be ensured if others are to create value from it.
  3. Opening, making available and publishing data has a low priority level among politicians, since they have to deal with problems that are crucial for citizens living standards.
  4. Nevertheless, the objective for the Portuguese government is to provide open data by default as part of its daily process.
AMA's João Vasconcelos

AMA's João Vasconcelos continued this theme is his plenary talk. Openness means more responsibility – exposure, accountability – but it also means more strength. How can citizens trust governments if government doesn't trust the government? This was an echo of the sentiment expressed on behalf of the Greek government in Samos. AMA recognises 3 pillars of open government: (a) transparency, (b) participation, and (c) collaboration. But openness is also a matter of economics. An Open government is also a smart government.

These ideas are being put into practice with portals for public software as well as public data, and new laws that require Portuguese public administrations to adopt interoperable standards and a comply or explain policy in favour of free open source software over proprietary alternatives.

Like Paulo Neves, the deputy head of the Data Value Chain unit at the European Commission, Beatrice Covassi, used her speech to emphasise the importance of community building. In that context there is a clear need to foster open data policies and to develop an adequate skills base of data professionals.

To strengthen Europe's Big Data community and help lay the foundations for the thriving data-driven economy of the future, the EC signed a Memorandum of Understanding for a Public-Private Partnership (PPP) on Big Data. The EC has earmarked over €500 million of investment that private partners from industry are expected to match at least four times over. To prepare this process, meetings take place on a regular basis in the PSI Expert Group which consists of representatives from the EU Member States. The EC also provided guidelines on charging, data sets and licenses. Open data should generally be accessible and available for all, at zero or at very low cost. Due to the fact that materials in national museums, libraries and archives now fall under the scope of the revised Directive, there will now also be access to more exciting and inspirational content.

Beatrice referred to the Open Data 500 project in which she was involved. The Open Data 500 is the first comprehensive study of U.S. companies that use open government data to generate new business and develop new products and services.

A European example of engaging the private sector in the implementation of open data policies is the Open Data Users Group established as part of the UK's Open Data strategy. Amongst others, it helps to build business cases on how additional government funding for the free release of data should be prioritised. Another example is the Spanish government's regular assessment of the impact of PSI re-use on the national market and its dialogue with the private sector data re-users. It would be good to see more of such engagements of Open Data re-use in commercial contexts in the EU-28.

The subject matter for the Lisbon workshop: Encouraging open data usage by commercial developers, was also the one chosen for the 2014 Open data Day in Flanders. This was the third annual event in the series and its organiser, Noël Van Herreweghe, used it to ask the open data community what their expectations and recommendations were with respect to things such as the relevance of defined open data policies, the availability of data feeds, standards, challenges, opportunities etc. From that event, a long list of conclusions was drawn up.

CORVe's Noël Van Herreweghe delivering his plenary talk

In summary: we need a reality check. open data, even when freely available, is not free to use since so much time has to be spent cleaning it up, converting it, integrating and maintaining it. There is a marked difference in approach between government and business. One provides long term investment and slow innovation, but the opposite is true for business, especially activists. Start ups innvote quickly but are in it for the long haul.

Businesses prefer stable, complete data to simply 'more data' and demand Service Level Agreements. And greater stability can only come from a uniform legislative framework.

This last point was picked up in several of the LAPSI sessions (the LAPSI project focuses on the legal aspects of the PSI Directive and so complements Share-PSI). To what data do citizens actually have right of access? It's commonly agreed that personal information should only be available to the individual concerned but the legal barriers to releasing some data, such as tax data in the Czech Republic, are mixed up with privacy issues that only specialists can understand. Therefore the decisions about what data should be made available are often being taken by a very small number of individuals. What's needed is a mixture of legislation and a general culture that is inclined towards sharing. This supports the view from the session on incubators that feedback from the users, including businesses, is essential if the benefits of actively making PSI accessible.

In their paper and presentation, Marc de Vries (The Greenland) and Georg Hittmair (Compass/PSI Alliance) set out the context and original expectations for PSI. Experience has shown that the results are different than expected and a lot harder to measure. The value is non-linear and often non-monetary which makes it hard to measure and correlate.

Graph showing three phases of open data investment, beginning with an initial loss and ending with a positive outcome
The economic effects and their time span. Only Denmark is already on the right hand side of this graph having released all its geographical and cadastral data.

They made the case for a stable framework and new business models that include:

  • a clear value proposition
  • marketing model
  • profit model

Marc and Georg were among the many workshop participants to argue that the demands of users need to be recognised and responded to. If a user requests a dataset, the chances are they have a business model behind the request. Policies need to be in place that take proper account of privacy concerns with a suitable redress mechanism in place to settle disputes. This will only happen if (potential) businesses are made aware of the commercial re-user rights rather than data being published as if it's a gift of ill-defined merit.

A number of key suggestions for improvements were offered:

  • strict national provisions regarding charging;
  • service Level Agreements secure investments;
  • standardised interfaces reduce the developers efforts;
  • liability clauses are helpful in many cases;
  • no restrictions regarding distribution channels (resellers …)

App contests are all well and good but the outcomes are rarely based on sound business models and it's hard to avoid the suspicion that the public sector body is looking for a diversion from publishing the really valuable data.

Hackathons

Perhaps surprisingly for a workshop about encouraging commercial use of PSI, there was only one session dedicated to the subject of hackathons. There was general recognition that these events often don't lead to anything as businesses generally have little interest in the applications developed – they develop their own applications under very different circumstances.

view from the back of the small auditorium - we see a very steep rake of seating with distant figures
Alberto Abella & Emma Beer (OKF), Amanda Smith (ODI and Simon Whitehouse (Digital Birmingham) lead the session on events, hackathons and challenge series - stimulating open data reuse. This was one of the sessions held in the Anfiteatro (amphitheatre)

Success stories from hackathons begin with the problem, not the data. An example comes from Scotland where the problem was stated as: how do we improve health in our country and what are the constraints on spending and the ageing population? A blueprint for a new system was created at a hackathon that then turned into a successful, government-backed open source ecosystem, with the prototype developed by an ex-nurse who happened to be able to code. Challenges were the quality of the data and the willingness of certain government entities to open up. There were clear benefits for all which explains the success of this project. One way to encourage this in future is to have one or more domain experts present at hackathons.

Often though, hackathons start with the data and 'the use of open data' is not a priority for investors. It's not just data publishers and businesses who have different agendas. Many hackers see hackathons as a social occasion, whether or not they do any coding. It is for these reasons that what often look like really good ideas for bringing investors and project leaders together may have disappointing results. Hackathons are, however, good for proofs of concept and to demonstrate to people within organisations that open data has a value and that open data programmes have internal credibility.

Business Models

INSIGHT's Sanaz (Fatemeh Ahmadi)

The importance of business models was highlighted in a session lead by Clemens Wass of openlaws.eu and Fatemeh Ahmadi, Insight Centre for Data Analytics. Repeating a lesson from the hackathon session, the world view held by the public sector is different from that of the commercial sector. Although public sector bodies should not be concerned, or try to influence, what their data is used for by others, they do need to understand the kind of business models that exist around their data.

At what point does the public sector compete with the private? Publishers in Austria complained when government legal data was made available for free and a period of adjustment followed where it was understood that, while the data was available for free, services built on top of that data could be provided by the commercial sector.

Different business models apply to different types of data. For example, transport data is very different from legal data.

In the session run by Ingo Keck of the Centre for Advanced Data Analytics Research, participants worked to sketch out a number of different business plans. In each case there were three key questions to answer:

  1. What is the need that the business fulfils?
  2. What is the market?
  3. How is your business unique?

Two very different businesses were discussed. One centred on data quality services – adding value to data by cleaning, standardising, describing and linking the data and then selling it as a service (Data-Publica is an example of such a service). The other, 'Know Your Neighbourhood,' would offer information about services available in a given area that might be useful to residents, businesses looking for the best location etc. Consistent availability and openness of the data were found to be crucial for business development.

Both of these ideas would be classified as infomediary companies in Spain, the topic of two studies into the sector presented by Dolores Hernandez of the Ministry of Finances and Public Administrations. Although her remarks were prefaced with with caveat that the two studies conducted in 2011 and 2012 were not scientifically rigorous, the slides include many interesting statistics that are at least indicative. For example, it is estimated that the infomediary sector employs around 4,000 people in Spain and generates up to €550M per year directly from infomediary activities (around half of the relevant companies' turnover).

UVT's Daniel Pop in full flow …

The most valuable PSI concerns geographic/cartographic data and company data and, as well as companies and self-employed individuals, 65% of infomediary companies cite the public sector as a client. In other words, they are deriving value from processing and selling PSI back to the public sector. Fully one third of infomediaries have overseas clients so that PSI can be seen as an export earner too. Payment per access is the most common revenue model, employed by more than half of Spanish infomediaries with products like processed data and generic reports being dominant, more than 60% of which are delivered as PDF documents.

Dolores ended her presentation with a list of demands from the infomediary sector to increase the re-use culture:

  • increased coordination and clear leadership by public administrations;
  • recognise the differences between Spain's autonomous regions to ensure a common market;
  • better regulation through modification of existing rules as well as new ones;
  • culture change should be seen as a mechanism for collaboration, not confrontation.

Dietmar Gattwinkel, who heads Saxony's Open Government Data project, lead a session dedicated to infomediaries. Again, it's the geo and business data that is the most commercially viable, things like the number of residences and residents in an area, planning; dates of incorporation, size of company, turnover etc. For these areas there is already a competitive market between multiple players.

The discussion in that session looked at the boundary between the public and private sectors, how the political aims interact with the business need, what role data portals should and do play cf. data aggregators, visualisations etc. These boundaries and roles need to be more clearly defined, perhaps to encourage the commercial development of brokerage services and applications that might distract from more fundamental tasks of providing standardised, DCAT-based, access to repositories, perhaps over a Content Delivery Network. This perhaps conflicts with, for example, the Portuguese plans and highlights the lack of clarity over what role is to be performed by which sector.

Another topic was the increasingly common notion of moving computation to where the data is rather than moving the data. A search engine is an example of this. You send a small amount of data (the query), the computation is done in the cloud and the results are delivered back to you. This offers a possible route to addressing privacy concerns as control remains within the service.

A final topic in the session centred on privacy, an issue that shows distinct differences in different countries. Privacy is often used as a smokescreen for other motivations not to publish PSI but in reality it's about granularity.

Linda Austere, Michele Osella waiting for the signal from Phil Archer to encourage people to come to their session (Xenia Beltran and Nikolay Tcholtchev are hidden from view)

From Wow to How?

Michele Osella from the Istituto Superiore Mario Boella lead a session that asked how we go from the wow – the billions of Euro cited as the potential of open data in various consultants' reports – to the how, that is, how to realise that potential. There were three primary conclusions:

  1. there is a need to educate potential entrepreneurs about what open data is and is not;
  2. access to data is essential of course;
  3. data must be maintained in terms of quality, frequency of update, formats and licences.

The second and third of these are inherited from upstream, i.e. from the relevant public sector body that holds the data. If there is no demand for the data, the public sector can carry on not publishing it and no one will notice so the requests for data need to be made clearly by citizens.

The cost of publishing PSI is rarely recouped by the publishing public sector body. Where PSI is published, it is often done due to obligation or even because it is currently fashionable. Ultimately we need evidence to show to policy makers that opening data vaults is not a cost but it will bring benefits. Opening data will be beneficial for governments according to an inward orientation: no more open data as a fad or an obligation, but as a necessity.

Michele developed these ideas further in a joint bar camp session with Paolo Dini of the LSE. The discussion centred around the notion of a non-capitalist market, a different socio-economic model that can encourage collaboration between the public and private sectors and that can support both equally. This would reinforce constructive interaction between social and economic spheres, and democratic participation and trust. It might even include the creation of a new type of non-commodity money (a zero-interest mutual credit system) with broad participation by all stakeholders. Another bar camp session on overcoming resistance to publishing data, lead by Cristiana Sappa and Muriel Foulonneau, focused on the cultural heritage sector but the same conclusion could apply to the broader public sector: institutions should calculate the full cost of selling data – in many cases they cannot make enough money by selling data to cover the costs – and compare it with the costs and benefits of sharing the data.

Ways of increasing user involvement were among the topics discussed at a bar camp lead by Peter Winstanley (Scottish Government), Jan Kucera (University of Economics, Prague) and Harris Alexopoulos (University of the Aegean). It was agreed that users – i.e. the broader community – need not be involved in making the original data available, but can make a significant contribution to its description through additional metadata, tagging etc. They might also be involved with transforming the data into different formats. In each case, there needs to be a distinction between the source data and the related user-generated content.

Incubators and Accelerators

Mateja Presern (Slovene Government) has good news for Share-PSI Project Coordinator, ERCIM's Philippe Rohou

During her speech, Beatrice Covassi mentioned a new EU initiative: the upcoming European Open Data Integration and Reuse Incubator for SMEs “… to foster the development of open data supply chains. It strives to attract the participation of European companies willing to contribute their own data assets as Open Data for experimentation or to integrate open data with their own private data as the basis for innovative applications. This is a very promising avenue. All in all, open data can be used to launch commercial and non-profit ventures, to do research, to take data-driven decisions, and to solve complex problems.”

FINODEX project coordinator, Zabala's Miguel Garcia

The Lisbon workshop included a session lead by Miguel Garcia (Zabala) who coordinates a similar accelerator programme, FINODEX, based on FIWARE and open data. Entrepreneurship is provided through funding, training and mentor programmes, all of which enable publishers to see the possibilities offered by opening their data. The selection process is based on the proposed business model ensuring that sustainable businesses are created rather than what might be good ideas but that lack a long term future. The success of an accelerator can be measured through the number of proposals received, the quality and sustainability of the funded projects, the kind of data re-used and the amount of private funding attracted.

The FINODEX accelerator project has a good deal in common with the Open Data Institute's start-up programme. So far 16 start-ups have been supported with an emphasis on long term sustainability and the sharing of experience. The process includes a good deal of data processing to turn raw, messy data, perhaps PSI published in PDF documents, into clean, usable data that can be analysed and visualised. It's notable that using government data has helped the ODI to identify inefficiencies within the public sector such as delays in the tender procedure. This benefit of open data was highlighted during Michele Osellas' session From Wow to How as one of the possible situations where public sector bodies could see a tangible return on the investment made in publishing their data.

The wide ranging support for start-ups, including the culture and training, clearly makes a difference. OpenCorporates, Spend Network, Mastodon C and Open Utility are examples of successful companies supported by the ODI. In comparison, a year after three open data start-ups were awarded funding in Gijon through a much simpler programme, only one is still in operation.

Working with established companies to help them understand the value of open data, and highlighting the successful start-ups and other businesses that use open data helps them to understand the potential benefits. The promotion of best practice across Europe can help data harmonisation and scale. The ODI points to their certificates as a guide. Doing this creates an evidence base that encourages further PSI provision and solutions to common problems. These ideas are not limited to government data; the same is true for the cultural heritage sector as discussed in the COOLTURA session lead by Xenia Beltrán Jaunsaras and the bar camp lead by Cristiana Sappa and Muriel Foulonneau.

The issues in the cultural heritage sector are the same as elsewhere: the reluctance to publish can only be overcome with a succession of demonstrations not just of potential but of actual value for publishers and users alike. Copyright issues around cultural heritage objects vary enormously and although libraries have long been used to sharing metadata, museums have a variety of funding models and are often more sceptical. Being able to track data usage was raised at the bar camp, often a simple 'thank you for the data' is often enough but perhaps the music industry's tracking of usage of its material could be an inspiration?

A possible solution to modelling a business derived from high-level business requirements is the TOGAF® architecture methodology from The Open Group. It can be used to create a complete description of a business problem, both in business and in architectural terms, that enables individual requirements to be viewed in relation to one another in the context of the overall problem. It takes a business process, application, or set of applications that can be enabled by an architecture, and describes the business and technology environment, the people and computing components (called "actors") who execute the scenario, and the desired outcome of proper execution. Without such a complete description to serve as context, the business value of solving the problem is unclear, the relevance of potential solutions is unclear, and there is a danger of solutions being based on an incomplete set of requirements that do not add up to a whole problem description.

Arnold van Overeem speaking on behalf of the Open Group

On behalf of The Open Group, Arnold van Overeem trailed a new standard under development, Open Platform 3.0TM that builds a common architecture environment on top of the Web. It's designed to overcome typical stakeholder concerns such as:

  • the compulsory use of business registers;
  • government imposed deadlines;
  • transparency of administrative decision making;
  • protection of privacy.

Open platform 3.0 will use an updated UDEFTM Standard as an enabler for semantic interoperability. This is a technology-neutral standard that can be encoded in many ways including RDF.

Infrastructure

An issue highlighted by the FINODEX session and in the Flanders Open Data Day conclusions, among others, concerns infrastructure. A technical infrastructure, such as FIWARE, can support multiple businesses and provide the kind of service level needed if businesses are to rely on the data and related services. Workshop hosts AMA see the future of the Portuguese data portal as a data broker, a provider of data-centric services as much as data. This can be used by several platforms and government Web sites, and as a way to present information about services rather than a simple catalogue. In his session Model-Driven Engineering for Data Harvesters, Nikolay Tcholtchev of Fraunhofer FOKUS explained his ideas around metadata harvesting as a means of increasing the discoverability of data in different portals.

Miroslav Konečný, ADDSEN/COMSODE

One of the major new European initiatives in opening data is the pan-European Open Data portal. The main idea is to build a portal of portals for Open Data to increase synergies and the creation of value. The metadata repository of the pan-European Open Data portal will be an entry point to the more than 70 Open Data portals throughout Europe. The first operational version of this portal is foreseen by the end of 2015. Both the COOLTURA and COMSODE projects include metadata harvesting too. In the latter case, a new open source platform is being developed, Open-Data-Node. This is being used by the Slovakia open data portal that unusually, as well as data enhancement tools, offers Service Level Agreements and 'certified data.'

Collecting feedback and crowd-sourcing information about data quality requires additional infrastructure which, of course, increases the cost and complexity required. This is not always seen as a realistic prospect, however the Gov4All project is all about providing tools for collaboration including a mechanism for rating datasets. In the Slovak portal, government employees need to be certificated to post data or comments, while users can be anonymous. Making government representatives identifiable is seen as an important aspect of trust.

In the Open Data Life Cycle and Infrastructure bar camp session, the observation was made that it would be helpful to abandon infrastructure that has its roots in the 20th century and use the infrastructure developed for the 21st century that can better satisfy needs of publication of data on the Web. This suggests a revolutionary, not evolutionary, approach that the public sector in particular finds hard.

Multilingualism and Location

One of the projects that ran multiple sessions during the workshop was the LIDER project. The re-use of PSI is strongly encouraged if the data is of good quality and semantic conflicts have been resolved before publishing. General information can be combined with domain-specific data and metadata using standardised, linked data ontologies and established terminologies. Such resources are more easily processed by machines and ease discovery and consumption of PSI by the human end user. Language and/or locale are critical to many applications. Is 10.000 exactly ten or ten thousand? Is red a colour or a net? Providing multilingual (meta)data can solve these questions.

Raquel Saraiva (DGT), Imgo Simonis (OGC) and Adomas Svirskas (Advisor to the Lithuanian Cadastre, President of Lithuanian Software and Services Cluster) leading the session on The Central Role of Location

Location, or rather, how location is expressed, is equally important, and the workshop heard from OGC's Ingo Simonis and Raquel Saraiva (DGT) that there is no shortage of standards. However, this variety itself presents a problem – which standards and vocabularies should be used? Two widely used 'standards' actually aren't formal standards at all. GeoJSON is widely respected and massively used but is a community effort and Shape Files are a proprietary format developed by a single company (ESRI). Does this matter? To many the answer is no but in government situations it might. Google Maps is a proprietary base map in one reference system while national base maps use a variety of coordinate reference systems and so on.

The call is for some best practices on vocabulary and modelling choices, perhaps with profiles of different standards for different situations. In this way geospatial data can be more easily used with other data to enhance the value of both. Initiated by the SmartOpenData project, W3C and OGC are in close collaboration to achieve exactly this.

The workshop included plenty of time for networking. In the foreground Peter Krantz and András Micsik share a table with Heike Schuster-James and Valentina Janev, soon to be joined by Mª Dolores Hernandez Maroto

Licensing

The LAPSI track was the focus of much discussion about licences. Intellectual Property Rights were introduced to allow organisations and individuals to profit from their ideas and so was designed as an enabler. Today it is often seen as a barrier. The law on IP varies significantly across the EU28. Some treat databases differently than other PSI, for example. The big question is whether copyright applies to PSI or not.

The session lead by Freyja van den Boom of KU Leuven included many examples of the differences in approach. In Ireland, commercial exploitation of PSI is simply not allowed. Finland has a licence that is a translation of CC0, but is not technically a CC0 licence. The situation in Latvia is very confusing. No datasets available from the National Library have associated licences but some are covered by specific laws that declare the data to be open – although it is not clear whether such openness extends beyond the country's borders and so on.

The widely used Data Catalogue Vocabulary, DCAT, re-uses a lot of the Dublin Core metadata set but doesn't include properties for the sort of fine-grained machine-readable details that are a minimum requirement if machines are to be able to detect and process licences. Two possible vocabularies exist for this however:

  • The Open Digital Rights Language (ODRL) which is used in Spain (among other contexts).
  • The similarly named, but different Open Data Rights Statement vocabulary (ODRS) was developed by Leigh Dodds on behalf of the Open Data Institute.

The session agreed that standardisation of such vocabularies would be beneficial.

Licence interoperability was an issue at another LAPSI session lead by Antigoni Trachaliou, (Greek National Documentation Center) and Leda Bargiotti (PwC EU Services). If an openly licensed dataset includes another dataset that is not openly licensed, who is liable, the publisher or the re-user? An example of this would be the UK Address file which is 'closed data' but often included in openly licensed datasets.

It's the responsibility of the data provider to make sure they have the necessary IP rights in third party data but are they aware of this? Licences cannot solve all the problems. Good IP management combined with education and raising awareness essential.

The session concluded with a number of recommendations to promote the re-use of PSI:

  • limit number of licences, allow commercial reuse;
  • ensure interoperability between licences;
  • be sure that what you licence as open data does not include third party rights;
  • where there has been improper clearance in terms of copyrights, re-users' liability should be limited;
  • attach a licence to datasets and make licences machine readable.

These steps will provide legal certainty, increase legal interoperability and lower costs.

Away from the LAPSI track, the session on open data start-ups lead by Amanda Smith & Elpida Prasopoulou (ODI) and Martin Alvarez-Espinar(CTIC) listed “publish open data with clear licenses” as its first answer to what should be done to promote the publication and re-use of PSI.

Although the Directive mandates the provision of PSI, it doesn't mandate that this be under a regulator since this is outside the competence of the EU. A dispute resolution mechanism between PSI publishers and re-users is therefore undefined and depends on a member state's existing laws and transposition of the Directive. They may create a regulator or assign PSI regulation to an existing one but such a network of regulators would need coordination. This situation is complicated further in countries with decentralised systems.

Nevertheless, the provision of a redress mechanism, one able to make binding decisions on public sector bodies, would represent a mechanism to challenge a Public Administration over denials of re-use.

A row of 11 people on stage waiting to make their bar camp pitches
The line up for the bar camp pitches, from left to right: Heike Schuster-James, Michele Osella, Jan Kucera, Deirdre Lee, Cristiana Sappa, Paolo Dini, Peter Krantz, Miguel Garcia (hidden behind Phil Archer) Peter Winstanley (hidden behind time keeper Steinar Skagemo) Muriel Foulonnneau (exit stage left).

Additional Bar Camp Topics

Miguel Garcia lead a bar camp discussion of an Open Data Exchange Programme as a bottom-up approach to connect communities in the Open Data field. This idea of an 'Erasmus programme for Open Data' was well received and seen as a way to connect isolated communities in Europe.

Muriel Foulonneau (Henri Tudor Research Centre) asked whether the emphasis on RDF, i.e. the 5 stars of Linked Open Data, was a help or a hindrance to commercial re-use of PSI while Peter Krantz asked “why is standardisation so difficult?” The discussion attracted a lot of people and much discussion but the conclusion was that data should be self-descriptive (whatever the format) so that applications can automatically display the data in human-readable form. This is best achieved using standard RDF vocabularies such as SKOS and FOAF, but also XML schema such as SDMX and XBRL. A feedback loop from developers and data users such as journalists to publishers is essential. One possible solution to the 'marketing problem of RDF' might be a standardised graphical notation similar to UML – with appropriate tooling.

Conclusions

There were a total of 31 sessions or presentations, and well over 200 registered participants – and this report has not highlighted the two sessions held in Portuguese. Any bullet point summary of such a substantial exchange of expertise will necessarily miss a lot of detail, however, these appear to be the most repeated themes.

  • There is a lack of knowledge of what can be done with open data which is hampering uptake.
  • There is a need for many examples of success to help show what can be done.
  • Any long term re-use of PSI must be based on a business plan.
  • Incubators/accelerators should select projects to support based on the business plan.
  • Feedback from re-users is an important component of the ecosystem and can be used to enhance metadata.
  • The boundary between what the public and private sectors can, should and should not do needs to be better defined to allow the public sector to focus on its core task and businesses to invest with confidence.
  • It is important to build an open data infrastructure, both legal and technical, that supports the sharing of PSI as part of normal activity.
  • Licences and/or rights statements are essential and should be machine readable. This is made easier if the choice of licences is minimised.
  • The most valuable data is the data that the public sector already charges for.
  • Include domain experts who can articulate real problems in hackathons (whether they write code or not).
  • Involvement of the user community and timely response to requests is essential.
  • There are valid business models that should be judged by their effectiveness and/or social impact rather than financial gain.

Acknowledgements

The Share-PSI team wishes to record its thanks to the staff at the LNEC, the venue for the event, for making sure everything was in the right place at the right time. And to our hosts, AMA, especially André Lapa, for extraordinary work in making the event such a success.