Privacy Workshop Position Paper - The DAP Perspective
Introducing the DAP WG
The application programming interfaces (APIs) work that the W3C Device APIs and Policy Working Group (DAP) [1] is chartered to perform represents the largest and most thorough assault on users' privacy ever undertaken by a single working group. A user agent supporting all of the DAP's APIs would enable arbitrary Web pages to have access to the user's contacts database, personal calendar, local file system, todo list, would be able to capture stills, video, and audio from the user's webcam and microphone, and would be able to send emails and SMSs on the user's behalf.
Application of this technology should enable a large number of new use cases relying on the integration of personal data into services, interactions based on device capabilities, far more fluid information sharing, and a much more immersive experience for the end user. Locally installed applications are likely to become progressively obsolete as the Web acquires the functionalities to reach ever deeper into our digital lives. To put it in today's lingo, if you don't go to the cloud, DAP will bring the cloud to you.
The traditional view on Web privacy is that it is most frequently and patently violated when users voluntarily provide information which is then reused. That view is only partially true (as we shall see below) but even if it were, the breadth of the new abilities that DAP is heralding and the vastly increased fluidity in sharing information that will likely accompany it risk being the cause of a major increase in the distribution of private information. Furthermore we should not fool ourselves that public awareness of privacy issues or regulation will, on their own, be sufficient to face up to this challenge. Such integrated services simplify common tasks, increase convenience to the user and provide for richer and more elaborate interactions that users will want. The simplicity of posting a picture or a video blog directly from inside of the page, of directly sharing one's contacts and calendar with a social network will beckon, and successfully at that.
Privacy: It's Worse Than Most Think
The new risks[2] involved in opening up new functionality in the way that DAP endeavors to are generally well understood: A Web site might surreptitiously start capturing a video on the user's device, without consent, perhaps in the bedroom or in a private meeting; a rogue Web application could scan the contacts list to create spam or to post private information; a messaging API could be used to generate SMS messages to a premium number, incurring unwanted billing charges on a user. Each of these APIs that enable interesting and useful use cases can also enable abuse cases through misuse.
The types of privacy violations that are in practice today, without DAP, are however not as well understood by the Web community at large. Most Web developers, including those who create Web standards, tend to believe that privacy issues mostly stem from users volunteering information (typically through input forms) which one merely needs not to do, or through ad-related tracking which are routinely blocked. They generally react with surprise (followed, as they realize how easy such things are at the technical level, by dismay) at the fact that most operations that one performs on an arbitrary page are not private. Every movement of the mouse cursor, the speed at which the page is scrolled, whether the user actually finished reading the page, in some cases whether it was bookmarked or printed, or when some content is selected or copied can all be tracked. And in more than a few cases, they actually are, typically by analytics services not routinely blocked.
This is further compounded by the fact that the now common buttons that are added to web pages that offer "Share on Twitter" as well as "Bookmark", "Print", "Copy", or "Send to a friend", are the preferred way of performing these operations for the vast majority of lay users, over the ones provided by the browser or OS. These are very reliably trackable (and tracked). Yet it is not uncommon for Web developers who claim to be privacy conscious to include services such as AddThis[10] to their Web sites.
Likewise, part of the “common lore” in the Web community is that one can disable tracking by blocking cookies from certain sources. As shown beyond doubt by the EFF's Panopticlick study[11], this is very far from true. The unique fingerprint that a user leaves while browser the Web is, in the vast majority of cases and including when cookies are disabled, good enough for tracking purposes. The study found that the only efficient technique to prevent such tracking is to disable Javascript entirely. Given the core importance of Javascript to a great number of services that users do wish to use, it is unlikely that a recommendation to disable it would be followed by any more than an insignificantly small group.
Steps Towards a Solution
Given that its work on opening up new functionality can have potentially devastating effects, it is only natural that the DAP WG would have in its core mission preventing unwarranted collection, use, transfer and deletion of personal data, and more generally helping protect users' privacy. DAP goals include supporting the user value proposition related to privacy concerns, such as minimization of data provided, limits on redistribution and use of data, etc. as well as the value proposition of web developers and web intrastructure providers, including usability and interoperability.
The first step in attempting to enable sharing and using information effectively while limiting its spread and misuse, is to make privacy an immediate part of the security architecture. This approach builds on the notion that privacy, like security, should be dealt with systematically, from the initial design ("Privacy by Design" [3]). It is an “intermediate good”, something that people expect to be built-in, much like an engine in a car [4]. Users expect it by default and will be surprised if it is not there. It cannot be bolted on after the fact.
What this translates to at the technical level is that all API calls that request access to potentially sensitive information are designed to be asynchronous. This enables the user agent to acquire the user's consent without blocking the user's action through a modal prompt. It is well acknowledged in the security community that prompts to accept potentially dangerous operations will be systematically acquiesced to by users (in fact, it is known that users do not even read the content of the prompt, even when insulting). By adopting the same approach for privacy, we enable implementations to adopt the best privacy-enhancing strategy available to them, be it non-modal prompting, prearranged policies, or a Web of trust approach.
The second step is minimization. An API doesn't need to be exposed to more information than it needs to function. In the simplest case, this is just a matter of only including in the API what has a solid use case justifying its presence — which is good design to start with. In more interesting situations, this involves making the user aware of which information is being requested, and offering the ability to control it. For instance, an application wanting to show the many fun anagrams that can be generated from one's friends' names could benefit from accessing the user's contacts database (since that prevents having to type them in) but it doesn't need to see the phone or email details. That is precisely the approach taken by DAP's Contacts API: the Web application requests the fields that it needs, and the user gets to choose which people to include, and potentially which fields to exclude (if several were requested.)
This negotiated process has multiple advantages. It makes it clear to users what information they are sharing without requiring much additional work for them. It encourages developers to only ask for the information that they actually need, and not go overboard lest they risk making users suspicious. Finally, it helps introduce the notion that justified access and blanket access are different, and can be supported separately.
Where applicable, another dimension that we have been working on is the standardization of an access
control policy framework that supports different rules for
various trust domains. This enables declarative rules to be created to
directly control the ability to use APIs or obtain information. This framework includes the means to define
trust domains as well as a policy language to define access
control to device information and services
based on those domains. In some cases these rules may be defined by a
third party, such as a network operator in a mobile environment, or they may be learned based on user
decisions.
We have also been investigating a user-mediated approach toward introducing consumers and providers of information to one another, the Powerbox [7]. How it combines with the policy framework is still an open question; it is possible that they may simplify and enhance one another, or on the other hand that they may compete for the same space. Given the complexities involved in addressing privacy issues, a systematic solution may require more than one component to the solution and some promising approaches might not be suitable when better understood. For this reason the group is exploring and experimenting with many options. Given the breadth and depth of the challenge, the Working Group is seeking as much involvement from the developer and privacy communities as possible.
The DAP WG has reviewed privacy and policy use cases, is developing requirements, and is exploring technical solutions as well as advancing work on contributions that have been made. The WG is aware of privacy discussions in the W3C Geolocation WG related to privacy and GeoPriv, some of which have been summarized in a recent research paper [6]. The DAP WG continues to learn as it works through issues and invites broad community participation, both in reviewing our work and contributing to the WG. All our discussions are open to the public on our public mailing list.
One of the greatest difficulties involved in producing a privacy-respectful Web is managing the ability for users to have insight or control aspects of privacy such as redistribution, purpose of use and retention. While the usage of privacy policy documents is common, they are consistently too long and too far removed from the immediate user action to be of much use. Inspired by what the Creative Commons did for copyright, DAP has been investigating “Privacy Rulesets” [9] that provide a pithy summary of what a service does with personal information. While we have yet to bind these rulesets to concrete implementation, several ways to capture and expose them to better inform the user are possible, from semantic markup that a user agent can use to icons that are immediately comprehensible.
How You Can Help
The DAP WG is making progress on the details of the APIs, has written draft requirements on policy and privacy, has a draft policy framework intended to support privacy, a contribution on user-mediated introduction of information providers to consumers, and a first cut at expressing more complicated privacy expectations in a usable manner (Privacy Rulesets). We are actively seeking feedback on these efforts, participation in our work by any interested party, and any suggestions and assistance in helping us make great APIs that enable far more powerful Web applications while consistently enabling privacy protection.
We believe the work of the DAP WG is important to the future of the Web, both in the functionality and usability that it will offer, as well as the privacy risks that it may create. We need your participation to make this work a success. Come join us.
Notes
- [1] DAP Home Page
- http://www.w3.org/2009/dap/
- [2] Device API Policy Requirements
- http://dev.w3.org/2009/dap/policy-reqs/
- [3] Privacy By Design
- http://www.privacybydesign.ca/
- [4] Wikipedia
- http://en.wikipedia.org/wiki/Intermediate_good
- [5] Device API Privacy Requirements
- http://dev.w3.org/2009/dap/privacy-reqs/
- [6] Privacy Issues of the W3C Geolocation API
- http://escholarship.org/uc/item/0rp834wf
- [7] Powerbox
- http://lists.w3.org/Archives/Public/public-device-apis/2010May/0133.html
- [8] Geographic Location/Privacy (geopriv), IETF
- http://datatracker.ietf.org/wg/geopriv/charter/
- [9] Privacy Rulesets
- http://dev.w3.org/2009/dap/privacy-rulesets/
- [10] AddThis
- http://www.addthis.com/
- [11] Panopticlick
- https://panopticlick.eff.org/