MinutesOfMeeting 20220119
DPVCG Meeting Call 19 JAN 2022
Attendees
- Present
- :, beatriz, harsh, jan, julian, mark
- Regrets
- paul
- Chair
- harsh
- Scribe
- harsh
Contents
- Jurisdictions - resolving proposed concepts
- Data categories in DPV
- Personal Data Handling in DPV
- Sensitive Data and Special Data
- Data Source and Processing
- Next Meeting
Meeting minutes
Jurisdictions - resolving proposed concepts
No comments or objections raised. Proposed concepts accepted.
Data categories in DPV
We have moved personal data categories to dpv-pd
extension with the namespace/IRI as `https://w3.org/ns/dpv-pd%60 to be redirected to `https://w3c.github.io/dpv/dpv-pd%60. This includes renaming PersonalDataCategory
to PersonalData
.
Some categories have been retained in DPV (main). These include: Sensitive, Special, Derived categories.
New proposals include InferredPersonalData
as subclass of Derived
.
In addition, to indicate something is not personal data the concept NonPersonalData
is proposed along with the umbrella/parent concept Data
to represent both personal and non-personal data.
PseudoAnonymousData
as a subclass of PersonalData
and AnonymousData
as a subclass of NonPersonalData
are proposed.
Jan and Julian raised concerns about NonPersonalData
being combined with other data to become PersonalData
. A note is to be added to the concept and its usage to illustrate caution in such usage and to clarify that non-personal is a 'tag' or 'label' to declare something is not personal data e.g. after anonymisation.
---
The reasons for inclusion of Data
as a concept include 1) as a semantic parent of PersonalData
and NonPersonalData
; and 2) to specify something is just data i.e. it is neither personal nor non-personal e.g. if it is unknown whether something is personal data.
jan: Are or could the existing categories provided by DPV be used to identify if something is personal data. For example by looking up whether a category is present in the list, if yes, its personal data.
harsh: Yes, this is possible. Though the list is neither authoritative (a category in DPV is not always personal data e.g. organisation email) nor exhaustive (not all possible categories are represented).
jan: Is the identifiability associated with data also represented? Or how could this be represented?
harsh: We don't provide a way to currently specify how personal data is related to an identity whether implicitly or explicitly. This is complex to do because identity could be an identifier present in the data itself, present externally and associated with the data, or the data could be used to re-identify an individual either by itself or in combination with other data.
harsh: The use and definition of PII as a term, whether under ISO or NIST is relevant here. As there is discussion on whether the ISO term PII is equivalent to Personal Data under GDPR. There are arguments that it is not.
harsh: Until we have a clear authoritative answer and a good working definition, we refrain from representing PII separately as a concept.
julian: So how would PII fit within DPV concepts? (paraphrased)
harsh: PII
is either equivalent to PersonalData
, in which case we don't model it, or it is a subset of PersonalData
, in which case we add it as a subclass.
---
For relevance, the definitions we refer to are listed here
GDPR → A.4-1 ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cult
ural or social identity of that natural person;
ISO → 29100:2011 2.9 personally identifiable information PII any information that (a) can be used to identify the PII principal to whom such information relates, or (b) is or might be directly or indirectly linked to a PII principal
NIST → Personally Identifiable Information; Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means.
Personal Data Handling in DPV
Proposal for a property hasPersonalDataHandling
to indicate applicability or inclusion of PersonalDataHandling
with another concept.
This is useful to nest instances of personal data handling e.g. to distinguish between different legal bases or recipients. Example includes a personal data handling instance associated with a policy that further contains additional personal data handling instances for different legal basis.
Additionally, the personal data handling property can also be used to link a personal data handling with concepts such as DataSubject
or Purpose
for non-conventional modelling of policies. For example, to indicate data subjects (employees) are associated with a specific personal data handling instance.
julian: Does this mean the property hasPersonalDataHandling
can be (effectively) used as an inverse property of hasDataSubject
which links PersonalDataHandling
to DataSubject
?
harsh: Yes, this is one of the possible uses of this.
Sensitive Data and Special Data
We have SensitivePersonalData
which is a subclass of PersonalData
and then SpecialCategoryPersonalData
which is a subclass of SensitivePersonalData
.
Definitions for these are as follows
Sensitive → That which requires additional considerations, measures, or protection
Special → That which is sensitive, prohibited from processing, and which requires additional (separate) legal basis for use.
This distinction helps in identifying data that should be handled cautiously (e.g. location traces) from that which is special category as defined under e.g. GDPR.
beatriz: in PROTECT we have the question whether DPV will be providing a list of sensitive personal data categories ?
harsh: If someone provides or we have an authoritative source for what is sensitive then yes we will provide them, similar to special categories from GDPR.
Data Source and Processing
Proposal for indicating a DataSource
is PublicDataSource
or NonPublicDataSource
.
This is relevant in cases where public data sources have additional or different requirements and legal obligations.
jan: Does public here mean accessible? If so, then how is that accessibility specified?
jan: We also need to know how it applies to e.g. government providing data versus someone's profile being set to be publicly viewable on the internet
harsh: The definition of public and accessible can vary a lot based on jurisdiction, domain, and mediums. So there is no proposal for these at the moment.
jan: DPV should provide tools/concepts for specifying that data collected from public sources should be ensured to be valid (i.e. source is still providing that data publicly) and that it should be deleted after some amount of time.
harsh: This is not for us (DPVCG) to define, but could be from laws or codes of conduct.
harsh: To indicate duration of storage or processing, we have concepts to define these.
julian: Why are the DataSource
concepts in the Processing section?
harsh: For convenience. The spreadsheet and tabs roughly correspond to where that concept will be presented in the DPV documentation. As data sources are related to data collection, they are in the same section i.e. Processing.
jan: When data is reused e.g. from public sources, what should be the legal basis and how is it indicated?
harsh: There's the hasLegalBasis
property. E.g. using a dataset under CC-by license, the license is technically the legal basis for collection/usage of that data, so this makes sense to use.
---
harsh: ISO/IEC 29184 defines four categories of data collection / sources
directly → from data subject e.g. web form
indirectly → from third party e.g. credit agency
observed by controller → from something the data subject does or has but does not directly provide e.g. browser fingerprinting ; this could also understood as 'extracting' i.e. the data is present
inferred by controller → by reusing existing data to produce something that is not present in that data e.g. identifying demographics from non-demographic information (mouse movements)
Proposal is to add these to DPV as collection methods or as data sources. Contributions for these are welcome.
Next Meeting
We will meet again next week as per our regular weekly schedule.
WED JAN-26 13:00 WET / 14:00 CET
We will continue going through the spreadsheet and GitHub issues for resolution of proposed concepts.
Minutes manually created (not a transcript), formatted by scribe.perl version 188 (Sat Jan 8 18:27:23 2022 UTC).