See also: IRC log
<Caroline_> Introductions...
kcoyle: the main goal for our F2F
is to discuss the UCR
... Caroline and I tried to categorize them. If we get to a Use
Case and think it is in another category we just move it
... the idea is to get through all of them even though if we
don't get resolutions about all we have listed
<danbri> https://www.w3.org/2017/dxwg/wiki/Main_Page#Working_Documents -> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space & https://www.w3.org/2017/dxwg/wiki/Use_Cases_and_Requirements
kcoyle: if we need we may finish some of them afterwards
kcoyle: the first one we are going to discuss is https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID8
SimonCox: looking the version one
of DCAT thear is no ??
... to the extended DCAT part of what we are looking at is part
of dublincore
<danbri> https://www.w3.org/TR/vocab-dcat/#introduction"""Data can come in many formats, ranging from spreadsheets over XML and RDF to various speciality formats. DCAT does not make any assumptions about the format of the datasets described in a catalog. Other, complementary vocabularies may be used together with DCAT to provide more detailed format-specific information."""
SimonCox: what is the scope for
DCAT descriptions?
... also dataset
... we recommend the use of existing DCAT recommendations
<danbri> DCAT alludes to http://dublincore.org/documents/2003/02/12/dcmi-type-vocabulary/"""(Dataset) A dataset is information encoded in a defined structure (for example, lists, tables, and databases), intended to be useful for direct machine processing."""
SimonCox: the original dublicore
metadata
... the description of the use case is above
... it is clear as well as the requirements "Guidance on use of
dc:type or similar for DCAT records. Recommendation on
content-type vocabularies."
Jaroslav_Pullmann: Is this still
a dataset or is any resource which is not anymore a
dataset?
... I support the dataset
<AndreaPerego> About the different resource types in different metadata standards, I prepared a summary table (incomplete): https://docs.google.com/spreadsheets/d/1nlAgLUGQcBe40oTk5WNCVz-6rud1JtLwjoYyyqAT45U/edit?usp=sharing
Jaroslav_Pullmann: it should be more than separately
s/separetaly/separately
Makx: I am against of limiting the scope of what DCAT dataset is
<AndreaPerego> +1 to Makx
Makx: I am in favor of using vocab to say what dataset is
annette_g: I think the use case
approach should come down to actual use cases
... some of the use cases are questions
... we may consider those as separate questions
LuizBonino: I like the idea to be able to describe diferent types of information as assets
antoine: it seems this use case is to describe what is the dataset but it can also be understood about the context
alejandra: I think it is
important to discuss the scope of the use cases
... make sure that we provide guidance on the type
... I agree with the Use Case and I think we need to consider
it
<Keith> the problem with using 'type' is that 'type' may be made up of many different attributes
<antoine> Keith++
<Zakim> danbri, you wanted to suggest that ANY collection of 0s and 1s (including empty collection) can be treated as a dataset; "dataset" is about how the data is handled/treated/managed,
Makx: the definition of
dataset
... has to be curated
<alejandra> curated
<roba> seems to me the main thing is not to try to define it now - but to decide if we will maintain (or adopt) a list of types
Makx: I think it is important to clear it up
danbri: I think we agree
... is about the curation of the process around data
<Thomas> +1 for makx and dan
<alejandra> maybe this is useful: software vs data https://github.com/danielskatz/software-vs-data
<Makx> accept
<danbri> [I agree with Makx that being a dataset is around the social context surrounding data, not the data itself]
kcoyle: can we accept the use case ID8 as it is?
Jaroslav_Pullmann: we can just accept it
<annette_g> +1 to Jaroslav
Jaroslav_Pullmann: there are questions that are not stated on the use case
<annette_g> S/can/can't/
Jaroslav_Pullmann: maybe we could check others use case related to see the requirements and descriptions to see if they complete themselves
kcoyle: let's check the use case
ID20
https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID20
... we ante to be able to specify a type
... we are probably going to have to point to a small number of
recommended vocabs
... given that, could we vote on the ID8 and ID20 at the same
time?
<SimonCox> The link to parse.insight in the use-case description was unhelpful - I've corrected it
antoine: I think it would be all
right
... maybe SimonCox could explain
SimonCox: there are a lot of
diferent file types
... they call content type
<roba> dataset type != encoding type - dataset may be exposed in many encodings
SimonCox: there are different formats of media type
antoine: on the web context content type uses media type
<danbri> [media type could be .Z (application/x-compress, LZW) in the case of the Web History collection https://www.w3.org/History/1992/timbl-floppies/TimBerners-Lee_CERN/hype.tar.Z]
<Keith> a problem with the concept of dataset concerns streaming data because of its continuity: is the dataset the whole thing or a defined 'window'
SimonCox: I am talking about
semantic oriented
... the language chosen is certain conflicted
... talking about content type
<roba> should definitely change "content-type" wording in Use Case
<roba> we are talking about the range of dc:type
<LuizBonino> Is it the "nature" of the dataset instead of how it is serialised, right?
<Thomas> Right; that's how I perceive it also
SimonCox: the dublincore
descriptions from 20 years ago recognize datasets which are
images, maps, spreadsheets, etc
... there is a strong sense the images are different
antoine: I accept it
<antoine> https://en.wikipedia.org/wiki/List_of_HTTP_header_fields
antoine: as someone suggested to put a small note saying it
kcoyle: we have mentioned something that was not discussed on the use cases
SimonCox: it says that in the use case
kcoyle: are we at a point that could we vote on this
Jaroslav_Pullmann: we should merge them
<alejandra> +1 to merge them
<AndreaPerego> Sorry, merge what?
Makx: reminded us that we could merge only the requirements
<Makx> +1 to merging reqs
the uses cases ID8 and ID20, AndreaPerego
<antoine> +1 to keeping the use case separated (they were contributed separately) but having the requirements consolidated.
<Jaroslav_Pullmann> +1
<SimonCox> +1 to merge reqs - this will drive DCAT 1.x - keep use cases separate for record keeping
Jaroslav_Pullmann: if we are
looking for audiences we have differents
... they were not in the discussions. That was my motivation to
merge them
... it might be interesting for researchers to see them
merged
... if we talk about access the question is if are we talking
about datasets
... we should be talking always about digital access
resources
... the access would be only by protocols
... the definition of data maybe also about non digital data.
It can be anything. So we must be sure to be talking about data
accessible
<danbri> [is there anything DCAT can't describe? :]
Thomas: these two use cases could
be about anything
... the discussion about content type and so on is part of
content negotiation
... agree with Jaroslav_Pullmann to merge the requirements
<SimonCox> +1 danbri
Jaroslav_Pullmann: is the purpose
is to have a history we should merge only the
requirements
... sometimes the use cases are very valuable
... it is important to have reports of what we are missing
kcoyle: if you feel there is a use case missing, please create it
AndreaPerego: we should consider include descriptions or resources that are not data
<Zakim> LarsG, you wanted to ask if it's just about to accept or decline use cases
LarsG: I have a metaquestion. are
we discussin the merging and how to proceed?
... we discussed that in a call and agreed to keep the use
cases separeted and merge the requirements
... alo a catalogue should be considered
<Thomas> Proposal will follow here
<SimonCox> I agree that ID20 partly elaborates ID8, but it is only the requirements arising from these that matters in the end!
PROPOSAL: to accept the use cases ID8 and ID20 as they are
<SimonCox> The use-cases stay on the books so that we can check at the end if the products solve the use-cases
<Makx> +1 o Simton
<antoine> +1
kcoyle: is up to the group to drive requirements
PROPOSAL: to accept the use cases ID8 and ID20 as they are
<newton> +1
<Thomas> +1
<SimonCox> +1
<riccardoAlbertoni> +1
+1
<alejandra> +1
<kcoyle> +1
<annette_g_> -!
<PWinstanley> +1
<LuizBonino> +1
<AndreaPerego> +1
<Jaroslav_Pullmann> +1
<LarsG> +1
<roba> +1
<Ine_> +1
<Makx> +1
<annette_g_> -1
<Keith> +1
<danbri> +1
<antoine> with or without the requirement part?
<DaveBrowning> +1
<dsr> +1
<Thomas> antoine without for now
annette_g_: I still have a concern about the ID8 being a use case
<antoine> ok then +1
annette_g_: it is too
general
... I feel the use cases should be concrete
kcoyle: annette_g_ do you volunteer to rewrite it?
SimonCox: I agree that annette_g_ do it
PROPOSAL: to accept the use cases ID8 with edits that annette_g_ will provide and ID20 as it is
<Thomas> +1
+1
<PWinstanley> +1
<newton> +1
<annette_g_> +1
<alejandra> +1
<DaveBrowning> +1
<AndreaPerego> +1
<LarsG> +1
<dsr> +1
<Ine_> +1
<Keith> +1
<kcoyle> +1
<roba> +1
<Jaroslav_Pullmann> +1
<LuizBonino> +1
<riccardoAlbertoni> +1
<SimonCox> +1
<danbri> +1
<antoine> +0
RESOLUTION: to accept the use cases ID8 with edits that annette_g_ will provide and ID20 as it is
<Thomas> philippe keep the space after +
<Thomas> sorry; it works
<Thomas> (still getting used to IRC)
<SimonCox> IMO we should be quite generous in accepting use-cases, since these exemplify concerns in the community. The more challenging part is distilling the _requirements_ and consolidating these where they overlap or duplicate. The requirements will drive the design of the products.
<antoine> sorry I've abstained only because I've missed the explanation of how annette_g_ wanted to make the UC more concrete.
the use case ID36 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID36
Makx: Cross-vocabulary relationships is about the need that might be in the dcat about those other type of datasets
<riccardoAlbertoni> +1 to Makx ( probably is just a matter of providing some examples..)
<Keith> agree with Simon, accept all use cases and get on with the work of distilling requirements
<danbri> [q: couldn't I distribute my qb:DataSet in either Turtle or RDF/XML syntaxes, each being a Distribution?]
Jaroslav_Pullmann: I can refer to
the wikipage
... Makx is right. Some schema.org consider the data being
abstract
roba: I think it is an important
use case
... it is not just a distribuition
... we should just double check that we create a situation that
can't be a dcat
<AndreaPerego> +1 to roba
Makx: it is a litle bit more
complicate than that
... if you have a dataset as a datacube
... the concept is almost the same, but now you have 2
implementation
... one part would be of what dcat call a dataset
roba: I was saying that
description can be a distribution
... just we don't get confused on describing data
<Zakim> danbri, you wanted to mention CSVW too
danbri: it is a very important problem
<Keith> dataset/distribution: the problem is DCAT does not use the concepts conceptual, logical, physical - this would help
danbri: we have the choice of
going of very specific things
... seems that we have agreed with every domain
... we have to be pragmatic
... if we are describing as a distribution then describe it as
a distribution
... there is no right answer, but having concrete use cases
might help
Jaroslav_Pullmann: if this would
modif dcat standard
... concepts of what this dataset is
... if we agree that the dataset is abstract
... with this notion in mind we should compare with other
standards
... these are the differences
... comparing to schema
<danbri> [Dublin Core is scruffy and pragmatic where https://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records#FRBR_entities is overly prescriptive; even scoped to libraries, having 4 mutually exclusive types has been hard. It feels like there's a lesson for describing data here.]
kcoyle: is this a use case we want to address?
PROPOSAL: accept the use case ID36
<Thomas> +1
<LuizBonino> +1
<AndreaPerego> +1
<Philippe> +1
<Makx> +1 of course
<Jaroslav_Pullmann> +1
+1
<newton> +1
<alejandra> +1
<PWinstanley> +1
<kcoyle> +1
<LarsG> +1
<Ine_> +1
<danbri> +1
<riccardoAlbertoni> +1
<antoine> +1
<DaveBrowning> +1
<SimonCox> +1
<annette_g_> +1
<dsr> +1
<Keith> +1
<danbri> +2
RESOLUTION: accept the use case ID36
<scribe> scribe: DaveBrowning
<SimonCox> I vote to accept all use cases. But then we will need to distill, and collate, the *requirements* implied by the use cases.
<dsr> scribenick: dsr
<roba> +1
<scribe> scribe: Dave_Raggett
We start with ID9, seehttps://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID9
which talks about Common requirements for scientific data
AndreaPerego: this is a use case based upon experience at JRC
We need to verify requirements for multidisciplinary scientific data
we want to be able to describe the context, including authors, lineage, usage, links to publications about the dataset and links to input data
we should start with a link to the context, and later work on what we can describe in the context
PWinstanley: I would be very hesitant to distinguish scientific in the requirements, although its fine as a use case
<danbri> +1 to Peter's concern about distinguishing "science" from non
<alejandra> +q
Keith: I would like to go further with a complex set of role bound properties
We need this additional layer if intelligent software is to make use if it effectively
Annette will extend the use case
<Keith> Keith will generate an extended use case referencing ID9 emphasising relationships of dataset to many other entities with role and temporal limits
Jaroslav: for scientific datasets, there will be an appropriate set of metadata
<scribe> ACTION: Keith to generate an extended use case referencing ID9 emphasising relationships of dataset to many other entities with role and temporal limits [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action01]
<trackbot> Error finding 'Keith'. You can review and register nicknames at <https://www.w3.org/2017/dxwg/track/users>.
Thomas: we don’t want to scare people off with long lists of metadata which may be optional
<Zakim> AndreaPerego, you wanted to comment on the use of "scientific" in the use case
<AndreaPerego> Data lineage: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#Modeling_data_lineage
<Caroline_> ACTION: annette_g_ to make the UC ID8 more concrete [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action02]
<trackbot> Error finding 'annette_g_'. You can review and register nicknames at <https://www.w3.org/2017/dxwg/track/users>.
<danbri> [what is the unique content of this usecase, beyond those it covers? e.g. I just noticed data citation is also in UC10 also from AndreaPerego]
<annette_g_> S/annette_g_/annette_g/
AndreaPerego: you put a link to the original dataset, but it could also be interesting to describe the processing involved in the lineage
<Caroline_> ACTION: annette_g to make the UC ID8 more concrete [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action03]
<trackbot> Created ACTION-14 - Make the uc id8 more concrete [on Annette Greiner - due 2017-07-24].
It would be very useful to have two levels
<LarsG> [Do use cases need to be unique? I thought we just decided that they are only there to be hooks for unique _requirements_]
AndreaPerego: it is useful to have a link to a specific community where the metadata is relevant
<Zakim> danbri, you wanted to note that data citation metadata is critical for data(set) discovery (+maybe "scholarly" can substitute for "science" in some places?)
danbri: I wanted to speak up for search indexing, we love text rather than numbers
<danbri> AndreaPerego, is there anything unique in UC9 not in your other related UCs?
LuizBonino talks about different roles of authors
<AndreaPerego> danbri, it's more a "meta" use case, giving the general context
<SimonCox> +1 to danbri "what is the extra requirement from this use case" - we should spend our time on extracting requirements. There will be a lot of overlaps, but I'm not sure that chugging through votes on each uc is good use of time?
Karen: can we vote on ID9, and how it relates to profiles
<danbri> +0 then (it seems a useful aggregation of the others, but if it has no unique content, seems an administrative/editorial matter)
<kcoyle> PROPOSAL: accept id9, and consider this also when we discuss profiles
<danbri> +1
<antoine> +1
<Jaroslav_Pullmann> +1
<SimonCox> +1
<alejandra> +1
<LuizBonino> +1
<Ine_> +1
<riccardoAlbertoni> +1
<newton> +1
<Philippe> +1
<Caroline_> +1
<Keith> +1 accept all use cases and get on with requirements
+1
<LarsG> +1
<annette_g_> +1
<PWinstanley> +1 but with the caveat that we remove 'scientific' from it
<Thomas> +1 and +2 to Keith
<kcoyle> +1
<DaveBrowning> +1
<Makx> +1
RESOLUTION: accept id9, and consider this also when we discuss profiles
<kcoyle> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID10
We look at ID10 Requirements for data citation
Karen invites Andrea to introduce ID10
AndreaPerego: this is about being able to cite bibliographic information and to associated related resources with persistent identifiers
<danbri> "Being able to specify the basic mandatory information for data citation" suggests a relation to using SHACL/SHEX or similar c.f. https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID41. I don't see the word mandatory in https://www.w3.org/TR/vocab-dcat/
DCAT doesn’t yet provide the means to distinguish these identifiers sufficiently
Karen summarises the wording in the DXWG charter. We want to describe what is meant by an application profile, but not to define domain specific vocabularies for profiles
AndreaPerego: the use case is not about that as such, but rather about enabling citations
so ID10 is about DCAT rather than app profiles
<alejandra> +q
Thomas: the title of this use case is a little misleading
<danbri> [i.e. If a DCAT-based description is going to be useful for data citation, we'll need to at least show how it would be modeled i.e. in terms of vocabulary not mandatory-ness. Someone else's problem to represent that profile using shex/shacl/etc.]
Keith: one of the big thing with citation is being able to reference a specific version and section of a data set, and this is best handled in terms of a query expression
LarsG asks for clarification about the target of the use case
AndreaPerego: this is about DCAT
alejandra: I think this is an important use case for DCAT, and we need to clarify the requirements as it overlaps with ID9
Jaroslav: we need to consider the query parameters for referencing the distribution
Keith: this can get really complicated with some data stores
LarsG: I am still not sure if this is about DCAT or DCAT-AP
are we here to extend DCAT or to support some form of profile of DCAT usage
<alejandra> +q
Jaroslav: we seem to missing a use case on data identification
<AndreaPerego> I wonder whether "data identification" could not be too abstract. I see it more as a requirement.
<alejandra> isn't that kind of described in ID11?
<scribe> ACTION: Jaroslav_Pullmann to work with Keith on a use case on data identification [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action04]
<trackbot> Created ACTION-15 - Work with keith on a use case on data identification [on Jaroslav Pullmann - due 2017-07-24].
<kcoyle> PROPOSAL: accept ID10
<annette_g_> +1
<alejandra> +1
<riccardoAlbertoni> +1
<Caroline_> +1
<LuizBonino> +1
<Keith> +1
<Philippe> +1
<Ine_> +1
<Jaroslav_Pullmann> +1
<newton> +1
<Thomas> Meaning some core citation info is essential for DCAT VOC
<Thomas> +1
<antoine> +1
<PWinstanley> +1
<DaveBrowning> +1
annette_g: every scientific domain has its own list of metadata for its data sets
what is the level that a profile sits at?
<Makx> +1 to Karen
Karen: we will define how to express a dataset profile, but we won’t work on specific profiles which will be left to the relevant communities
annette_g: we do need to provide guidance to communities as to what we’re expecting them to do
Thomas talks about how to define profiles
how to provide a consistent set of extensions
Karen: we can look at how Dublic Core tackled this
<Makx> +1
<AndreaPerego> +1
<annette_g_> +1
RESOLUTION: accept ID10
<SimonCox> +1 to accept all use cases ...
<Makx> +1
<Thomas> +1
<Jaroslav_Pullmann> +1
<newton> +1
<antoine> +1
<LarsG> +1
<Ine_> +1
<Keith> +1 accept use cases and get to requirements
<LuizBonino> +1
We move onto ID11 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID11 Modeling identifiers and making them actionable
<danbri> +1
<Thomas> +1
Karen: this is similar to others, can we just vote on accepting it?
Keith: many identifiers are role based, and we need to be general in supporting them
Karen: you need to say what kind of identifier it is to enable search
<alejandra_> it also says alternative identifiers
Annette_G: we need to void limiting people to a single de-referenceable link
LuizBonino: people need to state their identifier and the schema it belongs to
<kcoyle> PROPOSED: Accept ID11
<Thomas> +1
<annette_g_> +1
<LarsG> +1
<Philippe> +1
<Caroline_> +1
<LuizBonino> +1
<riccardoAlbertoni> +1
<SimonCox> +1
<Keith> +1
<PWinstanley> +1
<alejandra_> +1
<newton> +1
<Makx> +1
+1
<kcoyle> +1
<Ine_> +1
<antoine> +1
RESOLUTION: Accept ID11
<danbri> +1
<Jaroslav_Pullmann> +1
<DaveBrowning> +1
<Makx> the question should be: is the UC clear?
<danbri> PROPOSAL: We accept all use cases.
<SimonCox> +1
<LuizBonino> +1
<Keith> +1
<Makx> accept them unless osmeone objects
<danbri> No objections so far
<Makx> +1
<AndreaPerego> +1
danbri proposes we accept all of the use cases so we can discuss requirements
<Zakim> danbri, you wanted to propose we accept all usecases
<antoine> -1
<DaveBrowning> +1
<danbri> antoine, can you repeat?
<SimonCox> notuc = no objection to unanimous consent
<antoine> I am co-chairing a group where people actually submitted UCs that were out of scope
<antoine> so I have to flag this
<antoine> That said, I will not strongly object if the WG here decides to just move on!
<antoine> Maybe this is a better group :-)
Karen: we want to decide whether the use cases are in or out scope
danbri: has anyone some ideas as to which use cases should be out of scope
ID9 needs some rewriting to avoid being specific to scientific datasets
Out of scope, means that we won’t address the use case in DCAT
<riccardoAlbertoni> +1 to AndreaPerego
AndreaPerego: it is better to be concrete in the use cases and then generalise the requirements
Karen agrees
Do we want to go through each of the use cases to resolve whether they are in or out of scope?
We won’t be able to get a full list of requirements from the use cases today.
<Makx> what time come back?
Our purpose today is to determine what is the scope of DCAT 1.1
<danbri> [Maybe we can get away with a 'bulk' resolution that we believe all UCs submitted are in-scope to *consider* as reasonable asks of DCAT 1.1]
scribe: we break for 15 minutes …
<Caroline_> in 15min we will come back
<SimonCox> I won't come back. Getting late here and I'm still nursing pneumonia
<Caroline_> hope you get better soon SimonCox
<danbri> goodnight, SimonCox!
RESOLUTION: SimonCox to get well soon
<SimonCox> :-)
<Thomas> on a pause her, makx/andrea
<Thomas> starting within a few minutes
<Thomas> (I think)
<Makx> My apologies, will have to disconnect at 13:00 my time/noon Oxford for another call.
<Makx> Hope we'll get through Quality by the top of the hour
<Thomas> scribe Thomas
<scribe> scribenick: Thomas
kcoyle: next two use cases - similar to identifiers
id28 & id29 parallel
are these out of scope?
AndreaPerego: relationship id28-29
both about spatial aspect of data
not related otherwise
28 is about specifying reference systems that the data use (coordinate systems)
29 is about the spatial coverage
29 is more literal by nature
<kcoyle> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID28
<kcoyle> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID38
kcoyle: talking about 28 and 38
29 later
<roba> these are both cases of ID26
looking at id28 and id38 now
<Caroline_> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID28
<Caroline_> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID38
any objections against 28?
<roba> 28 and 27 are basically the same pattern
<danbri> Noting that https://www.w3.org/TR/owl-time/ is in Candidate Recommendation review re time-related aspects in ID38
<LuizBonino> Again, it is seems that both 28 and 38 are suitable for extensions/profiles
<alejandra> can we update the agenda to have the correct name if ID28?
roba: 27 and 28 are alike and go about the modelling aspects of semantics
general set of requirements will have to be extracted from there
and then discussed
<AndreaPerego> I would say UC27 (temporal coverage) is related to UC29 (spatial coverage), not UC28 (reference systems).
lot of people are using these things differently
<antoine> AndreaPerego++
kcoyle: grandfather in related aspects to 26
Jaroslav_Pullmann: we're going to look into requirements - we have to ask of the requirement in UC28 is enough?
kcoyle: are there other requirements - they can be added to the UC
PWinstanley: spatial and temporal - have a 'scaling'-property
Maybe some 'superclass'-object for scaling?
Describe a reference system where the scaling info originates to
some ontology etc
<roba> UC1 is such even more general UC in https://www.w3.org/2017/dxwg/wiki/Use_Cases_and_Requirements
dsr: what are we up to here?
describing conventions or validating consistency to a reference system
PWinstanley: do we need a place
for this in DCAT? We shouldn't miss out on flexibility
... keep a future-proof architecture
dsr: do we need to check on integrity?
PWinstanley: yes, but we need a chunck to make that possible
kcoyle: can we make a use case for this?
PWinstanley: ACTION: Peter W will do a use case for this
roba: general use case-attempt for that one
feel free to edit this one
roba: job of just describing a reference is really not easy
a simple property defining the reference is just not enough
especially within spatial world
look at the spatial data on the web WG for that
not overspecify within DCAT
Keith: don't forget astronomical and microscopical coordinate systems
AndreaPerego: in the UC there is a reference to the SDW-WG
review the work from there and follow the best practices might be an option
<Zakim> AndreaPerego, you wanted to point also to UC14 and ID16
AndreaPerego: related to UC14 and UC 16
kcoyle: let's stick with the spatial and temporal for the time being
objections for having these?
<kcoyle> PROPOSED: Accept ID28 and ID38
<AndreaPerego> Relevant SDWBP, linked from UC28: https://www.w3.org/TR/sdw-bp/#bp-crs
<annette_g> +1
+1
<antoine> +1
<alejandra> +1
<LarsG> +1
<Caroline_> +1
<riccardoAlbertoni> +1
<Jaroslav_Pullmann> +1
<Philippe> +1
<DaveBrowning> +1
<roba> +1
<dsr> +1
<Ine_> +1
<kcoyle> +1
<Keith> +1
<AndreaPerego> +1
<PWinstanley> +1
<danbri> +1
RESOLUTION: Accept ID28 and ID38
<kcoyle> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID14
kcoyle: data quality; UC14-15
<kcoyle> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID15
<alejandra> ID14 is related to ID43
<dsr> I note that XBRL supports hypercubes as an abstract coordinate space for financial reporting data
kcoyle: 14 about data quality and 15 about precision and accuracy
<AndreaPerego> alejandra, yep, it is.
PWinstanley: 15 is a subset of 14
And constraints on useability come into that
PWinstanley: What are the things that restricts the re-use of the data?
Partly the data-quality and partly the collection-process
Maybe we should use another pluggable container alike reference systems
PWinstanley: we shouldn't focus on the things that we directly see at hand
<Zakim> danbri, you wanted to ask what "Provide patterns for " means here; e.g. is it showing some examples using other vocabs?
all the time
danbri: should we provide examples of other vocabularies
AndreaPerego: what is missing was a recommendation to follow
<danbri> Thomas: what I saw in these 2 use cases is another case of using reference systems. We shouldn't make lots and lots of reference system classes, but have some common modelling structure.
<danbri> ... we shouldn't go further than that in defining DCAT. Anything going deeper is up to the profiles.
Jaroslav_Pullmann: in a
commercial PoV, in order to express quality of service levels,
we need that information also
... they ought to be valid use cases for us
<dsr> Can we enable profiles with annotations covering:
<dsr> * Where the estimates of precision/accuracy come from
<dsr> * When a data point has been interpolated (e.g. lost data, broken sensor)
<dsr> * When a sensor is no longer trusted, despite what it says
<danbri> [ AndreaPerego - do you think the topic of modeling caveats would fit in these UCs? https://lists.w3.org/Archives/Public/public-dxwg-wg/2017Jul/0041.html]
<AndreaPerego> [ danbri, possibly, but need to look at it more closely ]
dsr: a question is how to enable profiles to allow the reason/connotation on data quality issues
<kcoyle> PROPOSED: Accept ID14 and ID15
broken sensors vs fawl data
+1
<alejandra> ++1
<alejandra> +1
<annette_g> +1
<Caroline_> +1
<Philippe> +1
<antoine> +1
<LuizBonino> +1
<roba> +1
<dsr> +1
<riccardoAlbertoni> +1
<Ine_> +1
<Keith> +1
<LarsG> +1
<DaveBrowning> +1
<danbri> +1
<newton> +1
<PWinstanley> +1
<newton> +1
<Jaroslav_Pullmann> +1
RESOLUTION: Accept ID14 and ID15
kcoyle: go to ID12 and then have
lunch
... ID12 - data lineage
https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID12
kcoyle: we probably need a place to denote the source of the data
<Keith> does data lineage = provenance?
Is PROV an option?
Jaroslav_Pullmann: provenance is important
Textual of structured?
we need a structured, machine-readable way for this
<AndreaPerego> Keith, indeed, there may be related (lineage / provenance), but it depends on the definition of provenance.
<AndreaPerego> Thomas, PROV is mentioned as an option in the UC.
alejandra: is'nt this too generic
thx, andrea
s/isn't isn't
Keith: vertical vs horizontal provenance including the relationship between those two
goeis'nt isn't beyond PROV VOC
AndreaPerego: comment on provenance lineage
sometimes 'who did the job'?
<dsr> Keith: the ability to reconstruct the state at a specified time
you have provenance on the dataset-level and provenance on the agent roles, workload, ...
s/you AndreaPerego :/
PWinstanley: you have also instances where the data is being used etc
don't strictly belong to the provenance of the dataset
<AndreaPerego> Quoting from the DC definition of dct:provenance: "A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation."
kcoyle: we don't have anything that goes beyond the strict provenance
PWinstanley: when a dataset is
used in different contexts, the meaning/nature of the dataset
might change but the dataset itself isn't
... we could have an 'event' and a 'transition' (transition =
change; event = not changed)
all can adhere to 'provenance'
kcoyle: are we describing another requirement
PWinstanley: want to leave that to be decided
Jaroslav_Pullmann: UC 'funding sources' is another related aspect - that UC is linked to this one
also very strong related to versioning
<scribe> ACTION: Jaroslav_Pullmann will link these [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action05]
<trackbot> Created ACTION-16 - Will link these [on Jaroslav Pullmann - due 2017-07-24].
roba: what is the goal for bringing extra properties into DCAT?
<kcoyle> PROPOSED: Accept ID12
AndreaPerego: we should also take into account the goal to which the dataset should be used
(andrea: correct me if I'm wrong please)
Jaroslav_Pullmann: we should understand why provenance should be modeled
it isn't clear this time
kcoyle: that's what happens when we pull the requirements out of the use cases
<AndreaPerego> Thanks, Thomas. The 2 purposes I see are: data reproducibility and fitness for purpose.
thx
<Caroline_> +1
<roba> +1
<alejandra> +1
<newton> +1
<danbri_> +1
<LuizBonino> +1
+1
<Jaroslav_Pullmann> +1
<LarsG> +1
<PWinstanley> +1
<DaveBrowning> +1
<dsr> +1
<Philippe> +1
<Ine_> +1
<annette_g> +1
<Keith> +1
RESOLUTION: Accept ID12
kcoyle: declares lunch
<antoine> +1
resume in one hour
<AndreaPerego> +1
s/! 1/
<roba> bye
<AndreaPerego> Bye.
<roba> have other commitments tomorrow eve - so will be joining later in your morning for a bit.
<Makx> Where are we on the agenda?
<Makx> OK so I just missed the whole item on Quality. Pity.
<Caroline_> chair: Caroline_
<Caroline_> scribenick: alejandra
<Caroline_> Use Case https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID16
ID16: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID16
is it a duplicate of ID43?
<dsr> (we break for lunch)
antoine: wondering if Andrea was
joking, there was a lot of discussion about this use case
... in the DWBP WG
... test wether a dataset complies with a given standard, one
wants to record this
... this was a use case in the data quality vocabulary
... we ended up not being able to implement what Andrea
wanted
... should I reformulate the discussion or confirm that the use
case it is still relevant?
Caroline_: do people think it is out of scope or relevant?
Jaroslav_Pullmann: expression of quality is hard to express or assess without reference to an evaluation criteria
kcoyle: same as ID14
LarsG: same question about
providing a hook in DCAT core to provide these things?
... or is it outside of DCAT core?
LuizBonino: my understanding is
about compliance with something (standard, quality parameter,
etc)
... then there will be a validator to check the
compliance
... compliance is context dependence
<antoine> For the record here is the part about it in DQV, with a note about our resolution in DWBP: https://www.w3.org/TR/vocab-dqv/#ExpressConformanceWithStandard
annette_g: ... there is a lot of overlap with the data quality vocabulary
<AndreaPerego> Thanks, Antoine.
<Zakim> danbri, you wanted to note that lots of UCs have this structure - a reasonable usecase that may likely be beyond core and addressed by dcat + another vocab. Will we make a
annette_g: maybe we need to discuss how to address that with DCAT and at this point, not how to deal with it
danbri: a lot of the UCs indicate
that we can grow the DCAT core very quickly
... should we collect a list of useful extras?
<Makx> +q
<Keith> can we get requirements from UCs and then decide what is core and what not?
<Makx> -q
kcoyle: we need a document advising about what fits in
<danbri> alejandra, ... sorry I meant to say the opposite. Rather that we keep getting UCs where we could make a small change to the core but not address the full usercase.
sorry!
<antoine> Also for the record, I could dig the issue about Andrea's suggestions: https://www.w3.org/2013/dwbp/track/issues/202
<Zakim> AndreaPerego, you wanted to explain about the specificity of this UC
<dsr> The charter makes provision for work on a primer
<danbri> ... and that we don't have a repeatable answer, such as "we'll add this to our 'useful multi-vocabularies cookbook' page/document"
<PWinstanley> https://lists.w3.org/Archives/Public/public.../att.../DCAT-APimplementationguide.pdf
AndreaPerego: this UC is not
included in data quality 1 because we came across when using
DCAT for spatial metadata
... you should be able to express conformity and
non-conformity
... important for discovery purposes
... what are the data that needs to be modified to be
conformant
... general UC wasn't explaining these specific issues
<PWinstanley> https://lists.w3.org/Archives/Public/public-dwbp-wg/2015Jul/att-0010/DCAT-APimplementationguide.pdf
AndreaPerego: we identified this
in the implementation of DCAT
... in some cases, we found a solution, not in others
... 90% of these use cases are meta-UCs
... use of DCAT for supporting cross-domain
interoperability
... there is always a reference to other standards
... we want to support interoperability across metadata
standards
... we had to address this problem on how data standards are
modeling things
LuizBonino: the majority of the
use cases we discussed seem suitable for extensions around
datasets
... other parts may interfere on the structure of DCAT as it is
now
... if you have an approach for versioning, you have a version
and a distribution, the distribution should not be attached to
the dataset anymore but to the version
<Makx> @PWinstanley https://joinup.ec.europa.eu/asset/dcat-ap_implementation_guidelines/description are based on actual problems brought forward by implementers.
LuizBonino: we need to define the profile description method to define how people are going to use this
kcoyle: I edited some UCs related
to this
... e.g. in ID42
... this is about the dataset itself and not about the data
itself
... I don't know if it needs to be brought to the level of
DCAT
LuizBonino: we have the dataset
and the distribution, and we have the metadata about the
semantics
... each distribution matches to the generic concept
... the constraints on what you have to provide is what I would
consider a profile
kcoyle: a picture would be good
PROPOSAL: accept ID16
<annette_g> +1
<riccardoAlbertoni> +1
<antoine> +1
<newton> +1
<Caroline_> +1
<Ine_> +1
<Makx> +1
<Jaroslav_Pullmann> +1
<PWinstanley> +1
<danbri> +1
<LarsG> +1
<Caroline_> +1
<Keith> +1
<DaveBrowning> +1
<dsr> +1 - yes to in scope as a use case
<antoine> I agree it is very similar to 43
is this overlapping with ID43?
+1
<Makx> @antoine French headphone level limit?
<annette_g> Overlap is okay, no?
<AndreaPerego> Yes
<Thomas> +1
LuizBonino: explaining diagram
diagram here also useful: https://www.w3.org/TR/hcls-dataset/
<danbri> see https://twitter.com/danbri/status/886933651178573824
<Makx> can't you share a screen on Webex?
LuizBonino: dataset can extended
with any profile
... distinction between dataset, version/release,
distribution
Makx: this is a substantial model change to DCAT
kcoyle: this is LuizBonino's
current model
... it doesn't mean that we will follow this model
Makx: we need to be very careful
in making substantial model changes
... as it can break current implementations
<AndreaPerego> +1 to Makx
RESOLUTION: accepted ID 16
<antoine> +1
Caroline_: now discussing ID23: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID23
riccardoAlbertoni: I'll give some
context to this UC that we collected from the DQV
... Antoine, other people and I contributed
... meta-use case, data quality is very important for any reuse
of data
... data collected in the past is considered also in this
group
... it seems that there is some overlap in the use cases that
Andrea proposed
... even though from a different perspective
... some concrete case studies, how to identify integrity
constraints (e.g.)
... depending on how far we want to go in data quality within
DCAT, there is some DQV housekeeping
... DQV was released last December
... it would be great for us to have the possibility to make
small changes
antoine: general question on what should be the position of the DQV in terms of the core and profiles for DCAT
riccardoAlbertoni: it is quite
difficult to define how far we have to go in data quality
... this discussion should consider the UCs presented by
AndreaPerego
Makx: two comments on DQV
... it makes sense to use UCs as a good point to see how DQV
can be attached to DCAT
... the one that danbri came up in the last couple of days
w.r.t caveats on statistical data
<danbri> Caveats discussion: https://lists.w3.org/Archives/Public/public-dxwg-wg/2017Jul/0041.html (caveat/footnotes even at data item level)
Makx: I've got a use case I forgot to put it
<danbri> statDCAT AP, https://joinup.ec.europa.eu/node/147940
Makx: people have included
annotations of DQV to datasets
... I will write that UC
... and danbri can write the other UC
Caroline_: yes, please write more use cases
<AndreaPerego> Just to note that another case for the use of DQV in DCAT is UC15, where DQV is indicated in the existing approaches, and mentioned also in SDWBP: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID15
<danbri> ACTION: danbri write UC for data-item level caveat annotations [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action06]
<trackbot> Created ACTION-17 - Write uc for data-item level caveat annotations [on Dan Brickley - due 2017-07-24].
PROPOSAL: accept ID23 (https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID23)
<Thomas> +1
<AndreaPerego> +1
<danbri> +1
<LuizBonino> +1
<newton> +1
+1
<DaveBrowning> +1
<annette_g> +1
<PWinstanley> +1
<Makx> will you give me an action for DQV annotiation?
<antoine> +1
<Caroline_> +1
<Ine_> +1
kcoyle: there will be a need to tease out lots of requirements
<dsr> +1
<LarsG> +1
<Keith> +1
<riccardoAlbertoni> +1
<scribe> ACTION: Makx to create a UC for DQV annotation [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action07]
<trackbot> Created ACTION-18 - Create a uc for dqv annotation [on Makx Dekkers - due 2017-07-24].
RESOLUTION: accepted ID23
Next UC ID19: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID19
AndreaPerego: this is a general
or meta UC
... modelling different types of information
... e.g. input data
... property linking a dataset with the publisher and the
author
... you may need to attach to these relationships some
additional information
... such as the temporal context
... general use case where we need some guidance on how to
provide this information
... it can be applied for any type of information to be
attached to a relationship
kcoyle: in our UCs we have mixed
up UCs about the dataset and the data in the dataset
... we need to tease those apart
... as we may want to address them differently
... there may be statements that we may want to make about the
data / data semantics
... we haven't made that distinction in the discussion
AndreaPerego: when we talk about
dataset and when we talk about the data itself
... do we have this in DCAT already? CatalogRecord and
Dataset
kcoyle: ... are we ok in mixing
those or do we need to keep them separately?
... if we need to say something about quality, we need to say
quality about what
AndreaPerego: we had these
discussions in the DQV and we concluded that in most cases we
are talking about data
... we can use the same approach in data or metadata
... this is a topic that may need further discussion
Jaroslav_Pullmann: I didn't make
this difference
... dataset is about whatever data is behind
... I think this is related to what Rob was proposing
... atomic properties and specific descriptions
... I think this is a general approach of modelling
... atomic properties and complex descriptions, which the UC
says qualified descriptions
... what does it mean qualified form?
AndreaPerego: this is from
PROV-O
... different ways of representing the same information: the
core, the extended and, the qualified
... reified representation of a relationship
... where you can attach additional information to a
relationship
<AndreaPerego> PROV properties with qualified forms: https://www.w3.org/TR/prov-o/#inverse-names-table
AndreaPerego: e.g.
prov:qualifiedAssociation
... I don't know if there is a better term, but the idea is to
have another relationship to add more information
... bridge between the dataset and the source data
... where you can attached more information on when the data
was processed and so on
Jaroslav_Pullmann: simple atomic
properties and qualified properties
... is there a concrete suggestion on when this patterns
applies?
... e.g. for quality, accuracy
... how would you restrict the application of this pattern?
AndreaPerego: personally I would
stick to what we define as concrete requirements
... I would rely on what the community used in DCAT to identify
what is relevant
... the point is to have concrete use cases where people want
to specify concrete information and either they can't do it or
they do it differently every time
Jaroslav_Pullmann: would this be an extensibility pattern for DCAT?
AndreaPerego: yes, I think so -
but you cannot be sure if the proposal is universally
applicable
... unless you collect use cases
PROPOSAL: accept ID19 as relevant use case https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID19
<riccardoAlbertoni> +1
<Caroline_> +1
<Philippe> +1
<danbri> +1
<Thomas> +1
<Ine_> +1
<Keith> +1
<dsr> +1
<Jaroslav_Pullmann> +1
<DaveBrowning> +1
<newton> +1
+1
<annette_g> +1
<LarsG> +1
<PWinstanley> +1
<LuizBonino> +1
RESOLUTION: ID19 accepted
<antoine> +1
Next UC: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID26
ID26
Jaroslav_Pullmann: this is a meta
UC
... I consider DCAT as a core where one would attach properties
with specific standards/vocabularies
<Thomas> UTC :-)
Jaroslav_Pullmann: identify what
are the specific aspects that need an extension
... and identify a property for each of them
kcoyle: are you anticipating that
there will be particular vocabularies that DCAT needs to
consider?
... how open should we be?
... should we decide for each element whether there are
recommended vocabularies?
<Makx> +q
Jaroslav_Pullmann: what should
happen is an analysis of what is there
... and have simple properties for describing simple
stuff
... and attach further descriptions
<Zakim> danbri, you wanted to suggest UC bakes in a specific kind of solution
danbri: I like the general
intent
... the current formulation of the UC assumes a specific
technical approach
... LuizBonino diagram includes a release structure
... Makx's pointed out on being careful on changing the
structure
... maybe it would be good to rephrase not to consider specific
implementation
... we shouldn't assume that a specific property is the
solution
Makx: I have difficulties
understanding how this would work
... danbri indicates that we shouldn't mention properties, but
in an RDF world we need to speak about properties
... these 6 bullet points seem to imply that there are separate
properties for separate extensions
... but we already have a potential provenance one
... we already have specific ones for temporal and spatial
coverage
<danbri> Makx, my point was that in some extreme/complex important cases the WG might actually restructure DCAT's overall pattern with new types (e.g. Release/Version as in Luiz's diagram)
Makx: I'm not quite sure on what
the proposal is
... if a catch-all approach
... with loose semantics
... or identifying what extensions are needed
... I prefer the latter
LarsG: I'm pro having extension points in DCAT, I don't think we should mandate which vocabularies to use
<Makx> +q
LarsG: here you can put
provenance, you may want to use PROV
... but we shouldn't say 'you must use PROV', as this is
getting into the area of profiles
Jaroslav_Pullmann: benefit of
this use case is to identify the important aspects that should
not be forgotten
... there should be a dedicated property that relates to
whatever specification of this aspect
Makx: I wanted to react to what
LarsG was saying
... we are absolutely not in a position to recommend
vocabularies
... we can only provide a property where people can put
whatever they think it is relevant
annette_g: I agree with LarsG, it
would be the role of a profile to define what extensions to
use
... we can say in a profile 'we will use PROV-O'
... but not in the core
<riccardoAlbertoni> +q
kcoyle: we can provide guidance
<riccardoAlbertoni> -q
<LarsG> alejandra: for each element listed in the UC there are specific use cases
<LarsG> ... there are areas that might not be relevant for DCAT but for profiles
<LarsG> Jaroslav_Pullmann: it's about grouping the other UCs
<LarsG> ... we have extension points where we can hook that in
so, this UC is for grouping the other UCs
<Caroline_> scribenick: LarsG
Thomas: It would be useful for clients to know how to handle that
antoine: wants to continue on
alejandra 's point
... UCs are not very specific, but more meta
... every property in DCAT can be seen as an extension
point
Jaroslav_Pullmann: it wasn't meant to be implemented in the model
<Makx> good point Antoine
Jaroslav_Pullmann: more a hint
that we need to consider this in the model
... it's a meta use case listing what I think is important
antoine: so it's more like a design principle
<Makx> +q
antoine: "if I want to extend DCAT this is what I should consider"
Jaroslav_Pullmann: if we agree on
accepting this UC, Jaroslav_Pullmann would have a task
... to think about this
Makx: when we worked on the
European profile this is exactly what we did (describing
extension points)
... if there is a large group of people that want the same
thing, we could go back to DCAT and add it
... so the six bullet points in the UC, there might be 200, but
in the end we need to figure
... out which ones we want: what goes into DCAT core and what
is profile
antoine: agrees with Makx, would
be in favour of accepting the UC
... suggests that Jaroslav_Pullmann focuses on the meta
aspect
... should phrase it as a methodological point
<Makx> +100 to antoine
antoine: "if you have own needs, we have a methodology for creating profiles"
Jaroslav_Pullmann: The UC was a
first shot. Compared to the ISO standards there is an
aspect
... of maintenance that isn't covered in DCAT
... DCAT needs a reference to that
... those are aspects that are usually covered by specific
vocabularies
PROPOSED: Accept ID26
<Makx> +1
<antoine> +1
<annette_g> +1
<AndreaPerego> +1
<antoine> with editing!
<PWinstanley> +1
<LuizBonino> +1
<Ine_> +1
<PWinstanley> with editing
<Thomas> +1
PROPOSED: Accept ID26 with editing by Jaroslav_Pullmann
<riccardoAlbertoni> +1
<dsr> +1 subject to editing the text
<Caroline_> +1 with editing
<annette_g> W/e
<newton> +1
<DaveBrowning> +1
<danbri> Jaroslav_Pullman, suggested edit: "extension points (properties) " -> "extension points (typically properties)"
<danbri> +1
<scribe> ACTION: Jaroslav_Pullmann to edit ID26 [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action08]
<trackbot> Created ACTION-19 - Edit id26 [on Jaroslav Pullmann - due 2017-07-24].
[discussion about how to get from Use Cases to Requirements...]
RESOLUTION: Accept ID26 with editing by Jaroslav_Pullmann
<Caroline_> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID33
alejandra: we need a way to
provide an overview of data
... could be statistics
... and might go into a profile
<danbri> https://www.w3.org/TR/hcls-dataset/#s6_6 (mentioned by alejandra) seems to use VoID for rdf triple stats
kcoyle: danbri said that search engines are better at text than at numbers
<alejandra> +q
kcoyle: so like an abstract in a
paper, an overview could improve discovery
... can DCAT help there
<danbri> (text and also entities that can be found via textual queries, rather than raw decontextualized numbers)
alejandra: it's more than just a
description, but telling potential users of how much data to
expect
... ten patients or 1000
... also how many triples etc
<Makx> +q
alejandra: but hard to generalise so might go into a profile
PWinstanley: so it won't be text
kcoyle: if it's not in a particular format, is it for display?
LuizBonino: it might be for
validation
... if you use a profile you want to check
... in the profile you need to attach the vocabulary that
describes how many patients
... and then you can validate: does the metadata contain this
statistical information?
Makx: we tried some of that. DCAT
only has byteSize (not clear to everybody). When you start
talking about how many things are in your DS, there are
many
... different ways to define that and that is specific to the
use of the DS
... so this is community-specific and can hardly be generalised
=> profile
<alejandra> I agree Makx - maybe to consider for AP guidelines
<antoine> https://www.w3.org/2013/dwbp/track/issues/164
<antoine> https://www.w3.org/2013/dwbp/track/issues/189
antoine: sends around a cople of
pointers from the data quality work.
... there statistical information was very important
... they point to initiatives about statistics that were
considered relevant
... agrees with Makx that counting is very difficult
<antoine> https://www.w3.org/TR/void/#statistics
<alejandra> +q
antoine: void has somie counting
properties, and even if we don't incorporate
... void into DCAT there are similarities that might satisfy
this DCAT
alejandra: void is specific to
RDF so could be more a guidance
... but this UC could be about a generic pattern how to count
things
Thomas: if you leave semantics behind you can count anything, so we need to stay within the domain
antoine: we should at least be able to say why we didn't look at these issues
PWinstanley: if we were to put the summary as an XMLLiteral the search engines could pick that up and leave a hook for people to provide structured data
Jaroslav_Pullmann: usually it's important to provide the range of a property to give hints: Do we plan to do that in our ontologies?
PWinstanley: we're not obliged to, it's an additional layer of modelling
<Zakim> danbri, you wanted to report a bioschemas discussion on "Data record" structures that relates
Jaroslav_Pullmann: but that's an important part of modelling
danbri: It depends on how static your ontology is. schema keeps changing so they are very conservative with domains and ranges
<danbri> http://bioschemas.org/
danbri: Bioschemas do much typing (rows/columns) and it would be good if DXWG do the same
kcoyle: domains and ranges could be part of a profile, not necessarily DCAT
<danbri> ... we saw value in having a DataRecord view into contents of a datset, but even a simple multi-table dataset has two obvious representations as a set of records (1. entities 2. table rows). Either or both may be useful.
kcoyle: profiles can not only add new elements but also add constraints to existing ones
alejandra: much of it could be
put into a profile
... healthcare data is important because it's often not freely
available
PROPOSED: Accept ID33
<alejandra> +1
<annette_g> +1
<Caroline_> +1
<Jaroslav_Pullmann> +1
<LuizBonino> +1
<dsr> +1
<danbri> +1
<PWinstanley> +1
<antoine> +1
+1
<Ine_> +1
<newton> +1
<DaveBrowning> +1
<Thomas> +1; curious about the requirement here
RESOLUTION: Accept ID33
<Keith> +1
<Makx> +1
next: UC35
<Caroline_> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID35
Makx: we have had this discussion
before: There is a dataset and a description of it but no
designated catalogue (or no catalogue at all)
... like people creating datasets of their own.
<alejandra> +q
Makx: Is DCAT the data _catalogue_ or about datasets, too
<Zakim> danbri, you wanted to suggest RDF vocabs don't do 'mandatory'
Makx: you could use DCAT to describe a dataset before it's made part of a catalogue
danbri: doesn't see a big
problem. Vocabularies just provide useful terms
... can provide some statistics from google
Keith: catalogues are create by manual creation or by harvesting from other catalogues, so it's not a proble
LuizBonino: the focus of DCAT is
the dataset and not the catalogue
... the model isn't clear, though. It should be possible to
have datasets without a catalogue, so we need to fix the
cardinality
Jaroslav_Pullmann: one issue could be that the concepts/topics are part of the catalogue and would be lost for datasets without it
<alejandra> dcat:Dataset definition: A collection of data, published or curated by a single agent, and available for access or download in one or more formats.
<Keith> and of course the catalogs are themselves all datasets
<alejandra> doesn't mention catalogue
<AndreaPerego> First sentence from DCAT does: "DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web."
PROPOSED: Accept ID35
<LuizBonino> +1
<newton> +1
<Thomas> +1
<Ine_> +1
<AndreaPerego> +1
<Jaroslav_Pullmann> +1
<Keith> +1
+1
<Caroline_> +1
<PWinstanley> +1
<Makx> +1
<Philippe> +1
<alejandra> +1
<dsr> +1
<antoine> +1
<danbri> Interesting - https://www.w3.org/TR/vocab-dcat/#class-dataset is quite restrictive. Whereas https://www.w3.org/TR/vocab-dcat/#introduction is quite general about data.
<danbri> +1
<annette_g> +0
<DaveBrowning> +1
<Makx> Section 1 is non-noramtive
<danbri> What does "published or curated by a single agent" mean? If two people publish something together must we treat them as an Organization to meet this semantic?
RESOLUTION: Accept ID35
<Thomas> I agree, Lars
<Makx> OK
Caroline_: we're discussing more than the chairs had planned
LuizBonino: What with the people who participate remotely in specific timeslots?
kcoyle: it's specifically about profile negotiation wher Ruben wanted to call in, but LarsG s here to cover that
antoine: would want to have ID37 moved up since he can only join until 3pm
Coffee break until 4pm
<AndreaPerego> ^^ 5PM CEST
<Makx> when will you be back from breack
in 30 minutes
<AndreaPerego> scribe: alejandra
<AndreaPerego> scribe: LarsG
<Caroline> scribe: Jaroslav_Pullmann
<Caroline> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID40
<AndreaPerego> scribenick: Jaroslav_Pullmann
<Caroline> reading the use case ID40
kcoyle: what part of dcat should be aligned with Schema.org?
<AndreaPerego> kcoyle, I made a comment on this topic here http://lists.w3.org/Archives/Public/public-dxwg-wg/2017Jul/0052.html
danbri: summarizing about the evolution approach of Schema.org
kcoyle: the question remains - how do we get both aligned (via sameAs etc.)?
<Makx> can we please respect speaker queue?
kcoyle: all the properties are in dcat/dct namespace ..
<Caroline> kcoyle was on her turn :)
<danbri> view-source:http://schema.org/Dataset
<danbri> <link property="owl:equivalentClass" href="http://rdfs.org/ns/void#Dataset"/> <link property="owl:equivalentClass" href="http://www.w3.org/ns/dcat#Dataset"/>
Keith: suggestion to use an Schema-annotated HTML page to make catalog/datasets accessible (~ landing page)
<dsr> How are data catalogues discovered? One idea is to embed schema.org tags in web pages as a means to discover catalogues, and then use the DCAT vocabulary for further queries
Makx: rewrite the use case to exemplify the publishing process
<dsr> Keith: we need a way to expose DCAT to schema.org
danbri: there are mutliple classes in Schema.org that might fit the individual Dataset notion like Product, CreativeWork
<Makx> are we talking about the use case or arguing dumping DCAT in favour of schema.org?
danbri: describes how the metadata is being extracted and processed out of the web pages, there should be mapping to this Schema.org subset from DCAT
<danbri> see https://developers.google.com/search/docs/data-types/datasets and associated blog post
<danbri> i.e. https://research.googleblog.com/2017/01/facilitating-discovery-of-public.html
<Makx> +1 to kcoyle
kcoyle: approach to be best found by search engines via web pages annotated with Schema.org metadata?
<Caroline> Jaroslav_Pullmann: in the level of a dataset which may be dynamic we don't need a API endpoint
<Caroline> we can annotate with some of the schema.org elements
<Caroline> which of these propoerties could be exported to schema.org?
<Caroline> danbri: shared the documentation see https://developers.google.com/search/docs/data-types/datasets and associated blog post i.e. https://research.googleblog.com/2017/01/facilitating-discovery-of-public.html
<danbri> https://www.w3.org/TR/2016/NOTE-csvw-html-20160225/
<danbri> ... is the CSVW WG's note on JSON-LD in HTML
dsr: asking for use cases for seraching of particular type of resources (datatsets, services) ..
LuizBonino: 2 differents ways considered
<Makx> can LuizBonino speak louder please
<Caroline> is it better Makx ?
1) we generate HTML pages annotated by Schema.org metadata for dataset
<Makx> a bit bvetter yes
<dsr> users typing informal queries to discover catalogues and data sets and linking to pages offering a richer more structured search, intent based search involving hidden APIs for quick added value results, or back end use cases where a service initiates a query and generates a composition of services as a design for later instantiation or a dynamically instantiated composition for immediate use.
the dcat:Dataset is described as schema:Dataset as well
<Caroline> LuizBonino was showing on the figure he draw what he is explaining. He is talking about DCAT:dataset in the figure
2) the findability is further supported by indicating a dcat:theme
alejandra: might Google support DCAT natively, in contrast to mapping to Schema.org?
<Makx> European Data Portal 750.000, data.gov 160.000 data sets
Jaroslav_Pullmann: What would be the target of such an indexing by search engines?
<Caroline> Jaroslav_Pullmann: what to do if you are searching for data and are given 200.000 datasets?
<Makx> +q
Makx: comming back to the use case..
<kcoyle> +1
assuming the approach to define such a landing page of a dataset, what is the guidance of how to epxose it in terms of Schema.org annotation?
<Makx> cookbooks are good
<alejandra> +1
<danbri> cookbook feels the right level to me; DCAT-shaped structures...
<Makx> +1
PWinstanley: suggesting a cook book with examples
<danbri> ... with extras from other vocabs, and in schema case maybe mappings
<dsr> +1 to separate cookbook + alignment on terms where possible
<danbri> (prefer "cookbook" to "best practice" given that these things are still in flux)
danbri: in effect, this is mostly about mapping DC terms to Schema, which has been done already
<danbri> see also http://wiki.dublincore.org/index.php/Schema.org_Alignment/MappingIssues
<newton> danbri would the cookbook be like a primer?
PWinstanley: what is the (tool) support for creating these annotations?
dsr: what is our adivse on choosing and using such tools?
PWinstanley: there is a commercial potential for creation and provision of such tools and services
<danbri> +1
<kcoyle> PROPOSED: accept ID40 as a use case for a non-normative document
<Ine_> +1
<Thomas> +1
<annette_g> +1
<danbri> +1
+1
<LuizBonino> +1
<newton> +1
<alejandra> +1
<dsr> +1
<LarsG> +1
<kcoyle> +1
<Keith> +1
<Caroline> +1
<DaveBrowning> +1
RESOLUTION: accept ID40 as a use case for a non-normative document
<dsr> We should consider how open source projects could help with building both DCAT and schema.org markup
<antoine> belated +1
<riccardoAlbertoni> /me sorry I have to leave.. Thanks for the interesting discussion, See you tomorrow!
bye
<Caroline> thank you for participating riccardoAlbertoni
<danbri> [PWinstanley talking about https://twitter.com/nwplanet]
<danbri> dsr, I did have a conversation with someone in CKAN community about getting schema.org dataset markup into CKAN per-dataset landing pages. Idea would be to improve and publicise the existing DCAT addon rather than make a rival addon.
PWinstanley: suggesting to create a wiki on tooling support
Caroline: we will create an informal document on topic (cook book)
<antoine> +1 for discussing it now.
<kcoyle> scribenick: kcoyle
<Caroline> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID18
<LarsG> scribe: kcoyle
andrea: ID18; there are three
other use cases that mention same problem
... problem is that in many cases your distribution is not a
direct file download
... could be an api or another service
<danbri> [q: (Makx?) how would I find DCAT from a page like view-source:https://www.europeandataportal.eu/data/en/dataset/air-pollution-monitoring-data-dublin-city?]
andrea: the issue is that both
machines and humans what happens when you follow the link
... the response from the api may be an error or doesn't make
sense to you
... this is a main issue left open by DCAT 1.0
... there was once a subclass of dcat:Distribution ->
dcat:WebService, but this was dropped
... this is a big problem that users have - when they don't get
the data back they are confused
... a sparql endpoint, they get multiple datasets back
dsr: this is about what people expect from a search
Jaroslav_Pullmann: dynamic distribution - let the data be pushed
danbri: does the group consider finding commercial datasets? find out that it exists and how much you pay for it
Jaroslav_Pullmann: there could be
domain-specific solutions
... or templated urls
<dsr> where we need to give the url parameters some kind of semantics
LarsG: We are not limiting
ourselves only to open datasets; it's about finding
... this is how Europeana works; you can find things but they
may be behind a firewall
<DaveBrowning> +1 for Lars
AndreaPerego: find a way to model
the info in a domain-independent way, a minimal set
... 2 main things: 1) distribute is not direct, uses API /
service
... 2) type of service - specify with a code type of
endpoint/service
... even this small amount of info would be helpful to
people
... and could be used by software engines if know the code
<Ine_> +1 for Andrea
<Zakim> AndreaPerego, you wanted to say that it may be worth finding a domain-independent solution
AndreaPerego: what is missing is that minimal info
Keith: 1) searching an individual
dataset
... worst case 2) complex API with distributed data
... are we going to describe APIs or datasets?
<AndreaPerego> Yep, the API description is the complex bit.
<danbri> suggesting that https://www.w3.org/TR/vocab-data-cube/ covers some of Keith's (1.).
dsr: links to goals of WoT in W3C
- links to general services and domain-specific
situations
... dcat needs to say - the type of this is an api. beyond that
is outside of dcat
LuizBonino: in health area, data is electronic, access process is offline
<Zakim> LarsG, you wanted to say that antoine was accidentally kicked out of the queue...
<Caroline> so sorry antoine
antoine: asking Andrea if his use case includes sparql endpoints, because dcat has that solution
AndreaPerego: ? dcat has a solution for sparql end points?
antoine: there is a dcat access url that could be used for sparql endpoints
Makx: it's true that dcat says that this could be used with sparql end points but never says how
<dsr> DCAT should provide information about where to get further information about an API and if this is machine interpretable, what formats are supported, e.g. thing descriptions for the Web of Things, or schema languages for RESTful APIs
<antoine> https://www.w3.org/TR/vocab-dcat/, search for 'SPARQL' and this eventually gives dcat:accessURL
<AndreaPerego> dcat:WebService: https://www.w3.org/TR/2012/WD-vocab-dcat-20120405/#Class:_WebService
AndreaPerego: dcat:WebService was dropped from the document
<danbri> nearby, sparql, void etc: https://www.w3.org/TR/void/#sparql-sd
antoine: is sparql included in your use case?
AndreaPerego: no, not mentioned
<danbri> ... and then there is some literature around SPARQL as interface to data cubes e.g. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-017-0112-6
<scribe> ACTION: Andrea will add SPARQL endpoint to ID18 [recorded in http://www.w3.org/2017/07/17-dxwg-minutes.html#action09]
<trackbot> Created ACTION-20 - Will add sparql endpoint to id18 [on Andrea Perego - due 2017-07-24].
<danbri> FWIW in last week's IoT/WoT discussions, RAML, Swagger (https://swagger.io/specification/) and JSON-Schema came up a lot.
<danbri> Peter: some don't handle both GET and POST params
dsr: as above, what comes back:
file or msg? what service/more info do you get.
... is it machine-readable?
annette_g: what you get back ... can differ
<Thomas> +1
PROPOSE: Accept use case ID18 as in scope
<danbri> +1
<newton> +1
<DaveBrowning> +1
<Ine_> +1
<LuizBonino> +1
<antoine> +1
<alejandra> +1
<Caroline> +1
<annette_g> +1
<LarsG> +1
<Philippe> +1
<dsr> +1
<Makx> +1
<PWinstanley> +1
<Keith> +1
RESOLUTION: Accept use case ID18 as in scope
We gratefully acknowledge funding for lunch and coffee breaks from the VRE4EIC project.