11:36:34 RRSAgent has joined #csvw 11:36:34 logging to http://www.w3.org/2014/10/08-csvw-irc 11:36:36 RRSAgent, make logs public 11:36:36 Zakim has joined #csvw 11:36:38 Zakim, this will be CSVW 11:36:38 ok, trackbot; I see DATA_CSVWG()8:00AM scheduled to start in 24 minutes 11:36:39 Meeting: CSV on the Web Working Group Teleconference 11:36:39 Date: 08 October 2014 11:37:28 ivan has changed the topic to: Meeting agenda 2014-10-08: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-10-08 11:43:29 danbri has joined #csvw 11:53:13 ivan, "intitution" typo in http://w3c.github.io/csvw/experiments/simple-templates-jquery/test.html 11:53:29 any suggestions re https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-10-08 ? 11:53:46 we need to sit with jeni next few days to draft f2f agenda, for sure 11:56:24 rrsagent, this will be CSVW 11:56:24 I'm logging. I don't understand 'this will be CSVW', danbri. Try /msg RRSAgent help 11:56:30 zakim, this will be CSVW 11:56:30 ok, danbri; I see DATA_CSVWG()8:00AM scheduled to start in 4 minutes 11:57:13 jtandy has joined #csvw 11:58:58 ivan, the metadata.csvm in http://w3c.github.io/csvw/experiments/simple-templates-jquery/test.html only gives 3 triples in the json-ld playground 12:00:53 jtandy has joined #csvw 12:01:03 agenda: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-10-08 12:01:16 rrsagent, help 12:01:28 rrsagent, start 12:01:53 AndyS has joined #csvw 12:02:32 DATA_CSVWG()8:00AM has now started 12:02:41 + +44.207.286.aaaa 12:02:43 phila has joined #csvw 12:02:43 hi - am having Zakim dial in issues; could be local, will keep trying 12:02:45 zakim, aaaa is me 12:02:45 +danbri; got it 12:02:59 +[IPcaller] 12:03:01 jtandy, I had to wait a while to get through on zakim 12:03:04 zakim, dial ivan-voip 12:03:04 ok, ivan; the call is being made 12:03:06 +Ivan 12:03:10 zakim, IPcaller is me 12:03:10 +AndyS; got it 12:04:01 ericstephan has joined #csvw 12:04:02 +[IPcaller] 12:04:05 zakim, ipcaller is me 12:04:05 +phila; got it 12:04:20 zakim, who is on the phone? 12:04:20 On the phone I see danbri, AndyS, Ivan, phila 12:04:41 agenda: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-10-08 12:04:44 +??P7 12:05:02 +Eric 12:05:07 zakim, ??P7 is me 12:05:08 +jtandy; got it 12:05:37 scribenick: jtandy 12:05:45 scribe: jtandy 12:06:16 agenda hacking 12:06:57 ivan: wonders if we can do anything for planning about the face to face meeting? 12:07:07 ... especially who will dial in? 12:07:36 AndyS: is hoping to dial in ... but only if there's an agenda to help target the conversation 12:07:56 DWBP threw togetehr an outline for planning purposes https://www.w3.org/2013/dwbp/wiki/TPAC_2014 12:07:59 danbri: notes that early morning PST will help for europeans 12:08:20 +[IPcaller] 12:08:40 topic: dc/schema @context and normative refs issue 12:08:46 ivan: and danbri organise their calendars to sort out the agenda 12:08:53 ... for the F2F 12:09:03 topic: 12:09:12 rgrp has joined #csvw 12:09:13 q+ 12:09:15 hi all 12:09:40 ack ivan? 12:09:45 danbri: schema.org, DC, plain old english prose 12:09:55 ... can we bottom out this conversation. 12:10:27 ivan: we need to define the json terms and then a one or two sentence description of those terms [to define the semantics] 12:10:50 ... to have fixed json terms is important because then metadata validators can use them 12:11:06 i'm generally plus one on having *some* 12:11:09 ... and we can provide example @context docs to map to other vocabs 12:11:24 q+ 12:11:30 zakim, who is on the call? 12:11:30 On the phone I see danbri, AndyS, Ivan, phila, jtandy, Eric, [IPcaller] 12:11:30 ack ivan 12:12:24 rgrp (rufus): a short list recommended terms is useful ... to define a pattern of usage for the community. 12:12:36 ... basically in general agreement with ivan 12:12:43 jtandy: its MAY not MUST in terms of use and people can obviously add their own ... 12:13:04 danbri: what is the next action here 12:13:14 "specifiy few core terms, keep it small, under 10" 12:13:32 ivan: we need to resolve the action and write down the short list of terms 12:13:48 danbri: notes his comparison of DC with schema.org 12:13:59 zakim, mute me 12:13:59 jtandy should now be muted 12:14:15 could have been an outsider here 12:14:20 q+ 12:14:25 ack rgrp 12:14:30 ack ivan 12:14:34 created 12:14:35 creator 12:14:35 description 12:14:35 language 12:14:35 license 12:14:35 modified 12:14:35 provenance 12:14:35 publisher 12:14:35 rights 12:14:35 rightsHolder 12:14:35 source 12:14:35 spatial 12:14:37 subject 12:14:37 temporal 12:14:37 title 12:15:02 ivan: shares his list of terms (above) ... but is not convinced that all the terms are necessary 12:15:10 ... e.g. spatial and temporal 12:15:22 ... the rest is probably ok 12:15:33 q+ 12:15:44 http://lists.w3.org/Archives/Public/public-csv-wg/2014Oct/0008.html 12:15:53 • created: http://schema.org/dateCreated 12:15:53 • creator: http://schema.org/creator or http://schema.org/author 12:15:53 • description: http://schema.org/description 12:15:53 • language: http://schema.org/language (definition applies to actions; 12:15:53 could be generalized) 12:15:53 • license: http://schema.org/license 12:15:53 • modified: http://schema.org/dateModified 12:15:54 • provenance: no direct. http://schema.org/evidenceOrigin is related. 12:15:54 • publisher: http://schema.org/publisher 12:15:54 • rights: no direct mapping 12:15:55 • rightsHolder: http://schema.org/copyrightHolder 12:15:55 • source: no direct mapping (how does this compare to provenance), not 12:15:55 http://schema.org/source which is medical. 12:15:56 • spatial: https://schema.org/spatial 12:15:56 • subject: http://schema.org/about 12:15:56 • temporal: https://schema.org/temporal 12:15:57 • title: https://schema.org/name (rather than https://schema.org/title) 12:16:15 ivan: need long discussion to resolve the final list & suggests lubrication with beer to help 12:16:24 i'm +0 on dropping spatial / temporal ... 12:16:41 i have to save dateCreated is kind of nicer to be explicit but i'm easy either way 12:16:41 danbri: may be able to tweak schema.org to get rid of the differences 12:16:51 s/save/say/g ... 12:16:56 q+ to talk about cores and onions, rights and licences 12:17:21 ivan: it's not really a problem because we can change the "name" for a given term in the @context 12:17:49 ... also think we don't need "source"; because the metadata already refers to the csv file resource 12:17:52 i'd vote for source ... 12:17:58 q? 12:17:58 provenance is somewhat fancy ... 12:18:05 ack jtandy 12:18:05 zakim, unmute me 12:18:06 jtandy was not muted, jtandy 12:18:10 or even better: "sources" ... 12:18:21 but i think that takes us away from dc ... 12:18:27 jtandy: i was trying some examples myself 12:18:33 dcat metadata for dataset i was working on 12:18:42 things like lcense in dcat are part of distribution not about dataset 12:18:53 there are some diffs in how we vs dcat handle things, in our csv metadata doc 12:18:56 right, but dcat makes it a bit overcomplex there ... 12:19:12 based on a comment in recent spec, should we not try to normalize around DCAT given that W3C has chosen this for discovery metadata? 12:19:13 +1 to normalising with DCAT (surprise surprise) 12:19:15 i think datasets can have license in dcat no? 12:19:17 or say that we've chosen not to? 12:19:17 q? 12:19:20 ack phila 12:19:20 phila, you wanted to talk about cores and onions, rights and licences 12:19:22 q+ 12:19:38 phila: CSV files are distributions 12:20:01 thanks 12:20:08 phila: the dataset / distribution distinction for a CSV (they are sort of the same here) 12:20:24 but agree generally - that's why you should support multiple resources/distributions ... 12:20:24 phila: about the dropping of spatial and temporal ... it _sometimes_ matters 12:20:27 +q 12:20:49 to be clear - this is not dropping spatial and temporal its about not having them on the special shortlist ... 12:20:49 q+ 12:20:52 ... so we need to let people know that there are other terms outside the core data that people might use 12:21:10 ... whether spatial and temporal are important depends on the data 12:21:17 ack ivan 12:21:48 no, no, no ivan ;-) 12:21:51 ... suggests that we have a category of "useful" as well as "core" ... and if you don't use the "useful" terms then you need a reason 12:21:54 dcat definitely about datasets ;-) 12:21:58 +1 rgrp 12:22:14 ivan: you are right that definitely for use by data catalogs to talk about the datasets they hold or point to :-) 12:22:17 schema.org Dataset is very dcat-inspired, and def about datasets 12:22:45 ivan: about spatial and temporal ... very important to consider that we are defining a very small core set of terms that can be used **without qualification** 12:22:46 +1 to ivan's points - want stuff to be very clear ... 12:22:54 db:temporal 12:22:58 jtandy: good summary - +1 to that 12:23:01 dc:temporal 12:23:41 ivan: we can't disallow use of other terms (beyond the small core) ... people can use what they want 12:23:41 q? 12:24:02 ack rgrp 12:24:09 ivan: there are many situations where things are useful, but let's stick to the core set 12:24:46 rgrp (rufus): the core set in no way prohibits people using other terms; the core should be the set of terms applicable to **every** CSV 12:25:54 ... the point is ... that we shouldn't remove spatial because it is applicable to almost every CSV so we should drive that usage 12:26:05 ack me 12:26:05 to every or almost every CSV ... 12:26:20 ... but notes that there are many ways to express "spatial" so can drive complexity 12:26:57 danbri: we need to give people the freedom to use what they need for their local tool chains and keep the mandatory list _very_ short 12:27:18 ... go for provision of _examples_ rather than normative recommendation 12:27:38 q? 12:28:31 rgrp: happy to take the list we have here and update the metadata vocab document 12:29:17 ivan: to avoid misunderstanding, the section listing loads of DC terms should be removed 12:29:22 I note that the EC's DCAT Application Profile does not include spatial and temporal as mandatory https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final 12:29:37 q+ 12:29:50 rgrp: agrees, and will make sure that people know they can add their own terms as necessary 12:29:59 danbri: so is the core list normative? 12:30:05 rgrp: yes 12:30:06 ack phila 12:30:10 ack me 12:30:40 https://joinup.ec.europa.eu/system/files/project/f9/42/c0/DCAT-AP_Final_v1.00.png 12:30:48 rgrp, can you 'action:' yourself a suitable editors task here? 12:30:54 danbri: yes ... 12:30:59 phila: references the DCAT application profile for EU (EC?) ... a list of terms that should be used when describing a dataset & talks of mandatory and optional terms 12:31:07 ACTION: (rufus) amend metadata draft with the shortlist 12:31:08 Error finding '(rufus)'. You can review and register nicknames at . 12:31:27 ... spatial and temporal are optional; this might provide support for our decision in establishing the core list 12:31:33 action: rufus to amend metadata draft with the shortlist 12:31:33 Created ACTION-32 - Amend metadata draft with the shortlist [on Rufus Pollock - due 2014-10-15]. 12:32:24 issue #29 ... 12:32:29 rgrp: asks for the final list to be shared after this call 12:32:50 [not chair hat] I propose using same short names as schema.org, per my mail above 12:33:37 rgrp: so as above, without spatial and temporal and I get to choose about the english prose of dateCreated or createAt 12:33:46 topic: direct mapping staus 12:34:13 ivan: I have a first version running 12:34:21 ... didn't hit any major issues 12:34:30 ... have not implemented datatype handling yet 12:34:49 ... sometimes I find the metadata convoluted (e.g. primarykey) 12:35:15 see also http://lists.w3.org/Archives/Public/public-csv-wg/2014Oct/0032.html -> http://w3c.github.io/csvw/experiments/simple-templates-jquery/test.html 12:35:57 ivan: the complexity makes the implementation more complex than it would otherwise be ... but nonetheless, it is implementable 12:36:45 q+ to ask about http://w3c.github.io/csvw/experiments/simple-templates-jquery/tree-ops/metadata.csvm 12:36:51 ivan: the implementation is the same for both JSON and RDF right up to the point where the output format is chosen 12:36:59 ... this is good 12:37:39 ivan: notes that jtandy found issues with the mapping 12:37:42 http://lists.w3.org/Archives/Public/public-csv-wg/2014Oct/0027.html 12:37:56 q? 12:37:57 ivan: how far can we go with the direct mapping? 12:38:05 ... need to be sure we don't over complicate things 12:38:43 danbri: I tried to feed the output from ivan's tool into the JSON-LD playground; something not quite right 12:38:59 ivan: needs to check with Greg 12:39:34 danbri: wonders if properties that are not in a namespace get dropped in JSON-LD 12:39:40 q? 12:39:43 ack me 12:39:43 danbri, you wanted to ask about http://w3c.github.io/csvw/experiments/simple-templates-jquery/tree-ops/metadata.csvm 12:39:49 ivan: I'm aiming for JSON, not full blown JSON-LD 12:39:58 danbri: what about the specifications? 12:40:03 q+ 12:40:25 ivan: only a few changes ... but notes the need to update the metadata vocab as agreed earlier 12:40:53 ... datatype area is currently under specified; esp. date formats 12:41:07 ack jtandy 12:41:24 jtandy: I'm looking forward to contributing as an editor 12:41:29 haha 12:41:37 q+ to ask a q that may not be welcome 12:42:22 ivan: notes the need for help with the specification work ... and notes that there's still an XML doc to do too 12:42:40 phila: lots of people still care about XML 12:43:42 http://www.google.com/trends/explore#q=XML%2C%20JSON%2C%20SQL 12:43:46 i have to drop if that is ok ... 12:43:53 thanks rufus 12:44:23 phila: ivan mentioned dates and datatypes; people write dates inconsistently in CSV files ... how can we handle date normalisation? 12:44:24 +1 on phil's point re bad dates ;-) cf http://okfnlabs.org/bad-data/ex/gla-spending/ 12:44:39 -rufus 12:45:52 ivan: from the conversion point of view it is easy ... using the 'format' specification in the metadata we can convert into a "proper" RDF (xsd) datatype 12:46:07 ... but if people write rubbish, what can we do? 12:46:53 phila: because the poor date / datetime writing is so common, can we make a special case for validation? 12:47:12 q? 12:47:14 q+ 12:47:18 q- 12:47:37 ivan: there is a "format" metadata term, also there are about 15 well known date forms that could be checked against 12:48:22 danbri: otoh, if date strings are so poor, this could be an argument for tolerance? 12:48:24 phila 12:48:25 ack me 12:49:08 phila: agreed, I worry about enforcement of a detailed pattern introducing errors where people don't know the details 12:49:10 q+ 12:49:16 t-10 12:49:42 ack jtandy 12:50:50 danbri: ultimately the only thing that will drive up data quality is getting data used! 12:50:57 topic: R2RML experimentation report (danbri) 12:51:17 danbri: has posted to the mailing list ... 12:51:42 ... implementation has been decoupled from SQL and modified to take CSV as input 12:52:33 ... the "event" example is working; 10-triples per row and exactly the triples I wanted (matching what people actually use) 12:53:01 ... but this is template driven, significantly beyond direct mapping 12:53:11 ... is it beyond mustache? 12:53:30 ... Shall we chase authors for a Working Group Note? 12:53:36 https://github.com/w3c/csvw/tree/gh-pages/examples/tests/scenarios/events 12:53:48 begun https://github.com/w3c/csvw/tree/gh-pages/examples/tests/scenarios/uc-24 12:53:55 ivan: a WG Note (for R2RML) is useful; no problem there. 12:54:07 ... do we also want Notes for mustache etc. 12:54:08 example https://github.com/w3c/csvw/blob/gh-pages/examples/tests/scenarios/events/attempts/attempt-1/mapping-events.rml.ttl 12:54:14 bye AndyS 12:54:14 -AndyS 12:54:31 ivan: we need to say how to ref an RML file from our metadata 12:54:39 ivan: we _do_ need to include the Recommendation how to refer to these external templates 12:55:24 ... we have 3 mapping processes so far: R2RML, mustache, direct mapping 12:56:07 ... would be useful to run through all the use cases to see where the capabilities of each mapping process reach 12:56:38 danbri: there are a few in progress now ... a few more should be enough? 12:57:06 ivan: to be systematic, would go through all use cases ... to document the pros and cons of each approach 12:57:27 ... this is a lot of work 12:57:40 ... perhaps discuss at the F2F meeting? 12:57:55 ... at some point we'll have to build tests _anyway_ 12:58:51 ... we need proper testing in order to progress to Recommendation 12:59:44 danbri: if we share the R2RML and mustache implementations we're working with already, then others in the group could work through the rest of the use cases 12:59:47 -Eric 12:59:52 -danbri 12:59:55 -Ivan 12:59:58 -phila 13:00:07 danbri: will nag the R2RML folks to include an Open source license 13:00:14 -jtandy 13:00:16 DATA_CSVWG()8:00AM has ended 13:00:16 Attendees were +44.207.286.aaaa, danbri, Ivan, AndyS, phila, Eric, jtandy, rufus 13:02:35 trackbot, end telcon 13:02:35 Zakim, list attendees 13:02:35 sorry, trackbot, I don't know what conference this is 13:02:43 RRSAgent, please draft minutes 13:02:43 I have made the request to generate http://www.w3.org/2014/10/08-csvw-minutes.html trackbot 13:02:44 RRSAgent, bye 13:02:44 I see 2 open action items saved in http://www.w3.org/2014/10/08-csvw-actions.rdf : 13:02:44 ACTION: (rufus) amend metadata draft with the shortlist [1] 13:02:44 recorded in http://www.w3.org/2014/10/08-csvw-irc#T12-31-07 13:02:44 ACTION: rufus to amend metadata draft with the shortlist [2] 13:02:44 recorded in http://www.w3.org/2014/10/08-csvw-irc#T12-31-33