12:09:49 RRSAgent has joined #schemata-discussion 12:09:53 logging to https://www.w3.org/2024/03/12-schemata-discussion-irc 12:09:53 RRSAgent, do not leave 12:09:54 RRSAgent, make logs public 12:09:57 Meeting: Schemata Discussion - Follow up from TPAC23 12:09:57 Chair: Ege Korkan 12:09:57 Agenda: https://github.com/w3c/breakouts-day-2024/issues/15 12:09:57 Zakim has joined #schemata-discussion 12:09:57 Zakim, clear agenda 12:09:57 agenda cleared 12:09:57 Zakim, agenda+ Pick a scribe 12:09:57 tidoust has joined #schemata-discussion 12:09:58 agendum 1 added 12:09:58 Zakim, agenda+ Reminders: code of conduct, health policies, recorded session policy 12:09:59 agendum 2 added 12:09:59 Zakim, agenda+ Goal of this session 12:09:59 agendum 3 added 12:09:59 Zakim, agenda+ Discussion 12:10:00 agendum 4 added 12:10:00 Zakim, agenda+ Next steps / where discussion continues 12:10:00 agendum 5 added 12:10:00 tpac-breakout-bot has left #schemata-discussion 12:13:55 tidoust has changed the topic to: Breakout: Schemata Discussion - Follow up from TPAC23 - Kora - 14:00-15:00 UTC 13:57:25 kaz has joined #schemata-discussion 13:57:28 McCool has joined #schemata-discussion 13:59:10 JKRhb has joined #schemata-discussion 13:59:46 ivan has joined #schemata-discussion 13:59:48 scribenick: JKRhb 13:59:55 present+ 14:00:17 Ege has joined #schemata-discussion 14:00:27 meeting: Schemata Discussion - Follow up from TPAC23 14:01:33 betehess has joined #schemata-discussion 14:01:37 present+ Kaz_Ashimura, Ege_Korkan, Marcel_Otto, Peter_Bruhn_Andersen, Alexandre_Bertails, Ivan_Herman, J-Y_Rossi, Jan_Romann, Klaus_Hartke, Michael_McCool, Octavian_Nadolu, Tomoaki_Mizushima, Christian_Glomb 14:01:54 present+ Vladmir_Alexiev 14:02:08 rrsagent, make log public 14:02:10 Tomo has joined #schemata-discussion 14:02:12 rrsagent, draft minutes 14:02:13 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz 14:02:20 topic: Intro 14:02:21 ek: We can start 14:02:21 Octavian_Nadolu has joined #schemata-discussion 14:02:24 ... welcome everyone 14:02:35 ktoumura_ has joined #schemata-discussion 14:02:35 pebran has joined #schemata-discussion 14:02:38 ... session is called Schemata Follow Up from TPAC2023 14:02:41 present+ Tomoaki_Mizushima 14:02:50 marcelotto has joined #schemata-discussion 14:02:53 ... Jan is taking minutes 14:03:00 ... will skip the introduction round 14:03:09 present+ 14:03:17 VladimirAlexiev_ has joined #schemata-discussion 14:03:20 ... due to large turnout, please introduce yourself before speaking 14:03:23 present+ Michael_McCool 14:03:28 present+ Thomas_Wehr 14:03:45 present+ Kunihiko_Toumura 14:03:47 rrsagent, draft minutes 14:03:48 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz 14:04:11 ... meetings are under two polices: Antitrust and competition policy, encourage competition, furthermore we encourage a good work environment 14:04:30 ... context is to share experience and find a place within W3C for discussion 14:04:47 ... required background is some knowledge of SHACL and JSON-LD 14:04:51 topic: Presentation 14:04:52 present+ Vladimir Alexiev 14:05:19 ek: Maybe we have met before, we have hosted a session with Pierre-Antoine before at TPAC 14:05:22 new 14:05:26 new 14:05:28 new 14:05:32 new 14:05:33 ... if you haven't been there, please write "new" into the IRC 14:05:42 present+ Ben_Hutton, Dorthe_Arndt 14:05:43 ... however, I also prepared a brief intro 14:05:53 ... there a few new people here, I see 14:06:23 ek: We have slides from previous sessions, which you can look at 14:06:39 doerthe has joined #schemata-discussion 14:06:57 ek: As quick summary, you have different kinds of schema approaches, and if you have a specification that uses different concepts, it becomes hard to manage 14:07:17 ... in WoT TD, for example, we have the spec document itself 14:07:20 ... ontology documents 14:07:24 ... SHACL shapes 14:07:28 ... JSON Schema files 14:07:38 ... type and class definitions in TypeScript 14:07:43 ... tests and examples 14:07:54 ... all need to be managed, updated and published 14:08:15 ... we have some tooling, but we still have to do some manual work 14:08:37 ... soon, we will also have a registry for Binding Documents, where authors will also face these issues 14:08:52 ek: Previous presentations were given by Chris Mungell and @@@ 14:08:59 see https://github.com/json-ld/yaml-ld/issues/19 for more "polyglot modeling" approaches/frameworks 14:09:14 ek: The work so far included an analysis by the WoT WG 14:09:26 ... concerning versioning, packaging, and serving resources 14:09:46 ... we can discuss how to continue this in the last 10 minutes of this slot 14:10:06 ek: Mahda did most of the work for this presentation actually, but she is currently not available 14:10:16 ... all resources are available on GitHub 14:10:33 present+ David 14:10:46 ek: So far, we were creating a very complicated diagram summarizing the very complicated process that has to be done with every PR 14:10:51 present+ Dominik_Tomaszuk, Elodie_Thieblin 14:11:02 ... for example, the JSON Schema needs to be updated or rendering needs to be triggered 14:11:13 ... we are not very proud of it, it is quite messy 14:11:28 ek: We have then been looking into alternative tooling to make our lives easier 14:11:48 ... and collected metrics and other aspects for comparison in a table 14:12:12 ... for example, the handling of different value representations, inheritance, or unknown object keys 14:12:45 ... at the moment, we are in favor of using LinkML in the future, but we have not decided yet and this is not is the topic of this session 14:12:59 ... but we want to collect feedback from the tool authors themselves 14:13:12 ... as Vladimir Alexiev has already done, thank you for that 14:13:32 ... we want to update our requirements accordingly, to make sure that the process is transparent 14:13:43 ... any questions so far regarding the analysis or the diagram? 14:13:48 No questions so far 14:13:57 dezell has joined #schemata-discussion 14:14:07 ek: If you have any questions, then please join the IRC and write "q+" 14:14:20 ek: So we have all of these resources, but there is still a missing point 14:14:22 ethieblin has joined #schemata-discussion 14:14:33 present+ David_Ezell 14:14:39 ... so in the WoT WG, we have a repo with GitHub pages available 14:14:54 ... after publishing, the W3C team contact adjusts the redirection 14:14:57 present+ elodie.thieblin 14:15:12 ... in general, you can consider this uploading software to a web server 14:15:22 ... and there is no standardized way to handle this 14:15:26 I've seen many communities that face the same problem: electrical CIM, traceability in trade, GS1 EPCIS in logistics, ACORD in insurance, IFC in AECO etc etc 14:15:39 ... and this process is too slow for our release cycle 14:15:44 ek: Can we do better? 14:15:52 ... we have a PR that tries to address this 14:16:12 ... we need better tooling, could rely on package managers 14:16:30 ... Klaus Hartke has done some work of using npm for this kind of thing 14:16:35 For the Traceability community, I asked them to consider LinkML: https://github.com/w3c-ccg/traceability-vocab/issues/295 14:17:06 ... in the JSON Schema world they are using custom registries (?) 14:17:18 ek: So this finishes the summary for now 14:17:23 More importantly, I wrote up some draft Requirements for such tooling: https://github.com/w3c-ccg/traceability-vocab/issues/296. This could complement the comparison table that WoT showed 14:17:26 ... wanted to keep it brief 14:17:28 rrsagent, draft minutes 14:17:29 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz 14:17:47 ... in the TPAC 2023 discussion, there was the question of where the discussion should continue 14:17:52 q+ 14:18:01 ... not necessarily needed to standardize something like LinkML 14:18:26 q? 14:18:42 ... but there needs to be some process or best practices in my opinion, if anyone has other thoughts, please make a comment 14:18:50 ... any questions? 14:18:54 https://docs.google.com/presentation/d/193OFcFaxD0GqrRuOggwZe5eorgL1C1Epe2cAYN3JEkk/edit?usp=sharing 14:19:11 kaz: A comment regarding logistics: Please paste the link of the slides into the IRC 14:19:39 ... another comment: please list the important questions in the slides 14:19:47 ... like tooling, versioning, and so on 14:20:01 ek: (Updates the slides) 14:20:10 i|Maybe we have met|-> https://docs.google.com/presentation/d/193OFcFaxD0GqrRuOggwZe5eorgL1C1Epe2cAYN3JEkk/edit?usp=sharing Slides| 14:20:13 ... if there are any points, we can categorize them accordingly 14:20:14 .. communities: also AAS (industrial Digital Twins, i.e. Industry 4.0 / RAMI) 14:20:25 ... in the IRC, there were some comments by Vladimir 14:20:36 ... saying, that others have been suffering the same issues? 14:20:55 va: I have been seeing the same issues in other communities as well 14:21:16 ... many communities want to use both JSON Schema and JSON-LD, and they need to be in sync 14:21:32 ... in some cases, they want to borrow schemas from other people and mix and match them 14:21:51 ... in some cases, the results are mixed, in others they are very bad 14:22:00 ... as many ontologies come with their own baggage 14:22:16 ... also difficult if you are relying on a schema in XML 14:22:50 va: In particular the Trade Transparency group has been working on related topics 14:23:09 ... find this very interesting to see how semantic technologies spread into these communities 14:23:17 ... on the other hand, these people need some help 14:23:30 ... need guidance how to use RDF properly 14:24:00 ... even simple things, how to model triples with literals (?) 14:24:37 ... in RDF, we have infinite precision, in the case of conventional JSON numbers, we don't 14:24:39 q+ 14:24:46 q- 14:24:51 va: The importance of this discussion is very very high 14:25:00 ... question how to get RDF into more communities 14:25:13 ... that try to use it, but don't get it right yet 14:25:29 ... question how W3C can help these communities 14:25:54 ek: Thank you for your comments, tried to update the presentation with the links form the minutes 14:26:01 q? 14:26:07 The import is: how can the sem web community help other data communities "graduate" into linked data? 14:26:15 ... could you elaborate on the Trade Transparency group? 14:26:17 va: Will do 14:26:28 ab: Quick introduction: 14:26:36 ... work at Netflix on an ontology service 14:26:54 ... this group is exactly discussing what we are trying to solve 14:27:11 ... try to combine schemas with ontologies (?) 14:27:21 ... we tried using SHACL, has been working okay so far 14:27:42 ... issue is that people need to learn RDF and SHACL, which is difficult 14:27:57 Another example: AAS is a very important spec in Industrial IoT. But they have fundamental issues that go against the Web Architecture, eg https://github.com/admin-shell-io/aas-specs/issues/383. See https://github.com/admin-shell-io/aas-specs/issues/384 for a list of issues 14:27:57 ... we have a problem of discovery, how to discover ontologies? 14:28:15 ... question how to combine schemas and ontologies 14:28:43 ... very much interested in talking to people, standardization, wich we would like to see at some point 14:28:49 s/wich/which/ 14:29:18 ek: Thank you, you've mentioned that you get a lot questions regarding SHACL, do you have problems with adaption? 14:29:32 ab: Problem is that SHACL is way too powerful 14:29:42 ... difficult to map to a schema 14:30:11 ... tried to create a subset of SHACL that is powerful enough to write meaningful ontologies which you can then project onto GraphQL for example 14:30:17 rrsagent, draft minutes 14:30:19 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz 14:30:25 ek: I think I can relate a bit with that 14:30:25 Another example: the Allotrope community (lab equipment measurements). It is very active and does things right. They use JSON-LD in nice ways 14:30:50 ... I think a main problem is that @@@ 14:31:00 q? 14:31:05 ... they have trouble seeing the benefit of switching 14:31:09 q- 14:31:28 ek: There has been some discussion on standardization or not 14:31:38 ... in my point of view, there wasn't the need yet 14:31:47 ... question: What should be standardized? 14:32:05 ... some aspects from the table could be standardized, but the tools are already quite stable already 14:32:22 ... not sure about the benefits, other that it would give more credibility 14:32:23 q+ 14:32:50 ab: I consider the main issue with this slide related to standardization 14:33:07 q+ 14:33:20 ... how to we make sure that our projection to GraphQL, for example, is correct and can be injected into an ontology? 14:33:30 q+ 14:33:39 ack betehess 14:33:50 ek: From my point of view, it is mostly a tooling question, with LinkML you could go to the other representations 14:34:08 ab: Yeah, for of people it would be about tooling, for us it would be about meaning 14:34:20 ek: Guaranteeing that there is no information loss? 14:34:35 q? 14:34:36 ab: Yeah, and the information is represented correctly 14:35:06 va: I feel very much all of the questions that have been raised as we are facing similar issues 14:35:41 ... I tried to create a community for canonical mapping between SHACL and RDF(?) 14:35:58 s/RDF(?)/Shex/ 14:35:59 ... SHACL is very useful to build UIs 14:37:37 ... so the question is how to use most common subset of SHACL. How to use with GraphQL and translate to SPARQL and use certain more complex joins 14:38:03 ... very important how to transform without losing meaning 14:38:11 +1 on being lossless. That's our main concern here at Netflix. 14:38:26 ek: Just one point 14:38:42 ... you've mentioned that there are GraphQL implementation that use RDF? 14:38:52 ... could you repeat that? 14:38:56 rrsagent, draft minutes 14:38:57 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz 14:39:04 q? 14:39:04 ack v 14:39:11 GraphQL implementations over RDF, and benchmarks (there's LinGBM and a couple smaller ones): https://www.zotero.org/groups/5393345/semantic_graphql 14:39:12 va: I have Zotero library with resources regarding this topic, I will send you a link 14:39:25 ih: I have seen similar problems before 14:39:29 q- 14:39:46 ... I am currently participating in the @@@ WG, there have been similar issues 14:40:21 ... Pierre-Antoine mentioned similar issues, there are slight differences between different languages, making it hard to convert 14:40:38 ... one thing we've seen was that SHACL and shex cannot model dataset 14:40:43 s/@@@ WG/Verifiable Credentials WG/ 14:41:01 Holger Knublauch is convening a SHACL 1.2 CG that should address named-graphs 14:41:17 ... typical problems, can't imagine what we will face with the introduction of RDF 1.2 14:41:49 ih: What I am critical of is the problems JSON-LD introduces 14:42:13 ... as it both sold as a serialization of RDF but also as plain JSON 14:42:26 ... inherit problem of JSON-LD, not sure how to solve it 14:42:51 ... one thing I've seen in communities was a misunderstanding of JSON-LD context files 14:43:01 q? 14:43:11 ... as context files can be seen as a glorified mapping file 14:43:14 q+ 14:43:16 "@context" is not an ontology, and it is not a schema: it's only a mapping from JSON to ontology terms 14:43:58 ... we need to work on making these discrepancies disappear, but I am not really optimistic, as all of these schematas are not the same 14:44:18 ... should not create yet another standard (in the XKCD sense) 14:44:41 ... but we need to be aware of the discrepancies before we can go to the tools 14:45:08 ek: Thank you, very good points, the double nature of JSON-LD is exactly why we are facing these kinds of problems 14:45:15 q? 14:45:18 ih: I have seen this in many communities before 14:45:24 ack ivan 14:45:27 ack betehess 14:45:31 ab: I agree so much with what Ivan just said 14:45:42 ... we are facing the same problems at Netflix 14:46:02 ... JSON-LD was not a problem so far, everyone is using Turtle and that is working as expected 14:46:37 ... the lack of definition of issues has been an issue, question how to import an asset (?) 14:47:18 ... one aspect that we've noticed that we can always define a SHACL shape and achieve what you want to do, also with fundamental things 14:47:29 q? 14:47:43 ... want to publish something in that regard soon, SHACL is very good for this kind of thing in our experience 14:48:28 sorry, ntd 14:48:41 ek: In my experience, there is some tooling to help us if we want to go down the route of using SHACL 14:49:01 topic: Check-out 14:49:03 ek: Now we should fill out the check-out slides 14:49:24 ... I think one consensus was that this is an annoying problem 14:49:48 ... and is relevant for different communities (not only WoT) 14:49:55 ... not sure about the next steps 14:50:05 ... should we create a CG or mailing list? 14:50:11 q+ 14:50:15 ... what can we do to work on this? 14:50:48 ih: What I felt is to try to list, gather, categorize the problems that make these tools so difficult 14:51:12 ... for example datasets, as Alexandre mentioned, or RDF 1.2 or literals 14:51:23 ... we need to have a clear view of the problem space 14:52:00 ... I have the impression that we should not jump into creating new tools, we should first understand the problem, step up 14:52:05 s/up/back/ 14:52:11 ek: Agree with that 14:52:23 ... what should the next step be then? 14:52:55 va: I think we should begin with the UCR 14:53:05 ... and then begin with SHACL vs Shex 14:53:19 ... creating mapping, then see what is missing 14:53:46 ... similar with GraphQL, if W3C wants to create a CG working on a mapping then this could happen 14:54:11 ... not going to solve all problems, as there will always be differences 14:54:18 kaz_ has joined #schemata-discussion 14:54:30 rrsagent, draft minutes 14:54:31 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz_ 14:54:54 ... what do you do in case of a discrepancy? Maybe best practices are enough, not necessarily need to create a new specification 14:55:17 ... the SHACL 1.2 CG is a good approach 14:55:42 q+ 14:55:45 ... @@@ 14:55:57 q? 14:56:13 ... so first focus on UCR and then create focus groups to start working on the individual problems 14:56:35 ek: Added creating the catalog to the slides 14:57:01 kaz: I basically agree with Ivan and Vladimir 14:57:25 ... we should clarify requirements, what the problem is, then see what solution would fit 14:57:46 ek: I think there is a question whose requirements it is? 14:58:05 kaz: A better word might be expections 14:58:28 ek: There is a question of ownership, not only the WoT WG is involved 14:58:31 yes: 1. Catalog the problems/features/questions/issues (UCR), 2. focus CGs to work out specific issues: a) SHEX-SHACL mapping, b) RDF-GraphQL best practices, c) maybe YAML/YAML-LD syntax for mixing schemata approaches (JSON Schema and JSONLD Context are first candidates) 14:58:58 kaz: As I mentioned, this is related to Ivan's comments, we should clarify and categorize problems 14:59:17 ... and then see how they relate to requirements 14:59:39 ... WoT WG should clarify its own requirements, then we can contact the others again 14:59:46 4. Dissemination/proliferation into various communities. Because these are problems that affect widely different communities, it will not be easy to reach/evangelize to them 15:00:59 ek: Question is how the individual groups will form, as I am not part of the groups mentioned in the discussion 15:01:06 s/As I mentioned, this is related to/As I mentioned at the beginning of this session, and similar to/ 15:01:09 rrsagent, draft minutes 15:01:11 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz_ 15:01:26 ... if there is not going to be a new CG, then each group will first have to work on its own 15:01:55 0. Catalog of tools/practices. I'd be ecstatic if a single tool (eg LinkML) can solve the problems, but I'm doubtful. So we can borrow from the "KGC" CG (who work on RDB/JSON/XML mapping tools, extending R2RML and RML): features from one tool are borrowed as requirements for another 15:02:01 ... (adds an action item to the slides that each person notes the problems and should do dissemination) 15:02:16 ... that concludes our session 15:02:34 [adjourned] 15:02:35 rrsagent, draft minutes 15:02:37 I have made the request to generate https://www.w3.org/2024/03/12-schemata-discussion-minutes.html kaz_ 17:26:47 tidoust has joined #schemata-discussion 20:23:47 RRSAgent, bye 20:23:47 I see no action items