W3C

– DRAFT –
RDF-star Working Group bi-weekly focused meeting

20 June 2024

Attendees

Present
AndyS, AZ, doerthe, draggett, eBremer, enrico, fsasaki, gkellogg, gtw, ktk, Kurt, niklasl, olaf, ora, pchampin, pfps, Souri, TallTed, tl, Tpt
Regrets
-
Chair
ora
Scribe
fsasaki

Meeting minutes

baseline about use cases

ora: I lost track of what the baseline is, there has been so much discussion

<enrico> https://github.com/w3c/rdf-star-wg/wiki/RDF-star-%22baseline-with-IRI-opacity%22

ora: enrico, can you discuss the baseline briefly before we get going?

enrico: according to the latest discussions we have a mix of transparent and opaque triple terms
… there is discussion if you really want to have opaque triples, that could be a discussion today
… peter introduced a minimal baseline: rdf star with only transparent triple terms

<niklasl> https://github.com/w3c/rdf-star-wg/wiki/RDF-star-%22minimal-baseline%22

ora: I meant with "lost track": there are we with the issue of our view: reifiers vs. many-to-many?

enrico: in the latest version we do not have the functionality of annotations anymore
… so you can have the same annotation for several triple terms
… we discussed that this does not hamper LPGs
… annotations introduces implicit policies which means: reasoning is not based on matching anymore
… in the baseline with transparent triple terms which are IRI opaque, there is a base language without syntactic restriction
… rdf entailment adds then only syntactic restrictions
… adding semantics only makes sense if we are in the restricted fragment
… that is the discussion of the last weeks

kurt: I had a presentation that I gave to another group that I wanted to talk about at some point, about alternatives that might be worth exploring
… if I can take 15-20 minutes I can present this, today or at a future meeting

ora: we have an agenda for this meeting, we can accomodate you in a future meting

pa: use cases should guide our decision.
… do we have a use case to model LPGs in RDF? that is one question

<AndyS> https://github.com/w3c/rdf-ucr/wiki/

<TallTed> https://github.com/w3c/rdf-ucr/wiki/RDF-star-for-labelled-property-graphs ?

pa: the other is: do we have a use case capturing the concerns mentioned by ora and others, also felix
… some arguments there usage based, how to explain things

<AndyS> https://github.com/w3c/rdf-ucr/wiki/RDF-star-for-labelled-property-graphs (but out of date?)

pa: that should be captured somehow

niklas: we have a minimal description of LPGs, they do not capture the concerns
… if we go forward it is hard to frame how we should look at them
… two baselines: one way to assess use cases with regards to: do they require reasoning
… I do not think that LPG is a use case, we should define use cases and then say: how to express them in LPG
… in that I expect that pattern matching on transparent terms are more clear

<pfps> Questions about what use cases cover can be answered (since six months or so) by consulting https://github.com/w3c/rdf-ucr/wiki/Summary and other pages in this wiki

peter: since 6 months we have questions about use cases, it was available in the WG and in the wiki
… go look at the wiki

felix: mention that LPGs are represented often as tables for vertices and edges. Showing an example how that is done could help

andys: we are looking for a generic mapping to LPG

<Zakim> TallTed, you wanted to say there are two kinds of use case -- one starts from zero and asks whether it can be satisfied by RDF and/or LPG; second says I have both RDF and LPGs and asks how they can be mixed/blended/combined

andys: the scope for the requirement was: LPG mapping, capturing edge annotations

ted: two kinds of use cases: 1) I have a pure, uncommitted data set that I could put in RDF or LPGs 2) we have been talking about: I have some data in LPGs or RDF, and how can I make those things work together
… that has not been discussed deeply until now
… 2) is the more challenging one. How to bring to data sets together from two companies merging, based on RDF or LPG together and have something useful?

niklas: there are many things in LPGs that have to be addressed, e.g. notion of IRIs, vocabularies ...

that goes into the direction of LPG LD, s.t. like json-ld for LPG
… not necessarily vice versa
… in the support for the simple baseline: that does not stand in the way to go from LPG to RDF, it is not obvious how to go from richer RDF back again
… if RDF data allows to go back and forth without data loss: not sure how much this is in scope for this WG?

ora: lossless roundtrips are possible a pipe dream

<niklasl> So, how much loss is acceptable...?

kurt: LPG is a class of different types of graph systems, neo4j is the elephant in the room, a lot of the discussion may be "open cypher and RDF equivalency"
… that is an easer domain to address
… if you talk about the broader class of graphs, including e.g. tinkerpop, things get more difficult
… there is no consistency in the LPG side, RDF has the benefit that it is concise and standardized
… open cypher now has standardization, if we has open cypher equiv. to RDF, that may be easier
… we need to be more specific what the target on the LPG side is and address issues one at a time

ora: so open cypher is: the kind of LPG that open cypher implies

ora: I had a discussion with Jesus from neo4j about this WG, he was interested but he could not participate in this WG (neo4j is not a W3C Member)
… how to deal with use cases? do we take one at a time and discuss it with respect to the baseline?

kurt: enumerate UC first and then go through them

ora: peter, can you describe if our UCs can break up into groups?

peter: there is a group that requires transparency
… another requires complete opaciity, including blank nodes
… and one about triple origin, which is weird

ted: could that summary be put in the wiki page?
… this is what we would need as a group to say: only this UC needs this feature, so we do not touch it. Or: this feature is needed by all use cases

peter: sure, I can add that to the summary page

ora: can we look into use cases that require transparency now?

kurt: we need to be specific what needs to be changed in RDF vs. what needs to be changed in turtle
… some changes are new notations
… and somehow turtle constructs become RDF constructs
… we need to be careful what changes are syntactic sugar to be added to turtle, without requiring changes to RDF itself

ora: I assume: there are changes to RDF, i.e. the abstract syntax and semantics

<AndyS> "Agreed Syntax" for Turtle :: https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0095.html

ora: and downstream effects on serializations

enrico: regarding LPGs, I would refer to standard GQL

<enrico> https://drops.dagstuhl.de/storage/00lipics/lipics-vol255-icdt2023/LIPIcs.ICDT.2023.1/LIPIcs.ICDT.2023.1.pdf

enrico: to the data model the GQL standard assumes to build a query language

enrico: the paper summarizes nicely GQL, we can easily refer to that in our discussions

+1 to the proposal from enrico

<pchampin> https://htmlpreview.github.io/?https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.html

pa: changes in the abstract syntax vs. concrete syntax need to be separated
… we had this discussion before kurt joined the group, see above link
… we decided to go for option 3
… we are well aware of the separation of concrete vs. abstract syntax changes and found consensus on that
… we are now working on the assumption that we will touch the abstract syntax
… though there are elements that are syntactic sugar like introducing double brackets

ora: should we take "transparent vs. opaque" as a guidance of our use case analysis?

pa: nearly all UCs seem to require transparency
… LPG use cases require transparency plus
… opacity was introduced to enable LPG use cases
… maybe we should discuss that to find consensus

ted: that is one differentiator , but there may be others
… we need a list of requirements to be able to understand potential relations between features

enrico: opacity is needed only for the annotation of syntactic triples
… if nodes become transparent one can do things to graphs that are impossible with LPGs

with an example like "this triple has been written wrongly ...": you need to have understanding
… the triple is just a piece of syntax that has some properties

ora: for some people is the distinction between opaque and transparent lost
… do we have use cases that demonstrate that distinction?
… people will ask: what is the difference?

peter: I tried to get answers from use case owners. They mostly did not have the situation that IRIs could help to the same thing
… for literals they said: of course that may be possibe.

niklas: even the opaque case may be needed to be connected to s.t.

andy: other important concepts that are hard to understand: triple terms vs. occurences
… that also carefully needs to be explained

kurt: what is the difference between a transparent and opaque IRI?
… is it: bnode vs. IRI?
… how does one defined transparent vs. opaque?

<enrico> https://github.com/w3c/rdf-star-wg/wiki/RDF%E2%80%90star-examples-of-profiles

enrico: see above link, example 7
… a transparent triple term talks about s.t. in your domain
… you want to refer to what the triple is talking about
… the IRI really denotes things in your domain
… in the case of opaque triple terms, you do not want to talk about the meaning of the triple
… you want to talk about the triple in the graph
… opaque triple terms being resolved are a triple
… transparent triples are statements

kurt: I have a structure that says: here is a term, that is a definition of opacity
… the combination of Sub - pred - object graph as being a resource that can be referred to. It is not talking about the subject

<pchampin> "rose is a flower" → rose in this sentence is transparent / "rose has 4 letters" → rose in this sentence is opaque

kurt: there is no semantics, it implies that there is no RDFs
… then you talk about the combination, an Opaque triple as an object
… you say: here is s.t. that points to the components of the triple
… that does not stop the reification, it just says: I have defined a reification for an entity that may or may not be in the graph

<AndyS> "triple T added to graph on 2024-06-20" (different from the fact described by the triple became true)

kurt: maybe that is a way to have a notation to distinguish transparency from opaqueness

<niklasl> I tried, in slide two of https://docs.google.com/presentation/d/e/2PACX-1vQd9lU1j4TPxluCe-cB0t7_BUpy8zAfeY_5hDlbwIyOB8wsiRqkRtSFP4AeflV5UsE4EqT-Y3_Jjx9q/pub (that I sent ~2 weeks ago) to quote the relevant parts as they are defined (one non-normatively) in RDF 1.1.

thomas: question about the unasserted triples
… for many use cases I took the annotation syntax as assigned
… for anything else, like cidoc-crm, I am lost
… there were mentions of customers who may need that
… also in the community group
… so I better want to understand then unasserted assertations are needed

<pchampin> wikidata clearly needs to talk about unasserted triples

enrico: opaque use case is about syntactic annotation
… it is not only about syntax
… clark kent - super man example

<tl> pchampin why?

enrico: names do not co-refer, that is a common use case in logic, not in RDF

<pchampin> tl, see https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#h-Statement_representation-Data_model

enrico: we talk about things being not ridig

kurt: this use case is what I was talking about

<tl> pchampin thanks, will do

if I have a triple that asserts a false statement, that statement should not be in the graph
… that is the use case for being opaque
… the s-p-o structure is triple like, but the assertion is not valid
… you can only do that by talking about triples as being s.t. being not part of the graph
… here being opaque means sense
… you deal not with triples but reified statements

<AndyS> "that graph at U contains the triple <<( :s :p :o )>>" -- "triple T withdrawn on 2024-06-20"

pa: we have two differentators quoting ted

<gtw> FWIW from my prior use of CIDOC-CRM, I'd say that often the modeling desire *is* to have unasserted triples. It probably only works with the many-to-many reifier approach where the reifier can denote something in the CRM domain (usually an Activity).

pa: there are some requirements. Would be good to start listing the requirements
… saying: there is consensus on this one, not on others
… e.g. do we need to talk about unasserted triples?
… in a previous meeting we said: we create a document note
… maybe we need to be more proactive here

<TallTed> also, do we need to *query* unasserted triples (e.g., what has been said about Paris, whether or not those statements are asserted within the graph?)

thomas: do the unasserted statements be opaque or transparent?

niklas: see above link from a presentation on opaque nodes, a case we did not have so far
… I wonder if many-to-many is more important
… I fear that opacity may be a distraction
… for true opacity, use literals
… it is dangers to focus on opacity edge case

enrico: I do not understand asserted vs. non asserted
… why do we talk about this?

ted: we want to be able to talk about triples that are part of the graph and are part of the reasoning
… and other triples that are not part of the reasoning of the graph
… we want to be able to get all triples back, of both types, that deals with an entity

<niklasl> It's in there, I'm quite certain.

<tl> it's not

ted: you want to be able to discover the non asserted notions of the entity but still have them *not* being part of the reasoning

ora: we do not have a use case for that, can you write that up?

<niklasl> Why not?

<Souri> select ?r ?x ?p ?y { ?x ?p ?y } UNION { ?r rdf:reifies <<( ?x ?p ?y )>> }

ted: I can give it a try

enrico: agree with souri's example

ora: semantics call tomorrow, adjourned

<tl> i will not be there tomorrow

<TallTed> That's not a bad query, Souri. But it's rather complex. and demands syntactic sugar!

<niklasl> There will be sugar.

<niklasl> rdf:reifies is not expected to be written out other than in (gah) edge cases.

<AndyS> "Agreed syntax" allows -- UNION { << ?r | ?x ?p ?y >> }

<AndyS> We do need to write out the triples, not sugar, as well because people relate to different forms + it ties to the semantics.

<niklasl> Absolutely.

<niklasl> I mostly expect nice examples and Turtle serializers to utilize the sugar as much as they utilize ";" and "[ ..]" (i.e. it depends on their capability and the computing resources in relation to size of the graph).

Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).

Diagnostics

Succeeded: s/prsent+/

Succeeded: s/could not participate in this WG/could not participate in this WG (neo4j is not a W3C Member)/

Maybe present: andy, felix, niklas, pa, peter, ted, thomas

All speakers: andy, andys, enrico, felix, kurt, niklas, ora, pa, peter, ted, thomas

Active on IRC: AndyS, AZ, doerthe, draggett, eBremer, enrico, fsasaki, gkellogg, gtw, ktk, Kurt, niklasl, olaf, ora, pchampin, pfps, Souri, TallTed, tl, Tpt