RDF-star WG biweekly long meeting

Meeting minutes

<AndyS> Summary link -- https://htmlpreview.github.io/?https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.html

Ora welcomes back niklasl who has renewed invited expert status.

<pchampin> https://htmlpreview.github.io/?https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.html

Adrian invites pchampin to summarise where we are

pchampin: we should be able to straightforwardly adapt the RDF star semantics for all of the proposals

Discussion of updated "Seeking consensus" table https://htmlpreview.github.io/?https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.html

pchampin: the most contentious point is the syntax

pchampin: a simple point: there is a question mark re SPARQL in olaf's email

The triple pattern s p o doesn't match edge statements ...

<pfps> I think that it is more accurate to say that the semantics is the same except the part for embedded triples.

pfps: the semantics are all the same except in relation to (?)

olaf: the last column in the table is less clear for me

gkellogg: talks about updating RDF/XML and pchampin's sugar+ note

Souri: was on vacation last week, so now catching up, RDF triple should be counted separately from named occurrence of triple
… , keeping them independent, I like that idea, but the count should be zero

AndyS: 2 points, 1) RDF/XML has its own well-formedness condition (gives details), 2) about count, it's worth noting that in Olaf's example, there is a real triple, so the numbers in the column might be one more than you think. I would like to know how you go between edges and triples

olaf: responding to Souri, I concur with AndyS. (provides explanation)
… . Souri should be able to respond to AndyS's 2nd point

Souri: if the data was an annotation, the count of s p o should be 1. For a named occurrence of a triple, in that case SPARQL should see a count of zero
… . for a regular triple, count should be 1
… , SPARQL should find the references rather than the triples themselves

tl: we're conflating asserted/unasserted triples.

olaf: it depends on which case we are talking about, for the 3rd case, first row, expansion to named triple
… , on the next row, the reference is to a different position, and you see two triples
… . For the last column it isn't so clear

<pchampin> the variables in the SPARQL query should probably have different names, like ?x ?y ?z, to differentiate them from :s :p and :o in the example :-/

Souri: if the subject in the query is a named triple, the count should be 1. There is one triple if we don't constrain the variable.
… RDF today doesn't have the asserted/unasserted distinction. S P O gets asserted, but when you use a name, it isn't asserted.
… we can avoid introducing the asserted/unasserted distinction.

AndyS: I think you're misreading the example. (explains).

pchampin: what is probably confusing is that the query is not being run on the stuff in the previous row

<AndyS> One triple is :e rdf:nameOf <<( :s :p :o )>> and the second triple is :e :pp :oo . -- :s :p :o is not asserted.

<olaf> First triple is :e rdf:nameOf <<( :s :p :o )>>

<AndyS> +1 to Olaf.

ora: one of the challenges is how we can explain things simply to avoid confusion both to ourselves and to others.
… we are now risk confusing vertices and edges in our explanation. The double chevron is a vertex not an edge.
… Do we want special interpretation for rdf:namedOccurrenceOf

<Zakim> TallTed, you wanted to ask for changes to the table, that may bring more clarity

TallTed: I will tweak the markdown for the table shortly to add another row to improve clarity with SELECT *

<olaf> +1 to replacing "SELECT ( COUNT(*) ... " by "SELECT * ..."

TallTed: , as well as cleaning up line breaks, code and non-code blocks

AndyS: are we agreed on the syntax?
… (for the Turtle syntax)

<pchampin> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0095.html

pchampin: I believe you are refering to the above emails, right?

AndyS: yes

<Zakim> gkellogg, you wanted to ask about bare use of <<>> when it's not allowed as a subject.

Ora: right

gkellogg: talks about bare use of <<>> when it's not allowed as a subject.
… it makes sense to select over just the term, as the subject.

Souri: I like giving variables that match S, P and O. (gives details in relation to the table examples).

AndyS: the new term in the reification, the RDF star syntax is in the chairs starting point email, a named occurrence, which can come in 3 forms.
… or as the 4 element form. The RDF star examples continue to work for the subject position. +1 to Ted for adding another row

<AndyS> In the "reification atom" column, the new term is <<( )>>. RDF-star CG <<>> syntax is reused for a named occurrence subject or object position https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0095.html . Without bar , a new blank node is used, with bar "|" and now 4 parts, it is the name of the reification.

niklasl: talks about naming, we need to sort that out

niklasl: the main thing is whether or not we should just have reification or also have descriptors.
… , these are not named graphs.
… I'm suggesting using rdf:value.
… I see a lot of value in the 3rd column in the table.

olaf: responding to Souri, refers to first column, 2nd row. This isn't something you can put directly in a Turtle file.

<tl> /me *now* i understand how olaf came to those numbers

ora: we don't write vertices by them selves in a Turtle file.

AndyS: 2 points. The name of a triple is kind of special. Any RDF graph can link a name to an atom.

re: multiple names, they should refer to the same triple (i.e. edge)
… . Atoms can simplify definition of well-formed n-triples

Souri: reducing the number of triples is very important. Regarding vertices and edges, property graphs are easy to think about: properties on vertices and edges.
… A triple is an edge that doesn't allow duplicates.
… you can have multiple named edges, that's fine and helps with authoring.

<ktk> thanks draggett

pa: edge is tricky concept. "e - spo" = name on edge, is not true ...
… edge is not asserted.
… edge is an imaginary edge here. Triples are assertations, edges are potential links / assertations
… naming an edge for future usage to talk about it: agree with that. All proposals make that possible
… one can still use the expanded syntax, with more or less syntactic sugar, in all options

Andy: name of triples is just saying "there is name", without anything else
… better to have this as RDF triple, not a separate concept

souri: syntactic sugar helps to keep amount of storage needed for n-triples down
… we reserve a name for an occurence of "spo"
… as a data creator, I want to keep some obvious names so that others can add annotations
… that is possible with the "spo" approach
… with the potential "edge", rdf triple is a special kind of edge.
… "e spo" is a potential edge. There is no question of its assertation, it is not an rdf triple kind of edge.
… we could call it "occurence". I just want to enable: having the name so that others can add annotations
… want to allow users of RDF to have this mental model that uses the terminology of edges and nodes
… this is very similar to the idea of the third column. This would make it easier to understand RDF for property graph people and general graph people

ora: I am extremely worried
… I don't want to explain that we have triples and edges
… we already have explained that triples encompass edges
… in property graph there is a structure of vertices and edges
… in RDF, vertices are like points, they don't exist
… Souri says "introduce the name so that others can re-use it": peopel can do that all the time
… you can make a statement so that you can re-use names
… I don't see the need to introduce names for things that some day could be edges
… without making any statements about them. We already have the practice to use names for statements. Let's try to keep this simple
… let's go back to thinking of vertices and edges.

pa: we said that triples are edges in the graph, agree
… the issue is: until rdf-star, there was a conflation
… now, we might need to distinguish notions

<Souri> I have no problem with column-3 way of introducing a name for a triple (actually, an occurrence of a triple or "RDF edge"). We don't have to introduce the idea of "edge".

pa: make it explicit, like Andy said, the distinction between edges and triples
… that is the key we need to clarify
… Souri's idea of a mental model: for me, realizing that the semantics could be adapted helped me to realize: I have a similar model for all approaches
… difference is only what we put into the abstract syntax or elsewhere
… adapting antoine's semantics in the different columns: they have the same interpretation!
… the only difference is that properties have standard names like "rdf:subject" ...
… or they do not have specific names, in the triple term solution "spo" are not anymore explicitly named,
… some semantic extensions can force them to be named
… in rdfn proposal the name is backed into the syntax
… differences are important, but they end up the same structure and interpretation

greg: triples are members of graphs
… they are asserted to a particular graph
… semantics is the meaning of a triple
… statement is a statement made by a token of an RDF triple

greg: might be confusing to add other things
… occurence nomenclature is challeging
… occurence of a triple is not a number of something
… a triple does not need to be a member of a graph
… trying to stick to: triples, graphs, trying to avoid the use of terms as occurence

ora: I was not suggesting to add new terminology

souri: "edge" does not have to be included as a term
… developers just think in that way

souri: we need to introduce the ability to use multiple names for a triple
… so that other people can use it
… that will allow the multiple edge idea
… will be nice to have good sugar syntax in n-triple
… I am not insisting on a new term, only discussing a mental model

thomas: +1 to PA, for the seeking consensus table shows that we have a lot of points in common, and PA also pointed out differences I agreed too
… about what Souri etc. said: a triple appears once, but we may want to speak about it multiple times
… asserted vs. unasserted, that gave us the opportunity to speak about unasserted
… in the CG we said: we want to speak about assertations without asserting them
… so we need a name for it
… we had discussion about claim and fact
… would be sub classes of rdf:statement
… in the sugar approach, you do not need the type. Could also be type claim or fact.
… we would provide the opportunity to be more precise

<TallTed> { :a | :Ted a :flounder . :a a :fact } ???

ora: suggest a straw poll
… what do people favor?

<pfps> I favour column 2 - sugar+

<tl> sugar

<Souri> +1 to proposals 3 and 4

<AndyS> 3 (reif atom, AZ semantics)

<TallTed> pchampin, AndyS -- I've created a PR on the MD for the table -- w3c/rdf-star-wg#110

<doerthe> 3 (then in that order 1,2,4)

<Dominik_T> sugar+

<olaf> 3

<gkellogg> triple-term and sugar+

<pchampin> 3. triple-terms / descriptor

abstain

<ora> 3, with 2 as my second choice

<niklasl> My current most favoured to least: 3, 1, 2, 4

<AZ> perhaps 1

<niklasl> (I think 3 unstars to 2 and 4 to 1 (or a mix of 1 and 2))

souri: rdf 1.2 has to support minimum base, applications decide what to do
… we need to be able to attach a name to a triple
… then you can say "it is a claim"
… applications can have many things, e..g. use a standard vocabulary for that

thomas: agree

adrian: the triple terms proposal could be called a "native rdf-star" implementation
… would that be correct?

<Zakim> gkellogg, you wanted to discuss impact of conformance level

gregg: we have some notion of conformance level
… 2 might be an entailment of 3
… systems that do not support full performance would output entailed version
… is also needed for canonicalization
… thinking of these as pairs might be useful

ora: like that idea. Do you see 3 as an optimzied version of 2?

gregg: yes

<pfps> I don't see 3 as any version, optimized or not, of 2

andy: we discussed translation in CG
… we could say "the native impl. can be translated into RDF 1.1. graph"

gregg: rdf xml token stuck to wellformedness constraints could be roundtripped

andy: not sure
… one could have two different graphs with the same base
… for a standalone rdf xml file that would be possible

<pchampin> pfps, az-RDF-Reification semantics does create a link between 2 and 3

pfps: 3-4 change RDF data model
… 1-2 do not
… 3 adds triple terms, a recursive thing
… 4 adds edges

ora: so triple terms could not be identifers?

pfps: you could, but that then is 2, not 3
… the abstract syntax is different, model theory is different

thomas: we need syntactic sugar, also in sparql
… that is the user facing part
… the implementation can be done in triple terms, in named graphs, or via standard reification
… wellformedness gives you guarentee that you can work with optimized implementation
… the four approaches are different ways to specify this
… so why do we need to have the term in the abstract syntax, the model
… if all is just encoded in the syntax

souri: 3-4 is introducing s.t. different
… question is: what will go to abstract syntax
… where do we introduce complexity?
… at the abstract syntax level we do not need this
… as long as the surface syntax has this, it will be fine
… once you get to lower level, there will be a lot of verbosity
… let's keep the consice syntax as much as possible

niklasl: hard to access recursion from the abstract syntax
… not sure if it is valuable to be able to encode a recursive silouette of data sets

andy: rules are recursive, but the output are trees

<Souri> +1 to Andy's comment about implications on SPARQL behavior

pa: difference boils down to how much constraints do we want to put into abstract syntax
… the table explicitly mentions the abstract syntax
… idea of welformedness is: keep abstract syntax as is
… have weaker constraints in addition
… the more you go the right in the table, the constraints are backed into abstract syntax
… in triple terms, it is impossible to remove the triple from the triple term and fits into abstract syntax
… adding this means: lose some flexibility
… but it provides some guarantees that esp. sparql implementers help
… as we saw in the CG and RDF star implementations
… the 4th is more change to abstraction than 3
… that is why I prefer 3

adrian: agree on souri to be concise, that is why I like 3 more than 1 and 2
… agree with what andy said about sparql
… with n-triples, I still do not understand what to do, seriailzng it and back
… we use lists in turtle etc. and it works
… but you cannot re-build the same structure in turtle
… this is what worries me with 1 and 2
… if we do not find that I have an issue
… lists suck
… if we can solve that, I am ok
… we have discussed this several times, but that is the elefant in the room in my view
… triple terms provide that in a abstract way

souri: 4 is not a stronger change than 3
… "e" is saying: there is one thing for association
… about conversion: that is very important
… if we introduce re-ification atom like things
… what happens if there is a mixture?
… should s.t. compact never be expanded?
… we can keep the sparql simpler, otherwise it can be complex
… let's try to avoid complexity

<pchampin> in simple entailment, I don't expect any "automatic conversion" between the two

<pchampin> but AZ demonstrated that this can be achieved via a semantic extension

thomas: never thought about back conversion
… with the reification based proposal, I only need to change the syntax parser

<Zakim> gkellogg, you wanted to note that a TRIPLE-TERM in the abstract syntax, without rdf:nameOf cannot be expressed in any concrete syntax

thomas: what do we gain if we put the change into abstract syntax?

gregg: concrete syntax is restriced to tokenized version of a claim
… about lists, we may want to fix them

<pchampin> we would need concrete syntax for "raw" triple-terms/descriptors, if only for N-Triples

ora: challenge is: find consensus between 2-3
… those get a lot of support
… I could be convinced of 2, although I had thought of 3
… can continue discussion in semantics call
… think it is between 2 and 3
… adjourned, see many of you tomrrow

<TallTed> my belated straw poll answers -- 2, 3, 1, 4 ... roughly

– DRAFT –
RDF-star WG biweekly long meeting

01 February 2024

Attendees

Meeting minutes

Discussion of updated "Seeking consensus" table https://htmlpreview.github.io/?https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.html

Diagnostics