W3C

– DRAFT –
RDF-star WG biweekly long meeting

15 February 2024

Attendees

Present
AndyS, AZ, doerthe, draggett, eBremer, enrico, fsasaki, gkellogg, gtw, ktk, niklasl, olaf, ora, pchampin, pfps, rubensworks, Souri, TallTed, tl, Tpt
Regrets
-
Chair
ora
Scribe
AZ, pchampin, Tpt

Meeting minutes

the story so far, setting the stage to consider options 1, 2, and 3 of https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.md

ora: the way I see it now is that option 2 seems to have a lot of support and personally I hesitate between option 2 and 3. Peter might have convinced me I support option 2.

ora: I would like to see if proponent of option 1 and 3 can be convinced to support option 2 to reach concensus

ora: I would like to use the first half of this meeting for that

ora: and maybe vote in the second hour

pfps: I think option 1 has no uniqueness requirements but some people supports it has

pfps: uniqueness requirement: their can only be a single subject/predicate/object for each rdf:Statement

enrico: The fact is that that uniqueness in option 1 implies opacity.

<pfps> that is, in option 2 <<e | s p o>> and <<f | s p o>> produces only one reification node

enrico: There are too much implicit understanding, to me 1 and 2 are assembly way of trying stuff but not all assembly make sense so we need for some of well-formeness

enrico: It's why I prefer option 3

ora: Question 1: how strongly are you in favor of option 3 or is there an option to move you from 3 camp to 2 camp

enrico: yes, the 2 camp allows a lot of freedom, we need to write a big best practices section to explain well-formness

enrico: it requires a lot of explanations

enrico: option 3 is self-explanable/error-free

ora: Question 2: if we have these two variants of option 2, can we identify pros and cons of these variants?

tl: If we go for formalizing occurrences, has it to go to the core or RDF or can we get by the concrete syntaxes we defined?

tl: In this perspective I am for a semantic extension and not extend the core

tl: I highlighted in my email that some extension of option 1 can bring a lot of features

tl: we decided that the use cases are about occurences and we extended the syntax to talk about occurences but the types came back again.

tl: thinking about it, we are missing an equivalent of option 1 as an extension to the model.

tl: we don't need the intermediate blank node in option 2, it adds complexity

pchampin: one thing: well-formeness is not a semantic extension, it has to do with syntax

pchampin: my main argument against option 1 is peter's wording of "Frankenstein reification"

<pfps> My worry about Option 1 is summed up in the problems with the seminal example. That these problems were in the motivation for RDF* and that they lasted for so long is, to me, strong evidence that any solution should make it hard to create this sort of bad modelling.

pchampin: the benefit of extending the abstract syntax is to have something stronger than "well-formeness"

pchampin: My understanding is that this tread-off is not good enough for all parties

pchampin: if I have to pick a side, I prefer to extend the abstract syntax

pchampin: last point: I have a proposal: if we go for option 2, I have some ideas about ensuring the unicity constraint in practical way that might help with well formeness

doerthe: I fear there 3 versions of option 2: "as syntaxic thing", "with well formeness" and "with well formness and unique blank nodes"

doerthe: my question: why is well formness so important?

AndyS: I prefer option 3 because it makes decision on what well-formness is and allows to prevent some graph algorithm to make sense of the data

AndyS: We should reflect on why RDF reification is not a success

enrico: Frankenstein example: as soon as we do expansion we don't know what the original s/p/o are

enrico: This is not backward compatible because upgrading to RDF 1.2 means they suddenly have to comply with RDF 1.2 constraints

<AndyS> Problem with Turtle predicate object lists : non-unique reification https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Feb/0005.html ("Case: Turtle predicate-object lists")

enrico: option 1: if people only use the macro then everything is ok, it fully compatible

<pchampin> enrico makes a good point: with option 1, people CAN create ill-formed graphs by using ONLY the syntactic sugar (the "frankenstein reification" case)

enrico: I believe we need some sort of unicity constraint, otherwise I can not give any meaning to the blank node.

enrico: you can write that have no sense and don't map to any meaningful reification. We need to rule out these things or at least note them in the best practices

<Zakim> gkellogg, you wanted to discuss a way a parser might deal with the unlean graph issue with option 2

enrico: option 3 has not these problems, and it's why I prefer slightly option 3

gkellogg: about option 2 and parsers: the problem can be mitigated by collapsing blank node

<Zakim> pfps, you wanted to talk about triple size and optimizing querying

pfps: About the number of extra triples: I don't think it's a valid concern: either there won't be many quoted triples or there will be a lot and SPARQL implementation will add specific structures to manage them at any reasonable speed

<pfps> as far as I have seen in RDF/SPARQL implementations, what consumes space is not the triples themselves but the supporting structures to allow fast querying. If there are a lot of quoted triples then quick querying will have to index the quoted triples somehow, probably ending up with about same amount of storage.

pfps: ... and if there are a few triples it does not matter

tl: It's an advantage if implementations can use triples because they can still work with it. Number of triples is not something we should focus on.

tl: I expect people to work with the annotation syntax and it's what is important to me

tl: we have to deal with the existing reification syntax anyway

AndyS: On triple counts: what is a strength of RDF is that SPARQL is properly defined. Counts are important

AndyS: we need to define RDF-merge. This is the ability to take 2 graphs together without worrying. If we merge to valid graphs and can get an invalid one we miss something

AndyS: 2 and 3 are semantically equivalent because they use the same semantic

ora: I find that graph merging is the thing that moves customer to use RDF and not property graphs

<enrico> :e rdf:subject :s1 . :e rdf:predicate :p1 . :e rdf:object :o1 .

enrico: About the Frankestein example:

<enrico> :e rdf:subject :s2 . :e rdf:predicate :p2 . :e rdf:object :o2 .

<enrico> :e rdf:nameOf _:b1 _:b1 rdf:subject :s1 . _:b1 rdf:predicate :p1 . _:b1 rdf:object :o1 .

<enrico> :e rdf:nameOf _:b2 _:b2 rdf:subject :s2 . _:b2 rdf:predicate :p2 . _:b2 rdf:object :o2 .

enrico: Take these 6 triples you can mix them up and can't go back to understand what is the first triple
… Opposite to option 2 when you keep the two triples

enrico: In option 1 you mix all subject/predicate/object of the same occurence of the reification.

<tl> w.r.t. merging: last friday the discussion seemed to show that in option 2 the reification name (blank node) needs different merging rules, i.e. it's not a normal blank node, and no standard merge/union rules apply

<tl> w.r.t. enrico's point: there's always a way to mess up compound triple structures.

pchampin: my problem with Frankenstein reification: the syntaxic sugar does not prevent creating invalid graphs

pchampin: This does not play well for option 1

pchampin: the syntaxic sugar can be defined: whenever the 3 terms in the edge are bounds (IRI or literal)s we can generate a URI via some rules.

<TallTed> s|s/Take these 6 triples/... Take these 6 triples/||

pchampin: every parser would generate the same IRI, the same node

pchampin: whenever there is a blank node, parsers are supposed to generate the exact same blank node

pchampin: so, I believe this would restrict the proliferation of blank node and would merge as one would expect and generate the same numbers in SPARQL

pchampin: I am not a big fan but serves some purpuses

doerthe: to come back to enrico example, it's not bad if we imply that s1 = s2, p1 = p2 and o1 = o2

doerthe: I think the syntactic conditions somehow clashes with semantic conditions

enrico: If you think in term of the semantic web stack, well-formness is based on unicity, and unicity in the ground of what?

<pchampin> doerthe, the fact that semantics "interferes" with well-formed-ness is indeed one of the concerns that I now have with well-formed-ness

enrico: the merging with a language with equality, pchampin trick would not work anymore

<pchampin> enrico: my trick would work: it concerns with the intermediate node, not with the subject of rdf:nameOf

enrico: because sameAs elements in triples would lead to different triples

<pchampin> but I still prefer option 3, mind you :)

AndyS: Adding a deterministic URI would get round some issues on merge but compared to the looses, we can't get back to the subject/predicate/object

tl: option 3 is supposed to have a mapping to triples, all problems with option 2 are also in option 3

tl: is someone arguing we don't need a mapping to legacy triples?

tl: I can make strange case about option 2 because there is an other node and people can do a lot of things with it

tl: Option 1 is reduced to the core, the syntaxic sugar: << e | s p o >>

tl: everything else if for people to define

tl: this certainly enough to model property graphs

ora: about option 1: we have syntaxic sugar that gives us a graph that is well formed. If you go out bad things could happen. Let's ignore that and talk only about things that are well formed

AndyS: to tl's option 3 vs option 2. If we pick option 3 we would define a mapping to RDF 1.1 as an entailment regime

AndyS: ... subject to what happens on merge

AndyS: SPARQL would define what the counts are and answer sets.

pchampin: The point raised by doerthe is very valid: under some entailment regime a wellformed graph may entail a not wellformed graph

pchampin: saying that all bets are off if you do some entailments on your nice graph is concerning

pchampin: this makes things trickier

enrico: about the mapping option 3 into option 2 we can propose an implementation on option 3 by expending to things on option 2 like the CG report

<doerthe> <<:e|:s :p :o>> :b :c. :s owl:sameAs :s2. entails: :e :nameOf _:x. _:x :subject :s, :s2; :predicate :p; :object :o. The latter is ill-formed.

enrico: this mapping would only work in RDF. If you start adding equality and stuff this does not work anymore

ora: what is the option that everybody prefers?

ora: let's vote

<enrico> 3

<eBremer> 3

<ktk> 3

<AZ> still option 1

<pchampin> 3

<olaf> 3

ora: put the number you prefer

<gkellogg> Prefer option 3, can live with option 2

<ora> 3

<pfps> 2 but can live with 3

<AndyS> option 3, with mapping for RDF 1.1

<gtw> 1 or 3

<rubensworks> 3, but could live with 2

<tl> 1

<TallTed> 3 > 2 > 1 > 4 ... still

<niklasl> Slightly in favour of 3 over 2, since it support many-to-many, and merges well. Needs mapping to 2 for RDF 1.1 support (ideally with PA:s gen-IRI-thing)

<doerthe> 3 (2 without well-formedness would maybe work for me as well)

<fsasaki> can live with 2 and 3

<Souri> choice 3 (also I like the auto-naming idea if I understood it correctly)

3

ktk: I get tl's argument on I don't want to touch on RDF 1.1 but still have syntaxic sugar

ktk: I get AndyS on the safety

ora: I see considerable support on option 3 now. I would like to see if we can reach out a consensus

<AZ> Alan_Snyder: interested in using RDF with AI systems and also use ontologies, in relation with cryptocurrencies

... I had criticism towards RDF related to the complexities
… but there are RDF success stories
… and RDF-star can fill a gap for property graphs
… I am located in Connecticut

<pfps> does NYC still have a heart?

<Alan_Snyder> oh cool - where abouts in CT AZ?

ora: I see a lot of supports for option 3
… but what about supporters of option 1

<Alan_Snyder> haha -- sorry just realized AZ is transcribing :)

tl: I think we only need syntactic sugar with the existing stuff
… we don't need the extra things in the model, just in the concrete syntax

<doerthe> thomas, I think I could also live with option 1 without well-formedness, I am just really against well-formedness in its current form

AZ: I have not followed all the email discussions in the past few weeks, a lot of text to read
… maybe I don't understand everything
… In the 'seeking consensus' table, I'm not sure what the <( ...)> notation really means
… I'm not certain that there is a consensus that my semantics, referenced by the table, is agreed on
… Option 3 introduces something new which has not been tested or implemented.
… Similar experience with RDF 1.1; inventions of the WG that were not implemented before do not work well.
… My problem with option 2 is is the meaning of the 'rdf:nameOf'? Why not another predicate?
… Probably the connection between the name (:e in the table) and the triple depends on everybody's interpretation.
… The name rdf:nameOf seems to imply that it *identifies* the triple, so why not just identify the triple.
… I see many reason for people to be not satisfied by this.

<niklasl> +1 for nameOf implying identification ("same as")

AZ: I think there is a way of doing option 2 with something like option 1 ;
… if we stick with the original concrete syntax << :s :p :o >> (without the "e: |" part)
… we can still put the ":e" outside of the << .. >> .

ora: how unhappy would you be with option 3?

AZ: I like ktk's proposal of providing a way to consider this as pure syntactic sugar.
… This would be consistent with the idea of having RDF basic and RDF full.
… You could have RDF basic + syntactic sugar. May be that would be an acceptable middle ground for me.

ktk: AZ, what do you mean by "it has not been tested"?
… Triple terms were implemented by some people, even if others waited.

AZ: implementations that we have seen implemented the CG proposal, which was not option 3.

tl: implementations of RFD-star preceded the CG report
… I would argue that implementations are closer to option 1 than anything

pchampin: yes, implementations are according to the CG proposal
… CG-conformant implem are very close to what option 3 says

<pchampin> +1 to have a basic+sugar profile

ora: it seems that option 3 is really the winner
… there hasn't been this much agreement so far

enrico: I can write a document that's a variant of what Andy proposed
… with a better formalisation and we can discuss it

ora: let us have a tentative vote

pchampin: responding to AZ
… the semantics is still to be discussed but this is on agreeing on the abstract syntax
… and related to rdf:nameOf, I agree that it is not a good name
… it has to be a very generic relation

ora: pchampin can you find a wording for this vote

pchampin: ok

niklasl: option 3 solves my problems

<pchampin> proposed STRAWPOLL: the WG will pursue with the abstract syntax proposed in option 3, considering 'rdf:nameOf' as a working title

niklasl: it satisfies my needss

<tl> would a basic profile be just with the syntactic sugar? ergo (close to) option 1?

<niklasl> I interpret basic profile to now mean option 2 (possibly with generated IRI:s för the edge nodes)?

ktk: we have to consider that option 3 is not fully designed, this is a vote to go forward

<ora> STRAWPOLL: the WG will pursue with the abstract syntax proposed in option 3, considering 'rdf:nameOf' as a working title

<ktk> tl: that was my idea at least

<gkellogg> +1

<pchampin> +1

<ktk> +1

<ora> +1

<enrico> +1

<olaf> +1

<niklasl> +1

<doerthe> +1

<Souri> +1

<AndyS> +1

<eBremer> +1

<TallTed> +1

<Alan_Snyder> +1

<Tpt> +1

<tl> -1

<rubensworks> +1

<gtw> +0 -- tentatively in support, but have some questions on option 3 I still need to work through

<pfps> >+1

my vote, not as scribe, is that we can go with option 3 given the quasi consensus, even if I am not really satisfied with it

<fsasaki> +1

ora: enrico, can you estimate how long it will take to have a revised proposal ready

gkellogg: we have a nomenclature issue because the one we used before is obsolete
… we need a notion of a triple descriptor
… the "<< ... >>" does not appear in the abstract syntax
… we need to figure out how these descriptors can be used within the abstract syntax

<Souri> ice-cream?

<niklasl> ... replace rdf:nameOf with ... rdf:triple ?

enrico: is the syntax with "nameOf" only appear with this construction
… so that there is a syntactic restriction, or not?

pchampin: (in response to enrico) I would not make it illegal
… (re. nomenclature) I like "triple term" but don't have a strong opinion
… we also should discuss the name of the "nameOf" relation
… in favour of "rdf:triple"
… also, should we allow nested "triple terms"

<Souri> +1 to no nesting

<niklasl> rdf:triple rdfs:range rdf:Triple . :D

AndyS: I was assuming we would define a range for rdf:nameOf

<pchampin> +1 to define its range

<Zakim> niklasl, you wanted to comment on nested triple terms

niklasl: about nested triple terms, I would not like to allow them

<pchampin> you could still talk about their "names", which is what most people would need

niklasl: nested triple terms is a rabbit hole

<TallTed> Forbidding nesting means we can no longer use RDF to describe *anything* that's named/identified, which feels problematic. Forbidding loops feels less problematic.

<enrico> Nesting is useful: :john :believes << :s1 | << [] | :liz :spouse :richard >> :starts 1964 >> .

<pchampin> I can definitely live with nesting, but I know that it worries some people

gkellogg: a triple descriptor can refer to an occurrence

<niklasl> +100 to not regularly *use* nested triple terms.

gkellogg: different triple descriptors may describe the same triple differently
… not nesting simplifies comparison a lot

enrico: why should we make this restriction, because it reduces expressiveness a lot
… if you want to make a statement about a statement
… you may want to model an n-ary relation using triple terms
… and then make an event in relation to the n-ary relation

<Zakim> gtw, you wanted to respond to gkellogg's take on non-nested descriptors

enrico: it is not too complicated to have a recursive structure

<pchampin> enrico, you can still do that, as gkellogg suggest:

<pchampin> :john :believes << :b1 | :s :p :o >>.

<pchampin> :marie :knows << :john :believes :b1 >>.

<enrico> :john :believes << :s1 | << [] | :liz :spouse :richard >> :starts 1964 >> . :s1 :certified-by :us-census .

<enrico> :paul :believes << :s2 | << [] | :liz :spouse :richard >> :starts 1965 >> .

gtw: it is natural to be able to describe anything about anything including statements about statements about statements
… but there are practical issues

<pchampin> which could be also written:

<pchampin> :marie :knows << :john :believes << :b1 | :s :p :o >> >> .

<AndyS> +1 to TallTed : occurrence mean unlikely to be very common. Advantage of named occurrences and separate triple /term/descriptors

gtw: [not sure what to wirte]

niklasl: what enrico wants is possible without nested terms

ora: this needs to be discussed further, many questions to be answered still
… you can model lots of things with nested terms but query writing become difficult

gkellogg: triple descriptors exist only because we want to talk about a triple occurrence

<pchampin> actually, this makes it impossible to talk about the rdf:nameOf triples...

gkellogg: if the occurrence is named then it can be used, even in a triple descriptor

<Souri> +1 to use of occurrence names inside a triple-desc to express nesting when needed

gkellogg: use of triple occurrence should be discouraged in the concrete syntax, but must be in the abstract syntax

pchampin: preventing the nesting makes it impossible to talk about the "nameOf" triples

<AndyS> SELECT ?x { :e rdf:nameOf ?x }

<AndyS> SELECT ?e ?x { ?e rdf:nameOf ?x }

AndyS: triple descriptors might not be generally used in RDF, but they will appear as results in SPARQL queries (examples above)

Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).

Diagnostics

Succeeded: s|https://github.com/w3c/rdf-star-wg/issues/1 -> CLOSED Issue 1 No activity (nor even README) since WG approval in August (by TallTed)||

Succeeded: s|https://github.com/w3c/rdf-star-wg/issues/3 -> CLOSED Issue 3 Convert SPARQL specs to ReSpec (by afs) [complete]||

Succeeded: s|https://github.com/w3c/rdf-star-wg/issues/2 -> Issue 2 RDF* in CBOR? (by ChristopherA)||

Succeeded: i|agenda+ Choosing|scribe: Tpt

Succeeded: s|topic: the story so far, setting the stage to consider options 1, 2, and 3 of https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.md||

Succeeded: i|the way I see it now| topic: the story so far, setting the stage to consider options 1, 2, and 3 of https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.md

Succeeded: s/inference/expansion

Succeeded: s/Take these 6 triples/... Take these 6 triples/

Succeeded: s/Take these 6 triples/... Take these 6 triples/

Succeeded: s/Opposite to/... Opposite to/

Failed: s|s/Take these 6 triples/... Take these 6 triples/||

Succeeded: s/... ... Take these 6 triples/... Take these 6 triples/

Succeeded: s/1./about option 1:/

Succeeded: s/supporters of option 3/supporters of option 1/

Succeeded: s/cariant/variant/

Succeeded: s/ned/needs/

All speakers: AndyS, AZ, doerthe, enrico, gkellogg, gtw, ktk, niklasl, ora, pchampin, pfps, tl

Active on IRC: Alan_Snyder, AndyS, AZ, doerthe, draggett, eBremer, enrico, fsasaki, gkellogg, gtw, ktk, niklasl, olaf, ora, pchampin, pfps, rubensworks, Souri, TallTed, tl, Tpt