RDF-star WG biweekly long meeting

Meeting minutes

the story so far, setting the stage to consider options 1, 2, and 3 of https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.md

ora: the way I see it now is that option 2 seems to have a lot of support and personally I hesitate between option 2 and 3. Peter might have convinced me I support option 2.

ora: I would like to see if proponent of option 1 and 3 can be convinced to support option 2 to reach concensus

ora: I would like to use the first half of this meeting for that

ora: and maybe vote in the second hour

pfps: I think option 1 has no uniqueness requirements but some people supports it has

pfps: uniqueness requirement: their can only be a single subject/predicate/object for each rdf:Statement

enrico: The fact is that that uniqueness in option 1 implies opacity.

<pfps> that is, in option 2 <<e | s p o>> and <<f | s p o>> produces only one reification node

enrico: There are too much implicit understanding, to me 1 and 2 are assembly way of trying stuff but not all assembly make sense so we need for some of well-formeness

enrico: It's why I prefer option 3

ora: Question 1: how strongly are you in favor of option 3 or is there an option to move you from 3 camp to 2 camp

enrico: yes, the 2 camp allows a lot of freedom, we need to write a big best practices section to explain well-formness

enrico: it requires a lot of explanations

enrico: option 3 is self-explanable/error-free

ora: Question 2: if we have these two variants of option 2, can we identify pros and cons of these variants?

tl: If we go for formalizing occurrences, has it to go to the core or RDF or can we get by the concrete syntaxes we defined?

tl: In this perspective I am for a semantic extension and not extend the core

tl: I highlighted in my email that some extension of option 1 can bring a lot of features

tl: we decided that the use cases are about occurences and we extended the syntax to talk about occurences but the types came back again.

tl: thinking about it, we are missing an equivalent of option 1 as an extension to the model.

tl: we don't need the intermediate blank node in option 2, it adds complexity

pchampin: one thing: well-formeness is not a semantic extension, it has to do with syntax

pchampin: my main argument against option 1 is peter's wording of "Frankenstein reification"

<pfps> My worry about Option 1 is summed up in the problems with the seminal example. That these problems were in the motivation for RDF* and that they lasted for so long is, to me, strong evidence that any solution should make it hard to create this sort of bad modelling.

pchampin: the benefit of extending the abstract syntax is to have something stronger than "well-formeness"

pchampin: My understanding is that this tread-off is not good enough for all parties

pchampin: if I have to pick a side, I prefer to extend the abstract syntax

pchampin: last point: I have a proposal: if we go for option 2, I have some ideas about ensuring the unicity constraint in practical way that might help with well formeness

doerthe: I fear there 3 versions of option 2: "as syntaxic thing", "with well formeness" and "with well formness and unique blank nodes"

doerthe: my question: why is well formness so important?

AndyS: I prefer option 3 because it makes decision on what well-formness is and allows to prevent some graph algorithm to make sense of the data

AndyS: We should reflect on why RDF reification is not a success

enrico: Frankenstein example: as soon as we do expansion we don't know what the original s/p/o are

enrico: This is not backward compatible because upgrading to RDF 1.2 means they suddenly have to comply with RDF 1.2 constraints

<AndyS> Problem with Turtle predicate object lists : non-unique reification https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Feb/0005.html ("Case: Turtle predicate-object lists")

enrico: option 1: if people only use the macro then everything is ok, it fully compatible

<pchampin> enrico makes a good point: with option 1, people CAN create ill-formed graphs by using ONLY the syntactic sugar (the "frankenstein reification" case)

enrico: I believe we need some sort of unicity constraint, otherwise I can not give any meaning to the blank node.

enrico: you can write that have no sense and don't map to any meaningful reification. We need to rule out these things or at least note them in the best practices

<Zakim> gkellogg, you wanted to discuss a way a parser might deal with the unlean graph issue with option 2

enrico: option 3 has not these problems, and it's why I prefer slightly option 3

gkellogg: about option 2 and parsers: the problem can be mitigated by collapsing blank node

<Zakim> pfps, you wanted to talk about triple size and optimizing querying

pfps: About the number of extra triples: I don't think it's a valid concern: either there won't be many quoted triples or there will be a lot and SPARQL implementation will add specific structures to manage them at any reasonable speed

<pfps> as far as I have seen in RDF/SPARQL implementations, what consumes space is not the triples themselves but the supporting structures to allow fast querying. If there are a lot of quoted triples then quick querying will have to index the quoted triples somehow, probably ending up with about same amount of storage.

pfps: ... and if there are a few triples it does not matter

tl: It's an advantage if implementations can use triples because they can still work with it. Number of triples is not something we should focus on.

tl: I expect people to work with the annotation syntax and it's what is important to me

tl: we have to deal with the existing reification syntax anyway

AndyS: On triple counts: what is a strength of RDF is that SPARQL is properly defined. Counts are important

AndyS: we need to define RDF-merge. This is the ability to take 2 graphs together without worrying. If we merge to valid graphs and can get an invalid one we miss something

AndyS: 2 and 3 are semantically equivalent because they use the same semantic

ora: I find that graph merging is the thing that moves customer to use RDF and not property graphs

<enrico> :e rdf:subject :s1 . :e rdf:predicate :p1 . :e rdf:object :o1 .

enrico: About the Frankestein example:

<enrico> :e rdf:subject :s2 . :e rdf:predicate :p2 . :e rdf:object :o2 .

<enrico> :e rdf:nameOf _:b1 _:b1 rdf:subject :s1 . _:b1 rdf:predicate :p1 . _:b1 rdf:object :o1 .

<enrico> :e rdf:nameOf _:b2 _:b2 rdf:subject :s2 . _:b2 rdf:predicate :p2 . _:b2 rdf:object :o2 .

enrico: Take these 6 triples you can mix them up and can't go back to understand what is the first triple
… Opposite to option 2 when you keep the two triples

enrico: In option 1 you mix all subject/predicate/object of the same occurence of the reification.

<tl> w.r.t. merging: last friday the discussion seemed to show that in option 2 the reification name (blank node) needs different merging rules, i.e. it's not a normal blank node, and no standard merge/union rules apply

<tl> w.r.t. enrico's point: there's always a way to mess up compound triple structures.

pchampin: my problem with Frankenstein reification: the syntaxic sugar does not prevent creating invalid graphs

pchampin: This does not play well for option 1

pchampin: the syntaxic sugar can be defined: whenever the 3 terms in the edge are bounds (IRI or literal)s we can generate a URI via some rules.

<TallTed> s|s/Take these 6 triples/... Take these 6 triples/||

pchampin: every parser would generate the same IRI, the same node

pchampin: whenever there is a blank node, parsers are supposed to generate the exact same blank node

pchampin: so, I believe this would restrict the proliferation of blank node and would merge as one would expect and generate the same numbers in SPARQL

pchampin: I am not a big fan but serves some purpuses

doerthe: to come back to enrico example, it's not bad if we imply that s1 = s2, p1 = p2 and o1 = o2

doerthe: I think the syntactic conditions somehow clashes with semantic conditions

enrico: If you think in term of the semantic web stack, well-formness is based on unicity, and unicity in the ground of what?

<pchampin> doerthe, the fact that semantics "interferes" with well-formed-ness is indeed one of the concerns that I now have with well-formed-ness

enrico: the merging with a language with equality, pchampin trick would not work anymore

<pchampin> enrico: my trick would work: it concerns with the intermediate node, not with the subject of rdf:nameOf

enrico: because sameAs elements in triples would lead to different triples

<pchampin> but I still prefer option 3, mind you :)

AndyS: Adding a deterministic URI would get round some issues on merge but compared to the looses, we can't get back to the subject/predicate/object

tl: option 3 is supposed to have a mapping to triples, all problems with option 2 are also in option 3

tl: is someone arguing we don't need a mapping to legacy triples?

tl: I can make strange case about option 2 because there is an other node and people can do a lot of things with it

tl: Option 1 is reduced to the core, the syntaxic sugar: << e | s p o >>

tl: everything else if for people to define

tl: this certainly enough to model property graphs

ora: about option 1: we have syntaxic sugar that gives us a graph that is well formed. If you go out bad things could happen. Let's ignore that and talk only about things that are well formed

AndyS: to tl's option 3 vs option 2. If we pick option 3 we would define a mapping to RDF 1.1 as an entailment regime

AndyS: ... subject to what happens on merge

AndyS: SPARQL would define what the counts are and answer sets.

pchampin: The point raised by doerthe is very valid: under some entailment regime a wellformed graph may entail a not wellformed graph

pchampin: saying that all bets are off if you do some entailments on your nice graph is concerning

pchampin: this makes things trickier

enrico: about the mapping option 3 into option 2 we can propose an implementation on option 3 by expending to things on option 2 like the CG report

<doerthe> <<:e|:s :p :o>> :b :c. :s owl:sameAs :s2. entails: :e :nameOf _:x. _:x :subject :s, :s2; :predicate :p; :object :o. The latter is ill-formed.

enrico: this mapping would only work in RDF. If you start adding equality and stuff this does not work anymore

ora: what is the option that everybody prefers?

ora: let's vote

<enrico> 3

<eBremer> 3

<ktk> 3

<AZ> still option 1

<pchampin> 3

<olaf> 3

ora: put the number you prefer

<gkellogg> Prefer option 3, can live with option 2

<ora> 3

<pfps> 2 but can live with 3

<AndyS> option 3, with mapping for RDF 1.1

<gtw> 1 or 3

<rubensworks> 3, but could live with 2

<tl> 1

<TallTed> 3 > 2 > 1 > 4 ... still

<niklasl> Slightly in favour of 3 over 2, since it support many-to-many, and merges well. Needs mapping to 2 for RDF 1.1 support (ideally with PA:s gen-IRI-thing)

<doerthe> 3 (2 without well-formedness would maybe work for me as well)

<fsasaki> can live with 2 and 3

<Souri> choice 3 (also I like the auto-naming idea if I understood it correctly)

ktk: I get tl's argument on I don't want to touch on RDF 1.1 but still have syntaxic sugar

ktk: I get AndyS on the safety

ora: I see considerable support on option 3 now. I would like to see if we can reach out a consensus

<AZ> Alan_Snyder: interested in using RDF with AI systems and also use ontologies, in relation with cryptocurrencies

... I had criticism towards RDF related to the complexities
… but there are RDF success stories
… and RDF-star can fill a gap for property graphs
… I am located in Connecticut

<pfps> does NYC still have a heart?

<Alan_Snyder> oh cool - where abouts in CT AZ?

ora: I see a lot of supports for option 3
… but what about supporters of option 1

<Alan_Snyder> haha -- sorry just realized AZ is transcribing :)

tl: I think we only need syntactic sugar with the existing stuff
… we don't need the extra things in the model, just in the concrete syntax

<doerthe> thomas, I think I could also live with option 1 without well-formedness, I am just really against well-formedness in its current form

AZ: I have not followed all the email discussions in the past few weeks, a lot of text to read
… maybe I don't understand everything
… In the 'seeking consensus' table, I'm not sure what the <( ...)> notation really means
… I'm not certain that there is a consensus that my semantics, referenced by the table, is agreed on
… Option 3 introduces something new which has not been tested or implemented.
… Similar experience with RDF 1.1; inventions of the WG that were not implemented before do not work well.
… My problem with option 2 is is the meaning of the 'rdf:nameOf'? Why not another predicate?
… Probably the connection between the name (:e in the table) and the triple depends on everybody's interpretation.
… The name rdf:nameOf seems to imply that it *identifies* the triple, so why not just identify the triple.
… I see many reason for people to be not satisfied by this.

<niklasl> +1 for nameOf implying identification ("same as")

AZ: I think there is a way of doing option 2 with something like option 1 ;
… if we stick with the original concrete syntax << :s :p :o >> (without the "e: |" part)
… we can still put the ":e" outside of the << .. >> .

ora: how unhappy would you be with option 3?

AZ: I like ktk's proposal of providing a way to consider this as pure syntactic sugar.
… This would be consistent with the idea of having RDF basic and RDF full.
… You could have RDF basic + syntactic sugar. May be that would be an acceptable middle ground for me.

ktk: AZ, what do you mean by "it has not been tested"?
… Triple terms were implemented by some people, even if others waited.

AZ: implementations that we have seen implemented the CG proposal, which was not option 3.

tl: implementations of RFD-star preceded the CG report
… I would argue that implementations are closer to option 1 than anything

pchampin: yes, implementations are according to the CG proposal
… CG-conformant implem are very close to what option 3 says

<pchampin> +1 to have a basic+sugar profile

ora: it seems that option 3 is really the winner
… there hasn't been this much agreement so far

enrico: I can write a document that's a variant of what Andy proposed
… with a better formalisation and we can discuss it

ora: let us have a tentative vote

pchampin: responding to AZ
… the semantics is still to be discussed but this is on agreeing on the abstract syntax
… and related to rdf:nameOf, I agree that it is not a good name
… it has to be a very generic relation

ora: pchampin can you find a wording for this vote

pchampin: ok

niklasl: option 3 solves my problems

<pchampin> proposed STRAWPOLL: the WG will pursue with the abstract syntax proposed in option 3, considering 'rdf:nameOf' as a working title

niklasl: it satisfies my needss

<tl> would a basic profile be just with the syntactic sugar? ergo (close to) option 1?

<niklasl> I interpret basic profile to now mean option 2 (possibly with generated IRI:s för the edge nodes)?

ktk: we have to consider that option 3 is not fully designed, this is a vote to go forward

<ora> STRAWPOLL: the WG will pursue with the abstract syntax proposed in option 3, considering 'rdf:nameOf' as a working title

<ktk> tl: that was my idea at least

<gkellogg> +1

<pchampin> +1

<ktk> +1

<ora> +1

<enrico> +1

<olaf> +1

<niklasl> +1

<doerthe> +1

<Souri> +1

<AndyS> +1

<eBremer> +1

<TallTed> +1

<Alan_Snyder> +1

<Tpt> +1

<tl> -1

<rubensworks> +1

<gtw> +0 -- tentatively in support, but have some questions on option 3 I still need to work through

<pfps> >+1

my vote, not as scribe, is that we can go with option 3 given the quasi consensus, even if I am not really satisfied with it

<fsasaki> +1

ora: enrico, can you estimate how long it will take to have a revised proposal ready

gkellogg: we have a nomenclature issue because the one we used before is obsolete
… we need a notion of a triple descriptor
… the "<< ... >>" does not appear in the abstract syntax
… we need to figure out how these descriptors can be used within the abstract syntax

<Souri> ice-cream?

<niklasl> ... replace rdf:nameOf with ... rdf:triple ?

enrico: is the syntax with "nameOf" only appear with this construction
… so that there is a syntactic restriction, or not?

pchampin: (in response to enrico) I would not make it illegal
… (re. nomenclature) I like "triple term" but don't have a strong opinion
… we also should discuss the name of the "nameOf" relation
… in favour of "rdf:triple"
… also, should we allow nested "triple terms"

<Souri> +1 to no nesting

<niklasl> rdf:triple rdfs:range rdf:Triple . :D

AndyS: I was assuming we would define a range for rdf:nameOf

<pchampin> +1 to define its range

<Zakim> niklasl, you wanted to comment on nested triple terms

niklasl: about nested triple terms, I would not like to allow them

<pchampin> you could still talk about their "names", which is what most people would need

niklasl: nested triple terms is a rabbit hole

<TallTed> Forbidding nesting means we can no longer use RDF to describe *anything* that's named/identified, which feels problematic. Forbidding loops feels less problematic.

<enrico> Nesting is useful: :john :believes << :s1 | << [] | :liz :spouse :richard >> :starts 1964 >> .

<pchampin> I can definitely live with nesting, but I know that it worries some people

gkellogg: a triple descriptor can refer to an occurrence

<niklasl> +100 to not regularly *use* nested triple terms.

gkellogg: different triple descriptors may describe the same triple differently
… not nesting simplifies comparison a lot

enrico: why should we make this restriction, because it reduces expressiveness a lot
… if you want to make a statement about a statement
… you may want to model an n-ary relation using triple terms
… and then make an event in relation to the n-ary relation

<Zakim> gtw, you wanted to respond to gkellogg's take on non-nested descriptors

enrico: it is not too complicated to have a recursive structure

<pchampin> enrico, you can still do that, as gkellogg suggest:

<pchampin> :john :believes << :b1 | :s :p :o >>.

<pchampin> :marie :knows << :john :believes :b1 >>.

<enrico> :john :believes << :s1 | << [] | :liz :spouse :richard >> :starts 1964 >> . :s1 :certified-by :us-census .

<enrico> :paul :believes << :s2 | << [] | :liz :spouse :richard >> :starts 1965 >> .

gtw: it is natural to be able to describe anything about anything including statements about statements about statements
… but there are practical issues

<pchampin> which could be also written:

<pchampin> :marie :knows << :john :believes << :b1 | :s :p :o >> >> .

<AndyS> +1 to TallTed : occurrence mean unlikely to be very common. Advantage of named occurrences and separate triple /term/descriptors

gtw: [not sure what to wirte]

niklasl: what enrico wants is possible without nested terms

ora: this needs to be discussed further, many questions to be answered still
… you can model lots of things with nested terms but query writing become difficult

gkellogg: triple descriptors exist only because we want to talk about a triple occurrence

<pchampin> actually, this makes it impossible to talk about the rdf:nameOf triples...

gkellogg: if the occurrence is named then it can be used, even in a triple descriptor

<Souri> +1 to use of occurrence names inside a triple-desc to express nesting when needed

gkellogg: use of triple occurrence should be discouraged in the concrete syntax, but must be in the abstract syntax

pchampin: preventing the nesting makes it impossible to talk about the "nameOf" triples

<AndyS> SELECT ?x { :e rdf:nameOf ?x }

<AndyS> SELECT ?e ?x { ?e rdf:nameOf ?x }

AndyS: triple descriptors might not be generally used in RDF, but they will appear as results in SPARQL queries (examples above)

– DRAFT –
RDF-star WG biweekly long meeting

15 February 2024

Attendees

Meeting minutes

the story so far, setting the stage to consider options 1, 2, and 3 of https://github.com/w3c/rdf-star-wg/blob/main/docs/seeking-consensus-2024-01.md

Diagnostics