RDF-star WG biweekly long meeting

Meeting minutes

Seeking consensus

ora: we have a proposal to start from the concrete syntax and go from there

https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0095.html

https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0077.html

pchampin: I wrote about starting on the general form of the concrete syntax
… : we generally agree on this syntax proposal
… : now we need to agree on entailment
… : in this proposal the syntax derives into standard reification
… : we would then introduce a notion of "well-formed" RDF

enrico: I'm terrified by this proposal with standard reification
… : this would not be compatible with current uses of standard reification
… : we could introduce "non-standard" reification, with a new alphabet, as in the CG

thomas: we should define most precise kind of reification

pchampin: response to enrico: not worried about backward compatibility. I would be happy with RDF-star quoted triples beeing very loosly constrained in terms of semantics.

<tl> more precise like "fact" and "claim" as subclasses of rdf:Statement and refering to asserted and unasserted statements respectively

pchampin: : but if there is agreement over a risk, I wouldn't oppose a new vocabulary

<pfps> There are a bunch of things here. What I see as the main points are:

<pfps> 1/ no change to the abstract syntax

<pfps> 2/ no change to the semantics

<pfps> 3/ several concrete syntaxes have new shorthands that expand to RDF reification

<pfps> 4/ a new notion of well-formed RDF, which could allow for optimized implementations

pfps: agree with Pierre-Antoine, I don't see any problem with backward compat, as long as there is no change in syntax/semantics

<niklasl> +1 to these, and let 4/ be for optimization (not necessarily "bad meaning")

enrico: I would agree if you can write anything you want. The rdf:Statement class should be semantically defined. If its just syntactic sugar then its fine.

<Souri> +1 to well-formed (and ill-formed) reification

<pfps> The semantics that includes named triples in the domain of discourse doesn't really add anything to RDF. Even though named triples are new they don't have any special semantics.

ora: about point 2) (in the referenced email?): the idea that would could optimize is nice. The same idea could be used to optimize lists

<niklasl> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0094.html

niklasl: I think using reification is fine and can cover many use cases.

pchampin: to enrico: I understand that its a shame to not be more specific about the semantics of the new syntax. However I think it's too late.
… : we could say the same for classes in RDF, they are very loosly defined. People more into ontologies might find they are not used correctly
… : its a feature not a bug

enrico: all the RDF alphabet is here so its a meta language for syntax. Its semantically void and its fine. But what if, in the wedding example, what does it mean to have a "starting date". This can't just be syntax.
… : now where to go: say we give special meaning to these things (e.g. believe property and so on)

I prefer "weak semantics" than "semantic-less" :)

enrico: : I we want to stay abstract "semantic-less", we need to give support for opacity

AndyS: if the syntax is in triple or nquads, the fact they are streaming is important for large datasets. If you optimize reification you can't stream anymore, you have to track
… : about well-formedness: you can't split files anymore, because then you get ill-formed files
… : I propose we define a new term, reification descriptor, s-p-o that's a term
… : then it can't be split out
… : they have to be written out together

Souri: I agree with what Andy said. In practice, things come by parts, not as a whole.
… It is important to allow for that streaming.
… Reification maybe OK as the abstract syntax, but in the concrete syntax, inc. N-Triples, we need to have those triples packaged into one.
… As part as *one* statement, that remove any possibility of the ill-form thing coming in.
… Nobody expected to write the 4 triples in N-Triples.
… N-Triples is used for streaming, it must support that adequately.

ora: What I had in mind for well-formed-ness is similar to the definition of "error" in Common-LISP.
… The spec says, in a number of places "It is an error to...".
… Implementations can chose not to detect or signal those errors.
… This allow to defer well-formed-ness, not necessarily at load time. Can happen later on.
… At execution, you get undefined behaviour.
… I realize that the optimization aspect may be difficult.
… I'm not saying the idea is entirely without problem, but I find it attractive.

AndyS: would you be interested in a syntactic solution to that?
… The idea that a term in the data model keeps the three parts together.
… Souri's suggestion about N-Triples changes the spirit of N-Triples, if one line is not anymore a single triple.

tl: if we keep the abstract syntax unchanged, but have concrete syntaxes to help, this should address Andy's and Souri's concerns.
… We don't need a quoted term in the model to get to that.

AndyS: we do.

pfps: I'm confused as to what these streaming implementations are going to do.

AndyS: writing out your graph is a streaming operation.

pfps: if you have the graph in advance, this is not streaming.

AndyS: I'm not talking about stream-processing of RDF. Just importing triples in, or serializing triples out.
… Triples can come in any order, but may be important for implementations.
… I repeat that we also need to consider the criticism on reification, especially the "bloat" problem.

Souri: the basic idea is to have a triple spo, and add properties to it.
… My idea was to split this in two parts in N-Triples.
… One statement to associate a name to spo. You don't need any more complication.
… Users really need something along that line.
… Also, you do not risk to break it by separating the N-Triples file in multiple parts.
… From customers, we often get the feedback that RDF is complex and PGs are nice and simple.
… I want to reduce this perceived complexity.

niklasl: N-Triples is now and probably should stay a subset of Triples.
… We need to preserve that.
… N-quads has more leaway, in the sense that it is not a subset of TrIg.
… [comment on implementations optimized for lists, which are even more complicated]

<AndyS> NQ is more fundamental for databases than NT.

olaf: about N-quads, we should not mix up N-Quads with anything meant to capture triples
… N-Quads is for datasets.
… We should not use the term quads, as it has a meaning different from what we are talking about.
… To Souri: from the point of view of reading the data in, order would matter
… you need to read the "naming statement" first if your named statements are stored in a different data structure
… if a simple "s p o" triple occurs and you don't know yet that s is the name of a named triple / edge, this might cause problem
… ???
… Generally, I could live with considering the concrete syntax as syntactic sugar for std reification.

<doerthe> I have to leave, bye

Souri: In Turtle, it is ok to have, on the same line, the naming of a triple ("spo is named e") and adding a property to it ("e a b"), but in N-Triples I would separate the two.
… About the order: there is no issue when a "naming statement" comes, there is no issue if the subject was used before.
… What is difficult is to check for circular references.

olaf: circular references are indeed a problem.

<TallTed> "Turtle, the Terse RDF Triple Language" suggests we'd need a new thing, maybe the somewhat absurd "Turntle, the Terse RDF Named Triple Language".

<TallTed> "N-Triples is an easy to parse line-based subset of Turtle" means that we need a new thing for this naming, perhaps "N-Named-Triples".

<TallTed> "N-Quads [is] an extension of N-Triples" so we probably need another new thing, perhaps "N-Named-Quads, an extension of N-Named-Triples"

olaf: but depending on your order in which you get the triples, you might need to move things from a data structure into another one.
… Consider ":s :p :o." You put this in your "regular triples" structure.
… Then you encounter ":a :b :c | :s" (:s is the name of a triple).
… Then you must move the first triple to your "annotations" structure.

TallTed: what is being discussed now can not be down with the existing language.
… We need new languages, this is a complete re-design and a rabbit hole.

AndyS: about optimization, I think that olaf is talking about a different case.
… We initially talked about representing the reification; olaf is talking about representing the annotations.
… Old-style reification triples can invalidate well-formed-ness.
… With triple terms, we can make the "naming statement" a regular triple, with specific predicate "rdf:nameOf" for example.

tl: would the problems just discussed be resolved by going back to defining the occurrence in a separate triple, eg :X rdfx:occurrenceOf << :s :p :o >>

AndyS: a cycle of references i not so bad, because of the indirection introduced by the name

<niklasl> +1 re. reference cycles

<TallTed> a bit of history -- https://www.w3.org/DesignIssues/Reify.html

ora: I'd like to gauge the group's sentiment about that direction

Souri: I like the direction in which this is going.

<TallTed> perhaps we need rdfstar:subject, rdfstar:predicate, rdfstar:object -- similar to, but not identical to, rdf:subject, rdf:predicate, rdf:object -- to avoid breaking on previously reified data

Souri: about introducing a new term, with rdf:occurrenceOf, is basically doing the same

pchampin: hearing naming of a triple needs to be atomic.
… Proposal mentioned: New kind naming statement
… a graph is a usual graph and also the naming of triples.
… Proposal mentioned: Use form in CG report with a naming triple
… two ways of doing the same thing
… see merits in both - prefer second
… circular refs ... maybe define triple terms as non recursive.

<Souri> +1

<ora> Strapoll: should we pursue this?

Strawpoll by Ora: Is this productive direction?

ora: do people feel the current discussion is productive?

<ora> +1

<niklasl> +1

<pchampin> +1

<olaf> +1

<Tpt> +1

<tl> +1

<enrico> +1

<ktk> +1

<Timothe> +1

<TallTed> +1

<pfps> +1 it's as good as the other ways that have been explored

<gtw> +1

<eBremer> +1

<AZ> +0

<Dominik_T> +1

Yes to using reification the basis (or reification like thing)

ora: pfps, is this in line with option 1 in your message from some time ago?

pfps: I don't remember that message specifically

The Turtle syntax is good.

pfps: We can hash out the details tomorrow

ora: it make sense
… we can conclude this meeting now and reconvene tomorrow.
… Or do we want to continue this conversation now?
… People might need some time to digest.

AndyS: if we go for the reification thing, we could go for a different kind of reification.
… If would be good to know what people are expecting.

tl: I don't see what the expansion of the chevron syntax into naming statements means or implementations

Souri: today, reification makes an instance of rdf:Statement. What if we introduced a new type (which I'd like to call Edge, but could be something else)
… we can say "this is a specifc type with specific properties", detached from rdf:Statement

:a :b :c | :e.

SELECT * WHERE { :e ?p ?o }

<pfps> A more precise statement of this is

Souri: your query would not match anything.

<pfps> Note that optimized systems must behave correctly so there five (four?) results from

<pfps> SELECT ?s WHERE {

<pfps> ?s ?p ?o .

<pfps> }

<pfps> when querying

Souri: only an "edge pattern" would match

<pfps> << :a :b :c >> :d :e .

Compare to -- :e rdf:occurrenceOf <<( :a :b :c )>> # A new term thingy

AndyS: I get nervous when we start saying that an RDF graph is no longer just a set of triples
… now it would be "a set of triples + a set of edges"
… we should be very cautious about that

<pchampin> I agree, and that's why I prefer the 'rdf:occurrenceOf' way of naming

Souri: we are introducing new things anyway. occurrences are not behaving the same as triples.
… Yes, RDF 1.2 is RDF 1.1 + something, here a set of edges, which can be empty

<TallTed> "simple" changes can have VERY complex impacts

Souri: This brigdes the gap with PGs, in a backward compatible way
… PGs has shown that this was useful; RDBs have had it from day one.

AndyS: I agree with the objectives you state.
… pchampin has highlighted the difference above:
… are edges a new kind of terms, or a new kind of constructs in graphs

<enrico> p+

Souri: from a user's point of view, I don't really care how this is done internally
… I'm interested in the syntax.
… I think that a triple with the association of a name provides a better mental model.
… That's where I'm coming from.

AndyS: if in your syntax, the vertical bar was replaced by a property "as name of", would that work for you?

Souri: in my opinion, it would be the same, but I aim to avoid the introduction of a new term.
… I like the Turtle syntax. I would like something similar in the N-Triples syntax.

AndyS: we could also have a dedicated keyword for this "has name of" property, similar to the "a" keyword in Turtle, would that meet your usability concerns?

<TallTed> `<< :s :p :o >> named :fred`

Souri: it is basically equivalent, although I would prefer a simple symbol.

<TallTed> `<< :s :p :o >> Ω :fred`

AndyS: it is acceptable if it is a triple. That way we don't need a new set in the definition of graphs.

Souri: we introduce new complexity at one level or another. Either at the term level, or at the graph level.

pchampin: question is the point where the complexity goes.
… lean towards term extension that being said that might open up other issues

olaf: I wanted to call this out explicitly:
… AndyS, you propose that we have a new type of term (triple terms), but they can be used only as a subject of a specific predicate, and the object would be a blank node or IRI.
… Is that the proposal?

AndyS: yes, with the caveat that this would not be a MUST
… People could use those terms differently, and suffer the consequences
… Just like there are good usage and bad usages of reification, there would be good usage and bad usage of this new kind of terms

<tl> can we understand

<tl> << a: | :s :p :o >> :x :y .

<tl> as syntactic sugar for

<tl> :a :isNameOf << :s :p :o >> .

ktk: did you talk about well-formed? I would like to discuss how they relate to lists.

<tl> :a :x :y .

<tl> and then also << :s :p :o >> as syntactic sugar for standard reification quad

<tl> i.e. a two step sugar-ing?

ktk: I ran into hard problems with lists.

<AndyS> The recommended usage give nice syntax, the less good way gives no helper syntax.

ktk: If we go towards well-formed-ness, we must think of problems like that.

ora: I wrote examples in slides of how to manipulate collections, this is indeed hell.
… What about tomorrow's call?

AndyS: can anybody who is participating actively to this discussion make it tomorrow?

olaf: I can't, but I'll be on vacation next week, so don't postpone.

tl: I can make it

ora: me too

<ora> https://www.w3.org/groups/tf/rdf-star-semantics/calendar/

pchampin: olaf, my understand is that you are ok with the current approach of extending the abstract syntax (either at the term level, or at the graph level)

olaf: yes; my interest is in defining the behaviour of SPARQL on top of it

<Tpt> Thank you for this discussion, it seems to move into a great direction!

ora: I'm encouraged by this discussion; it could be seen as a step back, but I think it is a step forward in order to produce a recommendation

RDF-star WG biweekly long meeting

18 January 2024

Attendees

Meeting minutes

Seeking consensus

Diagnostics