W3C

DPUB IG Locator Task Force Call

27 Jan 2016

Agenda

See also: IRC log

Attendees

Present
Ivan Herman, Bill Kasdorff, Ben De Meester, Luc Audrain, Romain Deltour, Tzviya Siegman, Daniel Weck, Markus Gyilling
Regrets
Chair
Ben_de_Meester
Scribe
tzviya

ben: There has been some discussion about content negotiation for canonical version of URL
... Romain mentioned that this is not ideal

Romain: What is relevance of content negotiation for PWP?

ivan: (summarizing conversation) What is frequently done with namespaces in sem web world
... When there is a vocab in semweb, there is a namespace, which is a URI
... The vocab is expressed in RDF and can be expressed in various serializations: turtle, JSON-LD, RDF XML
... You have different URLs. You have the canonical URL, which is the namespace (example PROV)

<ivan> namespace for prov voc: http://www.w3.org/ns/prov

<ivan> ....prov.ttl

Ivan: This is the official URL for the Provenance vocab

<ivan> ....prov.rdf

Ivan: and the other serializations have different extensions
... What happens if you issue HTTP GET against canonical URL?
... returns Turtle RDF, because there is server preference for turtle
... Content negotiation works such that the client states preference for one serialization
... the ttl and rdf expressed the same information in different formats
... This is the model i had in mind for PWP
... whether archived or online, the URL of PWP is irrelevant
... packaged or unpackaged version should be an issue for client preferences
... Many dislike content negotiation
... This requires apache work and caching of what?

<Bill_Kasdorf> +1 I really like Ivan's conceptual model

Ivan: If I have PWP in archived, zip format, and I have it online - are these fundamentally the same or different?

romain: I want to look at this as approaching from HTTP standpoint
... on one hand, making GET request
... this is also a question of usability
... With RDF, most users are machines
... We are talking about human users. As a human, I can't make a decision about whether to access packaged/unpackaged form
... I think we can assume that we are talking about 2 different URLs

<Bill_Kasdorf> my comment will be quick

romain: when you make a GET request from a resource, must look at semantic equiv of resource. On one hand look at HTML, on the other han it is the full resource

bill_Kasdorf: we may be drawing too much a distinction between packaged/unpackaged
... Most PWPs will have some external resources - so we have publications that are both ends of the spectrum and in the middle

ivan: referring to romain - you make assumption that in 1 case, return the whole package, in other case, return the HTML file
... that might not be, maybe return the manifest
... which represent the package

<rdeltour> for the record: the other aspect of my comment was about usability: a user browsing the web doesn't have control on the HTTP headers when she clicks a link (unlike a piece of software accessing an RDF resource). hence a proactive-conneg-only approach cannot work in the general case, we *have to* have 2 disinct URLs.

ivan: this is just like the semweb case. We have different URLs for different purposes
... There is a URL that does not make the distinction because need it for reference purposes
... Perhaps we are not disagreeing
... If I do not make the distinction bw formats, i am just referring to the content, regardless of format

markus: would we be helped by stepping back and looking at high level design goals?
... Depending on perspective, we reach different conclusions
... Romain's conclusions are based on effects of using HTTP - rings a bell
... current situation is functionally awful - locating and linking is a mess for EPUB, anno, etc

<mgylling> Two high-level design goals:

<mgylling> 1) when a PWP is online and unpacked at its canonical location, linking and addressing works exactly as on the web

<mgylling> 2) when a PWP is detached from its canonical location, linking and addressing works exactly as if the PWP was located at 1

<mgylling> ... a client may in case 2 have to do active intercepts to enable #2. Compare to how browser

markus: point 2 is the high level issue - must establish equivalence between the two. No user should ever realize the difference between online/offline

romain: I agree that we should look at higher level and return to use cases
... we lack use cases around portability
... We have to figure out what we mean by portability
... If I understand correctly, the packaged version has significance when it is downloaded and moved around?

markus: yes

luc: is item 1 shared URL for all users and item 2 local address for individuals?
... The second one could be a unique PWP for my personal use

ivan: I love what luc said
... This differentiation is very important. There is a major difference between the package that is mine - the URL for this packaged format cannot be different from the canonical, and the URL for the pacakge on my disk
... It may not be realistic to maintain the same URL throughout the life of the product for security reasons (especially once downloaded)
... But, differentiating between downloaded and online seems to be the way to go
... We do very badly need the use cases

Luc: Please clarify what Markus's case 1 is
... is it that everyone from every country points to the same address? Or, is it that this URL is unique to me to annotate, etc.?

bill_K: This is not about the state of the publication or about whether it's changed

<Bill_Kasdorf> s / or about whether it's changed / but about whether it's changed

ivan: (speaking for Markus) for me, the issue is about 1 and 2 together
... as long as we are talking about the online and offline models, the offlinifaction is only relevant if I pass the publication on to someone else

<laudrain> what is the « canonical location »? the publisher manifestation?

ivan: Once I pass it on to another user, she can do with it what she wants

<Bill_Kasdorf> this is why we need a hierarchical model for the locator (a la FRBR)

luc: I am asking for clarification of canonical location
... the canonical location does not belong to anyone. What happens next, such as annotations, does not belong to anyone

markus: yes, and that is both good and bad
... canonical loc is a term that we use to say the URL that is the reference point for the publication
... If a user clones that URL or annotates, etc, will still be able to refer to canonical URL or local

ivan: We already have this. We called this identifier
... When I make my own copy (in a different state) and manipulate it, what happens to the URL?
... what happens when someone in a PWP makes a cross-reference with a canonical URL?

markus: do we have to define how the URL is mutated when doing FRBR-item ops?

romain: not sure i agree that canonical URL is identifier

ivan: +1

<rdeltour> I'll create gh issues as placeholders for discussion on use cases

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.144 (CVS log)
$Date: 2016/01/27 16:05:39 $