Publishing F2F, 1st day -- 22 Oct 2018

<ivan> scribenick: bigbluehat

Introductions

<Cristina> +cristina

UNKNOWN_SPEAKER: lots of wonderful people greet each other with their name and titles.

tzviya: anyone on the phone who would like to introduce themselves?

ivan: please everyone around the table please present+ yourself in IRC

tzviya: if you're not familiar with IRC, please let us know

Dinner

<tzviya> https://docs.google.com/document/d/1Mt9PTcOdmrCwIsgfxbGMGjwHlUsySU01I0D4oBkSbcA/edit?usp=sharing

tzviya: please RSVP for dinner on the Google Doc

ivan: also please add your appetizer and desert choice before noon today

Now Let's Talk About Publishing

tzviya: I'll start with an overview to hopefully catch everyone up
... here's our current position
... the content in a Web Publication is anything the Web can have: SVG, images, audio, HTML, etc.
... the big questions still pivot around the manifest and the content and how those interact
... we still talk about the abstract concept of the infoset
... and we also continue to explore metadata
... as well as the exact structure
... especially with regard to internationallization
... there is a lot of interplay between the pieces
... and we often get hung up on some of these questions
... the details of which are rather important
... but we need to be careful not to debate these at too much length, as we have more to accomplish
... so. what we have right now is:
... the content - html, css, etc.
... the abstract infoset
... the manifest
... the webidl
... and the JSON-LD context
... and we'll be discussing how those all interact
... if there's anyone who's new to this, please ask questions!
... k. I think we'll move on to the next agenda item

ivan: um. I think the whole infoset debate/question should be discussed
... when we started the work, the abstraction was valuable
... if I were comparing it to something else, it would be numbers or numerals
... numbers are the abstract concept, and numerals are the real tangibles...hexidecimal, etc.
... because I'm a mathematician this abstraction helps me a lot
... and currently the whole document is written around this construction
... but we need to determine if it would be better to focus on the just the manifest
... which we did consider may happen when we introduced the manifest concept

garth: I would have said the info set is really the content of the manifest

<Hadrien> +1 to removing the infoset

garth: I don't really see them as separate

ivan: so let's not forget that your statement is not quite correct
... the manifest may be incomplete by itself
... and it will need to take information from the surrounding HTML itself
... so it's much clearer to think about the infoset
... which may be filled from many different sources
... the infoset then is important

romain: I feel it's mostly a communication issue
... and fear we'll drive people way with the "abstract infoset" language
... for instance, if we talk about "publication title" it seems obvious this is the abstract concept, and not the specific serialization

<George> +1 to what Romain said

<laudrain> +1

romain: so I'm not sure continuing to distinguish this is helpful

<tzviya> +1 to romain

Avneesh: is there a possibility that the infoset at the abstract level includes beyond the manifest?
... so I guess we may need to have both of them

Rachel: what is the problem that the infoset was attempting to solve in the first place?

ivan: I think at the moment, when we began to discuss this things went all over the place
... what are the things we need, and then we very quickly went into how we expressed them in JSON
... so we introduced the infoset terminology, to avoid always pushing things into JSON

<laurentlemeur> q

ivan: so we were then focused on what the things we needed abstractly without demanding we determine the format/expression
... do we still need that now? does it help the reader? that's really the question for today

Rachel: I'm still struggling to understand what the infoset is actually solving for us

tzviya: we built up this list of metadata and called it the "infoset" for abstraction purposes
... then when we got to the JSON-LD discussion, we started adding MUSTs and SHOULDs
... but we still have the infoset, because things like `title` may be expressed in either HTML or JSON-LD
... and therefore the abstraction may be helpful

Rachel: the problem, then that we have is that we need to determine the location of this information

tzviya: so one idea is to focus on the origins of the metadata rather than their abstractions

duga: I'm trying to think of specs that do something like this, and CSS 2.1's box model spec takes this approach

<tzviya> https://www.w3.org/TR/CSS2/box.html

duga: there are abstract ideas explained, and then details of their expressions
... I'm not sure, though, that our spec is that complex
... however, it still might be useful to keep some expression of abstractions
... but for things like title, I'm not sure it needs it's own abstraction as people understand those already

leonardr: I would agree with duga and think ivan's history that we defined the infoset before we serialized it is correct
... and that the abstraction may no longer be necessary
... and hopefully we can remove the abstraction
... and fill in the missing pieces that it dealt with

laudrain: so in our intro we say that a Web Publication is "pure Web"
... so the abstractions begin to explain perhaps how a Web Publication differs from a Web Site or a Web App
... it could be very technical and expressed in something like JSON-LD
... but we really should explain how Web Publications differ

gkellogg: we ran into this on the JSON-LD specifications
... we currently discuss things in strings, arrays, and dictionaries
... abstract enough that it can be implemented in something like YAML, but concrete enough for it's focus on JSON

laurentlemeur: perhaps we could remove the abstract infoset, and instead focus on the WebIDL or some other expression of the properties

<Avneesh> +1 Laurent. Infoset is not so well understood

laurentlemeur: also the infoset is scattered throughout the document, and perhaps moving it back together into one place could be helpful
... the options for placing the title, are encoding expressions

Hadrien: we are working with something conceptual

<laudrain> +1

Hadrien: but I do think the infoset term is confusing
... but this concept of a Web Publication is what we're selling to the world
... perhaps one of the issues is that keeping things at the abstract, is that we don't divide well between what actually ends up in the manifest and what doesn't

<laudrain> +1

<Zakim> tzviya, you wanted to explain that duplication is the problem

<Cristina> +1

tzviya: one concern we've had in the past is duplication
... the requirements on the abstract infoset can be confusing
... and using properties of a publication might be more clarifying

ivan: k. I don't think we should go on with this discussion to much longer
... I propose that Matt, I, and perhaps the chairs look through the document
... and take these comments into account
... and essentially remove the term...which I realize may frighten some here
... but I don't think having telcos on this longer makes sense
... so we will work up a pull request to address this

PROPOSED: edit the infoset and properties section, and introduce a PR to the group

Avneesh: should we express that this is a non-operative section?

tzviya: I think that is part of the intent, but as yet it's not that concrete

<clapierre> +1

ivan: we hope to have continued discussion around the specific text once it's sent as a pull request

RESOLUTION: edit the infoset and properties section, and introduce a PR to the group

use case, affordances

tzviya: since we have 10 extra minutes, does anyone have other questions around the interactions of the different pieces, manifest, infoset, webidl, etc?

George: are web browsers today looking for this manifest?

<laudrain> +1

ivan: no. and it remains to be seen if/when that will happen

George: so that is why we begin with an HTML document?

tzviya: correct.

laudrain: so, these are at this point very similar to Web Apps.
... but that will be hard for publishers to make something like that consistently and as a standard

tzviya: I believe this is issue #271...and we've discussed this frequently
... what does it mean to be "WP-aware"?
... I don't want to derail things, but we have yet to define what it means to open a WP

<Ralph> #271 : WP rendering in non WP aware browsers

<tzviya> https://w3c.github.io/dpub-pwp-ucr/

tzviya: Franco has done an amazing amount of work on the use cases and requirements document
... hi Josh!
... are you on IRC?

josh: yes!
... the changes I've made were checked in on Friday
... and I've noticed some have been merged

<tzviya> https://w3c.github.io/dpub-pwp-ucr/

josh: they're good. Franco did some great work identifying areas that weren't covered
... so the status feels like it has what it needs and just needs some editing and then publishing
... there may, however, need to be some trimming of requirements that I've not understood when I started editing the document

leonardr: can you give us any sort of general overview about the changes?
... were these clean up changes? new use cases? can you generalize these into a summary for us?

josh: sure. there were no new use cases.
... Franco has submitted some that tzviya and I have yet to review
... it was about a two year old document
... so it's been tidied up a bit, and in the last 48 hours there's some new content
... but we've also done some general clean up
... however, i think this group has bigger fish to fry than more use cases
... I don't know how in depth tzviya wants to go
... there's one huge one, what does the UA want to do?
... and that one could take a while
... for most of these, we'd all agree easily with about 90% of these
... but 10% of the ones related to user experience--layout, user movement through the document, etc
... they sound simple, but they have frequently been hard to discuss
... I don't know how much of these are practical for the first version

tzviya: so, since I merged many of Franco's PRs, I can give a brief overview
... he did add more use cases based on what's gone into the spec
... he also added comments, so you'll see comments like "I did not see these fields, help wanted"
... this is where the request of the working group to review/contribute
... the intent was to make the UCR document match the WPUB document's current reality
... when heather was the editor, she'd started a pattern of adding requirements at the end of the document
... Franco's continued that with the intent of making UCR and WPUB match
... we have a WebIDL and we have metadata, but what does that mean if I open the web publication?
... we keep getting hung up here
... and if I want to understand that with the UCR, I open section 2.2.1
... and it explains more about what could be done with that metadata
... so, today, we can talk about what can be done

ivan: the most important thing is to link the UCR with WPUB...from both sides
... right now, we have the "affordances" section
... and also with the specific metadata
... and those refer to their use cases
... and for many it's obvious, but for some it's not
... we also want to do the same from the UCR document
... we can or can not do the use case described with what's expressed in WPUB with links back to how to make that happen in reality

Hadrien: the document is very useful, but I'm concerned that whenever we talk about a UCR or when we talk about what a non-WP does or does not
... we still don't determine what the minimum of what a WP-aware reader must do
... there are some things that are not minimum, like offlining, that we talk about as if they were
... and I fear that by trying to solve them all at once we solve none of them

leonardr: I think that what I'm hearing is two different things
... I completely agree with the desire to map WPUB requirements to the UCR document
... what I have a problem with is using the UCR document to put requirements on UAs
... in PDF specs for example, we talk about format requirements vs. process requirements
... we have format requirements in WPUB, but we lack process requirements
... that seems like a very normative core to our spec which we should add
... we're describing not only what they're expressing, but what they expect to have happen when they do it

<garth> +1 Leonard

leonardr: the UCR, however, is not normative, and has no "Thou Shalt Do X" in it

tzviya: typically, it depends on the spec, but usually that sort of thing does't appear in W3C specifications
... I'd spoken with Josh about creating three or so "focal" use cases
... one might be offlining
... it's a lot complicated
... but it seems essential
... so we write some focal use cases around that and other core requirements

<Zakim> tzviya, you wanted to talk about focal use cases

garth: how do you see the UCR? is it just augmenting the WPUB spec?

tzviya: yes. I agree with that

George: I agree with garth and Hadrien. That we should reference and describe the things in our UCR document
... and point to them from the WPUB spec
... but we shouldn't go from the UCR document and then figure out how to modify the WPUB spec to match
... we should focus on implementable WPUB

ivan: I'm in agreement with a minimum viable approach to WPUB
... but the hard part if determining what is normative and what is non-normative
... if we put it as normative, then we MUST have (per W3C process) several implementations for everyone one of those MUSTs
... if it's informative, then we can just leave it there
... so, for this MVP approach, those things should be normative, but we should be careful here
... because we have to test and verify all these things
... so if it goes into WPUB as normative, we should be very careful

tzviya: so, coming back to the UCR, what things should we add to this MVP for WPUB?

garth: all MUSTs but some SHOULDs?

tzviya: do we even have that language in the UCR?

garth: yeah, we do have that in the UCR currently

josh: I love the MVP idea
... I've been going through the UCR document for weeks
... for instance, showing the TOC while you're anywhere within the document

<George> The TOC must be omnipresent

PROPOSE: the table of contents must be accessible from anywhere in the publication

<scribe> scribenick: duga

<wolfgang> +1 to bigbluehat on TOC

bigbluehat: All for clarifying stuff, but really need a testing champion

tzviya: We have one! Chris

bigbluehat: Need another
... Need to figure out how to make this omnipresence testable

<bigbluehat> scribenick: bigbluehat

ivan: we've discussed the toc issue several times
... there are many more of these though
... like search should be across all the things in the publication
... and figure numbering, etc, should be across all the things in the publication
... these seem like sensible and unique-to-publications features and capabilities

Hadrien: I don't think these are MVP
... frankly, I only see two things as MVP
... going through the reading order, and accessing the ToC

leonardr: I was going the same place as Hadrien.
... what ivan listed require many more discussions
... and I'd not consider them MVP

garth: I'm very hesitant to put requirements on UAs.
... things like search, etc, seem likely to be added by UAs, but not likely to be an MVP

ivan: searching across multiple documents is something unique to publications

garth: I'm generally agreeing. The top two (toc access and directional progression through the document) do state an MVP
... but then what can happen in non-WP aware UAs is also a consideration
... well, the actual one is understanding the manifest
... so, perhaps this is a great discussion for us to determine these things and build up from there

tzviya: josh perhaps you can think through what's MVP and what's WP-aware and add these things to the UCR

laurentlemeur: so, we can do some of these things by going back and forward in a current browser
... or is this a directional progression from point a to point b to point c without back and forth?

garth: I'd say directional from resource to resource without back and forth

Avneesh: this sounds like determining the bounds of a publication

laudrain_: about WP-aware UAs
... is it possible to speak about a UA that understands at least JS and CSS?
... what kind of engine are we speaking about?
... what must it support?
... what if it can't do CSS or JavaScript?
... what does it mean in that moment?
... I would like to propose that we're talking about UAs that can do JavaScript and CSS as a minimum

Avneesh: as far as I understand, WP-aware vs. non-WP-aware, the keyword is "WP"
... not JavaScript support, etc.
... there should be points where we say "this is a WP aware UA feature"
... the JavaScript and CSS are not part of those requirements

laudrain_: but that is my point by not determining whether we have these things or not, we do not have a foundation to build upon
... I feel like we speak about engines that are too small and lack features, such that we can't build a WP experience in a non-WP-aware browser

<ivan> RickJ asked me to forward his greetings to everyone around

<Ralph> [I suggest that we can describe what it means to be "WP Aware"; what it means to conform to the WP specification but it is not practical to say what a non-"WP Aware" agent does with a WP. We can design WP such that non-aware clients are more likely than not to do something helpful.]

Avneesh: just one note to say that if JavaScript is provided via a non-WP-aware but uses JavaScript to create a WP like experience, than that non-WP-aware UA becomes WP-aware

<Zakim> tzviya, you wanted to say that we can't redefine UA for W3C

Hadrien: this is back to the discussion earlier, but I think going back and forth to a ToC is not an MVP feature

tzviya: I think we need to be careful with defining User Agent
... we can define WP-aware
... but we need to be careful to determine the meaning of UA

garth: I keep thinking Reading System, and I'm not sure if that's a UA. I think it is, but it has been confusing
... I think what laudrain_ and Avneesh are saying is that if you can build the reading experience and distribute that with your publication than you get a WP-aware UA with your publication
... and in as much as WP's are distributed on the open web platform, then a WP can be distributed with such a built-in WP-aware UA
... the publication itself causes the WP to be WP-aware

<leonardr> :)

George: Hadrien I like to be able to move through the document without going back and forth
... but I also want to be able to collapse a collection of 1000+ documents, and get to just part of that

Hadrien: yes. I want that too.

<garth> s/Hal/Dave/

George: then we're in agreement. great :)

<romain> sidenote: in HTML, "user agent" are defined as a conformance class (in 2.1.8: https://html.spec.whatwg.org/#conformance-classes)

tzviya: let's come back to the MVP for WP-aware

Hadrien: there's no concept of this now
... if you build something like this now, it would be a Web App
... and those lack an understanding of scope
... especially with search...so I think that should not be an MVP
... also offline-ability is hard to achieve consistently
... things like comics, etc, don't have space available usually
... and are therefore not good MVPs

leonardr: and this brings us back to the discussion of boundedness

<garth> Perhaps: MVP == (get to TOC | move through reading order); MGP == search within bounds of WP

leonardr: do we list all resources? do we list just some resources?
... do we allow them all to be searched? offlined? etc.
... then we'll need to determine per resource what's possible for search, offline, etc. per resource

ivan: so we have some MVP thoughts
... but if we stop there, then we have this minimal thing
... and a huge blob of features in the UCR
... and then UA developers pick just what she wants
... and then there are some of these affordances, however hard they may be, should be considered fundamental to a publication
... it perhaps is a difference between MUSTs and SHOULDs, but they should still be expressed

wendyreid: the product creation folks understand this
... and ivan is reading my mind
... what we probably have to do is create tiers
... MVP comes from product design
... one of the core product design concepts is iterative improvement
... and we could benefit here from a list of iterative improvements up from an MVP

duga: so I have an unhelpful comment, but I'm also going to be generally agreeable
... I don't think search is an MVP because it'd be possible to ship something that can be searched, but without expectation that it will be searched
... if a large group says, it's hard to implement and hard to create then is it an MVP feature?

marisa: can someone remind me what MVP means in relation to our stack?
... when I think of our spec, then I could see many browsers to do most of these
... what does it give our spec to define mvp?

garth: I'd think of them as just the MUSTs
... and I'll agree with duga
... searching across the bounds of the publication is probably not a MUST

marisa: I don't think we have to split so many hairs here

George: the minimum viable reading experience is probably something beyond what we've discussed so far as MVP

<laudrain_> +1

George: and I'm concerned that if we only spec an MVP, that it won't result in a good reading experience

Hadrien: we should focus on MUSTs, SHOULDs, and MAYs
... there's been a lot of discussion of search and offline
... I think those are intertwined
... once they're offline you could index them

<scribe> scribenick: duga

bigbluehat: The web is trending offline
... service workers, new stuff

<scribe> ... New indexing stuff is coming for searching large documents

UNKNOWN_SPEAKER: Currently be explored in other groups, let's talk to them

<bigbluehat> scribenick: bigbluehat

romain: I had a quick look for UAs in the HTML spec
... and they use conformance classes
... interactive browsers do one set of things
... non-interactive ones get a slightly different list
... things like validators, etc.

https://html.spec.whatwg.org/#conformance-classes

scribe: we could build up from these

Avneesh: I believe everyone generally agrees, but perhaps MVP is a confusing term
... and in fact we're looking for the core affordances that should be provided

tzviya: I agree

liisamk_: I don't think search is MVP, and I don't think changing font size is necessarily MVP
... but I do think internal and external linking experience is MVP

<Avneesh> +1 to Romain, to also consider html classification

ivan: I was happy to hear what bigbluehat was saying. We should use the MUST, SHOULD, and MAY, etc.
... if it's hard today, we should put it as a SHOULD
... but we should have a clear idea of what is being built for the future

<scribe> scribenick: bigbluehat

<Ralph> [more tests are always better; even SHOULD and MAY]

<duga> bigbluehat: You do have to write tests for SHOULD but don't need to pass them (maybe?)

<duga> ... There is an assumption we are on a desktop browser given our title with "Web"

<duga> ... Can we assume this for conformance tests?

<duga> ... If we don't assume a proper browser, we end up in a bad place

<scribe> scribenick: bigbluehat

George: is it possible for a WP aware browser, to not process JavaScript embedded affordances that it already provides

ivan: the answer is yes: that's what polyfills are designed to do

tzviya: we've got 13 minutes
... just fyi

<Zakim> tzviya, you wanted to propose a way forward

liisamk_: from a general mapping perspective, MVP is MUSTs, SHOULDs are next level up, and MAY is super awesome product

tzviya: so. I'm going to propose that josh and franco with the UCR and affordances, etc, go through the existing WPUB document
... and go through the MUSTs

<liisamk_> must = minimal, should = desirable, may = optimal

tzviya: and next time we meet we look at the MVP/MUSTs
... and start listing them in some section of the document

<liisamk_> or may = sexy

tzviya: and then we'll see if that's something we can live with
... does that sound like a good plan?

ivan: I would also welcome if someone else took on editorial jobs related to this minimal stuff
... so that we have a text that might actually replace the current section we have on affordances, etc.
... matt will have quite a lot of work already to deal with our infoset choices
... so I think it's unfair to expect matt to do this also
... so help wanted!

<scribe> scribenick: duga

gpellegrino1: Reply to George: JS should check if browser is WP aware
... like JQuery does
... Allows polyfill of specific features

George: That means each affordance can be uniquely identified
... Ran into this with footnotes, with JS footnotes vs RS finding them

Hadrien: We have been talking about MVP, but in terms of JS don't want to test if WP aware
... Instead want to test for features
... Transition from non-WP aware to WP aware is also important
... We haven't discussed it but there is a UX issue

tzviya: Josh and Franco to find the MUSTs for MVP
... in the future we will do the same with SHOULD
... Need volunteers and an editor for Affordances section

garth: Wanting to define a minimal list of musts
... currently we have 2 and third, is that enough?

tzviya: No
... Break!

marisa: Does WebIDL have a way to query if a specific affordance is available?

Ivan: No

<scribe> scribenick: duga

tzviya: Reminder about dinner stuff

garth: Google covering dinner

Boundaries

tzviya: Reading from the agenda

dauwhe: Need to put some effort into describing this in an operational way

<dauwhe> https://github.com/w3c/wpub/issues/194#issuecomment-428662128

dauwhe: Say I am in a WP context and click a link to another WP?
... what happens then?
... How do we discard a manifest?
... Easy to talk about boundaries are, but what do they mean?

leonardr: The concept I agree with
... Thinking about boundaries from UX perspective
... eg my goal is to search this publication - what is a WP to accomplish that
... Look at use cases as they relate to boundaries

bigbluehat: Dave mentioned UX. We talked about constraints on UAs
... Biggest one we have to consider is what happens when you cross the boundary
... There is some precedent in the web, eg web manifest
... inverse is iFrames, you pull things into your scope
... third is target:blank which insists you leave the publication
... Anything out of scope takes you to a browser context

Hadrien: Glad to hear this example, we had inter document linking discussions at epub
... Nice to finally be able to link between pubs
... Earlier we didn't have the right terminology to discuss the boundaries
... No longer true. The scope is now expressed.
... Agree we have established patterns for what happens
... no need to reinvent the wheel
... When you are no longer in the bounds of the pub, the affordances are no longer available

tzviya: We seem to be agreeing
... Goal is to address issue 205
... Maybe we are done with this?

<Zakim> tzviya, you wanted to make sure we define rules and UX separately

tzviya: maybe we just need to be more specific about what happens and the UX for when we leave the pub

garth: Searching - maybe that is a should, clearly you need bounds for that
... Are we comfortable saying we are now done?

leonardr: Concerned we are talking about 2 different things regarding bounds
... 1 is what the UA understands are the bounds (eg for search)
... Seems clear why we need that
... Issues around exiting and entering is a completely different issue
... Nothing to do with the actual bounds
... Just a UX issue, which is still important
... Look at both, but do not combine

romain: Security issues - what happens when you move between origins
... Origin historically undefined in epub world
... Pubs can share local storage, etc
... Bounds is an opportunity for us to tackle this issue
... Do you have to examine every resource to determine origins
... ?

<Ralph> 205 - We need a section of the document that explicitly defines the bounds of a publication

ivan: Issue of bounds depends on what we discussed before the break
... May be ok to say search is only for things in the resource lists
... but may not be true for offlining

Hadrien: But those are the same bounds?

ivan: But do we need to list eg CSS

dauwhe: case 194 talks about links to items in multiple pubs?
... Do you need to define the various combinations of navigation actions?

bigbluehat: How ready are we to decide things like experiential actions like
... clicking on something outside of bounds is different than inside?
... Are we at a place to deal with that now?

liisamk: Yes, we are!
... If you are in the bounds you should know that
... There should be some experiential way of knowing I am navigating to something I "own" to something I don't

dauwhe: Important web principal - how can a user trust their content (or not)?

<bigbluehat> +1 to experience mapping to user trust

dauwhe: web app manifest has a lot on this, about indicating to user they are in some special mode

bigbluehat: There was a mention about web apps be similar but non standard
... There is no consistency promise
... Web pubs should have more trust - clear you are in the pub
... Adding that expression of trust is valuable to publishing

tzviya: We are revisiting why we need boundaries
... But we have already discussed that
... Need for security, offlining, wayfinding, etc
... Need to focus on how not why, people!

laudrain_: User trust is fine, but also need to consider author trust
... Something has been "published"
... Bounds are important to verify that

Hadrien: In the case of web apps, it is similar to how epub RS often work
... You have a context, when you go out of it, may open a browser or web view
... so you are now in a different UX context
... Compared to web once I have switched, I don't have the same expectation of how I get back
... Web apps often don't really support back
... From a UX standpoint fairly common way of handling bounds

leonardr: Take a use case, say offlining
... Use as an example to figure out what we need
... Have default reading, resource list, etc
... [reading from spec]
... The bounds are defined as the union of resource list and default reading order

<Zakim> bigbluehat, you wanted to suggest we write some UA reqs based on these bounds :)

bigbluehat: Before we go there
... We have avoided UA requirement so far
... Should we define those now?
... Things like leaving the pub, etc
... Should we just file issues and make Josh do them?

tzviya: Yes

dauwhe: Can we define expectations?

<ivan> guests+ Tess_O'Connor

dauwhe: Eg if you do search, these are the ones you should search
... Those can be tested
... Have an operational definition instead of saying "this is a boundary"
... Breaking the back button is really bad

Hadrien: I was just pointing out how web apps work

dauwhe: I would be unhappy with pubs that did that

josh: Is it possible you have something in bounds that is not in the reading order?

chorus_of_voices: yes

<tzviya> https://w3c.github.io/wpub/#resource-list

josh: Does that mean what Leonard says was wrong?

ivan: No, you had it wrong, it is the union
... I am fine putting this into a resolution
... and then we can close an issue
... it puts responsibility in the authors that if they expect eg offline to work
... then they better put the CSS in the resources
... Which is fine, but we need to decide
... I propose we do it now!

garth: Agree
... Better flesh out your resources

leonardr: I support that
... The things we need to iterate on are various things we have discussed

tzviya: Proposal: leave language as is, change from a note to text, close the issue

leonardr: Change the language to "this defines the bounds of the publication"

dauwhe: What happens if you have a resource that links to CSS outside the bounds

ivan: What happens today if you have something in the cache that refers to an external file?
... That is totes the same

dauwhe: I am ok with not required

tzviya: Objections?

romain: Need to understand what happens when there are multiple origins

ivan: Need extra constraints on resources

Hadrien: If we decide defining bounds is more convenient as one resource, then [something]

leonardr: You are viewing the bounds wrong
... The fact that you can reference external CSS is irrelevant
... Bounds are what the list says, not where they are
... How we deal with bounds is another question

bigbluehat: There is some prior art that is painful
... eg app manifest
... which is being replaced by service workers
... No master list, it just puts referenced things in the cache
... No need for boundaries
... Further constrains a service worker

<scribe> scribenick: bigbluehat

duga: these lists of resources are all great
... once upon a time there was a thing called epub
... we had a manifest
... then we had a package
... which also had a list of files within the zip format
... and the only thing that we got from the manifest
... was errors
... when you're not considering packaging, manifest sound great
... but when you get to packing, you probably will hate the manifest

tzviya: do you want these to be the same in WPUB and EPUB4?

duga: probably not

<scribe> scribenick: duga

garth: A wp doesn't need to be packaged
... but we could make a rule that the list is expunged by the time we package
... I thought we were sort of close to agreeing that the bounds was the union

leonardr: Didn't we agree
... ?

liisamk: We did have call for objections
... I disagree with duga, I think it was very useful to have a list of resources
... and am not opposed to it in WP

dauwhe: Does the current web packaging spec have features that support WP??
... Is there something there that we should pay attention to?

<tzviya> dauwhe++

dauwhe: Having some alignment with web packaging would be a lovely alignment
... Hope we can coordinate with them

<romain> +1

<tzviya> The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication.

<ivan> +1

tzviya: Can we agree with the statement in the spec now [reading from spec]?

tzviya: Do we agree?

<Avneesh> +1

<laudrain_> +1

<gpellegrino> +1

<wolfgang> +1

ivan: Do we close 205 with this?

tzviya: Yes?

<dauwhe> +1 in that this statement is necessary, but not sufficient. There is more work to be done with boundaries and user experiences

<wendyreid> +1

RESOLUTION: The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication & close #205

tzviya: Objections?

<wendyreid> also +1 dauwhe

<leonardr> +q

bigbluehat: Not -1

<leonardr> +1

bigbluehat: Don't want to be the only negative one
... also discussed a CG for exploring this
... and kind of concerned about this
... Opposed because it is underexplored, has security ramifications, etc
... We are pushing ahead due to time, but we need an outlet to properly vet these things

<romain> +1 to what @bigbluehat said

wendyreid: Dave said something like that in his +1

<Hadrien> +1

ivan: If new problems come up, it is in our right to reopen
... but don't want to keep issues open forever

<laurentlemeur> +1

<josh> +1

<garth> +1

ivan: at this moment uncertainty is bad

tzviya: Now proposing the PCG!

<laurentlemeur> +1 to Ivan. We have to study implication of this definition of boundaries but can use it as a ground.

tzviya: lunch time!

<ivan> scribejs, set danbri Dan Brickley

<ivan> scribejs, set addison Addison Philips

<tzviya> https://www.w3.org/community/blog/2018/10/22/proposed-group-publishing-community-group/

<scribe> scribenick: leonardr

<garth> scribenick leonardr

<ivan> scribejs, set r12a Richard Ishida

<ivan> guests+ danbri, r12a, addison

tzviya: quick introductions

tzviya some technical issues + please use mike

<tzviya> https://w3c.github.io/wpub/#language-and-dir

tzviya three github issues

language and base direction in JSON-LD

ivan: long running discussion that we've had with you folks in the past, about the need to represent
... the language of content as well as multi-lingual
... schema.org also needs to be able to understand and address whatever "standard" is used
... base direction of a publication is not well defined in schema.org
... (page) progression direction also needs to be reflected/described
... in the cse of the WP, the UA needs to know which is "next"

addison: this also applies to J books for vertical orders
... its not the same thing as base direction (for bidi)

ivan: we do have them separately

<tzviya> https://github.com/w3c/wpub/labels/topic%3Ainternationalization

liisamk: are we also going to address mixed direction in a block? specifically in JSON, such as the title

ivan: lets start with the trandition lang and dir issues
... what we did there is to have two separate items (lang + dir)
... but there is gossip to change that

addison: what is the context, as there may be the case.
... metadata about the publication, yes?

ivan: we don't do anything with the content (that' HTML). this is indeed abotu metadata
... we set a global language via existing schema.org
... inLanguage?

tzviya: clearing up some issues here...
... we are talking about specifc tags in JSON-LD and possibly schema.org, which danbri may have input into
... we are particularly talking about how to express the language and/or direction of some tags. For example, the title or author of a document.

addison: the problem that you are encountering is similar to those of other groups.
... if we have a piece of natural lang text, we need to be able to apply language and the base dir of that text.

<r12a> here is our base reference: https://w3c.github.io/string-meta/

addison: (sometimes you may not need one or the other, but you want to be able to set them when important)
... there are presentation cases where you need to know the proper things to do (eg. font selection)
... we recommend that each tag have its own set of values for this. Our string-meta doc tries to address this
... we agree that you need to transport both

ivan: in the current draft, we have a global setting, but we also want to enable overridding for specific items.
... inLanguage from schema.org but they don't have an "inDirection" tag, so we added our own
... but this is an issue because we had to introduce our own. However, there is a bigger issue
... the individual override. Using @value language in JSON-LD
... but we don't believe that schema.org understand this
... (see gkellogg who wrote that spec)
... the online tool for schema.org wasn't happy about it
... but direction is even more complicated, because JSON-LD doesn't have this natively.

gkellogg: because JSON-LD is just RDF, then the language is coming from base. but RDF has no direction concept, tehre is nothing to come from
... but if a future RDF had it, then JSON-LD would get it. (but not in the plans right now)

<r12a> https://w3c.github.io/string-meta/#script_subtag

ivan: going back a bit, the gossip says that for this restrictive usage we could "get away" with just the Lang tag.

<bigbluehat> ack q?

r12a: if there is no other way to do it, then you could assume direction from language tag, if you had script info as well.
... or could reliably guess. Hebrew, for example, might include info about the cdirection

script - ISO 15927

addison: inferencing the direction is not the same as actually having one

<danbri> are there cases where a single script could be written several ways (eg. maybe japanese sometimes vertical etc?)

addison: esp if you are counting on all downstream processing to do the same (right?) thing
... the challenge is to think of cases where the title is in a lang but non-standard direction

tzviya: I can give you lots of examples!

addison: you can imagine a doc where infering direction would cause things in the wrong direction.

ivan: the reason why I would like to avoid the direction tag is that it gets rid of a headache.

r12a: our preference is to have a separate label for each string.
... but there could be any number of strings and each one needs the same treatment

gkellogg: using HTML literals as string values?

ivan: we considered by punted on that for other reasons

gkellogg: there is another level of indirection also possible but also mnot used commonly

ivan: setting the lang for each literal is not a problem, we know how to do that from JSON-LD perspective
... but also having the direction is the problem

addison: we get that its a problem but you need that info if that the dstring is ever going to be displayed
... otherwise, things may not actually line up properly in display. This is avoidable.
... we do, however, consider this as a major flaw/limitation in RDF/JSON/JSON=LD

ivan: yes, but we are at the end of the stick
... the only thing we can set today in JSON-LD is lang

danbri: there are several dynamics happening here...
... while I am happy to put stuff into schema.org, it may not be the right place
... this group seems to be doing application modelling as well and I can help you do that too
... let's not confuse the two

ivan: I don't see a proper solution here

danbri: cleanup issues in early 2000's lead to the current state and we're not going back there

tzviya: I like getting people angry

Hadrien: what we have right now is similar to your proposal but it woudln't be undestandable by generic processors
... and that is a big part of our concern. we don't necessarily want to go out on our own, we'd rather use general stuff
... esp. if it causes failures by general processors

bigbluehat: concerning the suggestion of using HTML syntax for the strings, ivan is not a fan.
... I'd like to understand why we don't want to go down that path.

<Zakim> danbri, you wanted to discourage notion of a "Schema.org processor". There are applications (and families of application that share infrastructure) which use Schema.org terms and

<duga> +1 to bigbluehat

danbri: there is no such thing as a "schema.org processor". there are search engines, but that's not a general case

ivan: can we take the position that we can use any valid JSON-LD and have it handled?

danbri: no, not aware of a full JSON-LD processor in use today for schema.org
... there are specific use cases where specific needs of JSON-LD are used (or not).
... at Google, we picked a specific subset of things and Bing probably as well

ivan: what we have tried to do so far is to ensure that what we want to do is understood by an existing tool and if not...

Hadrien: we have discussed the HTML route many times, we are concerned that many UAs that would ingest these literals would not know what to do with the HTML anyway
... it would convert it down to a string and would end up dropping the useful bits

laurentlemeur: there is also an issues if improper elements are used
... adding HTML elements de-simplifies JSON

r12a: there are three levels - global, per element, inter-element
... HTML already knows how to do that and the Unicode controls aren't well supported by browsers

ivan: we don't use HTML for any strings

addison: Ruby shouldn't be need for presentation of metadata but would be useful for sorting
... yes, if you use HTML, you need to then have it parsed and understood.

<Zakim> bigbluehat, you wanted to ask which strings we're needing this for

bigbluehat: it might be helpful which strings are thinking about here. What other data could be applied here?

addison: author, publisher, title

tzviya: let's look at the issues

ivan: they are all around the same issues

tzviya: proposal 1: going to HTML. (but that has been nixed by a few people)

ivan: do we think our UA/RS folks are willing to do deal with the HTML?

tzviya: would UAs rather see it as HTML or JSON/JSON-LD? what are common flows?

addison: they have fields with the values and not HTML

you don't get anything for free from JSON-LD but it still works with those processors

<danbri> re Google SDTT, consider an example like https://gist.github.com/danbri/010ee9afeb48806c857775d062caf3ed... it is OK by the Google tester but effectively useless. SDTT is good for testing to see if specific data examples match the information needs of specific google tools, and it also catches some low-level errors.

<Zakim> bigbluehat, you wanted to propose a narrower sub-set of HTML focused on language expression

bigbluehat: social web working group uses the HTML representation (or a subset thereof)
... we should probably also set the set of valid values

<Ralph> Leonard: Adobe's XDM handles some of these issues

leonardr: we follow script-meta recommendations

danbri: there seems to be nothing in between HTML and "plain text".

ivan: we are not in the position to make a new RDF datatype
... we could define a subset of HTML but that would need to be validated during workflows

bigbluehat: in practice, *lots* of folks had the same issue that HTML is too big and scary and don't expect folks to use it
... and ended up using a subset of HTML

tzviya: we need something robust but not as scary as HTML

garth: it having our special stuff ignored a bad thing?

tzviya: why can't we change JSON-LD?

bigbluehat: because JSON-LD is modelled on RDF which we can't change
... why is violating JSON-LD OK?

ivan: because implementors could just ignore the validation errors
... the reason for JSON-LD was so that the metadata would be understood by search engines and other schema.org aware systems. So why violate it?

bigbluehat: but maybe Google or bing will index it anyway

<Ralph> [ JSON punted on the lang/dir issue. RDF tried to punt on it as well, expecting the underlying serialization to handle it. When the first serialization was XML then the RDF punt nearly worked. An underlying serialization in HTML handles lang/dir as well as (potentially) markup for SVG, MathML, ... ]

r12a: HTML was one option. JSON-LD was another. but there is another possibility

gkellogg: the three options I heard. Inside a tag, you have to use HTML (or the like). For the entire tag, use a script as part of the language tag (as that is valid RDF). Or use a fully structured object with lang and direction, which is also valid JSON-LD/RDF

ivan: this means moving away from "simple literals"

bigbluehat: the situation of feeding search engines, I've tested things and Google seems to handle the HTML values in JSON-LD - for some definition of "handle"

the alternative, the complex objects, will not index by the search engine

gkellogg: if there was a standard "indirect object" that would help

liisamk: doesn't ONIX use HTML for thje strings?

tzviya: it's XML not HTML. (ONIX is a common vocab for books)

Matthias_Kovatsch: what are the cases where you can't determine the direction from the lang?

<danbri> re JSON-LD, could we use a complex "datatype" to carry lang+direction together? (ugly and horrible and wrong...)

<garth> ONIX: https://www.editeur.org/83/Overview/

addison: writing mode is different. examples like azerbejan can go in both latin or arabic

ivan: what are the real practical cases that we would have in publishing if we *only* used the lang tag?

r12a: you would have to ensure a script tag!

addison: the Unicode/ICU approach also has some options (???)

danbri: re JSON-LD, could we use a complex "datatype" to carry lang+direction together? (ugly and horrible and wrong...)

<Ralph> LeonardR: when you mention Unicode, you're not referring to the deprecated substring stuff, are you?

<Ralph> Addison: I'm referring to BCP47

<Ralph> ... separately, Unicode bidi control characters that could be used in plain strings outside of markup

<Zakim> tzviya, you wanted to ask where we go from here?

<Ralph> ... modern changes on isolating controls are not yet widely in use, though that's what you'd really want to recommend

<danbri> FWIW Google's JSON-LD parser doesn't complain if it sees "name": "Stan <em>Dinkley</em>", etc., but nothing in Schema.org says when/whether to treat '<' as markup vs part of the content. Most of our *applications* would strip or otherwise sanitize it. But that is an application-level decision.

<Ralph> RichardIshida: we really need isolation control to make bidi work

tzviya: what happens next?
... danbri suggested evaluating underused attrs like "role"

<tzviya> http://blog.schema.org/2014/06/introducing-role.html

danbri: we tried that, but it wasn't a success
... nobody likes it

<Zakim> bigbluehat, you wanted to mention the hard case is the mixed language title

bigbluehat: I still feel the only way to handle this, esp with mixed language, is to use (a subset of) HTML

<danbri> here's a real world json-ld schema.org mixed language name/title from MusicBrainz, https://search.google.com/structured-data/testing-tool?url=https://musicbrainz.org/work/e664139f-6fb5-4aaf-91f1-3c109753c7ea#url=https%3A%2F%2Fmusicbrainz.org%2Fwork%2Fe664139f-6fb5-4aaf-91f1-3c109753c7ea

laurentlemeur: I want to go the exact opposite way, to not use HTML but standard data elements
... but that does not solve mixed language

<danbri> (currently {"name":"(Si Si) Je Suis un Rock Star")

<Ralph> Leonard: do we actually know what was indexed (by Google) in Benjamin's experiments?

<glazou> hehe

<Ralph> bigbluehat: removing the tags and keeping only the content gives the wrong result

r12a: if you did it using the HTML, you would have to do it all the time for each string

<Zakim> danbri, you wanted to comment on the 'how google does it' bit

danbri: I would move away from "how Google indexes things", but instead consider rthat stuff comes in and then out as a "triple"
... and then sent downstream where some processors may strip out bit and others may not

ivan: the perfect is enemy of good
... so what is the 80/20 cut on this one?

<r12a> example of strings that need base direction: في HMTL5 يتم تحقيق ذلك بإضافة العنصر المضمن bdi.

ivan: are we willing to sacrifice some of these requirements (eg. mixed language) in favor of supporting the others more simply?

david_clarke: an example in titles such as my text book

ivan: this is where Unicode directionality might help

r12a: you are talking about a different issue. applying bidi inside a string with the Unicode dir chars is fine. *but* that doesn't address the base direction
... my example earlier shows up the problem with missing base dir

addison: when passing data around, when you need the data, you need it! and once computed, you want it understood the same way in all cases
... the options discussed here are all valid ones but you need to decide which pain points you are willing accept

marisa: another example in arabic...numerals. you switch the directions with numerals

ivan: there is no ideal solution. "which finger should I bite?"

bigbluehat I have a starting point from PDF for that that we use for rich text string...

<Zakim> bigbluehat, you wanted to propose that we write a simple HTML subset for strings in JSON-LD

bigbluehat: propose that we write a simple HTML subset for strings in JSON-LD

danbri: we'd use that if you come up with it!
... alternatively, would you like us to add an inDirection property in schema.org to help move things along?

<Zakim> danbri, you wanted to ask if you'd like a tentative/experimental (i.e. "pending" review) inDirection property in schema.org to help move things along? Is there a definition to use?

<r12a> s/making proposals/writing down proposals/

ivan: will make sure to contact danbri with a formal propsal

<ivan> proposed: we propose for schema.org to add an inDirection term alongside inLanguage, with value of "rtl", "ltr", "auto"

DanielWeck: if we were to carry the HTML in the string liternal, then the RS needs to process that? How would you know that it is HTML?

ivan: you would know that from the RDF/JSON-LD
... calling for objections for my proposal?

<garth> Crickets

RESOLUTION: we propose for schema.org to add an inDirection term alongside inLanguage, with value of "rtl", "ltr", "auto"

<Ralph> [acknowledging that this resolution doesn't address multiple-direction literals]

DanielWeck: this is not just RS, but authoring, etc. all have to deal with whatever we proposed

<danbri> thanks, noted in https://github.com/schemaorg/schemaorg/issues/2086

<DanielWeck> (entities, whitespace normalization, etc.) the HTML processing model

<Zakim> bigbluehat, you wanted to ask that we also record the sub-set of HTML proposal

ivan: if we do a pure HTML datatype, then anything could be allowed

bigbluehat: which is why I want a specialized version

duga: there is a history here with EPUB that the Japanese couldn't represent some titles - let's not do that again

<Ralph> [is it plausible to sanitize the HTML before putting it through a full HTML parser? Likely reading systems don't want to have to write a separate limited HTML parser?]

tzviya: we have a proposal to work up a small subset of HTML that could be used

<Ralph> Leonard: if someone is proposing to pursue a formal proposal, we should let them do it

<Ralph> Benjamin: I'm writing it now

<ivan> proposed: allow literals (title, publisher, creators) to be expressible using an HTML datatype, restricting the HTML to a subset

<bigbluehat> PROPOSED: to create a sub-set of HTML--narrowed to multiple language tags--to use within all text strings used within the infoset/manifest

PROPOSED: to create a sub-set of HTML--narrowed to multiple language tags--to use within (a set of TBD) strings within the infoset/manifest

tzviya: any objections?

ivan: no one is saying that you *must* use the HTML. just that it would (possibly) be an option

addison: please make sure to include us in those discussions
... we want solutions for the web at large

<danbri> FWIW I asked in #tpac-chat, tantek noted WICG draft around sanitization, https://github.com/WICG/purification

tzviya: break time!@
... thanks to all our guests.

George: we also have a11y items outstanding as well. don't forget about them.

<Rachel> scribnick: Rachel

schema.org issues

<ivan> scribenick: Rachel

<ivan> https://github.com/w3c/wpub/wiki/Schema.org-issues

Ivan: we've started with some problems in which we started to define our own context and terms.
... in publishing the order of authors, publishers, translator etc is deadly serious
... currently, these terms are not ordered
... we put these in an ordered list to meet our needs - is this an acceptable solution for schema.org

danbri: lists have always been a pain in rdf... schema.org made an item list construction. 6 months ago we went through an exercise with json-ld. the end result was prettier and there was consensus that the result was prettier.
... that's as far as we got
... we would introduce new ones, not turn the current ones into lists

ivan: so for the time being its fine to keep it as it is now

danbri: the most obvious one is recipe instructions

ivan: it will be a timing issue

<danbri> re lists in schema.org, see https://github.com/schemaorg/schemaorg/issues/1910

ivan: we will skip over language setting

tzviya: accessibility issues go back to the first idpf proposals for tags that sit in a non-normative document about certification metadata
... we have outlined what we want
... it picks up conformance rules from dublincore
... certified-by, certified-credential, certified-report

danbri: my concern would be overlap

<garth> https://idpf.github.io/epub-vocabs/package/a11y/#sec-certifierCredential

danbri: This might overlap with the credential work (certifier)

tzviya: we didn't add accessibility to purposefully keep it generic

wendyreid: certified-by points to the org, which is a recognized org
... for accessibility

leonardr: it should be tied to accessibility

danbri: our preference is to say it's a relationship between two things
... certified credential carries some of that work

gpellegrino: we also need a URL, because it is specific to that publication

<Ralph> Ralph: certification should be a URL so that you can look up properties of the certifier; e.g. org, specifics about the certification

leonardr: if the document is certified to 2 purposes but you only have one set of metadata how do you address that

tzviya: have the field repeat

danbri: what is the purpose

tzviya: to clarify who is saying that something is accessible and what they are using to say it

danbri: if you have multiple things being investigated, you don't know which report is which
... you might need to hack something

<Ralph> [we should also look for overlap with Verifiable Claims work]

<danbri> EOCred work --- https://www.w3.org/community/eocred-schema/

<Ralph> Rachel: re: purpose -- in education this comes up repeatedly; we have two situations -- an actual certification program run by Benetech

<Ralph> ... Benetech will certify that a publisher is producing accessibly ebooks

<Ralph> ... and many campuses are requiring publishers to self-certify; accepting responsibility for fixing problems

danbri: it's about each particular published thing
... rather than the publisher
... we have this corner of schema.org where we can throw things and see if they work

<danbri> have a look at this: http://pending.eocred-1779.appspot.com/EducationalOccupationalCredential

tzviya: there is some potential overlap with verifiable claims and open badges
... if a publisher is saying "I assert this is accessible" how do we verify this?

ivan: languages - in json-ld there is something easier to use, called language mapping, from an authoring point of view...???

<Ralph> [Ivan is now discussing https://github.com/w3c/wpub/wiki/Schema.org-issues#language-indexing Language indexing ]

Hadrien: you need to be able to process the context

danbri: I'm not aware of anyone in the search industry doing complex processing

Ivan: can you find out if this is planned? is it a good idea to use this index?

Hadrien: are there plans to support json?

danbri: that would be company by company

bigbluehat: [starts troubleshooting for the json working group even though he has his own meeting]

ivan: we need something close to the link role

<Ralph> LinkRole

ivan: even though it's experimental
... what we started to do is that we defined our own type
... it is almost the same except linkrole cannot accept mime types
... I raised an issue about this on schema.org

danbri: encoding format shouldn't be used on the link role

ivan: the real question is encoding format

danbri: linkrole is still pending?
... I need to check on the mechanics...

ivan: It would be better for us to rely on schema.org vocabulary than inventing our own here
... audio

<Ralph> [audio] duration alues

ivan: the html equivalent is more relevant than schema.org

wendyreid: duration in schema refers more to a CV - as in I was in a job from jan 2017 to feb 2018

<danbri> see lower section of https://schema.org/Duration for properties whose value is datatyped schema.org/Duration, which does lean towards 8601

ivan: we should allow for the iso standard to be used

<Ralph> [HTML 5.3] Durations

wendyreid: we would like to extend the iso standard to be used

<danbri> whereas https://schema.org/temporalCoverage covers date ranges, and we had some issue about openended ranges

<danbri> https://github.com/schemaorg/schemaorg/issues/2086

<danbri> next step is to follow up https://pending.schema.org/duration and clarify whether it is a temporal quantity, versus a period (temporalCoverage); could be clarified

ivan: general question - there is a large vocabulary for publications in schema.org, but new types of publications are unavoidable
... what steps should we take to add new types

<Ralph> mechanisms for new publication types?

danbri: we generally push to wikidata to do the work

ivan: proceedings is a good example

<danbri> https://www.wikidata.org/wiki/Q1143604

danbri: you can use wikidata directly

ivan: so in my json-ld, I would use Q1143604 instead of Proceedings?

danbri: yes

ivan: [suggests workaround Rachel doesn't understand]

danbri: that too

Issues

<danbri> workaround is "Proceedings" is a term in the surface syntax defined by a json-ld context, but maps to Q1234-style URLs

<Ralph> scribenick: Ralph

issues

Wendy: looking for things that could be closed ("propose closing")

<wendyreid> https://github.com/w3c/wpub/issues/261

Wendy: #261: use of "cover image"

Ivan: 'cover' is a structural property in the manifest

Leonard: cover is optional?

Ivan: yes

Wendy: any objection to closing this?

[none expressed]

Garth: we have to update the editor's draft now to close it

Ivan: yes; that's the mechanics

RESOLUTION: close #261

Wendy: consider #54

<tzviya> https://github.com/w3c/wpub/issues/54

Wendy: "Obtaining language from http headers"
... the issue is whether http headers can be fallback for determining the language of a publication

Tzviya: this is no longer relevant; it's from a long time ago

Rachel: let's add a comment saying this

Ivan: I'll clean the minutes then refer to them in a close comment

Wendy: #270
... Conformance criterion for UA

Ivan: we should add a reference to this morning's discussion
... and add a comment that it will be followed-up later

Wendy: not proposing to close this yet
... #291

<Rachel> https://github.com/w3c/wpub/issues/291

Wendy: Do we need a more detailed definition for the HTML TOC format?
... based on #285

Ivan: and we now have pagelist

Garth: there's a reference to a transcript of a meeting, though it didn't resolve the issue

Ivan: the question is whether we want to define a structure in HTML for TOC
... at the moment the spec doesn't say anything
... we have two extremes: the EPUB model and no restriction (whatever you can express as HTML)
... the discussion was that whatever we define should be machine processable; there should be a clear algorithm for extracting the TOC from the HTML
... the current spec stops at describing an ARIA tag
... it's relatively clear that an algorithm can be defined for what's in EPUB, or even for something more liberal
... on the other hand, nobody so far has come up with an algorithm to get a reasonable TOC from arbitrary HTML
... if you think something more liberal should be permitted, provide an algorithm

Juan: I've been thinking about an algorithm
... my thinking is to require UL/OL with LI and then require SPAN
... I've been looking at the Category content model and have an idea about explicit/implicit P
... I think there's text there about runs of phrasing content that could be used to define a generic algorithm
... I have a tree-walker algorithm that is very preliminary
... would welcome comments

Tzviya: sounds interesting; I'd love to see it

Laurent: we have to permit round-tripability
... we must be careful

Liisa: what problem are we trying to solve?
... replacing the nav doc in some way that doesn't sound like addressing missing formatting in that doc
... the inline TOC and the navigation are not a map of each other
... often we include more in the nav that we include in the inline TOC
... we've been thinking that we might not create an inline TOC where the printed version didn't have one
... as a content item
... a renderable TOC is often very heavily designed and that doesn't get put in the nav
... as we're thinking about this, keep the option that they can be separate

Tzviya: nothing prevents you from creating an inline TOC
... in my ideal world they would be the same thing

Liisa: I'd like to have the machine-readable with the option to have the pretty one
... I don't necessarily want the pretty TOC to be the machine-readable TOC

Ivan: so you'll have two structures?

Liisa: yes

Ivan: that's still doable; you just have to identify via ARIA which one is the machine-readable one

Wendy: the pretty one is the one users look at; as a RS, I'd love to offer all the information that's there

Liisa: two separate problems: the amount of information and the prettyfying of that

Ivan: can you write down (later) the algorithm and the corresponding HTML structure?
... if this can be written, and if it includes the usual structure in EPUB 3 and other possibilities, that's fine

Juan: I can do that

Luc: this is important both for TOC and for accessibility issues with TOC

Juan: I'm thinking that there's one container, a DIV, and a text node
... that text node becomes an H1
... I get lost with mixing H2, H3, ...
... I saw a proposal for something that takes the level from the nesting
... I don't yet know how to address this

Tziya: others here might help in understanding the outline algorithm

Rachel: when we present a TOC in a textbook we generally have two of them: a detailed one and a brief one
... in our ebooks we have a third for navigation
... we present the nav TOC in the reader navigation and within that there are links to the brief and full TOC
... the full TOC has additional links that are not in the nav

<Bobbytung> Hidden=“hidden”

Rachel: we put these in the full TOC because students look for them while paging through a book

Benjamin: there are use cases for several TOC presentations
... is it a requirement that the machine version be extractable from the human-readable one?
... perhaps it's an imagemap

<duga> a+

Benjamin: are we trying really hard to make this possible without saying "the machine SHOULD ..."?

Leonard: similar question; I'm not sure why we need a machine readable TOC in WP
... I understand the need for a human-presentable one
... giving that reading order and resources are both handled elsewhere, why does a UA need in addition another TOC?

Tzviya: we've answered this over the course of several meetings

Brady: briefly, for accessibility
... George can explain in more detail

Liisa: were we planning to be able to have the machine readable have extensible formatting?
... embedding styling, images, ...

Tzviya: this ties back to the JSON-LD discussion

Wendy: audiobooks have an example
... and audiobook in a language that does not have a text form
... the TOC might use voice prompts or images
... we'd want a machine-readable TOC to be able to say "there's no text for this chapter"

Luc: I'm concerned with performance to compute a machine-readable TOC
... as a publisher, I have no issue to prepare a machine readable and a presentable TOC
... I don't care about duplication
... I care about production and performance for the user experience
... I have no issue with a complex presentaton TOC

Laurent: a good discussion would be what is the user experience in an RS if there is no machine-readable TOC only an HTML page
... the TOC could still be accessible but it might take the whole screen because the RS wouldn't know how to make it smaller
... would this be a good user experience?

Hadrien: we're discussing two concepts
... something nice, presented to the user in many ways
... and something else meant for accessibility or [other] specific UI features
... in EPUB3 we tried to have something serve both and failed
... I'm wondering if we shouldn't simply treat them as separate
... the machine-readable one may have some very specific information
... I'm not hearing yet a step forward

Ivan: Juan might come up with a more general and usable solution
... if JuansAlgorithm fails there can be a fallback for the UA to use it as is
... an imagemap can be marked as a TOC, JuansAlgorithm will fail, and Benjamin will get what he wants

Hadrien: one can mark up the same element with two different doc-XXX to denote the two different navigations approaches in one element

Tzviya: we get into a tricky area if you're thinking that the accessible version is different

Hadrien: that's not what I'm suggesting
... I was thinking at the manifest level

Tzviya: I think people will be happier with Juan's approach if it works

Ivan: is there another ARIA element?

Tzviya: not really

Dave: it's possible to be both smart and good looking :)

<bigbluehat> https://wileylabs.github.io/no-can-transclude/moby-dick-from-epub-samples/

Benjamin: ^^ shows what one can do
... it's an EPUB in which I renamed the nav file to index.html
... Avnesh says this is sufficient for accessibility
... next and previous are populated from what's underneath what you read
... it's not using LI or anything magical; just finding the next anchor
... keeps reading state

Tzviya: this is navigation, not the TOC

Benjamin: in this document those happen to be the same
... I claim that navigation can be as simple as next anchor

Ivan: the JuansAlgoritm might cover this

Juan: yes; I think this idea could still work

Benjamin: tree order can be determined from the parentage of each link

<Bobbytung> +1

Benjamin: if this were structured in an HTML DOM tree they could be represented in a machine-readable navigatin

<Hadrien> my proposal is to have two different rel values at a manifest level, if a resource can be both "good looking" and "machine readable", it would simply use both rels

Benjamin: I believe this suggests that the algorithm is not very hard
... the machine readable thing is calculable
... this example calculates from the TOC

<tzviya> proposed CG https://www.w3.org/community/blog/2018/10/22/proposed-group-publishing-community-group/

Tzviya: I think this discussion could proceed in the newly proposed ^^ CG

Garth: the TOC discussion belongs in this WG
... the discussion of options might come from the CG?

Tzviya: yes

Benjamin: I think the WG can keep the discussion

Tzviya: Juan will work on JuansAlgorthm
... Hadrien will work on a double TOC proposal

Garth: I'm skeptical that JuansAlgorithm can work

- DRAFT -

Publishing F2F, 1st day

22 Oct 2018

Attendees

Contents

Introductions

Dinner

Now Let's Talk About Publishing

use case, affordances

Boundaries

language and base direction in JSON-LD

schema.org issues

Issues

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output