Portable Web Pub­li­ca­tions: Tech­nol­ogy Chal­lenges

Ivan Her­man, W3C

W3C Track @ WWW2016, Montréal, Canada

2016-04-13

Portable Web Publications: Technology Challenges

Ivan Her­man, W3C

W3C Track @ WWW2016

2016-04-13

These Slides are Avail­able on the Web

See:
http://www.w3.org/2016/Talks/W3CTrack-IH/

(Slides are in HTML)

Is it a book? Is it a Web site?

Ex­tract from “Big Java", by Cay Horstmann, John Wiley & Sons, 2013

The main mes­sage:

Dig­i­tal Pub­lish­ing
=
Web Pub­lish­ing!

put it an­other way…

Web Pub­lish­ing
=
Dig­i­tal Pub­lish­ing!

What does this mean?

Portable Web Publication at a glance

Sep­a­ra­tion be­tween pub­lish­ing “on­line”, as Web sites, and of­fline and/or pack­aged should be di­min­ished to zero

What does this mean?

Portable Web Publication at a glance

ibta ara­bia

For example: book in a browser

Joseph Reagle's book as a web page
Ex­tract of Joseph Rea­gle’s Book
  • On a desk­top I may want to read a book just like a Web page:
    • eas­ily fol­low a link “out” of the book
    • cre­ate book­marks to “within” a page in a book
    • use use­ful plu­g­ins and tools that my browser may have
    • cre­ate an­no­ta­tions
    • some­times I may need the com­put­ing power of my desk-top for, e.g., in­ter­ac­tive 3D con­tent

For example: book in a browser

Joseph Reagle's book as an ebook in reader
Ex­tract of Joseph Rea­gle’s Book as ePUB
  • But, at other times, I may also want to use a small ded­i­cated reader de­vice to read the book on the beach…
  • All these on the same book (not con­ver­sions from one for­mat to the other)!

For example: I may not be online…

Person sitting in a station with a mobile in hand
Bryan Ong, Flickr
  • I may find an ar­ti­cle on the Web that I want to re­view, an­no­tate, etc., while com­mut­ing home on a train
  • I want the re­sults of the an­no­ta­tions to be back on­line, when I am back on the In­ter­net
    • note: some browsers have an “archiv­ing” pos­si­bil­ity, but they are not in­ter­op­er­a­ble

For example: educational publications

University hall with students, most of them with a tablet
Mer­rill Col­lege of Jour­nal­ism, Flickr

Synergy effects of convergence

Advantage for the publishers‘ community

Photo of a bookshelf with lots of technical books
Jef­frey Zeld­man, Flickr
  • The main in­ter­est of pub­lish­ers is to pro­duce, edit, cu­rate, etc, con­tent
  • Pub­lish­ers have in­vested heav­ily into tech­nol­ogy de­vel­op­ments, but the Web de­vel­op­ers’ com­mu­nity can com­ple­ment that with a wider reach and per­spec­tive
  • Work­ing closely with Web de­vel­op­ers avoids re-in­vent­ing wheels

Advantage for the Web community

image of a medieval manuscript
Oliver Byrne's edi­tion of Eu­clid, Uni­ver­sity of British Co­lum­bia
  • Pub­lish­ers have ex­pe­ri­ence in:
    • er­gonom­ics, ty­pog­ra­phy, aes­thet­ics…
    • pub­lish­ing long texts, with the right read­abil­ity and struc­ture
  • Work­flow for pro­duc­ing com­plex con­tent

But… why not rely only on the Web?
(i.e., forget about downloaded content, it is outdated!)

Several reasons…

How do we get there? (Technically)

Moyan Brenn, Flickr

Warning: everything I say is subject to change!

Cather­ine Kolodziej, Flickr

Technical Challenge: Fundamental Terminology

Web Publications

a collection of resources with different URL pointer
  • The cur­rent Web has the no­tion of a sin­gle re­source:
    • con­cep­tu­ally, a sin­gle piece of data
      • HTML source, meta­data, CSS style sheet, etc.
    • each has its own URL
  • Pre­sen­ta­tion is based on the in­ter­op­er­a­tion of many such re­sources

Web Publications

a collection of resources in a 'blob' with one URL pointer
  • But pub­lish­ers need the con­cept of a sin­gle Pub­li­ca­tion:
    • a col­lec­tion of pages, to­gether with the rel­e­vant CSS, im­ages, video, etc., files
    • it is the col­lec­tion that has a real dis­tinct iden­tity (URL), not its con­stituents

Formally

  • A Web Pub­li­ca­tion: an ag­gre­gated set of in­ter­re­lated Web Re­sources, in­tended to be con­sid­ered as a sin­gle en­tity, and which can be ad­dressed on the Web as a unit (is it­self a Web Re­source)
a collection of resources in a 'blob' with one URL pointer

Portable Web Publications

More Formally

What kinds of documents are we talking about?

What kinds of documents are we not talking about?

But there are of course differences

Envisioned “states” of a Portable Web Publication

Pro­to­col Ac­cess File Ac­cess
Packed PWP as one archive on a server PWP as one archive on a local disc
Unpacked PWP spread over sev­eral files on a server PWP spread over sev­eral files on a local disc

Technical challenge: an overall architecture to handle PWP-s

Envisioned architecture:
a “PWP Processor”

Envisioned architecture:
unpacked state

Document consumed through the Web in a traditional way

Envisioned architecture:
cached state

Document consumed through a Service Worker, possibly cached

Envisioned architecture:
packed state

Document consumed through a Service Worker, possibly unpacked

Envisioned architecture:
packed state

Document consumed through a Service Worker, possibly unpacked

Draft…

Is this ap­proach at all fea­si­ble?

Advances in modern browsers: Web and Service Workers

Advances in modern browsers: Web and Service Workers

Work in progress

A PWP Proces­sor could be im­ple­mented as a Ser­vice Worker

Not only a wild idea…

Technical challenge: addressing, identification

Is it "addressing" or is it "identification"?

Is it "addressing" or is it "identification"?

Three layers of addressing

  1. Lo­ca­tor for the PWP it­self:
    http://www.ex.org/MyPWP/
  2. Lo­cat­ing a re­source within a PWP:
    http://www.ex.org/MyPWP/Chapter1.html
  3. Lo­cat­ing a tar­get within a re­source:
    http://www.ex.org/MyPWP/Chapter1.html#section1

Locating the different PWP “states”

Canonical locators

The PWP Processor can take care of the rest…

What does an HTTP GET return for L?

Getting hold of all locators

Flow diagram on accessing and combining various sources of Metadata

Getting hold of all locators

Flow diagram on accessing and combining various sources of Metadata

Work in progress

Manifests

Technical challenge: presentation control
(a.k.a. Personalization)

How do we get there? (Practically)

Moyan Brenn, Flickr

DPUB IG and Portable Web Publications

screen dump of the PWP draft

IDPF, W3C, and others

Some references

DPUB IG Wiki
https://www.w3.org/dpub/IG/wiki/Main_Page
Lat­est PWP Of­fi­cial Draft:
http://www.w3.org/TR/pwp/
PWP Ed­i­tors’ draft:
https://w3c.github.io/dpub-pwp/
PWP Issue list:
https://github.com/w3c/dpub-pwp/issues

Thank you for your at­ten­tion!

This pre­sen­ta­tion:
http://www.w3.org/2016/Talks/W3CTrack-IH/
(PDF is also avail­able for down­load)
My con­tact:
ivan@​w3.​org