Warning:
This wiki has been archived and is now read-only.
London Meeting
From Fixing Application Cache Community Group
(Redirected from LondonMeeting)
14 August 2012, FT Labs.
Minutes are reproduced below verbatim for archiving purposes. Original minutes available here: https://etherpad.mozilla.org/appcache-london.
* No implementation is correct * AppCache spec is too complex * In order to change the standards we need real stories that didn't work and offer proposals * Results of today ** Use cases -- "what are the various set of things we're trying to build?" ** Things to discuss next week in Mountain View Case studies ========== FT app --------- * Single page app, (currently) uses hash-bang URLs. * As a result, explicitly caches root URL and depends on user only visiting that one URL * Otherwise much as described in Lanyrd and Economist case studies below. Lanyrd --------- http://m.lanyrd.com/ * separate domain * not a retrofit, would have required a re-write anyway * want to provide a subset of overall content, scored by users a-priori * can't put a manifest on every page - lack of control, no way to uncache parts explicitly - diveintohtml5 pattern doesn't work; no way to expire things - bunfight over cache eviction policies * description of the problems: http:///articles/application-cache-is-a-douchebag/ * problem with stale data * problem with user-facing private data * wanted the app to work without JS * cached JSON data in localStorage (mustache to render HTML - reason is storing multiple page data) - unreliably connected is *worse* than disconnected • because users are logged in and their data is saved to the cache, if they log out and someone else logs in you get the wrong data. So they include the login as a comment in the manifest to work around that. • We use XHR just so we can show the "Loading" thing • there is no error feedback on fallback pages, and there are quite a few rather different error cases • interesting attack: load the main page with lots of different query strings, cause it to be implicitly cached many, many times • far-future caching of the manifest will lock out updates until the far future happens (30 FT users still locked into an AppCache from July) (bug 830588) Economist app --------------------- * the result of lessons learned from the FT app * no manifest when uesrs aren't signed in * offline is an explicit uesr choice (users are prompted) * manifest is populated via an iframe, the same as Lanyrd to * upon user permission, download all current edition WebSQL/IndexedDB • we don't know how much space there is in appcache, and realistically we find that it usually is not enough for our needs * storing images as encoded strings in the db. * batching requests for pages. Pages are atomic, JSON data that contain the text content + the encoded images. < * economist has a native affordance, app store distribution * FT only run through web UI, but prompts users to add a web shortcut - shortcut is the preferred method for Lanyrd too * neither FT nor Lanyrd is mobile-only for this stuff, but is mobile-first * on desktop, one of the important use cases is t o pre-fetch content. * individual issues tend not to change - eviction is handled through LocalStorage/etc., not through AppCache, and is LRU based on issues * All the JS is in LocalStorage • on startup, a heartbeat is sent to the server with last modificatin times of the resources in local storage - server responds with information about what to update (and the priority of the update) based on compat with currently bootstrapped host code * very long session times (sometimes weeks) * everyone is using window.name to hold invalidation/last-boot information in order to avoid reload loops and to force refreshes when the appcache has old templates vs. incompatible data * two fallbacks: /api and /issues, both are needed to make JS apis work gracefully when things aren't working * want to support content + struc in the same response for performance reasons, crawlers, etc. * deliver content within the structure first (data inlcuded in HTML markup) then bootstap the JS ("appify"). * origin segmentation is a huge issue. Everyone is checksumming in order to invalidate based on user of the app (different logins in the same browser/session) Facebook mobile site ------------------------------ * current infrastructure has oone giant branch system, no differentiation based on client capability in terms of overall architecture * 2011 Appcache testing - facebook is mainly content * caching static resources via AppCache * push a lot, many invalidations, pre-fetching punishes demand-loading of content * history API via a controller that owns the app state - JS for page is serialized in sequence and replayed when view is returned to * always signed in (no way to cache the whole thing), never offline * appcache for perf *only* * would strongly prefer a "page state snapshotting" system - don't have a strong idea for how the controller would interact with this * NO LONGER USING APPCACHE - stopped in Dec - burned through too many network resources to be useful - probabilistic resource packaging means that all content is invalidated on each push * changing their appraoch, looking to be more in-line with AppCache's current strenghts - will go to a more app-like style, which will help - progressive enhancement to the Nth degree is untennable at some size - spreading a codebase too wide hurts both high and low-end experiences * concern that relying on HTTP for *SOME* things, *PART* of the time makes the whole system extremely difficult to reason about (see: no store headers and master entries vs. other items) * big apps hit the "manifests hosted on diffrent servers from data and load-balancer's f you" and invalidating everything is the only way to go because you can't do temp responses (because spec doesn't want to be hostage to captive portals) - manifest on Server 1, downloads master entry (which can't be fingerprinted) - resources are on a CDN and are all fingerprinted ... much quibbling over what spec actually says about 2-phase update... ... some discussion of overlapping resource requests in disjoint manifests from the same domain ... NOBODY UNDERSTANDS THE MANIFEST SELECTION ALGORITHM ... correction... WE HAVE NO MENTAL MODEL FOR APPCACHE, and as a group are relatively convinced that nobody else does either. Mobile GMail ------------------ * All data in gmail is private, no public data * All served over SSL * Multiple versions of the app for different platform / client capabilities * Offline only supported on most modern browsers * Heavy client side data model * Every view is a search * Gears was designed to make GMail work offline * App Cache stores the 'shell' of the application * Main app code mainly sits in localStorage (though it is initially delivered to browser in an HTML comment) * Time to view inbox, time to view message - key performance targets * App cache contains just a fallback / master for bootstrap - effectively just a single resource Use Cases ======== * The ability for a user to be able to choose content (developer-defined chunks, for example alternative and/or additional content) he wants to consume later, maybe when offline (eg wikipedia read later). * The user wants to update an application and only incur the cost of downloading the delta of that app. Dev Use Case ========== * The user sees an indication to make it clear to them when content is available offline, and which specific items of content. Items don't refer to HTTP assets but rather App * User can refer to an index of pages they have cached offline, built by the site * User visits single page, sees content instantly from cache, application fetches resources to update that single page atomically, user is informed of updates if any * User visits leaf page (search results, article), want to update/create offline experience for that site without the current page necessarily being part of it * User visits web game, assets are all downloaded before user is able to play, game is playable without a connection * User visits web game, levels are downloaded one by one, user is able to play once first level has downloaded, addiitonal levels are downloaded in the background. Game playable without a connection (only pre-downloaded levels) * User is in a low network area but data for page has been cached offline, wants to see cached data straight away rather than waiting for connection to succeed/fail * User visited page they have cached, site decides to show them cached data straight away for performance, then update with fresh data when/if it arrives * User visits an online newspaper and a user-specific selection of several large media assets (audio and video) are downloaded in the background, and the user is able to play these while offline. * User visits http://foo.bar/articles/whatever but they don't have it cached offline / page does not exist / foo.bar is down, user wants to know whether the error is their fault, server fault or connection fault * User visits page while online, train goes into tunnel, triggers an async request, request comes from offline cache if present * User views TV shedule which also contains now+next. Page caches with the exception of now+next which is too time sensitive * User saves 99 articles for “offline reading” on a large site such as wikipedia. User adds article on “gravity” to their offline list. Want to cache that new article without re-requesting the other 99 * User visits article on “Gravity” while offline, want to show cached version straight away but request updates to the article without requesting the other 99 * User selects “update all” from a menu, want to update cache for all 100 articles * Page uses webfonts, want to cache web font formats used without having to cache other suggested formats. Eg, cache WOFF or TTF, not both (similar cases for media-query determined imagery) * A tool like Opera Dragonfly - currently an appCached HTML5 app - needs to be pre-seeded so on first run there's already something there. Currently, if a user happens to be offline and for the first time hits "Inspect Element", nothing is there as the appCache is empty. * The user is informed of cache population progress * The user is shown an app-specific loading screen between pages, indicating that the resource isn't instantly showable (perhaps offering alternatives that are) * The user is aware that a particular page is 30 days old and some data may not be relevant * User clicks "read later" on an article, that article is cached along with a Google Maps image included on the page (which originates from another origin) * User visits page, page makes itself work offline, some resources are from a CDN for performance (therefore different origin) which the page can use offline Trying to pin down when/how caches are removed on low-memory/usage - don't think we agreed on these: * The user has cached 50 levels of a game, the system runs out of space, the browser flushes completed levels, but leaves uncompleted levels and the core of the game (developer has control over priority of atomic packages, and accepts some may be discarded) * A site gaurantees the user a particular set of data to be available offline, the browser or system must not remove this without user knowledge * A user has 5 games and 1 note taking app stored offline. The system is critically short of resources, the user chooses to uncache the games but retains the note taking app, based on importance and space used * A user is asked to give an app no-flush privilages, meaning it will not be flushed by the browser automatically, user selects "yes please", because they want to rely on the app's offline functions Potential proposals ============== * Super cache API: programmatic control over cache - push items into cache, inspect current cache status of any resource, evict items from cache, etc. * Web worker callback / local server: Route all requests through a local process that can intercept requests and serve them from local data stores rather than the network (and push responses from the network into local stores before making response available to browser) * Discussion Topics ============= * need a way to know if the current page is served from appcache or not Arbitrary Notes ============ • Worth looking at Quota Management API http://dvcs.w3.org/hg/quota/raw-file/tip/Overview.html • Old, currently dead proposal: Programmable HTTP Caching and Serving http://www.w3.org/TR/DataCache/ * we want a fairly low-level API, which gives power, but also puts responsibility on, developers to be able to very granularly control what/how/when stuff gets cached. no special APIs or "flags" to say "this is important -> put it in a new type of SUPERcache", leave it up to developer to handle exactly how they balance their resources by giving them enough access to info about: what is and isn't cached, how old are the cached versions, are updates available for each individual resource, how much space is left on the device for caching, does the device allow me to ask for more space from the user. * more granular control needed because there are different kinds of "stuff" that developers want to store. nice to have (images, video?) vs critical (the actual JS logic of an app). it's not just static content, but could be whole framework, data, etc. * low-resource callback system: Event Pages: http://developer.chrome.com/beta/extensions/event_pages.html People present: Alex Russell -- Google Andrew Betts -- FT Labs Jackson Gabbard -- Facebook Jake Archibald -- Lanyrd Patrick H Lauke -- Opera Robin Berjon -- Independent standards consultant Tobie Langel -- Facebook Christian Heilmann -- Mozilla Rowan Beentje - FT Labs (from 12pm)