<cyril> scribe: Cyril
<scribe> meeting: DataCue and "Time marches on" in HTML
chris: topics are discuss the DataCue API and the Time marches on algorithm
chris: if anybody is not familiar
so far
... our goal is we want to introduce native user agent
support
... for DASH events
... as past of support for MPEG CMAF content
... alongside if UA has native user support for DASH
playback
... we would like to support out of band MPD events
... HbbTV is an example of a player that has nativd DASH
support
... I'd like to discuss implementer interest for other metadata
cue formats
... for example Safari has support for ID3
... we also want API support for application-generated timed
metadata cues
... when generated by a player
... the existing approach is to use the VTT cue
... either inline or as a reference in the VTTCue object
... having a more convenient datacue API that let us store in
the preferred format would be better
... our goals are: sync arbitrary data with video
... e.g. dashcams sensor data
... there is a lot of interest from the Open Geo
Consortium
... the applicability is broader than for the M&E IG
... [shows current support for DataCue API]
... basically no support so far in Chrome or Firefox
... some support in Safari with an extended API
... [showing the data structure for emsg]
... there a 2 versions that differ wrt the timing
... there would need a mapping between the esmg timing and the
cue timing
... in v0 the timing is relative
... the data is a byte array, so you need a schema to identify
the data
... DASH-IF is working in parallel around specifying a delivery
and processing model for DASH events
... they are considering more types of players, not web
platform only
... one of the requirements they have identified
... is that in order for an application to get prepared for
presenting a cue, e.g. a video overlay
... that may require fetching other resources
... they signal the event to the application ahead of
time
... to be able to render at the appropriate time
... so we have 2 events: onreceive and onstart
... I have a number of questions:
... it relates to the early discussion around this, should in
band events be exposed as a byte array
... or should they be exposed as objects
... the second approach makes it easier for app devs
... this may be desirables for cues that are commonly
used
... for example within DASH players
... the emsg can be used for application specific events
... and we don't need support from browsers for those
... there is a question of how we identify inband tracks
... there are various fields
... all of them seem to enable identifying the kind of
metadata
... it is not clear to me reading the spec and comparing
implementation
... what the level of support is
chris_c: on your first
bullet
... is it a reasonable behavior to fallback to the opaque array
buffer when you don't understand the type?
chris: I'd like to understand
what common subset can be supported?
... but the fallback could be a good approach if we have an API
for that
francois: youll end up with 2
representations for the same data
... so in the end the devs have to handle the opaque case
... so in this case we shouldn't bother about the structured
object
eric: I disagree strongly
... there are metadata formats that are very difficult for JS
to parse
... correctly
... so that's why I added the implementation to WebKit
... because we had lots of requests to support datacues
... and just supporting arrays is not doing web authors a
servie
francois: it is better to have a system that works across browsers
eric: if we ddecide that
structured data is important
... we need to agree on a set of types
... that we want to support
... there will always be custom metadata
... people can put anything a container format
... and they want to have access to them
... it does not make sense to have support for limited set
chris: I can imagine a world
where an impl wants to provide access to ID3 and another
not
... it's the responsability of the dev to know that
eric: I agree that we should not end up in this situation
mounir: is there benefit in trying to avoid that?
eric: I think so
... we don't have to end up there
... if we can come up with a way to describe the cue
... and require that a browser that uses that identifier have a
structured data
mounir: there could be security issues and different parsing if the browser do it themself
richard: if the parsing within the browser and use webassembly, how could there be a security issue?
mounir: if you use webassembly
that's ok
... we try to avoid doing parsing in C++
chris_c: I'm trying to understand
what the fallback would look like
... maybe the ID3 would not be contentious
chris: having an API structure that lets the application introspect the cue
gkatsev: ID3 in HLS, safari parses it, but in other browsers you have to do it yoursefl
nigel: is the data in the array buffer a registered type
ericc: no the data has no indication
cyril: no magic number?
ericc: no
chris: the emsg also indicates the scheme id
ericc: with the current data cue
api, the array buffer would have that whole thing from start to
end
... and you'd have to snfiff the bits to figure out if it's an
emsg or id3
... and it's going to have to parse it to determine if it's a
emsg or not
nigel: imagine that we expose
this data through MSE
... the bytestream would be identifiable
ericc: the UA, thing that parses
the raw media container, does have a signal about what kind of
metadata it is
... if the data cue had a scheme and identifier for the type of
metadata and an array buffer
... then in theory it could know how to parse it
... the reason I decided that was not practical for us
... is that there are metadata values that are extremely
complex to parse
... like HLS has a pList
... writing a parser for a binary pList in JS
... is not easy
... pratically speaking, WebKit does not have access to the raw
pList
... the low level does the parsing
... and we get it as a native object
... a representation of the data
... which WebKit converts into a JS object attached to the
datacue
greg: most of the conversation is
about inband
... I can see datacue useful for out of band use cases
ericc: that is a part of
this
... from script you can make a new data cue with
start/end
... and attach anything
chris: the explainer is
incomplete and in a very early stage
... it does not explain everything
ericc: any solution we come up with has to support cues from script
<nigel> scribe: nigel
cyril: Comment on
synchronisation
... The payload of the metadata may trigger behaviour with
unbounded complexity
... so that's why you probably need to process it in advance
and to know in advance the practical bound.
... To me this is similar to how video content is
processed.
... We don't have two timestamps, one for receiving, the other
for presenting.
... The implementation has to know when to preprocess
things.
... So I'm not convinced that having two events is a good
approach.
ericc: I agree and am strongly opposed to having two.
<cyril> eric: I strongly oppose to having 2 timestamps
<scribe> scribe: cyril
UNKNOWN_SPEAKER: in addition you
cannot predict how much it is going to take in the app to do
the processing
... if what you are suggested is that a cue should be delivered
as soon as it is available
... that's going to vary widely
... depending on where the parsing happens
francois: perhaps it's useful to look at why
<inserted> scribe: nigel
cyril: I agree, 3 categories of
event:
... 1. Overlay, maybe after js processing.
... 2. Network impact, like making requests or sending
messages
... 3rd, modifying the DOM
... The 3rd category - you should be able to pre-render in
advance and keep your frames until they're ready
... The other two I'm not sure about yet.
<scribe> scribe: cyril
chris: I'd like to move on to the next part, synchronization
chris: web apps use the
oncuechange
... triggered by the time marches on
... and the spec says there is an upper limit
... but in practice some implementations do follow the upper
limit
... this means that it is possible for an application to miss a
short duration cue entirely
... the cuechange event is fired, the app inspects the active
cues list
... and acts
... it's quite possible that in between cues triggered there
are cues that app don't see it
... there is a bug report raised by Jon Piesing, HbbTV
... and the recommendation is not to create short cues
... but it's worse than that
... you have to take execution time into account
... use of oncuechange is problematic for handling cues
... the good news is that if you want to avoid missng
cues
... you can attach events to onenter and on exit
nigel: but if it was missed,
enter/exit are triggered at the same time
... and if there are visual changes they will be missed
foolip: the time marches on step
are not defined to run every 250ms
... it's meant to be continuous
... only the event are triggered every 250ms
chris: that's not my readinfg of the spec
foolip: the problem is that implementations are not following the spec because that's easier to do
ericc: if you run a test to look
at the variance
... you'll see 10-20ms
because we don't use the time marches on
scribe: but look at the
cues
... this is a quality of implementation issue
nigel: this is a spec
question
... [reading the spec]
foolip: it's just for the
timeupdate event
... not for the cue events
... [explaining how it worked in Presto]
nigel: chrome does it this way
foolip: not because the spec is wrong
nigel: but the spec allows it
chris: we need a follow-up to understand that
foolip: maybe open a bug in chromium
chris_n: the spec does not mandate 250ms
ericc: so that we are not firing
timeupdate events to not overload the system
... we could, but that would cause other issues
ack
pierre: the spec guarantees that
every single cue will be fired
... regardless of the algorithm ?
gkatsev: no some have been missed
scott: the text says some cues can be skipped
ericc: cues can be dropped
pal: if I have a cue that has a duration of d is there a req that difference between onenter and onexit is close to d?
ericc: no
pal: you could get them simulatenously
ericc: but if there is onenter/onexit it should be fired
foolip: that's a good idea
chris: another related issue
is
... we want a more accurate firing of these events
... driven by the need to align captions with shots or scene
changes in the video
... and we came up with a number of 20ms
... that gives a chance to the application
nigel: you want to replace the number 250 with 20?
chris: no
richard: the shorter the time limit goes down, it's exponential the power you're going to have
foolip: the reason the schedulig is poor isnot for battery saving
ericc: it was because it was
simpler to write
... it's not possible to guarantee any kind specific
latency
... because the browser is under the same constraints as
anything else
nigel: that depends on the frame rate
ericc: cues are not tied to frames
foolip: but frames have time stamos
ericc: in my system the frames are rendered by a different subsystem
foolip: there is a quality of implementation issue
ericc: no matter what wording we
put in the spec
... it won't help you
... you have to file bugs to get what you need
chris_n: I wouldn't close an issue because it is ok with the spec
chris: I see some inconsistencies
between implementations
... when the application moves cues around in the
timeline
... if you change time of the cues
... and if you seek the media
... and seek over some cues
ericc: have you filed bugs?
chris: not yet
chris_n: a spec update is not necessary but it may be useful to avoid others doing the same mistake
ericc: we should not wait for
TPAC to file bugs
... if we want to have the issue fixed quickly
nigel: it's hard to file a bug with the given spec
ericc: if you file a bug with an
example and it is not good enough even if it matches the spec
we should fix it
... we could get the spec improved
foolip: all specs are wrong every other paragraph!
richard: sometimes I've asked to fix an implementation but been told that the impl is within the spec
foolip: it happens that implementers consider the spec as untouchable but you should escalate
chris: [showing a waverform
library demo]
... I'm using VTTCues
... adjusting the times on cues
... it's not the only use case
... [showing a table of what events get fired in practice]
ericc: you should file a bug
chris: the next stage is the
meeting on Friday, joint Media WG and Timed Text
... we should figure out how to use that time productively
<nigel> Blink bug
<nigel> blink bug
chris: it seems filing bug is the recommendation
This is scribe.perl Revision: 1.154 of Date: 2018/09/25 16:35:56 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00) Succeeded: s/Ov/1. Ov/ Succeeded: s/Ne/2. Ne/ Succeeded: i/cyril:/scribe: nigel Present: romain gkatsev pal chcunningham ericc jkamata stepsteg Nigel Found Scribe: Cyril Inferring ScribeNick: cyril Found Scribe: nigel Inferring ScribeNick: nigel Found Scribe: cyril Inferring ScribeNick: cyril Found Scribe: nigel Inferring ScribeNick: nigel Found Scribe: cyril Inferring ScribeNick: cyril Scribes: Cyril, nigel ScribeNicks: cyril, nigel WARNING: No date found! Assuming today. (Hint: Specify the W3C IRC log URL, and the date will be determined from that.) Or specify the date like this: <dbooth> Date: 12 Sep 2002 People with action items: WARNING: IRC log location not specified! (You can ignore this warning if you do not want the generated minutes to contain a link to the original IRC log.)[End of scribe.perl diagnostic output]