<kaz> Chair: Giri
<kaz> scribenick: kaz
Media Timed Events UC document
Giri: One of the things Chris raised
as a next step is publication of this document.
... I believe we said it would come out as an IG Note. There's a
couple of outstanding pull requests,
... but after that we have the option of taking a snapshot in
time.
... What are the formal procedures involved in publishing this as a
note?
<cpn> scribenick: cpn
Kaz: The procedure itself is not very
complicated. During this TF call, we can make our own
decision, and then we can confirm that decision during a main IG
call as well,
... and then talk with PLH as the project manager, to get approval,
and then we can publish the document.
<kaz> scribenick: kaz
Giri: Is there any objection to that?
My suggestion would be to take care of outstanding pull requests
during the next couple of weeks prior to the next IG call,
... then raise this as a topic during that call to say we want to
publish. Does that sound OK?
Chris: I think what we have now is
good as a first public draft. I'm not sure it's ready to be
finalised at this stage.
... I'm certainly happy to publish, but with a view to making some
more updates. I don't necessarily feel we've completed this
yet.
Giri: I tend to view this as a snapshot in time. Kaz, what's the document life cycle for IG notes, as compared to standards track documents? Can we publish a snapshot and keep revising?
<cpn> scribenick: cpn
Kaz: As we're an Interest Group, this
document will become an IG note,
... and we can publish whenever we want as an updated group
note.
<kaz> scribenick: kaz
Giri: OK
Steve: That sounds reasonable to me,
I think that's the best approach.
... If we get something out then it gives people visibility of what
we've been doing.
Giri: I think it's also a good idea too, because as we go forward with the collaboration with WICG, we may find we need to revise the document.
Giri: We always need to keep revising
use cases, I think. Timing requirements was another thing.
... Chris and I have both sent a couple of pull requests, we need
another set of eyes on this.
Chris: Let's go through these, to see what we think?
Chris: Looking at issue 22, the
timing requirements.
... We've had some discussions on previous calls about captioning
and the need for frame accurate rendering.
... One of the things I think we discussed at TPAC was to identify
if different use cases have different timing requirements.
Giri: I had an action item from TPAC to document the SCTE-35 requirements.
Giri: There, you can set an insertion
cue as little as 2 seconds prior to the availability of the splice
event,
... which to me seems to put a tight timing requirement on the user
agent for propagating a splice event over to the application.
... It's a use case that doesn't leave much time for client side
processing.
Chris: What's the context for this, is it in MSE playback?
Giri: It could be. By splicing, I'm
primarily referring to ad insertion.
... As far as SCTE is concerned, it's any downstream point where
the splice can take place,
... even all the way down to the user agent, and MSE is the W3C
solution to splicing, currently.
... So this is what the cable guys are looking at as far as splice
requirements are concerned.
... What are we going to do as far as user agent processing is
concerned?
... They don't really give normative requirements that can be
translated into user agent requirements.
... Two seconds seems like something the user agent could meet, but
that's for the entire processing of the ad insertion cue,
... including the transmission delay and processing by the user
agent to extract the event data and propagate it to the
application,
... and delay in the application to handle the event data. So the
timeline isn't very clear to me.
Chris: We've talked about the "time
marches on" algorithm and the 250 millisecond limit that that
specifies.
... Would that be sufficient?
Giri: My guess is it's probably not.
If you consider a 2 second budget, the transmission latency could
be several hundred milliseconds, depending on the quality of the
transmission.
... Then there's a 250ms upper bound on processing by the user
agent. Plus, if the application has to make any networking calls,
that can add several hundred milliseconds.
... So that could be up to 1 second of delay. You would rather have
the bulk of that 2 second budget allocated to transmission delays
rather than client processing.
... That's my intial take. I could break it down into a time budget
for one way transmission.
... I think the BBC might be in a better position to answer this,
as you're actually processing these cues on the back-end, so you
might also be able to insight.
Chris: I can see what I can find out regarding end to end delay.
Giri: This is an example of additional timing requirements. Are there any others we should consider?
Chris: Somebody contacted me offline,
they're following the work we're doing.
... So we may have some new use cases coming in, I hope they'll
reply to the GitHub issue.
... The caption rendering is another one where we have stringent
timing requirements.
Giri: We have the BBC subtitle guidelines. Are there other sources for caption requirements that could be frame synchronous?
Chris: There was someone who replied to the M&E IG issue 4.
<cpn> https://github.com/w3c/media-and-entertainment/issues/4#issuecomment-396762643
Chris: This commment mentions caption
rendering, he wants the captions to appear and disappear coinciding
with scene changes,
... and he's using the currentTime from the video element.
Giri: I see, this seems to be a
caption that is sync'd with the scene rather than a captioning
requirement.
... I think we can reflect that in the document.
... It seems like a content authoring use case, I can see if there
are other mentions of this.
... From a BBC perspective, do you see this requirement from a
content authoring perspective?
Chris: I would defer to Nigel on
issues of caption rendering,
... but something we are interested in doing, more from a research
area,
... is client side rendering of supplemental content alongside
video, e.g.,
... triggering overlays or graphics rendered with the video
content.
... And we would want to achieve a higher level of timing precision
to do those kinds of things.
... Those are the kinds of use cases where right now you would have
to take video frames and render into a canvas
... rather than using events to trigger DOM updates.
... So, for us, having a much more integrated control over the
video rendering with the ability to mix additional content into
that would be of interest.
... I'm less certain what the solutions would be there.
Giri: We have a good case here, if
this is critical for content authors.
... My experience with canvas is that you cannot keep the same FPS
as the source content, I haven't been able to get that to
work.
... I don't think content authors would want to sacrifice the
quality of the video playback to satisfy this use case.
... It sounds like the comment is valid, I don't believe that
current browsers can satisfy it.
... I think we need some additional sources to say this use case is
important for content authors.
Chris: What we have now is really
pointing towards a newly defined DataCue API,
... and the ability to surface in-band events of particular types,
which may be different depending on the media format.
... The timing guarantees around that may be somewhat tightened up
based on the existing definitions around "time marches on".
... But I think there may be a bigger thing to be explored, which
is around this more integrated video and graphics pipeline.
... This is a scope issue for our task force, are we looking at all
of this together in one piece of work,
... or are we OK with focusing on a TextTrack like mechanism for
in-band and out-of-band events,
... with some consideration of timing of event extraction and event
propagation to the application?
... The integrated rendering case is a much bigger topic than we've
looked at so far,
... and may be something we would want to take back to the IG to
get guidance on.
... I'm responding to what we've seen in the issue 4, frame
accuracy seeking.
... It seems to have generated some interest from people, some are
outside the IG,
... people with use cases they're trying to achieve thwt may
require more precise control over timing and rendering.
... My suggestion would be to go back to the IG and follow up
there.
Giri: We'll put this on the agenda
for the next IG meeting.
... There is an aspect to this which is frame accurate handling of
events, which will be difficult at 60 FPS.
Chris: Going back to your PR 25, is
there anything we want to say in addition?
... It currently says that the propagation should be considerably
less than 2 seconds. Should we try to break that down?
Giri: Yes, I can try to characterize
it from a client side perspective. You might be able to say from a
content originator perspective.
... Particularly for live content, this becomes a tricky
problem.
... we can take the worst case, where you see the ad insertion cue
when it's first ingested, before it's sent over some transport,
cable, over the air, or internet.
... From there, we can take out the expected transport delays and
see what the budget is for client side processing.
Chris: I will see what I can find out.
Giri: I worry some of the browser
vendors may come back and say that they can't make it work.
... The SCTE also give other insertion points that account for
different kinds of implementation.
... These are just examples, we may have some negotiation in WICG
with browser vendors.
Chris: I have a couple of pull requests open. The first one is about restructuring the use cases.
Chris: This is open to discussion, if
you think it's useful. I put each use case as its own
section,
... as opposed to the categories we're using: synchronised events,
synchronised rendering of web resources, and embedded web content
inside media.
... Sometimes what you have is an event carried in the media, which
triggers fetching a resource, which is then rendered. There's a
combination of actions that happens.
... I wanted to describe each use case as a whole. I'd like
feedback on whether this a good way to go.
... We didn't say much about rendering of embedded content at the
moment, e.g., in 3.3.
... It's not clear that these use cases are about embedding inside
the media container.
... If structure this way, I'd like to describe the use cases for
embedded media here.
... The distinction between these different use cases was less
clear, so I felt reorganising may help.
... With those three cases (social media, banner ads, accessibility
assets), do we expect the resources to be retrieved over the Web,
or carried inside the media container?
Giri: From what I understand of the
MPEG work, even if it's carried in the container, it could still be
requested over the internet.
... It may just be a trigger to the application to fetch a resource
at a particular time.
... The advantage of putting it in the media container is that it
allows direct rendering by a media player without an application,
if the track is authored in the right way.
... This is something we've found at ATSC, like with HbbTV as well,
when the user tunes to a channel there's an associated application
that can handle the interactivity.
... We can also remove that from the document for now, as the
standardisation effort is in progress.
Chris: Looking ATSC as an example, you have the two media players. Does our document need to target the native player, or are we talking about the interaction between an application level player and the user agent?
Giri: MPEG has two models. One where
the media player handles all interactivity as part of the media
container.
... The other is more of a metadata cue model, where if an
application is present, the media player propagates the events to
the application, as we're envisioning with DataCue.
Chris: My overall feeling with
section 3 is that the use cases could use some clarification.
... I'll revisit that and see if I can do it differently.
Giri: That's fine, can you try to include it before the next IG meeting, ready for the snapshot?
Chris: Yes, I agree.
... The next pull request is to add details of the Webkit DataCue
that Eric posted in WICG.
... It's an extension of the HTML5 DataCue, more flexible. I just
added it to the gap analysis section,
... to capture what's presently implemented.
... The SCTE-35 and WebKit pull requests I feel we can merge.
... I'd like to ask your feedback on section 6,
Recommendations.
... What I've tried to do is summarise the requirements that an API
should support.
... It may be a bit too emsg specific.
... [Describes requirement for subscribing to events by id and
value, as in HbbTV]
... Should there be an opt-in mechanism from the application
side?
Giri: We want to have specific
recommendations to WICG. Since this is a living document, we can
add new recommendations.
... I'm not so concerned it's emsg centric. It's recommendations to
WICG to help structure the work accordingly.
Chris: If it is too specific, I may
want to change that, the first few items in that list, subscribing
and unsubscribing.
... The other recommendations are for in-band and out-of-band
support, the DAInty mode 1 and mode 2 triggering.
... Did we cover all the topics from the agenda?
Giri: Recommendations to WICG and next steps. We had a meeting at TPAC with WICG.
Chris: Yes. There was agreement at
that meeting that we'd met the necessary bar for starting an
incubation.
... We have at least Apple interested to work. I contacted the WICG
co-chair, who was happy for us to go ahead.
... We just need a WICG repo, I'll follow that up.
... What I'd like to do is figure out who among us is going to work
on the spec development on the WICG side.
... I can invite people from the browser companies.
... Once we have a repo, we can create an initial specification
template, then start work on the details.
Rob: In the breakout session I
chaired, WICG is the place we should be doing this.
... If we have a repo that will encourage people to participate,
and we can get them to join up and contribute.
Chris: I agree. We've more or less
reached the limit of what we can do as an IG,
... aside from publishing the document. We need to get this work
happening on the WICG side as soon as we can really.
Rob: I have some contacts at Mozilla and Chrome I can contact.
Chris: My plan is to follow up with
the WICG co-chairs, as only they can create the repo for us.
... Once we have that, we can contact people from all the browser
companies.
Rob: It helps that Apple are already involved.
Chris: I've had some offline
discussions with MS as well,
... as it's a general media industry need.
Giri: Let's go ahead and merge the two PRs discussed today.
Chris: OK, I will merge those two PRs, and think about the use cases section.
Rob: I can see what you're trying to do. I wonder if you should summaries the main points at the start, then follow it with the example use cases.
Chris: The existing section headings
could be bullet points, and the
... a section per use case
... and describe the detail
Rob: I did a similar thing in WebVMT,
describing the use cases in detail, and then a note at the end of
the benefits, as bullets.
... It picks out the detail, which may be lost if you read it.
Chris: I'll take a look, it could be a good model to follow.
<RobSmith> https://w3c.github.io/sdw/proposals/geotagging/webvmt/
Chris: I have a few actions, and I hope to have news on WICG soon.
[adjourned]