<kaz> Agenda: https://lists.w3.org/Archives/Public/public-web-and-tv/2020Oct/0010.html
<scribe> scribenick: cpn
JohnS: Two issues are important.
CMAF Byte Stream format, developed by WAVE, restrictions that
should be respected for CMAF profiles not captured in ISO
BMFF
... When we started on Media Capabilities, we discussed CMAF
profiles, decided to create a polyfill library to ask for
profile support, translated into MCAPI calls
... Looks like this approach may not be possible - certain
capabilities can't be queried about, e.g., in-stream codec
parameters
... or because the profile parameter, which relates to the
container, is not support in MCAPI
... Had some discussion in the last week, decided it would make
sense to have a WAVE / W3C joint meeting, to bring in the media
experts involved from WAVE
... Yesterday, we discussed why MPEG chose to identify media
profiles to begin with: why use 4 character codes for media
profiles?
... I've done an analysis of the byte stream format, to
identify what the byte stream format requires of the UA, that's
not in the ISOBMFF byte stream format
... Need to do a deep dive on that
... WAVE has produced a spec, not public yet, which is how to
do media capability reporting for CMAF media profiles. Need to
discuss on that.
... There needs to be a more regular dialog between the people
working in WAVE, and the MEIG and Media WG
... More regular dialog could help solve these issues
... Summary: How to properly handle CMAF? It's becoming
dominant for commercial content. How to handle with MCAPI? How
to engage with WAVE and MEIG on more regular basis?
Matt: Thanks for organizing the
meeting yesterday and doing the analysis between ISO BMFF spec
and CMAF
... You've done a good job to summarise the outcomes and next
steps.
... We want to make the web platform usable for CMAF, but also
don't want to restrict it to where the support for CMAF is so
specific that MSE implementations would say they don't support
CMAF related queries
... The ways of identifying the right set of things to have
good support for CMAF.
... There's granularity: query using 4 character codes, return
a yes/no answer vs very granular queries, some of which may be
too much for an app to deal with
... We're not sure about the 4 char codes, need more detail on
that and where they came from
JohnS: What's the right level of
granularity? Most desktop playback players, you just try to
play the content however you can. If you have to say no, even
though you support most aspects, would be unwelcome
... Issue of enhanced audio codecs that need in-stream codec
parameters - would be a yes/no question
... So some tests are yes/no, some are better. For 608
captioning, an app would want to know if it's handled, to use
an alternate method instead
... I'm not the expert on how the CMAF profiles came about,
Dave Singer and Kilroy Hughes, or Cyril or Thomas Stockhammer
could explain that
... One idea from Chris C yesterday: the media profiles are
useful on the encoding side, and we could test if you're
compliant with the profile
... For network efficient, don't want a combination of media
segments. Could be less useful on the decode side, to create an
efficient set of catalogs
... That seemed it could be true to me. From a playback
perspective, does the 4 character code tell you something
useful about playback. Any input on that?
Will: The profile sets maximum
constraints. If you support the maximum, you support everything
in it. The 4CC codes are essentially a label for
characteristics, expressed as bounds
... The content may not require the maximum itself
... I want to solve the problem of a player inspecting the
environemnt to see if it can play sets of content?
JohnS: It has the color primaries, transfer characteristics
Will: But those aren't in the playlist. The DASH/HLS manifest should be sufficient to decide if it can play a piece of content
ChrisC: Thinking about it as a
bound, this guided us towards the polyfill solution. You could
query using that bound to get a yes/no answer
... The issue that came up yesterday is that there are some
aspects of the container that MCAPI has no way to support,
e.g., this quirk of CMAF
... We could add to MCAPI, but is it a good idea? We have a
stable fixed definition in MSE so far, not had lots of issues
come up for fMP4
... For the sake of CMAF to be friendly to users and UA, you'd
want that model for CMAF also.
JohnS: One of the big issues is
that if you look at a CMAF media profile (e.g., a vanilla
profile, AVC, and one that's more esoteric, e.g., enhanced
audio) - what traits are specific to the CMAF media profile
that would be required to be supported for the UA to render the
content as intended?
... For example, there's a bitrate range or resolution range
specified, the app would want to know if the upper and lower
bounds are supported
... They want to know which format to switch to, depending on
the traits
... Are there container specific traits, how CMAF requires fMP4
to be constructed that go beyond
Matt: Placement of emsg boxes can
vary, but for MSE they need to be collocated in the media
segment, to get deterministic parsing, timestamp handling
... The handling depends on where the box is in relation to the
media segment. The details need to be locked down a bit
more
JohnS: The TPAC discussion on
where emsg boxes could reside - at top level in media segment.
One of the things I mentioned is that emsg is defined by DASH,
not ISO BMFF. It's now in CMAF, but it's never been a base
level box in part 10.
... Can you put it in the init segment, or can it be at the
chunk level, clearly needs to be normalised
... Media profiles are orthogonal to emsg boxes. It could be in
any track, not just the video
Will: It is being used. Want to avoid parsing in JS, can be inserted in any MOOV atom
Matt: Should the content of the
emsg be better represented out of band?
... MSE inherently allows relocation in time of segments. The
emsg could be meant for 10 seconds in the future, so at what
time should the emsg fire?
Will: There are two modes: trigger now, or inform of impending change in the future. This is a good case for being out of band. There are good reasons for putting them inband, it's one less request a client needs to make
Pierre: On the question of
profiles, they're for the sender and receiver to agree on
capabilities that need to be support. It seems the profiles
aren't adequate to communicate that
... There's no agreement in the industry on what's a reasonable
set of capabilities for players? It could be hopeless to try to
parameterize all the options
JohnS: It used to be that you'd
say: the client device must have the following capabilities,
closed ecosystem. We're moving to a world with feature
capability detection to figure out what the device can
do.
... We don't have a lot of experience with the feature
detection approach. It feels like an intractable problem,
because you don't want too much granularity, so you have to ask
a huge checklist of questions
... vs having a 4CC which you can query about
... We haven't found the right level of granularity to ask
those questions, which is what we're trying to sort out
... We want to come up with the set of 6 or so questions that
are yes/no
ChrisN: We filed an issue for chrome and firefox to support emsg yeas ago
Pierre: emsg is just one aspect,
e.g., ICPCP, v2, HLG, what's the target peak luminance? Similar
questions for audio. How do we parameterise it?
... Or pick some constrained profiles
JohnS: That was part of the reason why CMAF media profiles were created, to be able to ask these questions
Pierre: Yes, they're for exactly that. The profiles in SC?? were too constrained, not constrained enough. Need to get the industry together again to figure it out
ChrisC: You mentioned it might be
folly to break the profile into its constituents, as there
could be too many. But if we don't do that, and if MCAPI
understands 4CC, could still be folly
... Under the hood, you still need to understand all the
variants. You want maximum support everywhere, one flavour of
CMAF to encomplish that
... There's been some exploration of what profiles would look
like, would love to see the polyfill, a flat list of the
profiles and what they need
... From what I've see it may not be constrained enough
Yasser: Do manifests and playlists play a role, or is it just about the CMAF content?
Will: If it's not exposed at the
playlist/manifest level. it would have to load an init segment
or some content and expose it ot the browser to ask if it can
be played. Wastes time
... Packaging process should expose the information at the
playlsit level, and that should be sufficient, to be most
conveneitn. We shouldn't have to actually load the content to
see if we can play it
JohnS: Agree with Will. That was
sort of the intent, use the manifest data to query the UA
... So how to do that? What should be in the manifest for the
UA to determine if it can play?
... It's not just about asking if you can play the codec. There
are other essences that need to be asked about. What experience
do people have with the adequacy if what's in the manifest? Is
there a problem there, e.g., color space for example?
... Has the DASH-IF has looked at the MCAPI?
Will: We looked at it over a year ago, there's been additions since then, not sure if a gap analysis has been done
JohnS: I'm hoping the WAVE MC
spec can be shared soon, so we can have a call where we go
through the analysis - is it adequate for the polyfill?
... In the CTA WAVE spec, we've identified the primary media
profiles, a formal process to approve what gets in the spec.
There are maybe 12 video profiles and 8 audio profiles
... It summarizes the characteristics of each profile in a
table
... For a follow on meeting, between WAVE, MEIG, Media WG, we
want to have Apple, MS, Mozilla participating. Would be useful
to put together an agenda for discussion
... Should include reviewing MCAPI, the CMAF byte stream
spec
... The objective should be: what does the media stack need to
change to support CMAF content? What are the unanswered
questions?
... Also to have more regular conversation between the
groups
... I'll work at WAVE to get the Media Capability document
released. The review draft of the CMAF bytestream spec may have
changes incoming (Zach Cava), send to this group, and set up a
meeting to discuss the meeting soon, before the holiday
season
Matt: I agree with that. Is a text based document enough to ask what the platform can play. May want to include some optional capabilities. DataCue may be unsupportable on some platforms, where stream would otherwise be playable, let the app decide if the platform is sufficient
John: I don't think we have a document that states what's optional and required for the CMAF profile. It's not just limited to CMAF, so would be a good thing to bring up at DASH-IF, also with Roger Pantos at Apple for HLS
ChrisN: Also constraints on the output channel, e.g., HDMI
ChrisC: The separation of responsibilities is that MCAPI is for what the UA understands separately from display capabilities. Those are handled by CSS, CSSOM View, we've landed some spec changes for those cases
JohnS: We could create some test
examples, e.g., this display resolution, with this codec, HDCP
2.2 - can you handle that specific use case?
... If we have those questions, we can test against MCAPI, to
see how we might change the API if it's not yet supported
ChrisC: MCAPI only has an
editor's draft so far. That draft has those additions. The
design of those was hashed out between MS, Apple, Google,
Netflix, lots of back and forth.
... Should meet your needs for questions on HDR, but interested
to learn about any gaps and address them
... For questions about the display, it's not handled by the
MCAPI, as it's about the screen not the decoding. We're working
to surface properties there, also media queries for
resolution
... We've added dynamic range with 'standard' and 'high'
values, in the CSS spec, not implemented yet
JohnS: How does some one know which specs to look at to know?
<kaz> Media Capabilities draft
ChrisC: It's not written down yet
JohnS: Would be great to have
such a document
... It also would become a useful explainer to help do
capability querying
Matt: Would have to be a living document. EME has a proposal for key status in HDCP status to know if HDCP 1.0 is needed. That's in EME, but needed for a playback scenario
ChrisC: I'd be happy to produce
an explainer document for the HDR pieces
... The intent for MCAPI is to have parity with EME
capabilities
... Joey Parrish owns the EME spec
... HDR capabilites can't be answered with
requestKeySystemAccess. Should we continually add features to
both specs, doesn't make sense, so let's advance that spec, for
HDR properties there's no plan to add to EME, do it via
MCAPI
JohnS: I can query, do you support this encryption mode, this codec, ask in combination?
ChrisC: Yes, that's a feature of
MCAPI, implemented in Chrome. Need to know if in EME context,
as the decoders can vary widely
... EME will as is, for playback after the query. Don't want to
revisit that spec
JohnS: What we can do as WAVE is create the list of queries we want to do? And compare with the explainer
Kaz: I agree with John's proposal
to look at use cases and gap analysis for those scenarios
... FYI, the WoT group is working on 3 specs: Thing
description, scripting API, profiles of device description.
Looking at these approaches could be useful for this, e.g.,
Thing Description might be useful for the discussion on
possible extension for the manifest
<kaz> Thing Description to describe the server's capability based on JSON-LD
<kaz> Scripting API to handle the description
<kaz> Profiles of the server description for interoperability
ChrisN: So next steps: Chris C will prepare an explainer, showing how to use MCAPI and CSS Media Queries and Screen object together. In the WAVE project, we can look at the media profiles and see how those map to MCAPI calls, possibly produce a polyfill library. Once that's done we can schedule a follow up call to discuss the outcomes, e.g., any possible API gaps. Could be a MEIG meeting or a WAVE meeting
JohnS: I'll follow up with you on that.
[adjourned]