Meeting minutes
TPAC planning: WebRTC WG / Media WG joint meeting
Chris: Topics for WebRTC joint meeting?
Jer: Media capabilities and harmonising with RTC
Jan-Ivar: Main topic is MediaCapture Transform or its replacement, about exposing real-time audio and video between MediaStreamTrack and JS
… Now it seems will be Streams based. Some issues open around that
… Specific to WebCodecs are video frame lifetime, close and clone methods
… GC cleanup. If you do a tee to branch a stream into two, there's no cloning by default
… When one branch closes the video frame it stops working from the other. Want to discuss also audio issues
… We invited the Audio WG to also join
Bernard: TPAC is a good time for overviews or relationships between things
… So the overall direction we're going with media, using Streams to create pipelines to process media
… Future of content protection. S-Frame in WebRTC is an encrypted content form, doesn't work with WebCodecs
… Questions about transports. Looking at the overview, what's missing, how does it all fit together for developers?
… We get lots of questions from developers. How to render: WebGPU, Canvas, etc. Any recommendations?
… Coherent view on Workers
Chris: Anything on WebTransport specifically?
Bernard: WT supports workers, RTCDataChannel doesn't, but that's proposed, it's an extension spec. PeerConnection doesn't support workers
… So lots of different things, good to look at the overall picture. What are we not doing?
Jan-Ivar: Opportunity to look at the overall picture, Alternative to stand up an alternative to WebRTC using the other APIs. Looking at how that fits together and what does and doesn't work
Bernard: Developers ask how all the parts fit together. Could present what we think the overview is, see where there's agreement. Does it make sense?
Jan-Ivar: We'll make slides in WebRTC WG. Media WG can contribute to that
… A question could be: If a source and sink are sink-based, why not also the bit in the middle?
… Why use streams instead of promises for media capture transform?
… Why isn't encode promise based?
… The streams model would be able to handle it.
Bernard: We could present the story, issues, questions people ask. Streaming and RTC are converging, low latency streaming
… Using EME with WebRTC doesn't work today. Could put some examples in
Jer: Can you come up with the overview?
Bernard: That's for the first hour, then audio for the second hour
MSE Bytestream Format
Cyril: I opened 4 issues
… Generally, the intent was to read and understand the spec, check for conflicts with other specs, if any
… And check interop across browsers. I also wrote some unit tests, hand-editing MP4 files and feeding to an MSE based player
… I found some surprising results
… I'm migrating the tests to WPT
… Issue #4 ftyp box
Cyril: Wording in the BSF spec says the ftyp box is part of the init segment, and the UA should run the error algorithm
… if there's a mismatch. It seems to ask a lot from browsers
… Requires the browser to validate. In my tests, I found the ftyp box is ignored. If I use an init segment with a moof box it's fine
… I can put anything in the ftyp box and everything is accepted
… My understanding of what's implemented today is that we shouldn't say anything about the ftyp box
… Just say an init segment it just a moof box with some constraints
… There's a similar statement about the segment type box that could be safely ignored
Jer: Wasn't there a move in the MEIG to have a restrictive ftyp that would throw errors in cases where extra boxes would be ignored. If we remove this, will there be a request to add it back?
ChrisN: The CMAF BSF discussion
Cyril: I argued against that. If all browsers implement the same thing, why take it out?
Jer: I agree. It's that they didn't have separate mime type so wanted to use ftyp
… I think if we can parse the file, we should, and not throw errors. Being relaxed in error handling is consistent with the web approach
… I'm fine with adding it to the list of boxes to ignore
… But concerned that others will object
Cyril: I doubt browsers will verify conformance to brands. Browsers and players try to do their best with the content
Jer: Making WPTs would answer these questions. Update the spec to match browser behaviour
Cyril: If a box contains a compatible brand the UA doesn't support, it should fail. That's the opposite of what ISO BMFF says
Jer: I wonder if what's written was the opposite of the intent
Matt: I agree that the ftyp box is superfluous in Chrome
… Is there a case where folks want to play streams but they want capability detection based on ftyp?
… As no browsers filter on this, we should remove it from the spec. It's in the list of boxes that should be skipped in implementations
Cyril: Next issue, support for edit lists
Cyril: The spec currently says browsers must support one type of edit lists
… An offset edit list, which offsets the composition time when you have B-frames
… The edit list maps the non-zero composition time to a presentation time of zero
… The spec is silent on other types of edit lists
… Can you use fractional or zero rate? Can you use empty or multiple entries in edit list?
… Rare to have interop in those tests. Mostly browsers ignore edit lists not supported. I think it should fire an error
… Could lead to A/V sync issues
Matt: I have a concern about raising a decode or parse error on content that previously played successfully
… Could be a note for clarification on which edit lists have interoperable support, and others would be ignored
Cyril: I tested fractional rates, empty edit list (should fill the timeline with a gap)
… Maybe deprecation first, then removal in a future edition
Matt: Are there components of these edit lists used with other parts of MSE, timestampOffset, playbackRate, so MSE couldn't afford applications polyfilling
Jer: Hard to polyfill with muxed tracks
Matt: Do we have any stats on existence of these kinds of edit lists?
Cyril: There's one that must be supported
Matt: So a note to say the others should be ignored by implementations
Cyril: I'd prefer to say "may be" ignored, and content providers "should not" use
Matt: Makes sense, also gather data
Jer: Other uses? Offset is needed for B-frames. What about multiple playback rates, other use cases?
Cyril: Empty edits could be used, when you want to align audio and video, you can either remove some video content to start at the audio start
… or say the audio has a gap, and the player should play video without audio until audio starts
Jer: Another option, if we find that these are being used in the wild, could add a "should" statement for the empty edits, or others with valid use cases
Matt: We don't have enough data now. If there are use cases that can't be solved ergonomically in the MSE API, can address at a later time
… If we don't see people complain that playback isn't working, should we then add telemetry?
Cyril: I think it's important to document what content creators can rely on
Matt: so documenting it may be ingored, and content providers should not be used. File a github issue so people can reply to bring to our attention
Cyril: Next is #6, support for unknown boxes. Boxes accepted and ignored
… Not sure what is meant by valid top-level boxes
Matt: If you put an out of order box, such as a moof before a moov. The spec handles that, but are there other cases?
Cyril: ... That's the next issue on the number and order of boxes...
Cyril: I tested a unkn box
Matt: Is that in the spec, how do we know its a top-level box?
Cyril: From where it's placed in the stream
Matt: If it's not defined as a top-level box in the normative spec
Mark: All the boxes that the ISO spec says are allow to appear at the top level
Cyril: Anyone can add other boxes at the top level if they want
Matt: If we need to bind the MSE spec more closely to ISO BMFF, we can
Cyril: Is the intent to ignore unknown boxes at the top level?
Matt: It could indicate the stream is malformed
Cyril: Concerned it doesn't scale. Each time ISOBMFF spec changes, you'd have to change implementation
… It has happened before
Jer: If a box not defined at top level is found at top level, could through an error. But it's OK to skip an unknown box
Cyril: Just consume the bytes and continue parsing
Matt: Could there be a malformed stream that causes the implementation to hold onto large blocks of data?
Jer: Seems like an implementation detail
Matt: Some implementations may see 2 gigabytes as too large and couldn't skip
… We have quota exceeded mechansim. Just thinking through implementation based considerations
… In terms of API usage, there was one user-defined box, proposed by the BSF, but that source was unaware of pre-existing top level boxes they could have used, JS level parsing
Cyril: The unkn box is one I invented
… ISO BMFF recently introduced compressed boxes (gzip). The sidx can be replaced with !sdx, defined in ISO BMFF
Cyril: Let's discuss how to make the spec changes
… Should I open an issue for each problem, then a PR. Or propose a rewrite as a draft and review the whole thing?
Matt: Depends on scale. If just a few, disuss as one offs
… Design principles about not wanting to regress
Cyril: What about small issues? I'd like to be able to rewrite the text and review as a whole
… Agree on the intent of the issues, than make a PR
Jer: Seems reasonable to close a number of issues in one PR
Matt: An issue per item sounds good, and a PR that addresses multiple
Cyril: The BSF is a Note. Why is it not on the Rec track? If there are tests and implementations can be compliant to it, why a Note?
Matt: We focused on testing MSE itself and not so much the BSFs. An implementation must support *a* BSF, so implementations may support different ones
… Allowed more flexibility at the time
Francois: Another reason it's a note relates to patents. It more directly relates to codecs, so didn't need to ask about the royalty free patent policy
ChrisN: What about Process 2021, provides a structure for registries and entries?
Matt: WebCodecs has taken a similar approach to MSE for registries. Anything we can learn from that?
Francois: Similar reasons, would make sense to have them as Rec track specs
Matt: It's been easy to propose and support new entries fairly quickly
Cyril: I think it's fine if the entries aren't all at same maturity level
Matt: I'd need to check with colleagues on that
Cyril: I created WPT for this spec. It may need some more work. Is there a link between WPT and this WG?
Matt: Existing tests were bound to the API itself, not so much a specific BSF. The tests are mostly testing the API for a supported format
… Testing BSF format more deeply is good, put into a subfolder
Cyril: I'll start a PR. How do you detect an error? Buffer range, error event, etc?
Matt: There's a proposed introspection API that could help with that
Cyril: Thank you
Matt: FPWD of MSE v2 and short name
Francois: It'll be published on Thursday, no need for a new CfC
Matt: Update the SoTD?
Francois: Yes, feel free to do that. In future we'll switch to audomatic publishing to /TR
Chris: Second screen WG would like a joint meeting to talk through some issues around capability detection
… Will schedule that for an upcoming call, possibly next time?
[adjourned]