Media WG meeting – 13 June 2023

Meeting minutes

Coding tool configuration

Bernard: For AV1 we added the ability to configure screen coding tools. We found an interesting gap, looking at other needs
… We thought WebCodecs would not need to consumer content hinds, for audio things like speech / music
… and similar for video. We noticed that it wouldn't work in WebRTC
… Looking at the Chromium source, this is how WebRTC optimises for screen sharing
… For h.264 it sets a screen content realtime mode, and changes the QP range
… We have Per-Frame QP in WebCodecs for AVC
… For AV1 it enables the palette control, and turns on screen content mode tuning
… Interesting is screen content realtime control. It's implementation dependent, it's in Open H264
… So can't put it in the registry as it's only in Open H264
… Does it make sense for WebCodecs to consume a content hint to set the screen content realtime control
… w3c/webcodecs#478
… There's an analog for audio, a speech / music hint

Eugene: In libvpx there's tuning for certain content types
… So there's probably more. We couldn't put all the possible switches in the registry. They're not video codec specific, but specific to the specific encodig library
… If they're just hints, and so up to the UA to use or not, we should totally do this

Bernard: So we're agreeing that we don't change anything in the registries, so doesn't affect per-frame QP?

Eugene: Yes. It would just be a hint in the encoder config. If nothing interferes with the hint, the UA can use it as it thinks necessary

Bernard: Would be concerned if the content hint moved the per-frame QP, that would be weird

Jean-Yves: Why pass it to WebCodecs? Would it be internal to the WC implementation? Why pass the data if you've configured the encoder or decoder for the particular site?

Bernard: THe content hint is a property of the track. No equivalent property on the Video Frame, so there's no current way to pass it as an argument
… Eugene is proposing we have a way to pass it to the encoder config, as it can't be done to day

Jean-Yves: Is was talking about modifying the encoder itself, as this would be for real-time sharing for example
… You could have two encoders. You can implement with what there is today, an encoder for screen sharing or an encoder for non-screen share
… I can see this evolving, adding lots of config over time for particular use cases, could be endless

Eugene: To some extent, it's possible we'll add options if the WG thinks they're useful
… Not sure what you mean by multiple encoders. The website needs a way to signal to the UA to be prepared to encode text or screenshare, or to encode camera input
… This is a way for the site to signal to the UA what kind of encoder it needs
… So what you say doesn't contradict the proposal. IIUC you're describing an implementation detail
… WebCodecs has just a single VideoEncoder class, so pass in config

Jean-Yves: I see the problem now, OK

Peter: I think it's a good approach, I like that concrete things override this, this changes the default behaviour
… Makes sense as a higher level control rather than lots of little

Chris: Agree with Peter, higher level seems preferable

Bernard: We'd blow up the registries if we added every possible option

Eugene: I wondered if we need the AV1 content tools in the registry

Bernard: The question is whether the content hint is taken into account. screen content can be disruptive

Chris: To clarify, this would apply to all codecs, and just be applied if there's the right implementation available

Bernard: Yes, but with checks that audio hints don't go to the video encoder and vice verse
… And include also for audio?

Peter: Would make sense, yes

Chris: Hearing no objection, sounds like that's a conclusion

ManagedMediaSource

Chris: Nice presentation, Jean-Yves!
… Want to discuss Marks' questions, and then talk about PRs and editing

Jer: Start with an quick refresher on the proposal?

Jean-Yves: We wanted to have an ability to enable MSE on iPhone
… We had concerns about battery life in comparison to HLS, so we wanted to find a way to add hints to the player
… to give guidance so the player fetches media at specific and given times
… the most power hungry aspect of players is fetching data, turning on the 5G modem which would drain power
… Initially we thought we'd need to enforce only appending data at a given time
… But it turned out that wasn't necessary to achieve our power saving goal
… We try to provide hints on when to fetch data and when to stop, so the MediaSource is a bit less free-reigned, and reduce power usage
… We added ability to evict content at any time, in low memory situations, rather than killing the page
… ManagedMediaSource in WebKit, they're hints only
… Internally, when the streaming attributes is true, we tag all requests as "media". Turn on the 5G modem
… When we send the endstreaming event, we stop tagging reqeusts and the modem goes into low power state
… We looked at what's useful in HLS, try to provide hints so MSE would behave in a similar fashion
… Hints to request a particular quality could be a fingerprint issue, so may not work
… The original idea was how to make MSE more like HLS to make some manufacturers happy, including the one I work for

Mark: Sounds cool, good to see MSE coming to iPhone

Chris: +1 Mark

Mark: My comments are exploring options. The buffer removed events, to give the site more info on what's in the buffer, seem independent and could be standalone
… Could they be added to MediaSource?
… For discovery, adding the event handler could be enough
… Would it make sense to separate this off?

Jean-Yves: From an implementation point of view, moving that event is not a technical challenge
… But just the event, not when the eviction can occur?

Mark: Yes, just events telling you when eviction has happened

Jean-Yves: Putting it inside ManagedMediaSource, makes it clear you must listen to the event or the player will stall
… With MSE you know when it will be removed

Mark: So as well as the events for data evicting, you're also adding the ability to evict at other times, not just during the two MSE operations

Jean-Yves: The site must listen to the event, otherwise it could stall

Mark: So in using ManagedMediaSource you're giving permission to the UA to do that. Makes sense

Jean-Yves: I believe it greatly helps the player. Every single library works on a time range, so needs to manage the buffer
… So moving it would lose that hint

Mark: So I don't have a strong opinion either way now, you could move it or not

Jean-Yves: Google at the time suggested it could be nice to get the site itself to evict. The issue there was there could be multiple players, e.g., in hidden or suspended tabs
… When there's memory pressure you have to react quickly, so needs to be in the UA to have that response time

Mark: The MSE design fully decouples downloading content from buffering. The design makes no assumptions about what you're downloading and when you're appending
… We've found that extremely useful at Netflix, lots of use cases enabled. So I get nervous that we might curtail sites ability to do those things
… So it occurred to me, that there's a duty cycle for using the expensive modem could be useful outside of streaming, so talk about it for downloads also

Bernard: RTCDataChannel and WebTransport are also used with MSE, so not just HTTP adaptive streaming

Jean-Yves: The assumtion with MMS was to have something very lightweight, replace the MS in the player with MMS
… All are hints, provide as little change over MSE as possible, everything event driven rather than use promises
… It's an MSE extension in some way. Your question makes sense, they way the proposal is phrased is having a way to say: if you don't follow hints we'll block you
… But it evolved in implementation that this wasn't necessary
… If you call appendBuffer and streaming is not allowed, we could throw. Would you still have the same reservation?

Mark: Blocking the append would make it problematic
… I see three distinct things going on, so could be useful to separate out. Those are:
… The start/stop streaming events, that tell you when you have more efficient transport. Those could go on Window, and time your downloads for any purpose
… There's some logic not in the proposal that's UA specific, to choose when to send those events (turn the radio on/off). That could take into account whether you've created a MMS and how much is buffered
… The site knows what it has buffered, could the site inform the UA more of the buffer state to inform when to turn the radio on/off
… The third thing is XXX

Jean-Yves: There's many ways we could skin the cat. The idea of having it all in one place in MMS makes it easier, rather than having multiple places

Mark: So why is it specific to streaming? Other applications would benefit

Jer: With implementer hat on, we found there are some cellular networks that give benefits when streaming media over 5G, e.g., counts against your data cap
… So we wanted to be careful to not waste battery life and not waste people's money
… Your ideas for efficient network traffic management aren't bad ones, but we don't have control over that in this WG
… It's worth bringing to the people who manage the fetch spec, but we can't do that here. Similar on WebTransport and WebRTC DataChannel

Mark: I see the motivation to get this working for streaming. So why not put the start/stop events on Window, to generalise it a bit?

Jer: Sure, this is very particular to the device we were developing on. Some things we're not in control of. We may not want to expose to the entire web when the modem is available or not
… So we had to ride a fine line in selecting a heuristic

Jean-Yves: There's a strong link between the managed media source buffer and the streaming value. So it made more sense to have it in MMS, those events are for when streaming with MMS and only used with MMS

Mark: Feels like a bit too much is left without guidance to the site about how it might work. Don't want to constrain how the UA would behave
… Need guidance for sites. If I append a lot of data, the start/stop streaming events will slow down. I need to know that in my client code, to balance my media vs other requests
… Give some heuristics or guidance on what to expect

Jer: Sounds like you want a non-normative note on how to implement?

Mark: That might be sufficient. May be we'd put all our requests int he start/stop region

Jer: One of the reasons for the MMS name, is a similar problem is faced in Apple WebKit, Chromium, and set top boxes. Much more limited buffers in those devices
… They'll let you do complicated things, such as lots of buffers in JS, but the available memory is much less
… The goal of the MMS design is to make it work on those kinds of devices too

Mark: Use cases where you dont' want to append right away. Example, reaching an end of episode and likely to want to watch the next episode rather than the end credits
… I have a bunch of buffers, haven't appended anything
… If the only effect is using more battery, that's fine, so long as its rare. Problematic would be if you're not giving the resource I need

Jer: You mentioned in the thread giving more info about what you're doing
… A scenario where it won't work is live video, where you append every 10 seconds. In that case the page won't be obeying the start/stop events, the UA could fall back to a more efficient transport mechanism
… Earlier we had an explicit signal to say the client would be doing live streaming, and use the more efficient modem up front, when loading small amounts of data more frequently
… In your example, your could signal that you're done. Those could be useful additions to MMS

Jean-Yves: With live streaming, whether we implemented or not, the heuristic was identical. No need for it
… With knowing when you won't append more data, can you signal that

Mark: But in this case you would still be appending
… Happy to do the editing if there's PR

Jean-Yves: I can volunteer to edit if Matt cannot

Chris: We also need to think about it, important use cases such as live stream, branching narratives, etc

Jean-Yves: Aim here was to give UA a bit more control, it's not an ultimate solution, could have different objects, but that wasn't our aim
… Our aim was to enable on low memory constrained devices, by extending MSE. I hope we've achieved this

Jer: I have some feedback from WAVE, so the people who work on STBs are aware and interesting

Chris: So with my MEIG chair hat on, let's set up a discussion so interested WAVE people can ask questions

Jer: Sounds good
… I also want to thank Mark for the productive discussions in the GitHub thread. Good suggestions that can lead to wider further improvement

Next meeting

Chris: Scheduled for July 11, we'll see how discussion in issues goes, can meet sooner if needed

[adjourned]

– DRAFT –
Media WG meeting

13 June 2023

Attendees

Meeting minutes

Coding tool configuration

ManagedMediaSource

Next meeting

Diagnostics