Meeting minutes
Introduction
Chris: Welcome to the
MEIG meeting, in particular to guests from the CTA WAVE project,
and thank you Jean-Yves for agreeing to present today.
… To start the discussion, Jean-Yves, could you
give an introduction to Manged Media Source, the background, and
design considerations?
Managed Media Source
Jean-Yves: The primary
reason we came up with MMS is that Apple wanted to add MSE support
for iPhone
… It has been supported on all Apple devices: iPad
OS, Mac, WebKit for quite a while
… But since MSE was introduced it was never allowed
on iPhone. It has HLS which gives 100% control to the device on how
we're going to fetch the content
… The device can detect the resolution to be used,
has deeper knowledge on what the user has asked, there's less
privacy concern
… WebKit is open source, we wanted to have MSE. The
issue was how to introduce it on iPhone, and the primary concern
was to reduce power usage
… Tests showed that power usage would go up with
MSE, so viewing time would reduce. We can never introduce a
regression on iPhone
… We were given constraints: if we play a YouTube
video on iPhone with MSE enabled, we shouldn't see a power
regression
… We tried various ways, and what we found was the
primary reason why the power is used, is websites get one segment
at a time rather than downloading in bursts
… The main power usage is when the cellular modem
is on. So we looked at how HLS downloads data, it downloads quite a
big chunk, then stops
… 5G is more power hungry than 4G or 3G, it will
download more seconds then shut down
… Apple still wanted to have control, but provide
hints. We didn't enable MSE on iPhone, as we'd have those
issues
… MMS provides hints: "start fetching content", "i
have enough", "stop fetching content"
… We wanted to enable on 5G, do we need a carrot
and stick approach? If the website does the wrong thing, and finds
a way around it, shut it off completely, and don't enable MSE and
fallback to HLS
… We found, If we disable the 5G modem, that the
power usage the same if using HLS vs MSE
… The website can do what it wants and it will
still be alright
… MMS evolved to being simply trying to force the
site to detecting when internally when to enable the 5G
modem
… If the page sticks within those events and we
only enable 5G during those times, the power usage is acceptable
and there's no regression over HLS
… MMS allows WebKit to be power efficient on
iPhone, allowing easy transition of your website to
MMS
… We needed something different to MSE to avoid
regression, so there's a new API which is MMS which can be dealt
with independently
… WebKit didn't want regression from a user
perspecitive, e.g., AirPlay not compatible with MSE. We want the
site to be aware, they must disable remote playback
… So it's on purpose that you introduce the
regression
… MMS will ship on iPad OS and Mac, it's behind an
experimental flag on iPhone, iOS 17 I think. In Safari settings
there's a feature flag to enable
… That's how WebKit came up with MMS. For
implementation it was designed to allow easy transition from one to
the other
… From discussion with HLS.js folks, we looked at
how we can help with things missing in MSE
… Having more detail on what's evicted and what's
added was useful. It's common to all players, they need a way to
scan the buffer range, timeranges
… It's easier for the UA to perform that operation
than the JS player
… There are things we wanted to add, with HLS the
phone can know the user has selected low data usage, so we can
limit the resolution to fetch
… Unfortunately, providing the quality attribute in
the spec may not be possible from a privacy perspective, as it
could be a fingerprinting vector
… So we have to find a way that can be done or drop
it from the spec
… That's the inner detail on how we came up with
MMS, we've attended FOMS and TPAC meetings where we got feedback on
what's missing in MSE
… We tried to add many of those features, in
particular on SourceBuffer, so I'm hoping we achieve that goal, not
100% if that's the case or not
Mike: If a player switches to MMS, how much work is done in the player and how much in the implementation? How much change is needed in players?
Jean-Yves: Very little
change. If a MSE player wants to work on iPhone, you just need to
change the call to ManagedMediaSource add the disable remote
playback flag, use plain MP4 or HLS
… That doesn't mean you get the best performance,
that's the minimum to get it working. Requirement for an alternate
player isn't part of the spec
Chris: Can you clarify what you mean by an alternative player?
Jean-Yves: The existing
HTML5 spec for <video> allows you to provide a src attribute,
a <source> with different links, one for your MSE source and
for somehting compatible with remote playback, iPhone currently
plays
… If in your existing player you listen to the
start/endstraming events and you only fetch data between those,
you'll get access to the 5G modem, so faster start time and higher
quality playback and better power usage
Chris: Thank you for the explanation. Questions?
Alicia: Thanks for this
work. Something I noticed is there seems to be three different
problems it's trying to fix
… which are not related problems. One is a
performance problem with fetching media reducing power usage. I
like the start/stop streaming events,
… because right now it's a big challenge for anyone
making an MSE player. They have to make that choice but they're not
in the best place to make that choice, so handholding is
good
… The AirPlay alternative is a nice thing, but I
don't see it as related. Why couldn't we have alternative playback
URLs and non-managed MediaSource?
… Also, I saw the buffer change events, it would be
good to have as events instead having the client check
periodically. I think it should exist in fullblown MSE
too
… I haven't seen if those will be added to
both?
Jean-Yves: The requirement for an AirPlay source alternative is not something I'm super happy with, it's not an engineering decision
Alicia: I get
that
… Could the events work with both MMS and
MSE?
… So the limitation is only on the iPhone
implementation, not part of the spec?
Jean-Yves: Was added in
Safari last year, hasn't changed in the spec. I'd like to refer you
to the GitHub issue for your last question for the changed event,
Mark Watson asked the same question
… https://
Chris: Please do leave comments in the issue
Joey: I was wondering how the notifications of how to fetch content, what interval are they on and how does it interact with low latency live streams?
Jean-Yves: Interval is
up to the UA to define. For iPhone, the low water mark is about 10
seconds, high is 30 seconds
… For live streaming, the expected behaviour is
that if you're in a live stream, you won't follow the start/end
streaming events, you'll fetch all the time
… But if you fetch outside those times, the 5G
modem on iPhone is disabled and it's like MSE, using
4G
… Whether other UAs will do that, I don't
know
Joey: Is expectation that if you're at the live edge, those will be overruled?
Jean-Yves: The spec has
language that appends could fail if you append when it doesn't want
you to
… It's a remnant of the stick approach. We left it
in the proposal, if you don't append between start/end streaming
the UA is allowed to deny the append (throw)
… That's not something WebKit has implemented. It's
also mentioned in the GitHub issue, we'll have a follow up
discussion on whether to remove or not
Joey: As a player developer sohuld we ignore the events when straming at live edge or if there's a buffer underflow? What's the guidance?
Jean-Yves: the events
are hints, it's only going to affect how it works under the hood,
so enabling of the 5G modem
… If you want the lowest latency possible and you
start appending constantly, it's a strong signal that you're
playing at the live edge
… We discussed internally and at previous FOMS
about having a dedicated live mode, but we found with just
start/end streaming and if player respects those events to
determine if it's playing a live stream
Joey: Would you be willing to write guidance for player developers? It would be good to encourage players to use the API in ways implementers prefer to use it
Jean-Yves: I touched on
those in the WWDC video
… https://
Joey: It woundn't be in the spec, just a way to help the community do the right thing
Jean-Yves: In summary, if you respect the start/end events you get the 5G model. We can see about writing a blog post
Alicia: Is there any performance gain for other than mobile phones, e.g., for set top boxes?
Jean-Yves: On Mac you get better power usage regardless
Alicia: For a desktop or laptop it's on wifi, similar for STBs. Any gains in those kinds of devices?
Jean-Yves: From a
performance perspective, remembering back to my work on Twitch on
Firefox, as Frefox doesn't support MPEG-TS you append one frame at
a time
… If you do that, the CPU is working all the time,
has to check if each frame is in the existing buffered
range
… Would be better if they were to batch their
content, but Twitch wants lowest latency possible
Alicia: May be not so much a technical problem, but pushing them to do the right thing
Jean-Yves: It's a fundamental issue between VOD and live, you can't have it all
Alicia: More
practically, many of our customers work with STBs, I work on
maintaining implementations of MSE in WebKit for media boxes
… Could I have good things to sell them on MMS, or
is there nothing to offer beyond following the standard?
Jean-Yves: Probably not. One benefit is having the latest WebKit version. STBs we know don't update often
Alicia: They also have their own performance constraints. In STBs power can be used in content switching done in the browser
Jean-Yves: If you do things in batch and increase latency, you'll increase performance. MMS favours doing things in batch
Alicia: And batching also increases memory usage, which is an issue on STBs
Chris: Alicia, what kind of improvement to MSE would you be looking for?
Alicia: Don't know if it handles running out of memory, was that mentioned in MMS?
Jean-Yves: It is, that's a good point, I should have mentioned. MMS is allowed to evict at any time, not just during appendBuffer, specifically for low memory devices
Alicia: Extremely useful, you don't know how much RAM the device has. You can have the problem where a website stores too much in memory, the STB might start swapping
Jean-Yves: Forcing a lower memory footprint in the player can be added on the UA side if using MMS
Alicia: I recall we
have some settings to force memory usage precisely because this is
a problem. Video players assume you have as much RAM as on a
desktop
… Good to have hints more than surprise
sticks
Jean-Yves: MMS allows eviction then the bufferchange event is fired
Thasso: Coming back to
Joey's point, the core issue I see in the GitHub issue is the
discrepancy between when to download media and when not to
… It's fine for 5G usage but gives the ability for
the UA to block the append. Maybe this could be removed in a follow
up PR, but could it be removed from the proposal too? It's a big
concern if the append could reject depending on when you fetch the
data
Jean-Yves: I guess it's a matter of you never know. It may be it'll never be used
Thasso: So could it be removed from the proposal now and add it later if they need that ability?
Jean-Yves: I don't know
Alicia: Would it conflict with the live playback case?
Thasso: It doesn't, I'm
good if I fetch between the events, but if the UA decides later to
reject, I'll have the problem and the player gets more
complex
… It also affects VOD playback where players
maintain their own buffers outside of MSE, as buffer managemnet in
MSE is tricky
… If the implementation isn't there, why allow it
in the spec in the first place?
Jean-Yves: It's an ongoing discussion in the issue, may be best to take it offline
Chris: Please raise
this in the GitHub issue, and the Media WG will follow up on
it
… the GitHub thread is getting long so we may want
to break it into specific issues for different questions
Wrap up
Chris: Thank you all for joining and for the productive discussion today. I recommend continuing to follow the work in GitHub and in Media WG.
[adjourned]