Meeting minutes
Audio Session & Media Session
Audio Session
youenn: Audio can be played in different ways for different purposes.
… Music, navigation, multiple audio sources, how you mix them.
… Web Audio on the Web will typically not interrupt other sources.
… That's not really specified, up to implementations.
… Audio session is meant to address some of this.
… Partial implementation in Safari. We see some traction.
… One use case is "playback" type. Some web sites adopted that.
… to interrupt other audio.
… Or people want the opposite.
padenot: Similar functionality in Firefox, though not enabled and with different names.
youenn: We would like the model to be made clearer
… Whenever session type is not defined, it's up to the UA. That's fine. When it's defined, the hope is that it will be the same across sessions.
youenn: Currently, each document has a default audio session.
… If I'm playing audio, the UA decides whether it's ambient (e.g., web audio) or playback (audio tag).
eric_carlson: Audio tag is "playback" as long as the duration is more than a few seconds.
… Actually duration of "you've got mail" mp3 ;)
youenn: All the sources/sinks are also tied to a document.
… By default, it makes sense that all the objects that a document is creating be tied to that session.
… It would also make sense to tie activation of AudioSession to sink/source activity.
… Any feedback on the direction?
padenot: I think not changing the default rules is important. Then encoding roughly what implementations actually do.
youenn: [example of active/inactive question to answer]
padenot: What happens with autoplay currently in Safari?
eric_carlson: It depends.
padenot: The last one plays.
eric_carlson: Yes.
padenot: Media element and AudioContext can interrupt each other. With two different documents.
eric_carlson: An AudioContext does not interrupt at all. But with AudioSession, it could be made to interrupt.
youenn: About interruption between different AudioSessions of the same page. Should they interrupt? I think so.
… if they're both playback.
padenot: That makes sense.
youenn: When it's "auto", it's the current behavior that should apply. We cannot change that.
… When it's on the same page, we can define things. When it's two tabs, or one tab and another OS application, it's up to the UA.
padenot: [missed on hook]. Typically implemented on mobile. Would make sense to extend to desktop.
tommy: Only if the page explicitly says that audio session type is "playback" would we pause. That's fine.
youenn: Yes, that would be specified.
tommy: If that's explicit, that makes sense.
eric_carlson: What happens when one is auto and the other is playback, what do we do?
youenn: And if it's third-party iframes, we have other issues.
padenot: If you don't opt in, nothing changes. Otherwise, you opt in to being paused.
tommy: Why would a web site opt in to being paused?
youenn: You can play music and have a GPS-like application in the same app. You might want to pause the playback while you play the notifications. Within the same app.
youenn: By default, the principle is that the AudioSession is scoped to the document.
… It can change (if element switches).
… Multiple AudioSessions could perhaps be used, with sinks/sources grouped to a specific AudioSession, so that they share its underlying type.
… That would mean some way to construct an AudioSession from JS, and having getters/setters on each audio sink/source.
eric_carlson: I don't think we should do it right now. I think we should check how well the proposed mechanism at the document level works before we look into complicating the API.
youenn: So a second step. Not part of the MVP.
markafoltz: What kind of sinks?
youenn: Media element, AudioContext. I'm not sure about getUserMedia, more a source.
youenn: AudioSession is used to group audio sinks/sources per document. That makes it a natural context to control speaker routes.
… The possibility is for AudioSession to do it for all sinks connected to it.
… Very similar to the setSinkId proposal.
padenot: That would work. Avoids listing as well.
youenn: The model would be that if the media element had its own sink id, it would use it.
wolfgang: If one audio goes to the speaker and the other to the headphones, I'm thinking that the behavior might need to change.
… I'm not sure it makes sense to group sinks together when they go to different devices.
youenn: There's a possibility to expose further options, modeled after the related iOS API.
… E.g., defaulting to the built-in speaker. Some hints about what I want. No need to enumerate the devices.
… In general, it seems ok for a web page to say "here is what I want, please make it work".
… I think Microsoft is interesting about MediaDevice.setSinkId.
Steven_Becker: It's global to the page in our proposal.
youenn: We could do the same approach. Where we put the API.
SteveBecker: The proposal is that it affects all children?
youenn: Yes. I think it's exactly the same proposal. Just a matter of aligning where it goes.
youenn: It could happen that you have a sink or source which is incompatible with the AudioSession type.
… In theory, if you set it to "playback" and start recording with getUserMedia(), things happen.
… I'm wondering what the right approach is in terms of guidelines.
tommy: If we already had the ability to create another AudioSession, I'd be more ok with rejecting.
<SteveBecker> Link to the MediaDevices setDefaultSinkId() proposal: https://
youenn: Currently, the approach is call getUserMedia in another document basically.
… If you start capturing, you might suspend the other session. That's another approach.
… So, reject in the long term. In the short term, make developers life easier?
tommy: I can see problems in the short term, example of a game that mixes things, including media capturing.
youenn: Re. minimal scope. The existing API. We describe how things interact. That's doable with some editorial work still needed.
… Grouping API somewhere in the future.
… Route control API perhaps before.
… Request/release focus API, I didn't mention, I do not know where it is in terms of priorities, but it's certainly not in the minimal scope.
padenot: If you play multiple tracks of an album, you could keep the same media session.
youenn: Yes, the media session would be defined in terms of audio session.
… With this API, you would be able to change the metadata in a more nicely way somehow.
… I would say:
… 1. Route control API
… 2. Grouping API
… 3. Request/Release focus API
padenot: Agree.
youenn: What would it take to Firefox to be aligned?
padenot: Same as usual. A good spec we can look at and implement without having to review other codebases.
… Alastor can help. He would do both the spec and the implementation. Again, the backend is already there for us.
hta: We're always mainly customer driven. The question is: who is going to use this?
youenn: We have seen Web Audio people wanting to be treated as "playback".
hta: Is that feedback document?
youenn: We received many bugs, I can provide pointers.
youenn: We have IDL right now, and loose wording. Now we need to describe the behavior.
padenot: More spec work needed there.
cpn: Media Session also has some wording that references audio session. Spec text may need to be adjusted.
… What do we need to include in a First Public Working Draft?
youenn: Aren't we already there?
… I think we're ready for First Public Working Draft.
cpn: Will trigger an patent exclusion period.
tidoust: IDL is already in the scope and matches the MVP, so that seems fine.
cpn: I propose to the group that we do that.
ACTION: cpn to issue a formal CfC about publication of Audio Session as FPWD
Media Session
cpn: An issue raised around making the skipad action more generic.
… I'm not sure that all implementations currently have skipad.
… Comes from the TAG review. Why only skip ads? Skip to the next chapter for example?
… Skipping may not need an ad.
… And then an observation that we now have chapter information. Is there an opportunity to tie those in some way?
… No particular proposal.
tommy: Proposal to replace "skipad" with "skip" in short.
… That's an interesting idea.
… We were discussing adding with a timer.
youenn: You may also have that at the beginning of a movie as well.
tommy: OK. I think we shipped this recently in Chrome. But we basically launched it without backing API. We should probably change it fairly soon if we are to change it.
cpn: Am I right into thinking that the TAG feedback was triggered by Chromium's implementation?
tommy: Yes, I think that's fair.
cpn: Assuming there's interest to implement, I suggest that we change the action then.
tommy: I don't think we have a concrete proposal right now for the timer. It could be part of the action handler or some separate mechanism.
cpn: And then the relation with chapter metadata?
… You wouldn't necessarily skip based on chapter boundaries.
tommy: I agree.
youenn: It could be a different action.
cpn: If there's an action handler, then you don't skip at all. Otherwise the UA could provide a button.
… At a minimum, I propose that we implement the naming change.
youenn: Agree.
tommy: Agree as well.
youenn: I don't remember what the definition of "skipad" means, but the spec probably needs to be updated accordingly.
cpn: Allow a website to disable default action handlers.
tommy: The user agent can easily know what default action to take. But from time to time, default actions do not make sense for web sites, even though no specific action actually makes sense.
… Currently, websites provide an empty action handler. That looks broken.
youenn: It could really be a hint. The UA needs to be free to do whatever it wants.
tommy: Yes, I agree.
youenn: For "pause", looking at the bug, I think audio session is a nicer way to fix the problem.
eric_carlson: An issue with the existing API is that there's no way to indicate the seekable ranges. Instead of having a null action handler, a way for the page to describe the seekable range. If there is none, you would just not provide a seekable rubber.
cpn: If you have multiple media elements, how would the page know which one is in media session?
jean-yves: It's very vague. Implementations try to match what's currently playing.
… There is no description on how you define what is the current playing element.
… I don't think you can query Media Session to know which media element is currently playing. If the action is "pause", you have to look for a paused element yourself.
… It's a difficult problem. On our side, we see issues regularly with people not expecting the behavior. It's also process intensive do determine what's currenlty playing.
cpn: For actions that should not be overridable, such as "hangup", we could spec that this is the case, and a hint otherwise.
jean-yves: The intent for the user agent to not believe that an action handler is defined.
youenn: Yes, it's a better signal for user agents to understand that the page does not want the default action behavior.
tommy: That sounds good to me. Question on the shape of the API?
… I was looking at the actual steps to set the action handler to null.
youenn: You should be able to change the boolean as well.
… We need to have an API shape that makes sense.
hta: Handler, default, and no handler.
… 3 states
youenn: It seems that we should continue the discussion on GitHub on the API shape.
tommy: If the website has never called setActionHandler and calls it with "null", maybe that's enough to understand that it does not want the default action handler.
… I don't think there's a use case for a website changing their mind.
… I'll update a bug with a couple of possible approaches.
youenn: There's some dark magic in Web IDL. We got it wrong in Media Session. Now we need to fix it.
… I won't get into details but 1. it's a bit inconvenient for developers and 2. the spec is broken, so implementations are not fully aligned.
… I updated PR #243 to update the spec with something that makes sense. It aligns with Chrome and Safari, but breaks with Firefox implementation.
cpn: How does the PR address the example on the slide?
youenn: It's a baby step. The spec is broken, goal is to align with implementations first, and then improve from there.
… For the artist, yes, that would work.
… The artwork, being a FrozenArray, you cannot change the `src` attribute.
… That's life, that's not convenient with developers.
… I'm open to suggestions.
… If we make media image an interface, then we would be able to change it.
… We need to keep things backwards compatible.
… It would be a good action item once we've cleaned the spec.
jean-yves: The MediaImage used to be an interface.
… We may need to find out why it was and no longer is.
youenn: If we change, we may end up with situations where things start to throw when they were not throwing before.
… Maybe it's backward compatible. But we need a proposal and some proof that it could be done.
… Jan-Ivar was looking into it, I think.
… PR is recent, some comments from Anne. Domenic to review as well.
cpn: Reviewing with Marcos from a high level perspective. You kind of don't know what happens when you update MediaMetadata.
… Similarly, there is no security model for how we fetch the resources.
youenn: That's a great point. We should specify it.
… We could try to be consistent with other similar contexts.
cpn: When we have chapter metadata, you may want to update that quite dynamically.
… If you are adding to the chapters, as the media progresses, does that affect what is showing in the UI?
… What happens when you're still within the same chapter?
… Is it seemless, or does it trigger a refresh, which we'd like to avoid?
… I'll turn the questions into issues.
ACTION: Chris to raise Media Session issues based on slide 51