1. Introduction
This section is non-normative.
Media is used extensively today, and the Web is one of the primary means of consuming media content. Many platforms can display media metadata, such as title, artist, album and album art on various UI elements such as notification, media control center, device lockscreen and wearable devices. This specification aims to enable web pages to specify the media metadata to be displayed in platform UI, and respond to media controls which may come from platform UI or media keys, thereby improving the user experience.
2. Conformance
All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and terminate these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
When a method or an attribute is said to call another method or attribute, the user agent must invoke its internal API for that attribute or method so that e.g. the author can’t change the behavior by overriding attributes or methods with custom properties or functions in JavaScript.
Unless otherwise stated, string comparisons use is identical to.
3. Dependencies
The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]
4. Security and Privacy Considerations
This section is non-normative.
The API introduced in this specification has very low impact with regards to security and privacy. Part of the API allows a website to expose metadata that can be used by the user agent. The user agent obviously needs to use this data with care. Another part of the API allows a website to receive commands from the user via buttons or other form of controls which might sometimes introduce a new input layer between the user and the website.
4.1. User interface guidelines
The MediaMetadata
introduced in this specification allows a website to
offer more information with regards to what is being played. The user
agent is meant to use this information in any UI related to media
playback, either internal to the user agent or within the platform.
The MediaMetadata
are expected to be used in the context of media
playback, making spoofing harder but because the MediaMetadata
has
text fields and image fields, a malicious website could try to spoof
another website’s identity. It is recommended that the user agent offers a
way to find the origin or clearly expose the origin of the website which
the metadata are coming from.
If a user agent offers a mechanism to go back to a website from a UI
element created based on the MediaMetadata
, it is recommended that the
action should not be noticeable by the website, thus reducing the chances
of spoofing.
In general, all security and privacy considerations related to the display
of notifications from a website should apply here. It is worth noting that
the MediaMetadata
offer less customization than regular web
notifications, thus would be harder to spoof.
4.2. Incognito mode
For privacy purposes, when in incognito mode, the user agent should be
careful when sharing the information from MediaMetadata
with the
system and make sure they will not be used in a way that would harm the
user. Displaying this information in a way that is very visible would be
against the user’s intent of browsing in incognito mode. When available,
the UI elements should be advertized as private to the platform.
4.3. Media Session Actions
Media session actions expose a new input layer to the web platform. User agents should make sure users are aware that their actions might be routed to the website with the active media session. Especially, when the actions are coming from remote devices such as a headset or other remote device. It is recommended for the user agent to follow the platform conventions when listening to these inputs in order to facilitate the user understanding.
5. Model
5.1. Playback State
In order to make play and pause actions work properly, the user agent SHOULD be able to determine if a browsing context of the active media session is playing media or not, which is called the guessed playback state. The RECOMMENDED way for determining the guessed playback state is to monitor the media elements whose node document’s browsing context is the browsing context. The browsing context’s guessed playback state is playing if any of them is potentially playing and not muted, and is paused otherwise. Other information SHOULD also be considered, such as WebAudio and plugins.
The playbackState attribute specifies the declared playback state from the browsing context. The state is combined with the guessed playback state to compute the actual playback state, which is a finalized state and will be used for play and pause actions.
The actual playback state is computed in the following way:
- If the declared playback state is playing, return playing.
- Otherwise, return the guessed playback state.
The playbackState
attribute could be useful when the page
wants to do some preparation steps when the media is paused but it allows
the preparation steps to be interrupted by pause action. See Setting playbackState for example.
When the actual playback state of the active media session changes, the user agent MUST run the media session actions update algorithm.
5.2. Routing
There could be multiple MediaSession
objects existing at the same time
since the user agent could have multiple tabs, each tab could contain a top-level browsing context and multiple nested browsing
contexts, and each browsing context could have a MediaSession
object.
The user agent MUST select at most one of the MediaSession
objects to
present to the user, which is called the active media session.
The active media session may be null. The selection is up to the user
agent and SHOULD be based on preferred user experience. Note that the playbackState
attribute MUST not affect media session
routing. It only takes effect for the active media session.
It is RECOMMENDED that the user agent selects the active media session by managing audio focus. A tab or browsing context is said to have audio focus if it is currently playing audio or the user expects to control the media in it. The AudioFocus API targets this area and could be used once it’s finished.
Whenever the active media session is changed, the user agent MUST run the media session actions update algorithm and the update metadata algorithm.
5.3. Metadata
The media metadata for the active media session MAY be displayed in
the platform UI depending on platform conventions. Whenever the active
media session changes or setting metadata
of the active media
session, the user agent MUST run the update metadata
algorithm. The steps are as follows:
- If the active media session is null, unset the media metadata presented to the platform, and terminate these steps.
- If the
metadata
of the active media session is an empty metadata, unset the media metadata presented to the platform, and terminate these steps. - Update the media metadata presented to the platform to match the
metadata
for the active media session. - If the user agent wants to display an artwork image, it is RECOMMENDED to run the fetch image algorithm.
The RECOMMENDED fetch image algorithm is as follows:
- If there are other fetch image algorithms running, cancel existing algorithm execution instances.
- If metadata’s
artwork
of the active media session is empty, then terminate these steps. - If the platform supports displaying media artwork, select a preferred artwork image from metadata’s
artwork
of the active media session. -
Fetch the preferred artwork image’s
src
.Then, in parallel:
- Wait for the response.
- If the response’s internal response’s type is default, attempt to decode the resource as an image.
- If the image format is supported, use the image as the artwork for display in platform UI. Otherwise the fetch image algorithm fails and terminates.
If no images are fetched in the fetch image algorithm, the user agent MAY have fallback behavior such as displaying a default image as artwork.
5.4. Actions
A media session action is an action that the page can handle in
order for the user to interact with the MediaSession
. For example, a
page can handle some actions that will then be triggered when the user
presses buttons from a headset or other remote device.
A media session action source is a source that might produce a media session action. Such a source can be the platform or the UI surfaces created by the user agent.
A media session action source has an optional target which should be the
recipient of any media session action created by the media session action source. If a media session action source’s target is null
,
the active media session is the recipient of all media session action source’s actions.
A media session action is represented by a MediaSessionAction
which can have one of the following value:
-
play
: the action’s intent is to resume the playback. -
pause
: the action’s intent is to pause the currently active playback. -
seekbackward
: the action’s intent is to move the playback time backward by a short period (eg. a few seconds). -
seekforward
: the action’s intent is to move the playback time forward by a short period (eg. a few seconds). -
previoustrack
: the action’s intent is to either start the current playback from the beginning if the playback has a notion of beginning, or move to the previous item in the playlist if the playback has a notion of playlist. -
nexttrack
: the action’s intent is to move to the playback to the next item in the playlist if the playback has a notion of playlist. -
skipad
: the action’s intent is to skip the advertisement that is currently playing. -
stop
: the action’s intent is to stop the playback and clear the state if appropriate. -
seekto
: the action’s intent is to move the playback time to a specific time. -
togglemicrophone
: the action’s intent is to mute or unmute the user’s microphone. -
togglecamera
: the action’s intent is to turn the user’s active camera on or off. -
hangup
: the action’s intent is to end a call. -
previousslide
: the action’s intent is to go back to the previous slide when presenting slides. -
nextslide
: the action’s intent is to go to the next slide when presenting slides. -
enterpictureinpicture
: the action’s intent is to open the media session in a picture-in-picture window.
All MediaSession
s have a map of supported media session
actions with, as a key, a media session action and as a value
a MediaSessionActionHandler
.
When the update action handler algorithm on a given MediaSession
with action and handler parameters
is invoked, the user agent MUST run the following steps:
- If handler is
null
, remove action from the supported media session actions forMediaSession
and abort these steps. - Add action to the supported media session actions for
MediaSession
and associate to it the handler.
When the supported media session actions are changed, the user agent SHOULD run the media session actions update algorithm. The user agent MAY queue a task in order to run the media session actions update algorithm in order to avoid UI flickering when multiple actions are modified in the same event loop.
When the user agent is notified by a media session action source named source that a media session action named action has been triggered, the user agent MUST run the handle media session action steps as follows:
- Let session be source’s target.
- If session is
null
, set session to the active media session. - If session is
null
, abort these steps. - Let actions be session’s supported media session actions.
- If actions does not contain the key action, abort these steps.
- Let handler be the
MediaSessionActionHandler
associated with the key action in actions. - Run handler with the details parameter set to:
MediaSessionActionDetails
. - Run the activation notification steps in the browsing context associated with session.
When the user agent receives a joint command for play and pause, such as a headset button click, it MUST run the following steps:
- If the active media session is
null
, abort these steps. - Let action be a media session action.
- If the actual playback state of the active media session is playing, set action to pause.
- Otherwise, set action to play.
- Run the handle media session action steps with action.
It is RECOMMENDED for user agents to implement a default handler for the play and pause media session actions if none was provided for the active media session.
A user agent MAY implement a default handler for the togglemicrophone, togglecamera, or hangup media session actions if none was provided for the active media session.
A page should only register a MediaSessionActionHandler
for a media
session action when it can handle the action given that the user agent
will list this as a supported media session action and update the media session action sources.
When the media session actions update algorithm is invoked, the user agent MUST run the following steps:
- Let available actions be an array of media session actions.
- If the active media session is null, set available actions to the empty array.
- Otherwise, set the available actions to the list of keys available in the active media session’s supported media session actions.
-
For each media session action source source, run the
following substeps:
-
Optionally, if the active media session is not null:
- If the active media session’s actual playback state is playing, remove play from available actions.
- Otherwise, remove pause from available actions.
- If the source is a UI element created by the user agent, it MAY remove some elements from available actions if there are too many of them compared to the available space.
- Notify the source with the updated list of available actions.
-
Optionally, if the active media session is not null:
5.5. Position State
A user agent MAY display the current playback position and duration of a media session in the platform UI depending on platform conventions. The position state is the combination of the following:
- The duration of the media in seconds.
- The playback rate of the media. It is a coefficient.
- The last reported playback position of the media. This is the playback position of the media in seconds when the position state was created.
The position state is represented by a MediaPositionState
which
MUST always be stored with the last position updated time. This
is the time the position state was last updated in seconds.
The RECOMMENDED way to determine the position state is to monitor the media elements whose node document’s browsing context is the browsing context.
The actual playback rate is a coefficient computed in the following way:
- If the actual playback state is paused, then return zero.
- Return playback rate.
The current playback position in seconds is computed in the following way:
- Set time elapsed to the system time in seconds minus the last position updated time.
- Mutliply time elapsed with actual playback rate.
- Set position to time elapsed added to last reported playback position.
- If position is less than zero, return zero.
- If position is greater than duration, return duration.
- Return position.
6. The MediaSession
interface
[Exposed =Window ]partial interface Navigator { [SameObject ]readonly attribute MediaSession mediaSession ; };enum {
MediaSessionPlaybackState "none" ,"paused" ,"playing" };enum {
MediaSessionAction "play" ,"pause" ,"seekbackward" ,"seekforward" ,"previoustrack" ,"nexttrack" ,"skipad" ,"stop" ,"seekto" ,"togglemicrophone" ,"togglecamera" ,"hangup" ,"previousslide" ,"nextslide" ,"enterpictureinpicture" };callback =
MediaSessionActionHandler undefined (MediaSessionActionDetails ); [
details Exposed =Window ]interface {
MediaSession attribute MediaMetadata ?metadata ;attribute MediaSessionPlaybackState playbackState ;undefined setActionHandler (MediaSessionAction ,
action MediaSessionActionHandler ?);
handler undefined setPositionState (optional MediaPositionState = {});
state undefined setMicrophoneActive (boolean );
active undefined setCameraActive (boolean ); };
active
A MediaSession
object represents a media session for a given document and
allows a document to communicate to the user agent some information about the
playback and how to handle it.
A MediaSession
has an associated metadata object represented by a MediaMetadata
. It is initially null
.
The mediaSession
attribute
MUST return the MediaSession
instance associated with the Navigator
object.
The metadata
attribute
reflects the MediaSession
's metadata. On getting,
it MUST return the MediaSession
's metadata. On
setting, it MUST run the following steps with value being the new
value being set:
- If the
MediaSession
's metadata is notnull
, set its media session tonull
. - Set the
MediaSession
's metadata to value. - If the
MediaSession
's metadata is notnull
, set its media session to the currentMediaSession
. - In parallel, run the update metadata algorithm.
The playbackState
attribute represents the declared playback state of the media
session, by which the session declares whether its browsing context is playing media or not. The initial value is none. On setting, the user agent MUST set
the IDL attribute to the new value if it is a valid MediaSessionPlaybackState
value. On getting, the user agent MUST return
the last valid value that was set. The playbackState
attribute is a hint for the user agent to determine whether the browsing
context is playing or paused.
Setting playbackState
may cause the actual playback
state to change and run the media session actions update algorithm.
The MediaSessionPlaybackState
enum is used to indicate whether a browsing context is playing media or not, the values are described as
follows:
-
none
means the browsing context does not specify whether it’s playing or paused, it can only be used in theplaybackState
attribute. -
playing
means the browsing context is currently playing media and it can be paused. -
paused
means the browsing context has paused media and it can be resumed.
The setActionHandler(action, handler)
method, when
invoked, MUST run the update action handler algorithm with action and handler on the MediaSession
.
The setPositionState(state)
method, when invoked
MUST perform the following steps:
- If the state is an empty dictionary then clear the position state.
- If the duration is not present or its value is null, throw a TypeError.
- If the duration is negative, throw a TypeError.
- If the position is not present or its value is null, set it to zero.
- If the position is negative or greater than duration, throw a TypeError.
- If the playbackRate is not present or its value is null, set it to 1.0.
- If the playbackRate is zero throw a TypeError.
- Update the position state and last position updated time.
The setMicrophoneActive(active)
and setCameraActive(active)
methods indicate to
the user agent whether the microphone and camera are currently considered by
the page to be active (e.g. if the microphone is considered "muted" by the
page since it is no longer sending audio through to a call, then the page can
invoke setMicrophoneActive(false)
).
It is RECOMMENDED that the user agent respect the microphone and camera
states indicated by the page in this UI.
The user agent MAY display UI which invokes handlers for media session actions.
7. The MediaMetadata
interface
[Exposed =Window ]interface {
MediaMetadata constructor (optional MediaMetadataInit = {});
init attribute DOMString title ;attribute DOMString artist ;attribute DOMString album ;attribute FrozenArray <MediaImage >artwork ; };dictionary {
MediaMetadataInit DOMString = "";
title DOMString = "";
artist DOMString = "";
album sequence <MediaImage >= []; };
artwork
A MediaMetadata
object is a representation of the metadata associated with
a MediaSession
that can be used by user agents to provide customized user
interface.
A MediaMetadata
can have an associated media
session.
A MediaMetadata
has an associated title, artist and album which are DOMString.
A MediaMetadata
has an associated list of artwork
images.
A MediaMetadata
is said to be an empty metadata if it is equal
to null
or all the following conditions are true:
- Its title is the empty string.
- Its artist is the empty string.
- Its album is the empty string.
- Its artwork images length
is
0
.
The MediaMetadata(init)
constructor, when invoked, MUST run the following steps:
- Let metadata be a new
MediaMetadata
object. - Set metadata’s
title
to init’stitle
. - Set metadata’s
artist
to init’sartist
. - Set metadata’s
album
to init’salbum
. - Run the convert artwork algorithm with init’s
artwork
as input and set metadata’s artwork images as the result if it succeeded. - Return metadata.
When the convert artwork algorithm with input parameter is invoked, the user agent MUST run the following steps:
- Let output be an empty list of type
MediaImage
. -
For each entry in input’s
artwork
, perform the following steps:- Let image be a new
MediaImage
. - Let baseURL be the API base URL specified by the entry settings object.
- Parse entry’s
src
using baseURL. If it does not return failure, set image’ssrc
to the return value. Otherwise, throw a TypeError and abort these steps. - Set image’s
sizes
to entry’ssizes
. - Set image’s
type
to entry’stype
. - Append image to the output.
- Let image be a new
- Return output as result.
The title
attribute
reflects the MediaMetadata
's title. On getting,
it MUST return the MediaMetadata
's title. On
setting, it MUST set the MediaMetadata
's title to
the given value.
The artist
attribute
reflects the MediaMetadata
's artist. On getting,
it MUST return the MediaMetadata
's artist. On
setting, it MUST set the MediaMetadata
's artist to the given value.
The album
attribute
reflects the MediaMetadata
's album. On getting,
it MUST return the MediaMetadata
's album. On
setting, it MUST set the MediaMetadata
's album to
the given value.
The artwork
attribute reflects the MediaMetadata
's artwork
images. On getting, it MUST return the result of the following steps:
- Let frozenArtwork be an empty list of type
MediaImage
. -
For each entry in the
MediaMetadata
's artwork images, perform the following steps:- Let image be a new
MediaImage
. - Set image’s
src
to entry’ssrc
. - Set image’s
sizes
to entry’ssizes
. - Set image’s
type
to entry’stype
. - Call Object.freeze on image, to prevent accidental mutation by scripts.
- Append image to frozenArtwork.
- Let image be a new
- Create a frozen array from frozenArtwork.
MediaMetadata
's artwork images as the
result if it succeeded.
When MediaMetadata
's title, artist, album or artwork images are modified, the user agent MUST run the
following steps:
- If the instance has no associated media session, abort these steps.
-
Otherwise, queue a task to run the following substeps:
- If the instance no longer has an associated media session, abort these steps.
- Otherwise, in parallel, run the update metadata algorithm.
8. The MediaImage
dictionary
dictionary {
MediaImage required USVString src ;DOMString sizes = "";DOMString type = ""; };
The MediaImage
dictionary members are inspired by ImageResource
in [IMAGE-RESOURCE].
The src
dictionary member is used
to specify the MediaImage
object’s source. It is
a URL from which the user agent can fetch the image’s data.
The sizes
dictionary member is
used to specify the MediaImage
object’s sizes
. It follows the
spec of sizes
attribute in
the HTML link
element, which is a string
consisting of an unordered set of unique space-separated tokens which are ASCII case-insensitive that represents the dimensions of an image. Each
keyword is either an ASCII case-insensitive match for the string "any",
or a value that consists of two valid non-negative integers that do not have a
leading U+0030 DIGIT ZERO (0) character and that are separated by a single
U+0078 LATIN SMALL LETTER X or U+0058 LATIN CAPITAL LETTER X character. The
keywords represent icon sizes in raw pixels (as opposed to CSS pixels). When
multiple image objects are available, a user agent MAY use the value to decide
which icon is most suitable for a display context (and ignore any that are
inappropriate). The parsing steps for the sizes
attribute MUST
follow the parsing steps for HTML link
element sizes
attribute.
The type
dictionary member is
used to specify the MediaImage
object’s MIME type. It is a hint as to
the media type of the image. The purpose of this attribute is to allow a user
agent to ignore images of media types it does not support.
9. The MediaPositionState
dictionary
dictionary {
MediaPositionState double duration ;double playbackRate ;double position ; };
The MediaPositionState
dictionary is a representation of the current
playback position associated with a MediaSession
that can be used by user
agents to provide a user interface that displays the current playback position
and duration.
The duration
dictionary
member is used to specify the duration in seconds. It should always be positive
and positive infinity can be used to indicate media without a defined end such
as live playback.
The playbackRate
dictionary
member is used to specify the playback rate. It can be positive to represent
forward playback or negative to represent backwards playback. It should not be
zero.
The position
dictionary
member is used to specify the last reported playback position in seconds. It
should always be positive.
10. The MediaSessionActionDetails
dictionary
dictionary {
MediaSessionActionDetails required MediaSessionAction action ;double seekOffset ;double seekTime ;boolean fastSeek ; };
The MediaSessionActionHandler
MUST be run with the details parameter which is represented by a dictionary inherited from MediaSessionActionDetails
.
The action
dictionary
member is used to specify the media session action that the MediaSessionActionHandler
is associated with.
The seekOffset
dictionary member MAY be provided when the media session action is seekbackward or seekforward. It is the time in seconds
to move the playback time by. If present, it should always be positive.
If it is not provided then the site should choose
a sensible time (e.g. a few seconds).
When the media session action is seekto:
- The
seekTime
dictionary member MUST be provided and is the time in seconds to move the playback time to. - The
fastSeek
dictionary member MAY be provided and will be true if the action is being called multiple times as part of a sequence and this is not the last call in that sequence.
11. Permissions Policy Integration
This specification defines a policy-controlled feature identified by the string "mediasession". Its default allowlist is *.
A document’s permissions policy determines whether any content in that document is allowed to use the MediaSession API. If disabled in the document, the User Agent MUST NOT select the document’s media session as the active media session.
12. Examples
This section is non-normative.
navigator. mediaSession. metadata= new MediaMetadata({ title: "Episode Title" , artist: "Podcast Host" , album: "Podcast Title" , artwork: [{ src: "podcast.jpg" }] });
Alternatively, providing multiple artwork images in the metadata can let the user agent be able to select different artwork images for different display purposes and better fit for different screens:
navigator. mediaSession. metadata= new MediaMetadata({ title: "Episode Title" , artist: "Podcast Host" , album: "Podcast Title" , artwork: [ { src: "podcast.jpg" , sizes: "128x128" , type: "image/jpeg" }, { src: "podcast_hd.jpg" , sizes: "256x256" }, { src: "podcast_xhd.jpg" , sizes: "1024x1024" , type: "image/jpeg" }, { src: "podcast.png" , sizes: "128x128" , type: "image/png" }, { src: "podcast_hd.png" , sizes: "256x256" , type: "image/png" }, { src: "podcast.ico" , sizes: "128x128 256x256" , type: "image/x-icon" } ] });
For example, if the user agent wants to use an image as icon, it may choose "podcast.jpg"
or "podcast.png"
for a
low-pixel-density screen, and "podcast_hd.jpg"
or "podcast_hd.png"
for a high-pixel-density screen. If the user
agent wants to use an image for lockscreen background, "podcast_xhd.jpg"
will be preferred.
For playlists or chapters of an audio book, multiple media elements can share a single media session.
var audio1= document. createElement( "audio" ); audio1. src= "chapter1.mp3" ; var audio2= document. createElement( "audio" ); audio2. src= "chapter2.mp3" ; audio1. play(); audio1. addEventListener( "ended" , function () { audio2. play(); });
Because the session is shared, the metadata must be updated to reflect what is currently playing.
function updateMetadata( event) { navigator. mediaSession. metadata= new MediaMetadata({ title: event. target== audio1? "Chapter 1" : "Chapter 2" , artist: "An Author" , album: "A Book" , artwork: [{ src: "cover.jpg" }] }); } audio1. addEventListener( "play" , updateMetadata); audio2. addEventListener( "play" , updateMetadata);
var tracks= [ "chapter1.mp3" , "chapter2.mp3" , "chapter3.mp3" ]; var trackId= 0 ; var audio= document. createElement( "audio" ); audio. src= tracks[ trackId]; function updatePlayingMedia() { audio. src= tracks[ trackId]; // Update metadata (omitted) } navigator. mediaSession. setActionHandler( "previoustrack" , function () { trackId= ( trackId+ tracks. length- 1 ) % tracks. length; updatePlayingMedia(); }); navigator. mediaSession. setActionHandler( "nexttrack" , function () { trackId= ( trackId+ 1 ) % tracks. length; updatePlayingMedia(); }); navigator. mediaSession. setActionHandler( "seekto" , function ( details) { audio. currentTime= details. seekTime; });
playbackState
:
When a page pauses its media and plays a third-party ad in an iframe, the UA might consider the session as "not playing", however the page wants to allow the user to pause the ad playback and cancel the pending playback after the ad finishes.
var adFrame; var audio= document. createElement( "audio" ); audio. src= "foo.mp3" ; function resetActionHandlers() { navigator. mediaSession. setActionHandler( "play" , _=> audio. play()); navigator. mediaSession. setActionHandler( "pause" , _=> audio. pause()); } resetActionHandlers(); // This method will be called when the page wants to play some ad. function pauseAudioAndPlayAd() { audio. pause(); navigator. mediaSession. playbackState= "playing" ; setUpAdFrame(); adFrame. contentWindow. postMessage( "play_ad" ); navigator. mediaSession. setActionHandler( "pause" , pauseAd); } function pauseAd() { adFrame. contentWindow. postMessage( "pause_ad" ); navigator. mediaSession. playbackState= "paused" ; navigator. mediaSession. setActionHandler( "play" , resumeAd); } function resumeAd() { adFrame. contentWindow. postMessage( "resume_ad" ); navigator. mediaSession. playbackState= "playing" ; navigator. mediaSession. setActionHandler( "pause" , pauseAd); } window. onmessage= function ( e) { if ( e. data=== "ad finished" ) { removeAdFrame(); navigator. mediaSession. playbackState= "none" ; resetActionHandlers(); } } function setUpAdFrame() { adFrame= document. createElement( "iframe" ); adFrame. src= "https://example.com/ad-iframe.html" ; document. body. appendChild( adFrame); } function removeAdFrame() { adFrame. remove(); }
// Media is loaded, set the duration. navigator. mediaSession. setPositionState({ duration: 60 }); // Media starts playing at the beginning. navigator. mediaSession. playbackState= "playing" ; // Media starts playing at 2x 10 seconds in. navigator. mediaSession. setPositionState({ duration: 60 , playbackRate: 2 , position: 10 }); // Media is paused. navigator. mediaSession. playbackState= "paused" ; // Media is reset. navigator. mediaSession. setPositionState( null );
var isMicrophoneActive= false ; var isCameraActive= false ; navigator. mediaSession. setMicrophoneActive( isMicrophoneActive); navigator. mediaSession. setCameraActive( isCameraActive); navigator. mediaSession. setActionHandler( "togglemicrophone" , function () { if ( isMicrophoneActive) { // Mute the microphone. Implementation omitted. } else { // Unmute the microphone. Implementation omitted. } isMicrophoneActive= ! isMicrophoneActive; navigator. mediaSession. setMicrophoneActive( isMicrophoneActive); }); navigator. mediaSession. setActionHandler( "togglecamera" , function () { if ( isCameraActive) { // Disable the camera. Implementation omitted. } else { // Enable the camera. Implementation omitted. } isCameraActive= ! isCameraActive; navigator. mediaSession. setCameraActive( isCameraActive); }); navigator. mediaSession. setActionHandler( "hangup" , function () { // End the call. Implementation omitted. });
var currentSlideIndex= 0 ; navigator. mediaSession. setActionHandler( "previousslide" , function () { currentSlideIndex-- ; // Set current slide. Implementation omitted. }); navigator. mediaSession. setActionHandler( "nextslide" , function () { currentSlideIndex++ ; // Set current slide. Implementation omitted. });
navigator. mediaSession. setActionHandler( "enterpictureinpicture" , function () { remoteVideo. requestPictureInPicture(); });
Acknowledgments
The editors would like to thank Paul Adenot, Jake Archibald, Tab Atkins, Jonathan Bailey, François Beaufort, Marcos Caceres, Domenic Denicola, Ralph Giles, Anne van Kesteren, Tobie Langel, Michael Mahemoff, Jer Noble, Elliott Sprehn, Chris Wilson, and Jörn Zaefferer for their participation in technical discussions that ultimately made this specification possible.
Special thanks go to Philip Jägenstedt and David Vest for their help in designing every aspect of media sessions and for their seemingly infinite patience in working through the initial design issues; Jer Noble for his help in building a model that also works well within the iOS audio focus model; and Mounir Lamouri and Anton Vayvod for their early involvement, feedback and support in making this specification happen.