Copyright © 2015 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines how a stream of media can be captured from a DOM
element, such as a <video>
, <audio>
,
or <canvas>
element, in the form of
a MediaStream
[GETUSERMEDIA].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
It is partially based on existing implementation experience in Firefox; it is nevertheless still an early proposal, and, while early experimentations are encouraged, it is therefore not intended for implementation.
This document was published by the Web Real-Time Communication Working Group and Device APIs Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All comments are welcome.
Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Web Real-Time Communication Working Group, Device APIs Working Group) made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 August 2014 W3C Process Document.
This section is non-normative.
This document describes an extension to both HTML media elements and the HTML canvas element that enables the capture of the output of the element in the form of streaming media.
The captured media is formed into a MediaStream
[GETUSERMEDIA], which can then be consumed by the various APIs that
process streams of media, such as WebRTC [WEBRTC], or Web Audio
[WEBAUDIO].
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, and SHOULD are to be interpreted as described in [RFC2119].
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.
The captureStream()
and
captureStreamUntilEnded()
methods are defined on HTML
[HTML5] media elements.
Both MediaStream
and HTMLMediaElement
expose the
concept of a track
. Since there is no common type used for
HTMLMediaElement
, this document uses the term
track to refer to either VideoTrack
or AudioTrack
.
MediaStreamTrack
is used to identify the media in a MediaStream
.
partial interface HTMLMediaElement {
MediaStream captureStream ();
MediaStream captureStreamUntilEnded ();
};
captureStream
The captureStream()
method produces a
real-time capture of the media that is rendered to the media element.
The captured MediaStream
comprises of
MediaStreamTrack
s that render the content from the set of
selected
(for VideoTrack
s,
or other exclusively selected track types) or enabled
(for AudioTrack
s,
or other track types that support multiple selections)
tracks from the media element. If the media element does not
have a selected or enabled tracks of a given type, then no
MediaStreamTrack
is present in the captured stream.
A <video>
element can therefore capture a video
MediaStreamTrack
and any number of audio
MediaStreamTrack
s. An <audio>
element
can capture any number of audio MediaStreamTrack
s. In
both cases, the set of captured MediaStreamTrack
s could
be empty.
captureStream
produces a MediaStream
that
captures any media that is currently playing on the element. Changes
in the media element source do not cause the stream to terminate,
though the set of MediaStreamTrack
s might change over
time. If the source stream for the media element ends, or the a
different source is selected, the MediaStream
captures
the state of the media element. This means that there could be
periods where the captured stream has no active media content.
If the selected VideoTrack changes or enabled
AudioTracks change for the media element,
MediaStreamTrack
s are added or removed as necessary to
ensure that the MediaStreamTrack
s in the
MediaStream
correctly reflect the changes. Necessary
addtrack
and removetrack
events are generated to notify applications of these changes.
A captured MediaStreamTrack
ends when the track
that it captures ends. Captured MediaStreamTrack
s also
end when tracks that are rendered to the media element change,
causing the MediaStreamTrack
to be removed from the
captured stream. That is, when a different
VideoTrack
is selected or the corresponding
AudioTrack
is disabled.
If media playback is paused, the captured stream continues to produce
whatever is being actively rendered to the element. What is rendered
to the captured stream will vary based on the type of media; a
VideoTrack
might capture a still frame, or an
AudioTrack
might capture silence.
Whether a media element is actively rendering content (e.g., to a screen or audio device) has no effect on the content of captured streams. Muting the audio on a media element does not cause the capture to produce silence, nor does hiding the media element suppress captured video.
MediaStream
captureStreamUntilEnded
A stream captured using captureStreamUntilEnded()
captures the rendered output from a single media resource. The
resulting stream ends when the media element has ended
playback, or when the media element is changed to render a
different resource.
captureStreamUntilEnded()
operates in the same
way that captureStream()
does, except that when a
captured MediaStreamTrack
is removed from the
MediaStream
no further tracks are added.
A stream captured with captureStreamUntilEnded()
MAY
still start with fewer tracks than the media element permits.
The first track of any given type that becomes selected or
enabled results in a MediaStreamTrack
being added to the
captured stream. Only the first track of a given type is
added; new AudioTrack
s that are enabled are not
added to the capture.
Once all tracks in the captured stream have been removed, the captured stream becomes permanently inactive. New tracks are not added to the capture, even if they are the first of their type.
This allows for a media element that renders multiple media types to
be captured without a complete set of media being present when the
capture is initiated. For instance, a <video>
element might initially only render audio, but have a
VideoTrack
added (or selected) after the capture
commences. A late-starting VideoTrack
would
consequently be added to the capture. However, if the
VideoTrack
commences after the
AudioTrack
ends, then the
VideoTrack
will not be added to the captured
MediaStream
.
MediaStream
The captureStream()
method is added to the HTML [HTML5] canvas element. The resulting
provides methods that allow
for controlling when frames are sampled from the canvas.
CanvasCaptureMediaStream
partial interface HTMLCanvasElement {
CanvasCaptureMediaStream
captureStream (optional double frameRate);
};
captureStream
The captureStream()
method produces a real-time video capture of the surface of the
canvas. The resulting media stream has a single video MediaStreamTrack
that
matches the dimensions of the canvas element.
This method throws a SecurityError exception if the canvas is not origin-clean. The captured stream immediately ceases to capture content from the canvas if the origin-clean flag of the canvas becomes false at any time.
A user agent SHOULD await a stable state in the script execution of the current page or worker that has control of the canvas before capturing a frame.
In order to support manual control of frame capture, browsers MUST
support a value of 0 for frameRate
. The captured stream
always captures at least one frame, even if frameRate
is
zero.
If the frameRate
value is omitted, the user agent
SHOULD capture new frames each time that the content of the canvas
changes.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
frameRate | double | ✘ | ✔ |
CanvasCaptureMediaStream
CanvasCaptureMediaStream
The CanvasCaptureMediaStream
is an extension of
MediaStream
that provide a single
requestFrame()
method. Applications that depend on tight
control over the rendering of content to the media stream can use this
method to control when frames from the canvas are captured.
interface CanvasCaptureMediaStream : MediaStream {
readonly attribute HTMLCanvasElement
canvas;
void requestFrame ();
};
canvas
of type HTMLCanvasElement
, readonly requestFrame
The requestFrame()
method allows
applications to manually request that a frame from the canvas be
captured and rendered into the media stream. In cases where
applications progressively render to a canvas, this allows
applications to avoid capturing a partially rendered frame.
void
Media elements can render media resources from origins that differ from
the origin of the media element. In those cases, the contents of the
resulting MediaStream
MUST be protected from access by the
document origin.
How this protection manifests will differ, depending on how the content is
accessed. For instance, rendering inaccessible video to a
canvas
element [2DCONTEXT] causes the origin-clean
property of the canvas to become false
; attempting to create
a Web Audio MediaStreamAudioSourceNode
[WEBAUDIO] succeeds,
but produces no information to the document origin (that is, only silence
is transmitted into the audio context); attempting to transfer the media
using WebRTC [WEBRTC] results in no information being transmitted.
The origin of the media that is rendered by a media element can change at any time. This is even the case for a single media resource. User agents MUST ensure that a change in the origin of media doesn't result in exposure of cross origin content.
This section will be removed before publication.
This document is based on the stream processing specification [streamproc] originally developed by Robert O'Callahan.