WebM Byte Stream Format

W3C Group Note

More details about this document
This version:
https://www.w3.org/TR/2024/NOTE-mse-byte-stream-format-webm-20240718/
Latest published version:
https://www.w3.org/TR/mse-byte-stream-format-webm/
Latest editor's draft:
https://w3c.github.io/mse-byte-stream-format-webm/
History:
https://www.w3.org/standards/history/mse-byte-stream-format-webm/
Commit history
Editor:
Chris Needham (British Broadcasting Corporation)
Former editors:
(W3C Invited Expert) (Until February 2024)
Jerry Smith (Microsoft Corporation) (Until September 2017)
Aaron Colwell (Google Inc.) (Until April 2015)
Feedback:
GitHub w3c/mse-byte-stream-format-webm (pull requests, new issue, open issues)
public-media-wg@w3.org with subject line [mse-byte-stream-format-webm] … message topic … (archives)

Abstract

This specification defines a Media Source Extensions™ [MEDIA-SOURCE] byte stream format specification based on the WebM container format [WEBM].

Status of This Document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

The working group maintains a list of all bug reports that the editors have not yet tried to address; there may also be related open bugs in the GitHub repository of the Media Source Extensions™ specification.

This document was published by the Media Working Group as a Group Note using the Note track.

This Group Note is endorsed by the Media Working Group, but is not endorsed by W3C itself nor its Members.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The W3C Patent Policy does not carry any licensing requirements or commitments on this document.

This document is governed by the 03 November 2023 W3C Process Document.

1. Introduction

This specification describes a byte stream format based on the WebM container format [WEBM].

It defines the MIME-type parameters used to signal codecs, and provides the necessary format specific definitions for initialization segments, media segments, and random access points required by the Byte Stream Formats section of the Media Source Extensions™ specification.

2. MIME-type parameters

This section specifies the parameters that can be used in the MIME-type passed to isTypeSupported() or addSourceBuffer().

codecs
A comma separated list of codec IDs used to specify what codecs will be used in the byte stream.
Codec ID Valid with "audio/webm" Valid with "video/webm"
vorbis true true
opus true true
vp8 false true
vp9 false true
vp09... as described in the VP Codec ISO Media File Format Binding document [VP09CODECSPARAMETERSTRING] false true
Note
Implementations SHOULD support all of the codec IDs mentioned in the table above.
Note
Implementations SHOULD encourage applications to prefer the "vp09..." codec ID over "vp9". The "vp09..." format provides detailed profile and color information, enabling implementations to give more accurate answers for codec support.

Examples of valid MIME-types with a codecs parameter.

  • audio/webm;codecs="vorbis"
  • video/webm;codecs="vorbis"
  • video/webm;codecs="vp8"
  • video/webm;codecs="vp8,vorbis"
  • video/webm;codecs="vp9,opus"
  • video/webm;codecs="vp09.00.10.08"
  • video/webm;codecs="vp09.02.10.10.01.09.16.09.01,opus"

3. Initialization Segments

A WebM initialization segment MUST contain a subset of the elements at the start of a typical WebM file.

The user agent MUST run the append error algorithm if any of the following conditions are not met:

  1. The initialization segment MUST start with an EBML Header element, followed by a Segment header.
  2. The size value in the Segment header MUST signal an "unknown size" or contain a value large enough to include the Segment Information and Track elements that follow.
  3. A Segment Information element and a Track element MUST appear, in that order, after the Segment header and before any further EBML Header or Cluster elements.

The user agent MUST accept and ignore any elements other than an EBML Header or a Cluster that occur before, in between, or after the Segment Information and Track elements.

The user agent MUST source attribute values for id, kind, label and language for AudioTrack, VideoTrack and TextTrack objects as described for WebM in the in-band tracks spec [INBANDTRACKS].

4. Media Segments

A WebM media segment is a single Cluster element.

The user agent uses the following rules when interpreting content in a Cluster:

  1. The TimecodeScale in the WebM initialization segment most recently appended applies to all timestamps in the Cluster
  2. The Timecode element in the Cluster contains a presentation timestamp in TimecodeScale units.
  3. The Cluster header MAY contain an "unknown" size value. If it does then the end of the cluster is reached when another Cluster header or an element header that indicates the start of a WebM initialization segment is encountered.

The user agent MUST run the append error algorithm if any of the following conditions are not met:

  1. The Timecode element MUST appear before any Block & SimpleBlock elements in a Cluster.
  2. Block & SimpleBlock elements are in time increasing order consistent with [WEBM].
  3. If the most recent WebM initialization segment describes multiple tracks, then blocks from all the tracks MUST be interleaved in time increasing order. At least one block from all audio and video tracks MUST be present.

The user agent MUST accept and ignore Cues or Chapters elements that follow a Cluster element.

5. Random Access Points

Either a SimpleBlock element with its Keyframe flag set, or a BlockGroup element having no ReferenceBlock elements, signals the location of a random access point for that track. The order of multiplexed blocks within a media segment MUST conform to the WebM Muxer Guidelines.

6. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

7. Acknowledgments

The editors would like to thank Chris Cunningham, Frank Galligan, and Philip Jägenstedt for their contributions to this specification.

A. References

A.1 Normative references

[html]
HTML Standard. Anne van Kesteren; Domenic Denicola; Ian Hickson; Philip Jägenstedt; Simon Pieters. WHATWG. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INBANDTRACKS]
Sourcing In-band Media Resource Tracks from Media Containers into HTML. Silvia Pfeiffer; Bob Lund. W3C. 26 April 2015. Unofficial Draft. URL: https://dev.w3.org/html5/html-sourcing-inband-tracks/
[media-source]
Media Source Extensions™. Jean-Yves Avenard; Mark Watson. W3C. 4 July 2024. W3C Working Draft. URL: https://www.w3.org/TR/media-source-2/
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174
[VP09CODECSPARAMETERSTRING]
VP Codec ISO Media File Format Binding. Frank Galligan; Kilroy Hughes; Thomás Inskip; David Ronca. WebM Project. URL: https://www.webmproject.org/vp9/mp4/#codecs-parameter-string
[WEBM]
WebM Container Guidelines. The WebM Project. 26 April 2016. URL: https://www.webmproject.org/docs/container/