ISO BMFF Byte Stream Format

W3C Group Note

More details about this document
This version:
https://www.w3.org/TR/2024/NOTE-mse-byte-stream-format-isobmff-20240718/
Latest published version:
https://www.w3.org/TR/mse-byte-stream-format-isobmff/
Latest editor's draft:
https://w3c.github.io/mse-byte-stream-format-isobmff/
History:
https://www.w3.org/standards/history/mse-byte-stream-format-isobmff/
Commit history
Editor:
Mark Watson (Netflix Inc.)
Former editors:
(W3C Invited Expert) (Until February 2024)
Jerry Smith (Microsoft Corporation) (Until September 2017)
Aaron Colwell (Google Inc.) (Until April 2015)
Adrian Bateman (Microsoft Corporation) (Until April 2015)
Feedback:
GitHub w3c/mse-byte-stream-format-isobmff (pull requests, new issue, open issues)
public-media-wg@w3.org with subject line [mse-byte-stream-format-isobmff] … message topic … (archives)

Abstract

This specification defines a Media Source Extensions™ [MEDIA-SOURCE] byte stream format specification based on the ISO Base Media File Format [ISOBMFF].

Status of This Document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

The working group maintains a list of all bug reports that the editors have not yet tried to address; there may also be related open bugs in the GitHub repository of the Media Source Extensions™ specification.

This document was published by the Media Working Group as a Group Note using the Note track.

This Group Note is endorsed by the Media Working Group, but is not endorsed by W3C itself nor its Members.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The W3C Patent Policy does not carry any licensing requirements or commitments on this document.

This document is governed by the 03 November 2023 W3C Process Document.

1. Introduction

This specification defines segment formats for implementations of Media Source Extensions™ [MEDIA-SOURCE] that choose to support the ISO Base Media File Format [ISOBMFF].

It defines the MIME-type parameters used to signal codecs, and provides the necessary format specific definitions for initialization segments, media segments, and random access points required by the Byte Stream Formats section of the Media Source Extensions™ specification.

2. MIME-type parameters

This section specifies the parameters that can be used in the MIME-type passed to isTypeSupported() or addSourceBuffer().

MIME-types for this specification MUST conform to the rules outlined for "audio/mp4" and "video/mp4" in [RFC6381].

Note
Implementations MAY only implement a subset of the codecs and profiles mentioned in [RFC6381].

3. Initialization Segments

An ISO BMFF initialization segment is defined in this specification as a single File Type Box (ftyp) followed by a single Movie Box (moov).

The user agent MUST run the append error algorithm if any of the following conditions are met:

  1. A File Type Box contains a major_brand or compatible_brand that the user agent does not support.
  2. A box or field in the Movie Box is encountered that violates the requirements mandated by the major_brand or one of the compatible_brands in the File Type Box.
  3. The tracks in the Movie Box contain samples (i.e., the entry_count in the stts, stsc or stco boxes are not set to zero).
  4. A Movie Extends (mvex) box is not contained in the Movie (moov) box to indicate that Movie Fragments are to be expected.

The user agent MUST support setting the offset from media composition time to movie presentation time by handling an Edit Box (edts) containing a single Edit List Box (elst) that contains a single edit with media rate one. This edit MAY have a duration of 0 (indicating that it spans all subsequent media) or MAY have a non-zero duration (indicating the total duration of the movie including fragments).

The user agent MUST support codec configurations stored out-of-band in the sample entry, and for codecs which allow codec configurations stored inband in the samples themselves, the user agent SHOULD support codec configurations stored inband.

Note

For example, for codecs which include SPS and PPS parameter sets, for maximum content interoperability, user agents are strongly advised to support both inband (e.g., as defined for avc3/avc4) and out-of-band (e.g., as defined for avc1/2) storage of the SPS and PPS.

Valid top-level boxes such as pdin, free, and sidx are allowed to appear before the moov box. These boxes MUST be accepted and ignored by the user agent and are not considered part of the initialization segment in this specification.

The user agent MUST source attribute values for id, kind, label and language for AudioTrack, VideoTrack and TextTrack objects as described for MPEG-4 ISOBMFF in the in-band tracks spec [INBANDTRACKS].

4. Media Segments

An ISO BMFF media segment is defined in this specification as one optional Segment Type Box (styp) followed by a single Movie Fragment Box (moof) followed by one or more Media Data Boxes (mdat). If the Segment Type Box is not present, the segment MUST conform to the brands listed in the File Type Box (ftyp) in the initialization segment.

Valid top-level boxes defined in [ISOBMFF] other than ftyp, moov, styp, moof, and mdat are allowed to appear between the end of an initialization segment or media segment and before the beginning of a new media segment. These boxes MUST be accepted and ignored by the user agent and are not considered part of the media segment in this specification.

The user agent MUST run the append error algorithm if any of the following conditions are met:

  1. A box or field in the Movie Fragment Box is encountered that violates the requirements mandated by the major_brand or one of the compatible_brands in the Segment Type Box in this media segment or the File Type Box in the initialization segment if a Segment Type Box is not present.
  2. This media segment contains a Segment Type Box that is not compatible with the File Type Box in the initialization segment.
  3. The Movie Fragment Box does not contain at least one Track Fragment Box (traf).
  4. The Movie Fragment Box does not use movie-fragment relative addressing.
  5. External data references are being used.
  6. At least one Track Fragment Box does not contain a Track Fragment Decode Time Box (tfdt)
  7. The Media Data Boxes do not contain all the samples referenced by the Track Fragment Run Boxes (trun) of the Movie Fragment Box.
  8. Inband parameter sets are not present in the appropriate samples and parameter sets are not present in the last initialization segment appended.

A Movie Fragment Box uses movie-fragment relative addressing when the first Track Fragment Run(trun) box in each Track Fragment Box has the data-offset-present flag set and either of the following conditions are met:

5. Random Access Points

A random access point as defined in this specification corresponds to a Stream Access Point of type 1 or 2 as defined in Annex I of [ISOBMFF].

6. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

7. Acknowledgments

The editors would like to thank Chris Poole, Cyril Concolato, David Singer, Jer Noble, Jerry Smith, Joe Steele, John Simmons, Kevin Streeter, Michael Thornburgh, and Steven Robertson for their contributions to this specification.

A. References

A.1 Normative references

[html]
HTML Standard. Anne van Kesteren; Domenic Denicola; Ian Hickson; Philip Jägenstedt; Simon Pieters. WHATWG. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INBANDTRACKS]
Sourcing In-band Media Resource Tracks from Media Containers into HTML. Silvia Pfeiffer; Bob Lund. W3C. 26 April 2015. Unofficial Draft. URL: https://dev.w3.org/html5/html-sourcing-inband-tracks/
[ISOBMFF]
Information technology — Coding of audio-visual objects — Part 12: ISO base media file format. ISO/IEC. Under development. URL: https://www.iso.org/standard/85596.html
[MEDIA-SOURCE]
Media Source Extensions™. Jean-Yves Avenard; Mark Watson. W3C. 4 July 2024. W3C Working Draft. URL: https://www.w3.org/TR/media-source-2/
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC6381]
The 'Codecs' and 'Profiles' Parameters for "Bucket" Media Types. R. Gellens; D. Singer; P. Frojdh. IETF. August 2011. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc6381
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174