Standardized Timed-text Format
March 21, 2002
Why a Standardized Timed-text Format?
On the Web, there is no standard method for displaying text which is
synchronized with other elements, such as video and audio. The three most
popular multimedia players-- Apple's QuickTime Player, Microsoft's Windows
Media Player and RealNetworks' RealPlayer-- support only their own
proprietary text formats (QText,
SAMI
and RealText,
respectively). As a result, multimedia authors must write synchronized text
files in multiple formats if they wish to support more than one player. A
standardized timed-text format would eliminate this duplication of work. It
would also simplify the creation and distribution of synchronized text for
use with a multitude of devices, both software and hardware, such as
multimedia players, caption encoders and decoders (EIA-608, 708 and TeleText,
for example), character generators, LED displays and other text-display
devices.
Common uses for a standardized timed-text format include the following:
- closed captions and subtitles (on the Web, on television and in movie
theaters)
- karaoke
- credit rolls
- ticker-tape displays (or crawls)
- text overlay
- hyperlinks and other interactivity
top
Standardized Timed-text Format Requirements
I ARCHITECTURE
A timed-text format must or should...
- Be simple to author and easy to learn.
- Have a valid XML representation.
- Be streamable.
- Be cross-platform.
- Allow extensibility.
- Support streaming real-time captions. Users should be able to tune in
to the text presentation at any time after it has begun.
- Allow for parallel languages in different documents or within the same
document (e.g., via the <switch> element)
- Allow the language of the text to be identified using xml:lang.
- Support mixed-language text.
- Be useable in all character sets.
- Have a default UNICODE font.
- Allow clean integration with sign-language captions.
- Allow hyperlinks via the HTML "a" tag, XHTML or other flexible
mechanism.
- Be searchable.
- Contain a timed-text version in each timed-text file and live
stream.
- Use markup to clearly distinguish one speaker from another. This could
be accomplished by a) using simple placement commands (<center>,
<left>, <right>, etc.); or b) creating a persona for text
which is spoken by each speaker using speaker="IDREF" attribute.
- Allow the creation of collated transcripts which contain, and
differentiate via markup, captions and audio descriptions.
- Allow motion through the use of the SMIL animate element or other
method.
- Use SVG, MathML, XHTML or other language for complex font displays
(such as math equations).
- Allow the user to navigate through discrete timed media via SMIL
interaction constructs.
- Allow for long-form presentation (e.g., it should support captions or
subtitles for full-length movies or other long presentations).
- Adopt SMIL 2.0 as a base language. Also consider using other W3C
recommendations as a base, including (but not limited to) XML 1.0, CSS2,
SVG 1.1).
- Be no less functional than EIA-708 and other appropriate international
standards.
top
II DISPLAY
A timed-text format must or should...
- Provide a means of giving richness or style to text.
- Support the display of bi-directional characters.
- Allow ruby markup.
- Allow text in different languages to be appropriately styled.
- Permit transparent overlay.
- Permit text highlighting.
- Allow for different display options (pop-on, roll-up, paint-on, crawl,
etc.).
- Support unique symbols, such as the musical note or the generic
closed-caption symbol ("CC" in a box).
- Permit user override of display.
- Permit unlimited positioning of text.
- Be able to display multiple captions simultaneously (for example, when
more than one person is speaking at once).
- Allow other ways to display text; for example, via text balloons.
top
III TIMING
A timed-text format must or should...
- Allow text to appear and disappear over time.
- Permit the display of no text-- that is, allow for erasure of text when
it is not necessary.
- Keep text and timing information together.
- Define text and timing markup in two separate modules in the
specification.
top
IV THINGS TO LEAVE OUT OF A TIMED-TEXT FORMAT
- a "font" element. (Use SVG or other technologies instead.)
top
Please send corrections and additions to the timed public list at public-tt@w3.org.