This is a working draft of a Document Object Model (DOM) specification for synchronized multimedia functionality. It is part of work in the Synchronized Multimedia Working Group (SYMM) towards a next version of the SMIL language and SMIL modules. Related documents describe the specific application of this SMIL DOM for SMIL documents and for HTML and XML documents that integrate SMIL functionality. The SMIL DOM builds upon the Core DOM functionality, adding support for timing and synchronization, media integration and other extensions to support synchronized multimedia documents.
The first W3C Working Group on Synchronized Multimedia (SYMM) developed SMIL - Synchronized Multimedia Integration Language. This XML-based language is used to express synchronization relationships among media elements. SMIL 1.0 documents describe multimedia presentations that can be played in SMIL-conformant viewers.
SMIL 1.0 did not define a Document Object Model. Because SMIL is XML based, the basic functionality defined by the Core DOM is available. However, just as HTML and CSS have defined DOM interfaces to make it easier to manipulate these document types, there is a need to define a specific DOM interface for SMIL functionality. The current SYMM charter includes a deliverable for a SMIL-specific DOM to address this need, and this document specifies the SMIL DOM interfaces.
Broadly defined, the SMIL DOM is an Application Programming Interface (API) for SMIL documents and XML/HTML documents that integrate SMIL functionality. It defines the logical structure of documents and the way a document is accessed and manipulated. This is described more completely in "What is the Document Object Model".
The SMIL DOM will be based upon the DOM Level 1 Core functionality. This describes a set of objects and interfaces for accessing and manipulating document objects. The SMIL DOM will also include the additional event interfaces described in the DOM Level 2 Events specification. The SMIL DOM extends these interfaces to describe elements, attributes, methods and events specific to SMIL functionality. Note that the SMIL DOM does not include support for DOM Level 2 Namespaces, Stylesheets, CSS, Filters and Iterators, and Model Range specifications.
The SYMM Working Group is also working towards a modularization of SMIL functionality, to better support integration with HTML and XML applications. Accordingly, the SMIL DOM is defined in terms of the SMIL modules.
The design and specification of the SMIL DOM must meet the following set of requirements.
General requirements:
SMIL specific requirements
It is not yet clear what all the requirements on the SMIL DOM will be related to the modularization of SMIL functionality. While the HTML Working Group is also working on modularization of XHTML, a modularized HTML DOM is yet to be defined. In addition, there is no general mechanism yet defined for combining DOM modules for a particular profile.
The SMIL DOM has as its foundation the Core DOM. The SMIL DOM includes the support defined in the DOM Level 1 Core API, and the DOM Level 2 Events API.
The DOM Level 1 Core API describes the general functionality needed to manipulate hierarchical document structures, elements and attributes. The SMIL DOM describes functionality that is associated with or depends upon SMIL elements and attributes. Where practical, we would like to simply inherit functionality that is already defined in the DOM Level 1 Core. Nevertheless, we want to present an API that is easy to use, and familiar to script authors that work with the HTML and CSS DOM definitions.
Following the pattern of the HTML DOM, the SMIL DOM follows a naming convention for properties, methods, events, collections and data types. All names are defined as one or more English words concatenated together to form a single string. The property or method name starts with the initial keyword in lowercase, and each subsequent word starts with a capital letter. For example, a method that converts a time on an element local timeline to global document time might be called "localToGlobalTime".
In the ECMAScript binding, properties are exposed as properties of a given object. In Java, properties are exposed with get and set methods.
Most of the properties are directly associated with attributes defined in the SMIL syntax. By the same token, most (or all?) of the attributes defined in the SMIL syntax are reflected as properties in the SMIL DOM. There are also additional properties in the DOM that present aspects of SMIL semantics (such as the current position on a timeline).
The SMIL DOM methods support functionality that is directly associated with SMIL functionality (such as control of an element timeline).
Note that the naming follows the DOM standard for XML, HTML and CSS DOM. This matches the HTML attribute naming scheme, but is on conflict with the SMIL 1.0 (and CSS) attribute naming conventions (all-lower with dashes between words). Given that the DOM Level 2 CSS API follows the primary DOM naming conventions, I think we should as well. Although this presents a naming conflict with the SMIL attributes (unless we reconsider attribute naming in the next version of SMIL), it presents a consistent DOM API.
In some instances, the SMIL DOM defines constraints on the Level 1 Core interfaces. These are introduced to simplify the SMIL associated runtime engines. The constraints include:
These constraints are defined in detail below.
This section will need to be reworked once we have a better handle on the approach we take (w.r.t. modality, etc.) and the details of the interfaces.
We probably also want to include notes on the recent discussion of a presentation or runtime object model as distinct from the DOM.
One of the goals of DOM Level 2 Event Model is the design of a generic event system which allows registration of event handlers, describes event flow through a tree structure, and provides basic contextual information for each event. The SMIL event model includes the definition of a standard set of events for synchronization control and presentation change notifications, a means of defining new events dynamically, and the defined contextual information for these events.
The DOM Level 2 Events specification currently defines a base Event interface and three broad event classifications:
In HTML documents, elements generally behave in a passive (or sometimes reactive) manner, with most events being user-driven (mouse and keyboard events). In SMIL, all timed elements behave in a more active manner, with many events being content-driven. Events are generated for key points or state on the element timeline (at the beginning, at the end and when the element repeats). Media elements generate additional events associated with the synchronization management of the media itself.
The SMIL DOM makes use of the general UI and mutation events, and also defines new event types, including:
Some runtime platforms will also define new UI events, e.g. associated with a control unit for web-enhanced television (e.g. channel change and simple focus navigation events). In addition, media players within a runtime may also define specific events related to the media player (e.g. low memory).
The SMIL events are grouped into four classifications:
In addition to defining the basic event types, the DOM Level 2 Events specification describes event flow and mechanisms to manipulate the event flow, including:
The SMIL DOM defines the behavior of Event capture, bubbling and cancellation in the context of SMIL and SMIL-integrated Documents.
In the HTML DOM, events originate from within the DOM implementation, in response to user interaction (e.g. mouse actions), to document changes or to some runtime state (e.g. document parsing). The DOM provides methods to register interest in an event, and to control event capture and bubbling. In particular, events can be handled locally at the target node or centrally at a particular node. This support is included in the SMIL DOM. Thus, for example, synchronization or media events can be handled locally on an element, or re-routed (via the bubbling mechanisms) to a parent element or even the document root. Event registrants can handle events locally or centrally.
Note: It is currently not resolved precisely how event flow (dispatch, bubbling, etc.) will be defined for SMIL timing events. Especially when the timing containment graph is orthogonal to the content structure (e.g. in XML/SMIL integrated documents), it may make more sense to define timing event flow relative to the timing containment graph, rather than the content containment graph. This may also cause problems, as different event types will behave in very different ways within the same document.
Note: It is currently not resolved precisely how certain user interface events (e.g. onmouseover, onmouseout) will be defined and will behave for SMIL documents. It may make more sense to define these events relative to the regions and layout model, rather than the timing graph.
We have found that the DOM has utility in a number of scenarios, and that these scenarios have differing requirements and constraints. In particular, we find that editing application scenarios require specific support that the browser or runtime environment typically does not require. We have identified the following requirements that are directly associated with support for editing application scenarios as distinct from runtime or playback scenarios:
Due to the time-varying behavior of SMIL and SMIL-integrated document types, we need to be able to impose different constraints upon the model depending upon whether the environment is editing or browsing/playing back. As such, we need to introduce the notion of modality to the DOM (and perhaps more generally to XML documents). We need a means of defining modes, of associating a mode with a document, and of querying the current document mode.
We are still considering the details, but it has been proposed to specify an active mode that is most commonly associated with browsers, and a non-active or editing mode that would be associated with an editing tool when the author is manipulating the document structure.
Associated with the requirement for modality is a need to represent a lock or read-only qualification on various elements and attributes, dependent upon the current document mode.
For an example that illustrates this need within the SMIL DOM: To simplify runtime engines, we want to disallow certain changes to the timing structure in an active document mode (e.g. to preclude certain structural changes or to make some properties read-only). However when editing the document, we do not want to impose these restrictions. It is a natural requirement of editing that the document structure and properties be mutable. We would like to represent this explicitly in the DOM specification.
There is currently some precedent for this in HTML browsers. E.g. within Microsoft Internet Explorer, some element structures (such as tables) cannot be manipulated while they are being parsed. Also, many script authors implicitly define a "loading" modality by associating script with the document.onLoad event. While this mechanism serves authors well, it nevertheless underscores the need for a generalized model for document modality.
A related requirement to modality support is the need for a simplified transaction model for the DOM. This would allow us to make a set of logically grouped manipulations to the DOM, deferring all mutation events and related notification until the atomic group is completed. We specifically do not foresee the need for a DBMS-style transaction model that includes rollback and advanced transaction functionality. We are prepared to specify a simplified model for the atomic changes. For example, if any error occurs at a step in an atomic change group, the atomicity can be broken at that point.
As an example of our related requirements, we will require support to optimize the propagation of changes to the time-graph modeled by the DOM. A typical operation when editing a timeline shortens one element of a timeline by trimming material from the beginning of the element. The associated changes to the DOM require two steps:
Typically, a timing engine will maintain a cache of the global begin and end times for the elements in the timeline. These caches are updated when a time that they depend on changes. In the above scenario, if the timeline represents a long sequence of elements, the first change will propagate to the whole chain of time-dependents and recalculate the cache times for all these elements. The second change will then propagate, recalculating the cache times again, and restoring them to the previous value. If the two operations could be grouped as an atomic change, deferring the change notice, the cache mechanism will see no effective change to the end time of the original element, and so no cache update will be required. This can have a significant impact on the performance of an application.
When manipulating the DOM for a timed multimedia presentation, the efficiency and robustness of the model will be greatly enhanced if there is a means of grouping related changes and the resulting event propagation into an atomic change.
The IDL interfaces will be moved to specific module documents once they are ready.
Cover document timing, document locking?, linking modality and any other document level issues. Are there issues with nested SMIL files?
Is it worth talking about different document scenarios, corresponding to differing profiles? E.g. Standalone SMIL, HTML integration, etc.
A separate document should describe the integrated DOM associated with SMIL documents, and documents for other document profiles (like HTML and SMIL integrations).
The SMILElement interface is the base for all SMIL element types. It follows the model of the HTMLElement in the HTML DOM, extending the base Element class to denote SMIL-specific elements.
Note that the SMILElement interface overlaps with the HTMLElement interface. In practice, an integrated document profile that include HTML and SMIL modules will effectively implement both interfaces (see also the DOM documentation discussion of Inheritance vs Flattened Views of the API).
Base interface for all SMIL elements.
interface SMILElement : Element { attribute DOMString id; // etc. This needs attention }
This module includes the SMIL, HEAD and BODY elements. These elements are all represented by the core SMIL element interface.
This module includes the META element.
interface SMILMetaElement : SMILElement { attribute DOMString content; attribute DOMString name; attribute DOMString skipContent; // Types may be wrong - review }
This module includes the LAYOUT, ROOT_LAYOUT and REGION elements, and associated attributes.
Declares layout type for the document. See the LAYOUT element definition in SMIL 1.0
interface SMILLayoutElement : SMILElement { attribute DOMString type; // Types may be wrong - review }
Declares layout properties for the root element. See the ROOT-LAYOUT element definition in SMIL 1.0
interface SMILRootLayoutElement : SMILElement { attribute DOMString backgroundColor; attribute long height; attribute DOMString skipContent; attribute DOMString title; attribute long width; // Types may be wrong - review }
Controls the position, size and scaling of media object elements. See the REGION element definition in SMIL 1.0
interface SMILRegionElement : SMILElement { attribute DOMString backgroundColor; attribute DOMString fit; attribute long height; attribute DOMString skipContent; attribute DOMString title; attribute DOMString top; attribute long width; attribute long zIndex; // Types may be wrong - review }
The layout module also includes the region attribute, used in SMIL layout to associate layout with content elements. This is represented as an individual interface, that is supported by content elements in SMIL documents (i.e. in profiles that use SMIL layout).
Declares rendering surface for an element. See the region attribute definition in SMIL 1.0
interface SMILRegionInterface { attribute SMILRegionElement region; }
This module includes the PAR and SEQ elements, and associated attributes.
This will be fleshed out as we work on the timing module. For now, we will define a time leaf interface as a placeholder for media elements. This is just an indication of one possibility - this is subject to discussion and review.
Declares timing information for timed elements.
interface SMILTimeInterface { attribute InstantType begin; attribute InstantType end; attribute DurationType dur; attribute DOMString repeat; // etc. Types may be wrong - review // Presentation methods void beginElement(); void endElement(); void pauseElement(); void resumeElement(); void seekElement(in InstantType seekTo); }
Attributes
Presentation Methods
Events
This is a placeholder - subject to change. This represents generic timelines.
interface SMILTimelineInterface : SMILTimeInterface { attribute NodeList timeChildren; // Presentation methods
NodeList getActiveChildrenAt(); NodeList getActiveChildrenAt( in instant InstantType instant); }
Attributes
Presentation Methods
interface SMILParElement : SMILTimelineInterface, SMILElement { attribute DOMString endsync; }
interface SMILSeqElement : SMILTimelineInterface, SMILElement { }
This module includes the media elements, and associated attributes. They are all currently represented by a single interface, as there are no specific attributes for individual media elements.
Declares media content.
interface SMILMediaInterface : SMILTimeInterface { attribute DOMString abstract; attribute DOMString alt; attribute DOMString author; attribute ClipTime clipBegin; attribute ClipTime clipEnd; attribute DOMString copyright; attribute DOMString fill; attribute DOMString longdesc; attribute DOMString src; attribute DOMString title; attribute DOMString type; // Types may be wrong - review }
interface SMILRefElement : SMILMediaInterface, SMILElement {
}
// audio, video, ...
This module will include interfaces associated with transition markup. This is yet to be defined.
This module will include interfaces associated with animation behaviors and markup. This is yet to be defined.
This module includes interfaces for hyperlinking elements.
Declares a hyperlink anchor. See the A element definition in SMIL 1.0.
interface SMILAElement : SMILElement { attribute DOMString title; attribute DOMString href; attribute DOMString show; // needs attention from the linking folks }
This module includes interfaces for content control markup.
Defines a block of content control. See the SWITCH element definition in SMIL 1.0
interface SMILSwitchElement : SMILElement { attribute DOMString title; // and...? }
Defines the test attributes interface. See the Test attributes definition in SMIL 1.0
interface SMILTestInterface { attribute DOMString systemBitrate; attribute DOMString systemCaptions; attribute DOMString systemLanguage; attribute DOMString systemOverdubOrCaption; attribute DOMString systemRequired; attribute DOMString systemScreenSize; attribute DOMString systemScreenDepth; // and...? }