Copyright ©2005 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
These are the collected Last Call comments on Timed Text (TT) Authoring Format 1.0 Distribution Format Exchange Profile (DFXP), Last Call WD which were sent to the public Timed Text mailing list public-tt@w3.org (archives) and responses to those comments.
The DFXP LC review annoucement was sent to the public-tt@w3.org and chairs@w3.org list on Sept 15 2006.
The 15 comments have the following status (status on 15 Sept 2006):
Public Comments include:
Notification of these Last Call responses were sent to all of the commenters and the Timed text list: http://lists.w3.org/Archives/Public/public-tt/
Comment:
Some questions that came up while reading the draft: * Why isn't the specification using xml:id? * Why is the specification using its own attribute rather than CSS? * Why does the specification refer to CSS2, which has been revised? * Why do we need a totally new specification for this which reinvents a lot of elements and attributes? (And CSS.) * Why does the specification has so many namespaces? * I assume the namespaces will change before this specification goes to CR? Although this is not really the "final publishing of this specification" the specification and WG do need sufficient feedback from implementors before they can move on to PR and beyond. -- Anne van Kesteren
The discussion is archived at:
http://lists.w3.org/Archives/Public/public-tt/2005Mar/0032.html
to
/2005Mar/0041.html
The resolution is archived at: F2F Cupertino
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0008.html
Comment:
1. Meeting requirements [[[ It is intended that a more feature-rich profile, known presently as the Authoring Format Exchange Profile (AFXP), be developed and published to address the full set of documented requirements. ]]] Is there any concrete reason to believe this will take place? The group has had its charter extended already, just to produce this restricted draft. Is the group working on this more complete version already? Or is this just a hope? 2. Validity in section 4 on Document Types is [[[ The definition of validity employed here does not require the presence of any attribute, does not take into account the value of any attribute, and does not take into account any semantics associated with ID/IDREF/IDREFS typed attributes. ]]] Why not? Attributes are defined all over the spec, so I would have thought they are important. 3. Default length units Section 6.2.3 defining defaultLengthUnit says [[[ If not specified, the default length unit must be considered to be pixels. ]]] For accessibility reasons, it seems better in a pure text format to use an element based on font size, such as the em unit, as a default. I realise this implies a small change in the practice of authors, who are probably generally accustomed to working in pixels. But then they are probably not generally accustomed to working on accessibility. 4. frameRateMultiplier It is hard to decipher this section. In particular, please explain where the numbers come from. It seems that this is designed to default to NTSC, which as I understand it is a relatively parochial approach since it is widely used only in North America. If the defaults for other systems such as PAL or SECAM are different, it would seem reasonable to include a systemType attribute which determines the default. Actually it seems strange that so much of this is in the spec. I understand that it enables synchronisation with traditional television, but it seems a lot of complexity. 5. Duration and begin/end It would be helpful if the spec said what happens when the begin/end and the dur attributes don't agree. 6. Compulsory xml:lang attribute Cool idea. 7. Style, and user styling Quite a lot of work went into CSS. It is also considered pretty normal to style XML with CSS. For accessibility purposes it is helpful to have something like the CSS2 Cascade rules (which represented a change from CSS1 for enhanced accessibility). It turns out that text is about the only area where it is easy for user styles to make sense, so it seems a shame that there is no mechanism anticipated by the spec for using CSS and takng advantage of the cascading of rules that are important to the user, where appropriate. 8. Timed text and sign language For people who are deaf and use sign language, it is often very much more appropriate to have sign language than text as the captioning format. It seems that there are already a number of video formats, and one might expect one of them or an SVG animation to be used in the cntext of a SMIL presentation. It might be worth noting within the spec that this use case effectively fals outside the scope, not because the spec ignores the need but because it is best satisfied by using this spec for its intended purpose (timed text) within the context of a group of specifications for timed multimedia. (In other words, this could be done with a brief note in the introduction or somewhere). cheers Charles McCathieNevile Fundacion Sidar
The discussion is archived at:
http://lists.w3.org/Archives/Public/public-tt/2005Mar/0077.html
http://lists.w3.org/Archives/Public/public-tt/2005Mar/0078.html
and
http://lists.w3.org/Archives/Public/public-tt/2005Apr/0000.html
to
2005Apr/0010.html
The resolution is archived at: Cupertino F2F
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0009.html
Glen, I have some problems with the current support for background colour in DXFP. The attached files indicate three common modes of using background colours. Given the current attributes tts:displayAlign and tts:showBackground I cannot see a way to trivially implement these effects in DXFP without creating a large number of regions. Can consideration be given to enhancing the tts:showBackground attribute to support the following attribute values? content - The background attributes are only applied to the part of the region that has content. I.e. there is only a background behind characters of displayed text but not behind white-space characters (e.g. line 21 boxed captions with transparent space). content-space - The background attributes are only applied to the part of the region that has content and also to white-space within the content. I.e. there is background behind characters of displayed text and behind white-space characters. If no white-space character exists at the end of each line, the background is extended to cover the area of a single space character at the end of each line. (UK boxed style) active-line - The background attributes are applied to the lines of the region that have content. The background extends the full width of the **region** for any line that has content within the region. active-stripe - The background attributes are applied to the lines of the region that have content. The background extends the full width of the display. active-bounds - The background attributes are applied to the bounding box of all content within the region, providing the region has some content. active-region - The background attributes are applied to the entire region providing the region has some content. stripe - The background attributes are applied to the entire region regardless of if the region has some content.. The background extends the full display width. region - The background attributes are applied to the entire region regardless of if the region has some content. (E.g. this might be used for censorship). regards John Birch Senior Software Engineer, Screen Subtitling Systems Limited,
Huge thread of 70 emails ...
http://lists.w3.org/Archives/Public/public-tt/2005Mar/0005.html to
2005Mar/00075.html
The resolution is archived at: Cupertino F2F
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0010.html
TTWG response agreed by Requestor
http://lists.w3.org/Archives/Public/public-tt/2005Aug/0033.html
Glenn, OK, I'll try and frame what I see as the style problem.... The DFXP style model is quite suitable for the carriage of styled text, BUT, in the contexts of accessibilty and transcoding, the DFXP style mechanism IMO lacks an essential ingredient, that being the reason for (or context of) the applied style. As an example - an author may choose yellow text on a red background for a warning message. The carriage of that text as simply text characters and colour codes loses one piece of information - the fact that it was intended as a warning. This missing information becomes important when interchanging content between formats that have different support for style (for example between a colour and monochromatic presentation), or in transcoding / translating content between cultural groups. Another example - Green is 'lucky' in Ireland, but Red is 'lucky' in China. Go n'éirí an t-ádh leat could translate to ??? (Note: Internet machine translation!). You have stated that it is possible (likely) that this style tagging (context) will be a feature of AFXP and that DFXP is a format intended as "a solution that addressed the more pressing and less complex need of interchange among a small number of legacy distribbution formats,specifically SAMI, QuickText, RealText, 3GPP TT, CEA-608/708, and WST." It should be noted that CEA-608/708, and WST (and in fact TV subtitling formats in general) are typically not stored in these wire formats by broadcasters, rather these wire distribution formats are created in real-time by insertion equipment working from proprietary file formats. A single common file format already exists as a ratified interchange standard, EBU 3264. DFXP could replace the use of EBU 3264 - it offers a few of advantages, a) it is Unicode, b) it is XML and c) it has a more comprehensive language tagging mechanism. However, DFXP does not offer any significant new features over EBU 3264, and indeed there are features in EBU3264 that are not present in DFXP (e.g. cumulative mode and boxing). A combination of extension elements and attributes and constrained document structuring (via a sub-profile) can probably be used with DFXP to fully represent EBU 3264 document contents - and other general TV broadcast related subtitling issues. Indeed, it is anticipated that the use of DFXP as an interchange mechanism for TV broadcast subtitling will require the development of guidelines for the interpretation of DFXP documents by transcoders. In addition it will probably require the development of a profile to add elements and attributes to DFXP to carry information and features currently supported by existing formats, (e.g. conditional content, cumulative modes, background styles, embedded glyphs, subtitles as images (DVD, DVB, Imitext)). The pressing need is not IMO for another interchange format per se, rather it is for a format that preserves more of the authorial intent (inc. understanding / meaning) such that implementing transcoding, translation and accessibility are made easier tasks than they are currently. My main concerns are that using DFXP will encourage the continuation of the existing practice of 'cooked text content' - that is text that has lost contextual meaning - and that AFXP will be too complex and too late for most implementations. Is there a middle path for DFXP that would encourage a more context sensitive (and accessible) role for text style? DFXP already includes a referenced style mechanism - could that mechanism be strengthened to provide greater support for contextual styling of text? best regards John Birch.
An author may use a combination of ttm:role, ttm:agent, as well as user-defined metadata attributes or elements, e.g., placing them in a child of the content, in order to express "the reason for (or context of) the applied style."
For example, see example in
http://lists.w3.org/Archives/Public/public-tt/2005Apr/0021.html..
The discussion is archived at: http://lists.w3.org/Archives/Public/public-tt/2005Apr/0016.html to 0023.html
The resolution is archived at: Cupertino F2F
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0010.html
TTWG response agreed by Requestor
http://lists.w3.org/Archives/Public/public-tt/2005Aug/0033.html
Comment:
On Sat, 02 Apr 2005 00:06:17 +1000, Al Gilman <Alfred.S.Gilman@ieee.org> wrote: >> On the other hand accessibility issues are not addressed by FO. It is >> not at all clear how a user should expect to provide styling rules to >> meet their particular needs, as is trivial using CSS for text styling. > On the one hand, XSL FO does address accessibility through the > link-to-source provision. > http://www.w3.org/TR/xsl/slice7.html#common-accessibility-properties This is for XSL FO documents - not for a different collection that happens to use some of the properties out of XSL-FO. A link to the source where it issome other format isn't likelyto beay more helpful than the DFXP document. > If DXFP does not emulate this, it should be considered. > On the other hand, CSS already arogates to itself the ability to > supercede presentation properties asserted inline in the source being > styled. Right. My contention is that DFXP should use CSS as the mechanism by whch User Agents provide users with the ability to override presentation, where that is required for accessibility reasons in the case that DFXP is served directly. cheers Charles McCathieNevile, Fundacion Sidar
The discussion is archived at:
http://lists.w3.org/Archives/Public/public-tt/2005Apr/0013.html
to
2005Apr/0023.html
The resolution is archived at: Cupertino F2F
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0009.html
Comment:
Thank you for offering us the chance to refine our input for a little longer. Since you are meeting face-to-face, let me offer the following thoughts of an individual and preliminary nature. Key thoughts: - if the user can receive the content on a programmable device, we need to develop the [Web] distribution options and content constraints [with format support] to serve alternative (adaptive) presentation for individuals. - there is going to be a lot of content that sees the light of intra-broadcast-industry pipelines in DFXP encoding. Deferring adaptive use to the availability of an AFXP spec is not necessarily an acceptable policy from the standpoint of disability access. While the DFXP specification may not define a CPE player for the format per se, there is still reason to consider use cases for people with disabilities which require an alternate presentation of the material. Just because there is no anticipation that the DFXP would be used directly in mass-market set-top-box processes, it doesn't mean that there aren't authoring-time requirements on the content that should be supported in the intermediate form i.e. the DFXP. Making the DFXP available to a transcoder of the user's choice is one way that the content encoded in the DFXP could be served to a person with a disability requiring alternate presentation. Or the content could be browsed offline using a mainstream XML reader and a schema-aware assistive technology. [start use scenario] Here is a scenario sketch to illustrate what I mean: There is a meeting held by videoconference over a corporate extranet. To serve strategic partners in other countries and technology platforms, Internet technologies are used including subtitles generated in real time and distributed using DFXP as an intermediate form. One of the people whose job requires interacting with the content of the meeting is Deaf and blind. So a complete log of the meeting is kept for this participant's offline review. supposition: The DFXP, as an XML format, is the dataset of choice on which to base this person's browse of what transpired in the session. Not just the formal statement of the decisions that were reached, but the dialog that led to the decisions. This would mean that the DFXP would be spooled and archived with the audio and video. Quite possibly there would be a SMIL wrapper created as a replay aid. But the deaf-blind user would be reviewing this through a refreshable Braille device and primarily reviewing the timed text as transcript. Note that in interactive Braille as the delivery context, right-justification and color are not appropriate as speaker-change cues. So we need the speaker-change semantics available, separable from any particular visual-presenation effects. DFXP gives the author the capability to express this, but will the information be there in instances? So regardless of whether a collated transcript is created by a transcoder, or the several text streams are browsed as is with an adaptive user agent, the availability of speaker identification in the DFXP instance, the working base for the adapted use, or at a minimum speaker-change events if the identity of the speakers was not captured, would be important in affording this user comparable quality of content as those receiving the same information as real-time display integrated with the video and audio. [end use scenario] This is just to illustrate that there are people with disabilities for whom the introduction of something like the DFXP into the content pipelines of broadcast happenings reflects an opportunity that should not be wasted to raise the level of service and lower the cost of delivering that service. In particular, the use cases for adapted presentation do not necessarily presume that the DFXP would be pushed to all consumers in the broadcast bundle. The distribution protocol might be on an ask-for or 'pull' basis. And the user interaction might be in non-real-time after the fact and not at speed. But the non-availability of the AFXP format as a "source in escrow" format for adapted uses means that the user needs the DFXP that gets produced to be as fit an adaptation basis as we can make it. This will be true while the AFXP is undefined, and will still be true for those situations where a copy of the DFXP can be obtained and a copy of a standard, XML source for that content cannot. The latter is likely to be common even after the AFXP has been specified by W3C. Thank you (the whole group) for bringing this important technology this far. Best wishes for your meeting. Al
http://lists.w3.org/Archives/Public/public-tt/2005Apr/0034.html
http://lists.w3.org/Archives/Public/public-tt/2005Apr/0035.html
The resolution is archived at: Cupertino F2F
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0006.html
The usual approach in W3C is to use consensus public formats as a pivot point so that the author can understand the binding of the content schema to the lingo of the domain sourcing the content, and the assistive technology or device independence specialist can understand how to map the content schema to the presentation possibilities of one or another delivery context. The content schema is consolidated through an inter-community negotiation; while the pool of people engaged in the negotiation need to cover the stakeholding domains of activity, nobody has to become an expert in both/all of them. http://www.w3.org/2004/06/DI-MCA-WS/ On the other hand, with the Semantic Web the W3C gives us an alternate approach with less reliance on standard formats and more reliance on metadata. And the WAI seeks creative solutions using any applicable technology, not simply rote cant. However, a metadata approach would still require that the content sourcing activity a) capture and be prepared to share key information such as speaker identity, where readily achievable, and b) explain the terms in the way *they* are using [whatever format they are using as the source or editable form] in terms of well-established public-use references. The later is a schema reconciliation or data thesaurus. [There is no policy-free solution, AFAIK.] The avenue of amelioration that we haven't touched on specifically has to do with the CR checklist. We should be looking at what concrete example-use activities during CR would illuminate the issues we have been discussing so as to make it easier to come to consensus that the DFXP does about what it should in these directions. Al
The discussion is archived at:
The resolution is archived at: Cupertino F2F
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0006.html
If we allow for extrinsic timing where such a time may not be resolved yet, then there are use cases where it is necessary to express both dur and end. For instance: "Display this text for 20 seconds unless the extrinsic- event-based end time resolves before that time, in which case end when it resolves". Is this mix of extrinsic and intrinsic timing actually supported within DFXP.
I thought that the discontinuous attribute applied to the entire document
(I see discontinuous as synonymous with media marker modality)? Further, I find it a strange balance of features in that DFXP allows such a
sophisticated timing model when it only supports a relatively simplistic styling model. Note: I do not have any real issue with the inclusion of the ttp parameters for timing model,
except that they increase the complexity of a **fully conformant** (non SMIL based) user agent
fairly dramatically. I am currently assuming that since DFXP deliberately avoids talking
about UAs, that an implementation must clarify what aspects of DFXP timing model it supports.
I feel this puts DFXP in an awkward position as a universal distribution format -
since originators of content may use features of the timing model that are not supported
by transcoders or UAs. So my position is that a simpler timimng model would be more likely
to be universally adopted, further sophistication of the type in your example could IMO
be handled by any 'container' format e.g. SMIL, and is unneccesary within a 'media track'
format (which is how I view DFXP). best regards John Birch
The discussion is archived at:
http://lists.w3.org/Archives/Public/public-tt/2005Apr/0008.html
http://lists.w3.org/Archives/Public/public-tt/2005Apr/00014.html
The resolution is archived at: Cupertino F2F
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0011.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Apr/0033.html
and
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0010.html
TTWG response agreed by Requestor
http://lists.w3.org/Archives/Public/public-tt/2005Aug/0033.html
Comment:
The following three information elements defined in your metadata provisions[1] would appear to replicate capabilities available from the Dublin Core[2]. 12.1.2 ttm:title 12.1.3 ttm:desc 12.1.4 ttm:copyright Please consult with the Dublin Core Usage Board or some expert well versed in their opinions to see if the information intended to be conveyed by these three elements is adequately expressed by existing Dublin Core terms. If there are expressions in terms of Dublin Core terms that convey what you need to convey, please use them. Al /self -- no Group consensus implied [1] http://www.w3.org/TR/2005/WD-ttaf1-dfxp-20050321/#metadata [2] http://dublincore.org/
Finally, the TT WG has carefully reviewed the semantics and intended use cases of the above three metadata elements and compared these with similarly named items in the Dublin Core vocabulary. After this review, we concluded that there is sufficient difference of usage and intended semantics to retain these items in the TT AF metadata vocabulary. The DC metadata vocabulary may be used along side this vocabulary as desired by an author.
The resolution is archived at: The discussion is archived
at:
http://lists.w3.org/Archives/Public/public-tt/2006Sep/0002.html
http://lists.w3.org/Archives/Public/public-tt/2006Sep/0003.html
TTWG response agreed by Requestor
http://lists.w3.org/Archives/Public/public-tt/2006Sep/0003.html
Comment:
Hello TT WG, Thanks for permitting a delay. Here are the promised comments: 5.1) Why six different namespaces in one document format? It doesn't seem like you can use any of them on its own. 5.2) If it is important for an implementation to know the profile a document conforms to, shouldn't the profile name be passed out of band (as a MIME type parameter), instead of inside the document? A profile is usually something you author to, but then neither the client nor the server need to know that you do so; or it is something a client uses for content negotiation (e.g., HTTP Accept header or TYPE attribute on an HTML LINK element). It is not useful inside the document, except perhaps for the Unix "file" command... 5.2.b) In fact, the spec doesn't say what an implementation does with profiles. I assume it doesn't do anything with it (unless perhaps if the program is a validator). 5.3.1) Why lowerCamelCase even for names that are borrowed from other specifications? XML names can contain dashes. 6.2.1) EDITORIAL: s/express number/express the number/ 6.2.1 until 6.2.13) All (not sure about one or two) of these attributes can only occur on one element, viz., <tt>. So why are they defined as global attributes (with a prefix)? 6.2.2) Where is the syntax of GPS time coordinates defined? I couldn't find the definition in the spec and there is no reference either. 6.2.2) Are GPS time coordinates really needed? Their semantics are the same as UTC, aren't they? What is the use case? 6.2.2) It seems that there may be only one clockMode attribute per document (if I interpret the note at the end of 6.2.13 correctly), but unlike for the other attributes, there is no paragraph in this section that says that clockMode may only occur on the <tt> element. 6.2.3) Is the defaultLengthUnit attribute needed? In CSS, we found it useful to have unitless numbers mean something specific, different from a length, e.g., as a multiplier. That possibility is removed when there is a default unit. Also, in a typical document there probably aren't more than a dozen or so length values, so declaring a default doesn't actually make the document shorter. 6.2.3) The default is "pixels," but are they device pixels or px units, as in XSL/CSS? 6.2.4) Same question: is defaultTimeMetric necessary? 6.2.5) EDITORIAL: s/of document instance/of a document instance/ 6.2.6) Is NTSC the only case where a frameRateMultiplier is necessary? If so, then maybe a single keyword ("NTSC" on the frameRate attribute) is enough, and a general multiplier is overkill. 6.2.6) EDITORIAL: s/MHz/Hz/ for the first occurrence in the note. 7.1.1) Why must xml:lang be specified? Isn't omitting it the same as defining it to be the empty string? 7.1.1) Is xml:space necessary? You'll have to have style attributes for space handling anyway, so why complicate matters by doing a half job in XML? 7.1.3) Attributes begin, dur and end are on <tt> and on <body>. Are they needed on both? 7.1.7) Is the <br> needed? You can also use two <p> elements if you need two lines. 7.1.7) What happens if you put two <br> elements in a row, do you get an empty line or not? 7.1.7) Why does the <br> element have an xml:space attribute? Empty elements don't contain spaces... 7.2.3) Rule three seems to imply that the mark-up <p>one<span> two </span>three</p> is displayed as "onetwothree" without any spaces. Maybe you meant to omit leading and trailing spaces from the <p> element only? 8.1.1) What happens if a DFXP document has a style PI? I assume a DFXP application will ignore it (just like a generic XML viewer will ignore the <styling> elements). 8.2.1) The bullet list of elements that accept style attributes should be non-normative (i.e., a note), because that information is already known from earlier sections. 8.2.10) The note says that a horizontal font-size is useful in systems that have two fonts: normal and double-width. But do you expect a horizontal font-size to work on any other system? or with any other value than "1c" or "2c"? 8.2.16) The note says that a <p> is displayed on one line, unless a <br> is used. But doesn't that also depend on wrapOption? 8.2.16) overflow in CSS/XSL has a value "scroll" but here it is renamed to "dynamic." Why? An automatic scroll, such as the marquee effect of "dynamicFlow," is a valid "scrolling mechanism" in XSL/CSS terms. 8.2.17) "padding" allows one, two or four values. Why not three, as in XSL/CSS? 8.2.18) "showBackground" appears to be similar to 'empty-cells' in CSS. Is there no way to merge them? 8.2.19) "textAlign" doesn't allow values "left" and "right" as in XSL/CSS, although it is much easier for an author to write "left" than "start" (or "end") when he means "left." Also, when converting from/to other formats, it is easier if the value for textAlign in DFXP is a direct translation of the corresponding value in the other format, rather than a function of that other value and the "direction" property. 8.2.20) CSS3 proposes a 'font-effect: outline' property to create outline fonts, but it doesn't give control over the thickness of the outline, let alone the amount of blur. Isn't 'font-effect: outline' enough? 8.2.22) The example is supposed to show that "visibility" can hide text, but there is no text to hide... The first text only appears after 1 second. 8.3.6) The generic font family names suggest specific kinds of fonts, but the spec effectively says to expect nothing. In that case, why aren't they called "font1" to "font5"? Some help for implementers seems useful. If an implementation has different fonts available, I think users would like it if the fonts are mapped somewhat intelligently. 8.3.6) The generic font families are different from those in XSL/CSS. Maybe DFXP doesn't need "fantasy" and "cursive," but it could have kept "sans-serif" and "serif" without renaming them. Also, is the difference "monospace-sans-serif" vs "monospace-serif" really needed? Just one monospace font has been enough for all my uses (which weren't subtitles, I admit). 8.3.11) The units px, em and c are defined syntactically, but what do they mean? I assume px is as in XSL/CSS and em is the font-size, because 8.2.10 mentions an "EM square" in relation to font-size. "c" is probably the cell as defined by 6.2.1 cellResolution. 8.3.11) Is the em unit the vertical font size or the horizontal one? Or does that depend on whether the length is used to measure something horizontal or vertical? 10.1.2) begin, end and dur attributes: what happens if they conflict? Bert --
5.2 [1] - TTWG Response:
A "profile" or "version" could be passed out-of-band, but since DFXP does not
define a transport mechanism, such definition is more appropriately defined
in another specification in order to suit the needs of that specification.
5.2 [2] - TTWG Response:
DFXP explicitly does not specify a schema binding mechanism. Conformance clause 3.2 sub-item 1 places this responsibility on a TT AF Content Processor. Nevertheless, we recognize that some guidance be provided to processor implementers; therefore, we will add informative language suggesting that a processor may make use of the ttp:profile parameter as a means for identifying the declared language subset (profile).
5.3.1 - TTWG Response:
The TT WG has adopted a consistent convention for all names, and will provide
an informative annex indicating heritage of tokens and names, indicating
whether they had camel case normalization applied.
6.2.1 [1] - TTWG Response:
Editorial: Will fix.
6.2.2 [3] - TTWG Response:
This was an oversight. An appropriate constraint will be added to be constent
with other time related attributes.
6.2.3 - [2] ) - TTWG Response:
No longer applicable due to removal of ttp:defaultLengthUnit; however, the
larger question of what does "px" mean when specified in a length will be
addressed by adding language indicating that the XSL definition applies.
6.2.5) - TTWG Response:
Editorial. Will fix.
6.2.6) [1] - TTWG Response:
NTSC is not the only case; furthermore, there are a variety of operational
usages in studios and in broadcast where frameRateMultiplier may effectively
be a continuous function. Use of an enumeration would be overly restrictive
and require constant updating.
6.2.6) [2] - TTWG Response:
Editorial. Will fix.
7.1.1) [1] - TTWG Response:
The goal is to strongly encourage authors and authoring systems to be
explicit about language. Specifying xml:space="" is not the same as not
specifying xml:space. The former is an explicit authorial expression of "no
default language"; the latter leaves authorial intention unexpressed. We wish
to enforce some intentional expression even if it is "no default
language".
7.1.1) [2] - TTWG Response:
We are not sure what is meant by "doing ... [the] job in XML". Our
understanding is that a compliant XML processor must always "pass all
characters in a document that are not markup through to the application". Our
understanding of xml:space is it permits the author to express whether the
application should use its "default white-space processing mode" or should
"preserve all white-space". Since we have not introduced the full range of
whitespace processing style properties into DFXP, such as XSL's
linefeed-treatment, white-space-collapse, white-space-treatment, and
suppress-at-line-break properties, we instead rely upon use of xml:space as
an alternative mechanism for specifying authorial intention regarding
whitespace preservation. Nevertheless, we believe it may be useful to
re-express the algorithm specifying the meaning of xml:space="default" to
instead normatively reference the semantics of the above mentioned XSL
properties.
7.1.3) - TTWG Response:
Distinct timing context is required on <tt> as opposed to <body>
in order to provide a timing container for <head> and thence to
<layout> and <region>, the latter of which can be animated.
7.1.7) [1] - TTWG Response:
In XSL, <fo:block/> does not produce a block area since it has empty
content. Since TT AF maps <p/> to <fo:block/> semantics, two
empty <p/> elements from the TT namespace would map to:
<fo:block/> <fo:block/>
and thus not produce any visible side effect.
In contrast,
<p> <br/> </p>
is defined to produce the same result as
<fo:block> <fo:character character="
"
suppress-at-line-break="retain"/> </fo:block>
We will review if additional clarification is required to express these
intentions.
7.1.7) [2] - TTWG Response:
Given
<p> <br/> <br/> </p>
a compliant presentation processor produces the same results as:
<fo:block> <fo:character character="
"
suppress-at-line-break="retain"/> <fo:character
character="
"
suppress-at-line-break="retain"/> </fo:block>
7.1.7) [3] - TTWG Response:
It was specified for symmetry sake (in order to be uniform on all content
elements). We will add an informative note that indicates that it is
effectively ignored if specified.
7.2.3) - TTWG Response:
Based on a comment above, we believe it may be useful to re-express the
algorithm specifying the meaning of xml:space="default" to instead
normatively reference the semantics of the above mentioned XSL properties. We
will eiother re-express thus or correct the algorithm (which we copied from
SVG).
8.1.1) - TTWG Response:
No normative semantic has been defined for any processing instruction;
therefore, a compliant presentation processor may ignore.
8.2.1) - TTWG Response:
We believe there is no inconsistency in presenting the same normative
requirements twice, given the different contexts of the specification,
provided that there is no inconsistency between the requirements. We believe
there is no inconsistency at this time.
8.2.10) - TTWG Response:
We believe that specifying both horizontal and vertical sizes may produce a
continuously varying anamorphoic transformation on devices capable of
rasterizing fonts in a rectangular EM square. We do intend to mandate that a
given compliant presentation processor must support either continuous
anamorphic scaling of EM square or support some discrete set of
anamorphically-transformed font sizes.
We also recongize that this feature constitutes an extension not presently supported in XSL, and isn't adequately addressed by section 9.3.2 sub-items 6 and 7 (pertaining to populating XSL style properties). Therefore, we will add additional normative language to 8.2.10 that expresses the intended semantics.
8.2.16) [1] - TTWG Response:
Good catch. Will fix.
8.2.16) [2] - TTWG Response:
We weren't certain if we could define "scroll" to mean "apply the dynamic
flow semantics defined in DFXP"; but given that you suggest this we will
change to using "scroll" and define "scroll" semantics according to the
dynamic flow features.
8.2.17) - TTWG Response:
We did not find the use of three values to be a particularly useful feature,
and did not need to support the limited subset of TT AF expressed by DFXP. It
is possible that AFXP will support the larger subset.
8.2.18) - TTWG Response:
Perhaps, but we feel comfortable basing the usage in DFXP on the current SMIL
attribute whose name is "showBackground" [1].
[1] http://www.w3.org/TR/2005/REC-SMIL2-20050107/layout.html#adef-showBackground
8.2.19) - TTWG Response:
We think that when at author writes "left" in a LRTB writing mode, that they
actually mean "start". We want to encourage the author to express their
logical intention. We are not certain if there is a strong use-case for
specifiying non-relative (absolute) text alignment.
8.2.20) - TTWG Response:
A number of TT WG members have a strong preference for providing the ability to specify blur radius. There has been a request for expressing separately the inner and outer color of the blur and possibly gradient parameters to apply at the transition boudnary; we chose a simple compromise that was modeled closely on the text-shadow property of CSS2.
8.2.22) - TTWG Response:
As it turns out, (1) the intent of this example was not "to show that
'visibility' can hide text", and (2) there is a bug in the example, in that
tts:visibility="false" on the paragraph means the paragraph (and its
children) will never be visible. The example DFXP document should have
read:
<p region="r1" dur="4"> <span tts:visibility="hidden"> <set
begin="1" tts:visibility="visible"/> Curiouser </span>
<span tts:visibility="hidden"> <set begin="2"
tts:visibility="visible"/> and </span> <span
tts:visibility="hidden"> <set begin="3" tts:visibility="visible"/>
curiouser! </span> </p>
We will fix.
8.3.6) [1] - TTWG Response:
Well it is either implementation dependent or not. We really want the former. We prefer to let the market decide whether an implementation does something sensible or not. To do something formal would require introducing PANOSE concepts or equivalent which doesn't seem particularly worthwhile.
8.3.6) [2] - TTWG Response:
The TT WG believes there are a number of examples of all combinations of {monotype,proportional} x {sans-serif,serif} in use in international subtitling applications that justify labeling all combinations.
8.3.11) [1] - TTWG Response:
We will add a cross-reference to XSL definitions and expand further on the
definition of "c". But see more below on "em".
8.3.11) [2] - TTWG Response:
We will elaborate that "em" in the context of a font that expresses an
anamorphic size has two interpretations depending on the context of usage,
i.e., depending on whether the length is being used to express a distance
along the block or inline progression dimensions.
10.1.2) - TTWG Response:
At present, section 10.4 normatively references the semantics of SMIL 2 for
the purpose of interpreting these attributes, as well as dealing with
possible conflicts (over-constraint scenarios). We are considering fully
inlining the timing interval semantics into the spec, which is a greatly
reduced subset of SMIL 2 timing semantics; if we do this, then the usage and
constraints on these attributes will be fully articulated.
----------
The resolution is archived at:
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0001.html
The discussion is archived at:
The discussion is archived at:
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0001.html
1st Response from CSS WG to TTWG response
http://lists.w3.org/Archives/Public/public-tt/2005Oct/0001.html
2nd Response from TT WG to 2nd SYMM response
http://lists.w3.org/Archives/Public/public-tt/2005Oct/0002.html
Final Response from CSS WG to TTWG response
http://lists.w3.org/Archives/Public/public-tt/2005Oct/0003.html
Comment:
Dear TTWG, On behalf of SYMM WG, I would like to say thank you for extending the review period of DFXP 1.0 LCWD. SYMM WG has eagerly reviewed DFXP 1.0 LCWD and prepared our comments. SYMM WG hopes these comments will be used to improve the Timed Text specification. Please accept our official comments on DFXP 1.0 LCWD from SYMM WG. ================================= SYMM WG Comments on DFXP 1.0 LCWD ================================= 0. Overall / General SYMM-0-1: The SYMM WG is concerned about large scale duplication of functionality of existing W3C specifications in the DFXP LC WD document: The DFXP specification describes functionality that is already defined by other specifications, such as XHTML, CSS, and SMIL. DFXP should re-use these specifications. It should do this without introducing any changes to the syntax or semantics in these specifications; whole units of related functionality should be adopted. To make re-use of an existing spec clearer to the reader and to avoid making any changes the DFXP specification should reference to the original specification instead of including an own description of such a feature. It may extend these existing languages whenever its own requirements exceed what is already available. This will be of benefit to content authors because they do not need to learn a new language. It will also be helpful for implementation of processors because they can re-use software components. 1. Introduction SYMM-1-1: The functional distinction between DFXP and AFXP seems unclear. It's hard to understand why there should be two different profiles. It just appears to be complicating the system model. The functional difference between DFXP and AFXP should be explained more formally than Figure 1. SYMM-1-2: The current draft does not mention any specific example to explain what kind of legacy formats can be transcoded and how much useful that is. The potential markets and application areas should be introduced more specifically. SYMM-1-3: SYMM WG believes DFXP should primarily serve the purpose to be rendered directly i.e. it should not be required to first transcode DFXP into a proprietary format for rendering. DFXP should therefore be specified as a distribution format. It should be designed to be delivered to and rendered by a wide range of desktop, embedded and mobile terminals. Such distribution format for TT should integrate well at least with SMIL and XHTML. Preferably, the DFXP specification should define in full the integration to SMIL and to XHTML to achieve full interoperability. 3. Conformance SYMM-3-1: The specification insufficiently defines rules for processing and rendering of DFXP content. 5. Vocabulary SYMM-5-1: It is not visible which vocabulary was newly invented by DFXP or introduced from existing standards such as XHTML, CSS, XSL, SMIL. The original references of all vocabularies should be arranged in a table for readability. 6. Parameters SYMM-6-1: The parameters for time metric seems to have improved very well and sufficient to associate with wide variety of media materials. But it would be helpful to understand them correctly if more specific examples for each feature were provided. 7. Content SYMM-7-1: It appears to be a bad choice to re-define HTML language elements div, span, p, br with different semantics as in XHTML. XHTML syntax and semantics should be adopted without making any changes. SYMM-7-2: Allowing the root tt element to have timing and styling attributes seems redundant. The right place to hold default values of a document would be the body element. 8. Styling SYMM-8-1: CSS and XSL:FO are the W3C standards for styling and layout. Also DFXP should use CSS or XSL:FO for styling. It should use both exact syntax and semantics of CSS/XSL:FO, and then define its own attributes where CSS/XSL specifications are insufficient. Chosen solution must allow a lightweight implementation on constraint embedded devices. In case that CSS is used for styling, it is not good enough to use CSS attribute names DFXP, e.g. tt:display, tt:fontFamily. CSS syntax and semantics should be used without changes. DFXP spec should list the CSS properties it supports and reference to CSS 2.1 spec for their definition. To get the CSS working normally and leverage its full power the DFXP spec may also adopt the following CSS 2.1 features by referencing to CSS 2.1 specs: * syntax and basic data types * selectors * assigning property values, Cascading, and Inheritance * media types (with possible restriction of the supported media types ) SYMM-8-2: (8.3.12) <namedColor> should reference some other specification. Stable references should be CSS2. 9. Layout SYMM-9-1: CSS and XSL are the W3C standards for styling and layout. Also DFXP should use either CSS, XSL or SMIL layout. DFXP may define its own attributes where CSS/XSL/SMIL specifications are insufficient. Chosen solution must allow a lightweight implementation on constraint embedded devices. SYMM-9-2: Allowing timing attributes to be placed in layout elements seems interesting, but it could complicate timing structure of a document. Its necessity should be explained reasonably. SYMM-9-3: Allowing style elements to be placed as a child of a region element seems redundant. Allowing style attributes to be placed in a region element would be sufficient. 10. Timing SYMM-10-1: DFXP should use a subset of the SMIL 2 Timing and Synchronization Module functionality. It should use exact syntax and semantics of SMIL. 12. Metadata SYMM-12-1: The metadata attributes should be introduced from or reference to industry standards or existing specifications. DFXP should not develop its own attribute set as a normative part of a Recommendation. SYMM-12-2: The places for metadata should be limited within a head element. SMIL already provides a good example: http://www.w3.org/TR/SMIL/metadata.html#smilMetadataNS-example Appendix B: Dynamic Flow Processing Model SYMM-B-1: Text and diagram should be provided. Appendix H: Acknowledgments SYMM-H-1: Listing former/inactive members seems inappropriate. (It looks like accusing specific individuals.) That paragraph should be removed. Best regards, Yoshihisa Gonno, Sony Corporation Co-chair W3C SYMM WG email: ygonno@sm.sony.co.jp
TTWG Response:
The TTWG believes that DFXP [1] does not duplicate, but, rather, reuses existing functionality of existing W3C specifications, and, in particular: XML, XHTML, CSS (through XSL), XSL, and SMIL.
In its reuse of vocabulary and semantics from the cited existing W3C specifications, certain changes were necessitated and warranted to satisfy overall requirements adopted for the Timed Text Authoring Format 1.0 (see [2]). Among these requirements is R105 Ownership of Core, which specifies that core functionality is to be specified by the TT WG:
<quote> The TT AF specification(s) shall be defined in such a manner that core functionality be specified soley by the TT WG or, in the event that the TT WG is terminated, its successors within the W3C.
Note: It is assumed that one or more appropriate namespace mechanisms will be used to segregate core functionality defined or adopted in the TT AF from peripheral functionality defined or adopted by clients of the TT AF. </quote>
In order to satisfy this requirement, the adopted vocabulary is placed in the TT Namespace (http://www.w3.org/2004/11/ttaf1) or a sub-namespace thereof.
When adopting vocabulary from existing W3C specifications into the TT Namespace, the TT WG has taken care to change the usage of that vocabulary only in order to satisfy other requirements established by [2].
In order to make this reuse of vocabulary more clear, and in order to offer explanation of the differences introduced by DFXP, the TT WG proposes to add an informative Annex to DFXP that specifies the derivation of vocabulary and explains the differences in usage. The TT WG believes such an Annex will meet the concerns of authors and users of DFXP content and permit them to fully reuse their knowledge of existing W3C specifications.
Regarding reuse of existing implementations, at least one member has indicated that they have successfully reused a subset of an existing implementation of XHTML, CSS, and SMIL to support DFXP and did so with only minor modifications.
[1] http://www.w3.org/TR/2005/WD-ttaf1-dfxp-20050321/
[2] http://www.w3.org/TR/tt-af-1-0-req/
The resolution is archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
1. Introduction
SYMM-1-1:
TTWG Response:
DFXP is a strict lexical and semantic subset of AFXP. Functionally, DFXP is
restricted to those features that can reasonably processed by a streaming
parser as opposed to a DOM based parser. In addition, to simply embedding in
other streaming application content formats, DFXP is wholly self-contained,
and does not require the use of any external resources (such as images, style
sheets, time sheets, etc.) It is expected that these constraints will not
apply to AFXP, which is expected to make use of XPath expressions to
associate styling and timing information with selected elements, and is
expected to support references to external image, font, style sheet, and
timing sheet resources.
The TTWG believes that this additional background information is not strictly necessary in the DFXP specification, but is more appropriately placed in the AFXP specification.
SYMM-1-2:
TTWG Response:
DFXP contains an informative reference to the TTAF 1.0 requirements document
[2], which refers to legacy formats and explains use cases.
SYMM-1-3:
TTWG Response:
DFXP is an implementation that satisfies the requirements adopted by the TTWG
and documented in TTAF 1.0 Use Cases and Requirements [2]. DFXP was
explicitly designed as an interchange format suitable for exchange amongst
existing timed text distribution systems. It was also explicitly designed
such that it could be directly rendered. The TTWG believes that this design
is realized in the current DFXP LC WD and that no technical change is needed
to facilitate such usage.
Regarding integration of DFXP with SMIL and/or XHTML, while such integration may be defined in the future, perhaps by the TTWG or perhaps by the SYMM or HTML WGs, the TTWG does not believe it is a requirement to define such integration at this time in the DFXP LC. DFXP as defined by [1] can be directly used by an appropriately enabled SMIL or XHTML user agent that supports the DFXP document type and its semantics. No additional integration specification is required to accomplish this usage.
The resolutions are archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
3. Conformance
SYMM-3-1:
TTWG Response:
Section 9.3 "Region Layout and Presentation" in combination with Section 3.2
"Processor Conformance" item (5) fully specifies the rules for rendering DFXP
content. Section 3.2 fully specifies all conformance requirements for
processing in general.
The resolution is archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
5. Vocabulary
SYMM-5-1:
TTWG Response:
Accepted. An informative table will be added that provides this information
for the reader.
The resolution is archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
6. Parameters
SYMM-6-1:
TTWG Response:
Accepted. Examples of use of timing parameters will be added.
The resolution is archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
7. Content
SYMM-7-1:
TTWG Response:
TT-AF 1.0 [1] requirement R105 mandates that all core vocabulary be specified
by the TTWG, which has been accomplished by using a TT
specific namespace. The derivation of content vocabulary from XHTML is based on the recommendation made by requirement R209. The TTWG believes it has made judicious reuse of XHTML vocabulary in a manner that is consistent with TT-AF requirements and general practice. In particular, the TTWG does not believe any interoperability problems will derive from this usage, and that greater interoperability will derive from familiarity.
SYMM-7-2:
TTWG Response:
The use of timing attributes (begin, dur, end) on the /tt element is
predicate upon the need that certain elements specified in /tt/head, in
particular, /tt/head/region elements, may have timing intervals associated
with them in order to construct animation timelines on the regions. Since
these elements do not appear as descendants of /tt/body, there was a need to
provide a timing context on a higher level element that includes both
/tt/body and /tt/head/region. Regarding the use of certain styling properties
as attributes on /tt, specifically tts:extent, this property used in this
context defines the extent of an outer containing region, known formally as
the "root container region", which is logically equivalent to the page-width
and page-height attributes expressed on the fo:simple-page-master flow object
as defined by XSL 1.0 [3].
[3] http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo_simple-page-master
The resolutions are archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
8. Styling
SYMM-8-1:
TTWG Response:
DFXP is based primarily on XSL expression of styling matter. The formatting
semantics of XSL are normatively adopted in Section 9.3.2 where the last
paragraph states: "then apply the formatting semantics prescribed by [XSL
1.0]"
DFXP employs only a subset of XSL functionality based on requirements stated in [2]. The TT WG takes exception with the assertion that DFXP must adopt the exact syntax and semantics of either XSL or CSS. Those syntactic and semantics that are applicable to DFXP are adopted wherever possible, with divergences only when deemed necessary to meet stated requirements. The TT WG notes that there is are many precedents in W3C technical specifications of adopting existing solutions when possible and then subsetting, supersetting, and modifying as the need arises. For example, one only has to consider the evolution of CSS/XSL and HTML/XHTML languages to see such design principles in practice.
SYMM-8-2:
TTWG Response:
The TT WG believes that direct specification of named color values is
technically consistent with CSS2 conventions, and believes there is no merit
in forcing authors to resolve external references for such a straightforward
and stable enumeration.
The resolutions are archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
9. Layout
SYMM-9-1:
TTWG Response:
DFXP is based on XSL layout semantics as described above, but is extended to
meet the requirements documented in [2]. The TT WG believes that it is
possible to demonstrate "lightweight" implementations of DFXP content and/or
presentation processing.
SYMM-9-2:
TTWG Response:
Agreed. Additional explanatory material and examples will be added.
SYMM-9-3:
TTWG Response:
Allowing separation of styles into distinct style child elements as opposed
to mandating their merge provides additional flexibility for authoring tools
and transformation processors that may wish to isolate individual styles and
refer to them individually via referential styling.
The resolutions are archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
10. Timing
SYMM-10-1:
TTWG Response:
DFXP is based on a subset of SMIL2 timing as document in Section 10.4: "The
semantics of time containment, durations, and intervals defined by [SMIL2]
apply to the interpretation of like-named timed elements and timing
vocabulary defined by this specification..."
DFXP departs from the exact syntax and semantics only when requirements dictate such departure. The TT WG takes exception to the notion that DFXP must adopt the exact syntax and semantics of SMIL. DFXP is a distinct media type, and is not intended to replace the functionality of SMIL. It's re-use of timing formulations already adopted in SMIL is merely a convenience for authors that have current familiarity with SMIL concepts.
The resolution is archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
12. Metadata
SYMM-12-1:
TTWG Response:
Careful consideration was given to the possibility of direct use of existing
Metadata vocabulary, in particular, the vocabulary defined by Dublin Core. In
the final analysis, it was determined that the requirements of DFXP did not
match those provided by similar Dublin Core vocabulary. As a consequence, a
very limited set of metadata vocabulary was defined to meet the specific
needs of DFXP content authors in creating interoperable content with agreed
upon meaning.
SYMM-12-2:
TTWG Response:
The TT WG does not agree with this assertion, and believes that there are
many use cases for the direct incorporation of metadata into content. Common
examples of such usage are prevalent in existing W3C standards, such as
XHTML, SMIL, and SVG in their use of title and description attributes or
elements. Similarly, XML Schemas provides the xs:documentation element type
for author supplied metadata.
The resolutions are archived at http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
Appendix B: Dynamic Flow Processing Model
SYMM-B-1: Text and diagram should be provided.
TTWG Response:
Agreed. Either this information will be provided or the feature will be
removed.
Appendix H: Acknowledgments
SYMM-H-1:
TTWG Response:
The intent of this comment is unclear. If specific persons listed thus would
like to have their names removed or attributed in a different manner, then
such specific request will certainly be accommodated.
********************************************************************
The resolution are archived at
[comments SYMM-0-1 to SYMM-12-2 and SYMM-B-1, SYMM-H-1]
http://lists.w3.org/Archives/Member/member-tt/2005Jul/0015.html
1st Response fromm TTWG
http://lists.w3.org/Archives/Public/public-tt/2005Aug/0015.html
2nd Response from SYMM WG to TTWG response
http://lists.w3.org/Archives/Public/public-tt/2005Sep/0001.html
2nd Response from TT WG to 2nd SYMM response
http://lists.w3.org/Archives/Public/public-tt/2005Oct/0000.html
Comment:
Dear Timed Text Working Group, In response to your request [1] for Last Call comments on "Timed Text (TT) Authoring Format 1.0 - Distribution Format Exchange Profile (DFXP)", the Multimodal Interaction Working Group has reviewed the document from our perspective, in particular considering how timed text might be incorporated into multimodal applications. The Multimodal Working Group has not an objection, but an observation to make about the Timed Text Group's last call working draft. Timed Text would be easier to use as part of multimodal interfaces if it had a means of handling external asynchronous events. Such events are the standard means of coordinating among modalities in multimodal situations. Consider a multimodal interface that is using Timed Text and text-to-speech simultaneously to prompt the user, while using speech recognition to gather the user's response. Using ttp:timeBase, the text to speech output can be synchronized with the Timed Text display. However, when the user starts speaking, the multimodal interface would normally want to stop the text to speech play and alter, if not stop, the Timed Text display to indicate that it is now listening to the user. Obviously, the timing of the user's utterance can't be known in advance, so the normal way to do this is to generate a 'speech-detected' or 'barge-in' event, which is then delivered to all the modalities where it is caught by appropriate event handlers. (The event handler for text to speech would halt the current text to speech play. A corresponding handler for Timed Text might flash the display or halt it or make it change colors.) In the current specification, there is no apparent way to handle this event in Timed Text markup. This gap does not indicate an inherent weakness in the Timed Text specification, but we think that it will limit the usefulness of Timed Text in multimodal interfaces. If you would like more information about the overall multimodal architecture that we're envisioning as a potential container for timed text, you may find our MMI Architecture document useful [3]. We would be happy to discuss our observation in more detail if you have any questions or comments. best regards, Debbie Dahl, MMI WG Chair [1] Request for Last Call Comments: http://lists.w3.org/Archives/Member/chairs/2005JanMar/0118.html [2] TT Authoring Format: http://www.w3.org/TR/2005/WD-ttaf1-dfxp-20050321/ [3] MMI Architecture: http://www.w3.org/TR/2005/WD-mmi-arch-20050422/
The discussion is archived at:
http://lists.w3.org/Archives/Member/member-tt/2005Jul/0002.html
The resolution is archived at:
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0011.html
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0012.html
TTWG response agreed by Requestor at:
http://lists.w3.org/Archives/Public/public-tt/2005Aug/0042.html
Comment:
<background> Please refer to the 'ttm:role' attribute as defined in the curent Timed Text DFXP specification. http://www.w3.org/TR/2005/WD-ttaf1-dfxp-20050321/#metadata-attribute-role A caucus of WCAG and UAAG participants concluded that they might want an explicit 'transcription' value for such a role. http://lists.w3.org/Archives/Member/w3c-wai-cg/2005AprJun/thread.html#35 </background> <comment> User agents need clear indications in the format of what text corresponds to speech in some corresponding audio segment. This is needed in order for the User Agents to satisfy UAAG Checkpoint 2.3. http://www.w3.org/TR/UAAG10/guidelines.html#tech-conditional-content We realize that the DFXP will most often be transcoded into another format before transmission to the User Agent. However, in order for the transmitted form to have this information, the distribution format must be clear on this point for the transcoded results to convey the right information. Is this information recognizable from the existing format as it stands? If so, how? Or should the 'ttm:role' attribute have a value of 'transcription' defined? </comment> Al /chair, Protocols and Formats WG This comment has rough consensus support within the WG
The resolution is archived at:
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0007.html
The discussion is archived at: http://lists.w3.org/Archives/Member/member-tt/2005Jul/0003.html
Comment:
It has been said on other threads that introduction of the timing model into the layout was due to the following: >>(1) we wanted regions to be temporally activated/deactivated; >>(2) we want to animate certain region styles, such as background color >>(which is independent of background colors deriving from content >>elements) and position; >> >>In order to provide these temporally sensitive features, we need to make >>regions timed elements, which implies a timing context, which in turn >>indicated a need for having the root container element <tt/> be a timed >>container. The use of timing inside layout elements (and, in turn, in the root container element <tt/>) is not necessary if certain other attributes are allowed. I feel that adding timing to these elements is adding unneeded complexity; if left as is, I fear a lot more investigation and documentation will need to be done to cover the non-obvious edge cases. You can do both of the things you described, above, in SMIL 2.0 without the existence of region timing attributes. (1) To temporally activate/deactivate regions: You could add the "showBackground" region attribute to TT and allow the value of "whenActive". See: http://www.w3.org/TR/SMIL2/layout.html#adef-showBackground When "whenActive" is active, "...the background color will not be shown in the region when no media object is rendering into that region". Also, you could allow animation of that attribute by using a <set> or <animate> element in the body that targets the region and its showBackground attribute. The latter option would allow you to turn the region's display on and off when no text was displayed in it. When text is displayed in the region and you want it to be hidden, you can move it behind other regions, resize it to 0x0, move it off screen, ...etc. The second two are contrived, I admit, but moving a region behind another is not, IMHO. (2) To animate region styles, you could use the <set> and/or <animate> element in the body, with the region's id as the targetElement value. For instance, the following SMIL2 Language-profile presentation animates the region's color from blue to red to yellow then back to red then back to blue at one-second intervals: <smil xmlns="http://www.w3.org/2001/SMIL20/Language"> <head> <layout> <root-layout width="340px" height="280px" /> <region id="r1" regionName="foo" top="10px" left="10px" height="240px" width="320px" backgroundColor="blue" /> </layout> </head> <body> <par> <img src="data:text/plain,Hello" width="50px" height="50px" region="r1" dur="5s" /> <set targetElement="r1" attributeName="backgroundColor" to="red" begin="1s" dur="3s" /> <set targetElement="r1" attributeName="backgroundColor" to="yellow" begin="2s" dur="1s" /> </par> </body> </smil> Note: In SMIL 2.0 you can smoothly animate from one value to another if the value is numerical, e.g., from="#0000FF" (blue) to="#FF0000" (red). SMIL 2.0 also has a shortcut for smoothly animating color called (naturally) "animateColor". - Erik
TTWG response agreed by Requestor
http://lists.w3.org/Archives/Public/public-tt/2005Aug/0032.html
Comment:
Hello public-tt, In 8.3.2 <color> http://www.w3.org/TR/2005/WD-ttaf1-dfxp-20050321/#style-value-color a) what is the color space for the RGB values (I expect it is sRGB, but the spec does not say so) b) Do rgba values represent premultiplied values or not? c) Is the opacity channel linear, or not? d) What color space is used for compositing of two rgba values? Also, in section 8.3.12 <namedColor> the same comment applies (I assume the answers are the same). -- Chris Lilley Chair, W3C SVG Working Group W3C Graphics Activity Lead
a) what is the color space for the RGB values (I expect it is sRGB, but the spec does not say so)
TTWG Response:
The intention is to use sRGB. We will ensure the specification normatively
reflects this intent.
b) Do rgba values represent premultiplied values or not?
TTWG Response:
The RGB components of an RGBA tuple expressed by the rgba() function are NOT
premultiplied by alpha; we believe this matches the interpretation used by
CSS3 Color Module. We will ensure the specification normatively reflects this
intent.
c) Is the opacity channel linear, or not?
TTWG Response:
The intent is to be linear. We will ensure the specification normatively
reflects this intent.
d) What color space is used for compositing of two rgba values?
TTWG Response:The intent is that the input and output color spaces from
the compositing function are all sRGB. We will ensure the specification
normatively reflects this intent.
e) Also, in section 8.3.12 <namedColor> the same comment applies (I assume the answers are the same).
TTWG Response:
The intent is to use sRGBA. We will ensure the specification normatively
reflects this intent.
The resolution is archived at:
http://lists.w3.org/Archives/Member/member-tt/2005Aug/0015.html
The discussion is archived at:
http://lists.w3.org/Archives/Member/member-tt/2005Jul/0005.html
TTWG response agreed by Requestor
http://lists.w3.org/Archives/Public/public-tt/2005Sep/0000.html
Thierry MICHEL (tmichel@w3.org)
Last Updated:$Date: 2006/09/15 10:53:30 $