Timed Text Working Group Teleconference

Meeting minutes

nigel: 2 days of meeting
… a lot of different topics to cover

nigel: we have a bunch of documents that we want to progress
… and we haven't begun horizontal review
… a goal is to decide which documents should get to WD to trigger HR
… the new process requires 3 months between WD and CR
… second goal is deciding what to do with TTML3
… 3rd WebVTT
… and 4th MIME types for embedded media

pal: I'd like to resolved on the font file size issue

nigel: we have several TTML2 topics
… I listed CSS properties but I don't know when we will discuss it
… for Danmaku, we will meet the Chinese IG tomorrow
… the schedule is arranged to cover for people departure times

atai: we probably need 45 min for 360

atai: we could have a short slot for demoing
… I would be interested in giving a 5-10min demo

nigel: the schedule does not include anything about charter
… the WBS survey is closed
… do you know if anybody is planning to explain to us about the charter?

atsushi: I have no information from PLH

nigel: [tweaking the agenda]

M&E IG follow-up

atai: we could also have followups of the break out sessions to prepare for the next meeting

<gkatsev> M&E IG joint meeting minutes

cyril: how many topics were adressed ?

atai: most of the time we discussed 360

pal: my take away is that there was no objection to support for text in MSE
… no objection, some support but this needs implementation

cyril: Some details to sort out, main problem is native parsing support, TextTrackCue v2
… Want to discuss MSE, DataCue and v2 TextTrackCue.
… Mental model:
… DataCue is the API used to surface events or information contained inband e.g. in an MP4 file
… The major use case is emsg box and metadata in general, not subtitles
… DataCue is how you surface thing
… TextTrackCue v2 is how you pass information in the other direction.
… DataCue flows browser -> app
… TextTrackCue flows app -> browser

Glenn: The native platform could be parsing the inband subtitling then surfacing it as a TextTrackCue if it wanted to.
… The immediate use case is to allow the script to do the parsing by getting data from DataCue or something like that.
… Somehow you need to get the data into the javascript to construct the data model to populate the TextTrackCue which
… can then be presented.

Cyril: Continuing model.
… The content of the text track does not flow into the app.
… MSE controls the synchronisation and what text track is used and so on.
… The added value is management of buffering and synchronisation more properly.

Pierre: It has more opportunity to do sync more accurately.

Cyril: Because little browser support for TTML, how can you have MSE support for Text Tracks without native support.
… Wondering if MSE calls the app to do the parsing in Javascript and the result is a TextTrackCue object passed to
… MSE that handles buffering, pre-rendering and synchronisation.

pal: unless it's encrypted, applications can do the extraction
… why do ask browsers to do parsing?
… just do it yourself

nigel: that's based on the model that the raw data that MSE exposes could be an IMSC document
… for example unwrapping mp4 containe

cyril: MSE could do the unwrapping, pass the IMSC/WebVTT doc to the app, the app would create an object and pass it back to MSE

nigel: what about accessibility?
… the BBC player always puts the right attributes on the div that displays the subtitles
… so that an assistive technology would read this out
… we would do adhoc repositioning when controls appear
… I'm trying to think whether any of these proposals would have impact

gkatsev: you can also change the size of the div

nigel: that's not enough to deconflict the UI from the text

nigel: I'm concerned that if you pass it down you don't have controls

cyril: on the other hand, if you pass it to the browser, it can handle PiP, fullscreen, second screen

gkatsev: people who use captions would not want screen readers over captions

nigel: not everybody who can't hear can see

pal: what can this group do?
… that's a media WG discussion

atai: to understand Cyril's model, is the rendering of the cue done by the browser?

cyril: yes

atai: this whole makes sense with the improvment about integrated rendering

nigel: the 2 problems seem to have some symmetry
… why audio/video are not treated as subtitles?

pal: if you encrypt subtitles, then there is an issue
… you'd have to discuss how to expose it back or not to the app

cyril: I've never heard of encrypted subs

nigel: the synchronization seems difficult in the browsers
… the DataCue breakout was interesting
… foolip said you misread the time marches on algorithm in the HTML spec

gkatsev: the timeupdate event firing is limited

nigel: I think that spec needs to be updated
… he's view is that the intent was that the time marches on should be run more frequently
… and a good implementation should be able to trigger the onenter/onexit close to the actual time
… Chris Cunningham assigned a Chrome bug to himself
… if you give it to MSE you may have more accurate timing but if the cue events are fired at a better time

gkatsev: I think there is benefit to having that
… but maybe less of an advantage once the time marches on thing is fixed

Cyril: Relying on cue event timing isn't going to be as good as relying on native.
… Implementation is really painful at the moment relying on maintaining synchronisation with timeupdate events etc

gkatsev: In video.js we use timeupdate but we're thinking of adopting RequestAnimationFrame

gkatsev: it's been working fairly well so far

nigel: have we got any request for the Media WG?

gkatsev: we can talk about that with them
… they are all tied together?

cyril: is it a good idea to present my model?

gkatsev: yes

pal: we could discuss separation of work between WICG, TTWG and Media WG
… if it deals with Timed Text format and accessibility
… if it is an API question it's the WICG or Media WG
… that means that if they see a need to extend the formats or accessbility aspects, they should come back to us
… in this group, we will not have sufficient browser vendors representation
… and vice versa

atai: the WICG is the place where the 2 communities would meet

nigel: the WICG allows people to develop micro features without process

atai: any activity yet on the WICG?

pal: Tess took the action item to get that started

cyril: this will be on discourse?

pal: yes

gkatsev: should we discuss that with the Media WG tomorrow?

cyril: yes, I can draft some slides

<gkatsev> WICG discourse

CSS WG meeting follow-up

nigel: that was a good meeting, very constructive
… it went better that previous meetings
… there was good progress on all of the issues
… the specs will be updated
… but their feedback was that either we should implement the proposed changes
… or ask someone to do it

gkatsev: we could add tests

nigel: well it was not mentionned

gkatsev: adding tests is good because browser vendors do pay attention to failures in WPT

glenn: what features?

nigel: line alignment (multiRowAlign) and line padding

pal: if somebody cared enough, PR against Chromium or Firefox
… would help
… they care about doing good rendering

nigel: that's where we are with the CSS WG and we should be happy about that

<nigel> CSS Issue 4319

atsushi: this was discussed in JLTF
… it was mentioned that it may have conflicts with boutens

pal: we could say that text emphasis is not allowed on upright text

glenn: they showed examples of prints that had that
… put the emphasis on the tate chu yoko block

atsushi: there is a new property that allows specifying a limit to the number of characters in Tate Chu Yoko

pal: where should the discussion happen?

atsushi: from an HTML and CSS point of view, you should not use TCY for long text

pal: I'm told that in DCinema 4digits years are displayed upright

cyril: Netflix's style guide says do not use combine for more than 2 characters

<nigel> JLREQ issue 109

<nigel> i18n issue 726

break until 1045

<atsushi> https://‌github.com/‌w3c/‌charter-timed-text/‌issues

IMSC 1.1 issues

forcedDisplay and visibility="hidden" imsc#484

<nigel> github: https://‌github.com/‌w3c/‌imsc/‌issues/‌484

Nigel: This was discussed in CSS WG and with APA
… there is a view that CSS speech spec is tended for this kind of purpose

<gkatsev> speak css property

Nigel: erika from CSS WG Commented if you set visibility to hidden but speak property ? it is supposed to be sent to the screen reader
… it was not clear from me from the spec that it should behave that way

gkatsev: spec is ambiguous

Nigel: I now think there is something needed
… CSS WG had volunteers to look at it

gkatsev: Currently this is defined in CSS3 speech

nigel: It is a buggy spec
… values for speak property are possibly wrong

gkatsev: The alway value is confusing and needs update

nigel: they may have thought of the difference between visibility and display but it is not clear
… action: we need to wait for the result of the a11y task force
… after we have an outcome we can see what to do next

glenn: in ttml we ended up merging voice-rate semantics into tta:speak
… we have values like normal, fast

Nigel: There is a difference in CSS:
… if you use speech representation that is how it should be styled

glenn: ...

Nigel: The idea in for CSS speak that you can direct screen reader for audio representation even if they
… do not have a pixel representation
… I now see that is a retired note in annotation in the spec

Summary: Wait what the CSS a11y TF come up with

Support `#font` TTML2 feature #imsc472

<nigel> github: https://‌github.com/‌w3c/‌imsc/‌issues/‌472

Nigel: It is about font feature in IMSC 1.1
… Pierre question is about ressource limit

pal: questions:
… should processors support a minimum set of font formats
… should @type be limited to a certain set of values?
… should the number of resources be limited in a document?
… should the size (in bytes) of each resource be limited?
… regarding should @type be limited to a certain set of values?
… my recommendation is to not require to constraint @type but the browsers need to support a minimum list of font ressources
… for the start of the spec the limit for the number of ressources my proposal is 2

Nigel: Why should we limit?

pal: Download time

cyril: we could make a difference between the number of ressources in the document
… and the number that should be loaded at begin time

pal: do we have a strategy of effective font loading in ttml

glenn: we have a strategy and named it
… lazy loading in our discussion
… it is an implemenation dependent feature

Nigel: Number of number font elements can be restricted
… you expect each one font element just one font ressource to be loaded
… if you want to restrict it you should restrict number of font element

glenn: I would not like to constraint neither font element nor ressources
… the application can decide on the basis the referenced font information what to fetch

pal: if you use fetch mechanism you can make the limitation bytes
… the downside of fetch it requires full processing of the document
… full processing like styke resolution etc.

glenn: This is a constraint you can only test during presentation processing

glenn: coult it be a constraint in the HRM

pal: yes
… as info: digital cinema sets fetch limit to 10 MB
… spoke with adobe colleague
… this is no coincidence
… it just works there

atsushi: In Japan we provide only a subset of a font
… this limits the size

cyril: I don't think that we at netflix we would do font subsetting
… especially for the first episode you have to provide all

glenn: noto sans font 8.6 mb for simplfied chinese

gkatsev: We can start with 10 MB and then see if anybody is complaining
… in that case we increase the limit

Pal: everybody agrees to have 10 MB as limit

<nigel> PROPOSAL: For FPWD limit fetched font resource to 10 megabytes

<nigel> Nigel: Any objections?

Resolved: For FPWD limit fetched font resource to 10 megabytes

pal: Constraint on @type
… my suggestion no constraint
… but require IMSC to support minimum set of font formats

PROPOSAL: no constraint on @type but IMSC processors need to support minimum set of font formats

<nigel> group: [no objections]

Resolved: No constraint on @type but IMSC processors need to support minimum set of font formats

pal: But...which font format?

<nigel> Nigel: Any objections?

pal: we need to be careful
… there are couple of formats woff woff2...
… I have not the expertise to decide what is hard not hard

cyril: We need one compressed and one uncompressed format

Nigel: There is an example in the DVB spec

<cyril> https://‌caniuse.com/#search=woff

<glenn> OTF's SVG table defined at https://‌docs.microsoft.com/‌en-us/‌typography/‌opentype/‌spec/‌svg

Nigel: font download it supports OFF (Open font format) and WOFF
… that is where I would start

cyril: we can discuss about woff2
… it is broadly supported
… and compresses better

<nigel> andreas: It would be good to get a view from Vladimir Levantovsky from Monotype who is a member of this group too.

pal: we wanted to specify requirement of processors

cyril: we should constraint it to not support svg-outline

<nigel> PROPOSAL: Require minimally processor support for font/otf with cff and ttf (i.e. no svg outline) plus woff

Nigel: Any objections?

Resolved: Require minimally processor support for font/otf with cff and ttf (i.e. no svg outline) plus woff

gkatsev: woff2 has 30% better compression (as a data point)

cyril: about unicode range
… I am satisfied with the current solution in Pierres PR

nigel: I have an example with two fonts and different sources
… but overlapping sources
… should we constraint this?
… is there a use case for this?

glenn: Usually we leave it to the implementation what make sense

pal: In this case we constraint the size of fetch ressource
… are we asking the implementation to find out the minimum set need
… seems complicated

glenn: we have font slelection strategy in in TTML2

nigel: this is orthogonal to this discussion

pal: what we can do
… we can forbid different fonts with the same font family

nigel: we may have different fonts and ressources for the same font family because they are for different faces
… e.g. bold, italics ...

Nigel: we can try to constraint font family together with different properties like weight

pal: how do unicode range goes together with fetch
… it needs to be validated by the processor when the size is exceeded
… the typical usecase is one font and one font family, right?

Nigel:

atai: NPO in Netherlands have the requirement to have two fonts in one document (one for Arabic and one for Dutch version of the subtitle)

pal: even if you have two fonts it make sense
… to constraint the combination for ranges, family, style and weight

Nigel: I see two propsoals
… you forbid to have unicode range overlap between different font elements with the same values for font famliy, style and weight
… or to have no constraint

glenn: There may have identical font elements with different kerning tables

pal: you can forbid that

<glenn> https://‌www.w3.org/‌TR/‌2018/‌REC-css-fonts-3-20180920/#font-style-matching

glenn: I would recommend that implementation follow the algorithm defined by css

the HRM should use this algorithm

pal: this would solve that issue

glenn: ttv does some (not all) HRM checking

nigel: You can not specifiy the HRM constraint
… you can not statically validate this as font resssouces may change dynamically

pal: idea of hrm is to provide basic guidlines so things are going to work

<glenn> see https://‌www.w3.org/‌TR/‌2018/‌REC-css-fonts-3-20180920/#composite-fonts for text on handling overlapping ranges

PROPOSAL: Unicode range overlap between different font elements are permitted even they have identical values for style, family and weight

Nigel: Any objection?

No objections

Resolved: Unicode range overlap between different font elements are permitted even they have identical values for style, family and weight

IMSC 1.2 FPWD

Proposal: Once we merged the open PR we go to FPWD

Nigel: Any objections?

No objections

Resolved: Once we merged the open PR we go to FPWD

s/PPWD/FPWD/

Lunch

Live Extensions

nigel: [pulls up repo]

nigel: there's a few things we need to cover
… last time we talked about it pal asked for some offical communication from EBU
… they had already done so
… in the reflector
… I've taken on the role of editor
… three docs
… 1. one is normative part
… 2. live carriage
… 3. live extensions guide
… take first two to req and 3rd one as a note
… atsushi: how do you do PR preview in repos with respec to make it easier to review

atsushi: there is a tool for preview

nigel: how to make it work with multiple docs in the same repo?
… maybe pr-preview can be configured
… can you look into it, atsushi?

atsushi: yes

nigel: currently no PRs or issues against it, only editorial notes
… as a reminder, the scope is for contribution of live streamed subs to distribution encoder
… the idea is not that these extensions would apply to a wide number of users
… if you've got an IMSC distribution encoder with a live distribution mechanism such as HLS or DASH
… this provides a way to get the content to that
… approach taken is to define some extension features
… key things are:
… introduces concept of a sequence with an identifier and number
… to resolve temporal issues
… tt-live is the main specification
… has anyone reviewed it?

atai: yes

others: no

atai: I have comments on the basic structure

nigel: let me continue briefly and hear it later

nigel: describes temporal overlap resolution mechanism
… defines processing semantics which are important for usecases in the scnearios of live
… the ability to delay delivery of documents or time within documents
… handling reference clocks
… the ability to hand over between subtitlers
… typicly in a live authoring one person will be subtitling and when they're done they hand it over to another subtitler
… this doc describes how to resolve inputs coming from multiple sources

<nigel> TTML Live Extensions Module

nigel: new constraints attributes on time base, sequenceIdentifier, sequenceNumber, authorsGroupIdentifier, authorsGroupControlToken, referenceClockIdentifier

cyril: what's the difference between stream and sequence

nigel: sequence set of documents with the same identifier
… and a stream is is the delivery of that sequence between two endpoints
… i've defined a profile which is simple
… a little bit from imsc, I've found I've needed as well as the concepts of permitted and prohibited, I needed optional and required
… the profile describes some extension features
… Apendix has requirements for carriage specifications

atai: thank you for you work
… I know there has been several years of discussion to get to this state and several edits
… I know what's in there is all necessary input to solve the problem
… but I also would really like to get this adopted by the market
… the market needs less coplexity
… at the end what the core scenario is the sequence number and everything else is for handler
… to solce the core problem we possibly just need the timing model and the sequence number
… that would be easier to digest for a lot of people

nigel: so, separate out things like hand off?

atai: system model is important but not relevant to everyone
… main usecase is contribution of live subtitles to a streaming encoder
… out of band live captions to the encoder and get delivered via DASH or HLS
… so, less is more

nigel: thank you

glenn: so you're suggesting this is paired down to core normative pieces

glenn: another option is to move all the normative syntactic pieces to the top and move everything to appendixes to the meat of it is at the top
… issue if the top section references the system model it could be confusing

nigel: section 6 is the meat of it

glenn: I thought section 10 was

cyril: annex B is also important

cyril: PDF of tt-live is 44 pages

nigel: incredible

glenn: under the namespaces section, it was vague whether you were redefining or defining or merely copying from TTML 1
… in general, I don't like copying normative defs if possible
… I see you've changed the prefix to tt:tt

nigel: might be a bug

glenn: having a copy of the normative defs across docs has some challenges, FYI

nigel: I understand concern, not sure I agree
… if I were defining the namespace, it would be a problem. just saying shall be used

glenn: I would put a note that the namespace is defined in TTML and link to it

nigel: I think this is done by reference, but we can adjust it to make more clear
… it isn't claiming to redefine anything
… new vocabulary are below

glenn: xml schema isn't used

nigel: maybe another bug

nigel: xs:string is used

glenn: make a note to say that normative defnitions are found in the appropriate docs

nigel: please raise an issue, it would be good

glenn: the items that have both optional and permitted in the profile section have some overlap
… it's a bit unclear
… both optional or permitted in live handover

nigel: live handover is optional and permitted if it's a handover manager node

glenn: it wasn't clear to me

cyril: maybe make it clearer by saying optional for everything or permitted by handover manager node

glenn: just having OR in there would've alleviated my concerns

nigel: anything else?

pal: I will review and suggest anything further

atai: it isn't about the technical quality
… it's about experience on how people look at long documents
… I think to get the core scenario working, you don't need too much

nigel: I'm definitely happy to hear suggestions on refactoring the document

glenn: I noticed you used SHALL, does IMSC use MUST? We use MUST everwhere

atai: this is a followup from EBU joined meeting in geneva
… one idea was also to align with IMSC

nigel: the way that this is structured is that it's an extension
… previously extension to EBU-TT
… nothing specific to EBU-TT
… by moving here by defining extensions means we can apply to IMSC as well
… so we can have an IMSC live that supports both this profile and IMSC itself
… that would provide a path towards live IMSC
… that's a design goal

glenn: a couple other editorial comments
… I would suggest editing extensions to extension, at the top
… instead of v1, I would suggest just using 1 to match TTML1
… I've been calling other modules timed-text instead of tt
… not tie directly to TTML so IMSC can also use it
… we should pick a qualifier and use it for all modules
… I don't have a huge preference

nigel: any ore on this one of the 3 docs?

glenn: are you read to publish as a working draft?

nigel: this has not been published, it's effectively an editor's draft

pal: to me criteria is whether normative section portions match source EBU doc

nigel: they do

pal: then publish
… just publish today

nigel: and refactor can happen after

<nigel> WebSocket Carriage Mechanism

nigel: next one is carriage mechanism
… one of the features of the main doc is a carriage mechanism
… this specifies how to do it over web socket

cyril: 14 pages

nigel: unbelievable

glenn: can we get a sense on whether tt or timed-text

nigel: wanted a brand name that's easy to say: TTML Live
… felt as shortest brand name

glenn: should I go and changed Timed Text to TTML? In say Karaoke
… Timed Text Karaoke or TTML Karaoke?

cyril: one question
… what is the conformance to this document
… is it a node or encoder? how do you verify comformance?

glenn: how do you test it?

nigel: definitions optional required etc translate to a pair of document requires
… and processor conformance requirements
… those permutations are defined by the profile disposition in section 6

cyril: have you thought about CR exit criteria for this?

nigel: well, they are implementation criteria, so, there should be tests for those
… from my perspective, I know 3 implementations
… one open source lead by me
… one is closed source by red beam
… another is closed source by FAB
… closed source can generate docs
… open source and consume and generate docs and output IMSC
… we can certainly contribute back some of the validty tests from the open source

nigel: so the web socket carriage mechanism, it's pretty straight forward
… just iterate thoguh each of the considerations from the main document for carriage and let the web socket do the heavy lifting
… there are certain rules about connection lifecycle and error management that relate to the websocket RFC
… there are some normative provisions that match what ws requires
… any comments on that one?

cyril: so to demonstrate intertop, you'd need a websocket server?

nigel: yes, you would have the normative provisions that if you send an invalid document it would break the connection

cyril: so it's more of a protocol?

nigel: yes, it's a good question how to test this

<nigel> TTML Live Guide

nigel: let's move to the guide
… some may remember that EBU-TT has a lot of discussion and informative text that I've moved to this guide
… some duplication from the main doc but it's all informative
… there are things like time graphs showing results of processing
… how overlaps get resolved
… lots of details that hopefully will help
… some of the things atai asked for examples with copmuted and resolved times
… there's an editorial note to bring to this document
… all of this needs no comformance language but just explanatory text
… all sections say "This section is non-normative"
… any questions on this?
… in response to pal's point earlier, move the first two docs on the req track to move to FPWD
… I'll create PRs based on glenn's issues
… should we go straight to FPWD or have another review period?
… no requirement for quality to go to FPWD?
… this is implemented already

cyril: we don't have much expertise on this in the group
… would be nice to have a wide review

atai: there's a certain amount of people who looked within EBU
… so it's really important to get peer review on this, it should generally be understandable
… specific live scenarions, conditions, etc
… most complicated is the timing model

glenn: I second motion to FPWD

PROPOSAL: go to FPWD for the Live extension module and the carriage mechanism after resolving current open issues

nigel: any objections?

nigel: no objections

Resolved: go to FPWD for the Live extension module and the carriage mechanism after resolving current open issues

nigel: one other thing to bring to attention that may not be obvious

<nigel> Relationship to TTML in RTP

nigel: there is an IETF document about embedding TTML in RTP
… could be seen as a way to do live
… wrote a doc about creating a TT-Live from a TTML doc in RTP and vice versa
… there's some interesting complexity
… tt-live is done from prespective that's all info is in the doc
… in RTP some of the info is in the wrapper

glenn: you used TT-Live

nigel: I should change it to be consistent

gkatsev: using Timed Text may confuse someone thinking it could apply to WebVTT, when it may not

cyril: should we talk wot WebRTC WG?

nigel: they have a req to make live captioning in WebRTC

atai: there are people doing live captioning in webrtc

glenn: some people were talking about this in the M&EIG
… was quite involved in RTC and proposed some subtitles

nigel: how did they do it?

atai: I can find out and tell everyone

glenn: take a look at the minutes, I think there was a presentation

nigel: any other points?

atsushi: do we need a CfC?

nigel: any resolution has a 2 week period for comments
… two ways for resolutions, record a resolution from a call/meeting and timer starts then
… alternatively, a CfC in an email
… when sending minutes out, mention decision period to get objections in
… no need to do both

atsushi: should I wait for 10 days?

nigel: depends on the kind of resolution
… there needs to have some editorial work

glenn: I don't think it should be published until the 10 days

nigel: yes, you need to wait till the decision review period is done
… ttml topic is next

TTML2

nigel: we have 5 open issues planned for TPAC

<nigel> TPAC labelled TTML2 issues

Remove application of tts:rubyPosition to ruby annotation text. ttml2#945

<nigel> github: https://‌github.com/‌w3c/‌ttml2/‌issues/‌945

glenn: the request here is to remove application of ruby position from ruby annotation text
… but to leave it for ruby container text
… right now TTML2 says it applies to ruby container and text
… CSS is not very clear
… the original reason was because you can have text without a text ocntainer

<nigel> CSS ruby-position

glenn: in that case you want to be able to specify position

<nigel> TTML2 tts:rubyPosition

glenn: but you can only have one text annotation present if you don't have a text container
… so you cannot have one that says before and one that says after

nigel: in terms of delta with CSS, CSS says applies to ruby annotation containers

glenn: but a container says applicable to boxes not elements
… "ruby annotation containers are internal ruby boxes"
… there is some ambiguity in the language

cyril: CSS spec is still in WD

glenn: 5 year discrepancies between WD and ED
… the status of CSS specs in this area are not very firm

glenn: what problem are we solving by removing it

pal: let's match CSS

glenn: but CSS has not updated the WD since t5 years

pal: also there is no reason to let it apply to text

glenn: but a text can have a text without text container

pal: there is an implied one

glenn: but there is no way to refer to it

glenn: we could just refer to the definition of ruby test container
… in section 10.2.21.1

pal: ruby position is inherited
… you never need to specify on ruby text
… because you can always specify on the container
… and it is inherited on the ruby text container

ack

atai: from what I understand, the difference between CSS and TTML is that the ruby position can be specified on a different structural element?

pal: no, it's only application
… CSS says it does not apply on text, just text container

pal: removing it from text does not remove any functionality

glenn: that's wrong, because you can specify it on a span ruby=text and it has the semantics of applying to that annotation text or the container that's implied that contains it
… if you removing it, it would not have any effect

nigel: is it true that in order to get the same functionality if you don't allow on text, you'd have to create an explicit text container?

pal: you can specify it on the ruby element, and it will inherit to the implied ruby text container

glenn: that doesn't deal with content already fielded that is putting ruby position on ruby annotation text elements

pal: in the order of priority, it is not the most important feature

glenn: do you agree that if we remove the ability, we are breaking content?
… it does make a technical change

atai: we do not need a resolution today
… but we should agree to align with CSS whenever possible

cyril: it's difficult to align with something not stable

nigel: should we tell CSS to do the way we do it?

glenn: yes

atsushi_: i18n and CSS is working on updating the CSS spec
… to finalize the element structure

glenn: it's not the structure in this case, but on the property

nigel: if this work changes the element structure this would have an effect

glenn: I doubt it would change the element structure, just the style definition

nigel: does that mean we have a place to contribute this idea?

atsushi_: current HTML does not have rtc
… that may be why there is a difference

nigel: if they did not have a rtc, how do they apply ruby position

atsushi_: they do it on rt

nigel: we should get alignment with CSS by having them apply it to rt

nigel: anybody wants to take the action?

atai: do you want to wait until the i18n work is finalized?

atsushi_: yes

atai: makes sense to me

glenn: I will file an issue with the CSS WG

atsushi_: we want to intrduce rtc and tabular ruby
… if we do that, it might be easier for every one

atai: as a group, we should align

cyril: we should make sure TTWG, CSS and I18N are all discussing the same thing

<nigel> Action on Glenn

<nigel> SUMMARY: Issue not time critical for us, work alongside CSS and i18n to get an aligned solution across TTML and CSS.

example of Cap2TT output: https://‌raw.githubusercontent.com/‌skynav/‌ttt/‌master/‌ttt-cap2tt/‌src/‌test/‌resources/‌com/‌skynav/‌cap2tt/‌app/‌imsc11/‌test-015-ruby-position.expected.xml

The set element is included in [resolve computed styles]. ttml2#950

<nigel> github: https://‌github.com/‌w3c/‌ttml2/‌issues/‌950

glenn: after discussions, I said we could remove it

pal: I think it was added as an error when going from TTML1 and TTML2 and so it should be removed

glenn: we could not remove set and not remove animate

nigel: Pierre says is an editorial and Glenn says it can't be tested

pal: it was introduced by mistake and we should remove it

glenn: I assume it was not an accident
… if we cannot test the removal, why remove it given the editorial pain

pal: the goal was not to create a divergence between TTML1 and TTML2
… it's only a spec divergence
… I checked the differences in algorithms in TTML1 and TTML2 and this one stood out

glenn: it's a change in a normative section

cyril: can we ask to go through the editing work to see if there is any problem?

glenn: I think we have done that
… and I said I am ready to remove them

pal: let's keep the issue open

glenn: fine

SUMMARY: defer this for the time being and don't make this a dependency for TTML2 2nd edition, removed from the milestone

Equivalence between tts:textDecoration="none" and "noUnderline noLineThrough noOverline" #1138

github: https://‌github.com/‌w3c/‌ttml2/‌issues/‌1138

nigel: in practice right now they are the same

cyril: in this case, example 2 and 3 should give the same result?

nigel: yes
… but if this was CSS it wouldn't be the same
… example 2 the textDecoration would be displayed
… example 3 is not possible in CSS because there are no values equivalent to no*
… you just can't do it
… once underlined has been applied at a parent level, you cannot un-apply it

glenn: in TTML, it does punch a hole

pal: is none identical to specifying the 3 no*

glenn: yes

cyril: at least we need a note that none here behaves differently from none in CSS

glenn: the no versions are also different

cyril: because you can undo them and not in CSS

glenn: yes
… this is a feature where we are diverging from CSS
… that does not mean you cannot map TTML to CSS

pal: it is not inherited in CSS

nigel: the only way to get rid of the text decoration is to use an inline block

nigel: we need a note to explain that none is equivalent to no*
… and in the semantic derivation that there are differences (inheritance behavior

glenn: we could put it directly in the text decoration definition

pal: 2 different notes: TTML-level and CSS/TTML difference

SUMMARY: we agree with having 2 notes, and let the editor decide where they go

Ruby constraints cannot be validated prior to ISD construction #1140

github: https://‌github.com/‌w3c/‌ttml2/‌issues/‌1140

pal: the problem is that the constraints are expressed based on properties not elements
… so you cannot validate them until you are in an ISD
… we have spans that behave like elements
… there are 3 options: we can say validation is done after ISD; we can constrain the application of a style resolution such that you can validate before style resolution; or say it was a mistake and change it to elements

glenn: I prefer option 1

pal: condition has a problem that we don't have a model that says when it's executed

nigel: there is no disagreement that has to be after ISD construction

pal: this requires a huge change in IMSC.js

nigel: there are 2 ways you can modify a property, but set/animate or by initials

glenn: the PR makes them non animatable

glenn: in TTT it validates after ISD construction

pal: what I would like is them to be elements

cyril: it's too late

pal: the way it is specified today is terrible

nigel: I don't see any advantage in making them not alterable by initial
… the conditional one has also some problems

pal: yes, conditional has other problems

nigel: the answer to the question in the issue, is yes that is the intent

cyril: should we close the issue with no change?

nigel: yes

glenn: do we want to let the initial change the initial value from none to ruby?

cyril: it's up to authors not to do crazy things

glenn: I'm ok with that

Resolved: we close the issue with no action

<nigel> SUMMARY: Yes. we confirm that validation of ruby constraints can only happen after ISD construction.

Clarify luminance gain prose (#1117). #1156

github: https://‌github.com/‌w3c/‌ttml2/‌pull/‌1156

nigel: I had a philosphical debate about "determine"

pal: I prefer "determine"

glenn: I can take out the cd/m2 unit

glenn: the other comment I'm not sure it's good

pal: fine, you can ignore it, it is not that important

SUMMARY: Glenn to update the PR to remove the extraneous units and Pierre to approve, and Glenn to merge

FPWD to TTML2 2nd ed

nigel: looking at the issues, can we move to FPWD ?
… there are 10 open substantive ones
… 5 issues without PR open

pal: the goal of FPWD is to get to HR started

nigel: this is a REC document, the path is not to go to FPWD

[looking at process]

pal: we can just ask people to do HR on the ED

glenn: can we make changes after HR has started?
… so I need to prepare a list of changes?

nigel: yes

SUMMARY: Glenn to prepare list of changes (editorial and substantive changes) and Nigel to initiate horizontal review based on Editor's Draft

Break until 1600

Karaoke Extensions

Cyril: There are a number of open issues

open issues on the karaoke module

Karaoke explainer

Karaoke draft specification

<cyril> https://‌github.com/‌w3c/‌tt-reqs/‌issues/‌9

Cyril: To start with the requirements, that's in tt-reqs issue 9
… Reminder: we began working on these requirements first. Reviewing them:
… 2 types:
… 1. To add more semantics to a document to be processed independently of the styles, which may or may not be in the document.
… Important - you can go deeply into details like the trajectory of the bouncing ball.
… Can get very verbose for each event individually.
… Netflix wanted to specify timing first then secondarily styles that could be in the document or overridden by the presentation processor.
… For example to allow Netflix to use the same style bouncing ball for all shows.
… At the same time, allow for specific styling in the documents if you like.
… That's how we proposed the requirements initially.
… Then the explainer lists possible solutions. [iterates through the options in the document]
… Using animate increased the verbosity of the representation.
… After this we proposed the current spec.
… The feedback initially was that using new elements was not a good idea.
… What I tried to do for the first version was see what the minimum amount of new attributes is to meet the requirements.
… I also wanted to do the minimum to keep compatibility with IMSC 1.x
… Does it make any difference if the features are expressed as new elements or new attributes?

Pierre: What's the most efficient? Using ruby as an example, introducing an element would have been much better.
… I don't have a strong opinion a priori.

Cyril: The specification starts with a model, which is kept simple.
… Imagine a small section of a movie with a song where karaoke mode is needed.
… Or another example is the whole document only represents that song.
… The first approach is a karaoke attribute in some namespace.
… The intention was to specify that here in this section the semantics of karaoke apply and the processor can do its thing.
… Then you need to know what can be overridden.
… That is given by the karaokeMode attribute and the set element to provide the inner times within the karaoke.
… The karaokeMode tells you what kind of change to do, e.g. color, which changes the color of the text with the time.
… For example a color sweep.

Glenn: The default for karaokeMode is auto so the processor can do its own thing

Cyril: [shows example from the draft spec]

Nigel: Where do the end times in the comments derive from?

Cyril: Only one karaoke can be in progress at any time so that sets the limit on some of them.

Nigel: [interested that the timing model seems different]

Cyril: Shows second example with karaokeMode="emphasis"
… No details of how the animation happens such as bouncing or moving horizontally.
… The only thing given is the key times.

Glenn: That's not adequate here because the balls need to be removed after each animation.

Cyril: That's not the intent, it is that the processor will do a first pass on the entire karaoke section, see that the mode
… is emphasis all the time and then see how it can be applied.
… There's only one glyph at a time that has the animation, is the assumption.
… 3rd example, has both begin and end times.

Glenn: Problem here is we can't use animate to vary a discretely animatable style.

Cyril: Two levels. First is only one key time per word. Second is two key times per word. I used set for the first and animate for the second.
… Two attributes: one to turn on karaoke mode the other to set the type.
… There's a style property for setting the image for the emphasis, called imageEmphasis to mimic textEmphasis.
… Now looking at the issues.
… Glenn sent several editorial comments.
… One comment was the same as Andreas's comment, which is that there are very few normative behaviours in the spec.

Andreas's comments in issue 4

Cyril: The idea is that an implementation that does not support the karaoke mode can just skip it.
… If you ignore it then it's a regular timed text file.
… Second level is you understand the timing and do whatever processing.
… Third level is precise rendering details, which has more normative requirements.

Glenn: [shows possible representations]

Cyril: First one is setting the colour by span, can be done in IMSC today.
… There's no information that it is a karaoke animation to allow processor to override styles.

Glenn: You could use ttm:role="lyrics"

Pierre: There's nothing defined.

Glenn: Right
… [shows sweep version with colour sweeping through glyphs]
… You cannot do this in TTML2 today because there's no way to apply a colour gradient linearly across an offset.

Cyril: It's not a Netflix requirement to define this exact feature because the renderer will apply the style itself.

Glenn: Example of simple linear text emphasis. [shows example of ball moving across the top of the text]
… This is different than our current textEmphasis because currently it applies to individual glyph areas.

Cyril: You might be able to discretely set the textEmphasis on a character by character basis.
… It wouldn't be continuous.

Glenn: That's one kind. Another uses a bouncing ball [shows example with orange bouncing ball combined with text colour sweeping through]
… Another, discrete turning on of outline. You could do this today with IMSC1.
… Last one does outlines and animated font size in the vertical direction.

Pierre: A key question, first order, is what is the right first order, is it semantic?

Cyril: Semantic and timing.

Pierre: Is that right or also is it necessary to capture the actual visual style.
… To me the first one is a no-brainer. The second is not obvious to me because the karaoke styles are used in
… popular movies and are incredibly complex and high quality.
… How high fidelity do we ultimately want to get to beyond this in TTML.
… Branding is a much bigger topic.

Glenn: I've been talking about a slightly different syntactic variation of doing this with Cyril. In this variation I proposed
… using a single karaoke element that is treated as an Animation.class element like set and animate but with
… special features, for example it does not take style properties but takes color, emphasis or outline, which define
… karaoke effects. One of their values can be auto.
… If you want all of them to be auto you set auto="all"

Cyril: We don't need to look at the details, but can look at the two options and see what goes in the draft.

Glenn: First order, we could just define some semantic element like this, or abbreviated further.
… Then the question is what do we do second order.
… If we only define the semantic element to start with, then that kind of makes it difficult to test. You can't test anything
… interoperably e.g. by generating a sample image.

Pierre: Often you've argued in favour of under-specifying in TTML.

Glenn: That's because we had XSL and CSS for semantics

Cyril: One way to test is if we stick to the minimal key times and semantics implementation without style, is to look at
… the transitions, take the first rendering and subsequent ones and the differences should happen at the key times.

Nigel: Is there any documentation of karaoke semantics elsewhere that we should be referencing?

Glenn: There's ASS and a rich application called aegisub used for fan-subbing.

Nigel: That's not specific to karaoke.
… It's the place to go for burning text!

Pierre: At some point we will need to think if we can do those semantics in TTML

Cyril: The question is how to define timing in the first version that is extensible later with styling.

Pierre: Excellent question. Is the goal to expand karaoke mode with more detailed color handling, say?

Cyril: Could be something like that where each extra detail can be ignored by older processors.
… The marker method doesn't force creation of spans.

Pierre: My initial reaction is that this draft is consistent with the timing model and syntax.
… Sounds good to me, not to invent something new.

Glenn: I'm concerned not to proliferate and add a lot of new style properties.

Pierre: Why is that? Here, what I like is that Cyril's example is basically TTML and it looks like it.

Glenn: This [alternative] does too.

Pierre: No it has a karaoke element that I've never seen before. The difference is that Cyril's way is still TTML.
… This is just my initial take.

Cyril: We need to see the proposal, study the merits, see how to merge them, choose levels etc.
… I'd like to discuss timeline wrt IMSC 1.2. I think that when we start doing HR on Karaoke a lot of people will have
… something to say.

Pierre: Is there an initial implementation?

Cyril: No, I have a generation tool but not a rendering tool.

Pierre: What Glenn is proposing sounds substantially the same as Cyril's proposal except with a new element.

Glenn: In my alternative just one new element and no new attribtues.
… It is an independent level that can be ignored.

Pierre: In capability are the two substantially similar?

Glenn: Cyril's draft doesn't allow multiple effects at the same time.
… There's no parameterisation and if you wanted to do it then you'd need to do something like what I did.

Pierre: One strategy is to send the existing draft to review and see what the reactions are.
… People might look at it and suggest tweaks or say it's entirely terrible.

Glenn: You'd have to take the animate out and replace it with a set so it doesn't break the semantics.

Nigel: Can't you set calcMode to discrete?

Glenn: They become the same thing.
… Another thing is I allow for nested karaoke.

Cyril: I don't want to discuss merits but timeline and ways forward.

Pierre: If the two options were not that far apart then we could go to FPWD with this and see where we are.

Glenn: Asking for this before we have done our homework is too soon. Cyril and I aren't even on the same page yet.

Pierre: The challenge is to set a roadmap for evaluating and publishing FPWD

Nigel: Why did you want to mention IMSC 1.2 before Cyril?

Cyril: I want this to be in IMSC 1.2

Nigel: It's too late

Pierre: There's no chance.

Cyril: I'm not pushing for wide review today. Maybe in a month if we've got something acceptable to the group.

Nigel: Tying to 1.2 doesn't help, there could just as easily be a 1.3 later that we add this to.

Cyril: We will have people asking for flames and other effects.

Glenn: I'm nervous about going out with a proposal that doesn't have any rendering semantics defined in it whatsoever.

Pierre: You could test transition timings as Cyril suggested above. We've had vague semantics before.

Glenn: I agree, you could do that, is that enough to answer people's desire?
… I know Andreas's comment is about the amount of normative requirements.

Cyril: Can we agree to have a level zero for timing and semantics only and a level one that is more fleshed out in terms of styles and properties?

Glenn: That's possible, and would satisfy your first level requirement.

Nigel: If you don't specify level 1 at the beginning then don't expect anyone ever to implement it, because the level 0
… processors might be all that anyone ever supports.

Pierre: I think it's useful to have something rather than nothing. Having a level 0 is a big win all by itself.

Cyril: The first question is level 0 and level 1. Nigel's point is heard that the incentive to go to level 1 for implementers might be limited.
… The next point is if we focus on level 0 what is the best way to specify timing information?
… a. aegisub marker style showing key times
… b. force creation of spans and then put elements like set/animation/karaoke in it.

Glenn: You can think of the karaoke element being like a marker.

Cyril: My question is do we want to force creation of <span>s or not?

Glenn: I see, do you want a p with elements interspersed in it to signify state changes?

Cyril: Yes

Glenn: I'm uncomfortable with that, it's like going back to the old escape sequences and gets away from structured content.

Cyril: You could say the same for <br/>

Glenn: It is considered a control code.

Cyril: There's precedent then.
… But there's no Unicode character for turning a karaoke effect on.
… If we did this we wouldn't have structured content anymore.

Cyril: My draft does add spans.

Glenn: I don't mind a new thing like set or animate

Pierre: You've not demonstrated clearly why a new element is needed.

Glenn: The problem is indicating karaoke semantic by default.
… One way is, let's say we define a karaoke color and emphasis property, defined at first as auto or none, then you could
… use a set to turn them on at a particular time, and then the implementation does its thing. How does that sound?

Cyril: I thought about it. The problem was you end up having to repeat the same properties multiple times. Maybe we
… can make it more efficient by using style referencing.

Glenn: I'm hearing no new element?

Nigel: I'm hearing, If you need a new element, then explain why

Glenn: I can come up with an alternative solution.

Cyril: Can we come up with a schedule for FPWD?

Nigel: I don't mind either way

Glenn: How about we target 1 Nov?

Cyril: Sounds reasonable to me

CJK extensions

Glenn: I have a repository, I will populate it with a draft spec whose content will be the old TTML2 work that we took out.
… I'm interested to know if we have enough implementation intent.

Cyril: It's not very high in my priority list.
… Initially I didn't intend a module for CJK, but just to clarify the edge cases in TTML2.
… We left a lot of cases undefined, like line breaks in ruby, so I'm not sure we need a module.
… If we want new features, yes, but not for clarifications.
… In both cases we need coordination with CSS and i18n.

Glenn: In both cases I don't see the need for @@@@

Cyril: Ruby has been used a lot, especially in Japanese, I don't know about other languages, and bopomofo.

Cyril: Or pinyin.

Glenn: Any kind of annotation.

Cyril: I don't know about roman character annotation

Glenn: We have some open issues about using non-CJK ruby text for example right now we've defined it so you separate
… by adding space in and then do alignment.
… Question is are there open issues for TTML2 2nd Ed right now and my answer is no. Let's push that out.

Cyril: We can let people know that there's a small window of opportunity to review edge cases, but they have to do it
… very soon.

Glenn: Nobody has done that so far. I can put that document out and see where it goes.

Cyril: I'm fine with that so it doesn't get lost.

Glenn: It would give us something to talk about with the CSS folk.

Nigel: So no particular time driver for this work.

Publication planning

Cyril: We haven't discussed TTML3 .

Nigel: Nor 360 and AD profile.

Cyril: What is your opinion about TTML3?

Pierre: It's not on the roadmap, we haven't talked about it.

Nigel: I don't see the need for it because we can add features in extension modules. I don't see any need for
… a change that drives a major version update.

Glenn: I agree, if we do modularisation then there's no strong need for it right now.
… There are a couple of things to change in TTML2 to effectively use the extension modules.
… Right now if you add a new content element in an extension module, say a new element to Animation.class,
… the language in TTML2 doesn't give you that opportunity.
… In other cases we added "or extension modules" but in the case of the vocabulary groups we still have them tied
… concretely to TTML2. We need to add some extensibility hooks - see ttml2#1160.

Cyril: Coming back to TTML3...

Glenn: The reason was to bring in modules, but now that's coming into TTML2 we can put it on the back burner, unless
… there's some other substantive change we find we need.

Pierre: If we decide to go down that path we should retire the ED, because TTML3 returns a result in Google.
… Make a boiler plate to remove confusion, by saying "this is not being worked on"

Cyril: I agree.

Pierre: Just so noone starts implementing it.

Nigel: +1 to banner on the top, we should not relinquish TTML3 and let someone else step in!

Pierre: I want to talk about TTML1 3rd Ed Errata because of XML 1.1 obsoletion.

Cyril: I want to talk about two requirements, one for inline image integration and the other for responsive timed text.

TTML1 3rd Ed Errata

Pierre: There's nothing to do until they obsolete XML 1.1

Nigel: No that's the wrong way around. We want to say "don't obsolete XML 1.1 until we've changed the refs".

<glenn> New issue on ttml3 repo to indicate status as "on hold": https://‌github.com/‌w3c/‌ttml3/‌issues/‌34.

Nigel: The sooner we publish the errata the sooner they can go ahead.

<glenn> Github: https://‌github.com/‌w3c/‌ttml3/‌issues/‌34

Pierre: Create an issue on TTML1 3rd ed

github-bot, help

<github-bot> nigel, The commands I understand are:

<github-bot> help - Send this message.

<github-bot> intro - Send a message describing what I do.

<github-bot> status - Send a message with current bot status.

<github-bot> bye - Leave the channel. (You can /invite me back.)

<github-bot> end topic - End the current topic without starting a new one.

<github-bot> reboot - Make me leave the server and exit. If properly configured, I will then update myself and return.

github-bot, bye

Atsushi: I will learn how to publish the errata, and will go ahead and find out.

Pierre: If someone creates an issue and assigns it to me I will prepare the text of the errata

TTML1 issue to unreference XML 1.1

Responsive timed text

Cyril: There has been no progress, so nothing to do here for now.

Nigel: I agree

Cyril: I raised it and haven't had time to work on it.

Nigel: I said I'd support, but same here.

Inline graphics

Cyril: What's the status here?

Nigel: I think private use area in Unicode might work but nobody will thank us for it, and it would have huge workflow issues.

Cyril: Glyph substitution would work much better though.

Nigel: So have a specific font that specifies a glyph substitution for a word, and if you don't want to do that, use a different font?

Cyril: Exactly.

Pierre: I'm interested in high fidelity generally, but not for IMSC 1.2.

Cyril: We should create specific examples using glyph substitution.

Pierre: I'm also thinking about the high fidelity branding requirements such as the BBC's red line by the text.

Nigel: With the animation

Pierre: Right
… My secret plan is a mid-summer workshop to figure out how we're going to do this.

Nigel: You have other interested people?

Pierre: Yes, other people have the same use case. It's also tied up with picture in picture, sign language etc.
… We need to figure out what we want to do.
… My personal plan is to come up with requirements by mid-next-year.

Nigel: I think I submitted our BBC requirements already

CSS tunnelling requirements

Nigel: I did this CSS tunnelling requirement because BBC folk showed that you could do BBC branding with CSS already but it's hard in TTML.

Pierre: What about SVG?

Nigel: I'm sure you can get even more precision with SVG
… Maybe use CSS content insertion where the content is SVG, if that's allowed?!

Publication planning

Our current publications

Specification publication timeline

Nigel: This timeline was really useful for us last year. Atsushi, could you try to do something similar for our current
… in progress specifications?

Atsushi: Yes

Nigel: You'll have lots of detail questions for us, which is fine, please go ahead and ask!

Nigel: To summarise today's resolutions:
… IMSC 1.2 - agreed to publish FPWD
… TTML Live Extension Module - agreed to publish FPWD
… TTML2 2nd Ed - agreed to publish CR
… TTML Karaoke Extension Module - agreed to try to get to an ED suitable for FPWD by 1st November
… TTML CJK Extension Module - agreed to put a draft together, no particular timeline for publication
… Tomorrow we will look at WebVTT, 360º and AD Profile.
… TTML3 - we agreed to put this on hold for the time being

Atsushi: I wonder if there are missing TTML modules?

Glenn: if so, nobody has proposed them yet
… TTML Live WebSocket Carriage Mechanism - agreed to publish FPWD

Pierre: Don't forget TTML1 3rd Edition errata.

Nigel: Of course, we agreed to create and publish them.

Nigel: And with that, today's agenda is complete so we'll adjourn until tomorrow. Thanks everyone!

– DRAFT –
Timed Text Working Group Teleconference

18 September 2019

Attendees

Meeting minutes

M&E IG follow-up

CSS WG meeting follow-up

break until 1045

IMSC 1.1 issues

forcedDisplay and visibility="hidden" imsc#484

Support `#font` TTML2 feature #imsc472

IMSC 1.2 FPWD

Lunch

Live Extensions

TTML2

Remove application of tts:rubyPosition to ruby annotation text. ttml2#945

The set element is included in [resolve computed styles]. ttml2#950

Equivalence between tts:textDecoration="none" and "noUnderline noLineThrough noOverline" #1138

Ruby constraints cannot be validated prior to ISD construction #1140

Clarify luminance gain prose (#1117). #1156

FPWD to TTML2 2nd ed

Break until 1600

Karaoke Extensions

CJK extensions

Publication planning

TTML1 3rd Ed Errata

Responsive timed text

Inline graphics

Publication planning

Summary of resolutions

Diagnostics