<scribe> Meeting: Improving Spoken Presentation of Web Content
scribe+
MH: will give you background on what we are doing in the pronunciation TF under the APA
<Makoto_> presen+
chair+ Irfan
MH: idea behind the
personalization TF, under APA facilitator is Irfan
... Pearson, DAISY, supported by Microsoft, college Board
participating
<Irfan> https://w3c.github.io/pronunciation/user-scenarios/
MH: need from education community Pearson and College Board does educational assessments active since Oct 2018
<Irfan> https://w3c.github.io/pronunciation/gap-analysis/
<Irfan> https://w3c.github.io/pronunciation/use-cases/
MH: first working drafts just
published, gap analysis, use cases and user senarios.
... student using AT, screen aloud technology. students
listening to content that was being misspoken
... education setting even slight problems is a major problem
if a word is not spoken exactly like the teacher that is a
problem. Read aloud tools can assist language learners.
... learning disabilities to understand content on the
web.
... voice base assistance (aka Alexa, Siri, Google Home)
etc.
... how do we enable content authors to make these systems to
speak the content correctly.
... We can't do this yet in HTML. audio books in EPUB with TTS
or reading their Ebook, or books on-mass using TTS is a use
case here.
... active spoken content critical in publishing and
educational domains.
... we are trying to solve the problem today.
... there are hacks today: improper use of ARIA standard with
aria-label but that only helps SR users not read aloud.
... data attributes being uses may be used in proprietary
products but no interoperability. Refreshable Braille ETS,
Pearson, will put into ARIA will be sent to the speech synth
but being read on the display incorrectly then which is a real
problem.
... looking for a standards based solution SSML a growing # of
Speech Engine support this, Amazon Polly CSS Speech is
dead.
AT have nothing to support.
scribe: decision by author speech
synth are getting better but education context but the author
needs to be best to suggest the spoken content. US there is a
consortium for spoken math content.
... people put commas in the text to add commas for pause but
this causes issues on the braille display getting ,,,, etc for
a long pause.
SSML is a great standard, CSS Speech is dead, PLS is another domain lexicons specification.
scribe: PLS can be domain specific say in chemistry.
<Makoto_> In the context of EPUB3, we have a standard attribute for embedding SSML within HTML content documents.
<Makoto_> It has been very heavily used in Japan. For example, by the biggest textbook publisher (Tokyo Shoseki).
scribe: Gap analysis change
language of content, gender, phonetic, substitution, see Gap
analysis document. pitch volume, emphasis, say-as
... example zipcode wont' read it as separate numbers for
example.
... pausing is an issue.
... HTML lets us markup language an semantics emphasis,
language support, emphasis not widely supported capability in
HTML but not supported.
ARIA, does not help solving the problem with substitution but this would be a hack.
PSL helps phonetic pronunciation.
CSS speech did rate/pitch, volume but not much else.
SSML, does support all of these potential gaps.
Mokoto: Japanese publishing company uses SSML but costs 4X more
MH: thats only for phonetic
pronunciation, could make it easier to markup the
language.
... say-as digits/numbers, emphasis, break, verbosity wan to
expose to content creators.
<Makoto_> I am afraid that I have to go to a JLreq TF meeting.
Inline SSML within HTML has been a nonstarter, talking with AT at this point not looking to support inline SSML
scribe: attribute model in EPUB3, like data-ssml or just ssml but these are only hacks.
<Makoto_> But let me ask whether the API between browser engines or EPUB reading systems and T2S engines.
Key points: Content encode SSML into HTML
<Makoto_> Text only? Or DOM tree? This issue was raised in the joint meeting of the CSS and I18N.
AT and other speech producers must be able to consume the SSML from the content.
TTS must consume the SSML and render the correct speech.
<Makoto_> BTW, DAISY people in Japan are very skeptical about the use of ruby for T2S.
Apple can map most of the SSML to the native speech, would be great to support this
whsieh: Apples position on this has not changed
<whsieh> CharlesL: whsieh here from Apple (sorry!)
MH: 2015 working with IMS adding SSML to the QTI standard (authoring profile for test questions allowed authors to use SSML into test questions, but that standard QTI gets translated to HTML but then lost the SSML support.
<Roy> See the Pronunciation Overview at https://www.w3.org/WAI/pronunciation/
MH: attribute approach data-SSML
has some support
... simple JSON value pairs
... some vendors seem to think this is a doable option.
Irfan: there is a wiki page for the example
<Irfan> https://github.com/w3c/pronunciation/wiki
MH: angle 30deg instead of AT
saying CAB or C A B should be interpreted as separate
characters.
... no speech synth can do coordinates in math, substitution
method where pm gets expanded to picometer for example.
Judy: noun and verb are pronounced with different emphasis,
MH: we haven't see that in
practice.
... creating web components, inline SSML, multiple
attributes
... survey has been put out towards Speech consumers which
options are acceptable.
Omar: use case use SSML for chatbot to service customers not a11y /SR we send the voice file
<Judy> [Judy: Wow! (Markku, comprehensive overview, thank you!]
Omar: we would have to stop doing that from the backend to support SR. other issue is to support other languages.
Janina: when we get to the normative part of the spec, we will need to specify language to ensure all TTS is already loaded.
Judy: isn't that a guideline in WCAG?
Janina: with inline you must declare the languages
aaron: we could do a prescan automation,
Omar: but will that be a refresh of page?
aaron: shold not be a problem nor refresh.
MH: wcag 2.2 might help us description of spoken content is a AAA requirement would like to see it as a AA.
Arab ic: arabic terms depends on the context of the sentence to add specific diatecs.
MH: ruby text
<Irfan> https://w3c.github.io/pronunciation/use-cases/#use-case-ruby
Judy: found the overview helpful information dense, i think a very highlevel overview would be good.
<Judy> ack (long
Bobby: req. for Japan is text
layout Ruby model technical issues markup using Ruby above or
right side
... issues with ruby model to support Japanese language, can
hear these annotations twice, should just skip the Ruby
base.
... issue with pronunciation with ruby annotations potenially.
Chinese traditional / simplified
MH: one challenges getting all
the stakeholders in the same room, we haven't had any Chinese
companies to be part of the TF. would be great to get review
from Apple, Google, would be great to get Apples involvement.
We welcome more input, more eyes looking at what we are
doing.
... I am looking at Avneesh and representing the Publishing
community.
Avneesh: Matt Garrish has already been assigned to review your specification.
Irfan: FPWD has been published and will add more such as the gap analysis and add examples
use case needs more examples based on the feedback here. we have some timelines and are working towards meeting those.
MH: ETS some testing tools explore the different markup approaches, and they tend to work across platforms but for Mac , there are extra JS to o the mappings.
Irfan: the Survey, we got some feedback but still waiting, so we extended the date to next week.
MH: we will send of further surveys as we get closer to some recomendations, we are reaching out to the AT, and consumers, the amazons, googles, etc.
I was working on an Alexa skill that would take content from the web and if there was SSML contained then it would be spoken correctly.
MH: content editable is an important case
Input text can that speech markup can be done, JS range case
Irfran: HTML content editable, JS can manipulate this…
MH: masters student can take
these WYHIWYS (What you Hear is what you See)
... costs how do we make this easier and cheeper and easy to
maintain
... thank you all for coming here today, ruby text was great,
the cost for SSML, text entry input was all great topics to
bring up.
Thanks everyone. great discussion.
This is scribe.perl Revision: 1.154 of Date: 2018/09/25 16:35:56 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00) Succeeded: s/??/whsieh/ Present: achraf CharlesL Joanmarie_Diggs Irfan jihye Roy Makoto_ Avneesh Judy Janina igarashi whsieh No ScribeNick specified. Guessing ScribeNick: CharlesL Inferring Scribes: CharlesL WARNING: No "Topic:" lines found. WARNING: No date found! Assuming today. (Hint: Specify the W3C IRC log URL, and the date will be determined from that.) Or specify the date like this: <dbooth> Date: 12 Sep 2002 People with action items: WARNING: Input appears to use implicit continuation lines. You may need the "-implicitContinuations" option. WARNING: No "Topic: ..." lines found! Resulting HTML may have an empty (invalid) <ol>...</ol>. Explanation: "Topic: ..." lines are used to indicate the start of new discussion topics or agenda items, such as: <dbooth> Topic: Review of Amy's report WARNING: IRC log location not specified! (You can ignore this warning if you do not want the generated minutes to contain a link to the original IRC log.)[End of scribe.perl diagnostic output]