IRC log of voice on 2021-10-20

Timestamps are in UTC.

06:21:15 [RRSAgent]
RRSAgent has joined #voice
06:21:15 [RRSAgent]
logging to https://www.w3.org/2021/10/20-voice-irc
06:21:16 [dom]
RRSAgent, stay
06:21:19 [dom]
RRSAgent, make log public
10:26:11 [stevelee]
stevelee has joined #voice
12:55:33 [stevelee]
stevelee has joined #voice
14:54:16 [IrfanA]
IrfanA has joined #voice
14:54:27 [IrfanA]
present+
15:00:01 [mhakkinen_]
mhakkinen_ has joined #voice
15:00:53 [avneeshsingh]
avneeshsingh has joined #voice
15:02:13 [r12a]
r12a has joined #voice
15:02:29 [wolfgang]
wolfgang has joined #voice
15:02:30 [bkardell_]
bkardell_ has joined #voice
15:02:32 [jasonjgw]
jasonjgw has joined #voice
15:02:57 [cpn]
cpn has joined #voice
15:03:03 [Mizushima]
Mizushima has joined #voice
15:03:05 [kris_chapman]
kris_chapman has joined #voice
15:03:27 [bkardell_]
present+
15:03:43 [stevelee]
present+
15:03:48 [kris_chapman]
present+
15:04:03 [cpn]
present+ Chris_Needham
15:04:08 [ddahl]
present+
15:04:39 [mhakkinen_]
present+
15:04:50 [jasonjgw]
present+
15:05:16 [kaz]
present+
15:05:21 [LisaSeemanKest_]
LisaSeemanKest_ has joined #voice
15:05:21 [cwilso]
cwilso has joined #voice
15:05:28 [kaz]
scribenick: ddahl
15:05:33 [Zakim]
Zakim has joined #voice
15:05:39 [cwilso]
present+
15:05:45 [kaz]
rrsagent, make log public
15:05:50 [kaz]
rrsagent, draft minutes
15:05:50 [RRSAgent]
I have made the request to generate https://www.w3.org/2021/10/20-voice-minutes.html kaz
15:06:43 [ddahl]
kaz: agenda, review existing standards from AC meeting, interoperability issues, and possible workshop
15:06:55 [ddahl]
...there are many existing standards
15:06:57 [kaz]
slide 2
15:07:46 [ddahl]
...VoiceXML, SSML, CSS speech, WebSpeechAPI and Specification for Spoken Presentation
15:08:23 [kaz]
slide 3
15:08:45 [ddahl]
...voice agents are getting popular -- accurate pronunciation, flexible speech styles, etc.
15:09:08 [ddahl]
...need for improved voice agents
15:09:34 [kaz]
slide 4
15:09:38 [kaz]
scribenick: kaz
15:09:45 [Kim_patch]
Kim_patch has joined #voice
15:10:00 [kaz]
dd: concerns by voice interaction cg
15:10:17 [kaz]
... generic agents like Siri, Google, Alexa
15:10:28 [kaz]
... other systems as well
15:10:48 [kaz]
... not based on web standards primarily
15:11:00 [kaz]
... would like to get them to interoperability
15:11:26 [kaz]
... for example, for banking, retail, ...
15:11:57 [kaz]
... CG meeting next week exactly same time
15:12:24 [kaz]
slide 5
15:12:31 [kaz]
dd: a lot of parallels
15:13:11 [kaz]
... Web page vs IPA
15:13:28 [kaz]
... web pages have to deal with user interaction
15:13:35 [kaz]
... primarily using GUI
15:14:02 [kaz]
... IPA use interaction main based on voice and natural language
15:14:21 [kaz]
... arbitrary differences as well
15:14:34 [kaz]
... browser vs proprietary platforms
15:14:52 [kaz]
(IPA stands for Intelligent Personal Assistants)
15:15:06 [kaz]
dd: ecosystem of skills, actions or whatever
15:15:15 [kaz]
... have to find them through the platform
15:15:31 [kaz]
slide 6
15:15:40 [kaz]
dd: this is architecture of IPA
15:15:46 [kaz]
... not going into the details
15:15:58 [kaz]
s/IPA/IPA generated by the CG/
15:16:19 [kaz]
... green box on the left is device
15:16:33 [kaz]
... the input device could be a microphone
15:16:55 [kaz]
... i the middle red box includes "dialogs"
15:17:13 [kaz]
... and blue box on the right includes "provider selection service"
15:17:40 [kaz]
... we have something component which perform the functions
15:17:48 [kaz]
... analogous with the browsers
15:17:55 [kaz]
slide 7
15:18:12 [ddahl]
scribenick: ddahl
15:18:23 [ddahl]
kaz: potential voice workshop
15:19:06 [ddahl]
...try to solve potential pain points
15:19:23 [ddahl]
...what is the best mechanism for discussion?
15:20:05 [ddahl]
...feedback from the first breakout
15:20:26 [bkardell_]
are the slides available so I can zoom in to see some of these things better than I could here?
15:20:53 [ddahl]
existing standards, other related technologies, pain points, emotion, common sense database related to people's perception
15:21:15 [ddahl]
...several participants said that emotion would be very interesting
15:21:46 [ddahl]
https://github.com/w3c/strategy/issues/221
15:21:53 [bkardell_]
q+
15:22:00 [ddahl]
kaz: opinions?
15:22:00 [kaz]
topic: Discussion
15:22:18 [jasonjgw]
q+
15:22:30 [cpn]
q+
15:22:31 [avneeshsingh]
q+
15:22:36 [ddahl]
brian: zoom in on IPA architecture
15:23:00 [ddahl]
...currently it's very underspecified how this is implemented
15:23:10 [phila_]
phila_ has joined #voice
15:23:13 [kaz]
i/agenda, review/topic: Presentation/
15:23:17 [ddahl]
...in current implementations
15:23:18 [kaz]
q?
15:24:10 [kaz]
ack bk
15:24:32 [ddahl]
...can you send SSML? Yes in some cases, but sometimes it doesn't work
15:25:44 [phila_]
q+
15:26:12 [kaz]
ack jas
15:26:41 [ddahl]
https://w3c.github.io/voiceinteraction/voice%20interaction%20drafts/paArchitecture-1-2.htm
15:26:53 [kaz]
dd: welcome to the CG meeting as well
15:27:50 [bkardell_]
... if we could please share link the APA doc just referernced for review/minutes as well?
15:27:59 [kaz]
-> https://www.w3.org/TR/2021/WD-naur-20211012/ Natural Language Interface Accessibility User Requirements
15:28:06 [IrfanA]
https://www.w3.org/TR/spoken-html/
15:28:13 [cpn]
q+ lisa_seeman
15:28:33 [ddahl]
jason: points out two accessibility publications
15:28:45 [ddahl]
...that are relevant to this discussion
15:29:20 [IrfanA]
https://github.com/w3c/pronunciation/issues
15:29:38 [jasonjgw]
https://www.w3.org/TR/naur/
15:29:44 [kaz]
q?
15:29:45 [ddahl]
irfan: please add issues to github
15:29:45 [mhakkinen_]
q+
15:29:53 [kaz]
ack cpn
15:30:29 [ddahl]
chris: as a content publisher, we've had to overcome a lot of proprietary content
15:30:57 [ddahl]
...is there interest from device manufacturers?
15:31:15 [ddahl]
kaz: would like to invite these vendors to the workshop
15:31:42 [kaz]
q?
15:31:42 [ddahl]
chrisW: would be interested
15:31:48 [kaz]
ack av
15:31:57 [MURATA_]
MURATA_ has joined #voice
15:31:58 [MURATA_]
present+
15:32:19 [cpn]
s/chrisW/Jason/
15:32:36 [ddahl]
avneesh: this is very important work, in the community group -- what would big players see as business benefits
15:32:40 [bkardell_]
q+
15:32:57 [kaz]
scribenick: kaz
15:33:05 [kaz]
dd: would be really interesting to look into
15:33:11 [kaz]
... have not done yet
15:33:32 [kaz]
... focus on our own short-term interest in gaps on interoperability so far
15:33:38 [kaz]
... but should look into it
15:33:46 [kaz]
q?
15:33:53 [kaz]
ack p
15:34:03 [LisaSeemanKest_]
q+
15:34:38 [ddahl]
philArcher: this is the 3rd TPAC in a row, have we yet reached a critical mass yet?
15:35:04 [ddahl]
kaz: let's hold the workshop in the next 6 months or so
15:35:16 [ddahl]
...probably will be held remotely
15:35:30 [kaz]
q?
15:35:36 [kaz]
ack lisa
15:35:45 [kaz]
ack lisa
15:36:13 [ddahl]
lisa_seeman: how does this interact with people with cognitive disabilities
15:36:30 [ddahl]
...put some ideas in Content Usable note
15:36:51 [ddahl]
...how can this specification support people with voice disabilities
15:37:05 [stevelee]
https://www.w3.org/TR/coga-usable/#voice-menus-user-story
15:37:33 [ddahl]
...this is full of potential and helps businesses with getting users who are struggling
15:37:53 [ddahl]
...that could be part of the business case
15:38:42 [kaz]
q+ tobias
15:39:12 [ddahl]
... we also requested that audio descriptions have easier and more literal descriptions made request to PA
15:39:17 [ddahl]
s/PA/APA
15:39:30 [kaz]
scribenick: kaz
15:39:36 [kaz]
dd: really interesting
15:39:47 [kaz]
... might be difficult to have simple audio description
15:40:07 [kaz]
... but it would be considerable to use external services
15:40:23 [kaz]
... would be possible to use EMMA message
15:40:37 [kaz]
... useful technology
15:40:55 [kaz]
... natural language technology is a difficult technology
15:41:05 [kaz]
... but getting better and better
15:41:40 [kaz]
lisa: you could dialog with your users
15:41:51 [kaz]
dd: someone may need some additional treatment
15:42:18 [kaz]
... e.g., airline reservations, need many parameters
15:42:35 [kaz]
lisa: wondering if there would be possible to put in a facility
15:42:39 [ddahl]
...more directed dialog for people who need simplified dialog
15:42:56 [kaz]
i/more/scribenick: ddahl/
15:43:05 [kaz]
s/...more/lisa: more/
15:43:05 [ddahl]
lisa: how could you make a note for yourself?
15:43:15 [kaz]
qq+ kaz
15:43:25 [kaz]
ack k
15:43:25 [Zakim]
kaz, you wanted to react to lisa_seeman
15:43:29 [ddahl]
...that would be good for people who have memory issues
15:44:14 [ddahl]
kaz: maybe that could be integrated in architecture
15:44:34 [kaz]
q?
15:44:36 [ddahl]
...that could be discussed during the workshop
15:44:40 [kaz]
ack mha
15:45:18 [mhakkinen_]
https://w3c.github.io/pronunciation/gap-analysis_and_use-case/
15:45:30 [ddahl]
mark: we've been raising the issue of getting better pronunciation for several years
15:45:36 [LisaSeemanKest_]
thank you mark
15:46:32 [ddahl]
...the education use case is that many users use computer read aloud. It might be good to bring in vendors from this community
15:46:40 [ddahl]
...for example, Text Help
15:46:53 [ddahl]
...we would be interested in the workshop
15:47:05 [kaz]
q?
15:47:12 [kaz]
ack bk
15:47:48 [ddahl]
brian: wants to highlight that this and a lot conversations are in terms of voice assistants like Siri
15:48:18 [ddahl]
...the use cases for TTS and STT are way broader than that
15:49:31 [ddahl]
...my company makes products for embedded devices. There are many uses cases that aren't browsers or voice agents
15:49:56 [ddahl]
...we should not limit this to conversational interfaces
15:50:49 [ddahl]
...many devices can't support a full conversational interface
15:51:00 [mhakkinen_]
+q
15:51:02 [cpn]
it's not an either/or question
15:51:10 [ddahl]
...will the workshop cover these things?
15:51:27 [ddahl]
...the SSML has to make it all the way down to what's actually speaking
15:52:09 [phila_]
q+
15:52:31 [ddahl]
... not questioning the value of conversational interfaces, but would like to broaden discussion
15:52:58 [ddahl]
kaz: we should talk about what's to be included
15:52:59 [kaz]
q?
15:53:11 [kaz]
qq+ dwalka
15:53:16 [kaz]
ack dwa
15:53:16 [Zakim]
dwalka, you wanted to react to bkardell_
15:53:35 [ddahl]
dirk: in Voice Interaction group we meant to include other modalities, like chatbots
15:54:01 [kaz]
qq+ mark
15:54:03 [kaz]
ack mark
15:54:03 [Zakim]
mark, you wanted to react to dwalka
15:54:11 [dirk_]
dirk_ has joined #voice
15:54:12 [kaz]
qq+ kaz
15:54:46 [ddahl]
mark: let's consider emergency alerts, synthesized alerts also have problems with pronunication
15:55:07 [bkardell_]
mark: do you have any link to the oasis stuff you mentioned
15:55:07 [mhakkinen_]
http://docs.oasis-open.org/emergency/cap/v1.2/CAP-v1.2-os.html
15:55:08 [ddahl]
...how can we improve this? did some earlier work
15:55:29 [kaz]
qq+ lisa
15:55:33 [kaz]
ack lisa
15:55:33 [Zakim]
lisa, you wanted to react to mark
15:55:46 [ddahl]
lisa: that would be a great use case. emergency communications have to be available to every single subgroup
15:56:01 [phila_]
q-
15:56:09 [ddahl]
...other use cases will be able to join that ecosystem at a lower cost
15:56:45 [ddahl]
kaz: we might want to talk about not just voice but have a "Smart Agent" workshop
15:57:00 [kaz]
ack kaz
15:57:00 [Zakim]
kaz, you wanted to react to mark
15:57:04 [kaz]
ack tob
15:57:14 [kirkwood]
+1 to ‘smart agent’ its clearer i think
15:57:31 [bkardell_]
can we get a link to the iso standards mentioned?
15:57:51 [ddahl]
tobias: working on DIN and OASIS standards. voice is very powerful, but the fastest way forward is to agree on minimal requirements
15:57:56 [ddahl]
...and implement them
15:58:01 [phila_]
I'm OK with Smart Agent workshop. My focus, unsurprisingly, is eCommerce and what's necessary for brand owners to help Smart Agents disambiguate products and retailers.
15:58:05 [kirkwood]
‘smart voice agent’
15:58:29 [kaz]
q?
15:58:35 [kaz]
ack mh
15:59:30 [ddahl]
kaz: would like to update the proposal with this feedback. Would like everyone to join the Program Committee
15:59:49 [ddahl]
...please contact me
15:59:50 [kaz]
-> https://www.w3.org/2021/Talks/1018-voice-dd-ka/20211018-voice-breakout-dd-ka.pdf slides
16:00:12 [kaz]
-> https://github.com/w3c/strategy/issues/221 github issue
16:00:18 [kaz]
<ashimura@w3.org>
16:00:40 [kaz]
[adjourned]
16:00:40 [ddahl]
rrsagent, format minutes
16:00:40 [RRSAgent]
I have made the request to generate https://www.w3.org/2021/10/20-voice-minutes.html ddahl
16:00:59 [ddahl]
rrsagent, make logs public
16:01:09 [kaz]
OVON Open Voice Network
16:01:09 [kaz]
Open Oasis RECITE Initiative
16:01:09 [kaz]
Amazon Voice Interoperability Initiative
16:01:54 [kaz]
rrsagent, draft minutes
16:01:54 [RRSAgent]
I have made the request to generate https://www.w3.org/2021/10/20-voice-minutes.html kaz
16:02:52 [r12a]
r12a has left #voice
16:29:11 [dom]
RRSAgent, bye
16:29:11 [RRSAgent]
I see no action items