IRC log of voice on 2021-10-20
Timestamps are in UTC.
- 06:21:15 [RRSAgent]
- RRSAgent has joined #voice
- 06:21:15 [RRSAgent]
- logging to https://www.w3.org/2021/10/20-voice-irc
- 06:21:16 [dom]
- RRSAgent, stay
- 06:21:19 [dom]
- RRSAgent, make log public
- 10:26:11 [stevelee]
- stevelee has joined #voice
- 12:55:33 [stevelee]
- stevelee has joined #voice
- 14:54:16 [IrfanA]
- IrfanA has joined #voice
- 14:54:27 [IrfanA]
- present+
- 15:00:01 [mhakkinen_]
- mhakkinen_ has joined #voice
- 15:00:53 [avneeshsingh]
- avneeshsingh has joined #voice
- 15:02:13 [r12a]
- r12a has joined #voice
- 15:02:29 [wolfgang]
- wolfgang has joined #voice
- 15:02:30 [bkardell_]
- bkardell_ has joined #voice
- 15:02:32 [jasonjgw]
- jasonjgw has joined #voice
- 15:02:57 [cpn]
- cpn has joined #voice
- 15:03:03 [Mizushima]
- Mizushima has joined #voice
- 15:03:05 [kris_chapman]
- kris_chapman has joined #voice
- 15:03:27 [bkardell_]
- present+
- 15:03:43 [stevelee]
- present+
- 15:03:48 [kris_chapman]
- present+
- 15:04:03 [cpn]
- present+ Chris_Needham
- 15:04:08 [ddahl]
- present+
- 15:04:39 [mhakkinen_]
- present+
- 15:04:50 [jasonjgw]
- present+
- 15:05:16 [kaz]
- present+
- 15:05:21 [LisaSeemanKest_]
- LisaSeemanKest_ has joined #voice
- 15:05:21 [cwilso]
- cwilso has joined #voice
- 15:05:28 [kaz]
- scribenick: ddahl
- 15:05:33 [Zakim]
- Zakim has joined #voice
- 15:05:39 [cwilso]
- present+
- 15:05:45 [kaz]
- rrsagent, make log public
- 15:05:50 [kaz]
- rrsagent, draft minutes
- 15:05:50 [RRSAgent]
- I have made the request to generate https://www.w3.org/2021/10/20-voice-minutes.html kaz
- 15:06:43 [ddahl]
- kaz: agenda, review existing standards from AC meeting, interoperability issues, and possible workshop
- 15:06:55 [ddahl]
- ...there are many existing standards
- 15:06:57 [kaz]
- slide 2
- 15:07:46 [ddahl]
- ...VoiceXML, SSML, CSS speech, WebSpeechAPI and Specification for Spoken Presentation
- 15:08:23 [kaz]
- slide 3
- 15:08:45 [ddahl]
- ...voice agents are getting popular -- accurate pronunciation, flexible speech styles, etc.
- 15:09:08 [ddahl]
- ...need for improved voice agents
- 15:09:34 [kaz]
- slide 4
- 15:09:38 [kaz]
- scribenick: kaz
- 15:09:45 [Kim_patch]
- Kim_patch has joined #voice
- 15:10:00 [kaz]
- dd: concerns by voice interaction cg
- 15:10:17 [kaz]
- ... generic agents like Siri, Google, Alexa
- 15:10:28 [kaz]
- ... other systems as well
- 15:10:48 [kaz]
- ... not based on web standards primarily
- 15:11:00 [kaz]
- ... would like to get them to interoperability
- 15:11:26 [kaz]
- ... for example, for banking, retail, ...
- 15:11:57 [kaz]
- ... CG meeting next week exactly same time
- 15:12:24 [kaz]
- slide 5
- 15:12:31 [kaz]
- dd: a lot of parallels
- 15:13:11 [kaz]
- ... Web page vs IPA
- 15:13:28 [kaz]
- ... web pages have to deal with user interaction
- 15:13:35 [kaz]
- ... primarily using GUI
- 15:14:02 [kaz]
- ... IPA use interaction main based on voice and natural language
- 15:14:21 [kaz]
- ... arbitrary differences as well
- 15:14:34 [kaz]
- ... browser vs proprietary platforms
- 15:14:52 [kaz]
- (IPA stands for Intelligent Personal Assistants)
- 15:15:06 [kaz]
- dd: ecosystem of skills, actions or whatever
- 15:15:15 [kaz]
- ... have to find them through the platform
- 15:15:31 [kaz]
- slide 6
- 15:15:40 [kaz]
- dd: this is architecture of IPA
- 15:15:46 [kaz]
- ... not going into the details
- 15:15:58 [kaz]
- s/IPA/IPA generated by the CG/
- 15:16:19 [kaz]
- ... green box on the left is device
- 15:16:33 [kaz]
- ... the input device could be a microphone
- 15:16:55 [kaz]
- ... i the middle red box includes "dialogs"
- 15:17:13 [kaz]
- ... and blue box on the right includes "provider selection service"
- 15:17:40 [kaz]
- ... we have something component which perform the functions
- 15:17:48 [kaz]
- ... analogous with the browsers
- 15:17:55 [kaz]
- slide 7
- 15:18:12 [ddahl]
- scribenick: ddahl
- 15:18:23 [ddahl]
- kaz: potential voice workshop
- 15:19:06 [ddahl]
- ...try to solve potential pain points
- 15:19:23 [ddahl]
- ...what is the best mechanism for discussion?
- 15:20:05 [ddahl]
- ...feedback from the first breakout
- 15:20:26 [bkardell_]
- are the slides available so I can zoom in to see some of these things better than I could here?
- 15:20:53 [ddahl]
- existing standards, other related technologies, pain points, emotion, common sense database related to people's perception
- 15:21:15 [ddahl]
- ...several participants said that emotion would be very interesting
- 15:21:46 [ddahl]
- https://github.com/w3c/strategy/issues/221
- 15:21:53 [bkardell_]
- q+
- 15:22:00 [ddahl]
- kaz: opinions?
- 15:22:00 [kaz]
- topic: Discussion
- 15:22:18 [jasonjgw]
- q+
- 15:22:30 [cpn]
- q+
- 15:22:31 [avneeshsingh]
- q+
- 15:22:36 [ddahl]
- brian: zoom in on IPA architecture
- 15:23:00 [ddahl]
- ...currently it's very underspecified how this is implemented
- 15:23:10 [phila_]
- phila_ has joined #voice
- 15:23:13 [kaz]
- i/agenda, review/topic: Presentation/
- 15:23:17 [ddahl]
- ...in current implementations
- 15:23:18 [kaz]
- q?
- 15:24:10 [kaz]
- ack bk
- 15:24:32 [ddahl]
- ...can you send SSML? Yes in some cases, but sometimes it doesn't work
- 15:25:44 [phila_]
- q+
- 15:26:12 [kaz]
- ack jas
- 15:26:41 [ddahl]
- https://w3c.github.io/voiceinteraction/voice%20interaction%20drafts/paArchitecture-1-2.htm
- 15:26:53 [kaz]
- dd: welcome to the CG meeting as well
- 15:27:50 [bkardell_]
- ... if we could please share link the APA doc just referernced for review/minutes as well?
- 15:27:59 [kaz]
- -> https://www.w3.org/TR/2021/WD-naur-20211012/ Natural Language Interface Accessibility User Requirements
- 15:28:06 [IrfanA]
- https://www.w3.org/TR/spoken-html/
- 15:28:13 [cpn]
- q+ lisa_seeman
- 15:28:33 [ddahl]
- jason: points out two accessibility publications
- 15:28:45 [ddahl]
- ...that are relevant to this discussion
- 15:29:20 [IrfanA]
- https://github.com/w3c/pronunciation/issues
- 15:29:38 [jasonjgw]
- https://www.w3.org/TR/naur/
- 15:29:44 [kaz]
- q?
- 15:29:45 [ddahl]
- irfan: please add issues to github
- 15:29:45 [mhakkinen_]
- q+
- 15:29:53 [kaz]
- ack cpn
- 15:30:29 [ddahl]
- chris: as a content publisher, we've had to overcome a lot of proprietary content
- 15:30:57 [ddahl]
- ...is there interest from device manufacturers?
- 15:31:15 [ddahl]
- kaz: would like to invite these vendors to the workshop
- 15:31:42 [kaz]
- q?
- 15:31:42 [ddahl]
- chrisW: would be interested
- 15:31:48 [kaz]
- ack av
- 15:31:57 [MURATA_]
- MURATA_ has joined #voice
- 15:31:58 [MURATA_]
- present+
- 15:32:19 [cpn]
- s/chrisW/Jason/
- 15:32:36 [ddahl]
- avneesh: this is very important work, in the community group -- what would big players see as business benefits
- 15:32:40 [bkardell_]
- q+
- 15:32:57 [kaz]
- scribenick: kaz
- 15:33:05 [kaz]
- dd: would be really interesting to look into
- 15:33:11 [kaz]
- ... have not done yet
- 15:33:32 [kaz]
- ... focus on our own short-term interest in gaps on interoperability so far
- 15:33:38 [kaz]
- ... but should look into it
- 15:33:46 [kaz]
- q?
- 15:33:53 [kaz]
- ack p
- 15:34:03 [LisaSeemanKest_]
- q+
- 15:34:38 [ddahl]
- philArcher: this is the 3rd TPAC in a row, have we yet reached a critical mass yet?
- 15:35:04 [ddahl]
- kaz: let's hold the workshop in the next 6 months or so
- 15:35:16 [ddahl]
- ...probably will be held remotely
- 15:35:30 [kaz]
- q?
- 15:35:36 [kaz]
- ack lisa
- 15:35:45 [kaz]
- ack lisa
- 15:36:13 [ddahl]
- lisa_seeman: how does this interact with people with cognitive disabilities
- 15:36:30 [ddahl]
- ...put some ideas in Content Usable note
- 15:36:51 [ddahl]
- ...how can this specification support people with voice disabilities
- 15:37:05 [stevelee]
- https://www.w3.org/TR/coga-usable/#voice-menus-user-story
- 15:37:33 [ddahl]
- ...this is full of potential and helps businesses with getting users who are struggling
- 15:37:53 [ddahl]
- ...that could be part of the business case
- 15:38:42 [kaz]
- q+ tobias
- 15:39:12 [ddahl]
- ... we also requested that audio descriptions have easier and more literal descriptions made request to PA
- 15:39:17 [ddahl]
- s/PA/APA
- 15:39:30 [kaz]
- scribenick: kaz
- 15:39:36 [kaz]
- dd: really interesting
- 15:39:47 [kaz]
- ... might be difficult to have simple audio description
- 15:40:07 [kaz]
- ... but it would be considerable to use external services
- 15:40:23 [kaz]
- ... would be possible to use EMMA message
- 15:40:37 [kaz]
- ... useful technology
- 15:40:55 [kaz]
- ... natural language technology is a difficult technology
- 15:41:05 [kaz]
- ... but getting better and better
- 15:41:40 [kaz]
- lisa: you could dialog with your users
- 15:41:51 [kaz]
- dd: someone may need some additional treatment
- 15:42:18 [kaz]
- ... e.g., airline reservations, need many parameters
- 15:42:35 [kaz]
- lisa: wondering if there would be possible to put in a facility
- 15:42:39 [ddahl]
- ...more directed dialog for people who need simplified dialog
- 15:42:56 [kaz]
- i/more/scribenick: ddahl/
- 15:43:05 [kaz]
- s/...more/lisa: more/
- 15:43:05 [ddahl]
- lisa: how could you make a note for yourself?
- 15:43:15 [kaz]
- qq+ kaz
- 15:43:25 [kaz]
- ack k
- 15:43:25 [Zakim]
- kaz, you wanted to react to lisa_seeman
- 15:43:29 [ddahl]
- ...that would be good for people who have memory issues
- 15:44:14 [ddahl]
- kaz: maybe that could be integrated in architecture
- 15:44:34 [kaz]
- q?
- 15:44:36 [ddahl]
- ...that could be discussed during the workshop
- 15:44:40 [kaz]
- ack mha
- 15:45:18 [mhakkinen_]
- https://w3c.github.io/pronunciation/gap-analysis_and_use-case/
- 15:45:30 [ddahl]
- mark: we've been raising the issue of getting better pronunciation for several years
- 15:45:36 [LisaSeemanKest_]
- thank you mark
- 15:46:32 [ddahl]
- ...the education use case is that many users use computer read aloud. It might be good to bring in vendors from this community
- 15:46:40 [ddahl]
- ...for example, Text Help
- 15:46:53 [ddahl]
- ...we would be interested in the workshop
- 15:47:05 [kaz]
- q?
- 15:47:12 [kaz]
- ack bk
- 15:47:48 [ddahl]
- brian: wants to highlight that this and a lot conversations are in terms of voice assistants like Siri
- 15:48:18 [ddahl]
- ...the use cases for TTS and STT are way broader than that
- 15:49:31 [ddahl]
- ...my company makes products for embedded devices. There are many uses cases that aren't browsers or voice agents
- 15:49:56 [ddahl]
- ...we should not limit this to conversational interfaces
- 15:50:49 [ddahl]
- ...many devices can't support a full conversational interface
- 15:51:00 [mhakkinen_]
- +q
- 15:51:02 [cpn]
- it's not an either/or question
- 15:51:10 [ddahl]
- ...will the workshop cover these things?
- 15:51:27 [ddahl]
- ...the SSML has to make it all the way down to what's actually speaking
- 15:52:09 [phila_]
- q+
- 15:52:31 [ddahl]
- ... not questioning the value of conversational interfaces, but would like to broaden discussion
- 15:52:58 [ddahl]
- kaz: we should talk about what's to be included
- 15:52:59 [kaz]
- q?
- 15:53:11 [kaz]
- qq+ dwalka
- 15:53:16 [kaz]
- ack dwa
- 15:53:16 [Zakim]
- dwalka, you wanted to react to bkardell_
- 15:53:35 [ddahl]
- dirk: in Voice Interaction group we meant to include other modalities, like chatbots
- 15:54:01 [kaz]
- qq+ mark
- 15:54:03 [kaz]
- ack mark
- 15:54:03 [Zakim]
- mark, you wanted to react to dwalka
- 15:54:11 [dirk_]
- dirk_ has joined #voice
- 15:54:12 [kaz]
- qq+ kaz
- 15:54:46 [ddahl]
- mark: let's consider emergency alerts, synthesized alerts also have problems with pronunication
- 15:55:07 [bkardell_]
- mark: do you have any link to the oasis stuff you mentioned
- 15:55:07 [mhakkinen_]
- http://docs.oasis-open.org/emergency/cap/v1.2/CAP-v1.2-os.html
- 15:55:08 [ddahl]
- ...how can we improve this? did some earlier work
- 15:55:29 [kaz]
- qq+ lisa
- 15:55:33 [kaz]
- ack lisa
- 15:55:33 [Zakim]
- lisa, you wanted to react to mark
- 15:55:46 [ddahl]
- lisa: that would be a great use case. emergency communications have to be available to every single subgroup
- 15:56:01 [phila_]
- q-
- 15:56:09 [ddahl]
- ...other use cases will be able to join that ecosystem at a lower cost
- 15:56:45 [ddahl]
- kaz: we might want to talk about not just voice but have a "Smart Agent" workshop
- 15:57:00 [kaz]
- ack kaz
- 15:57:00 [Zakim]
- kaz, you wanted to react to mark
- 15:57:04 [kaz]
- ack tob
- 15:57:14 [kirkwood]
- +1 to ‘smart agent’ its clearer i think
- 15:57:31 [bkardell_]
- can we get a link to the iso standards mentioned?
- 15:57:51 [ddahl]
- tobias: working on DIN and OASIS standards. voice is very powerful, but the fastest way forward is to agree on minimal requirements
- 15:57:56 [ddahl]
- ...and implement them
- 15:58:01 [phila_]
- I'm OK with Smart Agent workshop. My focus, unsurprisingly, is eCommerce and what's necessary for brand owners to help Smart Agents disambiguate products and retailers.
- 15:58:05 [kirkwood]
- ‘smart voice agent’
- 15:58:29 [kaz]
- q?
- 15:58:35 [kaz]
- ack mh
- 15:59:30 [ddahl]
- kaz: would like to update the proposal with this feedback. Would like everyone to join the Program Committee
- 15:59:49 [ddahl]
- ...please contact me
- 15:59:50 [kaz]
- -> https://www.w3.org/2021/Talks/1018-voice-dd-ka/20211018-voice-breakout-dd-ka.pdf slides
- 16:00:12 [kaz]
- -> https://github.com/w3c/strategy/issues/221 github issue
- 16:00:18 [kaz]
- <ashimura@w3.org>
- 16:00:40 [kaz]
- [adjourned]
- 16:00:40 [ddahl]
- rrsagent, format minutes
- 16:00:40 [RRSAgent]
- I have made the request to generate https://www.w3.org/2021/10/20-voice-minutes.html ddahl
- 16:00:59 [ddahl]
- rrsagent, make logs public
- 16:01:09 [kaz]
- OVON Open Voice Network
- 16:01:09 [kaz]
- Open Oasis RECITE Initiative
- 16:01:09 [kaz]
- Amazon Voice Interoperability Initiative
- 16:01:54 [kaz]
- rrsagent, draft minutes
- 16:01:54 [RRSAgent]
- I have made the request to generate https://www.w3.org/2021/10/20-voice-minutes.html kaz
- 16:02:52 [r12a]
- r12a has left #voice
- 16:29:11 [dom]
- RRSAgent, bye
- 16:29:11 [RRSAgent]
- I see no action items