CG meeting next week exactly same time
slide 5
dd: a lot of parallels
... Web page vs IPA
... web pages have to deal with user interaction
... primarily using GUI
... IPA use interaction main based on voice and natural language
... arbitrary differences as well
... browser vs proprietary platforms
(IPA stands for Intelligent Personal Assistants)
dd: ecosystem of skills, actions or whatever
... have to find them through the platform
slide 6
dd: this is architecture of IPA
... not going into the details
s/IPA/IPA generated by the CG/
... green box on the left is device
... the input device could be a microphone
... i the middle red box includes "dialogs"
... and blue box on the right includes "provider selection service"
... we have something component which perform the functions
... analogous with the browsers
slide 7
scribenick: ddahl
kaz: potential voice workshop
...try to solve potential pain points
...what is the best mechanism for discussion?
...feedback from the first breakout
are the slides available so I can zoom in to see some of these things better than I could here?
existing standards, other related technologies, pain points, emotion, common sense database related to people's perception
...several participants said that emotion would be very interesting
https://github.com/w3c/strategy/issues/221
q+
kaz: opinions?
topic: Discussion
q+
q+
q+
brian: zoom in on IPA architecture
...currently it's very underspecified how this is implemented
phila_ has joined #voice
i/agenda, review/topic: Presentation/
...in current implementations
q?
ack bk
...can you send SSML? Yes in some cases, but sometimes it doesn't work
q+
ack jas
https://w3c.github.io/voiceinteraction/voice%20interaction%20drafts/paArchitecture-1-2.htm
dd: welcome to the CG meeting as well
... if we could please share link the APA doc just referernced for review/minutes as well?
-> https://www.w3.org/TR/2021/WD-naur-20211012/ Natural Language Interface Accessibility User Requirements
https://www.w3.org/TR/spoken-html/
q+ lisa_seeman
jason: points out two accessibility publications
...that are relevant to this discussion
https://github.com/w3c/pronunciation/issues
https://www.w3.org/TR/naur/
q?
irfan: please add issues to github
q+
ack cpn
chris: as a content publisher, we've had to overcome a lot of proprietary content
...is there interest from device manufacturers?
kaz: would like to invite these vendors to the workshop
q?
chrisW: would be interested
ack av
MURATA_ has joined #voice
present+
s/chrisW/Jason/
avneesh: this is very important work, in the community group -- what would big players see as business benefits
q+
scribenick: kaz
dd: would be really interesting to look into
... have not done yet
... focus on our own short-term interest in gaps on interoperability so far
... but should look into it
q?
ack p
q+
philArcher: this is the 3rd TPAC in a row, have we yet reached a critical mass yet?
kaz: let's hold the workshop in the next 6 months or so
...probably will be held remotely
q?
ack lisa
ack lisa
lisa_seeman: how does this interact with people with cognitive disabilities
...put some ideas in Content Usable note
...how can this specification support people with voice disabilities
https://www.w3.org/TR/coga-usable/#voice-menus-user-story
...this is full of potential and helps businesses with getting users who are struggling
...that could be part of the business case
q+ tobias
... we also requested that audio descriptions have easier and more literal descriptions made request to PA
s/PA/APA
scribenick: kaz
dd: really interesting
... might be difficult to have simple audio description
... but it would be considerable to use external services
... would be possible to use EMMA message
... useful technology
... natural language technology is a difficult technology
... but getting better and better
lisa: you could dialog with your users
dd: someone may need some additional treatment
... e.g., airline reservations, need many parameters
lisa: wondering if there would be possible to put in a facility
...more directed dialog for people who need simplified dialog
i/more/scribenick: ddahl/
s/...more/lisa: more/
lisa: how could you make a note for yourself?
qq+ kaz
ack k
kaz, you wanted to react to lisa_seeman
...that would be good for people who have memory issues
kaz: maybe that could be integrated in architecture
q?
...that could be discussed during the workshop
ack mha
https://w3c.github.io/pronunciation/gap-analysis_and_use-case/
mark: we've been raising the issue of getting better pronunciation for several years
thank you mark
...the education use case is that many users use computer read aloud. It might be good to bring in vendors from this community
...for example, Text Help
...we would be interested in the workshop
q?
ack bk
brian: wants to highlight that this and a lot conversations are in terms of voice assistants like Siri
...the use cases for TTS and STT are way broader than that
...my company makes products for embedded devices. There are many uses cases that aren't browsers or voice agents
...we should not limit this to conversational interfaces
...many devices can't support a full conversational interface
+q
it's not an either/or question
...will the workshop cover these things?
...the SSML has to make it all the way down to what's actually speaking
q+
... not questioning the value of conversational interfaces, but would like to broaden discussion
kaz: we should talk about what's to be included
q?
qq+ dwalka
ack dwa
dwalka, you wanted to react to bkardell_
dirk: in Voice Interaction group we meant to include other modalities, like chatbots
qq+ mark
ack mark
mark, you wanted to react to dwalka
dirk_ has joined #voice
qq+ kaz
mark: let's consider emergency alerts, synthesized alerts also have problems with pronunication
mark: do you have any link to the oasis stuff you mentioned
http://docs.oasis-open.org/emergency/cap/v1.2/CAP-v1.2-os.html
...how can we improve this? did some earlier work
qq+ lisa
ack lisa
lisa, you wanted to react to mark
lisa: that would be a great use case. emergency communications have to be available to every single subgroup
q-
...other use cases will be able to join that ecosystem at a lower cost
kaz: we might want to talk about not just voice but have a "Smart Agent" workshop
ack kaz
kaz, you wanted to react to mark
ack tob
+1 to 'smart agent' its clearer i think
can we get a link to the iso standards mentioned?
tobias: working on DIN and OASIS standards. voice is very powerful, but the fastest way forward is to agree on minimal requirements
...and implement them
I'm OK with Smart Agent workshop. My focus, unsurprisingly, is eCommerce and what's necessary for brand owners to help Smart Agents disambiguate products and retailers.
'smart voice agent'
q?
ack mh
kaz: would like to update the proposal with this feedback. 