There is a perception that the Voice Browser Working Group (VBWG) only cares about call centers, and that its specifications are unrelated to and disconnected from others such as HTML. To address this, the group is making substantial efforts at communicating its mission (of which this is one such effort) and at enhancing its VoiceXML specification to simplify its use by those familiar with HTML. With phones becoming browsers, it is more crucial than ever that the expertise and products of the VBWG be leveraged to support the rich and varied applications we see increasingly on our personal mobile devices.
If you miss this presentation, you risk a) re-inventing the wheel in HTML, b) wasting the opportunity to tap the world's largest repository of expertise in phone-based apps, and c) losing the group that produced the most commercially successful specifications since HTML.
Don't miss out -- come and find out what we're doing!
Slides for the project review: http://www.w3.org/2010/Talks/0225-vow-project-review/VBWG_project_review.pdf
kaz: This is a project review for
Voice on the Web.
... We're encouraging people to know voice technology
more.
... And to promote the voice activity
... The presenter is Dan Burnett. He is from Voxeo, is an
active participant in VBWG and MMI, and is Co-Editor-in-Chief for the VoiceXML 3 draft.
burn: Slide 2
... Why are we even having this project review? The VBWG
believes that voice technology are heavily under-utilized
today. Specifically in Web applications. We're not seeing voice
used as much as we'd like.
... We think that's because Web developers aren't aware of it's
potential. We think W3C is the right place to make sure that
Web developers have and are aware of the appropriate
technologies for the Web.
... Slide 3
... Why do we care?
... HTML provides the primary visual UI for the Web. VoiceXML
was designed to provide an aural UI for the Web.
... In many cases a visual output is the best, but in a number
of cases voice is the best input modality.
... Some examples:
... Commercial sites on the Web today, when purchasing
something, you go through the catalog, find what you want, then
go to a billing screen, where you can select your form of
payment.
... I find it irritating that I can't pay the way I want to.
e.g. Discover Card, not usable everywhere, but many sites don't
take it.
... Voice tends to be good at allowing people to cut through
menu trees.
<timbl_> Slide 3 omits to mention populations literate and in position of a $10 phone but who cannot afford a smart phone.
<timbl_> which is a huge population.
burn: Someone just beginning to
look through a catalog might just be able to say "I'd like to
use a Discover Card" right up front. If properly programmed you
can cut right to it.
... Voice search is another example.
<Ralph> [I see no reason why a form couldn't have a non-verbal way to enter 'I want to use my MasterCard'. The mode of entry can be independent of the content of the entry.]
burn: Either the information that needs to be typed is just plain long, and you don't want to type it, or you want to refine a search in a particular way and it's just convenient to say it.
[[Ralph, because that takes up 'space' in the UI. There's no concept of that in voice UIs]]
burn: @@ sweater example @@
... Over the last few years, the VBWG has learned a lot about
how voice is useful where your eyes or hands are busy.
<steph> +1 to tim
burn: e.g. driving a car, where
you eyes and hands are both busy.
... Many states are prohibiting use of hands, if not eyes as
well.
... By addressing that use case you are also, if done properly,
also addressing the needs of a community of people who cannot
see or use their hands.
<shepazu> [note that places are also looking at banning voice calls as well... my town, for example]
burn: Many of the use cases are
the same as the accessibility use cases.
... Another use case is illiterate populations. The visual Web
demands a high literacy rate.
... I would not be surprised if we don't see changes in Africa
where populations are coming online who previously weren't who
cannot read?
<SteveB> People with low literacy are very important to the Web Foundation's work
burn: In the Coliseum example it'd be much easier to say that than type it.
timbl_: There's also a huge population of people who are literate but cannot afford a phone with a large screen.
<SteveB> +1 to Tim, inexpensive phones are another target
burn: Yes, I'm going to talk in a
bit about our historic motivators for voice. Our new motivators
include exactly what you say.
... Voice has been a primary medium for human communications
for a very long time.
... Slide 4
<timbl_> (eg Rwanda has 23% cellphone penetration but only 1% internet)
[[ Clarification questions please ask now, otherwise let's bring them up at the end ]]
burn: With respect to the goals
of the VBWG: we are particularly interested in making sure that
voice technologies are available on the Web.
... That may seem obvious, but when you hear our history, and
how we are perceived, it's not necessarily obvious.
... One of the things that's important for this group in
particular to understand is that there is a group filled with
voice experts.
... So, when I talk about this, remember that we have a lot
more experience than just call centers.
... Slide 5
... For more than ten years now we've been building voice
specifications.
... The group was initially driven very heavily by the call
center industry.
... For those not aware, there are massive rooms dedicated to
fulfilling telephone support requests, etc. There was a need
for automation there.
... The most common case is just call routing, figuring out
where the call belongs.
... When we started, there was a dramatic lack of open
standards.
... The call center space was populated with traditional
telephony companies and companies that had proprietary
telephony equipment companies.
... You had to buy a large piece of hardware that would take
these calls, forward on the rest, etc.
... All protocols were proprietary, lots of lock-in, very
expensive.
... This was all hampering the adoption of voice
technology.
... We've seen a lot of changes, an opening up of the telephony
world.
... There were Java APIs, MS's APIs, but they were definitely
not industry standards.
... In our community we needed standards that could operate on
a large scale.
burn: It was extremely important
that the implementation of the technology could be
efficient.
... The voice technologies themselves were expensive in terms
of time, and in terms of money. That's changing.
... Very time expensive, computing resources.
... There was a need for this stuff to happen in the
network.
<timbl_> "in the network"
burn: Voice technologies needed computation to occur in the network in order for accuracy to be good enough to be usable, for response times to be acceptable.
<timbl_> In the internet model, nothing happens in the network.
burn: At the time, no one was
thinking about Web architecture as the model for voice.
... It was a great selling point at the time to say that your
Web presence infrastructure could be reused for voice. We take
it for granted now, but people who are thinking of adding voice
to specifications today, may not realize how you build voice
technology into the Web infrastructure.
... Slide 6
... We've largely solved the problems we set out to do. We've
got trillions of voice Web pages used every year.
... This is driving the need for the group today to move to
mobile devices.
... Mobile devices are now just mini computers. They're quite
rich feature-wise, they're multimodal in the general
sense.
... We're interested in growing the language in ways that
simplify developing voice technologies on the Web, while also
giving them greater flexibility at the same time.
... Slide 7
... This slide is important because people need to understand
that we're not just driven by call centers.
... As you have questions about Voice on the Web, or about
standards (e.g. MRCP at the IETF), and network constraints, we
know this, we're familiar with this.
... Anyone who looks at voice technologies or any standards out
there today, looks and says "We can do this without the
standard, it's really simple".
... We can tell you from experience that sure, you can
simplify, but once you get beyond demos, it takes a lot of
work.
... We're ready to share the address that experience.
burn: Another thing we understand
is imprecision in technology. Think about geolocation, whether
it's cell-tower or GPS, there is an error factor here.
... Voice technologies are similar, they're not exact. The
results typically have confidence scores associated with them.
We, as a group, have a lot of experience dealing with that.
Commercial productive experience.
... The medium itself is interesting. In speech, <um> you
can <ah> say many things, without saying things.
... The medium of speech can be imprecise and ambiguous. We
have a lot of experience here.
... We have a lot of experience in how to direct users when
necessary, and how to take what we are given and make the best
of it.
... Slide 8
... What are we after here?
... We want to get the message out. Voice is not just about
call centers.
... It's not just important to the Web today, but a crucial
part to the future Web.
<ddahl> even mouse clicks can be imprecise, especially if the user doesn't have good control of their hands. some of the considerations that you have to take into account in voice are actually more generally useful.
<Ralph> +1 to Debbie's observation!
burn: The technology is not
simple, but it's also not too complex.
... "We can do something simple" or "It's too complex and
imprecise and we can't use it" -- the success of the VBWG, and
the companies within it is a sign that there is a middle
ground.
... We're interested in making sure that various specifications
work well together.
... There are a variety of specifications at W3C that we want
to make sure play well together.
... That doesn't necessarily mean VoiceXML, SCXML, CCXML, etc
-- though we are working on that -- but we are also willing to
discuss other alternatives, other APIs that are more
appropriate, e.g. a JavaScript API for use in HTML if that's
appropriate.
<kaz> [ I think we could use voice input as a supplemental modality for touchpad interface without mouse over event, e.g., by saying "this!" ]
burn: Ultimately we want to make
sure that voice technology is able to be integrated with other
technologies at W3C.
... We're also trying to improve the flexibility of VoiceXML.
Make it easier to customize the user interface.
... We're generalizing right now the way in which the UI is
built.
... e.g. creating Voice UI templates.
... Slide 9
... There are some things that we need from you, each of you
individually and as a group together.
... We want to see voice technology used by W3C itself.
... I'd be interested in brainstorming on how voice
technologies can be used today.
... What if there were a phone number that someone could call
for w3c? What information would be useful today?
<steph> +1 to dan
overheard: [[ read me the HTML5 spec ]]
burn: I'd like people on this call to use the technologies themselves.
<dom> [would be interesting to explore voice usage for some of our internal tools]
burn: There are three companies
listed in this presentation. All three have free web portals
with tutorials.
... I think if you start trying it out, you would quickly come
up with new ideas.
... We also need your help getting other groups to work with
us.
... Particularly HTML 5.
<kaz> [ I'd suggest we include these tutorials sites on the VB public page as well. ]
burn: We'd like to get some advice on how to get together with these people and work together.
<marie> [+1 to the LT suggestion]
<steph> great talk !
burn: @@ last point about mobile devices @@
<Ralph> Thanks, Dan
timbl_: We discussed this when
wandering around Africa. VoiceXML and HTML should be brought
together. But HTML and VoiceXML works entirely
differently.
... The visual browser right now in HTML is on the client. The
VoiceXML is now at a call center, typically no internet
involved at all. No one sees the links.
... If you imagine a situation where there is public VoiceXML
on the Web.
... Say someone at CDC has put up a voice dialog about AIDS for
example. Then someone in MA puts up a voice dialog for say
someone who is sick.
... The system may want to redirect them to the CDC center for
AIDS research.
... That's the HTML model.
... That hasn't happened before, because simply you have to
have a client be a voice browser.
... That's becoming possible as phones are becoming
powerful.
<Ralph> [Tim is referring to the verbal analog of a hyperlink; i.e. when the phone operator says "for that question, let me transfer you to ..."]
timbl_: You can't talk about
VoiceXML being used in a Web like way unless you explain how it
works.
... Will I go to a Web site and it'll have a voice browser
enabled? Will it work like a mashup? Do I call the browser or
is it on my computer?
... So what is the model? Has anyone figured it out?
burn: The voice browser as we
called it, has traditionally not lived on the device. The
browser is living in the network, the device is just an access
point.
... The cost associated with this is the cost of speech
synthesis, asr, or even DTMF detection.
<kaz> [ speech mashup example: http://www.youtube.com/watch?v=OURZpqh-35A&eurl=&feature=player_embedded ]
burn: I don't know of anyone who
has tried to create a public Web.
... I don't have a great answer.
<marie> ['voice links']
burn: Some of the future of voice
could easily live on the device.
... Others on the call who have things to say about that,
please feel free to jump in.
ddahl: There are the two models, one is the calling the phone number and browsing around in the voice only world.
burn: When you have an actual
mashup it is possible because of the IP transport of voice to
go ahead and click to begin a conversation right there.
... I don't know if there is a standard right now for a click
on a link to start you talking. There isn't a particular
standard for how that transition occurs. Right now it occurs in
parallel and then you maybe hang up the phone.
timbl_: If you write a VoiceXML document, are there entry points you can give.
<ddahl> the other model is more multimodal where you talk to a visual browser, but you need the voice-only model for the use cases where people don't have a display on their phone
timbl_: So if you can do that, you can link out of a document.
burn: Right, you can make links.
timbl_: And if it linked to a
picture? There's no extra architecture added.
... The picture could link back to another piece of voice.
burn: Most companies that
implement VoiceXML that has an <audio> tag also allow
video. They can play that video and still be taking speech
input.
... Yes, if the browser can do both, the browser could
recognize an image, etc.
... But that's not part of the VoiceXML standard, the browser
would have to know what to do.
<Zakim> steph, you wanted to discuss the need for public voice browsing services, and advocacy on that and to ask how to make the message out, where there is almost no example and to
steph: This is the case today. I have a portal, and I have a set of bookmarks that I put, and then I can select the one I want.
<SteveB> Steph wrote a portal
<SteveB> can access it by browser or voice
steph: You can get a number, have bookmarks to whatever sites you have, etc.
<SteveB> both
timbl_: These are VoiceXML bookmarks.
steph: I think we are mixing both
voice and visual things.
... I was unable to find people putting VoiceXML on the
Web.
... There was no way except for a geeky guy to browse the voice
Web.
... In my world, I imagine you can find a specific number that
you'd call, your own private VoiceXML home page in which you
can search to other pages that you bookmark.
timbl_: Imagine Dom making his
CSS helper mobile app into VoiceXML.
... What would it take to make a VoiceXML interface for it?
<dom> [TimBL is referring to the cheat sheet http://www.w3.org/2009/cheatsheet/]
burn: About voice, the words you might use for someone who is reading is going to be different than someone who is speaking.
<Ralph> [ah, perfect example; can I put a link in my HTML document to message #4 in my mobile phone provider's voicemail system? Does the VBWG believe I should be able to record a voicemail message on Tim's mobile carrier's voicemail with a link to a message in my mobile carrier's voicemail?]
timbl_: Dom's cheatsheet has a list of things like CSS properties, etc. If you called 1800w3ccss, you could find references to CSS.
burn: Even say the team contact list, or the list of specifications.
<dom> [I'm sure having a phone number that would answer the questions about W3C spamming people would be useful :) ]
<SteveB> that is still a pretty standards call center application
burn: "Thanks for calling W3C, what are you interested in learning about? Specification work? Our people? Or our mission?"
steph: I don't understand why it should be a phone number.
<ddahl> I wonder if you could build a voice-enabled document retrieval system for specs based on the text of all the specs
<SteveB> would like to see something more open, linking to other sites,
steph: For me, I don't think w3c should have to run it's own voice browser, it can just make it's VoiceXML application.
<marie> [yes, why a phone # only?]
steph: I think it's a major flaw if everyone who wants to use a voice application needs their own voice browser.
burn: You could run your own
voice browser. Before voice was being commonly sent over VoIP,
it was common if you wanted to build something like a voice
browser, whether it was VoiceXML or something else.
... In those days you'd buy analog cards put them in a PC and
hook them to phone lines.
... Now with VoIP we can start to provide a software only voice
browser.
burn: You can download our
software and you can set up a sip connection to it.
... So the user on the other end then needs either a SIP client
or a SIP/POTS gateway.
... Or you can have a gateway that runs the voice browser. I'm
hoping that becomes so common that it becomes ??
<Zakim> Ralph, you wanted to respond to Steph's "W3C shouldn't have to run its own voice system"
Ralph: I wanted to comment on Steph's "Why should W3C have to run it's own service?" -- this the dichotomy of running a Web server, vs Web hosting.
burn: Not everyone needs to do
that, one of the reasons that my company provides the software
in different ways.
... You own the visual browser, it may cost you, it may be
free.
... My company has something free up to a point.
... A difference is that it's not the client that does it, it's
not the person viewing the information that has to get the
browser, it's the one who is providing the information who
needs the browser.
... SIP may be changing that.
timbl_: What stops me from taking your software and connecting to it via a microphone?
burn: Nothing, you can do that today.
timbl_: How much of my machine will it take today?
burn: The most compute intensive part is the speech recognition. That's the biggest reason why most people will host it somewhere.
<Ralph> [I think I agree that sip: is the connector for One Web that contains voice as an equal media type]
timbl_: There's a vast amount of
computing in people's laptops.
... My macbook pro today, it'll take one of the CPUs? That
means in 3 years time it might only take one of the 16
CPUs.
... The CPU argument seems shortsighted.
burn: I agree strongly. Compute power needs are not something that are going to matter in the future.
timbl_: You also said it'd be too
expensive.
... That sounds like a bug in the pricing model.
... Suppose I want to equip my Firefox to be VoiceXML enabled.
I put a plugin in, and it lets me browse through VoiceXML that
I find on the Web. That's the sort of thing that I imagine that
Google would fold in to Chrome very quickly.
... Don't you have to compete with a 99 cent iPhone app?
burn: The costs are definitely coming down. It doesn't take much of your machine to run the software directly.
<kaz> [ Opera (until 8?) used support voice... ]
burn: My company provides a low
cost ASR that is just included in the software we
provide.
... The reason I talked about costs is just about how visual
Web browsers do more today than they did ten years ago. Ten
years from now voice browsers will also be more
sophisticated.
... For basic needs, I think you'll have basic recognition
running on the phone right now, but there will likely be
something more advanced coming that won't run.
... One of the things Voxeo has been doing, we've been taking
our hosted infrastructure and made it available via
JavaScript.
... Those applications today run on the hosted network because
they need access to telephony and speech resources, but shortly
they'll just be remote calls, and they'll only access our
remote network in the cloud as needed to do these other
technologies.
... We see the industry as a whole going that direction.
timbl_: Is there OSS ASR at all?
burn: Yes, I haven't tried it.
<steph> look at http://www.w3.org/2008/MW4D/wiki/Tools
<steph> voice section
<Zakim> ddahl, you wanted to say that the problem isn't so much how to get the voice browser per se, but how to get the speech recognized, which is part of the browser, but they can be
ddahl: There's a difference
between the ASR itself and the service.
... Something like Sphinx from CMU.
<Ralph> links to open source recognziers
ddahl: And there are starting to be more open speech services available. voice apps tutorials: the AT&T Speech Mashup
<kaz> [ speech mashup example (speak4it): http://www.youtube.com/watch?v=OURZpqh-35A&eurl=&feature=player_embedded ]
<Ralph> CMU Sphinx
ddahl: There's the experimental WHAMI from MIT.
burn: We don't offer a direct speech service. Could do that in the future. The primary platform for that right now is MRCP.
burn: Just like you use HTTP for
Web pages, we have MRCP for controlling speech resources.
... In our product right now, if you had an open source
recognizer running on another machine that supports MRCP
??
... Instead of pointing at the MRCP server that's installed for
our software now, you could just change it to use your own MRCP
server.
<kaz> open source ASR developed by Japanese researchers
burn: You can install open source software on any machine you want and do the same pointing. A lot of the Voice browsers work that way today. That's what MRCP support means today.
timbl_: Voice is one of those
things, like CSS is one of those things we pushed out there
because we thought it was good.
... Multimodal too.
... What standards, what glue do we need to produce?
<kaz> [ I think MMI is one of the possible glues ]
timbl_: If you imagine local
voice capabilities on your machine, that will get out there
sometime. The first person to fold it in to the visual browser
would launch a de facto standard.
... Get out there ahead of it with standardization is a good
idea.
... We should get people on the W3C staff using it. Playing
with it.
... The sort of thing that DanC did with tel:.
[[ at one time there was a proposed application type for 'universal voice browsing' in SIP at the IETF ]]
timbl_: Someone put some VoiceXML on the Web right now. Stick it in the public Web space. Put together an advanced adoption group who are playing with this stuff with the assumption everyone has this technology on their machines.
ddahl: One thing we have to be
careful about are the use cases of small devices, that require
remote speech recognition.
... It seems like we don't want to rely 100% on the native
platform. We want the flexibility to use remote or local
resources.
timbl_: The beginning of the Web
started with people telneting to www.cern.ch and got a Web
browser.
... That's now totally history. I'm glad we didn't design it
for that.
... If we'd stuck with that then... well, there would be lots
of dialup systems that wouldn't be open Web. Cern let you
follow links anywhere.
... We need to push this. Link together voice --
<SteveB> Stephane and Steveb need to leave a couple minutes before bottom of hour
timbl_: In my mind, if I write a VoiceXML file that has a bunch of information about me, my SSN, etc and then I follow a link to elsewhere. Does VoiceXML define how state is transferred?
<SteveB> would be great to share what the Web Foundation is thinking about doing in the field
burn: In general, no. That information is generally stored server side.
timbl_: It'd make up a huge horrible URL typically.
<kaz> [ and SCXML as well ]
<Ralph> Matt: VoiceXML will let you build deep links into an application
<Ralph> ... e.g. you could link directly to AIDS information in a voice site
burn: There are two different
programming models that we see today.
... Pages generated by the server, using jsp to generate
VoiceXML.
... So your context could be maintained on the server, so you
just generate the user interface, and then the information is
maintained by the server.
... Then the other model of doing everything locally on the
client. A data tag that you can use to send information to a
server somewhere.
... Typically not maintained in the UI piece. There's not a
general mechanism for transferring from one part of a UI to
another.
... There's not a way to share that state without going through
the server.
... People are used to that via HTML. The model is very
similar.
<marie> [/me would welcome best practices]
<Ralph> [what's the REST analog for linking between voice apps? Does it even make sense to believe that RESTful transfer applies to the voice use cases?]
steph: It's good to look at the
future. One of the major use cases we see at the W3F for
VoiceXML is to deliver content and information in developing
countries.]
... How can content provider's provide information in
VoiceXML.
... How can they be a host of the Voice browser? Each content
provider cannot afford managing twenty or a hundred lines on
their own.
<kaz> [ Ralph, MMI WG think RESTful connection should be also useful for voice connection ]
steph: That's a level of
investment that is ??
... How to build a kind of framework where a content provider
just provides information in VoiceXML.
<Ralph> [Kaz, that's good to know; thanks. any pointers to discussion?]
steph: So that operators can
provide a connection between the POTS and the voice
application.
... I also want to ask about African languages in TTS and
ASR.
steph: How to develop low cost
ASR/TTS?
... And what about the usability of DTMF vs directed dialog vs
natural language when you talk about people who do not have Web
or ?? experience and is illiterate?
... I think the voice group may be interested in all of those
topics.
burn: There are commercial
drivers for low cost ASR and TTS. A lot of data needs to be
collected to get that data costs a lot.
... Any time there's a substantial effort or time to produce
something, you'll occasionally find people to do it for free,
but more often get it for a fee.
... I'd like to see low cost ASR and TTS show up for the long
tail languages. Not the most spoken or most commercial viable
languages.
... VBWG would be very interested in talking about it. You can
join us on a call sometime and we can talk about it.
[[ particularly where SSML and SRGS isn't sufficient for the African languages ]]
steph: [[ and on illiteracy? ]]
<kaz> [ that's one of the reason of our motivations for the Conv Apps workshop :) ]
burn: We don't particularly talk
about this in our group. Typically we have knowledge of the
interfaces we need to build and bring that to the group. But
there are Voice UI experts who are very familiar with the best
way to create UIs.
... Happy to talk to you about it, but it hasn't been part of
our core for standardization.
ddahl: One of the things the voice community has found is that there's a big difference in voice user interfaces in more first world type countries. That would be of interest to you to look at.
<kaz> s/one of the reason of/one of/
<Ralph> [just ran into http://www.voice-push.com/voice-push/voice-push.php -- seems to have some useful resources there. Are the author(s) of that site connected with the WG? ]
steph: There are now quite a few
different teams who are focusing on usability for the
illiterate.
... Federating those groups for usability guidelines, would
that be of interest to this group or not?
<SteveB> I'll need to drop off ... see my agenda item re: multi-modal and linked apps, and also was going to make point that we plan to train people in developing countries to help them build voice/html apps -- hope it is easy to do this :)
<SteveB> thanks a lot to everyone... great project review!
<burn> thanks steveb
steph: There are research groups around the world coming up with usability guidelines. Would this be relevant to the VBWG?
ddahl: I think it's relevant ,
but the Voice UI would have to be looked at as a distinct thing
as well.
... The voice UI community experience has been that visual UI
experience does not translate well to voice.
<Ralph> Matt: usability guidelines could inform the spec as well
<steph> if people are interested http://www.w3.org/2008/MW4D/wiki/Stories
<Ralph> ... may find some things, e.g. new UI paradigms, that we don't currently support
<steph> voice section has some papers on this topic
burn: If there's a place for us to participate in usability guidelines, even in visual Web, happening within W3C.
[[ Mobile Web Best Practices -- not particular UI, but influences design ]]
<ddahl> actually, i meant that there are even cultural differences in Voice UI in the developed world, so that suggests you would find other differences if you looked at a broader range of cultures. it would be worth looking at
burn: Happy to help participate in that if it exists.
Ralph: steph, can you share your voice portal?
steph: I can write that up in a mail.
<steph> ACTION: steph to send info on his portal [recorded in http://www.w3.org/2010/02/25-vow-minutes.html#action01]
kaz: Are the participants here interested in more?
burn: I think the next step is
important, do what Tim said, go out and try it, play with it.
Give it a try.
... I think it would be good to have some kind of follow
on.
... More brainstorming after people have tried it.
[[ I like Tim's idea of an advanced group looking at some of the issues ]]
burn: I'm not sure what the right way forward is, get some team people together to brainstorm? what would be most effective?
<kaz> e.g.
<kaz> * http://studio.tellme.com
<kaz> * http://cafe.bevocal.com
<kaz> * http://evolution.voxeo.com
Ralph: I put an agenda item in
for more tutorial on how to write a voice app.
... That would give us a ideas about
other forum to start talking.
<plh> I'll be interested in an hands-on approach
<kaz> [ actually, matt is generating some simple voice app now ]
matt: I'd be happy to help, set up a forum for us to try it out.
timbl_: I'd be happy to try to install a browser for 15 minutes. If it takes longer than that, it needs to be faster :)
burn: You can download a voice browser from Voxeo. Or you can look at one of the other three sites for hosting a browser.
burn: That's where the place is to try it without installing anything.
Ralph: This was very useful, thank you Dan!
<Ralph> Here's what I wanted to say under agendum 2:
<Ralph> What's the technology layering envisioned by the VBWG?
<Ralph> high modularity allows reuse of:
<Ralph> . natural language interfaces
<Ralph> . techniques for dealing with imprecise input (Debbie's mouse-motor-skill observation)
<Ralph> . location context
burn: I'd still like to hear the next steps.
<Ralph> not just in voice mode, though voice may be driving several of these layers
burn: and please, contact our support, they're very good.
timbl_: Medium term, for future
of VoiceXML, I'd like to push the future of multimedia. Think
beyond call center and integrating VoiceXML from HTML.
... What happens when Firefox includes a voice browser?
<ddahl> yes, natural language can also include natural language processing of typed input. EMMA is designed so that it can accommodate voice, typed, or even handwritten input
timbl_: You could write an add on that signs up for VoiceXML and starts browsing when it finds it.
<Ralph> CSAIL Spoken Language Systems research group
timbl_: Then browser UI things, like watching a movie and saying 'turn down the volume'.
kaz: Do we want to have another call on voice applications? Next month sometime?
<Ralph> [[one of the things that has been on Ralph.Someday for years is to integrate [-> http://groups.csail.mit.edu/sls/technologies/galaxy.shtml Galaxy ] into Zakim ]]
<Ralph> [kaz, re: another talk on applications -- yes, I'd be interested]
<kaz> [adjourned]