16:02:47 <RRSAgent> RRSAgent has joined #htmlspeech
16:02:47 <RRSAgent> logging to http://www.w3.org/2011/06/30-htmlspeech-irc
16:02:49 <trackbot> RRSAgent, make logs public
16:02:50 <satish> satish has joined #htmlspeech
16:02:51 <trackbot> Zakim, this will be 
16:02:52 <trackbot> Meeting: HTML Speech Incubator Group Teleconference
16:02:52 <trackbot> Date: 30 June 2011
16:02:55 <Zakim> I don't understand 'this will be', trackbot
16:03:04 <ddahl> chair:Dan_Burnett
16:03:05 <glen> glen has joined #htmlspeech
16:03:12 <ddahl> scribe: ddahl
16:03:16 <burn> Agenda:  http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Jun/0078.html 
16:04:03 <ddahl> topic: review updated final report draft
16:04:10 <ddahl> dan: email me if you have problems
16:04:28 <ddahl> http://www.w3.org/2005/Incubator/htmlspeech/live/NOTE-htmlspeech-20110629.html
16:04:48 <Zakim> + +44.207.881.aacc
16:05:18 <ddahl> topic: approve proposed changes to report draft
16:05:42 <ddahl> dan: marc suggested wording changes to requirements, we should approve
16:06:04 <ddahl> ...i don't agree with all of them, redundancy isn't a problem
16:06:41 <ddahl> ...propose making changes based on our current understanding. let me know if you have concerns.
16:06:43 <Zakim> +??P28
16:06:55 <bringert> bringert has joined #htmlspeech
16:07:01 <burn> zakim, ??P28 is Bjorn_Bringert
16:07:01 <Zakim> +Bjorn_Bringert; got it
16:07:34 <ddahl> topic: status report from the WebAPI subgroup
16:07:56 <ddahl> dan: we'll start with the status and bring up anything that should be discussed in the larger group
16:08:41 <ddahl> ...fyi, will leave for half an hour half an hour in
16:08:50 <ddahl> michael: will start discussing drafts
16:09:16 <ddahl> dan: any general discussion?
16:09:39 <ddahl> michael: not yet.
16:10:44 <ddahl> michael: Raj is doing summary of requirements and design decisions, we don't know if there will be directional changes.
16:11:42 <ddahl> dan: is there any discussion from the rest of the group?
16:12:48 <ddahl> topic: WebAPI subgroup
16:14:59 <ddahl> danD: the idea was that I can create an object that isn't necessarily the ASR or TTS object, and then I can bind to the service.
16:15:36 <ddahl> ...the protocol will drive some of the parameters
16:15:53 <ddahl> ...will send an update based on bjorn's comments
16:16:14 <ddahl> bjorn: i'm fine with the functionality, but maybe we do need two objects
16:16:39 <ddahl> danD: will try to blend proposal with bjorn's comments
16:16:55 <ddahl> michael: do we agree or not on two vs. one interface?
16:17:26 <ddahl> danD: I don't know at the time when i do the query what services will be provided, TTS, ASR, or both
16:17:46 <ddahl> bjorn: does it make sense to have a service that can provide both?
16:18:19 <ddahl> michael: we do have a discussion point on this
16:18:29 <ddahl> danD: having an interface bridge won't hurt
16:19:10 <ddahl> bjorn: my objection to having a single one is that it makes the interface more complicated
16:19:46 <ddahl> ...i want to be able to handle the case where i have one or the other or both
16:21:17 <Zakim> -Bjorn_Bringert
16:21:35 <ddahl> michael: other comments on Dan's interface?
16:22:10 <ddahl> danD: this won't be a full-fledged API or module in itself, it's just initialization
16:22:12 <Zakim> + +44.794.417.aadd
16:22:12 <Zakim> - +44.207.881.aacc
16:22:29 <burn> zakim, aadd is Bjorn_Bringert
16:22:29 <Zakim> +Bjorn_Bringert; got it
16:22:34 <ddahl> ...we should start building a table saying "these are the things I want to identify"
16:23:26 <ddahl> bjorn: if i want to have support for ASR or TTS it's hard to see what the API is. what if they are two different services. you have to do a bunch of checking flags.
16:24:20 <ddahl> olli: it depends on whether the parameters are the same for both cases.
16:24:58 <ddahl> bjorn: you also do totally different things with different services. there would need to be some kind of generic interface
16:25:31 <ddahl> michael: it would succeed or fail depending on what you asked it to do.
16:25:51 <ddahl> bjorn: it's better to specify two objects than having one giant object
16:26:08 <satish> (I got disconnected and will try calling in again)
16:26:10 <ddahl> ...it's a syntactic issue
16:26:30 <ddahl> michael: it also depends on whether there are a lot of services that are one or another
16:27:13 <ddahl> bjorn: what parameters do you need to specify? URI, language, non-standard things like non-standard grammar format.
16:27:31 <ddahl> michael: other parameters?
16:27:54 <ddahl> michaelJ: grammar?
16:28:05 <ddahl> bjorn: this is querying for capabilities of the recognizer
16:29:19 <ddahl> ...it would make sense for the grammar to be a parameter, for example if you had some specific grammars, like "support for a specific grammar like 'date'".
16:29:37 <ddahl> michael: that could be for the moral equivalent of the builtins
16:30:35 <Zakim> -Dan_Burnett
16:30:39 <ddahl> dan: we're touching on some issues that we've already decided on, so we shouldn't revisit decisions that we already made
16:31:23 <ddahl> bjorn: standard queries would be grammar, language, and vendor-specific, so it doesn't matter too much if we have one API or two
16:31:56 <ddahl> michael: you may want to give them to the recognizer, not get them back from the recognizer
16:32:23 <ddahl> danD: we talked about not wanted to disclose what the application wanted to do.
16:32:52 <ddahl> bjorn: should get a list of what grammars and languages the recognizer support
16:33:03 <ddahl> s/support/supports
16:33:35 <ddahl> michael: it should accept a list of grammars and languages as it's criteria and you get an engine back
16:34:51 <ddahl> michael: should return failure if the service can't support all the languages, but in the case of languages you might want to know if the service supports a subset
16:35:13 <ddahl> bjorn: someone could pass in a list of all the languages in the world
16:35:39 <ddahl> olli: the user agent should be able to ask the user
16:36:11 <ddahl> danD: if i just ask what languages you support, how is that a privacy issue?
16:36:46 <Zakim> -Bjorn_Bringert
16:36:53 <ddahl> olli: if the service supports only Finnish and English, you could guess that i'm Finnish
16:37:02 <bringert> I got disconnected
16:37:14 <Zakim> +Bjorn_Bringert
16:37:27 <ddahl> michael: you could also use the API for the local device that always has the user's language on it.
16:38:04 <ddahl> ...services don't have to necessarily be honest about their answers
16:38:45 <ddahl> glenn: this seems like a major limitation that we're putting on developers for privacy reasons.
16:40:09 <ddahl> bjorn: regardless, we should say "give me a service that supports XYZ", and it's ok for the service to say "no comment"
16:40:29 <ddahl> michael: we want to allow the user to customize the service
16:40:51 <ddahl> charles: web servers already get the locale
16:41:13 <ddahl> olli: getting supported languages is just another data about the user
16:42:00 <ddahl> bjorn: most common use case is ASR and TTS for locale, so how about if we just get the locale language
16:42:05 <ddahl> olli: that might work
16:43:30 <ddahl> danD: so far, we should be able to provide the filter criteria for the grammar and the language, it should be optional, will get another version, we can discuss further
16:44:31 <ddahl> bjorn: we could say that the default locale language is supported, it's the additional languages that are supported that we have to think about
16:44:51 <ddahl> danD: will start a table of other attributes that should be available at initialization
16:45:18 <ddahl> ...and will get an update
16:45:31 <ddahl> michael: now look at HTML bindings
16:47:19 <ddahl> bjorn: would like there to be an element that can be standalone or enclosed in other elements
16:47:59 <ddahl> ...not sure about control element
16:48:56 <ddahl> ...the important things for me on the recognition element, it should be possible for the web app author to put it on a form
16:49:18 <ddahl> olli: how do you actually bind the value?
16:49:53 <ddahl> bjorn: the definition of a value for a form control is that it's always a string without formatting
16:50:10 <ddahl> ...not so obvious for checkbox, it has to be defined for each type
16:50:59 <ddahl> ...it's the kind of think you put in the "value" attribute for non-text elements
16:51:20 <ddahl> ...for textarea or content editable it's the text
16:51:42 <ddahl> olli: automatic binding in X+V was annoying
16:52:39 <ddahl> michael: the difference is the optionality, you don't have to do it. as for the microphone, the reco image is platform-specific, microphone, button, etc.
16:53:18 <ddahl> olli: the graphical presention could be problematic
16:53:41 <ddahl> bjorn: each browser will have to decide what security model it wants to implement
16:54:37 <ddahl> michael: not sure about usefullness of the form, but the "for" does seem useful
16:54:48 <ddahl> bjorn: form is just a convenience
16:55:07 <burn> hey, sounds like bjorn wants voicexml :)
16:55:32 <ddahl> bjorn: should we look at label? 
16:56:04 <ddahl> ... the HTML label does what we want
16:56:37 <ddahl> ...we want to do the same things that label does
16:56:57 <ddahl> olli: when will user give permission?
16:57:21 <ddahl> michael: each browser will be different
16:58:33 <ddahl> ...some people want the button to appear on the screen without asking permission
16:59:03 <ddahl> bjorn: Google Voice search, for example, you don't want to have to prompt the user every time
16:59:36 <ddahl> olli: worried about when user will give permission
17:00:21 <ddahl> bjorn: easier in the CaptureAPI case if there's no markup
17:01:13 <ddahl> michael: you need to check for permission when you do the reco, not just to have a reco object
17:01:38 <ddahl> olli: if the user never wants speech, maybe the browser doesn't even render the microphone
17:01:51 <Zakim> +Dan_Burnett
17:02:18 <ddahl> bjorn: olli, are you still concerned about consistency of permission policy?
17:03:08 <ddahl> olli: my concerns are that the user agent needs permission before using the reco object
17:04:16 <ddahl> bjorn: is the CaptureAPI similar to the Javascript recognition API?
17:04:42 <ddahl> olli: you get similar data in CaptureAPI and reco
17:05:05 <smaug> http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html#video-conferencing-and-peer-to-peer-communication
17:05:49 <ddahl> bjorn: you can get a "permission denied" error code, that's very similar to our API
17:06:38 <ddahl> michael: what doesn't work is that the permission check happens before the binding
17:07:28 <ddahl> danD: there are two steps, one the rendering of the object, and then the user decides to use that UI element, and that's a privacy and consent issue
17:08:00 <ddahl> ...it makes more sense if it doesn't even prompt the user until it knows something is there
17:09:11 <ddahl> olli: a query to find out what kind of recognizer object is available is ok
17:10:06 <ddahl> bjorn: do you see a problem with the HTML API having a different method?
17:10:30 <ddahl> ...i think browsers should implement permission after the user clicks the button
17:11:06 <ddahl> olli: what if user has already started speaking
17:11:49 <ddahl> bjorn: no permission could either cancel or not start recognition
17:12:01 <ddahl> michael: user should be able to revoke permission
17:12:57 <ddahl> bjorn: these things are up to the user agent, having the Javascript API and the button should make it possible to implement appropriate privacy and security
17:13:59 <ddahl> zakim, who is noisy?
17:14:09 <Zakim> ddahl, listening for 10 seconds I heard sound from the following: Olli_Pettay (85%), Michael_Bodell (58%)
17:14:18 <ddahl> michael: move on, because other topics
17:14:44 <ddahl> ...do we agree that we don't need HTML bindings for TTS?
17:15:50 <ddahl> bjorn: don't have anything against it, but maybe a waste of time.
17:16:02 <ddahl> michael: we can leave it as it is for now.
17:17:02 <ddahl> let's start on bjorn's speech recognition events, similar to what i sent before the f2f
17:17:49 <ddahl> ...added timestamps, there are also a number of error codes that we need to agree on
17:18:19 <ddahl> ...what about nomatch and noinput, are they errors or kinds of input?
17:18:36 <ddahl> michael: i think they're different types of result
17:19:20 <ddahl> ...nomatch seems like a result, but noinput seems like a different kind of event
17:19:32 <ddahl> dan: we look at rejections
17:19:54 <ddahl> michael: if rejection was just below confidence you may want to look at that.
17:20:14 <ddahl> charles: noinput could be like a volume issue
17:20:44 <ddahl> michael: nospeech would not generate an nbest on our platform
17:20:54 <ddahl> dan: for us it would be the same way
17:21:31 <ddahl> glenn: why have multiple events instead of a single event that returns different parameters?
17:22:17 <ddahl> michael: i don't think you're typically doing the same thing with noinput vs. nomatch
17:22:55 <ddahl> charles: it's nice to have the engine decide if it's a nomatch
17:23:34 <ddahl> dan: sometimes the engine ends up with no answer, the vast majority of nomatch is confidence-based
17:24:10 <ddahl> glenn: should make sure that results returned are in as similar a format as possible
17:24:38 <ddahl> bjorn: what about nospeech? 
17:25:06 <ddahl> dan: error to me means that something broke, not like a normal expected user situation
17:25:49 <ddahl> bjorn: the distinction between error and normal is not always clear
17:26:36 <ddahl> dan: true user interface behavior is not an error, "abort" would only be an error if you grouped together user-initiated abort and engine abort
17:27:16 <ddahl> bjorn: are permission problems or network problems errors?
17:27:39 <ddahl> michael: would not consider abort or noinput errors
17:28:30 <ddahl> glenn: I would tie them all into the same event, that would be simpler for the developer
17:29:10 <ddahl> michael: in the continuous case you don't care about noinput
17:29:23 <ddahl> dan: we won't resolve this in the remaining time.
17:30:05 <ddahl> michael: we can continue discussion on the list
17:30:24 <Zakim> -Glen_Shires
17:30:26 <ddahl> rrsagent, format minutes
17:30:26 <RRSAgent> I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html ddahl
17:30:30 <Zakim> -Olli_Pettay
17:30:33 <Zakim> -Michael_Bodell
17:30:34 <Zakim> -Druta
17:30:35 <Zakim> -Debbie_Dahl
17:30:42 <Zakim> -Dan_Burnett
17:30:44 <Zakim> -Patrick_Ehlen
17:30:45 <Zakim> -Charles_Hemphill
17:31:05 <Zakim> -Michael_Johnston
17:31:24 <Zakim> -Bjorn_Bringert
17:31:26 <Zakim> INC_(HTMLSPEECH)11:30AM has ended
17:31:28 <Zakim> Attendees were Dan_Burnett, +1.425.922.aaaa, Patrick_Ehlen, Robert_Brown, Michael_Johnston, Olli_Pettay, Michael_Bodell, Druta, Debbie_Dahl, Charles_Hemphill, +1.650.279.aabb,
17:31:31 <Zakim> ... Glen_Shires, +44.207.881.aacc, Bjorn_Bringert, +44.794.417.aadd
17:31:32 <ddahl> ddahl has left #htmlspeech
17:31:47 <burn> rrsagent, make log public
17:31:53 <burn> rrsagent, draft minutes
17:31:53 <RRSAgent> I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn
17:32:01 <burn> Regrets:  Raj_Tumuluri
17:32:06 <burn> rrsagent, draft minutes
17:32:06 <RRSAgent> I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn
17:33:07 <burn> s/, +1.425.922.aaaa//
17:33:26 <burn> s/, +1.650.279.aabb//
17:33:48 <burn> s/, +44.207.881.aacc//
17:34:02 <burn> s/, +44.794.417.aadd//
17:34:12 <burn> rrsagent, draft minutes
17:34:12 <RRSAgent> I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn
17:35:02 <burn> s/, Robert_Brown//
17:35:11 <burn> s/Druta/Dan_Druta/
17:41:42 <burn> rrsagent, draft minutes
17:41:42 <RRSAgent> I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn
17:43:20 <burn> s/Bjorn_Bringert/Bjorn_Bringert, Satish_Sampath/
17:43:23 <burn> rrsagent, draft minutes
17:43:23 <RRSAgent> I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn
19:39:59 <Zakim> Zakim has left #htmlspeech
20:33:24 <smaug> smaug has joined #htmlspeech