16:02:47 RRSAgent has joined #htmlspeech 16:02:47 logging to http://www.w3.org/2011/06/30-htmlspeech-irc 16:02:49 RRSAgent, make logs public 16:02:50 satish has joined #htmlspeech 16:02:51 Zakim, this will be 16:02:52 Meeting: HTML Speech Incubator Group Teleconference 16:02:52 Date: 30 June 2011 16:02:55 I don't understand 'this will be', trackbot 16:03:04 chair:Dan_Burnett 16:03:05 glen has joined #htmlspeech 16:03:12 scribe: ddahl 16:03:16 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Jun/0078.html 16:04:03 topic: review updated final report draft 16:04:10 dan: email me if you have problems 16:04:28 http://www.w3.org/2005/Incubator/htmlspeech/live/NOTE-htmlspeech-20110629.html 16:04:48 + +44.207.881.aacc 16:05:18 topic: approve proposed changes to report draft 16:05:42 dan: marc suggested wording changes to requirements, we should approve 16:06:04 ...i don't agree with all of them, redundancy isn't a problem 16:06:41 ...propose making changes based on our current understanding. let me know if you have concerns. 16:06:43 +??P28 16:06:55 bringert has joined #htmlspeech 16:07:01 zakim, ??P28 is Bjorn_Bringert 16:07:01 +Bjorn_Bringert; got it 16:07:34 topic: status report from the WebAPI subgroup 16:07:56 dan: we'll start with the status and bring up anything that should be discussed in the larger group 16:08:41 ...fyi, will leave for half an hour half an hour in 16:08:50 michael: will start discussing drafts 16:09:16 dan: any general discussion? 16:09:39 michael: not yet. 16:10:44 michael: Raj is doing summary of requirements and design decisions, we don't know if there will be directional changes. 16:11:42 dan: is there any discussion from the rest of the group? 16:12:48 topic: WebAPI subgroup 16:14:59 danD: the idea was that I can create an object that isn't necessarily the ASR or TTS object, and then I can bind to the service. 16:15:36 ...the protocol will drive some of the parameters 16:15:53 ...will send an update based on bjorn's comments 16:16:14 bjorn: i'm fine with the functionality, but maybe we do need two objects 16:16:39 danD: will try to blend proposal with bjorn's comments 16:16:55 michael: do we agree or not on two vs. one interface? 16:17:26 danD: I don't know at the time when i do the query what services will be provided, TTS, ASR, or both 16:17:46 bjorn: does it make sense to have a service that can provide both? 16:18:19 michael: we do have a discussion point on this 16:18:29 danD: having an interface bridge won't hurt 16:19:10 bjorn: my objection to having a single one is that it makes the interface more complicated 16:19:46 ...i want to be able to handle the case where i have one or the other or both 16:21:17 -Bjorn_Bringert 16:21:35 michael: other comments on Dan's interface? 16:22:10 danD: this won't be a full-fledged API or module in itself, it's just initialization 16:22:12 + +44.794.417.aadd 16:22:12 - +44.207.881.aacc 16:22:29 zakim, aadd is Bjorn_Bringert 16:22:29 +Bjorn_Bringert; got it 16:22:34 ...we should start building a table saying "these are the things I want to identify" 16:23:26 bjorn: if i want to have support for ASR or TTS it's hard to see what the API is. what if they are two different services. you have to do a bunch of checking flags. 16:24:20 olli: it depends on whether the parameters are the same for both cases. 16:24:58 bjorn: you also do totally different things with different services. there would need to be some kind of generic interface 16:25:31 michael: it would succeed or fail depending on what you asked it to do. 16:25:51 bjorn: it's better to specify two objects than having one giant object 16:26:08 (I got disconnected and will try calling in again) 16:26:10 ...it's a syntactic issue 16:26:30 michael: it also depends on whether there are a lot of services that are one or another 16:27:13 bjorn: what parameters do you need to specify? URI, language, non-standard things like non-standard grammar format. 16:27:31 michael: other parameters? 16:27:54 michaelJ: grammar? 16:28:05 bjorn: this is querying for capabilities of the recognizer 16:29:19 ...it would make sense for the grammar to be a parameter, for example if you had some specific grammars, like "support for a specific grammar like 'date'". 16:29:37 michael: that could be for the moral equivalent of the builtins 16:30:35 -Dan_Burnett 16:30:39 dan: we're touching on some issues that we've already decided on, so we shouldn't revisit decisions that we already made 16:31:23 bjorn: standard queries would be grammar, language, and vendor-specific, so it doesn't matter too much if we have one API or two 16:31:56 michael: you may want to give them to the recognizer, not get them back from the recognizer 16:32:23 danD: we talked about not wanted to disclose what the application wanted to do. 16:32:52 bjorn: should get a list of what grammars and languages the recognizer support 16:33:03 s/support/supports 16:33:35 michael: it should accept a list of grammars and languages as it's criteria and you get an engine back 16:34:51 michael: should return failure if the service can't support all the languages, but in the case of languages you might want to know if the service supports a subset 16:35:13 bjorn: someone could pass in a list of all the languages in the world 16:35:39 olli: the user agent should be able to ask the user 16:36:11 danD: if i just ask what languages you support, how is that a privacy issue? 16:36:46 -Bjorn_Bringert 16:36:53 olli: if the service supports only Finnish and English, you could guess that i'm Finnish 16:37:02 I got disconnected 16:37:14 +Bjorn_Bringert 16:37:27 michael: you could also use the API for the local device that always has the user's language on it. 16:38:04 ...services don't have to necessarily be honest about their answers 16:38:45 glenn: this seems like a major limitation that we're putting on developers for privacy reasons. 16:40:09 bjorn: regardless, we should say "give me a service that supports XYZ", and it's ok for the service to say "no comment" 16:40:29 michael: we want to allow the user to customize the service 16:40:51 charles: web servers already get the locale 16:41:13 olli: getting supported languages is just another data about the user 16:42:00 bjorn: most common use case is ASR and TTS for locale, so how about if we just get the locale language 16:42:05 olli: that might work 16:43:30 danD: so far, we should be able to provide the filter criteria for the grammar and the language, it should be optional, will get another version, we can discuss further 16:44:31 bjorn: we could say that the default locale language is supported, it's the additional languages that are supported that we have to think about 16:44:51 danD: will start a table of other attributes that should be available at initialization 16:45:18 ...and will get an update 16:45:31 michael: now look at HTML bindings 16:47:19 bjorn: would like there to be an element that can be standalone or enclosed in other elements 16:47:59 ...not sure about control element 16:48:56 ...the important things for me on the recognition element, it should be possible for the web app author to put it on a form 16:49:18 olli: how do you actually bind the value? 16:49:53 bjorn: the definition of a value for a form control is that it's always a string without formatting 16:50:10 ...not so obvious for checkbox, it has to be defined for each type 16:50:59 ...it's the kind of think you put in the "value" attribute for non-text elements 16:51:20 ...for textarea or content editable it's the text 16:51:42 olli: automatic binding in X+V was annoying 16:52:39 michael: the difference is the optionality, you don't have to do it. as for the microphone, the reco image is platform-specific, microphone, button, etc. 16:53:18 olli: the graphical presention could be problematic 16:53:41 bjorn: each browser will have to decide what security model it wants to implement 16:54:37 michael: not sure about usefullness of the form, but the "for" does seem useful 16:54:48 bjorn: form is just a convenience 16:55:07 hey, sounds like bjorn wants voicexml :) 16:55:32 bjorn: should we look at label? 16:56:04 ... the HTML label does what we want 16:56:37 ...we want to do the same things that label does 16:56:57 olli: when will user give permission? 16:57:21 michael: each browser will be different 16:58:33 ...some people want the button to appear on the screen without asking permission 16:59:03 bjorn: Google Voice search, for example, you don't want to have to prompt the user every time 16:59:36 olli: worried about when user will give permission 17:00:21 bjorn: easier in the CaptureAPI case if there's no markup 17:01:13 michael: you need to check for permission when you do the reco, not just to have a reco object 17:01:38 olli: if the user never wants speech, maybe the browser doesn't even render the microphone 17:01:51 +Dan_Burnett 17:02:18 bjorn: olli, are you still concerned about consistency of permission policy? 17:03:08 olli: my concerns are that the user agent needs permission before using the reco object 17:04:16 bjorn: is the CaptureAPI similar to the Javascript recognition API? 17:04:42 olli: you get similar data in CaptureAPI and reco 17:05:05 http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html#video-conferencing-and-peer-to-peer-communication 17:05:49 bjorn: you can get a "permission denied" error code, that's very similar to our API 17:06:38 michael: what doesn't work is that the permission check happens before the binding 17:07:28 danD: there are two steps, one the rendering of the object, and then the user decides to use that UI element, and that's a privacy and consent issue 17:08:00 ...it makes more sense if it doesn't even prompt the user until it knows something is there 17:09:11 olli: a query to find out what kind of recognizer object is available is ok 17:10:06 bjorn: do you see a problem with the HTML API having a different method? 17:10:30 ...i think browsers should implement permission after the user clicks the button 17:11:06 olli: what if user has already started speaking 17:11:49 bjorn: no permission could either cancel or not start recognition 17:12:01 michael: user should be able to revoke permission 17:12:57 bjorn: these things are up to the user agent, having the Javascript API and the button should make it possible to implement appropriate privacy and security 17:13:59 zakim, who is noisy? 17:14:09 ddahl, listening for 10 seconds I heard sound from the following: Olli_Pettay (85%), Michael_Bodell (58%) 17:14:18 michael: move on, because other topics 17:14:44 ...do we agree that we don't need HTML bindings for TTS? 17:15:50 bjorn: don't have anything against it, but maybe a waste of time. 17:16:02 michael: we can leave it as it is for now. 17:17:02 let's start on bjorn's speech recognition events, similar to what i sent before the f2f 17:17:49 ...added timestamps, there are also a number of error codes that we need to agree on 17:18:19 ...what about nomatch and noinput, are they errors or kinds of input? 17:18:36 michael: i think they're different types of result 17:19:20 ...nomatch seems like a result, but noinput seems like a different kind of event 17:19:32 dan: we look at rejections 17:19:54 michael: if rejection was just below confidence you may want to look at that. 17:20:14 charles: noinput could be like a volume issue 17:20:44 michael: nospeech would not generate an nbest on our platform 17:20:54 dan: for us it would be the same way 17:21:31 glenn: why have multiple events instead of a single event that returns different parameters? 17:22:17 michael: i don't think you're typically doing the same thing with noinput vs. nomatch 17:22:55 charles: it's nice to have the engine decide if it's a nomatch 17:23:34 dan: sometimes the engine ends up with no answer, the vast majority of nomatch is confidence-based 17:24:10 glenn: should make sure that results returned are in as similar a format as possible 17:24:38 bjorn: what about nospeech? 17:25:06 dan: error to me means that something broke, not like a normal expected user situation 17:25:49 bjorn: the distinction between error and normal is not always clear 17:26:36 dan: true user interface behavior is not an error, "abort" would only be an error if you grouped together user-initiated abort and engine abort 17:27:16 bjorn: are permission problems or network problems errors? 17:27:39 michael: would not consider abort or noinput errors 17:28:30 glenn: I would tie them all into the same event, that would be simpler for the developer 17:29:10 michael: in the continuous case you don't care about noinput 17:29:23 dan: we won't resolve this in the remaining time. 17:30:05 michael: we can continue discussion on the list 17:30:24 -Glen_Shires 17:30:26 rrsagent, format minutes 17:30:26 I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html ddahl 17:30:30 -Olli_Pettay 17:30:33 -Michael_Bodell 17:30:34 -Druta 17:30:35 -Debbie_Dahl 17:30:42 -Dan_Burnett 17:30:44 -Patrick_Ehlen 17:30:45 -Charles_Hemphill 17:31:05 -Michael_Johnston 17:31:24 -Bjorn_Bringert 17:31:26 INC_(HTMLSPEECH)11:30AM has ended 17:31:28 Attendees were Dan_Burnett, +1.425.922.aaaa, Patrick_Ehlen, Robert_Brown, Michael_Johnston, Olli_Pettay, Michael_Bodell, Druta, Debbie_Dahl, Charles_Hemphill, +1.650.279.aabb, 17:31:31 ... Glen_Shires, +44.207.881.aacc, Bjorn_Bringert, +44.794.417.aadd 17:31:32 ddahl has left #htmlspeech 17:31:47 rrsagent, make log public 17:31:53 rrsagent, draft minutes 17:31:53 I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn 17:32:01 Regrets: Raj_Tumuluri 17:32:06 rrsagent, draft minutes 17:32:06 I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn 17:33:07 s/, +1.425.922.aaaa// 17:33:26 s/, +1.650.279.aabb// 17:33:48 s/, +44.207.881.aacc// 17:34:02 s/, +44.794.417.aadd// 17:34:12 rrsagent, draft minutes 17:34:12 I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn 17:35:02 s/, Robert_Brown// 17:35:11 s/Druta/Dan_Druta/ 17:41:42 rrsagent, draft minutes 17:41:42 I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn 17:43:20 s/Bjorn_Bringert/Bjorn_Bringert, Satish_Sampath/ 17:43:23 rrsagent, draft minutes 17:43:23 I have made the request to generate http://www.w3.org/2011/06/30-htmlspeech-minutes.html burn 19:39:59 Zakim has left #htmlspeech 20:33:24 smaug has joined #htmlspeech