IRC log of htmlspeech on 2010-12-16
Timestamps are in UTC.
- 16:40:32 [RRSAgent]
- RRSAgent has joined #htmlspeech
- 16:40:32 [RRSAgent]
- logging to http://www.w3.org/2010/12/16-htmlspeech-irc
- 16:40:49 [Zakim]
- Zakim has joined #htmlspeech
- 16:43:58 [burn]
- zakim, this will be htmlspeech
- 16:43:58 [Zakim]
- ok, burn; I see INC_(HTMLSPEECH)12:00PM scheduled to start in 17 minutes
- 16:44:06 [burn]
- zakim, code?
- 16:44:06 [Zakim]
- the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn
- 16:54:28 [Zakim]
- INC_(HTMLSPEECH)12:00PM has now started
- 16:54:35 [Zakim]
- +Michael_Bodell
- 16:54:52 [mbodell_]
- mbodell_ has joined #htmlspeech
- 16:55:13 [Zakim]
- +??P1
- 16:55:41 [smaug_]
- Zakim, ??P1 is Olli_Pettay
- 16:55:41 [Zakim]
- +Olli_Pettay; got it
- 16:55:59 [bringert]
- bringert has joined #htmlspeech
- 16:56:03 [smaug_]
- Zakim, nick smaug is Olli_Pettay
- 16:56:03 [Zakim]
- sorry, smaug_, I do not see 'smaug' on this channel
- 16:56:08 [smaug_]
- Zakim, nick smaug_ is Olli_Pettay
- 16:56:08 [Zakim]
- ok, smaug_, I now associate you with Olli_Pettay
- 16:56:45 [marc]
- marc has joined #htmlspeech
- 16:57:01 [Zakim]
- +Milan_Young
- 16:57:23 [Zakim]
- + +44.122.546.aaaa
- 16:57:35 [Milan]
- Milan has joined #htmlspeech
- 16:57:48 [bringert]
- Zakim, +44.122.546.aaaa is Bjorn_Bringert
- 16:57:48 [Zakim]
- +Bjorn_Bringert; got it
- 16:57:58 [bringert]
- Zakim, I am Bjorn_Bringert
- 16:57:58 [Zakim]
- ok, bringert, I now associate you with Bjorn_Bringert
- 16:58:07 [burn]
- burn has joined #htmlspeech
- 16:58:25 [burn]
- zakim, code
- 16:58:25 [Zakim]
- I don't understand 'code', burn
- 16:58:28 [burn]
- zakim, code?
- 16:58:28 [Zakim]
- the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn
- 16:58:45 [Zakim]
- +[IPcaller]
- 16:58:46 [Zakim]
- +Dan_Burnett
- 16:58:57 [burn]
- trackbot, start telcon
- 16:58:58 [Zakim]
- +[Microsoft]
- 16:58:59 [trackbot]
- RRSAgent, make logs public
- 16:59:00 [marc]
- zakim, I am IPCaller
- 16:59:00 [Zakim]
- ok, marc, I now associate you with [IPcaller]
- 16:59:01 [trackbot]
- Zakim, this will be
- 16:59:02 [trackbot]
- Meeting: HTML Speech Incubator Group Teleconference
- 16:59:02 [trackbot]
- Date: 16 December 2010
- 16:59:02 [Zakim]
- I don't understand 'this will be', trackbot
- 16:59:06 [burn]
- Chair: Dan Burnett
- 16:59:22 [burn]
- Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0144.html
- 16:59:24 [ddahl]
- ddahl has joined #htmlspeech
- 16:59:33 [burn]
- zakim, i am Dan_Burnett
- 16:59:33 [Zakim]
- ok, burn, I now associate you with Dan_Burnett
- 16:59:56 [Zakim]
- +Debbie_Dahl
- 17:00:19 [burn]
- zakim, who is on the phone?
- 17:00:19 [Zakim]
- On the phone I see Michael_Bodell, Olli_Pettay, Milan_Young, Bjorn_Bringert, [IPcaller], Dan_Burnett, [Microsoft], Debbie_Dahl
- 17:00:23 [Robert]
- Robert has joined #htmlspeech
- 17:00:34 [marc]
- zakim, I am IPcaller
- 17:00:34 [Zakim]
- ok, marc, I now associate you with [IPcaller]
- 17:00:47 [burn]
- zakim, [Microsoft] is Robert_Brown
- 17:00:47 [Zakim]
- +Robert_Brown; got it
- 17:01:06 [burn]
- zakim, [IPcaller] is Marc_Schroeder
- 17:01:06 [Zakim]
- +Marc_Schroeder; got it
- 17:03:44 [burn]
- Scribe: Robert_Brown
- 17:03:54 [burn]
- ScribeNick: Robert
- 17:06:52 [Robert]
- topic: last week's minutes
- 17:07:10 [Robert]
- Dan: (no comments) last week's minutes approved
- 17:07:43 [Robert]
- topic: comments on the newest version of the requirements draft
- 17:07:50 [Robert]
- Dan: no comments
- 17:08:17 [Robert]
- topic: require encryption http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0099.html
- 17:09:12 [Robert]
- michael: not much mail on this, Bjorn agreed in mail, no other mail comments. seems reasonable
- 17:09:22 [mbodell_]
- proposed req: Web application must be able to encrypt communications to remote speech service
- 17:09:26 [Robert]
- Dan: asked for objections, no objections voiced
- 17:09:51 [Robert]
- topic: require best practices http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0107.html
- 17:10:42 [Robert]
- Milan: not sure we're aligned on the emphasis behind this requirement. maybe should put it on hold. some people are prioritising schedule ahead of features.
- 17:10:57 [Robert]
- ...: put it on hold and see how the other issues we discuss this week play out
- 17:11:17 [mbodell_]
- s/...:/... /
- 17:11:49 [burn]
- zakim, nick mbodell_ is Michael_Bodell
- 17:11:49 [Zakim]
- ok, burn, I now associate mbodell_ with Michael_Bodell
- 17:12:09 [Robert]
- Bjorn: has anybody had experience where this sort of requirement is needed? it seems redundant
- 17:12:10 [Zakim]
- -Bjorn_Bringert
- 17:12:26 [bringert]
- I got disconnected
- 17:12:53 [Zakim]
- +Bjorn_Bringert
- 17:12:54 [burn]
- zakim, nick Milan is Milan_Young
- 17:12:55 [Zakim]
- ok, burn, I now associate Milan with Milan_Young
- 17:13:07 [burn]
- zakim, nick bringert is Bjorn_Bringert
- 17:13:07 [Zakim]
- ok, burn, I now associate bringert with Bjorn_Bringert
- 17:13:08 [Robert]
- Dan: sometimes to prevent avoiding certain architectures
- 17:13:44 [burn]
- zakim, nick ddahl is Debbie_Dahl
- 17:13:44 [Zakim]
- ok, burn, I now associate ddahl with Debbie_Dahl
- 17:13:46 [Robert]
- Milan: intended to avoid the sessions/sockets issue. but lets get on dissing the other topics and get back to this one
- 17:14:10 [Robert]
- topic: require support for text interpretation http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0122.html
- 17:14:40 [Robert]
- Bjorn: i wouldn't consider it high priority, but okay keeping it for now
- 17:14:50 [Robert]
- Dan: this is certainly in scope
- 17:15:26 [Robert]
- Bjorn: it's already possible and doesn't need a new requirement. just use an xmlhttp request.
- 17:15:38 [Robert]
- Dan: there may be some benefit to having a unified approach
- 17:16:00 [Robert]
- Bjorn: agreed there's a benefit but not high priority
- 17:16:29 [Robert]
- Dan: looks like we have consensus on keeping it
- 17:16:34 [mbodell_]
- proposed req: Web applications must be able to request NL interpretation based only on text input (no audio sent).
- 17:16:59 [Robert]
- topic: re-recognition http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0133.html
- 17:17:22 [Robert]
- Michael: a fair bit of discussion in mail, but it seems people are okay keeping this
- 17:18:07 [Robert]
- Bjorn: okay to have as a requirement, lower priority, if I was making the proposal I wouldn't add it because of the added complexity
- 17:18:26 [mbodell_]
- proposed req: Web applications must be able to request recognition based on previously sent audio.
- 17:18:43 [Robert]
- Michael: no objections? [resounding silence...]
- 17:19:09 [Robert]
- dan: consensus
- 17:19:22 [Robert]
- topic: concept of session http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0130.html
- 17:19:41 [Robert]
- Michael: discussion on whether we need it and whether cookies support it?
- 17:19:53 [Robert]
- Milan: not thrilled, but okay to call this one good enough
- 17:20:07 [Robert]
- ... cookie gets 90% of use cases
- 17:20:37 [Robert]
- Bjorn: do you want to add a requirement like existing mechanisms should be used to manage sessions or something like that
- 17:20:46 [Robert]
- Milan: how about the way it's worded now?
- 17:21:00 [Robert]
- Bjorn: text in original email is okay with me
- 17:21:06 [burn]
- burn has changed the topic to: #htmlspeech agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0144.html (burn)
- 17:21:24 [Robert]
- Olli: okay with me too
- 17:21:50 [Milan]
- Robert nervous about defintition of word session
- 17:21:57 [burn]
- robert: wants to confirm meaning of "session". different from what we do in web apps?
- 17:22:40 [burn]
- robert: is there any use case?
- 17:22:56 [burn]
- bjorn: yes. could consider a speech API that does not pass on cookies that are set
- 17:23:15 [burn]
- milan: e.g. a native agent proposal. user agent would be required to tack on cookies
- 17:23:32 [burn]
- robert: can live with this. details will become apparent with the proposals
- 17:23:56 [burn]
- bjorn: IETF specs use the notion of "stateful session" when discussing cookies
- 17:24:01 [mbodell_]
- proposed req: Web application and speech services must have a means of binding session information to communications.
- 17:24:09 [Robert]
- michael: sounds like we have consensus
- 17:24:59 [Robert]
- topic: modify FPR30 to remove "UA" http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0111.html
- 17:25:47 [Robert]
- Bjorn: okay with Milan's restatement in mail
- 17:25:59 [Robert]
- Michael: concerned that this breaks our privacy requirements
- 17:26:15 [Robert]
- Milan: but that's broken (paraphrase)
- 17:26:38 [Robert]
- Michael: if I'm the only one who's nerveous I'm okay taking Milan's text
- 17:27:15 [Robert]
- Bjorn: if those mechanisms don't satisfy privacy requirements, we can look at improving them.
- 17:28:28 [Robert]
- Marc: is it part of our specification to make a position on who does it?
- 17:28:56 [Robert]
- Bjorn: xmlhttp talks about web app but implies UA requirements
- 17:29:34 [Robert]
- Michael: objections?
- 17:29:58 [Robert]
- Dan: nerveous but won't object. in prioritisation we may need to be more precise
- 17:30:01 [mbodell_]
- proposed change: fpr30 becomes Web applications must be allowed at least one form of communication with a particular speech service that is supported in all UAs.
- 17:30:18 [marc]
- my question was about confirming that at this stage we are not taking any decision how the communication between the web app and the speech service is realised, whether the UA plays a standardised role or not.
- 17:30:19 [Robert]
- Dan: agreed, move on
- 17:30:30 [marc]
- confirmed that this decision is *not* taken at this stage.
- 17:30:41 [marc]
- the new requirement is better because it makes this less explicit.
- 17:30:53 [Robert]
- topic: cancelling requests. http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0134.html
- 17:31:21 [Robert]
- Bjorn: besides efficiency, are there any reasons to add the requirement?
- 17:31:52 [Robert]
- Michael: existing requirements relate to this (barge-in)
- 17:32:23 [Robert]
- Milan: it's efficiency. but if you were going to do real barge-in in most of your transactions, it would be an issue
- 17:32:51 [Robert]
- Bjorn: if the client wants to stop sending audio, it can send a marker saying it's done
- 17:32:59 [Robert]
- Milan: that's what I'm asking for
- 17:33:19 [Robert]
- Bjorn: sender cancelling is easy with HTTP. receiver cancelling is difficult
- 17:33:44 [Robert]
- Milan: how would end of speech be indicated
- 17:34:02 [Robert]
- Bjorn: some sort of end-of-audio packet, which handles the sender cancelling
- 17:35:03 [Robert]
- ... why do we need this?
- 17:35:25 [Robert]
- Milan: the user agent may not be able to detect when done
- 17:35:37 [Robert]
- Bjorn: would server or client do that?
- 17:35:45 [Robert]
- Milan: the client
- 17:36:27 [Anthapu]
- Anthapu has joined #htmlspeech
- 17:36:27 [Robert]
- Bjorn: should split into two discussions: 1 client aborting recognition (fine and required and trivial); 2 client aborting synthesis
- 17:37:03 [Robert]
- ... implied by FPR17
- 17:37:14 [Robert]
- Michael: that says the user can abort it
- 17:37:42 [Robert]
- Bjorn: need a separate requirement that web application should be able to cancel audio capture
- 17:37:54 [Zakim]
- + +1.732.507.aabb
- 17:38:07 [Robert]
- Marc: we used the term "abort" intentionally, with privacy concerns in mind
- 17:38:24 [Robert]
- Bjorn: duplicate FPR17, replacing user with web app
- 17:38:29 [mbodell_]
- proposed new req: While capture is happening, there must be an obvious way for the web application to abort the capture and recognition process.
- 17:38:50 [mbodell_]
- s/obvious //
- 17:39:01 [mbodell_]
- s/an way/a way/
- 17:39:08 [Robert]
- Bjorn: fine with what Michael typed
- 17:39:35 [Robert]
- ... [no other objections] lets move on to synthesis
- 17:40:13 [Robert]
- ... client wants to abort playing of long synthesized speech. if there's no way for the client to signal the server, the only option is to tear down the connection
- 17:40:28 [Robert]
- ... this may have latency implications to establish a new connection
- 17:41:27 [Robert]
- Milan: there's a lot of work that goes into establishing a TCP socket. Email triage is a good example. App reads a few sentences of a message then the user interrupts
- 17:42:10 [Robert]
- ... it would be awkward if the mail app just read the first sentence
- 17:42:46 [Robert]
- Bjorn: or the app could read a sentence at a time until it decides to move to the next message
- 17:43:24 [Robert]
- Milan: not asking for interruption (existing requirement), but to cancel it all the way to the server
- 17:43:45 [Robert]
- Bjorn: reluctant to add a requirement of going all the way to the server
- 17:44:14 [Robert]
- Bjorn: propose "web application must be able to abort TTS output"
- 17:44:30 [Robert]
- Milan: but Bjorn has already to do this for reco, why not TTS?
- 17:45:07 [Robert]
- Bjorn: reco is required, and the sender aborts by sending up a token. this is different, because the receiver is aborting
- 17:45:49 [Robert]
- Milan: but with reco, the server is sending back ack's while the client is speaking, so there is a bi-directional mechanism
- 17:46:07 [Robert]
- Bjorn: are you saying a bidirectional communication is already required?
- 17:46:26 [Robert]
- Milan: we have the requirement that speech has begun and streaming
- 17:46:33 [Robert]
- Bjorn: speech detection is done on the client
- 17:46:50 [Robert]
- Milan: nerveous about detection in the client
- 17:47:17 [Robert]
- ... FPR21 apps should be notified when capture starts
- 17:47:52 [Robert]
- ... until we have reco, we can't say that speech has begun, and we can't do hotword from the client
- 17:48:01 [burn]
- zakim, who's noisy?
- 17:48:12 [Zakim]
- burn, listening for 10 seconds I could not identify any sounds
- 17:48:14 [Robert]
- Bjorn: notify -that- speech has begun, not -when- it has begun
- 17:48:31 [Zakim]
- -Marc_Schroeder
- 17:48:35 [Milan]
- Yep
- 17:49:01 [Zakim]
- +??P3
- 17:49:19 [burn]
- zakim, ??P3 is Marc_Schroeder
- 17:49:19 [Zakim]
- +Marc_Schroeder; got it
- 17:49:26 [Robert]
- Milan: this is part of the problem of not having detailed descriptions on this. I brought this up back in the F2F meeting, but didn't catch the nuance of the word "that"
- 17:49:32 [burn]
- zakim, nick marc is Marc_Schroeder
- 17:49:32 [Zakim]
- ok, burn, I now associate marc with Marc_Schroeder
- 17:50:07 [Robert]
- Bjorn: no assumption that detection runs on the client, but also no exclusion of this
- 17:50:20 [Robert]
- Milan: but if it runs on the server, then you need bi-direction communication
- 17:50:35 [Robert]
- ... and if so, it doesn't seem to be a stretch to say we need this for synthesis
- 17:51:02 [Robert]
- Bjorn: i agree with the analysis, but probably wouldn't propose an API for this
- 17:51:24 [Robert]
- Michael: we shoudl agree on whether or not it's a requirement, then prioritise in the next stage
- 17:51:43 [mbodell_]
- proposed req: Web application must be able to programatically abort tts output.
- 17:52:22 [Robert]
- Bjorn: can we agree that it's a requirement for the web app to abort TTS, without any specific requirement on how thsi affects the server
- 17:52:35 [Robert]
- Milan: sounds fine
- 17:52:44 [Robert]
- Michael: (silence) sounds like we have consensus
- 17:53:23 [Robert]
- Bjorn: so the other requirement is that when the client aborts TTS, it should not need to tear down the connection
- 17:53:57 [Robert]
- Marc: is this about functionality or efficiency? if it's about efficiency, the discussion should occur later, when we discuss implementation
- 17:54:12 [Robert]
- Milan: but it's so fundamental it would be crippling not to have this
- 17:54:30 [Robert]
- Bjorn: how about "aborting TTS should be efficient"?
- 17:54:34 [Robert]
- Milan: okay
- 17:54:48 [mbodell_]
- proposed req: Aborting the synthesis should be efficient.
- 17:55:02 [Robert]
- Michael: sounds like we have consensus
- 17:55:19 [Robert]
- Bjorn: "TTS output" rather than "synthesis"
- 17:56:21 [Robert]
- ... one is the effect on the user experience, the other is the effect on efficiency
- 17:56:26 [mbodell_]
- s/the synthesis/the TTS output/
- 17:57:16 [Robert]
- topic: discussion about API, device tag, etc http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0142.html
- 17:57:33 [Robert]
- Michael: is there a set of requirements out of that discussion?
- 17:57:53 [Robert]
- Bjorn: no it's a proposal
- 17:58:17 [Robert]
- Milan: it shows a lot of promise and if we started early we could get done sooner
- 17:58:31 [Robert]
- Bjorn: there's some serious politics going on there
- 17:59:30 [Robert]
- Michael: WHATWG doesn't really represent all browser manufacturers
- 18:00:32 [Robert]
- Milan: could the audio working group handle this?
- 18:00:44 [Robert]
- Michael: they're more about mixing and analysis, rather than capture
- 18:01:47 [Robert]
- ... IE wouldn't tackle this area until it's under some w3c group
- 18:02:11 [Robert]
- Milan: it would be in our group's interest to get some sort of audio capture API into HTML
- 18:02:39 [Robert]
- oops, that should have been Bjorn
- 18:03:19 [Robert]
- Michael: UI is geared around web cam capture
- 18:03:42 [Robert]
- Milan: people have been working on audio capture since 2005, and we only started this year
- 18:03:49 [Robert]
- Michael: but the use cases are different
- 18:04:10 [Robert]
- Bjorn: is there an audio chat scenario?
- 18:05:41 [Robert]
- Bjorn: could we specify an API required for speech without it being general purpose?
- 18:05:54 [Robert]
- Michael: we should propose what we need and explain why we need it
- 18:06:53 [Robert]
- Bjorn: if we don't have a general API for app-specified network recognition, we can still have reco with the default recognizer
- 18:07:45 [Robert]
- Olli: would it be easiest to co-author it with the whatwg and then propose that the HTML wg pick it up
- 18:07:53 [Robert]
- Bjorn: that's my preference
- 18:08:37 [Robert]
- Marc: if the browser captured audio according to ther requirements for speech recognition, then we wouldn't need any specific device API
- 18:09:11 [Robert]
- Michael: an alternative is to finish discussing requirements, then look at proposals, for which there may be a spectrum of approaches
- 18:09:27 [Robert]
- Bjorn: there's no reason to exclude a particular approach at this point
- 18:09:53 [Robert]
- Milan: concerned that device API has a promise and if we don't work together it won't happen
- 18:12:26 [Robert]
- Marc: we're expected to look at the pros and cons of various options and maybe make a decision, or if not, at least recommend options
- 18:13:54 [Robert]
- Dan: people can propose more requirements later on, but we should move on to prioritization
- 18:15:22 [Robert]
- ... begin prioritization in January, but between now and then, review the requirements and talk about those you don't feel are clear enough for you to prioritize
- 18:18:32 [Robert]
- Michael: please send description text where you think it's missing
- 18:21:26 [Robert]
- Milan: would prefer that the chairs propose a description and participants riff on that
- 18:23:27 [Robert]
- Dan: prioritization is a function that will naturally work out issues at the next level of detail
- 18:24:09 [Robert]
- ... So the first thing people should do is review the requirements, and if you can't prioritize, start a conversation
- 18:24:53 [Robert]
- Michael: I will send out another update soon, and you'll have a couple of week to review as Dan suggests
- 18:25:08 [Robert]
- Milan: it'll be chaos. 50 requirements. 6 groups here
- 18:26:39 [Robert]
- Dan: if this turns out to not work, we'll change strategies
- 18:27:05 [Robert]
- ... but I think we'll probably have a very small number of threads
- 18:28:04 [Robert]
- ... Plan to have calls at the same timeslot in January, in case we need them
- 18:29:06 [Robert]
- Marc: Michael, could you restructure the list of requirements by topic?
- 18:29:35 [Robert]
- Michael: will move section 3 to an appendix, and can potentially reorder section 4. I'll make an attempt
- 18:30:08 [Zakim]
- - +1.732.507.aabb
- 18:30:09 [Robert]
- ... I'll see what factors out
- 18:31:04 [Robert]
- Great work everybody!
- 18:32:27 [Zakim]
- -Marc_Schroeder
- 18:32:28 [Zakim]
- -Olli_Pettay
- 18:32:29 [Zakim]
- -Milan_Young
- 18:32:30 [Zakim]
- -Bjorn_Bringert
- 18:32:30 [Zakim]
- -Debbie_Dahl
- 18:32:32 [Zakim]
- -Michael_Bodell
- 18:32:33 [Zakim]
- -Dan_Burnett
- 18:32:39 [Zakim]
- -Robert_Brown
- 18:32:40 [Zakim]
- INC_(HTMLSPEECH)12:00PM has ended
- 18:32:42 [Zakim]
- Attendees were Michael_Bodell, Olli_Pettay, Milan_Young, Bjorn_Bringert, Dan_Burnett, Debbie_Dahl, Robert_Brown, Marc_Schroeder, +1.732.507.aabb
- 18:32:49 [marc]
- marc has left #htmlspeech
- 18:33:15 [burn]
- zakim, bye
- 18:33:15 [Zakim]
- Zakim has left #htmlspeech
- 18:33:21 [burn]
- rrsagent, make log public
- 18:33:29 [burn]
- rrsagent, draft minutes
- 18:33:29 [RRSAgent]
- I have made the request to generate http://www.w3.org/2010/12/16-htmlspeech-minutes.html burn
- 18:33:37 [ddahl]
- ddahl has left #htmlspeech
- 18:46:29 [smaug_]
- smaug_ has joined #htmlspeech
- 18:57:27 [smaug_]
- smaug_ has joined #htmlspeech