Each new device manufacturer builds its own browser to suit existing Web content.
This works so far but let's look at something more complex: "google maps in my car."
Google maps in my car. I want to have my car navigation system use google maps. Requirements:
Now:
For input we need: Grammars, Integration, Interfaces
Interfaces as IDL/WSDL APIs can be used in Javascript directly, or Web Services, respectively
Generic: register, setGrammar, setModel, getData, prompt, pause,
events
Specialised: sendVoiceXML
Speech Recognition Grammar Specification / Semantic Interpetation for Speech Recognition
<one-of> <item>Michael</item> <item>Yuriko</item> <item>Mary</item> <item>Duke</item> <item><ruleref uri="#otherNames"/></item> </one-of> <one-of><item>1</item> <item>2</item> <item>3</item></one-of> <one-of> <item weight="10">small</item> <item weight="2">medium</item> <item>large</item> </one-of> <one-of> <item weight="3.1415">pie</item> <item weight="1.414">root beer</item> <item weight=".25">cola</item> </one-of>
<ink> <trace> 10 0 9 14 8 28 7 42 6 56 6 70 8 84 8 98 8 112 9 126 10 140 13 154 14 168 17 182 18 188 23 174 30 160 38 147 49 135 58 124 72 121 77 135 80 149 82 163 84 177 87 191 93 205 </trace> <trace> 130 155 144 159 158 160 170 154 179 143 179 129 166 125 152 128 140 136 131 149 126 163 124 177 128 190 137 200 150 208 163 210 178 208 192 201 205 192 214 180 </trace> <trace> 227 50 226 64 225 78 227 92 228 106 228 120 229 134 230 148 234 162 235 176 238 190 241 204 </trace> <trace> 282 45 281 59 284 73 285 87 287 101 288 115 290 129 291 143 294 157 294 171 294 185 296 199 300 213 </trace> <trace> 366 130 359 143 354 157 349 171 352 185 359 197 371 204 385 205 398 202 408 191 413 177 413 163 405 150 392 143 378 141 365 150 </trace> </ink>
<emma:emma emma:version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma#"> <emma:one-of emma:id="r1" emma:start="2003-03-26T0:00:00.15" emma:end="2003-03-26T0:00:00.2"> <emma:interpretation emma:id="int1" emma:confidence="0.75" > <origin>Boston</origin> <destination>Denver</destination> <date> <emma:absolute-timestamp emma:start="2003-03-26T0:00:00.15" emma:end="2003-03-26T0:00:00.2"/> 03112003 </date> </emma:interpretation> <emma:interpretation emma:id="int2" emma:confidence="0.68" > <origin>Austin</origin> <destination>Denver</destination> <date>03112003</date> </emma:interpretation> </emma:one-of> </emma:emma>
<xsl:if test="@emma:confidence > 40"> <xsl:copy-of select="."/> </xsl:if>
<par> <par> <ref id="input1" mode="ink" grammar="select.ink" begin="activateEvent"/> <ref id="timeout1" dur="2s" begin="input1.activateEvent"/> </par> <excl end="timeout2.end"> <priorityClass peers="pause"> <ref id="timeout2" end="timeout1.end"/> <ref id="speech1" mode="speech" grammar="print.grm" begin="activateEvent"/> </priorityClass> </excl> </par>
The DPF specification defines an API to access system properties. E.g.
<html> <head> <title>GPS location example</title> <script type="text/javascript"> <![CDATA[ SystemEnvironment.location.format="zip code"; SystemEnvironment.location.updateFrequency="20s"; ]]> </script> <script defer="defer" type="text/javascript" ev:event="se:locationUpdate"> <![CDATA[ var field = document.getElementById("location"); var zipcode = SystemEnvironment.location; field.childNodes[0].nodeValue = zipcode; ]]> </script> </head> <body> <h1>Track your location as you walk</h1> <p>Your current zip code is: <span id="location">(please wait)</span></p> </body> </html>
The manager...
...
...and shapes the interaction accordingly:
Could be code: JavaScript using the APIs mentioned above, or declarative interaction markup like VoiceXML, with a mapping to API calls(with XBL)
Existing web pages and applications will still work but won't provide:
So extensions will be useful.
Historically precedes the MMI Framework
A specific framework
Now Integrates into MMI
VoiceXML2: one of W3C's most successful specifications
Simple form-filling applications on the phone
<field name="adjustment_amount"< <grammar type="application/srgs+xml" src="/grammars/currency.grxml"/< <prompt< What is the value of your account adjustment? </prompt< <filled< <submit next="/cgi-bin/updateaccount"/< </filled< </field< </form<
a standard for telephony platforms
handles events (e.g. incoming calls)
makes outgoing calls, conference calls, start (VoiceXML) dialogues
Harel State Tables: a general interaction management paradigm
CCXML provides markup for HST
Can plug in to CCXML, or drive VoiceXML dialogs or MMI interaction
How does the Voice Interface Framework specifications fit into the MMI architecture
New multimodal devices can make a better Web experience
The MMI Framework generalises the standard visual browser model
The MMI Framework generalises the standard voice browser model
New specifications needed, for:
MMI page: w3.org/2002/mmi
VBWG page: w3.org/voice
This presentation: w3.org/2005/Talks/1111-maxf-delhi