Authoring Applications for the Multimodal Architecture

1 Introduction

The W3C Multimodal Interaction (MMI) Working Group develops an architecture [MMI-ARCH] for the Multimodal Interaction framework [MMIF]. The Multimodal Architecture describes a general and flexible framework for interoperability of the various components of the multimodal framework (e.g. modality components and the interaction manager) in an abstract way. Among others it defines interfaces and messages between the constituents of the framework, but it is up to the implementation to decide how these messages are transferred in case of a distributed implementation.

The intention of this document is to provide a proposal of how to implement a multimodal runtime environment as well as an application based on the W3C Multimodal Architecture using existing W3C technologies. This proposal uses CCXML and VoiceXML to implement a voice modality component. Note that this is just one possibility for implementing it.

Note: The W3C Voice Browser Working Group is currently developing VoiceXML 3.0 which is the next major release of VoiceXML and will enable voice browsers to fit into the W3C Multimodal Architecture as a modality component. As VoiceXML 3.0 implementations are not yet available, this document relies on the existing VoiceXML 2.1 specification [VoiceXML].

The Multimodal Interaction Working Group itself wants to learn from this authoring example where improvements are possible and necessary. We also intend to present how we think that multimodal applications will be authored in the future.

2 Overview

The W3C Multimodal Architecture consists of the following main constituents (see also: MMI Runtime Architecture Diagram):

Runtime Framework containing the Interaction Manager.
Modality components.

In this document we discuss a distributed implementation of the multimodal framework using the following components and technologies:

Interaction Manager (IM) based on a state machine, described using [SCXML].
GUI Modality component based on [XHTML] and ECMAScript.
Voice Modality component based on [VoiceXML] and [CCXML].
Modality component API based on HTTP (event transport) and XML (event representation).

The following figure shows all these components mapped to the MMI Runtime Architecture Diagram:

The dashed boxes correspond to (logical) components within the MMI architecture whereas solid lines correspond to actual software or hardware components used to implement the system.

The voice input/output device shown in the figure above may be a regular (mobile) phone or a Voice-over-IP (soft) phone. In any case a phone connection to a standard voice browser is used.

3 Implementation of the components

This section discusses one possible implementation of the Multimodal Architecture.

3.1 The Runtime Framework

The Runtime Framework provides the environment which hosts the SCXML interpreter. It has to provide an interface to receive events from external components (modality components) and must be able to inject these events into an existing SCXML session or to start SCXML interpreter sessions. The Runtime Framework also needs to provide the possibility to send events to external components (i.e. some implementation of the SCXML <send> tag). In the future this feature might be a covered by the "external communications module" of the SCXML specification ([SCXML]).

An implementation of an SCXML interpreter written in Java is available open source from the Apache Software Foundation [Apache Commons SCXML]. One possibility for implementing a simple runtime framework could be to combine the Apache commons SCXML library [Apache Commons SCXML] with a J2EE servlet engine (e.g. [Apache Tomcat]). The servlet engine would be used to implement the HTTP I/O processor.

Even though HTTP might not be the most efficient solution as a transport protocol, it still has some advantages. It is a widely used protocol and available in nearly every programming language. In a distributed scenario, where the Interaction Manger (i.e. Runtime Framework) and the modality components are spread across the network, proxy and firewall problems are easy to solve. Also, our intended modality components (HTML browsers for graphical modality and VoiceXML browsers for voice modality) inherently support HTTP. Therefore we use HTTP for this proof-of-concept implementation proposal. Other, more scalable solutions might make use of other protocols.

The Runtime Framework provides the I/O processor which receives HTTP requests from modality components (containing XML based life-cycle event representation). Based on the event semantics the Runtime Framework logic has either to start a new SCXML interpreter instance (when receiving a mmi:newContextRequest message) or to inject an event into a running SCXML interpreter instance.

In this scenario in terms of transport the Runtime Framework acts as an HTTP server which receives HTTP requests and modality components are HTTP clients sending HTTP requests to the Runtime Framework. Therefore sending events from modality components to the Interaction Manager is relatively easy to implement using existing technologies.

The multimodal runtime architecture also requires to send events from the Interaction Manger to the modality components asynchronously. To be able to leverage standard components like HTML and VoiceXML browsers as modality components (or modality component containers) events should still be transferred using HTTP (as HTML and VoiceXML browsers supporting the HTTP protocol natively). But the browsers act as HTTP clients only. Therefore the Interaction Manager has still the role of the HTTP server. According to the HTTP model the client has to initiate requests. To enable the Interaction Manager to send events to the modality component, the modality component therefore has to send HTTP requests to the Interaction Manager to ask for events. This technique is usually known as polling. Simple implementations have obvious drawbacks (e.g. increased network traffic, additional delay) but it is possible to optimize it to some extend (e.g. by blocking the HTTP request server side and using timeouts). This technique certainly has limitations for large scale implementations, but it is relatively easy to implement based on existing technologies and therefore a good choice for a proof-of-concept.

Another promising approach could be [COMET] which uses long living HTTP connections to stream data to the client. Again the client has to open the HTTP connection. The server will stream an HTTP response to the client and leaves the HTTP connection open until the next event has to be sent to the client. Meanwhile there are a lot of applications out there using this server-push technology. Unfortunately this technology is not well standardized yet and therefore requires browser dependent implementations. But it is a potential solution for the required server-push channel.

3.2 GUI Modality Component

The GUI modality component may be implemented using HTML and JavaScript.

According to the rules defined for the Multimodal Architecture, the application logic resides within the Interaction Manager. Therefore the modality component has to send events (e.g. user initiated events like click or change) to the Interaction Manager. The Interaction Manager decides on possible reactions to this events and sends events to the modality component to instruct it to execute some action (e.g. displaying something).

The modality component API may be implemented using [XMLHttpRequest] (also know as AJAX). Event handlers for user initiated events like change for text input elements and click events for button elements may easily convert these into XML representations (MMI life-cycle event representation, e.g. containing values of input fields) and sent them to the Interaction Manager using XMLHttpRequests.

The following code snippet demonstrates the principle of how to send events to a server side Interaction Manager (assuming a servlet at someURL) using ECMAScript and XMLHttpRequests:

/* The sendMmiLifecycleEvent() function sends the MMI lifecycle
   event, potentially containing data values like color. The implementation of
   this function is vendor specific. The function is called to send a life cycle 
   event to the Runtime Framework using AJAX. The parameter "payload" contains a life 
   cycle event object.
*/ 
function sendMmiLifecycleEvent(source, context, payload) 
{
  var xmlHttpRequest = new XMLHttpRequest();
  
  // relative url, assuming that AJAX requests go to a url 
  // relative to the documents url
  var url ="./someURL";

  var XMLpayload = payload.toXML(source, context);
  
  xmlHttpRequest.open("POST", url, true);
  xmlHttpRequest.onreadystatechange = readystatehandler;
  xmlHttpRequest.setRequestHeader('Content-Type', 'application/x-www-form-urlencoded');
  xmlHttpRequest.send(XMLpayload);
  
  function readystatehandler()
  {
    if (xmlHttpRequest.status == 200 || xmlHttpRequest.status==304) {
      // be quiet in case of success
      // alert("success");
    } else {
      // alert error
      alert("send failure");
    }
  }
}

// JavaScript Event (pseudo) object
function LifeCycleEvent(mmiEvType, eventType, fieldName, fieldValue)
{
  this.mmiEventType = mmiEvType;        // e.g. extension
  this.eventType = eventType;           // user initiated event, e.g. change
  this.fieldName = fieldName;           // e.g. HTML id of the field
  this.fieldValue = fieldValue;         // e.g. value of the field
}

// method of LifeCycleEvent object to generate XML string from its properties
LifeCycleEvent.prototype.toXML = function(source, context)
{
  var mmiLifeCycleEvent;
     
  mmiLifeCycleEvent  = '&lt;mmi version="1.0" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch"&gt;';
  mmiLifeCycleEvent += '  &lt;mmi:" +  this.mmiEventType + "\"";
  mmiLifeCycleEvent += '  mmi:source="' +  source + '" mmi:context="' +  context + '"&gt;';
  mmiLifeCycleEvent += '  &lt;mmi:data&gt;';
  mmiLifeCycleEvent += '    &lt;eventType&gt;' + this.eventType + '&lt;/eventType&gt;';
  mmiLifeCycleEvent += '    &lt;fieldName&gt;' + this.fieldName + '&lt;/fieldName&gt;';
  mmiLifeCycleEvent += '    &lt;fieldValue&gt;' + this.fieldValue + '&lt;/fieldValue&gt;';
  mmiLifeCycleEvent += '  &lt;/mmi:data&gt;';
  mmiLifeCycleEvent += '  &lt;/mmi:' +  this.mmiEventType + '&gt;';
  mmiLifeCycleEvent += '&lt;/mmi&gt;";

  return mmiLifeCycleEvent;
}

As described above, receiving events from the Interaction Manager requires to send an HTTP request to the server (i.e. Runtime Framework). The response contains an XML coded event which represents an MMI life-cycle event. An event, indicating the change of the value of color, would be represented as a MMI life-cycle event "mmi:extension" (see [MMI-ARCH]) and could look like this:

<?xml version="1.0" encoding="UTF-8"?>
<mmi version="1.0" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch">
  <mmi:startRequest mmi:source="" mmi:target="" mmi:context="">
    <mmi:contentURL href="someContentURI" max-age="" fetchtimeout="1s">
    <mmi:data>
    </mmi:data>
  </mmi:extension>
</mmi>

It should be mentioned, that the content of the <mmi:data> element is application specific.

This approach requires to send an asynchronous XMLHttpRequest to the Runtime Framework (expecting the request to be blocked at the server) and to interpret the response accordingly: either taking any local action based on the event semantics and/or re-sending another request to the Runtime Framework.

/* This function handles all incoming MMI lifecycle events. They may be fetched 
   from the server side interaction manager using AJAX. The returned XML document
   is the MMI lifecycle event.
*/
function handleIncomingMmiEvents(xml)
{ 
  // check if incoming message is MMI lifecycle event
  // perform a very simple check:
  if(xml.match("<mmi:")) 
  {
    // parse incoming xml string to DOM
    parser=new DOMParser();
    doc=parser.parseFromString(xml,"text/xml"); 
    
    var element = doc.documentElement;
    if(element.childNodes[0].nodeName=="mmi:newContextResponse")
    {
      _CONTEXT = element.childNodes[0].getAttribute("mmi:context");
    }
    else if(element.childNodes[0].nodeName=="mmi:extension")
    {
      if(element.childNodes[0].childNodes[0].nodeName=="mmi:data")
       {
        // Application specific extension
        // In this example we receive the name of a function and the params.
        // This has to be evaluated locally using eval().
        var functionname = element.childNodes[0].childNodes[0].childNodes[0].childNodes[0].nodeValue;
        var elementname = element.childNodes[0].childNodes[0].childNodes[1].childNodes[0].nodeValue;
        var elementvalue = element.childNodes[0].childNodes[0].childNodes[2].childNodes[0].nodeValue;
        eval(functionname + "(elementname ,elementvalue)");
      }
    }
    else if(element.childNodes[0].nodeName=="mmi:clearContextRequest")
    {
      // create new mmiLifeCycleEvent Object that signals the removal of the Context
      event = new LifeCycleEvent("clearContextResponse", "", "", "");
      
      // send clearContextResponse lifecycle event
      sendMmiLifecycleEvent(_SOURCE, _CONTEXT, event);
    }
    else
    {
      // unknown lifecycle event
      alert("MMI lifecycle event not handled.");
    }

    // send HTTP request to server to receive lifecycle event.
    readMmiLifecycleEvent();
  }
  else // check if contains "<mmi:"
  {
    // --> it is not a valid lifecycle event!
    alert("Error: wrong message!");
  }
}

3.3 Voice Modality Component

The Voice Modality Component may be implemented using CCXML and VoiceXML 2.1.

VoiceXML 2.1 does not provide an external eventing functionality. As CCXML 1.0 defines an external event interface (Basic HTTP Event I/O Processor), which allows to inject external events into a running CCXML session or to start new CCXML sessions, CCXML will therefore be used as a event bridge between VoiceXML and the Interaction Manager. CCXML will receive events from the Interaction Manager and - depending on the event semantics - start a VoiceXML dialog.

VoiceXML will be used to implement the actual voice user interface (play prompt and control ASR). User input collected by VoiceXML will be returned to CCXML. CCXML has the ability to send HTTP requests to external components. This feature will be used to send events back to the multimodal runtime framework to inject events into the SCXML based Interaction Manager.

Due to the fact that VoiceXML must return to CCXML (and hence exit) to return results (e.g. recognition results) the VoiceXML user interface has to be implemented as small independent scripts. Each script corresponds to a single action, like play a prompt or start grammar and listen to user input.

4 Initiating multimodal sessions

Now, as we have described the basics of all constituents, we need to define the setup of a multimodal session.

A multimodal session may be initiated using a GUI modality component. The user starts a web browser and loads a HTML document from a given URL. Upon load, the HTML document registers corresponding event handlers (e.g. for change events) and is able to send messages to the Interaction Manager using AJAX (i.e. XMLHttpRequests).

The HTML document may contain a special text input field which is used to collect the users phone number or SIP URL. Once the user has entered this information it is sent (e.g. by pressing a corresponding button) to the Interaction Manager. The Interaction Manager generates a message towards the CCXML event processor to create a new CCXML session and to initiate a phone call to the given telephone number (or SIP URL).

As soon as the telephone connection has been established successfully the multimodal session is initiated. Now the Interaction Manager is capable of controlling the two modalities by sending life-cycle events.

5 Authoring example

This section ties together the previously described components to implement a sample application. The multimodal T-Shirt example contains a combined graphical and voice user interface and allows to fill in a form containing two fields (color and size) either by voice or by pen/keyboard.

The following figure shows the corresponding state machine logic for this example together with the MMI life-cycle events.

State machine logic and MMI life-cycle events

5.1 T-Shirt.scxml

The state machine could be represented in SCXML source code (T-Shirt.scxml) as follows:

<?xml version="1.0" encoding="UTF-8"?>
<scxml version="1.0" profile="ecmascript" initial="getColor" >
  <!-- we assume there is a script library which constructs MMI lifecycle events etc. -->
  <script src="mmi.js"/>
  
  <!-- datamodel definition -->
  <datamodel>
    <data id="color" expr=""/>
    <data id="size" expr=""/>
    <data id="received" expr="0"/>
  </datamodel>
  
  <!-- state getColor -->
  <state id="getColor">
    <onentry>
      <script>
        mmiEvent = new mmiStartRequest();
        mmiEvent.setURL('captureColorSize.html');
      </script>
      <!-- issue startRequest to GUI -->
      <send event="mmi:startRequest" target="GUI" targetType="x-ajax" namelist="mmiEvent"/>
      <script>
        mmiEvent = new mmiStartRequest();
        mmiEvent.setURL('getColor.vxml');
      </script>
      <!-- issue startRequest to VUI -->
      <send event="mmi:startRequest" target="VUI" targetType="basichttp" namelist="mmiEvent"/>
    </onentry>
    
    <!-- handle voice input -->
    <transition event="mmi:done" cond="_event.data..@source.toString() == 'VUI' && 
         _event.data..@status.toString() == 'success'" target="echoColor"/>
       <!-- save color to data model -->
       <assign location="_data.color" expr="_event.data..color.toString()"/>
       <!-- send event to GUI to display information -->
       <script>
         mmiEvent = new mmiExtension();
         // construct content of data element of extension event as XML string
         dataFieldValue = "&lt;eventType&gt;_check&lt;/eventType&gt;";
         dataFieldValue += "&lt;fieldName&gt;color&lt;/fieldName&gt;";
         dataFieldValue += "&lt;fieldValue&gt;" + color + "&lt;/fieldValue&gt;";
         mmiEvent.setDataField(dataFieldValue);
       </script>
      <send event="mmi:extension" target="GUI" targetType="x-ajax" namelist="mmiEvent"/>
    </transition>    
    
    <!-- handle GUI input -->
    <transition event="mmi:extension" cond="_event.data..@source.toString() == 'GUI' && 
         _event.data..@status.toString() == 'success'" target="echoColor"/>
       <!-- save color to data model -->
       <assign location="_data.color" expr="_event.data..color.toString()"/>
    </transition>    
    
    <!-- error handling -->
    <transition event="mmi:startResponse" cond="_event.data..@status.toString() == 'error'" target="failure"/>
    <transition event="mmi:done" cond="_event.data..@status.toString() == 'error'" target="failure"/>
  </state>
  
  <!-- state echoColor -->
  <state id="echoColor">
    <onentry>
      <!-- play back color to user -->
      <script>
        mmiEvent = new mmiStartRequest();
        mmiEvent.setURL('echoColor.vxml');
        // construct content of data element of extension event as XML string
        dataFieldValue = "&lt;color&gt;" + color + "&lt;/color&gt;";
        mmiEvent.setDataField(dataFieldValue);
      </script>
      <send event="mmi:startRequest" target="VUI" targetType="basichttp" namelist="mmiEvent"/>
    </onentry>

    <!-- play prompt done -->
    <transition event="mmi:done" cond="_event.data..@source.toString() == 'VUI' && 
         _event.data..@status.toString() == 'success'" target="getSize"/>
        
    <!-- error handling -->
    <transition event="mmi:startResponse" cond="_event.data..@status.toString() == 'error'" target="failure"/>
    <transition event="mmi:done" cond="_event.data..@status.toString() == 'error'" target="failure"/>
  </state>
  
  <!-- state getSize -->
  <state id="getSize">
    <onentry>
      <script>
        mmiEvent = new mmiStartRequest();
        mmiEvent.setURL('getSize.vxml');
      </script>
      <!-- issue startRequest to VUI -->
      <send event="mmi:startRequest" target="VUI" targetType="basichttp" namelist="mmiEvent"/>
    </onentry>
    
    <!-- handle voice input -->
    <transition event="mmi:done" cond="_event.data..@source.toString() == 'VUI' && 
         _event.data..@status.toString() == 'success'" target="echoSize"/>
       <!-- save color to data model -->
       <assign location="_data.size" expr="_event.data..size.toString()"/>
       <!-- send event to GUI to display information -->
       <script>
         mmiEvent = new mmiExtension();
         // construct content of data element of extension event as XML string
         dataFieldValue = "&lt;eventType&gt;_check&lt;/eventType&gt;";
         dataFieldValue += "&lt;fieldName&gt;size&lt;/fieldName&gt;";
         dataFieldValue += "&lt;fieldValue&gt;" + size + "&lt;/fieldValue&gt;";
         mmiEvent.setDataField(dataFieldValue);
       </script>
      <send event="mmi:extension" target="GUI" targetType="x-ajax" namelist="mmiEvent"/>
    </transition>    
    
    <!-- handle GUI input -->
    <transition event="mmi:extension" cond="_event.data..@source.toString() == 'GUI' && 
         _event.data..@status.toString() == 'success'" target="echoSize"/>
       <!-- save size to data model -->
       <assign location="_data.size" expr="_event.data..size.toString()"/>
    </transition>    
    
    <!-- error handling -->
    <transition event="mmi:startResponse" cond="_event.data..@status.toString() == 'error'" target="failure"/>
    <transition event="mmi:done" cond="_event.data..@status.toString() == 'error'" target="failure"/>  
  </state>
  
  <!-- state echoSize -->
  <state id="echoSize">
    <onentry>
      <!-- play back color to user -->
      <script>
        mmiEvent = new mmiStartRequest();
        mmiEvent.setURL('echoSize.vxml');
        // construct content of data element of extension event as XML string
        dataFieldValue = "&lt;size&gt;" + size + "&lt;/size&gt;";
        mmiEvent.setDataField(dataFieldValue);
      </script>
      <send event="mmi:startRequest" target="VUI" targetType="basichttp" namelist="mmiEvent"/>
    </onentry>

    <!-- play prompt done -->
    <transition event="mmi:done" cond="_event.data..@source.toString() == 'VUI' && 
         _event.data..@status.toString() == 'success'" target="endOfInteraction"/>
        
    <!-- error handling -->
    <transition event="mmi:startResponse" cond="_event.data..@status.toString() == 'error'" target="failure"/>
    <transition event="mmi:done" cond="_event.data..@status.toString() == 'error'" target="failure"/>  
  </state>
  
  <!-- state  endOfInteraction-->
  <state id="endOfInteraction">
    <onentry>
      <!-- number of received clearContextResponse messages, we are waiting for two -->
      <assign location="received" expr="0"/>
      
      <!-- issue clearContextRequest messages -->
      <script>
        mmiEvent = new mmiClearContextRequest();
      </script>
      <!-- issue clearContextRequest to GUI -->
      <send event="mmi:clearContextRequest" target="GUI" targetType="x-ajax" namelist="mmiEvent"/>
      <script>
        mmiEvent = new mmiClearContextRequest();
      </script>
      <!-- issue clearContextRequest to VUI -->
      <send event="mmi:clearContextRequest" target="VUI" targetType="basichttp" namelist="mmiEvent"/>
    </onentry>
    <transition event="mmi:clearContextResponse" cond="received = 0">
      <!-- increase counter -->
      <assign location="received" expr="1"/>
    </transition>
    <transition event="mmi:clearContextResponse" cond="received > 0" target="end"/>
  </state>
  
  <!-- state failure -->
  <state id ="failure">
    <!-- simply stop interaction -->
    <transition target="endOfInteraction"/>
  </state>
  
  <!-- final state -->
  <state id="end" final="true"/>
</scxml>

In this example we assume that the Runtime Framework supports the x-ajax and basichttp targettypes for the <send> tag. The GUI modality component uses AJAX to communicate to the Runtime Framework. Therefore we use x-ajax as the targettype, whereas the Voice modality component is implemented using CCXML/VoiceXML. As the external event interface of CCXML is used to inject events into the CCXML session we have to make use of the basichttp targettype.

5.2 captureColorSize.html

The following code fragment provides the basics of the HTML source code for the GUI modality component (i.e. captureColorSize.html):

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<meta name="application" content="captureColorSize" />
<meta name="description" content="solicits value for size and color" />

<title>CaptureColorSize</title>

<script src="sendEvent.js" language="javascript"/>
<script src="evaluateResponseXML.js" language="javascript"/>

<!-- Event handlers that create and send external events -->

<script language="JavaScript" type="text/javascript">
   
  var _SOURCE="GUI";
  var _CONTEXT="";
     
  function onloadHandler()
  {
    // create new mmiLifeCycleEvent Object that requests a new Context
    event = new LifeCycleEvent("newContextRequest", "", "", "");
    
    // send newContextRequest lifecycle event
    sendMmiLifecycleEvent(_SOURCE, _CONTEXT, event);
    
    // send HTTP request to server to receive lifecycle event.
    readMmiLifecycleEvent();
  }
  
  /* HTML event handler. The functions makes use of the browser event object
    which holds the id and the value (in this case the color value) of the HTML object.
  */
  function eventHandler(event)
  {  
    target = new Object();
    if(event.target)
    {    
      target = event.target;
    } 
    // Internet Explorer has no attribute target
    else if(event.srcElement)  
    {
      target = event.srcElement;
    }
    
    event = new LifeCycleEvent("extension", event.type, target.id, target.value);
    
    sendMmiLifecycleEvent(_SOURCE, _CONTEXT, event);
  }
    
  /* initiate AJAX request to the interaction manager to read the next MMI lifecycle
     event. The returned event is handled asynchronously within handleIncomingMmiEvents().
     Finally the next MMI lifecycle event is fetched.
  */
  function readMmiLifecycleEvent()
  { 
    // Start asynchronous XMLHttpRequest to receive MMI lifecycle event. 
    // We assume that the the IM always returns a lifecycle event, i.e.
    // we do not handle special timeout events. Once the request returns, 
    // handleIncomingMmiEvents() will be called to evaluate the xml encoded event.
  
    var xmlHttpRequest = new XMLHttpRequest();
  
    // relative url, assuming that AJAX requests go to a url 
    // relative to the documents url
    var url ="./getMMILifeCycleEvent";
  
    xmlHttpRequest.open("GET", url, true);
    xmlHttpRequest.onreadystatechange = readyhandler;
    xmlHttpRequest.send(null);
  
    function readyhandler()
    {
      if (xmlHttpRequest.readyState == 4) {
        if (xmlHttpRequest.status == 200) {
          //handle lifecycle event
          handleIncomingMmiEvents(xmlHttpRequest.responseText);
        } else {
          // alert error
          alert("readMmiLifecycleEvent failure");
        }
      }
    }     
  }
    
  // function to check elements with the given html id, e.g. radio buttons
  function _check(elementname, elementvalue)
  {
     document.getElementById(elementvalue).checked=true;
  }     
</script>
</head>

<body id="bodyId" onload="onloadHandler();">

<form action="" name="Color" id="Color">T-shirt color:
<table width="200">
  <tr>
    <td>
      <label> <input type="radio" id="red"
      name="radioGroup1" value="Red" onclick="eventHandler(event);" /> 
      Red</label>
    </td>
  </tr>
  <tr>
    <td>
      <label> <input type="radio" id="green"
      name="radioGroup1" value="Green" onclick="eventHandler(event);" />
      Green</label>
    </td>
  </tr>
  <tr>
    <td>
      <label> <input type="radio" id="blue"
      name="radioGroup1" value="Blue" onclick="eventHandler(event);" />
      Blue</label>
    </td>
  </tr>
</table>
</form>

<form action="" name="Size" id="Size">T-shirt size:
<table width="200">
  <tr>
    <td><label> <input type="radio" id="small"
      name="radioGroup2" value="Small" onclick="eventHandler(event);" />
    Small</label></td>
  </tr>
  <tr>
    <td><label> <input type="radio" id="medium"
      name="radioGroup2" value="Medium" onclick="eventHandler(event);" />
    Medium</label></td>
  </tr>
  <tr>
    <td><label> <input type="radio" id="large"
      name="radioGroup2" value="Large" onclick="eventHandler(event);" />
    Large</label></td>
  </tr>
</table>
</form>

</body>
</html>

sendEvent.js:

/* The sendMmiLifecycleEvent() function sends the MMI lifecycle
   event, potentially containing data values like color. The implementation of
   this function is vendor specific. The function is called to send a life cycle 
   event to the Runtime Framework using AJAX. The parameter "payload" contains a life 
   cycle event object.
*/ 
function sendMmiLifecycleEvent(source, context, payload) 
{
  var xmlHttpRequest = new XMLHttpRequest();
  
  // relative url, assuming that AJAX requests go to a url 
  // relative to the documents url
  var url ="./someURL";

  var XMLpayload = payload.toXML(source, context);
  
  xmlHttpRequest.open("POST", url, true);
  xmlHttpRequest.onreadystatechange = readystatehandler;
  xmlHttpRequest.setRequestHeader('Content-Type', 'application/x-www-form-urlencoded');
  xmlHttpRequest.send(XMLpayload);
  
  function readystatehandler()
  {
    if (xmlHttpRequest.status == 200 || xmlHttpRequest.status==304) {
      // be quiet in case of success
      // alert("success");
    } else {
      // alert error
      alert("send failure");
    }
  }
}

// JavaScript Event (pseudo) object
function LifeCycleEvent(mmiEvType, eventType, fieldName, fieldValue)
{
  this.mmiEventType = mmiEvType;        // e.g. extension
  this.eventType = eventType;           // user initiated event, e.g. change
  this.fieldName = fieldName;           // e.g. HTML id of the field
  this.fieldValue = fieldValue;         // e.g. value of the field
}

// method of LifeCycleEvent object to generate XML string from its properties
LifeCycleEvent.prototype.toXML = function(source, context)
{
  var mmiLifeCycleEvent;
     
  mmiLifeCycleEvent  = '&lt;mmi version="1.0" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch"&gt;';
  mmiLifeCycleEvent += '  &lt;mmi:' +  this.mmiEventType + '"';
  mmiLifeCycleEvent += '  mmi:source="' +  source + '" mmi:context="' +  context + '"&gt;';
  mmiLifeCycleEvent += '  &lt;mmi:data&gt;";
  mmiLifeCycleEvent += '    &lt;eventType&gt;' + this.eventType + '&lt;/eventType&gt;';
  mmiLifeCycleEvent += '    &lt;fieldName&gt;' + this.fieldName + '&lt;/fieldName&gt;';
  mmiLifeCycleEvent += '    &lt;fieldValue&gt;' + this.fieldValue + '&lt;/fieldValue&gt;';
  mmiLifeCycleEvent += '  &lt;/mmi:data&gt;';
  mmiLifeCycleEvent += '  &lt;/mmi:' +  this.mmiEventType + '&gt;';
  mmiLifeCycleEvent += '&lt;/mmi&gt;';

  return mmiLifeCycleEvent;
}

evaluateResponseXML.js:

/* This function handles all incoming MMI lifecycle events. They may be fetched 
   from the server side interaction manager using AJAX. The returned XML document
   is the MMI lifecycle event.
*/
function handleIncomingMmiEvents(xml)
{ 
  // check if incoming message is MMI lifecycle event
  // perform a very simple check:
  if(xml.match("<mmi:")) 
  {
    // parse incoming xml string to DOM
    parser=new DOMParser();
    doc=parser.parseFromString(xml,"text/xml"); 
    
    var element = doc.documentElement;
    if(element.childNodes[0].nodeName=="mmi:newContextResponse")
    {
      _CONTEXT = element.childNodes[0].getAttribute("mmi:context");
    }
    else if(element.childNodes[0].nodeName=="mmi:extension")
    {
      if(element.childNodes[0].childNodes[0].nodeName=="mmi:data")
       {
        // Application specific extension
        // In this example we receive the name of a function and the params.
        // This has to be evaluated locally using eval().
        var functionname = element.childNodes[0].childNodes[0].childNodes[0].childNodes[0].nodeValue;
        var elementname = element.childNodes[0].childNodes[0].childNodes[1].childNodes[0].nodeValue;
        var elementvalue = element.childNodes[0].childNodes[0].childNodes[2].childNodes[0].nodeValue;
        eval(functionname + "(elementname ,elementvalue)");
      }
    }
    else if(element.childNodes[0].nodeName=="mmi:clearContextRequest")
    {
      // create new mmiLifeCycleEvent Object that signals the removal of the Context
      event = new LifeCycleEvent("clearContextResponse", "", "", "");
      
      // send clearContextResponse lifecycle event
      sendMmiLifecycleEvent(_SOURCE, _CONTEXT, event);
    }
    else
    {
      // unknown lifecycle event
      alert("MMI lifecycle event not handled.");
    }

    // send HTTP request to server to receive lifecycle event.
    readMmiLifecycleEvent();
  }
  else // check if contains "<mmi:"
  {
    // --> it is not a valid lifecycle event!
    alert("Error: wrong message!");
  }
}

The ECMAScript function _check(elementname, elementvalue) within captureColorSize.html is provided to check a radio button. To achieve this, the Interaction Manager sends a mmi:extension life-cycle event where the (application specific) eventType element within the <mmi:data> element is set to _check. The fieldValue element contains the HTML id of the corresponding object. The _check(...) function therefore simply uses the DOM API to activate the radio button. The following example shows the MMI life-cycle event to activate the green color radio button.

<?xml version="1.0" encoding="UTF-8"?>
<mmi version="1.0" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch">
  <mmi:extension mmi:source="captureColorSize.html" mmi:context="">
    <mmi:data>
      <eventType>_check</eventType>
      <fieldName>color</fieldName>
      <fieldValue>green</fieldValue>
    </mmi:data>
  </mmi:extension>
</mmi>

This event is created within the SCXML script. See the getColor state of the SCXML sample code (5.1 T-Shirt.scxml).

5.3 dispatcher.ccxml

CCXML is used as a dispatcher of events between SCXML and VoiceXML. The script ccxml_events.js contains a collection of support functions to create MMI life-cycle events or to start VoiceXML dialogs.

Note that this script is written to be application independent.

<?xml version="1.0" encoding="UTF-8" ?>
<ccxml version="1.0" xmlns="http://www.w3.org/2002/09/ccxml">

  <!-- we assume there is a library of functions to send data to 
   the Interaction Manager -->
  <script src="ccxml_events.js" />

  <!-- CCXML session ID -->
  <var name="connectionId" expr="''" />

  <!-- SCXML session ID -->
  <var name="interactionId" expr="''" />
  <!-- request ID of lifecycle event -->
  <var name="requestID" expr="'123456'" />
  <!-- target type -->
  <var name="SCXML" expr="'basichttp'" />

  <!-- VXML dialog ID for termination -->
  <var name="vxml_dialogid" expr="0" />
  <!-- whether a VXML dialog is running or not -->
  <var name="vxml_running" expr="false" />
  <!-- whether a VXML dialog is in terminating or not -->
  <var name="vxml_terminating" expr="false" />

  <var name="prompt" expr="''" />
  <var name="audio" expr="''" />
  <var name="grammarUri" expr="''" />
  <var name="fields" expr="''" />
        
        
  <!-- Note: All events which are tagged with "(INTERNAL)" are standard events! -->
  <eventprocessor>
        
    <!-- ===================================================== -->
    <!-- SCXML events -->
    <!-- ===================================================== -->
                
    <!-- CCXML (INTERNAL): when CCXML is started, it throws this internal event -->
    <transition event="ccxml.loaded">
      <script>
        _ccxml.setSCXML_URI(session.values.scxml_serverip,
            session.values.scxml_serverport,
            session.values.scxml_serverpage);
      </script>
      <assign name="connectionId" expr="event$.connectionid" />
      <assign name="interactionId" expr="session.values.interactionid" />
                        
      <!-- call SIP phone -->
      <var name="sipip" expr="session.values.sip_phoneprefix + '@' + session.values.sip_phoneip + 
           ':' + session.values.sip_phoneport" />
      <createcall dest="sipip" connectionid="connectionId" />
    </transition>
                
    <!-- CCXML: terminate -->
    <transition event="ccxml.terminate">
      <send target="_ccxml.clearContextResponse(interactionId, requestID)"
          targettype="SCXML" name="'ccxml.external'" />
      <send target="session.id" targettype="'ccxml'" name="'this.exit'" />
    </transition>
    <transition event="this.exit">
      <log expr="'CCXML.exit'" />
      <exit />
    </transition>
                
    <!-- SIP: disconnect SIP phone -->
    <transition event="sip.disconnect">
      <disconnect connectionid="connectionId" />
      </transition>
                
        <!-- VXML: start -->
    <transition event="vxml.start">
      <assign name="prompt" expr="event$.prompt" />
      <assign name="audio" expr="event$.audio" />
      <assign name="grammarUri" expr="event$.grammarUri" />
      <assign name="fields" expr="event$.fields" />
      
      <!-- If a VXML dialog is running, terminate. otherwise start -->
      <if cond="vxml_running == false">
        <assign name="vxml_running" expr="true" />
        <var name="sessionid" expr="event$.sessionid" />
        <dialogstart src="_vxml.start(grammarUri, prompt, audio, fields)"
            dialogid="vxml_dialogid" connectionid="connectionId" namelist="sessionid" />
      <else />
        <assign name="vxml_terminating" expr="true" />
        <dialogterminate dialogid="vxml_dialogid" immediate="true" />
      </if>
    </transition>
    
    <!-- VXML: terminate -->
    <transition event="vxml.terminate">
      <var name="immediate" expr="event$.immediate" />
      <dialogterminate dialogid="vxml_dialogid" immediate="immediate" />
    </transition>
    
    
    <!-- ===================================================== -->
    <!-- SIP events -->
    <!-- ===================================================== -->
    
    <!-- SIP (INTERNAL): connection to phone completed -->
    <transition event="connection.connected">
      <send target="_ccxml.createResponse(interactionId, requestID, session.id)"
          targettype="SCXML" name="'ccxml.external'" />
    </transition>
    
    <!-- SIP (INTERNAL): disconnected -->
    <transition event="connection.disconnected">
      <send target="_ccxml.clearContextRequest(interactionId)"
          targettype="SCXML" name="'ccxml.external'" />
      <send target="session.id" targettype="'ccxml'" name="'this.exit'" />
    </transition>
    
    <!-- SIP (INTERNAL): reject call from SIP phone -->
    <transition event="connection.alerting">
      <reject />
    </transition>
    
    
    <!-- ===================================================== -->
    <!-- VXML events -->
    <!-- ===================================================== -->
    
    <!-- VXML (INTERNAL): when VXML is started, it throws this internal event -->
    <transition event="dialog.started">
      <send target="_vxml.startResponse(interactionId, requestID, vxml_dialogid)"
          targettype="SCXML" name="'ccxml.external'" />
    </transition>
    
    <!-- VXML (INTERNAL): an exit in VXML throws this internal event
      if it was not just a prompt, get EMMA from VXML and send response to SCXML -->
    <transition event="dialog.exit">
      <assign name="vxml_running" expr="false" />
      <!-- if a VXML dialog was terminated as a cause of dialogstart until a dialog was running,
        start new dialog now (for that case we have the global _vxml.start()-parameters) -->
      <if cond="vxml_terminating == false">
        <send target="_vxml.doneNotification(interactionId, event$.values.emma, vxml_dialogid)"
            targettype="SCXML" name="'ccxml.external'" />
      <else />
        <send target="_vxml.doneNotification(interactionId, '', vxml_dialogid)"
            targettype="SCXML" name="'ccxml.external'" />
        <assign name="vxml_running" expr="true" />
        <var name="sessionid" expr="event$.sessionid" />
        <dialogstart src="_vxml.start(grammarUri, prompt, audio, fields)"
            dialogid="vxml_dialogid" connectionid="connectionId" namelist="sessionid" />
        <assign name="vxml_terminating" expr="false" />
      </if>
    </transition>
    
    
    <!-- ===================================================== -->
    <!-- error events -->
    <!-- ===================================================== -->
    
    <!-- all errors (INTERNAL) -->
    <transition event="error.*">
      <log expr="'CCXML error'" />
      <!--
      <send target="_vxml.sendEvent(interactionId, _vxml.ERROR)"
          targettype="SCXML" name="'ccxml.external'" />
      -->
    </transition>
    
  </eventprocessor>
  
</ccxml>

5.4 captureColor.vxml

As mentioned in 3.3 Voice Modality Component VoiceXML must return to CCXML (and hence exit the VoiceXML dialog) to return results (e.g. recognition results). Therefore the VoiceXML user interfaces has to be implemented as small independent scripts. Each script corresponds to a single action, like play a prompt or start grammar and listen to user input.

The following code sample shows how the captureColor.vxml document could look like. The script vxml_emma.js, which is referenced in the VoiceXML document, contains a collection of auxiliary ECMAScript functions to create an [EMMA] representation of the user input. See Appendix C of [EMMA] for more information of how to map a recognition result into an EMMA representation.

Note that the other VoiceXML documents are very similar and therefore not shown here.

<?xml version="1.0" encoding="UTF-8"?>
<vxml xmlns:vxml="http://www.w3.org/2001/vxml" version="2.1">
  <!-- We assume that there is a script library to convert the 
       recognition result into an EMMA string 
       (see http://www.w3.org/TR/emma) -->
  <script src="vxml_emma.js" />
  <form>
    <field name="color">
       <prompt>Which color?</prompt>
       <option>red</option>
       <option>blue</option>
       <option>green</option>
       <filled>
         <!-- generate EMMA string from recognition result -->
         <var name="emma" expr="createEmma(application.lastresult$)"/>
         <!-- exit back to CCXML and return the recognized result -->
         <exit namelist="emma"/>
       </filled>
    </field>
    <catch event="help nomatch noinput">
       Your options are <enumerate/>
    </catch>
  </form>
</vxml>

Authoring Applications for the Multimodal Architecture

W3C Working Group Note 2 July 2008

Abstract

Status of this Document

Table of Contents

Appendices