Position Paper for
The W3C Workshop on Web Applications
and Compound Documents
June 1-2, 2004,
San Jose, CA

Paul Topping
Design Science, Inc.
pault@dessci.com
www.dessci.com

Who we are

We at Design Science have been involved with web applications and compound documents for many years. Several of us (this author, Robert Miner, Neil Soiffer) are current or former members of the W3C's Math Working Group, the body that created the MathML Recommendation. We also develop and market several MathML-based products: WebEQ (a MathML editor web application), MathPlayer (a MathML display engine for Microsoft's Internet Explorer web browser), and MathFlow Editor (a MathML editor that works with XML document editors). We are currently doing work under two NSF grants: one for researching making math accessible within online content to people with visual and other disabilities, and the other for holding a workshop (Enhancing the Searching of Mathematics) and doing research on math-based searching, both of which involve web applications and compound documents.

Overview of our Position

Our experience with existing plugin architectures leads us to believe that what is needed (as suggested by the Workshop Scope) is a "generic extension architecture" for content embedded in documents. In order to focus discussion, and to prevent it from becoming too generic, we would like to call these extensions or plugins "embedded content handlers". Embedded content handlers would extend web browsers and other content viewers and editors (user agents), enabling them to handle islands of content within compound documents whose format they otherwise do not understand. By handling, we include display, printing, user interaction, and editing. We also include handling requests from other user agent extensions, such as screen readers that make content accessible to users with visual and other disabilities, search engines, and spelling and grammar checkers. Readers may also be interested in our Manifesto: Requirements for Handling Math in a Document Editor which covers some of the issues in more detail but from a different point of view.

Existing Plugin Architectures are Limited

Existing web browsers implement two plugin architectures:

It is also possible for user agents to implement embedded languages directly, without requiring a plugin. For example, the Mozilla browser has built-in MathML support, but it is not implemented in an extensible way. Adding support for embedded languages for which support is not already built in, or replacing the built-in support, requires modification of the browser's source code.

Goals for an Embedded Content Handler Architecture

Our experience leads us to believe that what is needed is a new plugin architecture to be implemented within user agents (web browsers and other content viewers and editors) that allows for the creation of embedded content handlers. The goals of such an architecture include:

More Detailed Requirements and Features

Here we will give some more detailed requirements and desirable features for an embedded content handler architecture, along with their motivation where necessary. In many cases, we will refer to the handling of MathML content islands within an XHTML document in order to make the discussion more concrete. However, we feel that every item is applicable to the general problem.

Applications

The possible applications of an embedded content handler architecture are many but include the following major categories:

In our particular area of interest, applications include math-aware collaboration tools (math-enabled message boards, whiteboards, chat), testing and evaluation (online math quizzes), and interactive educational materials (simulations, manipulatives, calculators, etc.).

While the focus here is on user agents that fetch their content across the web, the embedded content handler architecture should also be applicable to plugins that extend general-purpose HTML/XHTML editors (eg, Dreamweaver, GoLive, FrontPage) and XML editors (eg, XMetaL, Epic, XMLSpy) in order to provide editing facilities for embedded languages, such as MathML, for which formatted text display or direct tag editing is inadequate. In addition, many kinds of content viewer and editor applications use the services of general HTML and XML display and editing engines. Such applications include email clients, instant messaging clients, and help engines. If the HTML engine used by these applications supports embedded content handlers, they should inherit the additional functionality provided by handlers for free.

Document Layout and Display

When the user agent formats a paragraph (or other textual layout primitive) that contains content for which a handler is available, it should gather the ambient properties that apply at the point of the handled content and pass them, along with the embedded content, to the handler to perform layout of its content. These ambient properties include font, point size, character style (bold, italic), text color, background color, transparency, column size, etc. The handler returns placement data that the user agent uses to place the rendering of the embedded content within the paragraph. The placement data items include width and height (non-rectangular outlines may also need to be supported), baseline position, side-bearings, desired line-breaking behavior, and potential page-break locations. Both inline and block formatting should be allowed.

Dynamic Documents and Editing

Many handlers may implement some kind of editing or user interaction that trigger a change in the document layout and display. Such changes may be due to script execution, browser state changes, changes to handled content, changes to other document content, or any external event. In order to implement a content editor, the handler will need to participate in window activation and keyboard/mouse focus mechanisms. Editors may have their own complex selection, focus, and highlighting model. For example, a math editor might allow the selection of individual sub-expressions for the purposes of clipboard cut/copy/paste, drag-and-drop, or direct keyboard editing. For accessibility purposes, highlighting of sub-expressions may also be synchronized to speech rendering of content.

Inter-plugin Communication

Handlers should be able to participate in features of user agents that may be supplied by other plugins or companion applications. This is exemplified by the relationship of screen readers used by the visually impaired to web browsers. The screen reader application has the ability to read the web page to the user, essentially providing an alternate, audio rendering of the page. This involves programmatic access to the textual rendering of the page and to various user interface controls and embedded content that may be present on the page. Since the browser doesn't know how to "speak" the embedded content, the handler must be involved. This works well in our MathPlayer plugin for Internet Explorer. Screen readers, such as WindowEyes and JAWS, use the Microsoft Active Accessibility (MSAA) API to access the HTML text of the web page. Since MathPlayer's equation objects support the MSAA interfaces, the screen readers are able to speak the math as well.

Cooperation between multiple software participants (user agent, embedded content handlers, auxiliary applications) may also be useful in areas such as spelling and grammar checkers, search engines, and allowing script code access to custom handler methods. While specific interfaces for accessibility, spell checking, etc. are probably outside the scope of an embedded content handler architecture, it should include facilities that make such interfaces available to other handlers and auxiliary applications. User agent plugins and auxiliary applications, given an embedded content DOM node, should be able to query for arbitrary interfaces that the handler may support.