Strawman W3C software architecture notes

There is no reason why a particular set of network protocol standards should imply any particular software architecture within the peer agents, until the mobility of code makes the distinction between remote operation and local interfaces arbitrary. However, for the purposes of making reference code for those protocols a sound architecture is necessary; and besides, there is call for the standardization of the APIs for their own sakes, for the mixing of software from different manuafacturers.

Current (1993-4-5) design

The W3C reference code is required not only to be modular, so that but also to be extensible easily by the addition of new code at build time or run time. The 1993 design of the library involved the notion of registering subclasses of certian given classes: In each case, a specific function (such as HTRegisterProtocol) is defined to allow the functions to be added; a separate table of registered objects is kept; and a small numbre of core subclasses were provided in the library. In each case at registration an entry point is passed to the new code, typically that of a creation routine for a c++-like object whose first element is an "isa" pointer to a jump table of method entry points for the new module.

This allows extra functionality to be added at runtime by code linked to the core library. It does not address the issues of dynamic loading, or of inter-process communication, so it was in practice only used at initialization time for code linked in by the application developer. In two cases, there was a separate provision made in totally separate ways for adding functionality outside the process. Proxy servers can implement new URI schemes, with registration using environment variables (etc) and communication through HTTP, and helper applications can present new content types, with registration through (on unix) a "mailcap" file and communication through shared files and command line arguments.

Next step

There is now a call for the registration of further types of extension: and so on. These extensions can easily be expressed by the registration of subclasses of generic objects and handled in a similar way. However, rather than write specific code for each occasion, it seems reasonable to use a generic technique.

The need for CCI (Client-client interface) standards demonstrates that intra-process communication is not sufficient and inter-application links are needed. RPC techniques such as ILU, OLE2/DCE, etc clearly are designed to do this, and world presumably mesh well with a generic registration system.

The typical things you neeed to be able to do with a new subclass are

These facilities are all typically used in some form already. We just have to generalize how W3C reference code uses them.

The important point, of course is the general architecture, and of secondary importance is the question of whether tools are used to generate the stubs or registration code.

Platform specifics

An advantage of a generic subclass registration system is that it can be mapped onto platform-specific facilities once per platform. It is reasonable to use local OS-specific conventions for IPC, dynamic linking, and program invokation.

Extension modules need callback interfaces, and these too we will have to map onto local IPC conventions.

It is not proposed that we reinvent any IPC wok which we can pick up and is sufficiently well-defined and open.

Specific Extension classes

The reference code consists then of two parts. The framework code contains the basic API, and the registration functions, and the functions that search for and invoke registered subclasses. The other part consists of a set of "kernel" modules which provide basic standard functionality and also provide examples for the creation of extension modules.

URI scheme

Function
Provides access to objects in a given name space.
Current
HTProtocol, HTRegisterProtocol, etc locally; Proxy server.
Parameters
URI scheme string

Local object access

Function
In server, provides access to specific parts of URI space
Current
Communication with theCommon Gateway Interface (CGI), registration in server configuration file.
Parameters
URI template
Note similarity with proxy

Format converter

Function
Converts from Content-Type a to Content-Type b
Current
HTConverter
Parameters
Contant-Type names a and b; quality factor

Presentation

Function
Renders an object for the user (converts from Content-Type a to dymmy type "www/present")
Current
HTConverter
Parameters
Content-type name a; quality factor
This is currently handled as a special case of a format converter as it simplifies the code. In fact in the noninteractive case, www/present simply represents any output format which is acceptable to the user.

Header handler

Function
Performs whatever handling is necessary for an rfc822-style header h
Current
none
Parameters
The header keyword h
A header handler needs a lot of call-backs to allow it to manipulate the body processing pipeline, change other headers, abort transactions, operate a sub-protocol over the same channel, etc. The design of these callback interfaces is non-trivial.

It is an open question as to whether the bulk of security and payment protocols can be grafted on in this way.

Hash algorithm

Function
Message digest algorithm
Current
none
Parameters
Algorithm name

Conclusions

Clearly the list above is extensible itself. In the security area are the sorts of things which one might register.

A common parameter may be a "quality" factor giving some way of disambiguating a choice of apparently equivalent extensions, and which can be used in a negotiation process.

What is not apparent is whether a common selection algorithm can be used. It looks as though many subclasses will be registered using one identifying name, and a search for an exact match to that name is all that is required. The format conversion modules are currently unique in that they have two parameters, and so a more complex search is required to construct a stack of such modules given input and output formats. (See HTStreamStack which does not do the complete job).

I feel (June 1995) that getting the set of extension classes defined is an important early step in the design of the next phase. It will separate the tasks of writing framework and core modules, and will give users a good idea of what they are getting. -t


W3C
1995, TimBL