Strawman W3C software architecture notes
There is no reason why a particular set of network
protocol standards should imply any particular software
architecture within the peer agents, until the mobility of code
makes the distinction between remote operation and local
interfaces arbitrary. However, for the purposes of making
reference code for those protocols a sound architecture is
necessary; and besides, there is call for the standardization
of the APIs for their own sakes, for the mixing of software
from different manuafacturers.
Current (1993-4-5) design
The W3C reference code is required not only to be modular,
so that but also to be extensible easily by the addition of new
code at build time or run time. The 1993 design of the library
involved the notion of registering subclasses of certian given
classes:
- Objects for presenting documents of a given Content-Type
to the user;
- Format converters between different Content-Type values;
- Protocol implementations for different URI scheme values;
In each case, a specific function (such as
HTRegisterProtocol) is defined to allow the functions
to be added; a separate table of registered objects is kept;
and a small numbre of core subclasses were provided in the
library. In each case at registration an entry point is passed
to the new code, typically that of a creation routine for a
c++-like object whose first element is an "isa" pointer to a
jump table of method entry points for the new module.
This allows extra functionality to be added at runtime by
code linked to the core library. It does not address the
issues of dynamic loading, or of inter-process communication,
so it was in practice only used at initialization time for
code linked in by the application developer. In two cases,
there was a separate provision made in totally separate ways
for adding functionality outside the process. Proxy servers
can implement new URI schemes, with registration using
environment variables (etc) and communication through HTTP,
and helper applications can present new content types, with
registration through (on unix) a "mailcap" file and
communication through shared files and command line
arguments.
Next step
There is now a call for the registration of further types
of extension:
- Handlers for previously unknown rfc822 headers in HTTP or
other messages;
- In servers, objects corresponding to certain URIs as in
the CGI interface;
- Compression, encryption and payment algorithms;
and so on. These extensions can easily be expressed by the
registration of subclasses of generic objects and handled in a
similar way. However, rather than write specific code for each
occasion, it seems reasonable to use a generic technique.
The need for CCI (Client-client interface) standards
demonstrates that intra-process communication is not
sufficient and inter-application links are needed. RPC
techniques such as ILU, OLE2/DCE, etc clearly are designed to
do this, and world presumably mesh well with a generic
registration system.
The typical things you neeed to be able to do with a new
subclass are
- Register a statically linked module at initialization
time;
- Dynamically load (OS permitting), link and register a
module;
- Launch a new application, link it in as a module using
IPC
- Find libraries or applications on demand using some
search algorithm;
These facilities are all typically used in some form
already. We just have to generalize how W3C reference code uses
them.
The important point, of course is the general architecture,
and of secondary importance is the question of whether tools
are used to generate the stubs or registration code.
Platform specifics
An advantage of a generic subclass registration system is
that it can be mapped onto platform-specific facilities once
per platform. It is reasonable to use local OS-specific
conventions for IPC, dynamic linking, and program invokation.
Extension modules need callback interfaces, and these too we
will have to map onto local IPC conventions.
It is not proposed that we reinvent any IPC wok which we can
pick up and is sufficiently well-defined and open.
Specific Extension classes
The reference code consists then of two parts. The
framework code contains the basic API, and the registration
functions, and the functions that search for and invoke
registered subclasses. The other part consists of a set of
"kernel" modules which provide basic standard functionality and
also provide examples for the creation of extension modules.
URI scheme
-
Function
-
Provides access to objects in a given name space.
-
Current
-
HTProtocol,
HTRegisterProtocol, etc locally; Proxy server.
-
Parameters
-
URI scheme string
Local object access
-
Function
-
In server, provides access to specific parts of URI space
-
Current
-
Communication with theCommon Gateway
Interface (CGI), registration in server configuration
file.
-
Parameters
-
URI template
Note similarity with proxy
Format converter
-
Function
-
Converts from Content-Type a to Content-Type b
-
Current
-
HTConverter
-
Parameters
-
Contant-Type names a and b; quality factor
Presentation
-
Function
-
Renders an object for the user (converts from Content-Type
a to dymmy type "www/present")
-
Current
-
HTConverter
-
Parameters
-
Content-type name a; quality factor
This is currently handled as a special case of a format
converter as it simplifies the code. In fact in the
noninteractive case, www/present simply represents any output
format which is acceptable to the user.
Header handler
-
Function
-
Performs whatever handling is necessary for an rfc822-style
header h
-
Current
-
none
-
Parameters
-
The header keyword h
A header handler needs a lot of call-backs to allow it to
manipulate the body processing pipeline, change other headers,
abort transactions, operate a sub-protocol over the same
channel, etc. The design of these callback interfaces is
non-trivial.
It is an open question as to whether the bulk of security and
payment protocols can be grafted on in this way.
Hash algorithm
-
Function
-
Message digest algorithm
-
Current
-
none
-
Parameters
-
Algorithm name
Conclusions
Clearly the list above is extensible itself. In the
security area
- PK algorithm
- Bulk symetric algorithm
- Certificate verification algorithm
are the sorts of things which one might register.
A common parameter may be a "quality" factor giving some way
of disambiguating a choice of apparently equivalent
extensions, and which can be used in a negotiation process.
What is not apparent is whether a common selection algorithm
can be used. It looks as though many subclasses will be
registered using one identifying name, and a search for an
exact match to that name is all that is required. The format
conversion modules are currently unique in that they have two
parameters, and so a more complex search is required to
construct a stack of such modules given input and output
formats. (See HTStreamStack which does not do the complete
job).
I feel (June 1995) that getting the set of extension
classes defined is an important early step in the design of
the next phase. It will separate the tasks of writing
framework and core modules, and will give users a good idea
of what they are getting. -t
1995, TimBL