Modular Extension Mechanism for HTTP/1.2

by Rohit Khare

Introduction

Wrapping up considerable internal analysis & debate, W3C has prepared a proposal for working with extension mechanisms in HTTP/1.2

We started with input from several sources:

David Kristol's Internet Draft comments
Spyglass Inc.'s implementation experience with modular software
The identified need from the Security WG for modular security tools
Discussions of future-generation HTTP models (2.x+)

Scenario planning indicated the following requirements:

A well-managed namespace for extension modules
The ability to indicate which modules, if any, have been employed
The ability to indicate facilities available (binary have/don't have)
The ability to negotiate further on module-defined parameters.
The ability to specify precise scope and strength requirements

A "legally binding" upgrade to the HTTP specification (in 1.2) will ensure that "required" and "refused" mechanisms are obeyed by all parties.

Proposed Module: Syntax

HTTP 1.2 will include a new method for working with modular extensions:

Wrapped: WRAPPED HTTP/1.2 indicates that the request is directed at the origin server itself. The server is required to process modular extensions to the message (if any) before further processing. This request includes an optional entity-body.

The proposed syntax for Module: is a minor addition to the HTTP grammar. It borrows two idioms, first by extending module names from MIME "category/class" usage to pathname style and, second, by extending the notion of attribute-value pairs to arbitrary trees by adding {} to the HTTP <tspecials> rule.

Module: <ModSpec> <AVTree>

<ModSpec> ::= <ModName> ¦ (`{' <ModName> <ModSpec>`}') ¦ ( <ModName> `¦' <ModSpec>)

[This does parse properly, right? Precedence? Rewrite?']

<ModName> ::= <token> ¦ (<token>[`/'<token>]*)

[I mean for <token> to be restricted to alnum and `+', `-']

<AVTree> ::= [`;'<token>[`='(<token>¦<quotedstr>¦`{'<AVTree>`}')]]*

Significant Attribute-Value Tree Entries

Scope: ;scope= is one of none, conn, route, or origin. If ;scope is not present or invalid, it defaults to none.
Strength: ;strength= is one of opt, req, or ref. If ;strength is not present or invalid, it defaults to opt.
Tag: ;tag= is any <token>, and may not collide with the tag on any other module specification in the message. If ;tag is not present or invalid, it defaults to the specified <ModName>.
Negotiable: ;negotiable= is an attribute-value tree of attributes that counterparties cn negotiate upon, as defined in the offical specification for <ModName>. If ;negotiable is not present or invalid, it defaults to {}. [The syntax for indicating numeric ranges will be borrowed from SHTTP.]
Headers: ;headers= is an attribute-value tree of the data headers particular to the application of the Module to the present message, as defined in the offical specification for <ModName>. If ;headers is not present or invalid, it defaults to {}.
Domain: ;domain= is a <quotedstr>-form URI to the document defining this module. The URI may be an absolute reference, or a base for concatenating the module name If ;domain is not present or invalid, it defaults to path:/ORG/W3/Registry/HTTP/Modules/.

New error code responses.

100 Class: Roy proposed that "Continue Negotiating" belongs here.
200 Class: None
400 Class: Continue Negotiating. No modules found (a la None Acceptable). Security Validation Failed. Extension Refused. Incompatible Module Parameters
500 Class: Module not Available. Error in processing Module.

Optional understood attribute/value pairs: version?

Proposed Module: semantics:

The syntax is the minimum support we needed to support the behaviors described below; they, in turn should be judged gainst the following test:

Is this modular extension mechanism powerful, precise, and flexible enough to encompass the range of future HTTP protocol evolution?

As we explain each of the syntactic features (and later, usage scenarios), compare the benefits against that standard.

Definition & Specification

Modules are specific processing stages that can be applied to HTTP messages that, in effect, extend the protocol itself.

Modules can do this because the associated code can reprocess an entire HTTP request or reply, like a stage in pipeline.

Modules can also be used to indicate auxilliary capabilities available, e.g. `"I have Scripting/Java"

Modules can invoke side-effect processing, i.e. operations that do not affect the HTTP message: "Change the desktop color to red"

Modules that have been applied to a message can be processed, in order, as part of the Content-Encoding and Content-Transfer-Encoding.

Each module that has been applied to the current message must have a unique ;tag, The tag names are then referenced from the C-E: and C-T-E: headers to invoke the appropriate processing in the correct order.

Modules that indicate properties of an entity-body, like the signature of a pre-signed document on disk, are invoked from C-E:. Modules that indicate processing of the connection, such as a server's signature, belong in C-T-E:.

Modules should not be used to duplicate MIME processing such as content-type handling.

Modules are named in a well-defined, extensible name-space

The namespace hierarchy implies a simple type system, where compatibility checking asks for common prefixes. This allows a party to ask for "Security/Bulk-Cipher/", which means the counterparty can use any enciphrment algorithm that matches the prefix (this is all indicated by the trailing slash..

Each module name is part of a hierachical namespace (possibly managed by W3C). Each level of the hierarchy has a corresponding definition of its conformance requirements.

Each module specification can be found from a URN. Using the ;domain attribute, parties can find the absolute URN of the reference, or calulate one by concatenating the domain and the module name.

possible digressions: 1) the Liskov Substitution Principle 2) The projected openness of the namespace (i.e. infinitely less painful than IANA Media Types. 3) more examples?

Module requirements can also be composed in Boolean combination, as explained below.

There are clearly cases where unambiguous compound negotiation is required: Consider the PC Card that only allows RSA + DES in combination. The syntax is interpreted as: the following scope and strength data applies to any combination of compatible modules on the other side that matches the and/or expression send in place of a plain module name.

Note that for compatibility with HTTP 1.x's stated requirement for unordered, single-ocurrence headers, entire Module: specifications may be `,' concatenated.

Operation

A module corresponds to a single function that transforms HTTP from an input form to an output form. In the process, the code may take advantage of many other resources: key databases, user interface feedback, network access...

This specification intentionally avoids the platform- and browser- specific definition of user interface and calling-convention APIs. It is expected that a companion document will document best practices for several combinations, as has happened for CGI and CCI/SDI.

Modules are evaluated by the following loop: (really bad first cut) [The real point of this loop is that executing one module can dynamically add more modules to the worklist, as would happen with Parser/.]

Pseudocode processing loop:

// Inverse C-T-E

// Inverse C-E

labelsDone := {};

nextModule := C-E_List.next;

while (nextModule && nextModule !in labelsDone)

if nextModule.isOptional -> labelsDone += nextModule

if nextModule.isRequired

if findModule(nextModule)

newHttp := nextModule(Http);

labelsDone += nextModule;

else

error 5xy "Module xx not found"

nextModule := C-E_List.next;

// ... pass result to existing C-T handler

Should accommodation for "functional" be made here?

We expect to build a prototype implementation entirely within a proxy server, so the connection model we expect to design around are pipes and console-log output.

Composition

For a singleton module specification, capability checking asks for a module that matches the name or the name-prefix. For a composite module specification of the frorm "{ a/b c/d } ¦ {e/f g/h}", compatibility means having access to either conjunction, (a/b and c/d) or (e/f and g/h).

Beyond that, we assume that the remainder of the semantics are unchanged. It is merely highly unlikely that compound modules will have refrence standards defining jointly negotiable parameters.

Scope & Strength

In HTTP 1.1 and above, there is no distinction made between proxy servers and origin servers. From either origin's perspective, there is some indeterminate multiple-hop route from end-to-end. This model suggested the four acceptable scopes defining which participants must process Module:s

None: This module is purely an advisory notifiaction of a capability the originating party has. What does the meaning of strength become? Should this be Any instead?
Connection: This module defines processing that the immediate counterparty needs to address (one hop). Regardless of disposition (employed, refused, etc), the module-spec must be stripped before passing along (though a particular proxy may wish to reexpress the same requirement of its next counterparty. See Keep-Alive for an example.
Route: This module indicates processing reelevant to every participant in the conversation. The corresponding module specification must not be removed before passing along. See SSL Tunneling for an example.
Orgin: This module defines processing directed at the originating counterparty. See any end-to-end security attribute for an example.

Borrowing from S-HTTP, there are also strength bindings for each participant in-scope. This part is a little tricky because "refused" makes sense in realtively fewer contexts.

Optional: If the party has no sufficient module, it must remove this module specification from the message. If it does have one, it is not obligated to invoke it.
Required: If the party has no sufficient module it must report a 500-class error. If it has one, it must execute it or determine a 400-class incompability.
Refused: This module (in the modes indicated by the negotiable block) must never be used in a message to the requesting party. Refused specifications must be passed on unmodified to any remaining participants in-scope. If the same module has actually been applied to the current message, participant behavior is undefined. Sounds weaselly, no?

Sufficiency

While the syntax and semantics of this mechanism correspond to the requirements outlined in the beginning, we want to satisfy the additional "sufficiency" goal at the beginning of this section.

Does our compounding syntax offer reasonable power?

Can we forsee outlining the future development of HTTP as 1.2 + future transport imrovements (-NG) + extension modules?

Usage Scenarios

This section exercises the syntax with a few key applications.

Keep-Alive

Motivation

For TCP-based implementations of HTTP clients and servers, it is a clear win to hold a single connection open between requests than to create nd tear down a succession. Negotiable parameters include "maxrequests", the number of transactions envisioned on the current connection.

Description

Clearly, Keep-Alive is restricted to the scope of an immediate connection. Each step in a proxy chain may independently attempt to use this optimization for its traffic, but it is not at the discretion of the origin client or server.

Syntax

The example syntax uses scoping and negotiation. Note that the insertion of a ;tag always implies that a module has been employed -- even if it does not appear in a C-T-E or C-E.

Client: "Module: HTTP/Session/Keep-Alive ;scope=conn ;strength=opt ;negotiable={;maxreq=[1-5]}"

Server: "Module: HTTP/Session/Keep-Alive ;scope=conn ;strength=opt ;negotiable={;maxreq=4} ;tag=foo123"

Open Concerns

What does each party send on the next three transactions over this connection? nothing? a decreasing agreement?

SSL Tunneling

Motivation

As highlighted by Ari Luotonen's SSL Tunnelling RFC, a bidirectional, synchronous message flow needs explicit .setup and agreement by all participants.

Description

Clearly, for session tunneling to succeed, it must be required of an entire route. Each step in a proxy chain must accomodate this extension; any step that doesn't must return a 500-class error. Negotiable parameters may include the protocols that are to be gatewayed (conceivably, some server may wish to gateway SSL, but not SMTP).

Syntax

The example syntax uses scoping and negotiation. Watch for the syntax of negotiable set-elements (though we, could, in theory, use numerical port ranges).

Client: "Module: HTTP/Session/Tunneling ;scope=route ;strength=req ;negotiable={;protocols={;SSL}}"

Server: "Module: HTTP/Session/Tunneling ;scope=route ;strength=req ;negotiable={;protocols={;SSL}} ;tag=abc4"

Note that in a negotiation step (using OPTIONS), a server can say:

Server: "Module: HTTP/Session/Tunneling ;scope=route ;strength=opt ;negotiable={;protocols={;SSL;Telnet}}"

Server: "Module: HTTP/Session/Tunneling ;scope=route ;strength=ref ;negotiable={;protocols={;POP3;SMTP}}"

Open Concerns

Does this cover all the territory of Ari's proposal?

Scripting Capabilities

Motivation

Since this extension mechanism is the HTTP answer for complicated negotiation accomodation as well as message reprocess, it might be able to fulfill one upcoming need: profiling scripting language & library capabilities.

Description

This is a case of pure advertising, where there is no natural scope, but it may be useful for all parties to be informaed that these capabilites exist. Also, for many languages, there may be a language-dependent negotiable component that enumerates available packages/libraries or versions.

Syntax

The example syntax elides a lot because of the defualts. Watch for the syntax of a negotiable set.

Client: "Module: Scripting/JAVA ;negotiable={ ;version=alpha ;packages={;awt ;net}"

Open Concerns

Should this behavior be deprecated in favor of content-negotiation? What about other "machine features" -- may others, such as bit-depth, may also belong in content-negotiation.

Authentication

Another exercise in grandfathering existing patches, Subtly rewrite syntax of Authenticate, Authorize, and innovate on Proxy auth. Figue out how to keeop insecure stuff utside of Security hierarchy.

Signatures

Signatures are a fairly good exercise in scoping and strength, since serveral parties can require signatures to be generated (by several other parties), but only some links in the chain should have to check and enforce signatures.

Encryption

End-to-End: clear scope, strength, tough on negotiation, key-len, and client-side implementation.

Proxy Encryption: Talk to Phill to see if we hit this one.

Content Rating

the immediate driver. Hmm...

Summary

Table

NOTES ON THIS DOCUMENT

Structure? Is it upside down?

Style? Go directly to RFC style, do not pass GO?

Canonicity? Does this document stand alone, with out getting overly wedged in HTTP or security considerations?

Notation: Did everyone catch the status of courier, helvetica, and angle braces? was it too wonkish to refer to ;tag=? italics were supposed to mean "inband commentary"

Should it include historical development: we originally started with CCI & CGI and envisioned a CFI, specifying purely client-side processing instructions, but as we incorporated SHTTP and aimed for symmetry, and incorporated the scoping from KEEP-ALIVE, we came up with...

Does the "Call - response" discussion technique make sense?

OPEN QUESTIONS

Is the processing pipe static? when do we calculate the sequence of modules?

Would Parser/ break this model?

"Messages employing modular extensions should be follow established guidelines of HTTP/MIME consistency on the wire and after all processing has been completed."

Do my two HTTP/Session examples undermine the point of modules as http-reprcessors? Since those two examples clearly don't fit that mold...

Should ;tag be renamed ;label or ;used ?

;cacheable

when do we react to passing through a non-http/1.2 gateway?

FEEDBACK

In tunneling separate port # from protocol type

keep-alive header grandfathering from 1.1 for Connection: and Upgrade:

Go Up (Parent):: [Old-Drafts]
See Also (Siblings):: [Extension Proposal] [W3CSP Core Features] [Extension Syntax] [Modular Extension Mechanism Architecture]

Modular Extension Mechanism for HTTP was converted on Tue Oct 03 21:15:44 EDT 1995 by the eText Engine, version 5, release 0.96