by Rohit Khare
Wrapping up considerable internal analysis & debate, W3C has prepared a proposal for working with extension mechanisms in HTTP/1.2
We started with input from several sources:
Scenario planning indicated the following requirements:
A "legally binding" upgrade to the HTTP specification (in 1.2) will ensure that "required" and "refused" mechanisms are obeyed by all parties.
HTTP 1.2 will include a new method for working with modular extensions:
The proposed syntax for Module: is a minor addition to the HTTP grammar. It borrows two idioms, first by extending module names from MIME "category/class" usage to pathname style and, second, by extending the notion of attribute-value pairs to arbitrary trees by adding {} to the HTTP <tspecials> rule.
Module: <ModSpec> <AVTree>
<ModSpec> ::= <ModName> ¦ (`{' <ModName> <ModSpec>`}') ¦ ( <ModName> `¦' <ModSpec>)
[This does parse properly, right? Precedence? Rewrite?']
<ModName> ::= <token> ¦ (<token>[`/'<token>]*)
[I mean for <token> to be restricted to alnum and `+', `-']
<AVTree> ::= [`;'<token>[`='(<token>¦<quotedstr>¦`{'<AVTree>`}')]]*
New error code responses.
Optional understood attribute/value pairs: version?
The syntax is the minimum support we needed to support the behaviors described below; they, in turn should be judged gainst the following test:
Is this modular extension mechanism powerful, precise, and flexible enough to encompass the range of future HTTP protocol evolution?As we explain each of the syntactic features (and later, usage scenarios), compare the benefits against that standard.
Modules are specific processing stages that can be applied to HTTP messages that, in effect, extend the protocol itself.Modules can do this because the associated code can reprocess an entire HTTP request or reply, like a stage in pipeline.
Modules can also be used to indicate auxilliary capabilities available, e.g. `"I have Scripting/Java"
Modules can invoke side-effect processing, i.e. operations that do not affect the HTTP message: "Change the desktop color to red"
Modules that have been applied to a message can be processed, in order, as part of the Content-Encoding and Content-Transfer-Encoding.Each module that has been applied to the current message must have a unique ;tag, The tag names are then referenced from the C-E: and C-T-E: headers to invoke the appropriate processing in the correct order.
Modules that indicate properties of an entity-body, like the signature of a pre-signed document on disk, are invoked from C-E:. Modules that indicate processing of the connection, such as a server's signature, belong in C-T-E:.
Modules should not be used to duplicate MIME processing such as content-type handling.
Modules are named in a well-defined, extensible name-spaceThe namespace hierarchy implies a simple type system, where compatibility checking asks for common prefixes. This allows a party to ask for "Security/Bulk-Cipher/", which means the counterparty can use any enciphrment algorithm that matches the prefix (this is all indicated by the trailing slash..
Each module name is part of a hierachical namespace (possibly managed by W3C). Each level of the hierarchy has a corresponding definition of its conformance requirements.
Each module specification can be found from a URN. Using the ;domain attribute, parties can find the absolute URN of the reference, or calulate one by concatenating the domain and the module name.
possible digressions: 1) the Liskov Substitution Principle 2) The projected openness of the namespace (i.e. infinitely less painful than IANA Media Types. 3) more examples?
Module requirements can also be composed in Boolean combination, as explained below.There are clearly cases where unambiguous compound negotiation is required: Consider the PC Card that only allows RSA + DES in combination. The syntax is interpreted as: the following scope and strength data applies to any combination of compatible modules on the other side that matches the and/or expression send in place of a plain module name.
Note that for compatibility with HTTP 1.x's stated requirement for unordered, single-ocurrence headers, entire Module: specifications may be `,' concatenated.
A module corresponds to a single function that transforms HTTP from an input form to an output form. In the process, the code may take advantage of many other resources: key databases, user interface feedback, network access...
This specification intentionally avoids the platform- and browser- specific definition of user interface and calling-convention APIs. It is expected that a companion document will document best practices for several combinations, as has happened for CGI and CCI/SDI.
Modules are evaluated by the following loop: (really bad first cut) [The real point of this loop is that executing one module can dynamically add more modules to the worklist, as would happen with Parser/.]
Pseudocode processing loop:
// Inverse C-T-E
// Inverse C-E
labelsDone := {};
nextModule := C-E_List.next;
while (nextModule && nextModule !in labelsDone)
if nextModule.isOptional -> labelsDone += nextModule
if nextModule.isRequired
if findModule(nextModule)
newHttp := nextModule(Http);
labelsDone += nextModule;
else
error 5xy "Module xx not found"
nextModule := C-E_List.next;
// ... pass result to existing C-T handler
Should accommodation for "functional" be made here?
We expect to build a prototype implementation entirely within a proxy server, so the connection model we expect to design around are pipes and console-log output.
For a singleton module specification, capability checking asks for a module that matches the name or the name-prefix. For a composite module specification of the frorm "{ a/b c/d } ¦ {e/f g/h}", compatibility means having access to either conjunction, (a/b and c/d) or (e/f and g/h).
Beyond that, we assume that the remainder of the semantics are unchanged. It is merely highly unlikely that compound modules will have refrence standards defining jointly negotiable parameters.
In HTTP 1.1 and above, there is no distinction made between proxy servers and origin servers. From either origin's perspective, there is some indeterminate multiple-hop route from end-to-end. This model suggested the four acceptable scopes defining which participants must process Module:s
Borrowing from S-HTTP, there are also strength bindings for each participant in-scope. This part is a little tricky because "refused" makes sense in realtively fewer contexts.
While the syntax and semantics of this mechanism correspond to the requirements outlined in the beginning, we want to satisfy the additional "sufficiency" goal at the beginning of this section.
Does our compounding syntax offer reasonable power?
Can we forsee outlining the future development of HTTP as 1.2 + future transport imrovements (-NG) + extension modules?
This section exercises the syntax with a few key applications.
Client: "Module: HTTP/Session/Keep-Alive ;scope=conn ;strength=opt ;negotiable={;maxreq=[1-5]}"
Server: "Module: HTTP/Session/Keep-Alive ;scope=conn ;strength=opt ;negotiable={;maxreq=4} ;tag=foo123"
Client: "Module: HTTP/Session/Tunneling ;scope=route ;strength=req ;negotiable={;protocols={;SSL}}"
Server: "Module: HTTP/Session/Tunneling ;scope=route ;strength=req ;negotiable={;protocols={;SSL}} ;tag=abc4"
Note that in a negotiation step (using OPTIONS), a server can say:
Server: "Module: HTTP/Session/Tunneling ;scope=route ;strength=opt ;negotiable={;protocols={;SSL;Telnet}}"
Server: "Module: HTTP/Session/Tunneling ;scope=route ;strength=ref ;negotiable={;protocols={;POP3;SMTP}}"
Client: "Module: Scripting/JAVA ;negotiable={ ;version=alpha ;packages={;awt ;net}"
Another exercise in grandfathering existing patches, Subtly rewrite syntax of Authenticate, Authorize, and innovate on Proxy auth. Figue out how to keeop insecure stuff utside of Security hierarchy.
Signatures are a fairly good exercise in scoping and strength, since serveral parties can require signatures to be generated (by several other parties), but only some links in the chain should have to check and enforce signatures.
End-to-End: clear scope, strength, tough on negotiation, key-len, and client-side implementation.
Proxy Encryption: Talk to Phill to see if we hit this one.
the immediate driver. Hmm...
Summary
Table
Structure? Is it upside down?
Style? Go directly to RFC style, do not pass GO?
Canonicity? Does this document stand alone, with out getting overly wedged in HTTP or security considerations?
Notation: Did everyone catch the status of courier, helvetica, and angle braces? was it too wonkish to refer to ;tag=? italics were supposed to mean "inband commentary"
Should it include historical development: we originally started with CCI & CGI and envisioned a CFI, specifying purely client-side processing instructions, but as we incorporated SHTTP and aimed for symmetry, and incorporated the scoping from KEEP-ALIVE, we came up with...
Does the "Call - response" discussion technique make sense?
Is the processing pipe static? when do we calculate the sequence of modules?
Would Parser/ break this model?
"Messages employing modular extensions should be follow established guidelines of HTTP/MIME consistency on the wire and after all processing has been completed."
Do my two HTTP/Session examples undermine the point of modules as http-reprcessors? Since those two examples clearly don't fit that mold...
Should ;tag be renamed ;label or ;used ?
;cacheable
when do we react to passing through a non-http/1.2 gateway?
In tunneling separate port # from protocol type
keep-alive header grandfathering from 1.1 for Connection: and Upgrade: