Webapp Module

From EXPath Community Group

The webapp module provides a way to write web applications directly in XSLT, XQuery and/or XProc. The homepage of the module is: http://expath.org/modules/webapp/. This module is defined by first looking at the existing and playing with a toy implementation, before writing down a spec. This page gather several observations as well as drafts. There are also specific info for the implementation for Saxon and Calabash, using Java Servlets, aka Servlex.

Some notes can also be found in Florent Georges's wiki.

Servlet definition

A servlet is a component that takes a request and a context as input, and provides a response as output. The request is represented by an element web:request and a sequence of zero or more request bodies. The context is represented by an element web:application and an element web:servlet. The response is represented by an element web:response and a sequence of zero or more response bodies.

A servlet can be implemented using one of various technologies. Each kind of servlet has its own rules for receiving requests and providing responses. The available servlet kinds are:

  • an XPath function (provided by an XQuery library module, a stylesheet, or any other implementation-specific means);
  • an XSLT named template;
  • an XSLT stylesheet;
  • an XQuery main module;
  • an XProc pipeline;
  • an XProc step.

Webapp structure

A webapp contains a web descriptor, mapping the request URIs (sent to an HTTP server) to a servlet (as explained above). A webapp is packaged as a regular EXPath package, with an additional file at the root, the web descriptor called expath-web.xml (so next to the package descriptor called expath-pkg.xml, as defined in the Packaging System specification). The package descriptor looks like the following:

<!--
  The prefix 'app' is used here to refer to XPath functions defined
    elsewhere in the package (e.g. in an XQuery library)
  @name is the name of the webapp (a URI)
  @abbrev is the abbreviation of the webapp, used to access it
    (e.g. when deploying on Servlex, use Servlex URI + {abbrev}/ to
    access the webapp root)
  @version is the webapp version
-->
<webapp xmlns="http://expath.org/ns/webapp/descriptor"
        xmlns:app="http://example.org/ns/my-website"
        name="http://example.org/my-website"
        abbrev="mine"
        version="0.1.0">

   <title>My webapp title</title>

   <!-- resource to serve straight away -->
   <resource pattern="/style/.+\.css"  media-type="text/css"/>
   <resource pattern="/images/.+\.png" media-type="image/png"/>

   <!--
       a servlet matching the URI [server]/[abbrev]/index, and
       implemented by an XSLT stylesheet
       @name is just a name in order to clarify things...
   -->
   <servlet name="index">
      <xslt uri="http://example.org/website/index.xsl"/>
      <url pattern="/index"/>
   </servlet>

   <!--
       a servlet matching the URI [server]/[abbrev]/thing/*,
       implemented by an XProc pipeline, and the container passes it
       the param 'thing' extracted from the URI
   -->
   <servlet name="thing">
      <xproc uri="http://example.org/website/thing.xproc"/>
      <url pattern="/thing/([^/]+)">
         <match group="1" name="thing"/>
      </url>
   </servlet>

</webapp>

The webapp with the above descriptor must contain an XSLT stylesheet and an XProc pipeline configured with the correct names in the package it has been built to (component names are defined in expath-pkg.xml as in any package). See http://expath.org/modules/xproject/ for an easy way to build a package (this is a general packaging solution, but if you save the web descriptor in xproject/expath-web.xml, it will package it up as expected).

Once the webapp deployed, when a user send an HTTP request to, say, http://[server]/[abbrev]/index, Servlex will look at the mapping in the web descriptor, see that the servlet "index" matches the URI (thanks to the value of url/@pattern, which is a regex), and so will know that the request has to be served by evaluating which component (here the XSLT stylesheet with the public import URI http://example.org/website/index.xsl).

The component is passed an element web:request, which looks like the following (if you go to http://h2oconsulting.be/tools/dump, this is a real dump of the real web:request passed to the component implementing that page):

<request servlet="index" path="/index" method="get" xmlns="http://expath.org/ns/webapp">
   <uri>http://[server]/[abbrev]/index</uri>
   <authority>http://[domain]</authority>
   <context-root>[...]/[abbrev]</context-root>
   <path>
      <!-- this is cut down into parts, in case of regex groups in url/@pattern -->
      <part>/index</part>
   </path>
   <header name="host" value="[domain]"/>
   <header name="user-agent" value="..."/>
   ...
</request>

In our example, the stylesheet will then be executed with the above input document, representing the HTTP request to be served. The result of the transform, on the other end, will describe the response to be sent back to the user. It looks like the following:

<web:response status="200" message="Ok">
   <web:header name="..." value="..."/>
   <web:body content-type="text/html">
      <html>
         <head>
            <title>Hello</title>
         </head>
         <body>
            <p>Hello, world!</p>
         </body>
      </html>
   </web:body>
</web:response>

The web descriptor can also contain error handlers (which use components to handle generic or specific XPath errors thrown in servlets), and filters (which use components to pre- and post-process inputs and outputs to and from servlets and other filters):

   <!-- the pipeline called to treat any error thrown in another servlet -->
   <error catch="*">
      <xproc uri="http://example.org/my-website/error-handler.xproc"/>
   </error>

   <!-- filter post-processing the output of servlets by applying a stylesheet -->
   <filter name="format">
      <out>
         <xslt uri="http://example.org/my-website/page.xsl"/>
      </out>
   </filter>

TODO

Misc

See the various points at the end of this wiki page, related to the session management, a function library and the setup a deployment.

Filters and error handlers

For filters, probably use something based on the following (for transformers, just define an out filter without in part), that is, define a filter as a pair of components:

<filter name="one"? group="general"?>
   <in>
      [ xslt | xquery | xproc ]
   </in>?
   <out>
      [ xslt | xquery | xproc ]
   </out>?
   <url pattern="/pages/*"/>?
</filter>

For error handlers, map error codes (aka QNames) to components in the same way.

Challenges: how to order them? For instance, do we have to apply a filter to the output of an error handler (in some cases we want, like a website layout, in other cases we don't, in some case we want error handler to catch errors in filters...) → order them as they are declared in the file.

Groups: define groups of servlets / filters / error handlers. The semantics is that the filters and error handlers apply to the servlets of the group. Define a group like a substitution group in XSD or a mode in XSLT: by using a QName and an attribute (and maybe a group element for meta infos, like xsl:mode in XSLT 3.0).

Maybe use the name chain instead of group, as this implies more the idea of order between items. A chain can be defined by the filters and error handlers with a @chain with the same name, or by using the element chain:

<chain name="my-chain">
   <filter ...>
      ...
   </filter>
   <error .../>
   <chain name="other-chain"/>
   <filter .../>
</chain>

A chain can be in a chain itself.