Introduction to XSL

In a nutshell: XSL is a W3C specification that describes a method for visually presenting XML documents.

This tutorial will cover:

These slides are available at http://www.w3.org/People/maxf/XSLideMaker/

XML Documents

Styling XML Documents

CSS

With CSS one can associate properties to tags:refer to hamlet example

TITLE {
  display: block; 
  font-family: Helvetica;
  font-size: 18pt
}

Simple model: properties are associated to tags or attributes.

CSS lacks complex page layout, i18n properties, and the capability to perform complex styling operations.Like TOCs (not styling per se, but which should not be written by the author, but rather by the process that produces the actual document)

XSL

XSL is an alternative to CSS that allows greater control over the presentation of the XML data.

What can it do?

Who is it for?

Applications that require high-level quality formatting:

But is it not meant to be used where presentation is deeply tied to the contents (like graphic design).

Example I: Hamlet

<ACT>
  <SCENE>
    <TITLE>A room in the castle.</TITLE>

    <STAGEDIR>
      Enter KING CLAUDIUS, QUEEN GERTRUDE, 
      POLONIUS, OPHELIA, ROSENCRANTZ, and 
      GUILDENSTERN
    </STAGEDIR>

  <SPEECH speaker="King Claudius">
      <LINE>And can you, by no drift of circumstance,</LINE>
      <LINE>Get from him why he puts on this confusion,</LINE>
      <LINE>Grating so harshly all his days of quiet</LINE>
      <LINE>With turbulent and dangerous lunacy?</LINE>
  </SPEECH>
...

This Example could be done in CSS, but the next two can't

Formatted for paper output (PDF), formatted for the Web (XHTML)

Example II: Mixed Writing Modes

mixed writing modes

Example III: database

Complex transforms, 2-key sorting (artist/year), grouping (by artist)

...
  <record year="1992">
    <artist>Sundays, The</artist>
    <title>Blind</title>
  </record>


  <record year="1994">
    <artist>(Various)</artist>
    <title>The Glory of Gershwin</title>
    <note>Compilation</note>
  </record>


  <record type="soundtrack" year="1992">
    <artist>Kamen, Michael</artist>
    <title>Brazil</title>
    <location folder="3" page="20"/>
  </record>
...

PDF

Other Examples

How do they do that?

The XSL Process(es)

XSL transformation and formatting

The result tree is an XML document in which the markup has information about how to display the document: what font to use, the size of a page, etc. This markup is called Formatting Objects (elements) and Properties (attributes). For example:

<block font-family="Helvetica">ACT III</block>
<block font-size="10pt">Scene 1: A room in the castle</block>
<block space-before="10mm" font-style="italic">
  Enter KING CLAUDIUS, QUEEN GERTRUDE, POLONIUS, 
  OPHELIA, ROSENCRANTZ, and GUILDENSTERN
</block>
...

Generated from:

<ACT>
  <SCENE>
    <TITLE>A room in the castle.</TITLE>

    <STAGEDIR>
      Enter KING CLAUDIUS, QUEEN GERTRUDE, 
      POLONIUS, OPHELIA, ROSENCRANTZ, and 
      GUILDENSTERN
    </STAGEDIR>
...

deliberately simplified. Note the automatic numbering.

Server-Side/Client-Side XSL

This approach is superior in that it decreases the load on the server (which only sends the XML and the stylesheet) and allows user style sheets (client can add/modify transformation rules to make fonts bigger, etc.).

XSL and other W3C specs

XSL uses CSS properties to express formatting information, and uses the CSS inheritance model.

more verbose, but XSLT has an XML syntax (can use XML parser)

XSL and SVG, MathML

Transformations: XSLT

XSLT is a transformation language originally designed to transform any XML document into another XML document containing formatting objects: pages, blocks, graphics, text, etc.

XSLT transformation from one XM

General-purpose XSLT

XSLT has evolved to become a general-purpose transformation language from XML to XML.

Many users use it to transform their own XML document type to HTML for viewing within a browser

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
  <xsl:template match="/">...</xsl:template>
  <xsl:template match="/html">...</xsl:template>
</xsl:stylesheet>

Templates

A template says: "when you find this in the input file, then output this

<xsl:template match="TITLE">
  <fo:block font-family="Helvetica" font-size="14pt">
    <xsl:apply-templates/>
  </fo:block>
</xsl:templates>

meaning: when you read a tag named 'title' print a block start, continue the matching process for what's inside the <title> and close the block

Note the use of namespaces to differentiate between formatting objects and transformation instructions. XSL defines both

So this will transform:

<TITLE>Hamlet</TITLE>

into

<fo:block font-family="Helvetica" font-size="14pt">
  Hamlet
</fo:block>

HTML can also be generated very simply in the template, using for instance <h1> instead of <fo:block>

<xsl:apply-templates/> means: apply other templates to contents.

Implicit rule: text is copied from input to output: a style sheet with no rules will only return the character data of the input.

XSLT statements

Allow navigation and iteration within the input document tree

"Play" to HTML

Very simple one-template example using the 'pull' method:

<?xml version="1.0" encoding="utf-8"?>

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xsl:version="1.0">

  <head>
    <title><xsl:value-of select="PLAY/TITLE"/></title>
  </head>

  <body>
    <h1><xsl:value-of select="PLAY/TITLE"/></h1>

    <xsl:for-each select="PLAY/ACT">
      <xsl:for-each select="SCENE">
        <xsl:if test="TITLE">
          <h2><xsl:value-of select="TITLE"/></h2>
        </xsl:if>
        
        <xsl:for-each select="SPEECH">
          <h3 style="color: red"><xsl:value-of select="SPEAKER"/></h3>
          <xsl:for-each select="LINE">
            <p><xsl:value-of select="."/></p>
          </xsl:for-each>
        </xsl:for-each>
      </xsl:for-each>
    </xsl:for-each>
  </body>
</html>

Result:

<?xml version="1.0" encoding="utf-8"?>
  <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
      <title>The Tragedy of Hamlet, Prince of Denmark</title>
    </head>
    <body>
      <h1>The Tragedy of Hamlet, Prince of Denmark</h1>
      <h2>Elsinore. A platform beforethe castle.</h2>

      <h3 style="color: red">BERNARDO</h3>
      <p>Who's there?</p>

      <h3 style="color: red">FRANCISCO</h3>
      <p>Nay, answer me: stand, and unfold yourself.</p>
...

Extended, output: numbering, TOC, etc.

This uses the 'push' method where structure follows the input. Roughly there is one template for each tag type in the input

XPath

Formatting Objects basics

Pages

The Page Model

The area model

On the page will be layed out areas, that contain text, images and other areas. An area is a rectangle, with padding and border:

Area: border, padding and conte

Block/inline areas

The concept of relative orientation and writing-modes. Where CSS defines top, bottom, left, right, XSL adds before, after, start and end. Areas can be of type: block or inline. Blocks are stacked from the 'before' side to the 'after' side, inlines are stacked orthogonally.

Inline and block areas

Formatting Objects:

Properties

Example: Play to FO

The style sheet has the same structure as play2html, but HTML output is now FO. @@internationalisation of generated texr (Act, Scene) refer to DocBook

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:fo="http://www.w3.org/1999/XSL/Format"
  version="1.0">


<!-- ************************************************************ -->

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="/">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="*">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="PLAY">
    <fo:root>
      <fo:layout-master-set>
        <fo:simple-page-master master-name="title-page"
          page-width="210mm" page-height="297mm"
          margin-top="2cm" margin-bottom="2cm"
          margin-left="2cm" margin-right="2cm">
          <fo:region-body region-name="body"/>
        </fo:simple-page-master>
        <fo:simple-page-master master-name="act-page"
          page-width="210mm" page-height="297mm"
          margin-top="2cm" margin-bottom="2cm"
          margin-left="2cm" margin-right="2cm">
          <fo:region-body region-name="body" margin-top="1cm" margin-bottom="1cm"/>
          <fo:region-before extent="1cm" region-name="header"/>
          <fo:region-after extent="1cm" region-name="footer"/>
        </fo:simple-page-master>
      </fo:layout-master-set>

      <fo:page-sequence master-name="title-page">
        <fo:flow flow-name="body">
          <fo:block display-align="center">
            <xsl:apply-templates select="TITLE"/>
            <xsl:apply-templates select="FM"/>
          </fo:block>
        </fo:flow>
      </fo:page-sequence>

      <fo:page-sequence master-name="title-page">
        <fo:flow flow-name="body">
          <fo:block space-before="5cm">

            <fo:block font-size="16pt" space-after="3em" text-align="center">
              <xsl:text>Table of Contents</xsl:text>
            </fo:block>

            <fo:block start-indent="3cm" font-size="14pt">
               <xsl:apply-templates select="ACT" mode="toc"/>
            </fo:block>
          </fo:block>
        </fo:flow>
      </fo:page-sequence>

      <fo:page-sequence master-name="title-page">
        <fo:flow flow-name="body">
          <xsl:apply-templates select="PERSONAE"/>
        </fo:flow>
      </fo:page-sequence>

      <fo:page-sequence master-name="title-page">
        <fo:flow flow-name="body">
          <xsl:apply-templates select="SCNDESCR"/>
        </fo:flow>
      </fo:page-sequence>

      <xsl:apply-templates select="ACT"/>


    </fo:root>
  </xsl:template>

  <xsl:template match="PLAY/TITLE">
    <fo:block text-align="center" 
              font-size="30pt" 
              space-before="1em" 
              space-after="1em">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="TITLE">
    <fo:block text-align="center" 
              font-size="20pt" 
              space-before="1em" 
              space-after="1em">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="ACT/TITLE">
    <fo:block id="{generate-id()}"
              text-align="center" 
              font-size="20pt" 
              space-before="1em" 
              space-after="1em">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="SCENE/TITLE">
    <fo:block text-align="center" 
              font-size="16pt" 
              space-before="1em" 
              space-after="1em">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="FM">
    <fo:block text-align="center" 
              font-size="10pt"
              space-before="1em"
              space-after="1em">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="PERSONAE/PERSONA | PERSONAE/PGROUP">
    <fo:block space-after=".5em"><xsl:apply-templates/></fo:block>
  </xsl:template>

  <xsl:template match="PERSONAE/PGROUP/PERSONA">
    <fo:block><xsl:apply-templates/></fo:block>
  </xsl:template>

  <xsl:template match="GRPDESCR">
    <fo:block start-indent="5mm"><xsl:apply-templates/></fo:block>
  </xsl:template>

  <xsl:template match="SCNDESCR">
    <fo:block text-align="center" 
              font-size="20pt">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="SCENE">
    <fo:block 
      id="{generate-id()}"
      font-size="20pt" 
      space-before.optimum="10pt" space-after.optimum="5pt"
      text-align="center">
      <xsl:text>Scene </xsl:text>
      <xsl:number/>
    </fo:block>
    <xsl:apply-templates/>
  </xsl:template>
    

  <xsl:template match="ACT">
    <fo:page-sequence master-name="act-page">
      <fo:static-content flow-name="header">
        <fo:block text-align="end">
          <xsl:value-of select="/PLAY/PLAYSUBT"/>
          <xsl:text> - Act </xsl:text>
          <xsl:number format="I"/>
        </fo:block>
      </fo:static-content>
      <fo:static-content flow-name="footer">
        <fo:block text-align="end">
          <fo:page-number/>
        </fo:block>
      </fo:static-content>
      <fo:flow flow-name="body">

        <fo:block id="{generate-id()}"
          font-size="24pt" 
          space-before.optimum="10pt" space-after.optimum="5pt"
          text-align="center">
          <xsl:text>Act </xsl:text>
          <xsl:number format="I"/>
        </fo:block>


        <xsl:apply-templates/>
      </fo:flow>
    </fo:page-sequence>
  </xsl:template>

  <xsl:template match="ACT" mode="toc">
        <fo:block>
          <fo:basic-link internal-destination="{generate-id()}">
            <xsl:text>Act </xsl:text>
            <xsl:number/>
          </fo:basic-link>
          <fo:leader leader-length="5cm" leader-pattern="dots" leader-alignment="reference-area"/>
          p. <fo:page-number-citation ref-id="{generate-id()}"/>
        </fo:block>
    <xsl:apply-templates mode="toc"/>
  </xsl:template>

  <xsl:template match="SCENE" mode="toc">
    <fo:block text-indent="2em">
    <fo:basic-link internal-destination="{generate-id()}">
      <xsl:text>Scene </xsl:text>
      <xsl:number/>
    </fo:basic-link>
    <fo:leader leader-length="5cm" leader-pattern="dots"/>
    p. <fo:page-number-citation ref-id="{generate-id()}"/>
  </fo:block>
  </xsl:template>



  <xsl:template match="STAGEDIR">
    <fo:block text-align="center" 
              font-size="10pt"
              font-style="italic"
              space-before=".5em">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="SPEAKER">
    <fo:block text-align="center" 
              font-size="10pt"
              space-before="1em"
              space-after=".5em">
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>

  <xsl:template match="LINE">
    <fo:block>
      <xsl:apply-templates/>
    </fo:block>
  </xsl:template>


</xsl:stylesheet>

Top-level Template

Here we set the page format, running headers and footers, and columns

For the sake of simplicity, we use one type of page, one size, no alternatives.

Page Format

<xsl:template match="/">
  <fo:root>
   <fo:layout-master-set>
     <fo:simple-page-master master-name="article-page"
       page-height="297mm" page-width="210mm"
       margin-top="20mm"  margin-bottom="10mm" 
       margin-left="10mm" margin-right="10mm">
       <fo:region-body region-name="main" column-count="2"/>
       <fo:region-before region-name="header" extent="10pt"/>
       <fo:region-after region-name="header" extent="10pt"/>
     </fo:simple-page-master>

Page Sequence Master

A sequence of a single page. We could have made 2 page-masters (odd and even) and used a alternating page sequence master

 <fo:page-sequence-master master-name="article-sequence">
       <fo:single-page-master-reference master-name="article-page">
     </fo:page-sequence-master>

Page Sequence

Flow

Contains blocks, which contains text and inlines

I18N Formatting Objects and Properties

Horizontal and Vertical baselin

Other Formatting Objects

And Properties

Example: mixed writing modes

Go to Windows

If you are still interested...

Status of the specifications

Implementations

The Future