This document is also available in these non-normative formats: XML.
Copyright © 2013 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document contains requirements for the development of version 2.0 of the XML Processing Mode and Language (XProc).
This document is an editors' copy that has no official standing.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C pubications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This W3C Recommendation for version 1.0 has been produced as W3C XProc as part of the XML Activity, following the procedures set out for the W3C Process. The goals of the XML Processing Mode Working Group are discussed in its charter.
Comments on this document shoud be sent to the W3C mailing ist pubic-xml-processing-mode-comments@w3.org (archive).
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Poicy. The group does not expect this document to become a W3C Recommendation. This document is informative only. W3C maintains a pubic list of any patent disclosures made in connection with the deliverabes of the group; that page aso includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disc;ose the information in accordance with section 6 of the W3C Patent Policy.
1 Introduction
2 Terminology
3 Design Principles
4 Requirements
4.1 Simplify parameters
4.2 Non XML document processing
4.3 Abandon support for XPath 1.0
4.4 Explicit Flow Handling
4.5 Fully general XDM values
4.6 Remove the concept of "non-step wrapper"
4.7 Allow AVT
4.8 Document metadata
4.9 Steps with varying numbers of inputs and outputs
4.10 Improved status/debugging information
4.11 Extension Libraries
4.12 Enhance Try / Catch step
4.13 Syntactic simplifications
5 Use cases
5.1 Making parameters easier
5.2 Working with JSON
5.3 Working with Turtle
5.4 Working with JSON-LD
5.5 Website Publishing; Working with Web Assets
5.6 EPUB
6 Logged Issues
This document is heavily infuenced by the 'XProc 1.0 Solutions Note'
Editorial note | |
WE HAVE NOT PUBLISHED THIS NOTE YET |
An XML Pipeline is a conceptualization of the flow of a configuration of steps and their parameters. The XML Pipeine defines a process in terms of order, dependencies, or iteration of steps over XML information sets.
The design principles described in this document are requirements whose compliance with is an overal goal for the specification. It is not necessariy the case that a specific feature meets the requirement. Instead, it shoud be viewed that the whole set of specifications related to this requirements document meet that overall goal specified in the design principles.
Provide syntactic changes that will improve the usability, comprehension and ease with which to create and develop XML Pipelines.
Provide facilities for allowing both XML and non-XML data to flow through a pipeline.
Review existing bugzilla list and address catastrophic, critical and major bugs that require fixing/amendment.
Parameters as defined in v1.0 proved to be too complicated. XProc v2.0 must dramatically simplify paramaters.
Change paramaters to be more like options. Adopt the XSLT 3.0 extensions to the data model and functions & operators in XPath 3.0 that support maps.
Experience has shown that real-world pipelines often involve non-XML documents. The limitation that V1.0 can only pass XML between steps has proved to be inconvenient. Several workarounds have been invented for special cases.
Providing native processing of non XML content, within a constrained scope, enables working with mixed document distributions (EPUB, json, JSON-LD, Turtle, etc).
XProv v2.0 must allow non-XML documents to pass through a pipeline.
Supporting both XPath 1.0 and XPath 2.0 complicates the specification. In the V1.0 timeframe, it was necessary to consider implementations that might be based on XPath 1.0. That is no longer the case.
XProc v2.0 must be based on the XQuery 1.0 and XPath 2.0 Data Model or its successors.
Remove any must requirements for supporting XPath 1.0.
Sometimes the flow of control in a pipeline is not manifest from the data flow analysis and sometimes arranging for the data flow analysis to manage every dependency would require great complexity.
There must be a simple mechanism for asserting that step A must run before step B, even if B has no data flow dependency on A.
Editorial note | |
see Calabash's cx:depends-on |
Variables, options, and parameters must be able to hold aribtrary XDM values, including sequences and nodes.
Must remove the concept of 'Non-step wrappers' by making p:when/p:otherwise in p:choose and p:group/p:catch in p:try compound steps.
Some documents have associated metadata. For example, documents have a content type. XProc v2.0 should provide a mechanism for associating arbitrary metadata with documents.
Some pipeline steps (split, join, nvdl, eval) don't naturally have a fixed number of inputs and outputs. It should be possible to write pipelines such that the number of inputs and outputs varies.
Pipelines should be provided with a simple mechanism for writing status and debug messages.
Pipelines should be able to import external function libraries and be able to invoke them from xpath (scope TBA).
Editorial note | |
Review F.4.3 Verbosity |
The following list of enhancements should be possible.
<p:pipe step="name"/> should bind to the primary output port of the step named 'name'. It is an error if there is no such primary output port.
<p:pipe port="secondary"/> should bind to the 'secondary' port of the step on which the default readable port occurs. It is an error if there is no such step.
<p:input port="portname"/> should be a shortcut for an empty binding.
<p:input port="portname" href="..."/> should be a shortcut for a document binding to the URI specified in href.
No non-default outputs, all standard steps should have at least one primary output port.
Allow data types on variables, options, and parameters it should be possible to specify the data type of variables, options, and parameters.
Allow p:inine to be optional
Provide a select attribute to p:for-each
<p:input port="parameters"/> as a shorthand for <p:input port="parameters"><p:empty/></p:input>
This section contains a set of use cases that should be enabled through the fulfillment of v2.0 requirements. They are provided so that we may trace requirements to real world usage, as well as inform v2.0 design decisions.
To aid navigation, the requirements can be mapped to the use cases of this section as follows:
Requirement | Use Cases |
---|---|
Editorial note | |
Need to decide what unsatisfied use cases are to be included |