Engines

From EXPath Community Group

List, description and comparison of various XPath engines (or engines for related languages).

Raw list

Very basic list, in alphabetical order:

  • BaseX, a native XML database with XQuery support.
  • Calabash, Norman Walsh's XProc processor.
  • EMC Documentum xDB by EMC, a native XML database with XQuery support.
  • EMC Documentum XProc Engine (aka Calumet), EMC's XProc processor, developed by Vojtech Toman.
  • eXist, an open source native XML database with XQuery and XSLT support.
  • Intel SOA Expressway's XSLT 2.0 Processor (this is an alpha version, post comments on this forum).
  • MarkLogic Server, a NoSQL database providing XQuery.
  • MXQuery by ETH Zurich, an in-memory XQuery processor.
  • Oracle Berkeley DB XML, an open source native XML database with XQuery support.
  • Sausalito, by 28msec. Not a processor, but a suite of tools that allow you to write, test, and deploy full-fledged Web-based applications, coded entirely in XQuery.
  • Qizx, by XMLmind, a fast XML repository and search engine fully supporting XQuery.
  • QuiXPath by Innovimax, a streaming XPath 1.0 implementation.
  • QuiXProc by Innovimax, a streaming XProc 1.0 implementation.
  • QuiXPath 2.0 by Innovimax/INRIA, a streaming XPath 2.0 implementation.
  • QuiXSchematron by Innovimax/INRIA, a streaming Schematron implementation.
  • QuiXSLT by Innovimax/INRIA, a streaming XSLT 3.0 implementation.
  • Saxon by Saxonica (by Michael Kay), an in-memory XPath 2.0, XSLT 2.0, XQuery and XML Schema processor.
  • Xalan by Apache, an XPath 1.0 and XSLT 1.0 processor, used in the Sun's Java standard distribution.
  • XQilla, an in-memory XQuery processor in C++, by John Snelson.
  • XQSharp, an in-memory XQuery 1.0 and XPath 2.0 processor for .NET.
  • Zorba by the FLWOR Foundation, an embeddable XQuery processor.

Comparison

Comparison matrix for major languages and features, for major engines.

Specifics and notes are in the next section, with details for each engine.

This table provides a comparison between several engines (the rows) based on specifications they implement or feature they provide (the columns).

Update, Full Text and Scripting are respectively the draft specs XQuery and XPath Full Text, XQuery Update Facility and XQuery Scripting Extension.

Packaging and webapps refer to an existing solution to build and install packages and to facilities to build webapps (this mean principally to be able to execute a component in response to an HTTP request, to access this request attributes and values, and to tell what will be the corresponding HTTP response to return).

Please note this matrix is an overview, be sure to read the next section for notes and details when appropriate.

Processor XPath XSLT XProc XML
Schema
XQuery Family Cross Language Feature
XQuery XQueryX Update Full Text Scripting Try/catch Function items Packaging Webapps
BaseX 3.1 Yes No 1.0 3.1 No Yes Yes No Yes Yes Yes Yes
Berkeley DB XML 2.0 No No ? 1.0 No Yes No No No No No No
Calabash 2.0 2.0 1.0 1.1 1.0 No No No No No No Yes Yes
Calumet 1.0 Yes 1.0 No 1.0 No Yes Yes No No No No No
EMC xDB 1.0 Yes 1.0 ? 1.0 No Yes Yes No No No No No
eXist 3.1 Yes 1.0 1.0 3.1 No Yes No No Yes Yes Yes Yes
Exselt No No No No No No No No No No No No No
MarkLogic 2.0 2.0 No 1.0 3.0 No No No No Yes Yes Yes Yes
Qizx 2.0 Yes No No 1.0 No Yes Yes Yes Yes No Yes No
QuiXTools 2.0 3.0 1.0 Yes 1.0 No No No No Yes No No No
Saxon 3.0 3.0 No 1.1 3.0 No Yes No No Yes Yes Yes Yes
Xalan 1.0 1.0 No No No No No No No No No No No
XQilla 2.0 No No ? 1.0 No Yes Yes No No Yes No No
XQSharp 2.0 2.0 No ? 1.0 Yes Yes No No No No No No
Zorba 2.0 1.0 No ? 3.0 Yes Yes Yes Yes Yes No No No

In addition to this table, an interesting link (though limited to XQuery) is the official XQuery Test Suite Result Summary hosted by the W3C XML Query working group.

Details

Details of several engines.

BaseX

BaseX is a native, open-source XML database and efficient XPath/XQuery processor, including support for the latest Full Text and Update recommendations. It supports very large XML instances and offers a highly interactive frontend. BaseX is written in Java and freely available for download. It is developed by the Database and Information Systems Group at the University of Konstanz.

XSLT support is provided by integration with an external transformer such as Saxon or Xalan.

Berkeley DB XML

Berkeley DB XML does not provide a way to evaluate an XPath expression while rejecting XQuery expressions that are not strictly XPath expressions. But as XPath expressions are valid XQuery expressions, one can consider it gives a way to evaluate XPath 2.0 as well as XQuery 1.0.

Calabash

XML Calabash is an open-source XProc processor, written in Java by Norman Walsh. Website is at http://xmlcalabash.com/, and the code repository is on GitHub.

Packaging is implemented in Calabash as a third-party open-source implementation: expath-pkg-java.

Webapp is implemented on top of Calabash as a third-party open-source implementation: Servlex.

XML Schema is supported in non-free editions of Saxon.

EMC Documentum xDB

EMC Documentum xDB (formerly known as X-Hive/DB) is a scalable, high-performance native XML database optimized for storing and querying large volumes of content. Written in 100% Java, xDB provides a fully persistent, transactional DOM Level 3 implementation. xDB is easy to embed, with fully configurable storage and memory footprint. EMC provides xDB free of charge for development purposes on the EMC Developer Network website.

You can try a live demo here.

Support for XPath 2.0 in xDB is being considered. In the meantime, XQuery remains the main query language in xDB.

XSLT support is provided by integration with an external transformer such as Saxon or Xalan.

XProc support is provided by integration with EMC Documentum XProc Engine (AKA Calumet).

EMC Documentum XProc Engine (AKA Calumet)

EMC Documentum XProc Engine (aka Calumet) is a Java XProc implementation. It can be used either as an embedded component in larger applications, or as a standalone tool with a simple command-line interface. Calumet features an open architecture that makes it possible to register plug-ins that customize the default behavior of the processor or provide new functionality, such as extension XProc steps. Calumet provides seamless integration with EMC Documentum xDB as well as various other 3rd-party tools. Calumet is available free of charge for development purposes on the EMC Developer Network website.

It supports XPath 1.0, but support for XPath 2.0 is being considered.

XSLT support is provided by integration with an external transformer such as Saxon or Xalan.

XQuery support is provided by integration with EMC Documentum xDB (including XQuery Update and Full Text).

EMC Documentum XForms Engine

EMC Documentum XForms Engine (aka Formula) is an XForms 1.1 engine based on the Google Web Toolkit. This means it is coded in the Java programming language, and compiled into JavaScript which can be executed by most modern browsers, without the need for a plugin or processing outside of the browser. It comes with Java and JavaScript application programming interfaces which make it easy to embed Formula in various types of web applications. EMC provides the Formula XForms engine free of charge for development purposes. It can be obtained from the EMC Developer Network website.

You can try a live demo here.

eXist

eXist is an Open Source Native XML Database featuring efficient, index-based XQuery processing. It has a modular indexing architecture and provides a powerful environment for the development of web applications based on XQuery, XSLT and related standards. Entire web applications can be written in XQuery, using XSLT, XHTML, CSS, and Javascript (for AJAX functionality).

eXist has now been around since late 2000 and provides a huge number of XML technology features. There is a large, active and well established Open Source community supporting and driving forward eXist's development.

XSLT support is provided by integration with an external transformer such as Saxon or Xalan. Native version is in development (work is currently being done to improve that support).

The XProc support in eXist is partial, see 8.3. xprocxq compliance and limitations for further details.

eXist does not support the Full Text spec, but integrates Lucene to provide a similar functionality.

MarkLogic

MarkLogic is a highly scalable NoSQL database that combines an application server, transactional persistent storage (XML, JSON, text, and binary), a full-text search engine and a triple store.

It includes a native XQuery engine, an HTTP server, a Javascript engine, a SPAQRL engine and a share-nothing cluster architecture to simplify how developers build and deploy rich information applications. MarkLogic is used today in production by leading organizations in media, government, financial services, and healthcare.

You can find more technical information and documentation on the developer network.

MarkLogic implements most of XQuery 3.0, but still miss some features, like grouping. XQuery Update and Full Text are not supported, but MarkLogic provides proprietary functions to achieve the same goals.

MarkLogic contains a built-in application server with XQuery and Javascript as programming languages. It supports HTTP 1.1, integrated SSL, multipart form handling, URL rewriting, error handling. etc.

Qizx

Qizx is a native XML repository and search engine optimized for fast queries. It supports XQuery Update Facility and XQuery Scripting Extension, as well as database features like transaction / isolation. There is a fully open-source edition with all XQuery features but no database support, called Qizx/open. See the features and product pages for more informations.

XSLT support is provided by integration with an external transformer such as Saxon or Xalan.

The level of support of XQuery is not clear from the website.

QuiXTools

QuiXTools are a family of product based on QuiXPath, that implement many XML Standards by allowing streaming processing:

  • QuiXProc for XProc 1.0
  • QuiXSLT for XSLT 3.0
  • QuiXSchematron for ISO Schematron
  • QuiXPath for XPath 2.0

Saxon

Saxon is a standalone XSLT, XQuery and XML Schema engine. It exists for Java, C, PHP and Javascript (in the browser).

Saxon comes in several editions supporting several level of conformance. The Home Edition supports the basic confirmance and is open-source and free. Other editions add more advanced features and extensions, and need a commercial license.

See the feature matrix for details.

XML Schema is supported in non-free editions of Saxon.

Packaging is implemented in Saxon as a third-party open-source implementation: expath-pkg-java.

Webapp is implemented on top of Saxon as a third-party open-source implementation: Servlex.

XQilla

The XSLT support for XQilla is under development. A partial implementation is available.

XQSharp

XQSharp is a standards compliant implementation of XQuery 1.0, XQuery Update Facility 1.0, XPath 2.0 and XSLT 2.0 for the Microsoft .NET Framework. The XQSharp API builds upon the classes in the System.Xml namespace. It is written in 100% managed code for Microsoft .NET Framework version 2.0 or later. It is provided as a single strong-named assembly which is suitable for use in Low and Medium Trust environments.

For an interesting and rather unusual example of XQuery application, see the article written by the XQSharp team about building a raytracer in XQuery.

Zorba

Zorba is a general purpose open-source XQuery processor that is written in C++. It implements most of the XQuery-related specifications: XPath, XQuery, Update, Scripting, Full-Text, XSLT, XQueryX, and more. Moreover, it comes with a rich XQuery library providing modules such as http, cryptography, image processing, geo projections, emails, data cleaning, data converters, or data formatting. Zorba is available for Windows, Linux, and Mac OS and provides language bindings for languages other than C/C++ such as PHP, Ruby, Python, and Java.

You can find a live demo at http://try.zorba-xquery.org.