Copyright ©2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
XML is an extremely versatile markup language, capable of labeling the information content of diverse data sources including structured and semi-structured documents, relational databases, and object repositories. A query language that uses the structure of XML intelligently can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware. Because query languages have traditionally been designed for specific kinds of data, most existing proposals for XML query languages are robust for particular types of data sources but weak for other types. This specification describes a new query language called XQuery, which is designed to be broadly applicable across all types of XML data sources.
This document is the first publicly available W3C Working Draft of XQuery, for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or made obsolete by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".
Comments on this document should be sent to the W3C mailing list www-xml-query-comments@w3.org (archived at http://lists.w3.org/Archives/Public/www-xml-query-comments/).
This document was produced by the W3C XML Query Working Group, which is part of the W3C XML Activity. A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.
As increasing amounts of information are stored, exchanged, and presented using XML, the ability to intelligently query XML data sources becomes increasingly important. One of the great strengths of XML is its flexibility in representing many kinds of information, including information traditionally considered to be a database and information traditionally considered to be a document. To exploit this flexibility, an XML query language must include the features that are necessary for retrieving information from these diverse sources.
XQuery is designed to meet the requirements identified by the W3C XML Query Working Group [Requirements]. It is designed to be a small, easily implementable language in which queries are concise and easily understood. It is also flexible enough to query a broad spectrum of XML information sources, including both databases and documents. The Query Working Group has identified a requirement for both a human-readable query syntax and an XML-based query syntax. XQuery is designed to meet the first of these requirements. An alternative, XML-based syntax for the XQuery semantics will be defined separately.
XQuery is derived from an XML query language called Quilt, which in turn borrowed features from several other languages. From XPath [XPath] and XQL [XQL] it took a path expression syntax suitable for hierarchical documents. From XML-QL [XML-QL] it took the notion of binding variables and then using the bound variables to create new structures. From SQL [SQL] it took the idea of a series of clauses based on keywords that provide a pattern for restructuring data (the SELECT-FROM-WHERE pattern in SQL). From OQL [ODMG] it took the notion of a functional language composed of several different kinds of expressions that can be nested with full generality. Quilt was also influenced by reading about other XML query languages such as Lorel [Lorel] and YATL [YATL]. Quilt has most recently been described in [Quilt].
Important issues remain open in the design of XQuery. Some of these issues deal with relationships between XQuery and other XML activities, for example:
The semantics of XQuery are defined in terms of the operators of the XML Query Algebra (see Appendix E). The mapping of XQuery operators into Algebra operators is still being designed, and may result in some changes to XQuery and/or the Algebra.
The type system of XQuery is the type system of XML Schema. Work is in progress to ensure that the type systems of XQuery, the XML Query Algebra, and XML Schema are completely aligned.
The details of the operators supported on simple XML Schema datatypes will be defined by a joint XSLT/Schema/Query task force.
XQuery relies on path expressions for navigating in hierarchic documents. XQuery expects these path expressions to conform to the semantics of XPath 2.0, as defined by a joint XSLT/Query task force.
For more details on open issues, see Appendix H.
Like OQL, XQuery is a functional language in which a query is represented as an expression. XQuery supports several kinds of expression, and therefore its queries may take several different forms. The various forms of XQuery expressions can be nested with full generality, so the notion of a "subquery" is natural to XQuery. The input and output of a query are instances of a data model called the XML Query Data Model [QueryDataModel]. This data model is a refinement of the data model described in the XPath specification [XPath], in which a document is modeled as a tree of nodes. A fragment of a document, or a collection of documents, may lack a common root and may therefore be modeled as an ordered forest of nodes of various types, including element nodes, attribute nodes, and text nodes, as illustrated in Figure 1.
Figure 1: An instance of the XML query data model--an ordered forest |
An XQuery syntax is given in Appendix B, and an initial approach to a formal definition of XQuery semantics is given in Appendix E.
The principal forms of XQuery expressions are as follows:
Path expressions
Element constructors
FLWR expressions
Expressions involving operators and functions
Conditional expressions
Quantified expressions
List constructors
Expressions that test or modify datatypes
Each of these expression-types is introduced and explained by a series of examples in the following sections. For syntactic details, please refer to the appendices.
In XQuery, keywords (such as FOR and LET) are case-insensitive, whereas identifiers (such as myBigBook) are case-sensitive.
A query may contain a comment, which is ignored during query processing. As in SQL, the beginning delimiter of a comment is a double hyphen and the ending delimiter is a newline character.
XQuery path expressions use the abbreviated syntax of XPath, extended with a new "dereference" operator and a new type of predicate called a "range predicate" (described below). XPath syntax is used in several XML-related applications including XSLT [XSLT] and XPointer [XPointer].
In XQuery, the result of a path expression is an ordered list of nodes (of course, each node includes its descendant nodes, so the result can be thought of as an ordered forest.) The top-level nodes in the path expression result are ordered according to their position in the original hierarchy, in top-down, left-to-right order. The result of a path expression may contain duplicate values (i.e., multiple nodes with the same type and content), but it will not contain duplicate nodes (i.e., multiple nodes with the same node identity).
A path expression consists of a series of steps. Each step represents movement through a document in a particular direction, and each step can apply one or more predicates to eliminate nodes that fail to satisfy a given condition. The result of each step is a list of nodes that serves as a starting point for the next step.
A path expression can begin with an expression that identifies a
specific node, such as the function document(string)
, which
returns the root node of a named document. A query can also contain a path
expression beginning with "/" or "//" which represents an implicit root node,
determined by the environment in which the query is executed.
A complete discussion of XPath abbreviated syntax can be found in [XPath]. Briefly, the following symbols are used:
. | Denotes the current node. |
.. | Denotes the parent of the current node. |
/ | Denotes the root node, or a separator between steps in a path. |
// | Denotes descendants of the current node. |
@ | Denotes attributes of the current node. |
* | Denotes "any" (node with unrestricted name). |
[ ] | Brackets enclose a Boolean expression that serves as a predicate for a given step. |
[n] | When a predicate consists of an integer, it serves to select the element with the given ordinal number from a list of elements. |
The following example uses a path expression consisting of three steps. The first step locates the root node of a document. The second step locates the second chapter element that is a child of the root element. The third step finds figure elements occurring anywhere within the chapter, but retains only those figure elements that have a caption with the value "Tree Frogs."
(Q1) In the second chapter of the document named "zoo.xml", find the figure(s) with caption "Tree Frogs".
document("zoo.xml")/chapter[2]//figure[caption = "Tree Frogs"]
It is sometimes desirable to isolate a list of elements whose ordinal numbers span a certain range. For this purpose, XQuery provides a RANGE predicate that is adapted from XQL [XQL], as illustrated in the following example. The first element in a list has ordinal number 1. The ordinal numbers of elements in a list are not affected by the presence of other types of nodes such as comments or processing instructions.
(Q2) Find all the figures in chapters 2 through 5 of the document named "zoo.xml."
document("zoo.xml")/chapter[RANGE 2 TO 5]//figure
In addition to the operators of the XPath abbreviated syntax, XQuery introduces an operator called the dereference operator ("->"). When a dereference operator follows an IDREF-type attribute, it returns the element(s) that are referenced by the attribute. A dereference operator is followed by a "name test" that specifies the target element. Following the usual XPath convention, a name test of "*" allows the target element to be of any type.
Dereference operators can be used in the steps of a path expression. For example, the following query uses a dereference operator to find the caption of the "fig" element referenced by the "refid" attribute of a "figref" element.
(Q3) Find captions of figures that are referenced by <figref> elements in the chapter of "zoo.xml" with title "Frogs".
document("zoo.xml")/chapter[title = "Frogs"] //figref/@refid->fig/caption
The XQuery dereference operator is similar in purpose to the
id
function of XPath. However, the right-arrow notation is
designed to be easier to read, especially in path expressions that involve
multiple dereferences. For example, suppose that a given document contains a
set of <emp> elements, each of which contains a "mgr" attribute. The
"mgr" attribute is of type IDREF, and it references another <emp> element
that represents the manager of the given employee. The name of each employee is
represented by a <name> element nested inside the <emp>
element.
(Q4) List the names of the second-level managers of all employees whose rating is "Poor".
/emp[rating = "Poor"]/@mgr->emp/@mgr->emp/name
As in XPath, the identifiers used in XQuery expressions can be qualified by namespace prefixes [Namespaces]. XQuery provides a syntax for declaring the Universal Resource Identifier (URI) associated with each namespace prefix used in a query, as illustrated in the following example:
(Q5) In the document "zoo.xml", find <tiger> elements in the namespace defined by www.abc.com/names that contain any subelement in the namespace defined by www.xyz.com/names.
NAMESPACE abc = "www.abc.com/names" NAMESPACE xyz = "www.xyz.com/names" document("zoo.xml")//abc:tiger[xyz:*]
The XQuery NAMESPACE clause also provides a way to declare a default namespace that applies to all unqualified names used in a query, as shown in the following example which is equivalent to Q5:
(Q6) (Equivalent to Q5)
NAMESPACE DEFAULT = "www.abc.com/names" NAMESPACE xyz = "www.xyz.com/names" document("zoo.xml")//tiger[xyz:*]
If no default namespace is declared for a query, unqualified names used in the query are considered to match local names in any namespace.
An XML element can be created by a kind of XQuery expression called an element constructor. An element constructor consists of a start tag and an end tag, enclosing an optional list of expressions that provide the content of the element. The start tag may also specify the values of one or more attributes. The name of the start tag may be specified either by a constant or a variable.
Although an element constructor is an expression in its own right, its typical use is nested inside another expression that binds one or more variables that are used in the element constructor. Both of the following examples are query fragments that refer to variables that are bound in some enclosing expression.
(Q7) Generate an <emp> element containing an "empid" attribute and nested <name> and <job> elements. The values of the attribute and nested elements are specified by variables that are bound in other parts of the query.<emp empid = $id> <name> $n </name> , <job> $j </job> </emp>
In the following example, the name of the generated element is specified by a variable named $tagname. Note that, when a start-tag contains a variable name, the matching end-tag must contain the same variable name (see Query Q14 for a more interesting version of this example.)
(Q8) Generate an element with a computed name, containing nested elements named <description> and <price>.<$tagname> <description> $d </description> , <price> $p </price> </$tagname>
In addition to elements, it is sometimes necessary to construct other kinds of XML nodes such as comments and processing instructions. Constructor functions are provided for this purpose, which may be invoked using the normal syntax for a function call. The following examples illustrate construction of a comment and a processing instruction:
comment("Houston, we have a problem.") pi("MyFormatter", "Pagebreak")
Occasionally it is necessary to construct (or search for) an element such that the name of the element, or the name of one of its attributes, is the same as an XQuery keyword. This is made possible by the lexing rule that strings enclosed in single quotes are always considered to be identifiers (such as tagnames), even if they match a keyword of the language. The following table illustrates the use of quotes in XQuery, which is slightly different from that in XML:
Example | Meaning |
FOR | XQuery keyword |
'FOR' | Identifier (e.g., tagname) |
"FOR" | Literal string value |
\" | Literal double-quote (used in a string) |
\' | Literal single-quote (used in a string) |
The following example shows how an element constructor can be used to create an element whose name and/or attribute-name is the same as a XQuery keyword:
<'FOR' 'LET' = "WHERE"/>
The following example shows how a path expression might be used to search for the element created in the previous example:
//'FOR'[@'LET' = "WHERE"]
A FLWR (pronounced "flower") expression is constructed from FOR, LET, WHERE, and RETURN clauses. As in an SQL query, these clauses must appear in a specific order. A FLWR expression binds values to one or more variables and then uses these variables to construct a result (in general, an ordered forest of nodes). The overall flow of data in a FLWR expression is illustrated in Figure 2.
The first part of a FLWR expression consists of FOR-clauses and/or LET-clauses which serve to bind values to one or more variables. The values to be bound to the variables are represented by expressions (for example, path expressions).
A FOR-clause is used whenever iteration is needed. The FOR-clause introduces one or more variables, associating each variable with an expression that returns a list of nodes. The result of the FOR-clause is a list of tuples, each of which contains a binding for each of the variables. The variables are bound to individual nodes returned by their respective expressions, in such a way that the binding-tuples represent the cross-product of the node-lists returned by all the expressions. Each variable in a FOR-clause can be thought of as iterating over the nodes returned by its respective expression.
A LET-clause is also used to bind one or more variables to one or more
expressions. Unlike a FOR-clause, however, a LET-clause simply binds each
variable to the value of its respective expression without iteration, resulting
in a single binding for each variable. The difference between a FOR-clause and
a LET-clause can be illustrated by a simple example. The clause FOR $x IN
/library/book
results in many bindings, each of which binds the variable
$x to one book in the library. On the other hand, the clause LET $x :=
/library/book
results in a single binding which binds the variable $x to
a list containing all the books in the library.
A FLWR expression may contain several FOR and LET-clauses, and each of these
clauses may contain references to variables bound in previous clauses. The
result of the sequence of FOR and LET clauses is an ordered list of tuples of
bound variables. The number of tuples generated by a FOR/LET sequence is the
product of the cardinalities of the node-lists returned by the expressions in
the FOR-clauses. The tuples generated by the FOR/LET sequence have an order
that is determined by the order of their bound elements in the input document,
with the first bound variable taking precedence, followed by the second bound
variable, and so on. However, if some expression used in a FOR-clause is
unordered (for example, because it contains a distinct
function),
the tuples generated by the FOR/LET sequence are unordered.
Fig. 2: Flow of data in a FLWR expression |
Each of the binding-tuples generated by the FOR and LET clauses is subject
to further filtering by an optional WHERE-clause. Only those tuples for which
the condition in the WHERE-clause is true are used to invoke the RETURN clause.
The WHERE-clause may contain several predicates, connected by AND, OR, and NOT.
These predicates usually contain references to the bound variables. Variables
bound by a FOR-clause represent a single node (with its descendants) and so
they are typically used in scalar predicates such as $p/color =
"Red"
. Variables bound by a LET-clause, on the other hand, may represent
lists of nodes, and can be used in list-oriented predicates such
as avg($p/price) > 100
. The ordering of the binding-tuples
generated by the FOR and LET clauses is preserved by the WHERE-clause.
The RETURN-clause generates the output of the FLWR expression, which may be a node, an ordered forest of nodes, or a primitive value. The RETURN-clause is executed once for each tuple of bindings that is generated by the FOR and LET-clauses and satisfies the condition in the WHERE-clause. If an ordering exists among these tuples, the RETURN-clause is executed on each tuple, in order, and the order of the results is preserved in the output document. The RETURN-clause contains an expression that often contains element constructors, references to bound variables, and nested subexpressions.
We will consider some examples of FLWR expressions based on a document named "bib.xml" that contains a list of <book> elements. Each <book> element, in turn, contains a <title> element, one or more <author> elements, a <publisher> element, a <year> element, and a <price> element. The first example is so simple that it could have been expressed using a path expression, but it is perhaps more readable when expressed as a FLWR expression.
(Q9) List the titles of books published by Morgan Kaufmann in 1998.FOR $b IN document("bib.xml")//book WHERE $b/publisher = "Morgan Kaufmann" AND $b/year = "1998" RETURN $b/title
Example Q10 uses a distinct
function in the FOR-clause to
eliminate duplicates from the list of publishers found in the input document.
Two elements are considered to be duplicates if their values (including name,
attributes, and normalized content) are equal. The result of
the distinct
function is an unordered set of elements. Example Q10
then uses a LET-clause to bind a variable to the average price of books
published by each of the publishers bound in the FOR-clause.
FOR $p IN distinct(document("bib.xml")//publisher) LET $a := avg(document("bib.xml") /book[publisher = $p]/price) RETURN <publisher> <name> $p/text() </name> , <avgprice> $a </avgprice> </publisher>
The next example uses a LET-clause to bind a variable $b to a set of books, and then uses a WHERE-clause to apply a condition to the set, retaining only bindings in which $b contains more than 100 elements. This query also illustrates the common practice of enclosing a FLWR expression inside an element constructor which provides an enclosing element for the query result.
(Q11) List the publishers who have published more than 100 books.<big_publishers> FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p </big_publishers>
FLWR expressions are often useful for performing structural transformations on documents, as illustrated by the next query, which inverts a hierarchy. This example also illustrates how one FLWR expression can be nested inside another.
(Q12) Invert the structure of the input document so that, instead of each book element containing a list of authors, each distinct author element contains a list of book-titles.<author_list> FOR $a IN distinct(document("bib.xml")//author) RETURN <author> <name> $a/text() </name>, FOR $b IN document("bib.xml")//book[author = $a] RETURN $b/title </author> </author_list>
LET-clauses are useful for breaking up long expressions, making queries more readable. They can also be helpful in simplifying a query that makes multiple uses of the same expression (called a "common subexpression.") In the following example, the average price of books is a common subexpression that is bound to variable $a and then used repeatedly in the body of the query.
(Q13) For each book whose price is greater than the average price, return the title of the book and the amount by which the book's price exceeds the average price.<result> LET $a := avg(//book/price) FOR $b IN /book WHERE $b/price > $a RETURN <expensive_book> $b/title , <price_difference> $b/price - $a </price_difference> </expensive_book> </result>
A LET-clause can be used in conjunction with an element constructor to
replicate some parts of an existing element, as in the following example. This
example uses the XPath functions name(element)
, which returns the
tagname of an element, and number(element)
, which returns the
content of an element expressed as a number. When an expression inside the body
of an element constructor evaluates to one or more attributes, those attributes
are considered to be attributes of the element that is being constructed.
LET $tagname := name($e) RETURN <$tagname> $e/@*, -- replicates the attributes of $e 2 * number($e) </$tagname>
By default, a FLWR expression preserves the ordering of elements in the input document(s). However, it is often important to specify an ordering for the elements in a query result that supplements or supercedes the order derived from the input. If a query result contains several levels of nested elements, an ordering may be required among the elements at each level. XQuery provides a SORTBY clause that may be used after any expression to specify an ordering among the resulting elements. The arguments of the SORTBY clause are evaluated within the context of the individual nodes to be sorted, and may be followed by ASCENDING or DESCENDING to specify the direction of the sort (ASCENDING is the default.) The use of SORTBY is illustrated by the following example.
(Q15) Make an alphabetic list of publishers. Within each publisher, make a list of books, each containing a title and a price, in descending order by price.<publisher_list> FOR $p IN distinct(document("bib.xml")//publisher) RETURN <publisher> <name> $p/text() </name> , FOR $b IN document("bib.xml")//book[publisher = $p] RETURN <book> $b/title , $b/price </book> SORTBY(price DESCENDING) </publisher> SORTBY(name) </publisher_list>
Like most languages, XQuery allows expressions to be constructed using infix and prefix operators, and allows nested expressions inside parentheses to serve as operands. XQuery supports the usual set of arithmetic and logical operators, and the collection operators UNION, INTERSECT, and EXCEPT. A complete list of these operators and a specification of their semantics as applied to various datatypes is provided in Appendix C.
From XQL, XQuery inherits the infix operators BEFORE and AFTER, which are useful in searching for information by its ordinal position. Each instance of the XML Query data model (regardless of whether it is a complete document, a fragment of a document, or a list of documents) is a forest that includes a total ordering, called "document order," among all its nodes. BEFORE operates on two lists of elements and returns those elements in the first list that occur before at least one element of the second list in document order (of course, this is possible only if the two lists are subsets of the same data model instance.) AFTER is defined in a similar way. Since BEFORE and AFTER are based on global document ordering, they can compare the positions of elements that do not have the same immediate parent. For the same reason, BEFORE and AFTER do not require their operands to have a local ordering. The next two examples illustrate the use of BEFORE and AFTER by retrieving excerpts from a surgical report that includes <procedure>, <incision>, and <anesthesia> elements.
(Q16) Prepare a "critical sequence" report consisting of all elements that occur between the first and second incision in the first procedure.<critical_sequence> LET $p := //procedure[1] FOR $e IN //* AFTER ($p//incision)[1] BEFORE ($p//incision)[2] RETURN shallow($e) </critical_sequence>
The shallow
function makes a shallow copy of a node, including
attributes but not including subelements.
-- Finds potential lawsuits FOR $p in //procedure WHERE empty($p//anesthesia BEFORE ($p//incision)[1]) RETURN $p
The empty
function returns True if and only if its argument is
an empty list.
Conditional expressions are useful when the structure of the information to be returned depends on some condition. Of course, like all XQuery expressions, conditional expressions can be nested and can be used wherever a value is expected.
As an example of a conditional expression, consider a library that has many holdings, each described by a <holding> element with a "type" attribute that identifies its type: book, journal, etc. All holdings have a title and other nested elements that depend on the type of holding.
(Q18) Make a list of holdings, ordered by title. For journals, include the editor, and for all other holdings, include the author.FOR $h IN //holding RETURN <holding> $h/title, IF $h/@type = "Journal" THEN $h/editor ELSE $h/author </holding> SORTBY (title)
Occasionally it is necessary to test for existence of some element that satisfies a condition, or to determine whether all elements in some collection satisfy a condition. For this purpose, XQuery provides existential and universal quantifiers. The existential quantifier is illustrated in Q19, and the universal quantifier is illustrated in Q20.
(Q19) Find titles of books in which both sailing and windsurfing are mentioned in the same paragraph.FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title(Q20) Find titles of books in which sailing is mentioned in every paragraph.
FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title
One of the functions in the XQuery core function library is
called filter
. This function takes two operands, each of which is
an expression that, in general, evaluates to an ordered forest of
nodes. filter
returns copies of some of the nodes in the forest
represented by the first operand, while preserving their hierarchic and
sequential relationships. The nodes that are copied into the result are those
nodes that are present at any level in the first operand and are also top-level
nodes in the second operand. Thus the second operand is used as a "filter" that
selects nodes from the forest represented by the first operand. The filtering
process is based on node identity; that is, it requires both operands to
contain the same node, not just two nodes with the same value. Obviously, if
the two operands do not have a common root, the result of
the filter
function is the empty list.
The action of a filter
function is illustrated by Figure 3,
which shows an element hierarchy that might result from evaluating the path
expression /C
. Figure 3 also shows the result of the function
call filter(/C, //A | //B)
. The result contains copies of all nodes
of type A and B, and where a hierarchic or sequential relationship exists among
these nodes, the relationship is preserved.
Fig 3: Action of FILTER on a hierarchy |
The filter
function is useful in "pruning" a document,
eliminating undesired parts while retaining the document structure. The
following example illustrates this process by computing a table of contents for
a document that contains many levels of nested sections. The query filters the
document, retaining only section elements, title elements nested directly
inside section elements, and the text of those title elements. Other elements,
such as paragraphs and figure titles, are eliminated, leaving only the
"skeleton" of the document.
In this example, the first argument of filter
is the root of a
document, and the second argument is a path expression that identifies the
nodes to be preserved from the original document.
LET $b := document("cookbook.xml") RETURN <toc> filter($b, $b//section | $b//section/title | $b//section/title/text() ) </toc>
The type system of XQuery is the type system of XML Schema. Each XQuery expression has a datatype that can be declared using the language of XML Schema.
In XQuery, datatype names appear in function declarations where they specify
the datatypes of the function parameters and result. Datatype names are also
used in CAST and TREAT expressions and as operands of the INSTANCEOF operator.
A datatype name is a qualified name. By using the datatype names defined in the
namespace http://www.w3.org/2000/10/XMLSchema-datatypes
(hereafter
abbreviated as xsd
), all the primitive and derived datatypes of
XML Schema can be used in queries. More complex datatypes can also be defined,
as described in Section 2.10.
Certain XML Schema datatypes have literal forms that are recognized by the XQuery lexer, as illustrated by the following examples:
Type | Example of literal |
xsd:string | "Hello" |
xsd:boolean | TRUE, FALSE |
xsd:integer | 47, -369 |
xsd:decimal | -2.57 |
xsd:float | -3.805E-2 |
Literal values of XML Schema types other than string, boolean, integer,
decimal, and float can be specified by means of constructor functions such
as date("2000-06-25")
or by cast expressions such as CAST AS
xsd:positiveInteger(47)
.
NOTE: The set of constructor functions has not yet been fixed. A task force has been assigned to determine the set of operators on XML Schema types, and we expect this task force to design the set of constructor functions.
It is common to declare functions whose parameters and/or results can be XML
elements of any type. XQuery provides a shorthand notation for this common
case, which avoids the necessity to make a namespace declaration. The keyword
ELEMENT is a shorthand for the type defined by <xsd:any>
,
which represents an element of any type. The xsd
namespace also
provides other forms of "wild cards"--for example, a type defined
by <xsd:anyAttribute>
represents any attribute.
Another datatype that is very commonly used is a repetition of zero or more
instances of some type. In XML Schema, this is represented by the
facets minOccurs="0" maxOccurs="unbounded"
. XQuery provides a
shorthand notation for this common case: LIST( x )
represents zero
or more occurrences of the type denoted by x
. For
example, LIST(Book)
is a shorthand notation for a type defined
by <xsd:element ref="Book" minOccurs="0"
maxOccurs="unbounded">
, and LIST(ELEMENT)
is a shorthand
notation for a type defined by <xsd:any minOccurs="0"
maxOccurs="unbounded">
.
In XML, lists of elements are always ordered. However, during processing of
a query, some intermediate results may be collections of elements in which
order is not significant. Various XQuery functions operate on lists of elements
and change their properties: for example, the distinct
function
operates on a list to remove duplicate elements and to make its ordering
insignificant. The SORTBY operator operates on a list to reorder it and to make
its order significant. Properties of lists, such as length, order-significance,
and presence or absence of duplicates, are not significant for purposes of
function resolution.
A list may be constructed by enclosing zero or more expressions in square brackets, separated by commas. For example, [$x, $y, $z] denotes a list containing three members represented by variables, and [ ] denotes an empty list. The square-bracket list constructor always "flattens" any of its arguments that are lists themselves, so that the result of the list constructor is a one-level list. For example, the following two expressions are equivalent, and both result in a one-level list containing four integers:
[[1, 2], 3, [ ], [[4]]] [1, 2, 3, 4]
The rule restricting data values to one-level lists avoids complications in function resolution and ensures that all XQuery values can be easily converted to XML format.
XQuery provides a core library of built-in functions for use in queries, as
specified in Appendix D. We have already used some of these core functions,
such as document
, which returns the root node of a named document.
The XQuery core function library contains all the functions of the XPath core
function library, all the aggregation functions of SQL (such
as avg
, sum
, count
, max
,
and min
), and a number of other useful functions. For example,
the distinct
function eliminates duplicates from a list, and
the empty
function returns TRUE if and only if its argument is an
empty list.
In addition to the built-in functions, XQuery allows users to define functions of their own. Each function definition must declare the datatypes of its parameters and result, and must provide an expression (the "body" of the function) that defines how the result of the function is computed from its parameters. When a function is invoked, its arguments must be valid instances of the declared parameter types. The result of a function must also be a valid instance of its declared type. These rules are checked using the type-inference rules of the XML Query Algebra.
In future work, we expect to define an extensibility mechanism whereby function definitions with global scope, written in various programming languages, can be added to the XQuery function library.
XQuery Version 1 does not allow user-defined functions to be overloaded--that
is, it does not allow multiple functions to be declared with the same name and
the same number of parameters. We consider function overloading to be a useful
and important feature that deserves further study in future versions of XQuery.
Although XQuery does not allow overloading of user-defined functions, some of
the built-in functions in the XQuery core library are overloaded--for example,
the string
function of XPath can convert an instance of almost any
type into a string.
It is possible in XQuery to invoke a function on a list of arguments whose types do not exactly match the declared parameter types of the function. The process of finding the best available function for a given invocation is calledfunction resolution. The rules for function resolution in XQuery Version 1 are as follows:
A fixed "promotion hierarchy" is defined among the primitive and derived types of XML Schema (see Appendix C). For example, part of the promotion hierarchy might look like this:
integer -> decimal -> float -> double
This fragment of the promotion hierarchy would indicate, for example, that a function with a declared parameter of float could be called with an integer argument, causing the integer to be converted to a float. The "best" function for a given call is found by searching among the available functions for the one whose declared parameters are closest in the promotion hierarchy to the arguments of the function call, considering the arguments from left to right.
A declared parameter-type matches an argument that is an instance of the named type or one of its subtypes. This rule is called subtype substitutability.
A function whose parameter type is a list of a given type can be invoked
with an argument that is a single instance of the given type. This rule follows
directly from the definition of a list in XQuery, which is based on the
facets minOccurs="0" maxOccurs="unbounded"
.
A function with a parameter of a given type can be invoked with an argument
that is a list of the given type. The result is a list, in which each member is
the result of invoking the function on one of the members of the argument list,
in their order of occurrence. For example, suppose that the
function price(Book)
is declared to take a Book and return an
integer. If the price
function is invoked on a list of Books, the
result is a list of integers representing the prices of the Books, in order.
This rule recognizes the fact that most path expressions return lists, and it
is desirable to allow a path expression to be used as the argument of a
function without requiring the function body to explicitly iterate over the
members of the list.
When this rule is applied, any individual function results that are lists are "flattened" before combining them into a single list that represents the final result of the function invocation. This rule prevents functions from returning lists of lists, which are not allowed in the query data model.
This rule generalizes to functions with multiple parameters in the following way: suppose that N arguments of a function-call are lists that match function-parameters where single elements are expected. Then the result of the function-call is a list whose individual members are the results of invoking the function on tuples of arguments taken from the Cartesian product of the N input lists.
A function may be defined recursively-that is, it may be referenced in its
own definition. The next query contains an example of a recursive function that
computes the depth of an element hierarchy. In its definition, the user-defined
function depth
calls the built-in
functions empty
and max
.
NAMESPACE xsd = "http://www.w3.org/2000/10/XMLSchema-datatypes" FUNCTION depth(ELEMENT $e) RETURNS xsd:integer { -- An empty element has depth 1 -- Otherwise, add 1 to max depth of children IF empty($e/*) THEN 1 ELSE max(depth($e/*)) + 1 } depth(document("partlist.xml"))
To further illustrate the power of functions, we will write a function that returns all the elements that are "connected" to a given element by child or reference connections, and a recursive function that returns all the elements that are "reachable" from a given element by child or reference connections.
(Q23) In the document "company.xml", find all the elements that are reachable from the employee with serial number 12345 by child or reference connections.FUNCTION connected(ELEMENT $e) RETURNS LIST(ELEMENT) { $e/* UNION $e/@*->* } FUNCTION reachable(ELEMENT $e) RETURNS LIST(ELEMENT) { $e UNION reachable(connected($e)) } reachable(document("company.xml")/emp[serial="12345"])
In the above example, the reachable
function invokes itself
recursively to find all the elements that are reachable from the elements that
are directly connected to the parameter $e.
As another example, we might use the filter
function together
with the reachable
function defined in Q23 to return all the
elements that are reachable from a specific employee element, while preserving
their hierarchic and sequential relationships.
LET $c := document("company.xml") RETURN filter($c, reachable($c/emp[empno="12345"]))
Of course, it is possible to write a recursive function that fails to
terminate for some set of arguments. In fact, the reachable
function
in the previous example will fail to terminate if called on an element that
references one of its ancestors. It is the user's responsibility to avoid
writing a nonterminating function call.
In addition to the primitive and derived datatypes of XML Schema, any datatype that can be constructed using the definition facilities of XML Schema can be used as an XQuery datatype. XML Schema language can be used to define an element or datatype and to give it a qualified name, which can then be used in an XQuery function declaration. For example, a schema might define an element named PurchaseOrder by specifying a set of attributes and a content model based on sequences and alternations of various other elements. PurchaseOrder could then be used as the type of a function parameter in a query.
A query can refer to element-names and type-names that are defined in any of the following schemas:
Schemas that are referenced by documents used in the query; i.e., the
implicit input document or any document referenced by
the document
function.
Schemas that are associated with namespaces declared at the beginning of the
query, such as NAMESPACE xsd =
"http://www.w3.org/2000/10/XMLSchema-datatypes"
.
Under consideration: A query might include a "preamble" in which local datatypes could be defined using XML Schema notation. Since Schema uses XML notation, this feature would require an XQuery implementation to include an XML parser. (See Issues)
In the following example, a schema defines complex types named "emp_type" and "dept_type" in the target namespace "http://www.BigCompany.com/BigNames". A query then uses these datatypes to define and invoke a function.
(Q25) Define complex types named "emp_type" and "dept_type" in a target namespace.<?xml version="1.0"> <schema xmlns="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.BigCompany.com/BigNames"> <complexType name="emp_type"> <sequence> <element name="name" type="string"/> <element name="deptno" type="string"/> <element name="salary" type="decimal"/> <element name="location" type="string"/> </sequence> </complexType> <complexType name="dept_type"> <sequence> <element name="deptno" type="string"/> <element name="headcount" type="integer"/> <element name="payroll" type="decimal"/> </sequence> </complexType> </schema>(Q26) Using the datatypes defined in Q25, create a function that summarizes employees by department, and use this function to summarize all the employees of Acme Corp. that are located in Denver.
NAMESPACE DEFAULT = "http://www.BigCompany.com/BigNames" FUNCTION summary(LIST(emp_type) $emps) RETURNS LIST(dept_type) { FOR $d IN distinct($emps/deptno) LET $e := $emps[deptno = $d] RETURN <dept> $d, <headcount> count($e) </headcount>, <payroll> sum($e/salary) </payroll> </dept> } summary(document("acme_corp.xml")/emp[location = "Denver"] )
It is sometimes desirable to validate that the result of a query conforms to
a specific datatype. This can be done by taking advantage of the fact that
XQuery functions validate the datatypes of their parameters and results. For
example, suppose that some query Q is intended to generate output that conforms
to the datatype abc:PurchaseOrder
(for some suitable binding of
the namespace prefix abc
). The query Q can be type-validated by
"wrapping" it in a function that takes an instance of the desired type as a
parameter and simply returns it, as in the following example:
FUNCTION check(abc:PurchaseOrder $po) RETURNS abc:PurchaseOrder { $po } check(Q)
The Boolean operator INSTANCEOF returns True if its first operand is an
instance of the type named in its second operand; otherwise it returns False.
For example, $x INSTANCEOF zoonames:animal
returns True if the
dynamic type of $x is zoonames:animal or a subtype of zoonames:animal. The
INSTANCEOF operator has the same syntax and behavior as
the instanceof
operator in Java.
Occasionally it is necessary to convert a value from one datatype to
another. For the primitive and derived types of XML Schema, a CAST notation is
supported that provides conversions between certain combinations of types. For
example, the notation CAST AS integer (x DIV y)
converts the
result of x DIV y
into the integer type. The set of type
conversions that are supported by the CAST notation is specified in Appendix C.
Conversions among user-defined datatypes are not supported by the CAST
notation, but user-defined functions can be written for this purpose.
In addition to CAST, XQuery provides a notation called TREAT. Rather than
converting an expression from one datatype to another, TREAT causes the query
processor to treat an expression as though its datatype were a subtype of its
static type. For example, TREAT AS Cat($mypet)
tells the query
processor to treat the variable $mypet as though it were an instance of the
type Cat, even though the static type of $mypet is a supertype of Cat such as
Animal. This notation allows functions that require an argument of type Cat to
be invoked on the variable $mypet. At query execution time, if the dynamic type
of $mypet is not Cat, an error results.
The following example shows how INSTANCEOF and TREAT can be combined to simulate a primitive form of subtype polymorphism. A more robust treatment of polymorphic functions is deferred to a later version of XQuery.
(Q28) Define a function named sound(animal) that returns different strings for various types of animals. Use the function in a query that returns the sounds made by all of Billy's pets.-- First define some functions to set the stage NAMESPACE xsd = "http://www.w3.org/2000/10/XMLSchema-datatypes" FUNCTION quack(duck $d) RETURNS xsd:string { "String depends on properties of duck" } FUNCTION woof(dog $d) RETURNS xsd:string { "String depends on properties of dog" } --This function illustrates simulated subtype polymorphism FUNCTION sound(animal $a) RETURNS xsd:string { IF $a INSTANCEOF duck THEN quack(TREAT AS duck($a)) ELSE IF $a INSTANCEOF dog THEN woof(TREAT AS dog($a)) ELSE "No sound" } -- This query returns the sounds made by all of Billy's pets FOR $p IN /kid[name="Billy"]/pet RETURN sound($p)
An XQuery unit is a string that can be recognized by an XQuery parser. At present, the following types of XQuery units are defined:
Queries: A query consists of zero or more namespace declarations, followed by zero or more function definitions, followed by an expression. The expression may invoke the functions declared in the query unit as well as functions declared in other function libraries. The result of the query is the value of the expression.
Function libraries: A function library consists of zero or more namespace declarations, followed by one or more function definitions. Function libraries are used to define functions for general use by multiple queries. The means by which a query gains access to a function library is not yet defined (see Issues).
Future versions of the language may define other types of XQuery units, such as updates and view definitions.
Since much of the world's business data is stored in relational databases, access to relational data is a vitally important application for an XML query language. In this section, we will illustrate the use of XQuery to access relational data by a series of examples based on a schema that is often used in relational database tutorials, containing descriptions of suppliers and parts, as shown in Figure 4. In this schema, Table S contains supplier numbers and names; Table P contains part numbers and descriptions, and Table SP contains contains the relationships between suppliers and the parts they supply, including the price of each part from each supplier.
Fig. 4: One possible XML representation of relational data |
Figure 4 also shows how the schema of parts and suppliers might be translated into an XML view in which each table appears as a document, each row of a table appears as an element inside the document, and each value inside a row appears as a nested element. Other, more richly structured views can be defined on top of this simple view by using XQuery syntax.
SQL [SQL] is the standard relational database language. In many cases, SQL queries can be converted to XQuery syntax in a straightforward way by mapping SQL query-blocks into FLWR-expressions. We illustrate this mapping by the following query:
(Q29) Find part numbers of gears, in numeric order.SQL version:
SELECT pno FROM p WHERE descrip LIKE 'Gear' ORDER BY pno;
XQuery version:
FOR $p IN document("p.xml")//p_tuple WHERE contains($p/descrip, "Gear") RETURN $p/pno SORTBY(.)
In XQuery, the operand of SORTBY is always interpreted within the context of the element to be sorted. Since the <pno> elements generated by Q29 have no internal structure, we use the notation "SORTBY(.)", which causes the <pno> elements to be sorted by their content.
Many relational queries involve forming data into groups and applying some
aggregation function such as count
or avg
to each
group. In SQL, these queries are expressed using GROUP BY and HAVING clauses.
The following example shows how such a query might be expressed in XQuery:
SQL version:
SELECT pno, avg(price) AS avgprice FROM sp GROUP BY pno HAVING count(*) >= 3 ORDER BY pno;
XQuery version:
FOR $pn IN distinct(document("sp.xml")//pno) LET $sp := document("sp.xml")//sp_tuple[pno = $pn] WHERE count($sp) >= 3 RETURN <well_supplied_item> $pn, <avgprice> avg($sp/price) </avgprice> </well_supplied_item> SORTBY(pno)
The distinct
function in this query eliminates duplicate part
numbers from the set of all part numbers in the document. The result
of distinct
is a list in which order is not significant.
Note that $pn, bound by a FOR-clause, represents an individual part number, whereas $sp, bound by a LET-clause, represents a set of sp-tuples. The SQL HAVING clause, which applies a predicate to a set, is mapped into a XQuery WHERE-clause that operates on the set-valued variable $sp. The XQuery version of the query also uses an element constructor to enclose each part number and average price in a containing element called <well_supplied_item>.
Joins, which combine data from multiple sources into a single query result, are among the most important forms of relational queries. In this section we will illustrate how several types of joins can be expressed in XQuery.
A conventional ("inner") join returns information from two or more related tables, as illustrated by example Q31.
(Q31) Return a "flat" list of supplier names and their part descriptions, in alphabetic order.FOR $sp IN document("sp.xml")//sp_tuple, $p IN document("p.xml")//p_tuple[pno = $sp/pno], $s IN document("s.xml")//s_tuple[sno = $sp/sno] RETURN <sp_pair> $s/sname , $p/descrip </sp_pair> SORTBY (sname, descrip)
Q31 returns information only about parts that have suppliers and suppliers that have parts. An "outer join" is a join that preserves information from one or more of the participating tables, including those rows that have no matching row in the joined table. For example, a "left outer join" between suppliers and parts might return information about suppliers that have no matching parts. In place of the missing parts data, relational systems usually return null values; but an XML query might represent the missing data by an empty element or the absence of an element. Q32 is an example of a query that corresponds to a left outer join.
(Q32) Return names of all the suppliers in alphabetic order, including those that supply no parts; inside each supplier element, list the descriptions of all the parts it supplies, in alphabetic order.FOR $s IN document("s.xml")//s_tuple RETURN <supplier> $s/sname, FOR $sp IN document("sp.xml")//sp_tuple [sno = $s/sno], $p IN document("p.xml")//p_tuple [pno = $sp/pno] RETURN $p/descrip SORTBY(.) </supplier> SORTBY(sname)
Another type of join that is sometimes used in relational systems is a "full outer join," which preserves information from both of the participating tables, including rows of each table that have no matching rows in the other table. In XML, the result of a full outer join can be structured in any of several ways. The example in Q33 uses a format of parts nested inside suppliers, followed by a list of parts that have no supplier. This might be thought of as a "supplier-centered" full outer join. A "part-centered" full outer join, on the other hand, might return a list of suppliers nested inside parts, followed by a list of suppliers that have no parts. Other forms of outer join queries are also possible.
(Q33) Return names of suppliers and descriptions and prices of their parts, including suppliers that supply no parts and parts that have no suppliers.<master_list> (FOR $s IN document("s.xml")//s_tuple RETURN <supplier> $s/sname, FOR $sp IN document("sp.xml")//sp_tuple [sno = $s/sno], $p IN document("p.xml")//p_tuple [pno = $sp/pno] RETURN <part> $p/descrip, $sp/price </part> SORTBY (descrip) </supplier> SORTBY(sname) ) UNION -- parts that have no supplier <orphan_parts> FOR $p IN document("p.xml")//p_tuple WHERE empty(document("sp.xml")//sp_tuple [pno = $p/pno] ) RETURN $p/descrip SORTBY(.) </orphan_parts> </master_list>
Q33 uses an element constructor to enclose its output inside a <master_list> element. The UNION operator, when used as in Q33 to combine two ordered lists, returns the first list with the second list appended at the end (removing duplicate node identities but not duplicate node values). The result is a <master_list> element containing an ordered list of <supplier> elements followed by an <orphan_parts> element that contains descriptions of all the parts that have no supplier.
With the emergence of XML, the distinctions among various forms of information, such as documents and databases, are quickly disappearing. XQuery is designed to support queries against a broad spectrum of information sources by incorporating features from several languages that were originally designed for diverse purposes. The versatility of XQuery will help XML to realize its potential as a universal medium for data interchange.
This specification describes XQuery Version 1. Future versions of XQuery may include additional features such as the following:
Data definition facilities for persistent views.
Function overloading and polymorphic functions.
Facilities for updating XML data.
The following rather permissive grammar specifies the structure of the XQuery language. An actual XQuery implementation would also augment the grammar with a set of typing rules (for example, an expression used in a predicate must return a Boolean value or an integer.)
BNF Notations: 'aaa' token ? optional * zero or more + one or more | alternation ( ) grouping ------------------------------------------------------------------ XQueryUnit ::= FunctionLibrary | Query FunctionLibrary ::= ContextDecl* FunctionDefn+ Query ::= ContextDecl* FunctionDefn* Expr ContextDecl ::= 'NAMESPACE' Identifier '=' Literal | 'NAMESPACE' 'DEFAULT' '=' Literal FunctionDefn ::= 'FUNCTION' QName '(' ParameterList? ')' 'RETURNS' Datatype '{' Expr '}' ParameterList ::= Parameter ( ',' Parameter )* Parameter ::= Datatype Variable Datatype ::= SimpleDatatype | 'LIST' '(' SimpleDatatype ')' SimpleDatatype ::= QName | '@' QName | 'ELEMENT' Expr ::= Expr 'SORTBY' '(' SortSpecList ')' | UnaryOp Expr | Expr BinaryOp Expr | Variable | Literal | '.' | FunctionName '(' ExprList? ')' | ElementConstructor | '(' Expr ')' | '[' ExprList? ']' | PathExpr | Expr Predicate | FlwrExpr | 'IF' Expr 'THEN' Expr 'ELSE' Expr | ('SOME' | 'EVERY') Variable 'IN' Expr 'SATISFIES' Expr | ('CAST' | 'TREAT') 'AS' Datatype '(' Expr ')' | Expr 'INSTANCEOF' Datatype ExprList ::= Expr ( ',' Expr )* FlwrExpr ::= (ForClause | LetClause)+ WhereClause? ReturnClause ForClause ::= 'FOR' Variable 'IN' Expr ( ',' Variable 'IN' Expr )* LetClause ::= 'LET' Variable ':=' Expr ( ',' Variable ':=' Expr )* WhereClause ::= 'WHERE' Expr ReturnClause ::= 'RETURN' Expr SortSpecList ::= SortSpec (',' SortSpec)* SortSpec ::= Expr ( 'ASCENDING' | 'DESCENDING' )? ElementConstructor ::= StartTag ExprList? EndTag | '<' TagName AttributeList? '/>' PathExpr ::= Path | '/' Path | '//' Path | Expr '/' Path | Expr '//' Path Path ::= Step ( ('/' | '//') Step )* Step ::= NodeGenerator Predicate* NodeGenerator ::= NameTest | NodeType '(' ')' | '@' NameTest ( '->' NameTest )? | '..' Predicate ::= '[' Expr ']' | '[' 'RANGE' Expr 'TO' Expr ']' NameTest ::= QName | ( NamePrefix ':' )? '*' | '*' ':' LocalName NodeType ::= 'NODE' | 'TEXT' | 'COMMENT' | 'PROCESSING-INSTRUCTION' StartTag ::= '<' TagName AttributeList? '>' TagName ::= QName | Variable AttributeList ::= (AttributeName '=' Expr)* AttributeName ::= QName | Variable EndTag ::= '</' TagName '>' QName ::= ( NamePrefix ':' )? LocalName NamePrefix ::= Identifier LocalName ::= Identifier BinaryOp ::= LogicalOp | CompareOp | ArithOp LogicalOp ::= 'AND' | 'OR' | 'UNION' | 'INTERSECT' | 'EXCEPT' | 'BEFORE' | 'AFTER' | '|' CompareOp ::= '=' | '<' | '<=' | '>' | '>=' | '!=' ArithOp ::= '+' | '-' | '*' | 'DIV' | 'MOD' UnaryOp ::= 'NOT' | '+' | '-' ------------------------------------------------------------------ The terminal symbols of this grammar are as follows: Variable Example: $x Literal Examples: "x", 5, 5.72E-3, TRUE Identifier Examples: x, 'FOR' ------------------------------------------------------------------ The operator precedence of this grammar is as follows: 1. Path operators: /, //, ->, predicates 2. Unary +, - 3. *, DIV, MOD 4. Binary +, - 5. =, <, <=, >, >=, !=, INSTANCEOF 6. FLWR, IF..THEN..ELSE, Quantifiers 7. BEFORE, AFTER 8. NOT 9. INTERSECT, AND 10. UNION, |, EXCEPT, OR 11. SORTBY
(To be completed)
(Include a list of valid casts)
(Include a promotion hierarchy for Schema types)
(Include specifications for literals where applicable)
(Include a specification of how existential quantifiers are implicitly generated by certain operations on lists)
(To be completed)
This section contains a mapping from the XQuery syntax to the XML Query Algebra. The algebra mapping is the support for the semantics of XQuery, both at run-time (evaluation) and at compile-time (typing).
This mapping is still preliminary and is subject to evolution in the future. This mapping might still contain inconsistencies and whenever appropriate, we point to corresponding issues in the issue list.
In XQuery, a query is composed of a preamble (containing schema, namespace and function declarations) and a body (containing a single XQuery expression). First, we give a mapping for XQuery expressions into Algebra expressions, then we give a mapping for XQuery declarations into Algebra declarations. This mapping is based on the XQuery grammar given in appendix B. Note that we assume that the precedence of XQuery and the XML Algebra are aligned (cf. [xquery-align-precedence]).
We will use the following notations.
E XQuery expression e XML Algebra expression l Local Name (NCName) {n}l QName with namespace n and local name l a Qualified name $v XQuery variables v XML Algebra variables dot Distinguished algebra variable used to contain the current node. T XQuery type t XML Algebra type [[ E ]]_c ==> e Given the context c, the XQuery expression E is mapped to the algebra expression e.
When the context is clear, we will write: [[ E ]] ==> e.
We give a mapping for each of the classes of expressions identified at the beginning of Section 2. Each of these classes corresponds to a subset of the Expr production in the XQuery Grammar of Appendix B.
XQuery path expressions use a fragment of XPath, plus two additional features: dereference and range predicates. XQuery and the XML Algebra do not currently support all the axis of XPath (Cf. [Algebra Issue-0052: Axes of Xpath.])
The mapping assumes that path steps are always given an explicit input expression. I.e., XPath abreviations are resolved, for instance, name is already in the form ./name, /person[name = "John"] is already in the form /person[./name = "John"], etc. Cf. Issue [xquery-mapping-input-context].
In XQuery, '.' denotes the current node. The current node is a special node, whose value depends on the context of the expression within which it is evaluated. Inside the Algebra, we will use a distinguished variable, called dot, to contain the value of the current node. Variable dot will always be bound explicitly in the algebra (see for instance, navigation and SORT expressions).
[[ . ]] ==> dot
In XQuery, '/' denotes the root of the document of the current node. This is returned by a special algebra function root() that accesses parents of the current node until it reaches the root. This function is given in the preamble section.
[[ / ]] ==> root(dot)
The following mapping is straightforward and works for a being: element names, attribute names, with and without wildcards.
[[ E/a ]] ==> [[E]]/a [[ /a ]] ==> [[ / ]]/a
Note that in the XML Algebra, this returns elements with name a. This is not always the case in XQuer due to implicit type coercions. These type coercions are not yet taken into account in the mapping to the Algebra (Cf. [xquery-xpath-coercions]).
Recursion is supported through recursive functions in the XML Algebra. Recursive navigation uses a special recursive function called descendant that returns the current node as well as all its descendants. This function is defined within the algebra and given in Section E.3.4.
[[ E//a ]] ==> descendant([[E]])/a [[ //a ]] ==> descendant([[ / ]])/a
This assumes the corresponding accessors are in the algebra.
[[ E/COMMENT() ]] ==> [[E]]/comment() [[ E/PROCESSING_INSTRUCTION() ]] ==> [[E]]/pi() [[ E/NODE() ]] ==> [[E]]/node() [[ E/TEXT() ]] ==> [[E]]/value()
This uses the support for dereference in the Algebra. This assumes an agreement on what is a reference and what is not (e.g., that ID/IDREF are converted to ref() in the data model).
[[E -> a]] ==> for v1 in [[E]] do for v2 in deref(v1) do match v2 with case v3 : a[AnyType] do v3 else ()
Local XPath predicates correspond to iteration over element in a collection, and requires the binding of the dot variable, as the predicate might use the current node.
[[ E1[E2] ]] ==> for v in [[E1]] do let dot = v do where [[E2]] do v
The parent navigation is mapped to the corresponding parent built-in function in the algebra.
[[ E/.. ]] ==> parent([[E]])
Element and attribute names are mapped to corresponding statements in the XML Algebra.
[[l]] ==> l [[n:l]] ==> {n}l
Note that in XQuery, namespaces are declared in the preamble. We will see in section E.3.2 that namespace declarations are mapped to global variables containing the URI of the correponding namespace in the XML Algebra.
Element constructors are mapped to similar operations in the Algebra.
[[ <a a1=E1 ... an=En> E1', ..., Ek' </a> ]] ==> a [ @a1 [ [[E1]] ], ..., @an [ [[En]] ], [[E1']], ..., [[Ek']] ] [[ <$v a1=E1 ... an=En> E1', ..., Ek' </$v> ]] ==> ~(v) [ @a1 [ [[E1]] ], ..., @an [ [[En]] ], [[E1']], ..., [[Ek']] ]
Note that if the Ei's do not return an attribute value, this will be detected by the XML Algebra type system at compile time. Note also, that the Ei's might return attributes, in which case the algebra element constructor should append then at the beginning with the explicitely declared attributes.
Comments and PI's constructors are mapped to similar operations in the Algebra.
[[ comment(E) ]] ==> comment([[E]]) [[ pi(E) ]] ==> pi([[E]])
For, Let, Where and Return components of a FLWR expression are mapped to corresponding components in the Algebra.
[[ FOR $v1 IN E1, ..., $vn IN En LET $v1' := E1', ..., $vk' IN Ek' WHERE Ew RETURN Er ]] ==> for v1 in [[E1]] do ... for vn in [[En]] do let v1' = [[E1']] do ... for vk' in [[Ek']] do where [[Ew]] do [[Er]]
Like for predicates, this mapping supposes that path expressions in the sort criterias are all given an explicit input. This also assumes a stable sort in the Algebra. Finally, this mapping is only possible for ASCENDING criteria, which is the only currently available in the XML Algebra (Cf. [Algebra Issue-0041: Sorting]).
[[ E SORTBY E1 ASCENDING, ..., En ASCENDING ]] ==> sort dot in (sort dot in (... (sort dot in [[E]] by [[En]]) ...) by [[E2]]) by [[E1]]
Operators in XQuery are mapped to corresponding operators in the Algebra.
These are simply mapped to the corresponding Algebra operators.
[[ E1 AND E2 ]] ==> [[E1]] and [[E2]] [[ E1 OR E2 ]] ==> [[E1]] or [[E2]] [[ NOT E ]] ==> not([[E]])
Mapping predicates from XQuery to the Algebra is not completely straightforward, as XQuery uses implicit existential quantification (like XPath), while the XML Algebra does not.
As a consequence, existential quantification needs to be explicitely introduced by the mapping. It is still an open issue whether XQuery should use existential quantification or not Cf. [xquery-xpath-coercions].
For example, in the case of equality, the mapping is:
[[ E1 = E2 ]] ==> not(empty(for v1 in [[E1]] do for v2 in [[E2]] do where v1 = v2 do v1))
Note that without existential quantification, the mapping would be more straightforward: [[ E1 = E2 ]] ==> [[E1]] = [[E2]].
[[ E1 < E2 ]] ==> not(empty(for v1 in [[E1]] do for v2 in [[E2]] do where v1 < v2 do v1)) [[ E1 <= E2 ]] ==> not(empty(for v1 in [[E1]] do for v2 in [[E2]] do where v1 <= v2)) [[ E1 >= E2 ]] ==> not(empty(for v1 in [[E1]] do for v2 in [[E2]] do where v1 >= v2)) [[ E1 > E2 ]] ==> not(empty(for v1 in [[E1]] do for v2 in [[E2]] do where v1 > v2)) [[ E1 != E2 ]] ==> not(empty(for v1 in [[E1]] do for v2 in [[E2]] do where v1 != v2))
Arithmetic operators are not yet integrated in XQuery and in the XML Algebra. See open issues [xquery-operators-and-functions] and [Algebra Issue-0056: Operators on Simple Types]. Note also that we do not perform automatic type coercions (Cf. [xquery-xpath-coercions]).
Here is a few examples of mappings for such expressions.
[[ E1 + E2 ]] ==> [[E1]] + [[E2]] [[ E1 - E2 ]] ==> [[E1]] - [[E2]] [[ E1 * E2 ]] ==> [[E1]] MULT [[E2]] [[ E1 DIV E2 ]] ==> [[E1]] DIV [[E2]] [[ E1 MOD E2 ]] ==> [[E1]] MOD [[E2]] ... [[ + E ]] ==> +[[E]] [[ - E ]] ==> -[[E]] ...
All collection operations in XQuery are based on node identity. See [xquery-equality-identity].
List concatenation corresponds to sequencing in the Algebra.
[[ [E] ]] ==> [[E]]_l [[ E1, E2 ]]_l ==> [[E1]], [[E2]]
Union in XQuery corresponds to the set Union operation.
[[ E1 UNION E2 ]] ==> distinct([[E1]], [[E2]])
Intersection and exception can be written in the algebra using existential quantification. Note the use of '==', the node identity operator in the XML Algebra.
[[ E1 INTERSECT E2 ]] ==> for v1 in [[E1]] do where not(empty(for v2 in [[E2]] do where v2 == v1 do v2)) do v1 [[ E1 EXCEPT E2 ]] ==> for v1 in [[E1]] do where empty(for v2 in [[E2]] do where v2 == v1 do v2) do v1
BEFORE and AFTER operators returns the subset of the element in the first expression which are before or after one of the element in the second expression in document order. The mapping to the algebra uses both existential quantification and document order comparators '<' and '>'.
[[ E1 BEFORE E2 ]] ==> for v1 in [[E1]] do where not(empty(for v2 in [[E2]] do where v2 > v1 do v2)) do v1 [[ E1 AFTER E2 ]] ==> for v1 in [[E1]] do where not(empty(for v2 in [[E2]] do where v1 > v2 do v2) do v1
Conditional expressions are mapped directly in the algebra.
[[ IF E1 THEN E2 ELSE E3 ]] ==> if [[E1]] then [[E2]] else [[E3]]
Literal values are mapped to their corresponding values in the algebra.
[[ TRUE ]] ==> true [[ FALSE ]] ==> false [[ 1 ]] ==> 1 [[ "" ]] ==> "" ...
Note that the algebra does not support a syntax for Float values, while XQuery does. See Issue [xquery-literal-xml-constructor] and [xquery-embedding-xml]. Further work is required on arithmetic values and operations.
Function calls are mapped to equivalent function calls. Mapping for function declarations is given in the next section.
[[ FunctionName(E1,..., En) ]] ==> FunctionName( [[E1]]; ...; [[E2]] )
Notably, the FILTER function will be treated in Section E.3.4.
Quantifiers are simply mapped to existential predicate and iteration in the Algebra.
[[ SOME $v IN E1 SATISFIES E2 ]] ==> not(empty(for v in [[E1]] do where [[E2]] do v)) [[ EVERY $v IN E1 SATISFIES E2 ]] ==> empty(for v in [[E1]] do where not([[E2]]) do v)
Variables in XQuery are mapped to variables in the algebra.
[[ $v ]] ==> v (* variables *)
XQuery supports operations on the types.
The INSTANCE OF operation allows to check whether a value is of a given type. This operation is mapped to a match expression in the algebra.
[[ E INSTANCE OF T ]] ==> match [[ E ]] case [[ T ]] do true else false
The CAST operation in XQuery performs a conversion of the actual value from a type into another. This kind of value conversion is not currently supported in the XML Algebra (Cf. [xquery-mapping-cast]).
The TREAT operation allows to change the type of a value at run-time. It is mapped into the unsafe type casting in the XML Algebra.
[[ CAST AS T (E) ]] ==> cast [[ E ]] : [[ T ]]
XQuery type declarations are mapped to the corresponding type in the XML Algebra. Note that there is a number of issues related to how XQuery types are written (Cf. [xquery-type-syntax] and [xquery-inline-xml-schema]), how typing is done in XQuery (Cf. [xquery-type-correspondence]), and how XML Algebra types relate to XML Schema (Cf. [Algebra Issue-0018: Align algebra types with schema]).
[[ xsd:string ]]_type ==> String (* String type *) [[ xsd:integer ]]_type ==> Integer (* Integer type *) [[ xsd:float ]]_type ==> Float (* Float type *) [[ 'ELEMENT' ]]_type ==> AnyElement (* Any Element *) [[ TagName ]]_type ==> TagName[AnyType] (* An Element *) [[ @AttributeName ]] ==> @AttributeName[AnyType] (* An Attribute *) [[ SchemaName ]] ==> X (* Where X is the Algebra type name corresponding to the XML Schema name *)
Namespace declarations in XQuery are mapped to global variables in the Algebra. Note that we use here the uri function of the XML Query Data Model, which is not currently in the XML Algebra (See Algebra Issue [Algebra Issue-0071: Alignment with the XML Query Datamodel ]).
[[ NAMESPACE n = d ]] ==> let n : URI = uri([[d]])
Each function declaration in XQuery is mapped to a corresponding function declaration in the Algebra.
[[ FUNCTION f(T1 v1, ..., Tn vn) RETURNS T' { E } ]] ==> fun f(v1 : [[T1]]; ...; vn : [[Tn]]) : [[T']] = [[E]]
In the above mapping, we used a number of Algebra functions, which are not currently part of the XML Algebra. These functions need to be declared accordingly as part of the preamble.
The following function computes the root of the input node.
fun root(x:AnyElement) : AnyElement = let p = parent(x) do if p = () then x else root(p)
The following function computes the descendants of the input nodes.
fun descendant(x:AnyForest) : AnyForest = for v in x do v,children(v)
The following algebra functions implement the corresponding XQuery filter function. Note that this definition lose all type information, see [xquery-filter-typing].
fun member(x : AnyNode; y : AnyForest) : Boolean = not(empty(for v in y do where v == y do v)) fun filter1(x : AnyNode; y : AnyForest) : AnyForest = if (member(x;y)) then match x case x' : (Comment|PI|String|AnyAttribute) do x else let tag = name(x) do ~(tag) [ x/@*, filter(x/node();y) ] else filter(x/node();y) fun filter(x : AnyForest; y : AnyForest) : AnyForest = for x' in x do filter1(x';y)
As noted earlier, the simple grammar shown in Appendix B needs to be made more specific in order to be used in a real XQuery implementation. The way in which this is done depends on the parser generator tools being used. This Appendix contains a specification, based on the grammar in Appendix B, that can be used with the Java CUP tool (http://www.cs.princeton.edu/~appel/modern/java/CUP/manual.html) to generate a working XQuery parser.
The grammar shown below allows a series of XQuery units (queries and function libraries) to be separated by semicolons and parsed as a single test case. A parser generated from this grammar has been used to validate all the XQuery solutions to the Use Cases of the Query Working Group.
/** * An XQueryFile is one or more XQueryUnits separated by semicolons. * An XQueryUnit may be either a function library or a query. */ XQueryFile ::= XQueryUnit | XQueryFile SEMICOLON XQueryUnit ; XQueryUnit ::= FunctionLibrary | Query ; FunctionLibrary ::= OptionalContextDeclarationList FunctionDefinitionList ; /** * A Query is zero or more namespace declarations, * followed by zero or more local function definitions, * followed by an expression */ Query ::= OptionalContextDeclarationList OptionalFunctionDefinitionList Expression ; OptionalContextDeclarationList ::= /* empty */ | ContextDeclarationList ; ContextDeclarationList ::= ContextDeclaration | ContextDeclarationList ContextDeclaration ; /** * A context declaration is used to declare a namespace prefix, * or to declare the default namespace. */ ContextDeclaration ::= NAMESPACE IDENTIFIER EQUALS STRING_LITERAL | NAMESPACE DEFAULT EQUALS STRING_LITERAL ; OptionalFunctionDefinitionList ::= /* empty */ | FunctionDefinitionList ; FunctionDefinitionList ::= FunctionDefinition | FunctionDefinitionList FunctionDefinition ; FunctionDefinition ::= FUNCTION FunctionName L_PAREN OptionalParameterList R_PAREN RETURNS Datatype L_BRACE Expression R_BRACE ; FunctionName ::= QName ; OptionalParameterList ::= /* empty */ | ParameterList ; ParameterList ::= Parameter | ParameterList COMMA Parameter ; /** * Each function parameter has a datatype and a name */ Parameter ::= Datatype VAR ; /** * A datatype can be the name of an element or attribute, or the * keyword ELEMENT or ATTRIBUTE denoting any element or attribute, or * one of the predefined types of XML Schema, or a list of any simple type. */ Datatype ::= SimpleDatatype | LIST L_PAREN SimpleDatatype R_PAREN ; SimpleDatatype ::= QName | AT QName | ELEMENT ; Expression ::= LogicalOrSetExpression | LogicalOrSetExpression SORTBY L_PAREN SortSpecList R_PAREN ; SortSpecList ::= SortSpec | SortSpecList COMMA SortSpec ; SortSpec ::= Expression | Expression ASCENDING | Expression DESCENDING ; LogicalOrSetExpression ::= LogicalOrSetTerm | LogicalOrSetExpression OR LogicalOrSetTerm | LogicalOrSetExpression UNION LogicalOrSetTerm | LogicalOrSetExpression BAR LogicalOrSetTerm | LogicalOrSetExpression EXCEPT LogicalOrSetTerm ; LogicalOrSetTerm ::= LogicalOrSetPrimitive | LogicalOrSetTerm AND LogicalOrSetPrimitive | LogicalOrSetTerm INTERSECT LogicalOrSetPrimitive ; LogicalOrSetPrimitive ::= SequencedValue | NOT SequencedValue ; SequencedValue ::= ValueExpression | SequencedValue BEFORE ValueExpression | SequencedValue AFTER ValueExpression ; ValueExpression ::= Comparison | SpecialExpression ; Comparison ::= ArithmeticExpression | Comparison CompareOperator ArithmeticExpression | Comparison INSTANCEOF Datatype ; ArithmeticExpression ::= ArithmeticTerm | ArithmeticExpression PLUS ArithmeticTerm | ArithmeticExpression MINUS ArithmeticTerm ; ArithmeticTerm ::= ArithmeticFactor | ArithmeticTerm STAR ArithmeticFactor | ArithmeticTerm DIV ArithmeticFactor | ArithmeticTerm MOD ArithmeticFactor ; ArithmeticFactor ::= ArithmeticPrimitive | PLUS ArithmeticFactor | MINUS ArithmeticFactor ; ArithmeticPrimitive ::= BasicExpression OptionalPredicateList | PathExpression ; PathExpression ::= Path | SLASH Path | DOUBLE_SLASH Path | BasicExpression OptionalPredicateList SLASH Path | BasicExpression OptionalPredicateList DOUBLE_SLASH Path ; /** * In the future, paths may include other forms of regular expressions. */ Path ::= Step | Path SLASH Step | Path DOUBLE_SLASH Step ; /** * As in XPath, a step represents movement in an XML document along an * axis (currently limited to the child, parent, and attribute axes). * If followed by DEREFERENCE, the step dereferences an attribute * of type IDREF(S) and returns the target element(s) (this requires * information from a schema or Document Type Definition.) */ Step ::= NodeGenerator OptionalPredicateList ; NodeGenerator ::= NameTest | NodeType L_PAREN R_PAREN | AT NameTest | AT NameTest DEREFERENCE NameTest | DOTDOT ; OptionalPredicateList ::= /* empty */ | OptionalPredicateList Predicate ; /** * An expression in a predicate must evaluate to a Boolean or an ordinal number. */ Predicate ::= L_BRACKET Expression R_BRACKET | L_BRACKET RANGE Expression TO Expression R_BRACKET ; /** * When used in an expression, DOT denotes the "current node". */ BasicExpression ::= VAR | Literal | FunctionName L_PAREN OptionalExpressionList R_PAREN | L_PAREN Expression R_PAREN | ElementConstructor | ListConstructor | CastingVerb AS Datatype L_PAREN Expression R_PAREN | DOT ; Literal ::= STRING_LITERAL | INTEGER_LITERAL | FLOAT_LITERAL | BooleanLiteral ; BooleanLiteral ::= TRUE | FALSE ; CastingVerb ::= CAST | TREAT ; OptionalExpressionList ::= /* empty */ | ExpressionList ; ExpressionList ::= Expression | ExpressionList COMMA Expression ; SpecialExpression ::= FlwrExpression | IF Expression THEN ValueExpression ELSE ValueExpression | Quantifier VAR IN Expression SATISFIES ValueExpression ; FlwrExpression ::= ForLetClause WhereReturnClause ; /** * A For-Let clause consists of one or more FOR-clauses and/or Let-clauses */ ForLetClause ::= ForOrLet | ForLetClause ForOrLet ; ForOrLet ::= ForClause | LetClause ; /** * A For clause has at least one variable that iterates over a collection */ ForClause ::= FOR VAR IN Expression | ForClause COMMA VAR IN Expression ; /** * A Let clause has at least one variable assignment */ LetClause ::= LET VAR SET_EQUAL_TO Expression | LetClause COMMA VAR SET_EQUAL_TO Expression ; WhereReturnClause ::= WhereClause ReturnClause | ReturnClause ; WhereClause ::= WHERE Expression ; ReturnClause ::= RETURN ValueExpression ; /** * An element constructor is a start tag, followed by an optional expression * list, followed by an end-tag; or it might be an empty element constructor. * * For constructing other types of nodes, use constructor functions such as * comment("string") and pi("target", "string"). */ ElementConstructor ::= StartTag OptionalExpressionList EndTag | EmptyElementConstructor ; StartTag ::= LESS_THAN TagName OptionalAttributeList GREATER_THAN ; TagName ::= QName | VAR ; OptionalAttributeList ::= /* empty */ | AttributeList ; AttributeList ::= AttributeValuePair | AttributeList AttributeValuePair ; AttributeValuePair ::= AttributeName EQUALS ArithmeticExpression ; AttributeName ::= QName | VAR ; /** * If the end tag contains a variable, it must match the start tag */ EndTag ::= LESS_THAN_SLASH TagName GREATER_THAN ; EmptyElementConstructor ::= LESS_THAN TagName OptionalAttributeList SLASH_GREATER_THAN ; ListConstructor ::= L_BRACKET ListContent R_BRACKET | L_BRACKET R_BRACKET ; ListContent ::= Expression | ListContent COMMA Expression ; /** * A name test is a Qname where "*" serves as a wild card */ NameTest ::= QName | NamePrefix COLON STAR | STAR COLON LocalPart | STAR ; NodeType ::= NODE | TEXT | COMMENT | PROCESSING_INSTRUCTION ; Quantifier ::= SOME | EVERY ; /** * A QName may consist of a single part or two parts */ QName ::= LocalPart | NamePrefix COLON LocalPart ; /** * NamePrefix and LocalPart are both IDENTIFIERs */ NamePrefix ::= IDENTIFIER ; LocalPart ::= IDENTIFIER ; CompareOperator ::= EQUALS | LESS_THAN | LESS_THAN_EQUALS | GREATER_THAN | GREATER_THAN_EQUALS | NOT_EQUALS ;
Most commonly used grammars are either LALR or LL(1). The CUP grammar presented in the previous appendix is an example of an LALR grammar that corresponds to the abstract grammar in Appendix B. This appendix contains an LL(1) grammar that works with JavaCC.
/** * Xquery parser; 1.26.2001 * */ PARSER_BEGIN(xquery) import java.io.*; import java.util.Vector; class xquery { public static void main(String args[]) throws Exception { xquery ql = null; if (args.length == 0 ) { System.out.println("--Input: test.xq"); ql = new xquery(new FileInputStream("test.xq")); } else { System.out.println("--Input: "+args[0]); ql = new xquery(new FileInputStream(args[0])); } try { ql.Start(); System.out.println("--Done."); } catch (Exception e) { System.out.println("--Undone."); System.out.println(e.getMessage()); e.printStackTrace(); } } } PARSER_END(xquery) void Start() throws Exception : {} { XQueryFile() } void XQueryFile() throws Exception : {} { XQueryUnit() [<SEMICOLON> XQueryFile ()] } void XQueryUnit()throws Exception: {} { FunctionLibrary() [Query()] } void FunctionLibrary() throws Exception:{} { [ ContextDeclarationList()] [ FunctionDefinitionList()] } void Query() throws Exception:{} { Expression() } void ContextDeclarationList()throws Exception: {} { ContextDeclaration() [ ContextDeclarationList()] } void ContextDeclaration()throws Exception:{} { <NAMESPACE> [<DEFAULTT>] <IDENTIFIER> <EQUALS> <STRING> } void FunctionDefinitionList()throws Exception:{} { FunctionDefinition() [ FunctionDefinitionList()] } void FunctionDefinition()throws Exception:{} { <FUNCTION> <IDENTIFIER> <L_PAREN> (ParameterList())? <R_PAREN> <RETURNS> DataType() <L_BRACE> Expression() <R_BRACE> } void ParameterList()throws Exception: {} { Parameter() [<COMMA> ParameterList()] } void Parameter()throws Exception: {} { DataType() VAR() } void DataType()throws Exception:{} { SimpleDatatype() | <LIST> <L_PAREN> SimpleDatatype() <R_PAREN> } void SimpleDatatype()throws Exception:{} { QName() | <ELEMENT> | <ATTRIBUTE> | <AT> QName() } void Expression() throws Exception:{} { ExpressionLevel14() } void ExpressionLevel14()throws Exception:{} { ExpressionLevel13() [LOOKAHEAD(2) <SORTBY> <L_PAREN> SortSpecList() <R_PAREN> ] } void ExpressionLevel13()throws Exception:{} { ExpressionLevel12() [LOOKAHEAD(2) (<OR> | <BAR> | <UNION> | <EXCEPT> ) ExpressionLevel13()] } void ExpressionLevel12()throws Exception:{} { ExpressionLevel11() [LOOKAHEAD(2) (<AND>| <INTERSECT>) ExpressionLevel12()] } void ExpressionLevel11()throws Exception:{} { [<NOT>] ExpressionLevel10() } void ExpressionLevel10()throws Exception:{} { ExpressionLevel9() [LOOKAHEAD(2) (<BEFORE> | <AFTER> ) ExpressionLevel10()] } void ExpressionLevel9()throws Exception:{} { ExpressionLevel7() | ExpressionLevel8() } void ExpressionLevel8() throws Exception:{} { ExpressionLevel8a() /* FLWR Expression */ | ExpressionLevel8b() /* If Then Else */ | ExpressionLevel8c() /* Quantified expression */ } void ExpressionLevel7()throws Exception:{} { ExpressionLevel6() [LOOKAHEAD(2) ComparisonOperator() ExpressionLevel7() ] } void ExpressionLevel6()throws Exception:{} { ExpressionLevel5() [LOOKAHEAD(2) ( <INSTANCEOF> DataType() ) ] } void ExpressionLevel5()throws Exception:{} { ExpressionLevel4() [LOOKAHEAD(2) ( <PLUS> | <MINUS> ) ExpressionLevel5()] } void ExpressionLevel4()throws Exception:{} { ExpressionLevel3() [LOOKAHEAD(2) ( <STAR> | <DIV> | <MOD> ) ExpressionLevel4()] } void ExpressionLevel3()throws Exception:{} { [ <PLUS> | <MINUS>] ExpressionLevel2() } void ExpressionLevel2()throws Exception:{} { LOOKAHEAD(4) ExpressionLevel1() [ (<SLASH> | <SLASHSLASH>) Path()] | Path() | <SLASH> Path() | <SLASHSLASH> Path() } void ExpressionLevel1() throws Exception:{} { ExpressionLevel0() [LOOKAHEAD(2) PredicateList()] } void ExpressionLevel0() throws Exception:{} /* Basic expression */ { VAR() | <DOT> | Literal() | QName() <L_PAREN> [ExpressionList()] <R_PAREN> | <L_PAREN> Expression() <R_PAREN> | NodeConstructor() | ListConstructor() | CastingVerb() <AS> DataType() <L_PAREN> Expression() <R_PAREN> } void Path()throws Exception:{} { AtomicStep() ( LOOKAHEAD(2) Predicate() | (<SLASH> | <SLASHSLASH>) AtomicStep() | <DEREFERENCE> NameTest() ) * } void AtomicStep()throws Exception:{} { NameTest() | NodeType() <L_PAREN> <R_PAREN> | <AT> NameTest() | <DOTDOT> } void PredicateList()throws Exception:{} { Predicate() [LOOKAHEAD(2) PredicateList()] } void Predicate()throws Exception:{} { <L_BRACKET> ( Expression() | <RANGE> Expression() <TO> Expression()) <R_BRACKET> } void ExpressionLevel8b() throws Exception:{} { <IF> Expression() <THEN> Expression() <ELSE> ExpressionLevel9() } void ExpressionLevel8c()throws Exception:{} { Quantifier() VAR() <IN> Expression() <SATISFIES> ExpressionLevel9() } void ExpressionLevel8a()throws Exception:{} { (ForClause() | LetClause())+ [WhereClause()] ReturnClause() } void ForClause()throws Exception:{} { <FOR> VAR() <IN> Expression() (<COMMA> VAR() <IN> Expression() )* } void LetClause()throws Exception:{} { <LET> VAR() <SET_EQUAL_TO> Expression() (<COMMA> VAR() <SET_EQUAL_TO> Expression())* } void WhereClause()throws Exception:{} { <WHERE> Expression() } void ReturnClause()throws Exception:{} { <RETURN> ExpressionLevel9() } void NodeConstructor()throws Exception:{} { ElementConstructor() } void ListConstructor() throws Exception:{} { <L_BRACKET> [ExpressionList()] <R_BRACKET> } void ElementConstructor()throws Exception:{} { StartTag() (<SLASH_GREATER_THAN> | (<GREATER_THAN> [ExpressionList()] EndTag())) } void StartTag()throws Exception:{} { <LESS_THAN> TagName() [AttributeList()] } void AttributeList()throws Exception:{} { AttributeValuePair() [AttributeList()] } void AttributeValuePair()throws Exception:{} { AttributeName() <EQUALS> ExpressionLevel5() } void AttributeName()throws Exception:{} { QName() | VAR() } void TagName()throws Exception:{} { QName() | VAR() } void EndTag()throws Exception:{} { <LESS_THAN_SLASH> TagName() <GREATER_THAN> } void NameTest()throws Exception:{} { (<IDENTIFIER> [<COLON> (<IDENTIFIER> | <STAR>)]) | (<STAR> [<COLON> <IDENTIFIER>]) } void NodeType()throws Exception:{} { <NODE> | <TEXT> | <COMMENT> | <PROCESSING_INSTRUCTION> } void Quantifier()throws Exception:{boolean b;} { <SOME> | <EVERY> } void QName()throws Exception:{} { <IDENTIFIER> [ <COLON> <IDENTIFIER>] } void ComparisonOperator()throws Exception:{} { <EQUALS> | <LESS_THAN> | <LESS_THAN_EQUALS> | <GREATER_THAN> | <GREATER_THAN_EQUALS> | <NOT_EQUALS> } void CastingVerb()throws Exception:{} { <CAST> | <TREAT> } void SortSpecList()throws Exception:{} { Expression() [<ASCENDING> | <DESCENDING>] [<COMMA> SortSpecList()] } void VAR()throws Exception : {} { <DOLLAR> <IDENTIFIER> } void Literal()throws Exception:{} { <STRING> | <INTEGER> | <FLOAT> | Boolean() } void Boolean() throws Exception:{} { <TRUE> | <FALSE> } void ExpressionList()throws Exception:{} { Expression() [<COMMA> ExpressionList()] }
This section contains the current issues for XQuery. The individual issues are shown in detail after an abbreviated issues list.
Issue |
1: Collection types (xquery-collection-types) |
2: Definitions of Operators (xquery-definition-of-operators) |
3: Operators and functions (xquery-operators-and-functions) |
4: External Functions (xquery-external-functions) |
5: Function Definition (xquery-function-definition) |
6: Function Resolution (xquery-function-resolution) |
7: CAST expression (xquery-cast-expression) |
8: Type Guard (xquery-type-guard) |
9: Separation of clauses in FLWR (xquery-separation-of-flowers) |
10: Alignment of Syntax (xquery-alignment-of-syntax) |
11: Alternative syntax for element construction (xquery-element-construction) |
12: Fusion (xquery-fusion) |
13: Filter as a Function (xquery-filter-function) |
14: TRY/CATCH and error() (xquery-try-catch-error) |
15: Updates (xquery-updates) |
16: Algebra Mapping (xquery-algebra-mapping) |
17: XPath Productions (xquery-xpath-productions) |
18: Abstract Syntax (xquery-abstract-syntax) |
19: Recursion (xquery-recursion) |
20: Copy and Reference Semantics (xquery-copy-reference) |
21: View Definition (xquery-persist-views-functions) |
22: Human-Readable Syntax for Types (xquery-type-syntax) |
23: What is a Query (xquery-query) |
24: XPath Type Coercions (xquery-xpath-coercions) |
25: Support for Unordered Collections (xquery-unordered-collections) |
26: Identity-based equality operator (xquery-equality-identity) |
27: Deep equality (xquery-deep-equality) |
28: Reference Constructor (xquery-reference-constructor) |
29: Precedence of Operators (xquery-precedence) |
30: Queries with Invalid Content (xquery-invalid-content) |
31: Function Libraries (xquery-function-library) |
32: Correspondence of Types (xquery-type-correspondence) |
33: Excluding Undesired Elements (xquery-exclude-undesireables) |
34: Alignment of Precedence (xquery-align-precedence) |
35: Escape to ABQL (xquery-escape-to-abql) |
36: CAST and TREAT AS Syntax (xquery-cast-syntax) |
37: XML Schema Datatypes Constructors (xquery-datatype-constructors) |
38: Attribute Constructor Function (xquery-attribute-constructor-function) |
39: Attribute Name, Attribute Content (xquery-attribute-name-content) |
40: Dereference Operator and Links (xquery-dereference-links) |
41: XML Constructor (xquery-literal-xml-constructor) |
42: Eval (xquery-eval) |
43: Inline XML Schema Declarations (xquery-inline-xml-schema) |
44: Encoding (xquery-encoding) |
45: Typing of Filter (xquery-filter-typing) |
46: Typeswitch (xquery-typeswitch) |
47: Mapping Input Context (xquery-mapping-input-context) |
48: Accessing Element Data (xquery-data-function) |
49: Embedding XML in XQuery (xquery-embedding-xml) |
50: Embedding XQuery in XML (xquery-embedding-xquery-in-xml) |
51: Naive Implementation Strategy (xquery-naive-implementation) |
52: XML-based Syntax (xquery-abql) |
53: Mapping CAST AS (xquery-mapping-cast) |
54: Defining Behavior for Well Formed, DTD, and Schema Documents (xquery-define-schema-variants) |
Originator: | XQuery Editors |
Locus: | requirements |
XQuery currently considers book and list(book) to be different types. List is an abbreviation of the facets minoccurs=0, maxoccurs = unbounded. We need to confirm that this accurately reflects the type system of XML Schema.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
XPath defines some operators on lists in ways that differ from most commonly used languages. For example, if $X is a list, $X+1 is defined as the result of adding 1 to the first element in $X (ignoring the other elements.) XQuery takes a more regular approach to operations on lists, as described in the section on functions. For example, if the function increment(integer) is defined to add 1 to its argument, then the function call increment($X) where $X is a list of integers will return a list in which every member has been incremented by 1.
This issue needs more study, which should probably occur on the joint task force between XML Query and XSL. Operations on lists may be one area in which work is needed to evolve the XPath specification toward Version 2.0.
It is very desirable that operators such as "+" occur only once in the grammar, not separately in the "XPath" and "non-XPath" parts of the language. This probably means that path expressions should be integrated into the XQuery grammar and not treated as a "terminal symbol." This is the approach that has been taken in the current CUP and JavaCC grammars.
Expressions involving operators must also be carefully defined. We expect this to be handled by the task force on operators and the XPath 2.0 task force.
Originator: | XQuery Editorial Team |
Locus: | xquery-algebra |
A complete list of XQuery operators and core functions is needed, with their signatures (datatypes of operands and results). We have reserved an appendix for this, but the appendix is currently empty.
Sources of information: XPath specification, operator lists presented at recent F2F meetings by Frank Olken and others.
Note: The Query Working Group and the Schema Working Group have agreed to spawn a joint task force to investigate this issue. The XQuery language will integrate the results of that work.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
An extensibility mechanism needs to be defined that permits XQuery to access a library of functions written in some other programming language such as Java.
Some sources of information: the definition of external functions in SQL, the implementation of external functions in Kweelt.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
We need more thought about what constitutes a valid parameter-type for a function. Attribute-types as well as element-types? Type-names vs. element-names? Should all the MSL symbol spaces be represented?
It is probably important to have NODE as a type, to allow functions to take any XML node as a parameter, or to return any XML node as a result.
Using univeral names similar to those found in the MSL document, but with a different syntax, would allow us to reference any schema type in XQuery function definitions.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
More detailed rules need to be developed for function resolution. What kinds of function overloading are allowed? A promotion hierarchy of basic types needs to be specified. The issue of polymorphic functions with dynamic dispatch needs to be studied. Can overloaded functions be defined such that the parameter-type of one function is a subtype of the parameter-type of another function? If so, what are the constraints on the return-types of these functions? Is function selection based on the static type of the argument or on the dynamic type of the argument (dynamic dispatch, performed at execution time)? If XQuery supports dynamic dispatch, is it based on all the arguments of a function or on only one distinguished argument?
Observation: This is a very complex area of language design. If it proves too difficult to solve in the available time, it may be wise to take a simple approach such as avoiding dynamic dispatch in Version 1 of XML Query.
Proposal:
The Query Algebra does not support overloading or dynamic dispatch. We will attempt to simplify XML Query Level 1 by omitting these, unless it becomes clear that they are needed. We realize that this might happen.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
Does XQuery need a CAST expression for casting an instance of one type into another type?
Proposal:
A "CAST AS" operator has been added to this current Working Draft. We solicit feedback on this operator.
Originator: | Michael Rys |
Locus: | xquery-algebra |
Does XQuery require a syntax for type guards?
Proposal:
We have added the "TREAT AS" operator to this current Working Draft. This is especially helpful for function arguments, since we do not have function polymorphism in the current specification. We solicit feedback on this operator.
Originator: | Algebra Editors |
Locus: | xquery-algebra |
A FOR clause can stand alone: FOR $var IN expr1 RETURN expr2. A LET clause can stand alone: LET $var := expr1 EVAL expr2. It has been suggested that a WHERE clause should also be able to stand alone: WHERE expr1 RETURN expr2.
Note: in the current BNF, at least one FOR or LET clause must occur, but it is not necessary to have a FOR.
Observation: This functionality is already provided by the conditional expression, IF expr1 THEN expr2 ELSE [ ]. There does not seem to be a compelling reason to provide a second way to write this expression. However, making the ELSE clause optional might be a slight improvement.
Proposal:
The following content model should be adopted for FLWR expressions:
FlwrExpr ::= (ForClause | LetClause)+ [WhereClause] ReturnClause
This allows us also to remove the separate LET/EVAL expression, which can now be expressed using LET/RETURN.
Originator: | Algebra Editors |
Locus: | xquery-algebra |
Should the syntaxes of XQuery and the XML Query Algebra be more closely aligned?
Proposal:
Since the Algebra is not a user language, it does not seem necessary for it to use the same keywords as XQuery, as long as it is straightforward to map to it.
Originator: | Algebra Editors |
Locus: | xquery-algebra |
(From Algebra team): We think it would be helpful to have two syntaxes for construction, xquery-algebra a[. . .] as used in the Algebra and <a> . . . </a> as used in XQuery. Not least, the a[. . .] syntax matches the syntax used in the Algebra for types.
Proposal:
We do not believe it is helpful to have two syntaxes for the same thing, and we do not feel that this suggestion would make XQuery easier to read or use. In any case, square brackets are used in XQuery to enclose predicates and are not available for the suggested purpose.
Originator: | Michael Rys |
Locus: | xquery-algebra |
Consider adding a fusion operator to XQuery. Michael has supplied us with a few queries in which fusion would be helpful, and fusion seems promising as a feature, but we have not yet done adequate study of this issue. We would like to explore a wider set of use cases to make sure that we take a general approach to the problems related to fusion.
This requires further study. Use cases will need to be developed to determine precisely what forms of a fusion operator are actually helpful for data integration. An extensive and informative thread on this topic can be found in the W3C archives, starting at lists.w3.org/Archives/Member/w3c-archive/2000Dec/0132.html (W3C members only)..
Originator: | Dana |
Locus: | xquery-algebra |
Dana has observed that Filter can be a function rather than an operator, if nodes have identity.
Resolution:
Filter is now a function, where the first parameter is the expression to be filtered, and the second parameter is the filter expression. We find this more elegant.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
We believe the following approach to error handling would be very useful - (1) introduce TRY <expression> CATCH <expression>, similar to try/catch in OO languages. Instead of having "throw" to throw objects, use error(<expression>), bind the result of the expression to the variable $err, and allow $err to be used in the CATCH clause.
Proposal:
Dana Florescu has been assigned the task of writing a proposal for this.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
We believe that a syntax for update would be extremely useful, allowing inserts, updates, and deletion. This might best be added as a non-normative appendix to the syntax proposal, since the algebra is not designed for defining this portion of the language.
Proposal:
Jonathan is working on a proposal for this.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
The algebra mapping is incomplete and out of date.
Proposal:
Jerome has created a new version of the mapping, with help from Mary, Dana and Mugur.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
XPath can't be treated as a terminal symbol in our grammar. We intend XQuery to be a superset of the abbreviated syntax of XPath. We do not use the grammar of XPath directly because it needs to be integrated into our other productions. For example, operators like the union operator, which occur in path expressions, also occur in other contexts in XQuery, and it makes little sense to define two different operators. This raises issues of coordination with XPath.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
Jerome and Mary have suggested that we abandon the separate abstract syntax for XQuery, in favor of a higher-level BNF.
Proposal:
This is the BNF that now appears in Appendix B.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
Should XQuery support general recursion, or should it be limited in some way? Status quo: XQuery currently supports general recursion.
Reference: XML Query Algebra Issues
[Issue-0008] Fixed point operator or recursive functions]
[Issue-0032] Full regular path expressions]
Originator: | XQuery Editors |
Locus: | xquery-algebra |
Copy and reference semantics must be defined properly for updates to work. This must be coordinated with the algebra team.
Originator: | Mugur |
Locus: | xquery-algebra |
Do we need a way to define views?
Resolution:
While a mechanism for view definition is desirable, we do not currently intend to provide one in Level 1.
Originator: | Algebra Editors |
Locus: | xquery-algebra |
The Algebra has a syntax for declaring types. Up to now, XQuery uses XML Schema for declaring types. Is this sufficient? Some important questions:
Are type names sufficient, or does XQuery really need its own syntax for declaring types?
Would Normalized Universal Names (derived from MSL) be sufficient for type names?
How will type names be bound to definitions?
Originator: | Dana |
Locus: | xquery-algebra |
What is a query?
According the the algebra: any number of type declarations, function definitions, variable definitions and expressions.
According to XQuery: any number of namespace declarations, function definitions and a unique expression.
These definitions should be coordinated.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
XPath has a number of implicit type coercions, and also has implied existential quantification in some places. In XML 1.0, which had a small and loose type system, this was less problematic than it is with XML Schema, which introduces many types and relationships among types. If XQuery is to be compatible with XPath, we need to study these rules carefully, and adapt them to be rational and intuitive when used with the XML Schema type system.
Also, there are interactions between quantification and type coercion in XPath, sometimes causing non-intuitive results.
Originator: | Algebra Editors |
Locus: | xquery |
Does XQuery need features to add support for unordered collections? If so, what features are required? In the current draft, "unordered" is a property of a list. The user can create an ordered list from an unordered list by using SORTBY. The distinct() function not only removes duplicates from a list, it also renders the list unordered.
Do we need a function that merely removes the ordered property of a list?
How does the ordered/unordered property of a list affect the semantics of operators applied to it?
Originator: | Algebra Editors |
Locus: | xquery-algebra |
Do we need an identity-based equality operator? Please justify your answer with sample queries. Note that XPath gets along without it.
Originator: | Jonathan |
Locus: | xquery-algebra |
In XPath, <book><title> Mark Twain </title></book> and <book><author> Mark Twain </author></book> are treated as equal in comparisons. Is this acceptable for us? Do we need another notion of deep equality? If so, what are the compatibility issues with XPath?
Originator: | Jonathan |
Locus: | xquery |
Queries should be able to return references; for instance, a query that generates a table of contents should also be able to create references to the items in the content itself. How can we extend the syntax to support this?
Originator: | Dana |
Locus: | xquery |
The XQuery editorial team is working through a number of cases to determine the best precedence of operators. Our current precedence rules are shown in Appendix B.
Originator: | XQuery Editors |
Locus: | xquery-algebra |
Is it an error for a query to specify content that may not appear, according to the schema definition? Consider the following query:
invoice//nose
If the schema does not allow a nose to appear on an invoice, is this query an error, or will it simply return an empty list?
Originator: | XQuery Editors |
Locus: | xquery |
XQuery needs a mechanism to allow function definitions to be shared by multiple queries. The XQuery grammar allows function definitions to occur without a query expression.
We must provide a way for queries to access functions in libraries. For instance, we might add an IMPORT statement to XQuery, with the URI of the functions to be imported. It must be possible to specify either that (1) local definitions replace the imported definitions, or (2) imported definitions replace the local ones.
Originator: | Jerome Simeon |
Locus: | xquery-algebra |
Section 2.9, on functions, portrays XQuery as a statically typed language, but the mechanisms by which static typing is established are still being developed by the XML Query Algebra editorial team. A complete accounting for type requires that the XML Query Algebra conform completely to the XML Schema type system, and that many open issues be resolved.
The semantics of XQuery are defined in terms of the operators of the XML Query Algebra (see Appendix E). The mapping of XQuery operators into Algebra operators is still being designed, and may result in some changes to XQuery and/or the Algebra. The type system of XQuery is the type system of XML Schema. Work is in progress to ensure that the type systems of XQuery, the XML Query Algebra, and XML Schema are completely aligned. The details of the operators supported on simple XML Schema datatypes will be defined by a joint XSLT/Schema/Query task force.
Originator: | Don Chamberlin |
Locus: | xquery |
How do we exclude undesired elements from the results of joins?
This need came out of a thread exploring data integration scenarios, starting with http://lists.w3.org/Archives/Member/w3c-archive/2000Dec/0132.html.
Originator: | Jerome Simeon |
Locus: | xquery-algebra |
The precedence rules of XQuery and the algebra are not completely aligned. This needs to be fixed by the time both specifications are finished.
Originator: | Jerome Simeon |
Locus: | xquery |
Is there a need to be able to escape to ABQL?
Originator: | Jonathan Robie |
Locus: | xquery |
Various approaches to the syntax of CAST AS and TREAT AS have been proposed. Java and SQL have both been sources used to discuss these approaches. We are still exploring other syntactic approaches.
Originator: | Paul Cotton |
Locus: | xquery |
The set of constructor functions for XML Schema simple datatypes has not yet been established. A joint task force with XML Schema has been chartered to determine the operators on datatypes, and we expect them to also determine the set of constructors for datatypes.
Originator: | Don Chamberlin |
Locus: | xquery |
We need a function for constructing attributes, eg:
<foo> IF $f/temp > 200 THEN attribute("warning", "about to explode!!") ELSE []f </foo>
In the above example, the first parameter of attribute is the name of the attribute, and the second is the content of the attribute. Attributes created within an element constructor are placed in the attribute list of the element that is constructed.
Originator: | Don Chamberlin |
Locus: | xquery |
We need functions to return the name of an attribute and the content of an attribute.
Originator: | Jonathan Robie |
Locus: | xquery |
Does the dereference operator work on links, such as XLink or HTML href?
Originator: | Jonathan Robie |
Locus: | xquery-algebra |
Is there a need for a constructor that creates an instance of the XML Query Data Model from a string that contains XML text?
Originator: | Jonathan Robie |
Locus: | xquery-algebra |
Is there a need to be able to execute a string that contains the text of an XQuery query? (This is similar to the eval function in Lisp.)
Originator: | Don Chamberlin |
Locus: | xquery |
Do we need to allow inline XML schema declarations in the prolog of a query? The following example shows one potential syntax for this. It extends the namespace declaration to allow literal XML Schema text to occur instead of a URI in a namespace declaration. The implementation would then assign an internal URI to the namespace.
NAMESPACE fid = "http://www.example.com/fiddlefaddle.xsd" NAMESPACE loc = [[ <xsd:schema xmlns:xsd = "http://www.w3.org/2000/10/XMLSchema"> <xsd:simpleType name="myInteger"> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="10000"/> <xsd:maxInclusive value="99999"/> </xsd:restriction> </xsd:simpleType> </xsd:schema> ]] FUNCTION string-to-myInteger ($s STRING) RETURNS loc:myInteger { -- If the facets of loc:myInteger are not satisfied, -- this function raises an error. LET $t := round(number($s)) RETURN TREAT AS loc:myInteger($t) } string-to-myInteger("1023")
Originator: | Jonathan Robie |
Locus: | xquery |
Does XQuery need a way to specify the encoding of a query? For instance, should the prolog allow statements like the following?
ENCODING utf-16
Originator: | Jerome Simeon |
Locus: | xquery-algebra |
The current mapping of filter to the algebra does not preserve much useful type information. Can the mapping be improved? Is there another approach to filter that would yield better type information?
Originator: | Mary Fernandez |
Locus: | xquery |
Do we need an expression similar to the MATCH expression in the algebra? One proposed syntax looks like this:
TYPESWITCH expr CASE typename1: expr1 CASE typename2: expr2
Since this syntax corresponds more closely to the syntax of the algebra, it can be mapped more easily. Alternatively, the match expression in the algebra could be made to resemble the INSTANCEOF expression of XQuery.
Originator: | Jerome Simeon |
Locus: | xquery-algebra |
Do we need a way to specify the nodes in the input context? Many queries do not specifically state the input to which they will be applied. This allows the same query, for instance, to be applied to a number of databases. It may be helpful for the mapping to introduce a global variable, eg $input, to represent the input nodes for a query. This variable might even be useful in the syntax of XQuery itself.
Originator: | Mary Fernandez |
Locus: | xquery |
There is no operator to access the typed constant content of an element. In the Algebra, the data() operator does this. Should XQuery do the same?
Originator: | XQuery Editors |
Locus: | xquery |
Do we need a way to embed literal XML content in a query? For instance, should there be a way to construct an XML element using the native syntax of XML?
Originator: | Steve Tolkin |
Locus: | xquery-algebra |
Do we need a way to escape from XML to XQuery syntax? This could be used to provide functionality similar to that of ASP or JSP.
Originator: | Marton Nagy |
Locus: | xquery |
Marton Nagy has suggested that it would be helpful to describe a naive implementation strategy for XQuery.
A naive XQuery implementation might parse the query, map it to Algebra syntax, and pass it to an Algebra implementation to request type checking from the algebra, returning an error if there were static type errors. A naive implementation might then request query execution from the algebra, get the results from the algebra and return it to the user.
Alternatively, the implementation might have its own algebra for execution, or it might generate statements in a specific implementation language such as XPath or SQL.We expect a wide variety of implementation approaches to be used in practice.
Originator: | XML Query WG |
Locus: | xquery |
XQuery needs an XML representation that reflects the structure of an XQuery query. Drafts of such a representation have been prepared, but it is not yet ready for publication.
Originator: | Jerome Simeon |
Locus: | xquery-algebra |
The algebra does not have coercion functions between values. The mapping requires this for CAST AS.
Originator: | Don Chamberlin |
Locus: | xquery-algebra |
We should specify the behavior of XQuery for well formed XML, XML validated by a schema, and XML validated by a DTD.