Copyright © 2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
This document defines basic operators and functions on the datatypes defined in [XML Schema Part 2: Datatypes] for use in XQuery, XPath, and other related XML standards. It also discusses operators and functions on nodes and node sequences as defined in the [XQuery 1.0 and XPath 2.0 Data Model] for use in XQuery, XPath, and other related XML standards.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.
This is the first Public Working Draft of this document for review by W3C Members and other interested parties. It is a draft document and may be updated, replaced or made obsolete by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". This is work in progress and does not imply endorsement by the W3C membership.
This document describes constructors, operators, and functions that are used in XQuery 1.0 and XPath 2.0. The document is generally unconcerned with the specific syntax with which these constructors, operators, and functions will be used, and focuses instead on defining the semantics of them as precisely as feasible.
This document has been produced as part of the [XML Activity], following the procedures set out for the W3C Process. This document was produced through the efforts of a joint task force of the W3C XML Query Working Group and the W3C XML Schema Working Group and a second joint task force of the W3C XML Query Working Group and the W3C XSL Working Group. It is designed to be read in conjunction with the following documents: [XQuery 1.0 and XPath 2.0 Data Model], [XQuery 1.0: An XML Query Language] and [XQuery 1.0 Formal Semantics]. A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.
Comments on this document should be sent to the W3C mailing list www-xml-query-comments@w3.org (archived at http://lists.w3.org/Archives/Public/www-xml-query-comments/).
1 Introduction
1.1 Syntax
1.2 Notations
1.3 Namespace Prefix
2 Constructors, Functions, and Operators on Numbers
2.1 Numeric Types
2.2 Numeric Literals
2.3 Numeric Constructors
2.3.1 xf:decimal
2.3.2 xf:integer
2.3.3 xf:long
2.3.4 xf:int
2.3.5 xf:short
2.3.6 xf:byte
2.3.7 xf:float
2.3.8 xf:double
2.4 Operators on Numeric Values
2.5 Comparisons of Numeric Values
2.6 Functions on Numeric Values
2.6.1 xf:floor
2.6.2 xf:ceiling
2.6.3 xf:round
3 Constructors, Functions, and Operators on Strings
3.1 String Types
3.2 String Literals
3.3 String Constructors
3.3.1 xf:string
3.3.2 xf:normalizedString
3.3.3 xf:token
3.3.4 xf:language
3.3.5 xf:Name
3.3.6 xf:NMTOKEN
3.3.7 xf:NCName
3.3.8 xf:ID
3.3.9 xf:IDREF
3.3.10 xf:ENTITY
3.4 Equality and Comparison of Strings
3.4.1 xf:codepoint-compare
3.4.2 xf:compare
3.5 Functions on String Values
3.5.1 Usage Notes
3.5.2 xf:concat
3.5.3 xf:starts-with
3.5.4 xf:ends-with
3.5.5 xf:codepoint-contains
3.5.6 xf:contains
3.5.7 xf:substring
3.5.8 xf:string-length
3.5.9 xf:codepoint-substring-before
3.5.10 xf:substring-before
3.5.11 xf:codepoint-substring-after
3.5.12 xf:substring-after
3.5.13 xf:normalize-space
3.5.14 xf:normalize-unicode
3.5.15 xf:upper-case
3.5.16 xf:lower-case
3.5.17 xf:translate
3.5.18 xf:string-pad-beginning
3.5.19 xf:string-pad-end
3.5.20 xf:match
3.5.21 xf:replace
4 Constructors, Functions, and Operators on Booleans
4.1 Boolean Constructors
4.1.1 xf:true
4.1.2 xf:false
4.1.3 xf:boolean-from-string
4.2 Operators on Boolean Values
4.2.1 Semantics
4.3 Functions on Boolean Values
4.3.1 xf:not
5 Constructors, Functions, and Operators on Dates and Times
5.1 Duration and Datetime Types
5.2 Duration and Datetime Constructors
5.2.1 xf:duration
5.2.2 xf:dateTime
5.2.3 xf:date
5.2.4 xf:time
5.2.5 xf:gYearMonth
5.2.6 xf:gYear
5.2.7 xf:gMonthDay
5.2.8 xf:gMonth
5.2.9 xf:gDay
5.2.10 xf:currentDateTime
5.3 Comparisons of Duration and Datetime Values
5.4 Component Extraction Functions on Datetime Values
5.4.1 xf:get-Century-from-dateTime
5.4.2 xf:get-Century-from-date
5.4.3 xf:get-Century-from-gYear
5.4.4 xf:get-Century-from-gYearMonth
5.4.5 xf:get-gYear-from-dateTime
5.4.6 xf:get-gYear-from-date
5.4.7 xf:get-gYear-from-gYearMonth
5.4.8 xf:get-gMonth-from-dateTime
5.4.9 xf:get-gMonth-from-date
5.4.10 xf:get-gMonth-from-gYearMonth
5.4.11 xf:get-gMonth-from-gMonthDay
5.4.12 xf:get-gDay-from-dateTime
5.4.13 xf:get-gDay-from-date
5.4.14 xf:get-gDay-from-gMonthDay
5.4.15 xf:get-hour-from-dateTime
5.4.16 xf:get-hour-from-time
5.4.17 xf:get-minutes-from-dateTime
5.4.18 xf:get-minutes-from-time
5.4.19 xf:get-seconds-from-dateTime
5.4.20 xf:get-seconds-from-time
5.4.21 xf:get-timezone-from-dateTime
5.4.22 xf:get-timezone-from-date
5.4.23 xf:get-timezone-from-time
5.4.24 xf:get-timezone-from-gYear
5.4.25 xf:get-timezone-from-gYearMonth
5.4.26 xf:get-timezone-from-gMonth
5.4.27 xf:get-timezone-from-gMonthDay
5.4.28 xf:get-timezone-from-gDay
5.5 Component Extraction Functions on Duration Values
5.5.1 xf:get-years
5.5.2 xf:get-months
5.5.3 xf:get-days
5.5.4 xf:get-hours
5.5.5 xf:get-minutes
5.5.6 xf:get-seconds
5.6 Arithmetic Functions on Dates
5.6.1 xf:add-days
5.6.2 xf:add-months
5.6.3 xf:add-years
5.6.4 xf:add-gMonth
5.6.5 xf:add-gYear
5.7 Functions on TimePeriod Values
5.7.1 xf:get-duration
5.7.2 xf:get-end
5.7.3 xf:get-start
5.7.4 xf:temporal-dateTimes-contains
5.7.5 xf:temporal-dateTimeDuration-contains
5.7.6 xf:temporal-durationDateTime-contains
6 Constructors, Functions, and Operators on QNames
6.1 Constructors for QNames
6.1.1 xf:QName-from-uri
6.1.2 xf:QName-from-prefix
6.1.3 xf:QName
6.2 Functions on QNames
6.2.1 xf:get-local-name
6.2.2 xf:get-namespace-uri
7 Constructors, Functions, and Operators for anyURI
7.1 Constructor for anyURI
7.1.1 xf:anyURI
8 Functions and Operators on base64Binary and hexBinary
8.1 Comparisons of base64Binary and hexBinary Values
9 Constructors, Functions, and Operators on NOTATION
9.1 NOTATION Constructor
9.1.1 xf:NOTATION
10 Functions and Operators on Nodes
10.1 Operators on Nodes
10.2 Functions on Nodes
10.2.1 xf:local-name
10.2.2 xf:namespace-uri
10.2.3 xf:number
10.2.4 xf:node-equal
10.2.5 xf:value-equal
10.2.6 xf:node-before
10.2.7 xf:node-after
10.2.8 xf:copy
10.2.9 xf:shallow
10.2.10 xf:boolean
11 Constructors, Functions, and Operators on Sequences
11.1 Sequences
11.2 Constructors on Sequences
11.2.1 TO
11.3 Operators on Sequences
11.4 Functions on Sequences
11.4.1 xf:position
11.4.2 xf:last
11.4.3 xf:item-at
11.4.4 xf:index-of
11.4.5 xf:empty
11.4.6 xf:exists
11.4.7 xf:identity-distinct
11.4.8 xf:value-distinct
11.4.9 xf:sort
11.4.10 xf:reverse-sort
11.4.11 xf:insert
11.4.12 xf:sublist-before
11.4.13 xf:sublist-after
11.4.14 xf:sublist
11.4.15 xf:sequence-pad-beginning
11.4.16 xf:sequence-pad-end
11.4.17 xf:truncate-beginning
11.4.18 xf:truncate-end
11.4.19 xf:resize-beginning
11.4.20 xf:resize-end
11.4.21 xf:unordered
11.5 Equals, Union, Intersection and Except
11.5.1 xf:sequence-value-equal
11.5.2 xf:sequence-node-equal
11.5.3 xf:union
11.5.4 xf:union-all
11.5.5 xf:intersect
11.5.6 xf:intersect-all
11.5.7 xf:except
11.5.8 xf:except-all
11.6 Aggregate Functions
11.6.1 xf:count
11.6.2 xf:avg
11.6.3 xf:max
11.6.4 xf:min
11.6.5 xf:sum
11.7 Functions that Generate Sequences
11.7.1 xf:id
11.7.2 xf:idref
11.7.3 xf:filter
11.7.4 xf:document
12 Casting Functions
12.1 Casting to string and its derived types
12.2 Casting to numeric types
12.3 Casting to datetime and duration types
12.4 Casting to all other simple types
12.5 Miscellaneous casting functions
12.5.1 xf:boolean
12.5.2 xf:string
A References
A.1 Normative
B Functions and Operators Issues List (Non-Normative)
[XML Schema Part 2: Datatypes] defines a number of primitive and derived datatypes, collectively known as built-in datatypes. This document defines operations on those datatypes for use in XQuery, XPath and related XML standards. This document also discusses operators and functions on nodes and node sequences as defined in the [XQuery 1.0 and XPath 2.0 Data Model] for use in XQuery, XPath and other related XML standards.
The [XQuery 1.0 and XPath 2.0 Data Model] also defines several accessors on nodes such as name(), string-value(), typed-value(), parent(), children(), and node-kind(). [XQuery 1.0: An XML Query Language] defines kind tests such as text() and node(). These functions are not included in this document.
[Issue 86: In which document do the node accessors functions and the kind tests go?]
The diagram below shows the built-in [XML Schema Part 2: Datatypes]. Solid lines connect a base datatype above to a derived datatype below. Dashed lines connect a datatype created as a list of an item type above.
Diagram courtesy Asir Vedamuthu, webMethods
[Issue 24: What effect do facets have?]
This document includes examples of the use of a number of functions. Examples for the remaining functions are tentatively planned, but feedback is solicited on the utility of the examples for those functions for which they are supplied, as well as for those for which they are not yet supplied.
The purpose of this document is to catalog the functions and operators required for XML Query and XPath 2.0. The exact syntax used to invoke these functions and operators is specified in [XQuery 1.0: An XML Query Language].
In general, [XQuery 1.0: An XML Query Language] does not support function overloading. Consequently, there are no overloaded functions in this document except for legacy [XPath 1.0] functions such as string() which takes a single argument of a variety of types and concat() which takes a variable number of string arguments. This does not apply to operators such as "+" which may be overloaded. Functions with optional arguments are allowed. If optional arguments are omitted, omissions are assumed to begin from the right.
This document defines, among other things, a number of constructors and other functions that apply to one or more data types. Each constructor and function is defined by specifying its signature, a description of each of its arguments, and its semantics; in addition, examples are given of many constructors and functions to illustrate their use.
Each function's signature is presented in a form like this:
function-name(parameter-type parameter-name ... )
=> return-type
In that notation, function-name
is the name of the function whose signature is being specified. If the function takes no parameters, then the name is followed by an empty set of parentheses: ()
; otherwise, the name is followed by a parenthesized list of parameter declarations, each declaration specifying the static type of the parameter and a non-normative name used to reference the parameter when the function's semantics are specified. If there are two or more parameter declarations, they are separated by a comma. The return-type
specifies the static type of the value returned by the function.
The names of constructor functions have been chosen so that their local names are "spelled the same" as the local names of the types for which they are constructors. For example, the name of the constructor function that constructs values whose type is xsd:decimal
is xf:decimal
. Throughout this document, we typically omit the prefix xsd:
in the names of XML Schema types.
[Issue 101: Scalar functions should accept the empty sequence.]
The functions and operators discussed in this document are contained within a namespace and referenced using a qualified name. The namespace prefix used in this document — merely for illustrative purposes — is xf
. The namespace prefix for these functions can vary, as long as the prefix is bound to the currect URI.
The actual namespace (that is, the URI of the namespace) is http://www.w3.org/2001/08/xquery-operators
.
[Issue 13: What is the appropriate namespace to be used?]
[Issue 33: Issues should reference the source of the issues]
This section discusses arithmetic operators on the numeric datatypes defined in [XML Schema Part 2: Datatypes]. It uses an approach that permits lightweight operations whenever possible.
The operators described in this section are defined on the following numeric types.
decimal | ||
integer | ||
int | ||
long | ||
short | ||
byte | ||
float | ||
double |
They also apply to user-defined types derived by restriction from these types.
[Issue 39: Should all XML Schema numeric types be supported?]
The following literals are defined on the numeric types.
Literal | Returns |
n , +n , or -n
[1]
|
integer |
nL , +nL , or -nL
[2]
|
long |
n. , +n. , or -n. , .n , +.n , or -.n , n.n , +n.n , or -n.n (with a decimal point) |
decimal |
n.nEn , +n.nEn , -n.nEn ,
n.nE+n , +n.nE+n , -n.nE+n ,
n.nE-n , +n.nE-n , or -n.nE-n
[3]
|
float |
[Issue 49: There are several syntax problems with numeric literals]
The following constructors are defined on the above numeric types.
Constructor | Meaning |
xf:decimal
|
Produces a decimal value by parsing and interpreting a string. |
xf:integer
|
Produces an integer value by parsing and interpreting a string. |
xf:long
|
Produces a long value by parsing and interpreting a string. |
xf:int
|
Produces an int value by parsing and interpreting a string. |
xf:short
|
Produces a short value by parsing and interpreting a string. |
xf:byte
|
Produces a byte value by parsing and interpreting a string. |
xf:float
|
Produces a float value by parsing and interpreting a string. |
xf:double
|
Produces a double value by parsing and interpreting a string. |
For float
and double
, the string
argument can indicate the special values: NaN
, INF
, -INF
, +0
, and -0
.
Whitespace is not allowed as part of the string argument.
If the argument string passed to a constructor results in an error (for example, if it conatains a letter other than "E" or "e" or a string other than the special values named above), the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 40: Some numeric constructors have unexpected return types]
Returns the decimal
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is not a valid lexical representation for the decimal
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the number of characters contained in the value of $srcval
that are digits is greater than the maximum number of decimal digits supported by the implementation, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
xf:decimal('123.5')
returns the decimal
value corresponding to one hundred twenty three and one-half.
xf:decimal('12.5E2')
returns the error value, since the use of the letter "E" is prohibited in the constructor for the decimal
type.
xf:decimal('12.5 ')
returns the error value, since whitespace is not allowed in the argument string.
Returns the integer
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is not a valid lexical representation for the integer
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the number of characters contained in the value of $srcval
that are digits is greater than the maximum number of digits supported by the implementation, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
Returns the long
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is not a valid lexical representation for the long
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of the number corresponding to the characters contained in the value of $srcval
is greater than 9,223,372,036,854,775,807 or less than -9,223,372,036,854,775,808, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
Returns the int
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is not a valid lexical representation for the int
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of the number corresponding to the characters contained in the value of $srcval
is greater than 2,147,483,647 or less than -2,147,483,648, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
xf:int('1235')
returns the int
value corresponding to one thousand two hundred thirty five.
xf:int('2147483648')
returns an error value, since the value two billion, one hundred forty seven million, four hundred eighty three thousand, six hundred forty eight is not a valid value for the short
type.
Returns the short
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is not a valid lexical representation for the short
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of the number corresponding to the characters contained in the value of $srcval
is greater than 32,767 or less than -32,768, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
Returns the byte
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is not a valid lexical representation for the byte
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of the number corresponding to the characters contained in the value of $srcval
is greater than 127 or less than -128, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
Returns the float
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is "
NaN
", then the constructor returns the "not-a-number" value.
If the value of $srcval
is "
INF
" or "
+INF
", then the constructor returns the "positive infinity" value. If the value of $srcval
is "
-INF
", then the constructor returns the "negative infinity" value.
If the value of $srcval
is "
0
" or "
+0
", then the constructor returns the value positive zero. If the value of $srcval
is "
-0
", then the constructor returns the value negative zero.
If the value of $srcval
is not a valid lexical representation for the float
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of the number corresponding to the characters contained in the value of $srcval
is not a valid value for the float
type, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 50: Should double("ZZZ") or cast as float("ZZZ") return an error or NaN? ]
xf:float('510E2')
returns the float
value corresponding to fifty one thousand.
xf:float('15.25')
returns the float
value corresponding to fifteen and a quarter.
xf:float('51D1')
returns an error value, since the use of the letter "D" is prohibited in the constructor for the float
type.
Returns the double
value that is represented by the characters contained in the value of $srcval
.
If the value of $srcval
is "
NaN
", then the constructor returns the "not-a-number" value.
If the value of $srcval
is "
INF
" or "
+INF
", then the constructor returns the "positive infinity" value. If the value of $srcval
is "
-INF
", then the constructor returns the "negative infinity" value.
NOTE:In XPath 1.0,
double("INF")
returned the "not-a-number" value.
If the value of $srcval
is "
0
" or "
+0
", then the constructor returns the value positive zero. If the value of $srcval
is "
-0
", then the constructor returns the value negative zero.
If the value of $srcval
is not a valid lexical representation for the double
type as specified in [XML Schema Part 2: Datatypes], then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of the number corresponding to the characters contained in the value of $srcval
is not a valid value for the float
type, then an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 50: Should double("ZZZ") or cast as float("ZZZ") return an error or NaN? ]
xf:double('510E2')
returns the double
value corresponding to fifty one thousand.
xf:double('15.25')
returns the double
value corresponding to fifteen and a quarter.
xf:double('51D1')
returns an error value, since the use of the letter "D" is prohibited in the constructor for the double
type.
The following operators are defined on these numeric types:
Operator | Meaning | Source |
+
|
Addition | XPath 1.0 |
-
|
Subtraction | XPath 1.0 |
*
|
Multiplication | XPath 1.0 |
div
|
Division | XPath 1.0 |
mod |
Division modulus | XPath 1.0 |
+
|
Unary plus | XPath 2.0 Req 1.7 Shd. |
-
|
Unary minus (negation) | XPath 1.0 |
The arguments and return types of all arithmetic operators are primitive numeric types: decimal
, float
, or double
. Although arithmetic operations are defined on only a few types, a simple type promotion scheme allows these operations to be performed on all arithmetic types, including derived types.
Integers, which are primitive in most language environments, are not primitive types in XML Schema. To be efficient, most implementations will probably choose to treat integer as primitive, and define promotions from integer to decimal.
The type promotion scheme includes only two rules:
Any type may be promoted to the type of its primitive ancestor (or integer if appropriate).
integer
may be promoted to decimal
, decimal
may be promoted to float
, and float
may be promoted to double
.
For simplicity, each operator is defined to operate on operands of the same datatype and to return the same datatype. If the two operands are not of the same datatype, one operand is promoted to be the type of the other operand.
The result type of operations (using ¤ to represent a binary operators such as +
, -
, *
, div
, and mod
, as well as unary operators such as +
and -
), along with their argument datatypes, is defined in the table below:
Operator | Returns |
integer ¤ integer | integer |
decimal ¤ decimal | decimal |
float ¤ float | float |
double ¤ double | double |
¤ integer | integer |
¤ decimal | decimal |
¤ float | float |
¤ double | double |
These rules define any operation on any pair of arithmetic types. Consider the following example:
int*double => double*double
For this operation, int
must be converted to double
. This can be done, since int
can be promoted to decimal
, its most primitive ancestor, decimal
can be promoted to float
, and float
can be promoted to double
.
[Issue 78: Type promotion rules appear to be inconsistent]
Consider another example:
byte*short => decimal*decimal
This is true because the *
operator is not defined on derived types. Both byte
and short
must be promoted to decimal
so that the operation can be performed.
We define the following comparison operators on numeric values. Comparisons take two arguments of the same type. If the arguments are of different types, one argument is promoted to the type of the other. Each comparison operator returns a boolean value, except in the case when one or both operands are the empty sequence. In that case, the result is the empty sequence.
Operator | Meaning | Source |
=
|
Equality comparison | XPath 1.0 |
<
|
Less-than comparison | XPath 1.0 |
>
|
Greater-than comparison | XPath 1.0 |
<=
|
Less-than-or-equal-to comparison | XPath 1.0 |
>=
|
Greater-than-or-equal-to comparison | XPath 1.0 |
!=
|
Inequality comparison | XPath 1.0 |
[Issue 8: Relationships Between Some Numeric Types Should Be Reconsidered]
[Issue 113: Need more complete numeric comparison semantics]
The following functions are defined on these numeric types:
Function | Meaning | Source |
xf:floor | Returns the largest integer less than or equal to the argument | XPath 1.0 |
xf:ceiling | Returns the smallest integer greater than or equal to the argument | XPath 1.0 |
xf:round | Rounds to the nearest integer | XPath 1.0 |
[Issue 53: Certain XPath 1.0 functions and other needed functions are not included]
[Issue 79: How many digits of precision (etc.) are returned from certain functions?]
[Issue 92: The abs() function is required.]
Returns the number that is closest to the argument and that is an integer. More formally, round(x)
produces the same result as floor(x+0.5)
. These semantics are consistent with Java's semantics. If there are two such numbers, then the one that is closest to positive infinity is returned. If the argument is NaN, then NaN is returned. If the argument is positive infinity, then positive infinity is returned. If the argument is negative infinity, then negative infinity is returned. If the argument is positive zero, then positive zero is returned. If the argument is negative zero, then negative zero is returned. If the argument is less than zero, but greater than or equal to -0.5, then negative zero is returned.
This section discusses functions and operators on the [XML Schema Part 2: Datatypes] string datatype and the datatypes derived from string.
The operators described in this section are defined on the following string types.
string | |||||
normalizedString | |||||
token | |||||
language | |||||
NMTOKEN | |||||
Name | |||||
NCName | |||||
ID | |||||
IDREF | |||||
ENTITY |
They also apply to user-defined types derived by restriction from these types.
The following literals are defined on string types and produce values of the types indicated.
Literal | Returns |
"xxx"
[4]
|
string |
'xxx'
[5]
|
string |
The following constructors are defined on string types.
Constructor | Meaning |
xf:string
|
Produces a string value by parsing and interpreting a supplied string. |
xf:normalizedString
|
Produces a normalizedString — the XML Schema datatype — value by parsing and interpreting a string |
xf:token
|
Produces a token value by parsing and interpreting a string. |
xf:language
|
Produces a language value by parsing and interpreting a string. |
xf:Name
|
Produces a Name value by parsing and interpreting a string. |
xf:NMTOKEN
|
Produces an NMTOKEN value by parsing and interpreting a string. |
xf:NCName
|
Produces an NCName value by parsing and interpreting a string. |
xf:ID
|
Produces an ID value by parsing and interpreting a string. |
xf:IDREF
|
Produces an IDREF value by parsing and interpreting a string. |
xf:ENTITY
|
Produces an ENTITY value by parsing and interpreting a string. |
[Issue 14: Some function signatures are unclear about argument types]
[Issue 90: Constructors for id and idref need a document context for validity.]
[Issue 107: Notion of document context required?]
Returns a string
value that is the value of $srcval
. This constructor is correctly perceived as a "no-op", but is included for the sake of orthogonality.
xf:string('abc')
returns "abc".
xf:string('Jérôme')
returns "Jérôme" ("̂" is the numeric code reference for the Unicode character U+0302, called "Combining Curcumflex Accent").
NOTE:The preceding semantic is correct if and only if this document requires the use of Unicode Normalization Form C (NFC) semantics for this constructor. [Character Model for the World Wide Web 1.0] requires normalization following certain operations, so it may be appropriate to mandate it here, too.
[Issue 106: Do function invocations expand character references?]
Returns a normalizedString
— the XML Schema datatype — value that is the value of $srcval
. Every character contained in $srcval
that is a line feed (#xA) is removed from the returned value.
If the argument string passed to a constructor is not a valid value in the lexical space of normalizedString
as specified in [XML Schema Part 2: Datatypes], then the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model]. Note that the argument to construct a normalizedString
cannot contain the carriage return (#xD) or the tab (#x9) character.
xf:normalizedString('abc')
returns "abc".
xf:normalizedString('ab
cd)
returns "abcd".
xf:normalizedString('ab
cd)
returns an error value.
[Issue 106: Do function invocations expand character references?]
Returns a token
value that is the value of $srcval
. Note that the argument to construct a token
must not contain the line feed (#xA) nor tab (#x9) characters, have no leading or trailing spaces (#x20), and must have no internal sequences of two or more spaces. If the argument string passed to a constructor results in an error, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 46: xf:token: Should other Unicode space characters be considered?]
Returns a language
value that is the value of $srcval
. Note that the value of $srcval
to construct a value of type language
must be a valid language identifier as defined in the language identification section of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error (for example, is not a valid language identifier), the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Returns a Name
value that is the value of $srcval
. Note that the value of $srcval
to construct a value of type Name
must match the Name
production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Returns an NMTOKEN
value that is the value of $srcval
. Note that the value of $srcval
to construct a value of type NMTOKEN
must match the Nmtoken
production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Returns an NCName
value that is the value of $srcval
. Note that the value of $srcval
to construct a value of type NCName
must match the NCName
production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Returns an ID
value that is the value of $srcval
. Note that the value of $srcval
to construct a value of type ID
must match the NCName
production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Returns an IDREF
value that is the value of $srcval
. Note that the value of $srcval
to construct a value of type IDREF
must match the NCName
production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Returns an ENTITY
value that is the value of $srcval
. Note that the value of $srcval
to construct a value of type ENTITY
must match the NCName
production of [XML 1.0 Recommendation (Second Edition)]. If the argument string passed to a constructor results in an error, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
The [Character Model for the World Wide Web 1.0] discusses the fact that strings from a particular character set may need to be collated (sorted) differently for different applications. Thus, the collation needs to be taken into account when comparing strings in any context. In this document, we assume that collations can be named and the collation name used as an argument to the comparison function. This document will also defines the manner in which a default collation is determined, allowing the collation argument to be optional in the functions that allow it.
[Issue 44: Collations: URIs and URI references or short names?]
[Issue 70: How are "default" collations determined?]
Some collations can be "tailored" for various purposes. See [Unicode Collation Algorithm]. This document does not discuss tailoring. Instead, we assume that the collation argument to the various functions below is a tailored and named collation.
NOTE:A user who wishes to preserve the XPath 1.0 semantics of "
<
" and ">
" can define a collation that converts each string to a number and then compares them numerically.
Collations can also indicate that some characters that are rendered differently are, in fact equal for collation purpose (e.g., "uve" and "uwe" are considered equivalent in some European languages). Thus, strings can be compared character-by-character or in a logical manner based on the collation.
[Issue 45: Collations: Is there a relationship to xml:lang?]
The [Character Model for the World Wide Web 1.0] recommends that all strings be normalized early and, thus, string comparisons need only be defined on normalized strings. If this is not the case, then we may also want to compare unnormalized strings based on their normalized representations.
Function | Meaning | Source |
xf:codepoint-compare | Compares two character strings on a character-by-character basis | |
xf:compare | Compares two character strings; a collation may optionally be specified | XSLT 2.0, Req. 2.13 (Could) |
[Issue 19: Do we need collation-sensitive comparisons and another sort?]
[Issue 73: Is a "between" function needed?]
[Issue 114: codepoint-compare versus compare with special collation]
Returns -1, 0, or 1, depending on whether the value of $comparand1
is respectively less than, equal to, or greater than the value of $comparand2
, compared on a character by character basis using the Unicode values for each character.
If the value of $comparand2
is longer than the value of $comparand1
and starts with the value of $comparand1
, the result is -1. If the value of $comparand1
is longer and starts with the the value of $comparand2
, the result is 1.
If either argument is the empty sequence, the result is the empty sequence.
xf:compare(string $comparand1, string $comparand2)
=> integer
xf:compare(string $comparand1, string $comparand2, anyURI $collation)
=> integer
$comparand1
— static type isstring
$comparand2
— static type isstring
$collation
— static type isanyURI
Returns -1, 0, or 1, depending on whether the value of the $comparand1
is respectively less than, equal to, or greater than the value of $comparand2
, according to the rules of the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
If the value of $comparand2
begins with a string that is equal to the value of $comparand1
(according to the collation that is used) and has additional characters following that beginning string, then the result is -1. If the value of $comparand1
begins with a string that is equal to the value of $comparand2
(according to the collation that is used) and has additional characters following that beginning string, then the result is 1.
If either argument is the empty sequence, the result is the empty sequence.
xf:compare('abc', 'abc')
returns 0.
xf:compare('Strasse', 'Straße')
returns 0 if and only if the default collation includes provisions that equate "ss" and the (German) character "ß" ("sharp-s"). (Otherwise, the returned value depends on the semantics of the default collation.)
xf:compare('Strasse', 'Straße', anyURI('deutsch'))
returns 0 if and only if the collation identified by the relative URI constructed from the string
value "deutsch" includes provisions that equate "ss" and the (German) character "ß" ("sharp-s"). (Otherwise, the returned value depends on the semantics of that collation.)
xf:compare('Strassen', 'Straße')
returns 1 if and only if the default collation includes provisions that equate "ss" and the (German) character "ß" ("sharp-s"). (Since the value of $comparand1
has an additional character, an "n", following the string that is equal to "Straße", it is greater than the value of $comparand2
.)
The following functions are defined on these string types:
Function | Meaning | Source |
xf:concat | Concatenates two or more character strings. | XPath 1.0 |
xf:starts-with | Indicates whether the value of one string begins with the characters of the value of another string. | |
xf:ends-with | Indicates whether the value of one string ends with the characters of the value of another string. | |
xf:codepoint-contains | Indicates whether the value of one string contains the characters of the value of another string using the default Unicode collation. | |
xf:contains | Indicates whether the value of one string contains the characters of the value of another string. A collation may optionally be specified. | XPath 1.0 |
xf:substring | Returns a string located at a specified place in the value of a string. | XPath 1.0 |
xf:string-length | Returns the length of the argument. | XPath 1.0 |
xf:codepoint-substring-before | Returns the characters of one string that precede in that string the characters in the value of another string. | |
xf:substring-before | Returns the characters of one string that precede in that string the characters in the value of another string. A collation may optionally be specified. | XPath 1.0 |
xf:codepoint-substring-after | Returns the characters of one string that precede in that string the characters in the value of another string. | |
xf:substring-after | Returns the characters of one string that precede in that string the characters in the value of another string. A collation may optionally be specified. | XPath 1.0 |
xf:normalize-space | Returns the whitespace-normalized value of the argument. | XPath 1.0 |
xf:normalize | Returns the normalized value of the first argument in the normalization form specified by the second argument. | XPath 2.0 Req 2.9 (Should) |
xf:upper-case | Returns the upper-cased value of the argument. | XPath 2.0 Req 2.4.3 (Should) |
xf:lower-case | Returns the lower-cased value of the argument. | XPath 2.0 Req 2.4.3 (Should) |
xf:translate | Returns the first argument string with occurrences of characters in the second argument replaced by the character at the corresponding position in the third string. | XPath 1.0 |
xf:string-pad-beginning | Returns the string padded in front by copies of the third argument. | XPath 2.0 Req 2.4.2, 4.4 (Should) |
xf:string-pad-end | Returns the string padded at the end by copies of the third argument. | XPath 2.0 Req 2.4.2, 4.4 (Should) |
xf:match | Returns a sequence of integers indicating the positions in the value of the first argument that are matched by the regular expression that is the value of the second argument. | XPath 2.0 Req 3. (Must) |
xf:replace | Returns the first argument with every substring matched by the second argument replaced by the value of the third argument. | XPath 2.0 Req 2.4.1. (Should) |
[Issue 23: "Returns a copy" is not appropriate wording]
[Issue 21: What is the precise type returned by each function?]
[Issue 37: Linguistic contains required?]
[Issue 56: Some functions may be redundant]
[Issue 94: Must allow searching for words near other words. ]
[Issue 104: Need equality and inequality operators for strings.]
Note that the resulting string after operations such as concatenation or substring must be normalized. See [Character Model for the World Wide Web 1.0].
[Issue 108: Should strings always be returned in Unicode normalized form?]
Note also that when the above operators and functions are applied to datatypes derived from string
, they are guaranteeed to return legal strings, but they may not return legal value for the particular subtype to which they were applied.
Returns the string that is the concatenation of the values of its arguments. XPath allows a variable number of arguments. The resulting string might not be normalized in any Unicode or W3C normalization.
NOTE:The
concat()
function is specified to allow an arbitrary number of string arguments that are concatenated together. This capability is retained for compatibility with [XPath 1.0] and is the only function specified in this document that has that characteristic.
xf:starts-with(string $operand1, string $operand2)
=> boolean
xf:starts-with(string $operand1, string $operand2, anyURI $collation)
=> boolean
$operand1
— static type isstring
$operand2
— static type isstring
$collation
— static type isanyURI
Returns a boolean indicating whether or not the value of $operand1
starts with a string that is equal to the value of $operand2
according to the collation that is used.
If the value of $operand2
is the zero-length string, then the function returns true
. If the value of $operand1
is the zero-length string and the value of $operand2
is not the zero-length string, then the function returns false
.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
xf:ends-with(string $operand1, string $operand2)
=> boolean
xf:ends-with(string $operand1, string $operand2, anyURI collation)
=> boolean
$operand1
— static type isstring
$operand2
— static type isstring
$collation
— static type isanyURI
Returns a boolean indicating whether or not the value of $operand1
ends with a string that is equal to the value of $operand2
according to the specified collation.
If the value of $operand2
is the zero-length string, then the function returns true
. If the value of $operand1
is the zero-length string and the value of $operand2
is not the zero-length string, then the function returns false
.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
Returns a boolean indicating whether or not the value of $operand1
contains (at the beginning, at the end, or anywhere within) a string equal to the value of $operand2
, considered on a character-by-character basis using the Unicode value of each character for comparison .
If the value of $operand2
is the zero-length string, then the function returns true
. If the value of $operand1
is the zero-length string and the value of $operand2
is not the zero-length string, then the function returns false
.
xf:contains(string $operand1, string $operand2)
=> boolean
xf:contains(string $operand1, string $operand2, anyURI $collation)
=> boolean
$operand1
— static type isstring
$operand2
— static type isstring
$collation
— static type isanyURI
Returns a boolean indicating whether or not the value of $operand1
contains (at the beginning, at the end, or anywhere within) a string equal to the value of $operand1
according to the collation that is used.
If the value of $operand2
is the zero-length string, then the function returns true
. If the value of $operand1
is the zero-length string and the value of $operand2
is not the zero-length string, then the function returns false
.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
xf:substring(string $sourceString, decimal $startingLoc)
=> string
xf:substring(string $sourceString, decimal $startingLoc, decimal $length)
=> string
$sourceString
— static type isstring
$startingLoc
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($startingLoc))
.
$length
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($length))
.
Returns the portion of the value of $sourceString
beginning at the position indicated by the value of $startingLoc
and continuing for the number of characters indicated by the value of $length
.
If $length
is not specified, then the substring identifies characters to the end of $sourceString
.
The value of $length
can be greater than the number of characters in the value of $sourceString
following the beginning position, in which case the substring identifies characters to the end of $sourceString
.
The first character of a string is located at position 1 (not position 0).
Returns the substring of the value of $operand1
that precedes in the value of $operand1
the first occurrence of a string that is equal to the value of $operand2
comparing each character in turn according to its Unicode value.
If the value of $operand2
is the zero-length string, then the function returns the value of $operand1
. If the value of $operand1
is the zero-length string and the value of $operand2
is the zero-length string, then the function returns the zero-length string.
If the value of $operand1
does not contain a string that is equal to the value of $operand2
, then the function returns the empty string.
xf:substring-before(string $operand1, string $operand2)
=> string
xf:substring-before(string $operand1, string $operand2, anyURI $collation)
=> string
$operand1
— static type isstring
$operand2
— static type isstring
$collation
— static type isanyURI
Returns the substring of the value of $operand1
that precedes in the value of $operand1
the first occurrence of a string that is equal to the value of $operand2
according to the collation that is used.
If the value of $operand2
is the zero-length string, then the function returns the value of $operand1
. If the value of $operand1
is the zero-length string and the value of $operand2
is the zero-length string, then the function returns the zero-length string.
If the value of $operand1
does not contain a string that is equal to the value of $operand2
, then the function returns the empty string.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
Returns the substring of the value of $operand1
that follows in the value of $operand1
the first occurrence of a string that is equal to the value of $operand2
comparing each character in turn according to its Unicode value.
If the value of $operand2
is the zero-length string, then the function returns the value of $operand1
. If the value of $operand1
is the zero-length string and the value of $operand2
is the zero-length string, then the function returns the zero-length string.
If the value of $operand1
does not contain a string that is equal to the value of $operand2
, then the function returns the zero-length string.
xf:substring-after(string $operand1, string $operand2)
=> string
xf:substring-after(string $operand1, string $operand2, anyURI $collation)
=> string
$operand1
— static type isstring
$operand2
— static type isstring
$collation
— static type isanyURI
Returns the substring of the value of $operand1
that follows in the value of $operand1
the first occurrence of a string that is equal to the value of $operand2
according to the collation that is used.
If the value of $operand2
is the zero-length string, then the function returns the value of $operand1
. If the value of $operand1
is the zero-length string and the value of $operand2
is the zero-length string, then the function returns the zero-length string.
If the value of $operand1
does not contain a string that is equal to the value of $operand2
, then the function returns the zero-length string.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
$srcval
— static type isstring
$normalizationForm
— static type isstring
. The effective value is the value returned byupper-case($normalizationForm)
.
Returns the value of $srcval
normalized according to the normalization criteria for a normalization form identified by the value of $normalizationForm
:
If the effective value of
$normalizationForm
is "NFC", then the value returned by the function is the value of$srcval
in Unicode Normalization Form C (NFC).
If the effective value of$normalizationForm
is "NFD", then the value returned by the function is the value of$srcval
in Unicode Normalization Form D (NFD).
If the effective value of$normalizationForm
is "W3C", then the value returned by the function is the value of$srcval
in W3C Normal Form.
Returns the value of $srcval
after translating every lower-case letter to its upper-case correspondent. Every lower-case letter that does not have an upper-case correspondent, and every character that is not a lower-case letter, is included in the returned value in its original form.
A "lower-case letter" is a character whose Unicode General Category class includes "Ll". The corresponding upper-case letter is determined using [Unicode Case Mappings].
Returns the value of $srcval
after translating every upper-case letter to its lower-case correspondent. Every upper-case letter that does not have a lower-case correspondent, and every character that is not an upper-case letter, is included in the output in its original form.
An "upper-case letter" is a character whose Unicode General Category class includes "Lu". The corresponding lower-case letter is determined using [Unicode Case Mappings].
$srcval
— static type isstring
$mapString
— static type isstring
$transString
— static type isstring
Returns the value of $srcval
modified so that every character in the value of $srcval
that occurs at some position N in the value of $mapString
has been replaced by the character that occurs at position N in the value of $transString
.
Every character in the value of $srcval
that does not appear in the value of $mapString
is unchanged.
Every character in the value of $srcval
that appears at some position M in the value of $mapString
, where the value of $transString
is less than M characters in length, is omitted from the returned value.
xf:string-pad-beginning(string $srcval, decimal $padCount, string $padChar)
=> string
$srcval
— static type isstring
$padCount
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($padCount))
.
$padChar
— static type isstring
.
Returns the value of $srcval
in which zero or more copies of the value of $padChar
have been inserted preceding the first character of the value of $srcval
. The number of copies of the value of $padChar
that are inserted is equal to the value of $padCount
.
If the value of string-length(padChar)
is greater than 1, then the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
$srcval
— static type isstring
$padCount
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor(length))
.
$padChar
— static type isstring
.
Returns the value of $srcval
in which zero or more copies of the value of $padChar
have been appended following the last character of the value of $srcval
. The number of copies of the value of $padChar
that are appended is equal to the value of $padCount
.
If the value of string-length(padChar)
is greater than 1, then the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
xf:match(string $srcval, string $regexp)
=> integer*
xf:match(string $srcval, string $regexp, anyURI $collation)
=> integer*
$srcval
— static type isstring
$regexp
— static type isstring
$collation
— static type isanyURI
Returns a list of integers that identify the offsets of locations within the value of $srcval
that are matched by the regular expression that is the value of $regexp
. Note that this list might often be implicitly converted to a boolean and that it is probably more common to determine whether a string matches than to ask where those matches occur.
The regular expression in the value of $regexp
uses the syntax of regular expressions specified in Appendix F of [XML Schema Part 2: Datatypes].
Comparisons of characters and character strings are, like in all functions, performed in the context of a collation.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
[Issue 74: Is a "match-exact()" function needed?]
[Issue 75: The semantics of match() are incompletely specified]
[Issue 81: What are the precise semantics of regular expressions?]
xf:replace(string $srcval, string $regexp, string $repval)
=> string
xf:replace(string $srcval, string $regexp, string $repval, anyURI $collation)
=> string
$srcval
— static type isstring
$regexp
— static type isstring
$repval
— static type isstring
$collation
— static type isanyURI
Returns the value of $srcval
in which every substring of the value of $srcval
that is matched by the regular expression that is the value of $regexp
, has been replaced by a copy of the value of $repval
.
Ordinary regular expression semantics are used. Among other characteristics, if the value of $regexp
is an ordinary character string without any of the "special characters" that give regular expressions their semantics, then the phrase "matched by the regular expression" is equivalent in meaning to "equal to the string".
The value of $repval
may use the standard regular expression syntax of "$N"
(where N
is some integer) to represent the N-th part of the matched pattern indicated by parentheses in the value of $regexp
.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
This section discusses operators on the [XML Schema Part 2: Datatypes] boolean datatype.
The following constructors are defined on the boolean type.
Constructor | Meaning | Source |
xf:true
|
boolean | XPath 1.0 |
xf:false
|
boolean | XPath 1.0 |
xf:boolean-from-string
|
boolean |
$srcval
— static type isstring
. The effective value is the value returned byupper-case($srcval)
.
If the effective value of $srcval
is "TRUE", then this constructor returns the boolean value true
; if the effective value of $srcval
is "FALSE", then this constructor returns the boolean value false
.
If the effective value of $srcval
is any value other than "TRUE" or "FALSE", the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
The following operators are defined on these boolean types:
Operator | Meaning | Source |
and
|
Logical conjunction | XPath 1.0 |
or
|
Logical disjunction | XPath 1.0 |
=
|
Equality comparison | XPath 1.0 |
!=
|
Inequality comparison | XPath 1.0 |
[Issue 57: Some boolean operators are incompletely or incorrectly specified]
The arguments and return types of all boolean operators are boolean types.
For and
, if one of the arguments is the empty sequence, the result is the empty sequence except if the other argument is false
in which case the result is false
. For or
, if one of the arguments is the empty sequence, the result is the empty sequence except if the other argument is true
in which case the result is true
. The empty seqeunce does not compare equal to any value including itself.
The following functions are defined on boolean types:
Function | Meaning | Source |
xf:not | Inverts the boolean value of the argument. | XPath 1.0 |
This section discusses operators on the [XML Schema Part 2: Datatypes] date and time types.
[Issue 109: Calendar context allows for non-Gregorian calendars]
The operators described in this section are defined on the following duration and datetime types.
duration
dateTime
date
time
gYearMonth
gYear
gMonthDay
gMonth
gDay
The following constructors are defined on duration and datetime datatypes.
Constructor | Meaning |
xf:duration
|
Returns a duration type derived by parsing and interpreting a string value. |
xf:dateTime
|
Returns a dateTime type derived by parsing and interpreting a string value. |
xf:date
|
Returns a date type derived by parsing and interpreting a string value. |
xf:time
|
Returns a time type derived by parsing and interpreting a string value. |
xf:gYearMonth
|
Returns a gYearMonth type derived by parsing and interpreting a string value. |
xf:gYear
|
Returns a gYear type derived by parsing and interpreting a string value. |
xf:gMonthDay
|
Returns a gMonthDay type derived by parsing and interpreting a string value. |
xf:gMonth
|
Returns a gMonth type derived by parsing and interpreting a string value. |
xf:gDay
|
Returns a gDay type derived by parsing and interpreting a string value. |
xf:currentDateTime()
|
Returns the current dateTime. |
If the value of $srcval
conforms to the lexical representation of a duration
as defined in [XML Schema Part 2: Datatypes], the constructor returns the duration
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
conforms to the lexical representation of a dateTime
as defined in [XML Schema Part 2: Datatypes], the constructor returns the dateTime
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
xf:dateTime("1999-05-31T05:00:00")
returns a dateTime
value corresponding to the 31st. of May, 1999 at 5:00 AM in an unspecified timezone.
xf:dateTime("1999-05-31T13:20:00-05:00")
returns a dateTime
value corresponding to 1:20 pm on May the 31st, 1999 for Eastern Standard Time, which is 5 hours behind Coordinated Universal Time (UTC).
If the value of $srcval
conforms to the lexical representation of a date
as defined in [XML Schema Part 2: Datatypes], the constructor returns the date
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
conforms to the lexical representation of a time
as defined in [XML Schema Part 2: Datatypes], the constructor returns the time
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
xf:time("11:33:24")
returns a time
value corresponding to 33 minutes and 24 seconds past 11 AM in an unspecified timezone.
xf:time("23:33:24.35-05:00")
returns a time
value corresponding to 33 minutes and 24.35 seconds past 11 PM for Eastern Standard Time, which is 5 hours behind Coordinated Universal Time (UTC).
If the value of $srcval
conforms to the lexical representation of a gYearMonth
as defined in [XML Schema Part 2: Datatypes], the constructor returns the gYearMonth
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
conforms to the lexical representation of a gYear
as defined in [XML Schema Part 2: Datatypes], the constructor returns the gYear
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
conforms to the lexical representation of a gMonthDay
as defined in [XML Schema Part 2: Datatypes], the constructor returns the gMonthDay
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
conforms to the lexical representation of a gMonth
as defined in [XML Schema Part 2: Datatypes], the constructor returns the gMonth
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
conforms to the lexical representation of a gDay
as defined in [XML Schema Part 2: Datatypes], the constructor returns the gDay
corresponding to that representation. Otherwise, the constructor returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
xf:gDay("13")
returns a gDay
corresponding to the thirteenth day in an unspecified month and year in an unspecified timezone.
xf:gDay("14+02:30")
returns a gDay
corresponding to the fourteenth day in an unspecified month and year in a timezone that is 2.5 hours ahead of Coordinated Universal Time (UTC).
Returns the dateTime that is current at some time during the evaluation of the XQuery or XPath expression in which the currentDateTime()
constructor is executed. All invocations of currentDateTime()
that are executed during the course of a single outermost XQuery or XPath expression return the same value; the precise instant during that XQuery or Xpath expression's evaluation when the currentDateTime()
constructor's value represents is implementation-defined.
The following operators are defined on date and time values:
Operator | Meaning |
=
|
Equality comparison |
<
|
Less-than comparison |
>
|
Greater-than comparison |
<=
|
Less-than-or-equal-to comparison |
>=
|
Greater-than-or-equal-to comparison |
!=
|
Inequality comparison |
These operators take two arguments of the same type. For equality and inequality, these operators return a boolean result. As discussed in [XML Schema Part 2: Datatypes], the order relation on the duration and the date and time datatypes is not a total order but, rather, a partial order. For this reason, the operators <
, >
, <=
, and >=
can return one of three values: true
, false
, or "indeterminate", represented by the symbol "
<>
".
When comparing durations, there is no indeterminacy if either both arguments contain only years and months or both arguments contains only days, hours, minutes and seconds.
When comparing dateTime values, indeterminacy arises only in the cases when one of the arguments has a timezone and the other does not. If both arguments have timezones or if both arguments do not have timezones, an indeterminate value is never returned.
If one or more arguments is the empty sequence, the result is the empty sequence.
[Issue 38: How are indeterminate values in date/time values represented?]
[Issue 121: Representation of indeterminate comparison]
[Issue 123: Comparison of durations and datetimes: functions or operators?]
The date and time datatypes may be considered to be composite datatypes in that they contain distinct components. The extraction functions specified below extract one component from a date or time value.
Function | Meaning |
xf:get-Century-from-dateTime
|
Returns the century component of the years value of the argument. |
xf:get-Century-from-date
|
Returns the century component of the years value of the argument. |
xf:get-Century-from-gYear
|
Returns the century component of the years value of the argument. |
xf:get-Century-from-gYearMonth
|
Returns the century component of the years value of the argument. |
xf:get-gYear-from-dateTime
|
Returns the years value of the argument. |
xf:get-gYear-from-date
|
Returns the years value of the argument. |
xf:get-gYear-from-gYearMonth
|
Returns the years value of the argument. |
xf:get-gMonth-from-dateTime
|
Returns the months value of the argument. |
xf:get-gMonth-from-date
|
Returns the months value of the argument. |
xf:get-gMonth-from-gYearMonth
|
Returns the months value of the argument. |
xf:get-gMonth-from-gMonthDay
|
Returns the months value of the argument. |
xf:get-gDay-from-dateTime
|
Returns the days value of the argument. |
xf:get-gDay-from-date
|
Returns the days value of the argument. |
xf:get-gDay-from-gMonthDay
|
Returns the days value of the argument. |
xf:get-hour-from-dateTime
|
Returns the hours value of the argument. |
xf:get-hour-from-time
|
Returns the hours value of the argument. |
xf:get-minutes-from-dateTime
|
Returns the minutes value of the argument. |
xf:get-minutes-from-time
|
Returns the minutes value of the argument. |
xf:get-seconds-from-dateTime
|
Returns the seconds value of the argument. |
xf:get-seconds-from-time
|
Returns the seconds value of the argument. |
xf:get-timezone-from-dateTime
|
Returns the timezone part of the argument. |
xf:get-timezone-from-date
|
Returns the timezone part of the argument. |
xf:get-timezone-from-time
|
Returns the timezone part of the argument. |
xf:get-timezone-from-gYearMonth
|
Returns the timezone part of the argument. |
xf:get-timezone-from-gYear
|
Returns the timezone part of the argument. |
xf:get-timezone-from-gMonthDay
|
Returns the timezone part of the argument. |
xf:get-timezone-from-gMonth
|
Returns the timezone part of the argument. |
xf:get-timezone-from-gDay
|
Returns the timezone part of the argument. |
xf:get-gMonth-from-gYearMonth(gYearMonth $srcval)
=> positiveInteger
(range 1 to 12)
xf:get-gMonth-from-gMonthDay(gMonthDay $srcval)
=> positiveInteger
(range 1 to 12)
Returns a decimal value representing the seconds and fractional seconds identified in the value of $srcval
. The value ranges from 0 to 61.999..., inclusive. The number of digits of fractional seconds precision is determined by the relevant facet of the argument. Note that the value can be greater than 60 seconds to accomodate occassional leap seconds used to keep human time synchronized with the rotation of the planet.
Returns a decimal value representing the seconds and fractional seconds identified in the value of $srcval
. The value ranges from 0 to 60.999..., inclusive. The number of digits of fractional seconds precision is determined by the relevant facet of the argument. Note that the value can be greater than 60 seconds to accomodate occassional leap seconds used to keep human time synchronized with the rotation of the planet.
The extraction functions specified below extract one component from a duration.
Function | Meaning |
xf:get-years(duration)
|
Returns the years component of the argument. |
xf:get-months(duration)
|
Returns the months component of the argument. |
xf:get-days(duration)
|
Returns the days component of the argument. |
xf:get-hours(duration)
|
Returns the hours component of the argument. |
xf:get-minutes(duration)
|
Returns the minutes component of the argument. |
xf:get-seconds(duration)
|
Returns the seconds component of the argument. |
Function | Meaning |
xf:add-days
|
Adds the number of days indicated by the second argument to the first argument. |
xf:add-months
|
Adds the number of months indicated by the second argument to the first argument. |
xf:add-years
|
Adds the number of years indicated by the second argument to the first argument. |
xf:add-gMonth
|
Adds the number of months indicated by the value of the second argument to the value of the first argument. |
xf:add-gYear
|
Adds the number of years indicated by the value of the second argument to the value of the first argument. |
These functions add a duration to a date. Appendix E of [XML Schema Part 2: Datatypes] describes an algorithm for adding durations to dates.
Adds the number of months indicated by the value of $incrMonths
to the value of $dateParam
. The value of $incrMonths
may be negative. If the value of $dateParam
has a timezone, it remains unchanged. The returned value is always normlized into a correct Gregorian calendar date.
[Issue 124: add-months() and add-years() incompletely specified]
Adds the number of years indicated by the value of $incrYears
to the value of $dateParam
. The value of $incrYears
may be negative. If the value of $dateParam
has a timezone, it remains unchanged. The returned value is always normlized into a correct Gregorian calendar date.
[Issue 124: add-months() and add-years() incompletely specified]
Let us define a time period
as an interval or duration of time with a fixed start and end. Thus, time periods have three properties, two of which are independent. The first three functions below take two of the properties as arguments and return the third. The last three functions test whhether a given dateTime falls within a time period.
Function | Meaning |
xf:get-duration
|
Returns the difference between two dateTimes as a duration. The two arguments must both have a timezone or both have no timezone. |
xf:get-end
|
Returns the end of a time period by adding the dateTime that starts the period and the duration of the period. |
xf:get-start
|
Returns the beginning of a time period by substracting the duration of the period from the the dateTime that ends the period. |
xf:temporal-dateTimes-contains
|
Indicates whether the time period defined by the first two arguments contain the time specified in the third argument. All three dateTime arguments must have a timezone or all three must not have a timezone. |
xf:temporal-dateTimeDuration-contains
|
Indicates whether the time period defined by the first two arguments contain the time specified in the third argument. The two dateTime arguments must have a timezone or both must not have a timezone. |
xf:temporal-durationDateTimes-contains
|
Indicates whether the time period defined by the first two arguments contain the time specified in the third argument. The two dateTime arguments must have a timezone or both must not have a timezone. |
[Issue 25: Is a normalize function needed for duration types?]
[Issue 96: These functions on time-period values are better written using operators as in SQL.]
[Issue 111: get-end and get-start should be more unique]
Returns the duration that corresponds to the difference between the value of $parameter1
and the value of $parameter2
. If the value of $parameter1
follows in time the value of $parameter2
, then the returned value is a negative duration. The two arguments must both have a timezone or both have no timezone.
xf:temporal-dateTimes-contains(dateTime $parameter1, dateTime $parameter2, dateTime $parameter3)
=> boolean
$parameter1
— static type isdateTime
$parameter2
— static type isdateTime
$parameter3
— static type isdateTime
If the value of $parameter3
is either (1)equal to the value of $parameter1
, (2)equal to the value of $parameter2
, (3)greater than the value of $parameter1
and less than the value of $parameter2
, or (4)less than the value of $parameter1
and greater than the value of $parameter2
, then the function returns true
; otherwise, the function returns false
. All three dateTime arguments must have a timezone or all three must not have a timezone.
xf:temporal-dateTimes-contains(xf:dateTime("2000-10-30T11:12:00"), xf:dateTime("1999-11-28T09:00:00"), xf:dateTime("2000-01-01T12:00:00"))
returns true
.
xf:temporal-dateTimes-contains(xf:dateTime("2000-10-30T11:12:00Z"), xf:dateTime("2000-01-01T09:00:00"), xf:dateTime("2000-01-01T12:00:00"))
returns error.
xf:temporal-dateTimeDuration-contains(dateTime $parameter1, duration $parameter2, dateTime $parameter3)
=> boolean
$parameter1
— static type isdateTime
$parameter2
— static type isduration
$parameter3
— static type isdateTime
If the value of $parameter3
is either (1)equal to the value of $parameter1
, (2)equal to the value of the dateTime value resulting from adding the value of $parameter2
to the value of $parameter1
, (3)greater than the value of $parameter1
and less than the value of the dateTime value resulting from adding the value of $parameter2
to the value of $parameter1
, or (4)less than the value of $parameter1
and greater than the value of the dateTime value resulting from adding the value of $parameter2
to the value of $parameter1
, then the function returns true
; otherwise, the function returns false
. The two dateTime arguments must both have a timezone or must both not have a timezone.
xf:temporal-durationDateTime-contains(duration $parameter1, dateTime $parameter2, dateTime $parameter3)
=> boolean
$parameter1
— static type isduration
$parameter2
— static type isdateTime
$parameter3
— static type isdateTime
If the value of $parameter3
is either (1)equal to the value of $parameter2
, (2)equal to the value of the dateTime value resulting from subtracting the value of $parameter1
from the value of $parameter2
, (3)greater than the value of $parameter2
and less than the value of the dateTime value resulting from subtracting the value of $parameter1
from the value of $parameter2
, or (4)less than the value of $parameter1
and greater than the value of the dateTime value resulting from subtracting the value of $parameter1
from the value of $parameter2
, then the function returns true
; otherwise, the function returns false
. The two dataTime arguments must both have a timezone both or must both not have a timezone.
This section discusses constructors for QNames as defined in [XML Schema Part 2: Datatypes].
Function | Meaning | Source |
xf:QName-from-uri
|
Returns a QName with the namespace URI given in the first argument and the local name in the second argument. | |
xf:QName-from-prefix
|
Returns a QName with the namespace URI that maps to the prefix given in the first argument and the local name in the second argument. The prefix-to-URI mapping uses the namespaces in scope. | |
xf:QName
|
Returns a QName in no namespace with the local name given in the argument. |
[Issue 115: QName-from-string function]
This section discusses functions on QNames as defined in [XML Schema Part 2: Datatypes].
Function | Meaning | Source |
xf:get-local-name
|
Returns a string representing the local part of the QName argument. | |
xf:get-namespace-uri
|
Returns the namespace URI for the QName argument. This may be the empty sequence if the QName is in no namespace. |
This section discusses a constructor for anyURI as defined in [XML Schema Part 2: Datatypes].
Function | Meaning | Source |
xf:anyURI
|
Returns an anyURI with a URI reference as given in the argument. |
Returns an anyURI
value with a URI reference specified as the value of $srcval
.
[Issue 10: Do we need functions to decompose and resolve URIs?]
We define the following comparison operators on base64Binary and hexBinary values. Comparisons take two operands of the same type i.e both operands must be base64Binary or hexBinary. Each comparison operator returns a boolean value.
The comparison is bit by bit. So, for base64Binary
, "A" < "B" < "a". If the two
operands are of different lengths and the second operand starts with the first
operand, then the first operand < second operand.
Operator | Meaning |
=
|
Equality comparison |
<
|
Less-than comparison |
>
|
Greater-than comparison |
<=
|
Less-than-or-equal-to comparison |
>=
|
Greater-than-or-equal-to comparison |
!=
|
Inequality comparison |
If either argument is the empty sequence, the result is the empty sequence.
[Issue 125: Do comparisons on base64Binary and hexBinary make sense?]
This section discusses a constructor for NOTATION as defined in [XML Schema Part 2: Datatypes].
Function | Meaning | Source |
xf:NOTATION
|
Returns a NOTATION with the value given in the argument. |
This section discusses functions and operators on nodes as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 86: In which document do the node accessors functions and the kind tests go?]
The following operators are defined on sequences.
Operator | Meaning |
==
|
Tests if two nodes have the same identity. |
!==
|
Tests if two nodes do not have the same identity. |
=>
|
The left-operand is an element or attribute with a value of type IDREF or IDREFS. The right-operand is a node test. The operator dereferences value and returns the referenced nodes that satisfy the node test. See also 11.7 Functions that Generate Sequences |
Function | Meaning | Source |
xf:local-name
|
Returns the local name of the context node or the specified node as a QName. | XPath 1.0 modified |
xf:namespace-uri
|
Returns the namespace URI for the name of the specified node. | XPath 1.0 |
xf:number
|
Returns the value of the context node or the specified node converted to a number. | XPath 2.0 req 1.5 (Could) |
xf:node-equal
|
Returns true if the two arguments have the same identity. | Data Model |
xf:value-equal
|
Returns true if the two arguments have the same value. | Data Model |
xf:node-before
|
Indicates whether one node appears before another node in document order. | Data Model |
xf:node-after
|
Indicates whether one node appears before another node in document order. | Data Model |
xf:copy
|
Returns a deep copy of a node. | Data Model |
xf:shallow
|
Returns a shallow copy of a node. | Data Model |
xf:boolean
|
Casts a node to a boolean. See 12.5 Miscellaneous casting functions. | XPath 1.0 |
[Issue 59: Recommend = operator for value-equal and "is" for node-equal.]
[Issue 95: Need if-absent and if-empty functions to handle missing and unknown data.]
[Issue 97: We need a deep-equals function for nodes.]
[Issue 103: Need operators for BEFORE and AFTER.]
For the illustrative examples below, assume an XQuery operating on a Purchase Order document containing a number of item elements. Each item has child elements called description, quantity, etc. Quantity has simple content of type decimal. Further assume that variables $item1
, $item2
,
etc. are bound to the nodes for the item elements in the document in sequence.
Returns the value of the node indicated by $srcval
or, if $srcval
is not specified, the context node, converted to a double. If the value of the node is not a valid lexical representation of a numeric simple type as defined in [XML Schema Part 2: Datatypes], then the function returns an error value as specified in [XQuery 1.0 and XPath 2.0 Data Model].
xf:value-equal(node $parameter1, node $parameter2)
=> boolean
xf:value-equal(node $parameter1, node $parameter2, anyURI $collation)
=> boolean
$parameter1
— static type isnode
$parameter2
— static type isnode
$collation
— static type isanyURI
If the node identified by the value of $parameter1
has the same value as the node identified by the value of $parameter2
, then the function returns true
; otherwise, the function returns false
.
We define the value-equality as follows. We assume value equality over simple values is defined. Equality of string values is determined according to the collation that is used. Equality over all other data model values is defined recursively:
Given attributes a1 and a2,
xf:value-equal
(a1,a2),
if and only if
xf:value-equal
(name
(a1), name
(a2)) and
xf:value-equal
(value
(a1), value
(a2)).
Given elements e1 and e2,
xf:value-equal
(e1, e2),
if and only if
xf:value-equal
(name
(e1), name
(e2)) and
xf:value-equal
(attributes
(e1), attributes
(e2)) and
xf:value-equal
(children
(e1), children
(e2)).
Given two sequences (u1, ...,
uj) and
(v1, ..., vk),
xf:value-equal
((u1,
..., uj),
(v1, ...,
vk)) holds if and
only if j = k and xf:value-equal
(ui,
vi) holds for all 1 <= i <=
n.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
If the node identified by the value of $parameter1
occurs in document order before the node identified by the value of $parameter2
, this function returns true
; otherwise, the function returns false
. The rules determining "document order" can be found in [XQuery 1.0 and XPath 2.0 Data Model].
If the nodes identified by the values of the two arguments are from different documents, the result is implementation-dependent.
[Issue 117: Data model defines document order for multiple documents]
If the node identified by the value of $parameter1
occurs in document order after the node identified by the value of $parameter2
, this function returns true
; otherwise, the function returns false
. The rules determining "document order" can be found in [XQuery 1.0 and XPath 2.0 Data Model].
If the nodes identified by the values of the two arguments are from different documents, the result is implementation-dependent.
[Issue 117: Data model defines document order for multiple documents]
Returns a copy of the node that is the value of $srcval
including all its attributes and descendants; the copy has a different identity than the node indicated by the value of $srcval
.
[Issue 60: What are the precise semantics of the copy() function?]
$var = xf:copy($item1)
creates a node that is a copy of the value of $item1, including its attributes and descendants, gives it a different identity, and sets the value of $var equal to it. Assume that the value of $item1
was the element node:
<family name='green'> <father>peter</father> <mother>mary<mother> <child>joseph</child> </family>
The value of $var
would be
<family name='green'> <father>peter</father> <mother>mary<mother> <child>joseph</child> </family>
Returns a copy of the node that is the value of $srcval
including all its attributes but not its descendants; the copy has a different identity than the node indicated by the value of $srcval
.
$var = xf:copy($item1)
creates a node that is a copy of $item1, including only its attributes and not its descendants, gives it a different identity, and sets the value of $var
equal to it. Assume that the value of $item1
was the element node:
<family name='green'> <father>peter</father> <mother>mary<mother> <child>joseph</child> </family>
The value of $var
would be
<family name='green'/>
Casts a node to a boolean. See 12.5 Miscellaneous casting functions.
[XML Schema Part 2: Datatypes] allows users to define datatypes that consist of space separated lists of values. The values must be of a specified datatype, called the itemType
, which must be an atomic datatype but may be a union type. In addition, it defines three legacy XML datatypes as lists: NMTOKENS, IDREFS and ENTITIES.
[XQuery 1.0 and XPath 2.0 Data Model] defines node sets, now renamed sequences, consisting of lists of nodes. There is a significant semantic difference between lists and sequences as defined in these two specifications: a list with a single value is still a list; a sequence with a single value is equivalent to the single value by itself.
For the purpose of this section, lists and sequences as defined above are referred to as sequences.
[Issue 82: Clarify distinction between node sets, lists, and sequences]
[Issue 89: Functions that have anyType in their return are problematic.]
The operators described in this section are defined on the following types.
Node sequences
User-defined list types
NMTOKENS
IDREFS
ENTITIES
The following constructors are defined for sequences.
Operator | Meaning |
TO
|
Returns the sequence containing every integer between the values of the operands. (Implemented as an infix operator.) |
The following operators are defined on sequences.
Operator | Meaning |
,
|
Infix operator. Concatenates two sequences. |
The following functions are defined on sequences.
Function | Meaning | Source |
xf:position
|
Returns an unsignedInt indicating the position of the given item in the given sequence. See also [Issue 100: Reconcile the definitions of these functions with XPath. ] | XPath 1.0 |
xf:last
|
Returns an unsignedInt indicating the last position in the given sequence. [Issue 100: Reconcile the definitions of these functions with XPath. ] | XPath 1.0 |
xf:item-at
|
Returns the item at given index. | XPath 2.0 Req 4.4 (Should) |
xf:index-of
|
Returns a sequence of unsignedInts, each of which is the index of a member of the specified sequence that is equal to the simple value or node that is the value of the second argument. If no members of the specified sequence are equal to the value of the second argument, the function returns an empty sequence. | XPath 2.0 Req 4.4 (Should) |
xf:empty
|
Indicates whether or not the provided sequence is empty. | XPath 2.0 Req 4.4 (Should) |
xf:exists
|
Indicates whether or not the provided sequence is not empty. | |
xf:identity-distinct
|
Returns a sequence in which all redundant duplicate elements, based on node identity, have been deleted. The specific node in a collection of redundant duplicate nodes that is retained in implementation-dependent. | XPath 2.0 Req 4.4 (Should) |
xf:value-distinct
|
Returns a sequence in which all redundant duplicate elements, based on value equality, have been deleted. The specific node in a collection of redundant duplicate nodes that is retained in implementation-dependent. | XPath 2.0 Req 4.4 (Should) |
xf:sort
|
Sorts into ascending order the elements in the provided sequence according to the values of the elements. | XPath 2.0 Req 4.4 (Should) |
xf:reverse-sort
|
Sorts into descending order the elements in the provided sequence according to the values of the elements. | XPath 2.0 Req 4.4 (Should) |
xf:insert
|
Inserts an element or sequence of elements into a specified position of a sequence. | XPath 2.0 Req 2.4, 4.4 (Should) |
xf:sublist-before
|
Returns the part of the first sequence that occurs before the first occurrence of the second sequence. | XPath 2.0 Req 4.4 (Should) |
xf:sublist-after
|
Returns the part of the first sequence that occurs after the last occurrence of the second sequence. | XPath 2.0 Req 4.4 (Should) |
xf:sublist
|
Returns a sequence located at a specified place in the value of a given sequence. | XPath 2.0 Req 4.4 (Should) |
xf:sequence-pad-beginning
|
Returns the sequence padded in front by copies of the third argument. | XPath 2.0 Req 2.4.2, 4.4 (Should) |
xf:sequence-pad-end
|
Returns the sequence padded at the end by copies of the third argument. | XPath 2.0 Req 2.4.2, 4.4 (Should) |
xf:truncate-beginning
|
Truncates the beginning of the sequence, resulting in a sequence whose number of elements equals the second argument. | XPath 2.0 Req 2.4, 4.4 (Should) |
xf:truncate-end
|
Truncates the beginning of the sequence, resulting in a sequence whose number of elements equals the second argument. | XPath 2.0 Req 2.4, 4.4 (Should) |
xf:resize-beginning
|
Truncates the beginning of a sequence, or pads a sequence at the beginning with the third argument, so that the sequence has a number of elements equal to the second argument. | XPath 2.0 Req 2.4, 4.4 (Should) |
xf:resize-end
|
Truncates the end of a sequence, or pads a sequence at the end with the third argument, so that the sequence has a number of elements equal to the second argument. | XPath 2.0 Req 2.4, 4.4 (Should) |
xf:unordered
|
This function provides a hint to the query optimizer that the order of the argument sequence is not important and can be ignored. | XQuery |
[Issue 63: Do we need variations of index-of for values and identity?]
[Issue 65: Are certain sequence functions both adequate and required?]
[Issue 66: A function to reorder a sequence into document order is needed]
[Issue 98: Need head() and tail() functions on sequences.]
[Issue 99: Remove sort and reverse-sort-functions. Covered by SORTBY operator in the language.]
As in the previous section, for the illustrative examples below, assume an XQuery operating on a Purchase Order document containing a number of item elements. The variable $seq
is bound to the sequence of item nodes in document order. The variables $item1
, $item2
, etc. are bound to individual item nodes in the sequence.
Returns an unsignedInt indicating the position of the $itemParam
in the $seqParam
.
If the $itemParam
is not contained in $seqParam
, then the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 130: Several functions might become second-order functions]
$seqParam
— static type isanyType*
$posParam
— static type isdecimal
. The effective value isfloor($posParam)
.
Returns the item in the sequence that is the value of $seqParam
that is located at the index that is the value of $posParam
.
If the value of $posParam
is greater than the number of items in the sequence, or is less than or equal to zero, then the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
xf:index-of(anyType* $seqParam, anyType $srchParam)
=> unsignedInt*
xf:index-of(anyType* $seqParam, anyType $srchParam, anyURI $collation)
=> unsignedInt*
$seqParam
— static type isanyType*
$srchParam
— static type isanyType
$collation
— static type isanyURI
If the value of $seqParam
contains simple values (that is, not nodes), then the function returns a sequence of values indicating the indexes (positions) of items in the value of $seqParam
that are equal to the simple value of $srchParam
. If the data types of the simple values are strings, then equality is determined according to the collation that is used.
If the value of $seqParam
contains nodes, then the function returns a sequence of values indicating the indexes (positions) of nodes whose string values are equal to the string value of the node in the second argument. Equality of string values is determined according to the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
The sequence must contain either simple values or nodes, not both. If the sequence contains both simple values and nodes, then the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $seqParam
is an empty sequence, then the returned value is an empty sequence.
The index is 1-based (not 0-based).
xf:value-distinct(anyType* $srcval)
=> anyType*
xf:value-distinct(anyType* $srcval, anyURI $collation)
=> anyType*
Returns the sequence that results from removing from $srcval
all but the first of a set of nodes that are equal to one other, based on the nodes' values (that is, using value-equal()
). If the sequence is a sequence of nodes, then the typed values of the nodes are used for the comparison. The specific node in a collection of nodes having equal values that is retained in implementation-dependent. Equality of string values are determined according to the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
Sorts into ascending order the sequence that is the value of $srcval
using the typed values of the elements of the sequence.
The sequence must be a sequence of comparable objects.
If the sequence contains any two items that are not comparable with one another, then an error value is returned as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
is a sequence of string values, or if the value of $srcval
is a sequence of nodes that are compared according to their string values, then the ordering is performed based on the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
Sorts into descending order the sequence that is the value of $srcval
using the typed values of the elements of the sequence.
The sequence must be a sequence of comparable objects.
If the sequence contains any two items that are not comparable with one another, then an error value is returned as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If the value of $srcval
is a sequence of string values, or if the value of $srcval
is a sequence of nodes that are compared according to their string values, then the ordering is performed based on the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
[Issue 64: Should reverse-sort be replaced by generic reverse?]
$target
— static type isanyType*
$position
— static type isdecimal
. The effective value is computed asfloor($position)
.
$inserts
— static type isanyType*
Returns a new sequence constructed from the value of $target
with the value of $inserts
inserted at the position specified by the value of $position
. (The value of $target
is not affected by the sequence construction.)
Let the effective value of $position
be N.
If N is less than zero, the effective value of N is zero. If N is greater than the number of items in $target
, then the effective value of N is equal to the number of items in $target
.
The value returned by the function consists of all items of $target
whose index is less than or equal to N, followed by all items of $inserts
, followed by the remaining elements of $target
, in that sequence.
xf:sublist-before(anyType* $parameter1, anyType* $parameter2)
=> anyType*
xf:sublist-before(anyType* $parameter1, anyType* $parameter2, anyURI $collation)
=> anyType*
$parameter1
— static type isanyType*
$parameter2
— static type isanyType*
$collation
— static type isanyURI
If there is at least one contiguous sequence of elements in the value of $parameter1
that are, taken in order, equal
to the elements of the value of $parameter2
, then the function returns a sequence constructed of every element of the value of $parameter1
, taken in order, that occur before the first such contiguous sequence of elements in the value of $parameter1
. In sequences of simple values, elements are compared for equality by value. In sequences of nodes, they
are compared by identity.
If the value of $operand1
does not contain such a contiguous sequence of elements, then the function returns the empty sequence.
If the values of $parameter1
and $parameter2
are sequences of string values, or if the values of $parameter1
and $parameter2
are a sequences of nodes that are compared according to their string values, then the ordering is performed based on the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
[Issue 130: Several functions might become second-order functions]
xf:sublist-after(anyType* $parameter1, anyType* $parameter2)
=> anyType*
xf:sublist-after(anyType* $parameter1, anyType* $parameter2, anyURI $collation)
=> anyType*
$parameter1
— static type isanyType*
$parameter2
— static type isanyType*
$collation
— static type isanyURI
If there is at least one contiguous sequence of elements in the value of $parameter1
that are, taken in order, equal
to the elements of the value of $parameter2
, then the function returns a sequence constructed of every element of the value of $parameter1
, taken in order, that occur after the last such contiguous sequence of elements in the value of $parameter1
. In sequences of simple values, elements are compared for equality by value. In sequences of nodes, they
are compared by identity.
If the value of $operand1
does not contain such a contiguous sequence of elements, then the function returns the empty sequence.
If the values of $parameter1
and $parameter2
are sequences of string values, or if the values of $parameter1
and $parameter2
are a sequences of nodes that are compared according to their string values, then the ordering is performed based on the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
[Issue 130: Several functions might become second-order functions]
xf:sublist(anyType* $sourceSeq, decimal $startingLoc)
=> anyType*
xf:sublist(anyType* $sourceSeq, decimal $startingLoc, decimal $length)
=> anyType*
$sourceSeq
— static type isanyType*
$startingLoc
— static type isdecimal
. The effective value is computed asfloor($startingLoc)
.
$length
— static type isdecimal
. The effective value is computed asfloor($length)
.
Returns the contiguous sequence of items in the value of $sourceSeq
beginning at the position indicated by the value of $startingLoc
and continuing for the number of items indicated by the value of $length
.
If $length
is not specified, then the sublist identifies items to the end of $sourceSeq
.
The value of $length
can be greater than the number of items in the value of $sourceSeq
following the beginning position, in which case the sublist identifies items to the end of $length
.
The first item of a sequence is located at position 1 (not position 0).
xf:sequence-pad-beginning(anyType* $srcval, decimal $padCount, anyType $padItem)
=> anyType*
$srcval
— static type isanyType*
$padCount
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($padCount))
.
$padItem
— static type isanyType
Returns a new sequence constructed from the value of $srcval
in which zero or more copies of the value of $padItem
have been inserted preceding the first item in the value of $srcval
. The number of copies of the value of $padItem
that are inserted is equal to the value of $padCount
.
If the value of $padItem
has a type that is not the type of the items in the value of $srcval
, then the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 130: Several functions might become second-order functions]
xf:sequence-pad-end(anyType* $srcval, decimal $padCount, anyType $padItem)
=> anyType*
$srcval
— static type isanyType*
$padCount
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($padCount))
.
$padItem
— static type isanyType
Returns a new sequence constructed from the value of $srcval
in which zero or more copies of the value of $padItem
have been inserted following the last item in the value of $srcval
. The number of copies of the value of $padItem
that are inserted is equal to the value of $padCount
.
If the value of $padItem
has a type that is not the type of the items in the value of $srcval
, then the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 130: Several functions might become second-order functions]
$srcval
— static type isanyType*
$length
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($length))
.
Returns a sequence constructed from the sequence that is the value of $srcval
, removing elements from the beginning of the sequence until the number of remaining elements is equal to the effective value of $length
.
If the number of elements in the sequence that is the value of $srcval
is less than or equal to the value of $length
, then the returned sequence is the value of $srcval
.
[Issue 130: Several functions might become second-order functions]
$srcval
— static type isanyType*
$length
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($length))
.
Returns a sequence constructed from the sequence that is the value of $srcval
, removing elements from the end of the sequence until the number of remaining elements is equal to the effective value of $length
.
If the number of elements in the sequence that is the value of $srcval
is less than or equal to the value of $length
, then the returned sequence is the value of $srcval
.
[Issue 130: Several functions might become second-order functions]
xf:resize-beginning(anyType* $srcval, decimal $finalLength, anyType $padItem)
=> anyType*
$srcval
— static type isanyType*
$finalLength
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($finalLength))
.
$padItem
— static type isanyType
Returns a sequence constructed from the sequence that is the value of $srcval
.
If the number of elements in the value of $srcval
is greater than the value of $finalLength
, then elements are removed from the beginning of the value of $srcval
until the number of remaining elements is equal to the value of $finalLength
.
If the number of elements in the value of $srcval
is equal to the value of $finalLength
, then the returned sequence is the value of $srcval
.
If the number of elements in the value of $srcval
is less than the value of $finalLength
, then the returned sequence is modified by adding instances of the value of $padItem
at the beginning of the returned sequence until the number of elements of the returned sequence is equal to the value of $finalLength
.
[Issue 130: Several functions might become second-order functions]
NOTE:This is a "convenience function" that effectively performs a
pad-beginning
or atruncate-beginning
as required to fulfil the$finalLength
criterion.
xf:resize-end(anyType* $srcval, decimal $finalLength, anyType $padItem)
=> anyType*
$srcval
— static type isanyType*
$finalLength
— static type isdecimal
. The effective value is computed ascast as unsignedInt(floor($finalLength))
.
$padItem
— static type isanyType
Returns a sequence constructed from the sequence that is the value of $srcval
.
If the number of elements in the value of $srcval
is greater than the value of $finalLength
, then elements are removed from the end of the value of $srcval
until the number of remaining elements is equal to the value of $finalLength
.
If the number of elements in the value of $srcval
is equal to the value of $finalLength
, then the returned sequence is the value of $srcval
.
If the number of elements in the value of $srcval
is less than the value of $finalLength
, then the returned sequence is modified by appending instances of the value of $padItem
at the end of the returned sequence until the number of elements of the returned sequence is equal to the value of $finalLength
.
[Issue 130: Several functions might become second-order functions]
NOTE:This is a "convenience function" that effectively performs a
pad-end
or atruncate-end
as required to fulfil the$finalLength
criterion.
Function | Meaning | Source |
xf:sequence-value-equal
|
Returns true if the two arguments have the same value. | Data Model |
xf:sequence-node-equal
|
Returns true if the two arguments have the same nodes. | Data Model |
xf:union
|
Returns the union of the two sequence arguments, eliminating duplicates. | XPath 2.0 Req 1.5 (Should) |
xf:union-all
|
Returns the union of the two sequence arguments without eliminating duplicates. | |
xf:intersect
|
Returns the intersection of the two sequence arguments, eliminating duplicates. XPath uses "&" for this. | XPath 2.0 Req 1.5 (Should) |
xf:intersect-all
|
Returns the intersection of the two sequence arguments without eliminating duplicates. | |
xf:except
|
Returns the difference of the two sequence arguments, eliminating duplicates. | XPath 2.0 Req 1.5 (Should) |
xf:except-all
|
Returns the difference of the two sequence arguments without eliminating duplicates. |
[Issue 91: Need value-based functions for Union, Intersect and Except.]
[Issue 102: Need operators for UNION, INTERSECT and EXCEPT.]
[Issue 132: union(), intersect(), and except(): only for simple values?]
As in the previous sections, for the illustrative examples below, assume a XQuery operating on a Purchase
Order document containing a number of item elements. The variables
$item1
, $item2
, etc. are bound to individual item nodes in the sequence. We shall use sequences of these nodes in the examples below.
xf:sequence-value-equal(anyType* $parameter1, anyType* $parameter2)
=> boolean
xf:sequence-value-equal(anyType* $parameter1, anyType* $parameter2, anyURI $collation)
=> boolean
$parameter1
— static type isanyType*
$parameter2
— static type isanyType*
$collation
— static type isanyURI
If the sequences that are the values of $parameter1
and $parameter2
have the same values (that is, they have the same number of items and items in corresponding positions in the two sequences are value-equal), then the function returns true
; otherwise, the function returns false
. Returns the empty sequence if one or both of its arguments is the empty sequence.
String values are compared according to the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
If the sequences that are the values of $parameter1
and $parameter2
have the same nodes as content (that is, they have the same number of items and items in corresponding positions in the two sequences are the identical nodes), then the function returns true
; otherwise, the function returns false
. Returns the empty sequence if one or both of its arguments is the empty sequence.
Constructs a sequence containing every element that occurs in the values of $parameter1
or of $parameter2
, eliminating redundant duplicate elements. The specific element in a collection of redundant duplicate elements that is retained in implementation-dependent.
Elements are compared using xf:node-equal()
.
The returned sequence is a sequence of nodes (that is, not of the values of the nodes) in document order, as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Note: XPath uses "|
" for union()
.
Constructs a sequence containing every element that occurs in the values of $parameter1
or of $parameter2
, retaining redundant duplicate elements (that is, duplicates are not eliminated).
Elements are compared using xf:node-equal()
.
The returned sequence is in document order, as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Constructs a sequence containing every element that occurs in the values of both $parameter1
and $parameter2
, eliminating redundant duplicate elements. The specific element in a collection of redundant duplicate elements that is retained in implementation-dependent.
Elements are compared using xf:node-equal()
.
The returned sequence is in document order, as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Constructs a sequence containing every element that occurs in the values of both $parameter1
and $parameter2
, retaining redundant duplicate elements (that is, duplicates are not eliminated).
The number of each redundant duplicate element that is included in the result is the minimum of the number of occurrences of the element in $parameter1
and the number of occurrences of the element in $parameter2
.
Elements are compared using xf:node-equal()
.
The returned sequence is in document order, as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Constructs a sequence containing every element that occurs in the values of $parameter1
, but not in the value of $parameter2
, eliminating redundant duplicate elements. The specific element in a collection of redundant duplicate elements that is retained in implementation-dependent.
Elements are compared using xf:node-equal()
.
The returned sequence is in document order, as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Constructs a sequence containing every element that occurs in the values of $parameter1
, but not in the value of $parameter2
, retaining redundant duplicate elements (that is, duplicates are not eliminated).
The number of each redundant duplicate element that is included in the result is the difference between the number of occurrences of the element in $parameter1
and the number of occurrences of the element in $parameter2
.
Elements are compared using xf:node-equal()
.
The returned sequence is in document order, as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Function | Meaning | Source |
xf:count
|
Returns the number of items in the sequence. | XPath 1.0 |
xf:avg
|
Returns the average of a sequence of numbers. | XSLT 2.0 Req. 1.4 (Must) |
xf:max
|
Returns the object with maximum value from a collection of comparable objects. | XSLT 2.0 Req. 1.4 (Must) |
xf:min
|
Returns the object with minimum value from a collection of comparable objects. | XSLT 2.0 Req. 1.4 (Must) |
xf:sum
|
Returns the sum of a sequence of numbers. | XSLT 1.0 |
Returns the number of items in the value of $srcval
.
[Issue 67: Should duplicates be eliminated for count() and sum()?]
If every item in the value of $srcval
is a number, then the function returns the average of the numbers (computed as sum($srcval) div count($srcval)
).
Otherwise, the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
xf:max(anyType* $srcval)
=> anyType
xf:max(anyType* $srcval, anyURI $collation)
=> anyType
Returns the item in the value of $srcval
whose value is greater than the value of every other item in the value of $srcval
. If there are two or more such items, then the specific item whose value is returned is implementation-dependent.
If the items in the value of $srcval
are strings, then the determination of the greatest item is made according to the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
xf:max(anyType* $srcval)
=> anyType
xf:min(anyType* $srcval, anyURI $collation)
=> anyType
Returns the item in the value of $srcval
whose value is less than the value of every other item in the value of $srcval
. If there are two or more such items, then the specific item whose value is returned is implementation-dependent.
If the items in the value of $srcval
are strings, then the determination of the least item is made according to the collation that is used.
If $collation
is specified, then the value of $collation
must identify a collation that is supported in the environment in which the function is invoked. Either an absolute URI or a relative URI may be specified. The collation identified by $collation
, while effectively based on the Unicode Collation Algorithm, has implementation-defined semantics.
If no collation is specified, then the default collation is used. The default collation is determined according to the rules in TO BE DETERMINED.
If every item in the value of $srcval
is a number, then the function returns the sum of those numbers.
Otherwise, the function returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 67: Should duplicates be eliminated for count() and sum()?]
Function | Meaning | Source |
xf:id
|
Returns the sequence of nodes having unique IDs that match the IDREFs represented by the argument sequence. | XPath 1.0 |
xf:idref
|
Returns the sequence of nodes with IDREFs matching the items in the argument sequence. | XSLT 2.0 Req. 2.11 (Could) |
xf:filter
|
Returns a shallow copy of the nodes that are selected by the expression argument, preserving any relationships that exist among these nodes. | XQuery |
xf:document
|
Treats its string argument as a URI Reference and returns the root node of the referenced document. | XSLT 1.0 |
[Issue 43: Are function names ID/id and IDREF/idref confusing?]
[Issue 68: The id-nodes() and id-NMTOKENS() functions must specify result documents]
Returns the sequence of element nodes with an IDREF value
matching the value of one of the items in the sequence argument or an IDREFS value containing an IDREF matching the value of one of the items in the sequence argument. If the value of $srcval
is a single string, it behaves as though a sequence of length one of strings was supplied. This function allows reverse navigation from IDs to IDREFs.
The filter function returns a shallow copy of the nodes that are selected by expression that is the value of $srcval
, preserving any relationships that exist among these nodes.
Suppose that the argument to filter is a path expression that selects nodes X, Y, and Z from some document. Suppose that, in the original document, nodes Y and Z are descendants (at any level) of node X. Then the result of filter is a copy of node X, with copies of nodes Y and Z as its immediate children. Any other intervening nodes from the original document are not includeed in the result. The name filter suggests a function that operates on a document to extract the parts that are of interest and discard the remainder, while retaining the structure of the original document.
Treats the string value of $srcval as a URI reference and returns the root node of the referenced document. Returns an error if the document cannot be accessed. [XSLT 1.0] allows many other kinds of arguments and returns a sequence of nodes.
Cast functions or cast operators take an expression as their argument and return a value of a given type. There are two basic differences from constructor: casting takes an expression rather than a literal as argument, and validity checking is done at run time rather than at compile time.
[Issue 17: Is CAST a function/operator for this document?]
In this specification, casting is available for conversions between certain combinations of the primitive and derived types defined by [XML Schema Part 2: Datatypes]. The type conversions that are supported are indicated in the following tables. In these tables, there is a row for each primitive and derived type for which a conversion is defined with that type as the source of the conversion. In addition, there is a column for each primitive and derived type for which a conversion is defined with that type as the target of the conversion. The intersections of rows and columns contain one of three characters: "Y" indicates that a conversion from values of the type to which the row applies to the type to which the column applies is supported; "N" indicates that there are no supported conversions from values of the type to which the row applies to the type to which the column applies; and "M" indicates that a conversion from values of the type to which the row applies to the type to which the column applies may be supported, subject to restrictions given in this section of this specification. (Temporarily, while this specification is under development, there may be table entries containing "?", indicating that it has not yet been determined whether or not a conversion is provided.)
In the following tables, the columns and rows are identified by short codes that identify simple (both primitive and derived) types as follows:
aURI = anyURI
b64 = base64Binary
bool = boolean
byt = byte
dat = date
Day = gDay
dbl = double
dec = decimal
dT = dateTime
dur = duration
ENS = ENTITIES
ENT = ENTITY
flt = float
hxB = hexBinary
ID = ID
IDR = IDREF
IDS = IDREFS
int = int
itg = integer
lan = language
lng = long
MD = gMonthDay
Mon = gMonth
Nam = Name
NC = NCNAME
nI = negativeInteger
NMS = NMTOKENS
NMT = NMTOKEN
nNI = nonNegativeInteger
NOT = NOTATION
nPI = nonPositiveInteger
nStr = normalizedString
pI = positiveInteger
QN = Qname
sh = short
str = string
tim = time
tok = token
uB = unsignedByte
uI = unsignedInt
uL = unsignedLong
uS = unsignedShort
YM = gYearMonth
Yr = gYear
In each of the following tables, the notation "S\T" indicates that the source ("S") of the conversion is indicated in the column below the notation and that the target ("T") is indicated in the row to the right of the notation.
[Issue 83: Are the cast tables appropriately structured and ordered?]
The following table covers conversions to string and its derived types.
S\T | str | nStr | tok | lan | Nam | NC | ID | IDR | IDS | ENT | ENS | NMT | NMS |
str | Y | Y | Y | M | M | M | N | N | N | N | N | M | M |
nStr | Y | Y | Y | M | M | M | N | N | N | N | N | M | M |
tok | Y | Y | Y | M | M | M | N | N | N | N | N | M | M |
lan | Y | Y | Y | Y | Y | Y | N | N | N | N | N | Y | Y |
Nam | Y | Y | Y | Y | Y | Y | N | N | N | N | N | Y | Y |
NC | Y | Y | Y | M | M | Y | N | N | N | N | N | Y | Y |
ID | Y | Y | Y | M | Y | Y | N | N | N | N | N | Y | Y |
IDR | Y | Y | Y | M | Y | Y | N | N | N | N | N | Y | Y |
IDS | Y | Y | Y | M | M | M | N | N | N | N | N | M | Y |
ENT | Y | Y | Y | M | Y | Y | N | N | N | N | N | Y | Y |
ENS | Y | Y | Y | M | M | M | N | N | N | N | N | M | Y |
NMT | Y | Y | Y | M | Y | Y | N | N | N | N | N | Y | Y |
NMS | Y | Y | Y | M | M | M | N | N | N | N | N | M | Y |
flt | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
dbl | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
dec | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
itg | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
nPI | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
nI | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
lng | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
int | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
sh | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
byt | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
nNI | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
uL | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
uI | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
uS | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
uB | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
pI | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
dur | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
dT | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
tim | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
dat | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
YM | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
Yr | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
MD | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
Day | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
Mon | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
bool | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
b64 | Y | Y | Y | M | M | M | M | M | M | M | M | M | M |
hxB | Y | Y | Y | M | M | M | M | M | M | M | M | M | M |
aURI | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
QN | Y | Y | Y | N | N | N | N | N | N | N | N | N | N |
NOT | Y | Y | Y | M | M | M | M | M | M | M | M | M | M |
As specified in the preceding table, casting is supported from almost every data type (the exception is base64Binary
) into the primitive type string
and into its derived types normalizedString
and token
. Conversion to other types derived from string
depends on factors considered below. (If the entry in the preceding table is "N", then no cast is supported; consequently, the following specifications do not discuss those conversions.)
When a value of any simple type is cast to string
, the derivation of the string
value TV depends on the source type ST and on the source value SV, as follows.
If ST is string
, normalizedString
, token
, language
, Name
, NCName
, ID
, IDREF
, IDREFS
, ENTITY
, ENTITIES
, NMTOKEN
, or NMTOKENS
, then TV is SV.
If ST is float
, double
, decimal
, integer
, nonPositiveInteger
, negativeInteger
, long
, int
, short
, byte
, nonNegativeInteger
, unsignedLong
, unsignedInt
, unsignedShort
, unsignedByte
, or positiveInteger
, then TV is the canonical representation of SV, as defined by [XML Schema Part 2: Datatypes].
If ST is duration
, then TV is the lexical representation of SV, as defined by [XML Schema Part 2: Datatypes], in which each integer and decimal component is expressed in its canonical representation.
If ST is dateTime
or time
, then TV is the canonical representation of SV, as defined by [XML Schema Part 2: Datatypes].
If ST is date
, gYearMonth
, gYear
, gMonthDay
, gDay
, or gMonth
, then TV is the lexical representation of SV, as defined by [XML Schema Part 2: Datatypes].
If ST is boolean
, then TV is "true" if SV is true and "false" if SV is false.
If ST is hexBinary
, then TV is the canonical representation of SV, as defined by [XML Schema Part 2: Datatypes].
If ST is anyURI
, then TV is the lexical representation of SV, as defined in [XML Schema Part 2: Datatypes], with each space replaced by the sequence "%20".
If ST is QName
or NOTATION
, then TV is SV.
When a value of any simple type is cast to normalizedString
, the normalizedString
value TV is derived from the source type ST and the source value SV> as follows:
SV is converted to an intermediate value IV of type string
.
IV is converted to normalizedString
by removing from it all carriage return codes (#xD), all line feed codes (#xA), and all tab characters (#x9).
When a value of any simple type is cast to token
, the token
value TV is derived from the source type ST and the source value SV> as follows:
SV is converted to an intermediate value IV of type string
.
IV is converted to token
by removing from it all all line feed codes (#xA) and all tab characters (#x9), then removing all spaces (#x20) that precede all non-space characters (leading spaces) and all spaces that follow all non-space characters (trailing spaces), and then replacing all sequences of two or more spaces (#x20) with a single space.
[Issue 84: When casting to token, are linefeed/tab converted to space?]
When a value of any simple type is cast to language
, the language
value TV is derived from the source type ST and the source value SV as follows:
If ST is language
, then TV is SV and the conversion is complete.
SV is converted to an intermediate value IV of type NMTOKEN
.
If IV is not a valid value for language
as defined in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to Name
, the Name
value TV is derived from the source type ST and the source value SV as follows:
If ST is name
, then TV is SV and the conversion is complete.
SV is converted to an intermediate value IV of type string
.
If IV is not a valid value for Name
as defined in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to NCName
, the NCName
value TV is derived from the source type ST and the source value SV as follows:
If ST is NCName
, ID
, or IDREF
, then TV is SV and the conversion is complete.
SV is converted to an intermediate value IV of type string
.
If IV is not a valid value for NCName
as defined in [Namespaces in XML], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to NMTOKEN
, the NMTOKEN
value TV is derived from the source type ST and the source value SV as follows:
If ST is NMTOKEN
, then TV is SV and the conversion is complete.
SV is converted to an intermediate value IV of type string
.
If IV is not a valid value for NMTOKEN
as defined in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to NMTOKENS
, the NMTOKENS
value TV is derived from the source type ST and the source value SV as follows:
If ST is NMTOKEN
or NMTOKENS
, then TV is SV and the conversion is complete.
SV is converted to an intermediate value IV of type string
.
If IV is not a valid value for NMTOKENS
as defined in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
The following table covers conversions to float and to double, and to decimal and its derived types.
S\T | flt | dbl | dec | itg | nPI | nI | lng | int | sh | byt | nNI | uL | uI | uS | uB | pI |
str | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M |
nStr | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M |
tok | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M |
lan | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
Nam | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
NC | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
ID | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
IDR | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
IDS | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
ENT | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
ENS | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
NMT | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
NMS | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
flt | Y | Y | M | M | M | M | M | M | M | M | M | M | M | M | M | M |
dbl | M | Y | M | M | M | M | M | M | M | M | M | M | M | M | M | M |
dec | Y | Y | Y | M | M | M | M | M | M | M | M | M | M | M | M | M |
itg | Y | Y | Y | Y | M | M | M | M | M | M | M | M | M | M | M | M |
nPI | Y | Y | Y | Y | Y | M | M | M | M | M | M | M | M | M | M | N |
nI | Y | Y | Y | Y | Y | Y | M | M | M | M | N | N | N | N | N | N |
lng | Y | Y | Y | Y | M | M | Y | M | M | M | M | M | M | M | M | M |
int | Y | Y | Y | Y | M | M | Y | Y | M | M | M | M | M | M | M | M |
sh | Y | Y | Y | Y | M | M | Y | Y | Y | M | M | M | M | M | M | M |
byt | Y | Y | Y | Y | M | M | Y | Y | Y | Y | M | M | M | M | M | M |
nNI | Y | Y | Y | Y | M | N | M | M | M | M | Y | M | M | M | M | M |
uL | Y | Y | Y | Y | M | N | M | M | M | M | Y | Y | M | M | M | M |
uI | Y | Y | Y | Y | M | N | Y | M | M | M | Y | Y | Y | M | M | M |
uS | Y | Y | Y | Y | M | N | Y | Y | M | M | Y | Y | Y | Y | M | M |
uB | Y | Y | Y | Y | M | N | Y | Y | Y | M | Y | Y | Y | Y | Y | M |
pI | Y | Y | Y | Y | N | N | M | M | M | M | M | M | M | M | M | Y |
dur | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
dT | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
tim | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
dat | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
YM | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
Yr | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
MD | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
Day | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
Mon | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
bool | Y | Y | Y | Y | M | N | Y | Y | Y | Y | Y | Y | Y | Y | Y | M |
b64 | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
hxB | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M |
aURI | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
QN | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
NOT | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
As specified in the preceding table, conversion from various simple types to the various numeric types (that is, float
, double
, decimal
, and types derived from decimal
) depends on factors considered below. (If the entry in the preceding table is "N", then no cast is supported; consequently, the following specifications do not discuss those conversions.)
When a value of any simple type is cast to float
, the float
value TV is derived from the source type ST and the source value SV as follows:
If ST is float
, then TV is SV and the conversion is complete.
If ST is double
and SV cannot be represented in the value space of float
as defined in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is double
and SV can be represented in the value space of float
as defined in [XML Schema Part 2: Datatypes], then TV is SV and the conversion is complete.
If ST is decimal
, integer
, nonPositiveInteger
, negativeInteger
, long
, int
, short
, byte
, nonNegativeInteger
, unsignedLong
, unsignedInt
, unsignedShort
, unsignedByte
, or positiveInteger
, then TV is xf:float(cast as string(
SV
))
and the conversion is complete.
SV is converted to an intermediate value IV of type token
.
If the value of xf:upper(
IV
)
is INF
, -INF
, or NAN
, then TV is positive infinity, negative infinity, or not-a-number, respectively, and the conversion is complete.
If IV does not match the lexical structure of NumericLiteral
as defined in [XQuery 1.0: An XML Query Language], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
[Issue 50: Should double("ZZZ") or cast as float("ZZZ") return an error or NaN? ]
Otherwise, let NL be a NumericLiteral
comprising the same sequence of characters as IV. TV is xf:float(
NL
)
.
When a value of any simple type is cast to double
, the double
value TV is derived from the source type ST and the source value SV as follows:
If ST is double
, then TV is SV and the conversion is complete.
If ST is float
, decimal
, integer
, nonPositiveInteger
, negativeInteger
, long
, int
, short
, byte
, nonNegativeInteger
, unsignedLong
, unsignedInt
, unsignedShort
, unsignedByte
, or positiveInteger
, then TV is xf:double(cast as string(
SV
))
and the conversion is complete.
SV is converted to an intermediate value IV of type token
.
If the value of xf:upper(
IV
)
is INF
, -INF
, or NAN
, then TV is positive infinity, negative infinity, or not-a-number, respectively, and the conversion is complete.
If IV does not match the lexical structure of NumericLiteral
as defined in [XQuery 1.0: An XML Query Language], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, let NL be a NumericLiteral
comprising the same sequence of characters as IV. TV is xf:double(
NL
)
.
When a value of any simple type is cast to decimal
, the decimal
value TV is derived from the source type ST and the source value SV as follows:
If ST is decimal
, then TV is SV and the conversion is complete.
If ST is integer
, nonPositiveInteger
, negativeInteger
, long
, int
, short
, byte
, nonNegativeInteger
, unsignedLong
, unsignedInt
, unsignedShort
, unsignedByte
, or positiveInteger
, then TV is decimal(cast as string(
SV
))
and the conversion is complete.
If ST is float
or double
, then TV is decimal(cast as string(xf:round(
SV
)))
and the conversion is complete.
SV is converted to an intermediate value IV of type token
.
If IV does not match the lexical structure of NumericLiteral
as defined in [XQuery 1.0: An XML Query Language], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, let NL be a NumericLiteral
comprising the same sequence of characters as IV. TV is decimal(cast as string(xf:round(
NL
)))
.
When a value of any simple type is cast to integer
, the integer
value TV is derived from the source type ST and the source value SV as follows:
If ST is integer
, nonPositiveInteger
, negativeInteger
, long
, int
, short
, byte
, nonNegativeInteger
, unsignedLong
, unsignedInt
, unsignedShort
, unsignedByte
, or positiveInteger
, then TV is integer(cast as string(
SV
))
and the conversion is complete.
If ST is decimal
, float
or double
, then TV is integer(cast as string(xf:round(
SV
)))
and the conversion is complete.
SV is converted to an intermediate value IV of type token
.
If IV does not match the lexical structure of NumericLiteral
as defined in [XQuery 1.0: An XML Query Language], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, let NL be a NumericLiteral
comprising the same sequence of characters as IV. TV is integer(cast as string(xf:round(
NL
)))
.
When a value of any simple type is cast to nonPositiveInteger
, the nonPositiveInteger
value TV is derived from the source type ST and the source value SV as follows:
SV is converted to an intermediate value IV of type integer
.
If IV is greater than zero, then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to negativeInteger
, the negativeInteger
value TV is derived from the source type ST and the source value SV as follows:
SV is converted to an intermediate value IV of type integer
.
If IV is greater than or equal to zero, then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to long
, int
, short
, or byte
, the long
, int
, short
, or byte
, respectively, value TV is derived from the source type ST and the source value SV as follows:
SV is converted to an intermediate value IV of type integer
.
If IV is greater than the value of maxInclusive
for long
, int
, short
, or byte
, respectively, or less than the value of minInclusive
for long
, int
, short
, or byte
, respectively, as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to nonNegativeInteger
, the nonNegativeInteger
value TV is derived from the source type ST and the source value SV as follows:
SV is converted to an intermediate value IV of type integer
.
If IV is less than zero, then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to unsignedLong
, unsignedInt
, unsignedShort
, or unsignedByte
, the unsignedLong
, unsignedInt
, unsignedShort
, or unsignedByte
, respectively, value TV is derived from the source type ST and the source value SV as follows:
SV is converted to an intermediate value IV of type integer
.
If IV is greater than the value of maxInclusive
for unsignedLong
, unsignedInt
, unsignedShort
, or unsignedByte
, respectively, as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
When a value of any simple type is cast to positiveInteger
, the positiveInteger
value TV is derived from the source type ST and the source value SV as follows:
SV is converted to an intermediate value IV of type integer
.
If IV is less than or equal to zero, then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, TV is IV.
The following table covers conversions to datetime and duration types.
S\T | dur | dT | tim | dat | YM | Yr | MD | Day | Mon |
str | M | M | M | M | M | M | M | M | M |
nStr | M | M | M | M | M | M | M | M | M |
tok | N | N | N | N | N | N | N | N | N |
lan | N | N | N | N | N | N | N | N | N |
Nam | N | N | N | N | N | N | N | N | N |
NC | N | N | N | N | N | N | N | N | N |
ID | N | N | N | N | N | N | N | N | N |
IDR | N | N | N | N | N | N | N | N | N |
IDS | N | N | N | N | N | N | N | N | N |
ENT | N | N | N | N | N | N | N | N | N |
ENS | N | N | N | N | N | N | N | N | N |
NMT | N | N | N | N | N | N | N | N | N |
NMS | N | N | N | N | N | N | N | N | N |
flt | N | N | N | N | N | N | N | N | N |
dbl | N | N | N | N | N | N | N | N | N |
dec | N | N | N | N | N | N | N | N | N |
itg | N | N | N | N | N | N | N | N | N |
nPI | N | N | N | N | N | N | N | N | N |
nI | N | N | N | N | N | N | N | N | N |
lng | N | N | N | N | N | N | N | N | N |
int | N | N | N | N | N | N | N | N | N |
sh | N | N | N | N | N | N | N | N | N |
byt | N | N | N | N | N | N | N | N | N |
nNI | N | N | N | N | N | N | N | N | N |
uL | N | N | N | N | N | N | N | N | N |
uI | N | N | N | N | N | N | N | N | N |
uS | N | N | N | N | N | N | N | N | N |
uB | N | N | N | N | N | N | N | N | N |
pI | N | N | N | N | N | N | N | N | N |
dur | Y | N | N | N | N | N | N | N | N |
dT | N | Y | Y | Y | Y | Y | Y | Y | Y |
tim | N | Y | Y | N | N | N | N | N | N |
dat | N | Y | N | Y | Y | Y | Y | Y | Y |
YM | N | Y | N | Y | Y | Y | Y | N | Y |
Yr | N | Y | N | Y | Y | Y | N | N | N |
MD | N | Y | N | Y | Y | N | Y | Y | Y |
Day | N | Y | N | Y | N | N | Y | Y | N |
Mon | N | Y | N | Y | Y | N | Y | N | Y |
bool | N | N | N | N | N | N | N | N | N |
b64 | N | N | N | N | N | N | N | N | N |
hxB | N | N | N | N | N | N | N | N | N |
aURI | N | N | N | N | N | N | N | N | N |
QN | N | N | N | N | N | N | N | N | N |
NOT | N | N | N | N | N | N | N | N | N |
As specified in the preceding table, conversion from various simple types to the various duration- and time-related types depends on factors considered below. (If the entry in the preceding table is "N", then no cast is supported; consequently, the following specifications do not discuss those conversions.)
When a value of any simple type is cast to duration
, the duration
value TV is derived from the source type ST and the source value SV as follows:
If ST is duration
, then TV is SV.
Editorial note | |
It is unclear what effects the pattern facet of either the source or target duration items should have on this conversion. |
If ST is string
or normalizedString
, then TV is xf:duration(cast as string(
SV
))
.
When a value of any simple type is cast to dateTime
, time
, date
, gYearMonth
, gYear
,
gMonthDay
, gDay
, or gMonth
, let CYR be cast as string( xf:get-Year( xf:currentDateTime() ) )
, let CMO be cast as string( xf:get-month( xf:currentDateTime() ) )
, and let CDA be cast as string( xf:get-day( xf:currentDateTime() ) )
.
When a value of any simple type is cast to dateTime
, the dateTime
value TV is derived from the source type ST and the source value SV as follows:
If ST is dateTime
, then TV is SV.
If ST is time
, then let SHR be cast as string( xf:get-hour(
SV
) )
, let SMI be cast as string( xf:get-minute(
SV
) )
, and let SSE be cast as string( xf:get-second(
SV
) )
; TV is xf:dateTime( xf:concat(
CYR
, '-',
CMO
, '-',
CDA
, 'T',
SHR
, ':',
SMI
, ':',
SSE
) )
.
If ST is date
, then let SYR be cast as string( xf:get-Year(
SV
) )
, let SMO be cast as string( xf:get-month(
SV
) )
, and let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:dateTime( xf:concat(
SYR
, '-',
SMO
, '-',
SDA
, 'T00:00:00') )
.
If ST is gYearMonth
, then let SYR be cast as string( xf:get-Year(
SV
) )
and let SMO be cast as string( xf:get-month(
SV
) )
; TV is xf:dateTime( xf:concat(
SYR
, '-',
SMO
, '-01T00:00:00') )
.
If ST is gYear
, then let SYR be cast as string( xf:get-Year(
SV
) )
; TV is xf:dateTime( xf:concat(
SYR
, '-01-01T00:00:00') )
.
If ST is gMonthDay
, then let SMO be cast as string( xf:get-month(
SV
) )
and let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:dateTime( xf:concat(
CYR
, '-',
SMO
, '-',
SDA
, 'T00:00:00') )
.
If ST is gDay
, then let SDA be cast as string(
SV
)
; TV is xf:dateTime( xf:concat(
CYR
, '-',
CMO
, '-',
SDA
, 'T00:00:00') )
.
If ST is gMonth
, then let SMO be cast as string(
SV
)
; TV is xf:dateTime( xf:concat(
CYR
, '-',
SMO
, '-01T00:00:00') )
.
If ST is string
or normalizedString
and SV is not a valid lexical representation for dateTime
as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is string
or normalizedString
and SV is a valid lexical representation for dateTime
as specified in [XML Schema Part 2: Datatypes], then TV is xf:dateTime(
SV
)
.
When a value of any simple type is cast to time
, the time
value TV is derived from the source type ST and the source value SV as follows:
If ST is time
, then TV is SV.
If ST is dateTime
, then TV is xf:time( xf:concat( cast as string( xf:get-hour(
SV
) ), ':', cast as string( xf:get-minute(
SV
) ), ':', cast as string( xf:get-second(
SV
) ) ) )
.
If ST is string
or normalizedString
and SV is a valid lexical representation for time
as specified in [XML Schema Part 2: Datatypes], then TV is xf:time(
SV
)
.
When a value of any simple type is cast to date
, the date
value TV is derived from the source type ST and the source value SV as follows:
If ST is dateTime
, then let SYR be cast as string( xf:get-Year(
SV
) )
, let SMO be cast as string( xf:get-month(
SV
) )
, and let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:dateTime( xf:concat(
SYR
, '-',
SMO
, '-',
SDA
) )
.
If ST is date
, then TV is SV.
If ST is gYearMonth
, then let SYR be cast as string( xf:get-Year(
SV
) )
and let SMO be cast as string(xf:get-month(
SV
) )
; TV is xf:dateTime( xf:concat(
SYR
, '-',
SMO
, '-01') )
.
If ST is gYear
, then let SYR be cast as string( xf:get-Year(
SV
) )
; TV is xf:dateTime( xf:concat(
SYR
, '-01-01') )
.
If ST is gMonthDay
, then let SMO be cast as string( xf:get-month(
SV
) )
and let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:dateTime( xf:concat(
CYR
, '-',
SMO
, '-',
SDA
) )
.
If ST is gDay
, then let SDA be cast as string(
SV
)
; TV is xf:dateTime( xf:concat(
CYR
, '-',
CMO
, '-',
SDA
) )
.
If ST is gMonth
, then let SMO be cast as string (
SV
)
; TV is xf:dateTime( xf:concat(
CYR
, '-',
SMO
, '-01') )
.
If ST is string
or normalizedString
and SV is not a valid lexical representation for date
as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is string
or normalizedString
and SV is a valid lexical representation for date
as specified in [XML Schema Part 2: Datatypes], then TV is xf:date(
SV
)
.
When a value of any simple type is cast to gYearMonth
, the gYearMonth
value TV is derived from the source type ST and the source value SV as follows:
If ST is dateTime
or date
, then let SMO be cast as string( xf:get-month(
SV
) )
and let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:gYearMonth( xf:concat(
SMO
, '-',
SDA
) )
.
If ST is gYearMonth
, then TV is SV.
If ST is gYear
, then let SYR be cast as string( xf:get-Year(
SV
) )
; TV is xf:dateTime( xf:concat(
SYR
, '-01') )
.
If ST is gMonthDay
or gMonth
, then let SMO be cast as string( xf:get-month(
SV
) )
; TV is xf:dateTime( xf:concat(
CYR
, '-',
SMO
) )
.
If ST is string
or normalizedString
and SV is not a valid lexical representation for gYearMonth
as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is string
or normalizedString
and SV is a valid lexical representation for gYearMonth
as specified in [XML Schema Part 2: Datatypes], then TV is xf:date(
SV
)
.
When a value of any simple type is cast to gYear
, the gYear
value TV is derived from the source type ST and the source value SV as follows:
If ST is dateTime
, date
, or gYearMonth
then let SYR be cast as string( xf:get-Year(
SV
) )
; TV is xf:gYear(
SYR
)
.
If ST is gYearMonth
, then TV is SV.
If ST is string
or normalizedString
and SV is not a valid lexical representation for gYear
as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is string
or normalizedString
and SV is a valid lexical representation for gYear
as specified in [XML Schema Part 2: Datatypes], then TV is xf:date(
SV
)
.
When a value of any simple type is cast to gMonthDay
, the gMonthDay
value TV is derived from the source type ST and the source value SV as follows:
If ST is dateTime
or date
, then let SMO be cast as string( xf:get-month(
SV
) )
and let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:gYearMonth( xf:concat(
SMO
, '-',
SDA
) )
.
If ST is gYearMonth
, then let SMO be cast as string( xf:get-month(
SV
) )
; TV is xf:dateTime( xf:concat(
SMO
, '-01') )
.
If ST is gMonthDay
, then TV is SV.
If ST is gDay
, then let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:dateTime( xf:concat(
CMO
,
CDA
) )
.
If ST is gMonth
, then let SMO be cast as string( xf:get-month(
SV
) )
; TV is xf:dateTime( xf:concat(
SMO
, '-01') )
.
If ST is string
or normalizedString
and SV is not a valid lexical representation for gMonthDay
as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is string
or normalizedString
and SV is a valid lexical representation for gMonthDay
as specified in [XML Schema Part 2: Datatypes], then TV is xf:date(
SV
)
.
When a value of any simple type is cast to gDay
, the gDay
value TV is derived from the source type ST and the source value SV as follows:
If ST is dateTime
, date
, or gMonthDay
, then let SDA be cast as string( xf:get-day(
SV
) )
; TV is xf:gDay(
SDA
)
.
If ST is gDay
, then TV is SV.
If ST is string
or normalizedString
and SV is not a valid lexical representation for gDay
as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is string
or normalizedString
and SV is a valid lexical representation for gDay
as specified in [XML Schema Part 2: Datatypes], then TV is xf:date(
SV
)
.
When a value of any simple type is cast to gMonth
, the gMonth
value TV is derived from the source type ST and the source value SV as follows:
If ST is dateTime
, date
, gYearMonth
, or gMonthDay
, then let SMO be cast as string( xf:get-month(
SV
) )
; TV is xf:gMonth(
SMO
)
.
If ST is gMonth
, then TV is SV.
If ST is string
or normalizedString
and SV is not a valid lexical representation for gMonth
as specified in [XML Schema Part 2: Datatypes], then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is string
or normalizedString
and SV is a valid lexical representation for gMonth
as specified in [XML Schema Part 2: Datatypes], then TV is xf:date(
SV
)
.
The following table covers conversions to all other simple types.
S\T | bool | b64 | hxB | aURI | QN | NOT |
str | M | Y | M | M | M | N |
nStr | M | Y | M | M | M | N |
tok | N | Y | M | N | N | N |
lan | N | Y | N | N | N | N |
Nam | N | Y | N | N | N | N |
NC | N | Y | N | N | N | N |
ID | N | Y | N | N | N | N |
IDR | N | Y | N | N | N | N |
IDS | N | Y | N | N | N | N |
ENT | N | Y | N | N | N | N |
ENS | N | Y | N | N | N | N |
NMT | N | Y | N | N | N | N |
NMS | N | Y | N | N | N | N |
flt | M | Y | N | N | N | N |
dbl | M | Y | N | N | N | N |
dec | M | Y | N | N | N | N |
itg | M | Y | N | N | N | N |
nPI | M | Y | N | N | N | N |
nI | M | Y | N | N | N | N |
lng | M | Y | N | N | N | N |
int | M | Y | N | N | N | N |
sh | M | Y | N | N | N | N |
byt | M | Y | N | N | N | N |
nNI | M | Y | N | N | N | N |
uL | M | Y | N | N | N | N |
uI | M | Y | N | N | N | N |
uS | M | Y | N | N | N | N |
uB | M | Y | N | N | N | N |
pI | M | Y | N | N | N | N |
dur | N | Y | N | N | N | N |
dT | N | Y | N | N | N | N |
tim | N | Y | N | N | N | N |
dat | N | Y | N | N | N | N |
YM | N | Y | N | N | N | N |
Yr | N | Y | N | N | N | N |
MD | N | Y | N | N | N | N |
Day | N | Y | N | N | N | N |
Mon | N | Y | N | N | N | N |
bool | Y | Y | N | N | N | N |
b64 | M | Y | M | N | N | N |
hxB | M | Y | Y | N | N | N |
aURI | N | N | N | Y | N | N |
QN | N | N | N | N | Y | N |
NOT | N | N | N | N | N | N |
As specified in the preceding table, casting from most simple types to base64Binary
and to hexBinary
is possible, but sometimes depends on factors considered below; by contrast, casting to anyURI
, or QName
is possible only from the same type or possibly from string
or normalizedString
. (If the entry in the preceding table is "N", then no cast is supported; consequently, the following specifications do not discuss those conversions.)
When a value of any simple type is cast to boolean
, the boolean
value TV is derived from the source type ST and the source value SV as follows:
If ST is string
or normalizedString
and xf:upper(
SV
)
is "
TRUE
" or "
1
", then TV is true
; if ST is string
or normalizedString
and xf:upper(
SV
)
is "
FALSE
" or "
0
", then TV is false
.
If ST is float
, double
, decimal
, integer
, long
, int
, short
, byte
, nonNegativeInteger
, unsignedLong
, unsignedInt
, unsignedShort
, unsignedByte
, or positiveInteger
and SV is 1, then TV is true
.
If ST is float
, double
, decimal
, integer
, nonPositiveInteger
, long
, int
, short
, byte
, nonNegativeInteger
, unsignedLong
, unsignedInt
, unsignedShort
, or unsignedByte
and SV is 0, then TV is false
.
If ST is nonPositiveInteger
or negativeInteger
and SV is 1, then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is nonPositiveInteger
, negativeInteger
, or positiveInteger
and SV is 0, then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
If ST is boolean
, then TV is SV.
If ST is base64Binary
or hexBinary
and SV is "
1
", then TV is true
; if ST is base64Binary
or hexBinary
and SV is "
0
", then TV is false
.
If ST is base64Binary
or hexBinary
and SV is neither "
1
" nor "
0
", then the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
Otherwise, an error value is returned as specified in [XQuery 1.0 and XPath 2.0 Data Model].
When a value of any simple type is cast to base64Binary
, the base64Binary
value TV is derived from the source type ST and the source value SV as follows:
If ST is not string
, then SV is converted to an intermediate value IV of type string
.
TV is IV converted to base64Binary
as specified in [XML Schema Part 2: Datatypes].
When a value of any simple type is cast to hexBinary
, the hexBinary
value TV is derived from the source type ST and the source value SV as follows:
If ST is not string
, then SV is converted to an intermediate value IV of type string
.
If IV is a sequence comprising an even number N of characters, each of which is a hexadecimal digit (taken from '0' through '9', 'a' through 'f', and 'A' through 'F'), then TV is a sequence of N/2 binary octets in which each octet contains the binary value corresponding to the pairs of hexadecimal digits in IV, taken in sequence.
Otherwise, the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
When a value of any simple type is cast to anyURI
, the anyURI
value TV is derived from the source type ST and the source value SV as follows:
If ST is string
or normalizedString
and SV conforms to the format of a Uniform Resource Identifier Reference as specified in [XML Schema Part 2: Datatypes], then TV is SV.
Otherwise, the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
When a value of any simple type is cast to QName
, the QName
value TV is derived from the source type ST and the source value SV as follows:
If ST is string
or normalizedString
and SV conforms to the format of a QName as specified in [XML Schema Part 2: Datatypes], then TV is SV.
Otherwise, the cast returns an error value as defined in [XQuery 1.0 and XPath 2.0 Data Model].
The following table specifies additional casting functions that already exists in XPath 1.0 and must be retained.
Function | Meaning | Source |
xf:boolean(node)
|
Returns a boolean based on the argument. | XPath 1.0 |
xf:string(node)
|
Returns a string based on the argument. | XPath 1.0 |
[Issue 41: Conflict: no function overloading, XPath function retention, constructor orthogonality]
[Issue 85: The semantics of xf:boolean(node) is underspecified]
The boolean function converts its argument to a boolean as follows:
A number is true if and only if it is neither zero, positive zero or negative zero, nor NaN.
A sequence is true if and only if it is non-empty.
A string is true if and only if its length is non-zero.
An object of a type other than the above types is converted to a boolean in a way that is dependent on that type.
The string function converts its argument to a string according to the following rules
A number is converted to a string as follows:
NaN is converted to the string NaN
.
Positive zero is converted to the string 0
.
Negative zero is converted to the string 0
.
Positive infinity is converted to the string Infinity
.
Negative infinity is converted to the string -Infinity
.
If the number is an integer, the number is represented in decimal form with no decimal point and no leading zeros, preceded by a minus sign (-) if the number is negative.
Otherwise, the number is represented in decimal form including a decimal point with at least one digit before the decimal point and at least one digit after the decimal point, preceded by a minus sign (-) if the number is negative; there must be no leading zeros before the decimal point apart possibly from the one required digit immediately before the decimal point; beyond the one required digit after the decimal point there must be as many, but only as many, more digits as are needed to uniquely distinguish the number from all other IEEE 754 numeric values.
The boolean false value is converted to the string false
. The boolean true value is converted to the string true
.
An object of a type other than the above types is converted to a string in a way that is dependent on that type.
If the argument is omitted, it defaults to a node-set with the context node as its only member.
[Issue 118: Should xf:string(node) return error if argument not scalar?]
This section contains the current issues related to the operators specification.
Originator: | Operator Editors |
Locus: | Syntax |
The mechanism of determining the precise XML Schema type of a numeric literal value has not yet been determined. The two broad approaches are (1)to determine the absolute value of the literal and determine its Schema type based on the range of values into which it falls, versus (2) to use additional syntax in the style of the C or Java programming languages to specify the Schema type.
Resolution:
RESOLVED: Numeric literals' types are determined by their syntax, not by their values.
Originator: | Andrew Eisenberg |
Locus: | Syntax |
An additional level of detail is required for some of the numeric constructors. For example, xf:double(string)
takes a string as its argument. We need to carefully specify what strings are valid and which are not.
Resolution:
Originator: | Andrew Eisenberg |
Locus: | Syntax |
We include the "<" operator in this table. "<" followed by a letter, a "_", or a "." is an element constructor. Does this mean that "<" followed by whitespace (or any other character) is the less-than operator?
Resolution:
RESOLVED: "<" is acceptable as the less-than operator as long as it is not followed by a character that can start an XML identifier.
Originator: | Andrew Eisenberg |
Locus: | Syntax |
We state that decimal op decimal returns decimal. Don't we need to be taking precision and scale into account (this issue applies to the constructors as well)? If my expression is 4.5 * 2.11, SQL would state that the operands are DECIMAL (2,1) and DECIMAL (3,2), and the result is DECIMAL (*,3), where * is implementation-defined.
Resolution:
RESOLVED: It is not necessarily the case that there will be facets, so numeric types do not necessarily have to capture precision and scale (which are, in fact, facets). When operations among numeric types are performed, facet information is lost.
Originator: | Andrew Eisenberg |
Locus: | Syntax |
The constructor xf:string
takes a string as its argument. Is this constructor a no-op?
Resolution:
RESOLVED: Yes, this constructor is a no-op, but it has been determined to keep it for sake of orthogonality.
Originator: | Andrew Eisenberg |
Locus: | Syntax |
Should another function, XOR
, for "exclusive or", be added as another Boolean function?
Resolution:
RESOLVED: The XOR function will not be introduced at this time.
Originator: | Michael Kay |
Locus: | Syntax |
The document doesn't seem to address conversion between types, e.g., integer to string, boolean to number.
Resolution:
RESOLVED: The document now has support for casting, using the syntax provided in [XQuery 1.0: An XML Query Language].
Originator: | Michael Kay |
Locus: | Syntax |
Why is the relationship of float to double not the same as the relationship of int (and short, etc) to decimal? It seems that all operations on floats should produce a double, just as all operations on integers produce a decimal; and also that the default type for, say, 1.0E6 should be double rather than float. (In fact, some think it amazing that the schema group found a reason to support single-precision float at all; why would anyone want it? And, for that matter, why do they want byte, short, etc?).
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
Does it make any difference whether the constructor "2" produces a byte, an int, or a decimal? There does not seem to be any operator or function in this document that produces a different answer depending on which it is. So why not make it a decimal?
Resolution:
RESOLVED: There are different constructors for each of several type.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
It may be desirable to define functions that will allow URIs to be decomposed into their component parts, resolve relative URIs into their absolute equivalents, and so forth.
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
What should this version of the document do about complex types? Should they be covered in V1 or deferred to a future version? Or should they just never be part of this specification.
At the very minimum, this version of this document must contain a statement confirming that complex types are not covered in the current version of the document.
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Prominent mentions of various schema types should be transformed into links into the Schema document(s). This may also be true of other items in this document regarding links to other documents.
Resolution:
RESOLVED: Links to Schema and other relevant documents are found throughout this document.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
It is appropriate for many names (e.g., function names) specified in this document to be Qnames that are qualified by a namespace appropriate to this use. Some of the names in the current document are qualified with the identifier "xsd", which is probably inappropriate because that identifier is often used for the XML Schema namespace. Furthermore, having just an identifier is insufficient: a complete URI is required.
Resolution:
PARTIALLY RESOLVED: It has been determined that the namespace prefix shall be "xf:", but the final namespace URI has not yet been determined. However, in this document, it is (tentatively?) specified to be http://www.w3.org/2001/08/query-functions
at the suggestion of the Query WG Chair.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
A function signature like "make_string(string)" is misleading since it implies that the argument might have to be of most specific type "string" as opposed to IDREF. This document must make it clear that other types (e.g., subtypes of "string", or derived types derived from "string", such as "token" are acceptable. Question arose about whether there should also be a version of "make-string" with a "number" argument, as opposed (or in addition to) to explicit and/or implicit "cast" functions. The intent is to have "make-string(string)" only for symmetry, not to allow for subtypes.
Resolution:
Originator: | Michael Brundage; private mail |
Locus: | Syntax |
Strings can be quoted using single or double quotes? Do we require an escape for the quotes?
Resolution:
RESOLVED: This is a syntax issue that should be dealt with in the XQuery document.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Several participants feel that there are three sorts of ways to "create" values of various types: (1)literals, (2)constructors, and (3)casting. There seems to be broad agreement on (1) and (3), but there are questions about the "meaning" of (2).
In particular, are constructors really functions, or do they use functional notation without actually being functions? If they are actual functions, it would appear that their function bodies are unlikely to be much more than a CAST. That sheds some doubt on the requirement to have constructor functions.
Resolution:
RESOLVED: There is much similarity between constructors and casting, but there appears to be sufficient value in distinguishing them. Therefore, the constructors will be retained. See [Issue 31: Distinguish between literals and constructors] for more information.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
CAST behaves in many ways like an operator or a function that is defined to operate on instances of several types. Should the definition of CAST be moved to this document (and eliminated from the XQuery document)?
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Should a constructor be as liberal as CAST in "fixing up" the contents of source data being transformed into another data type? Or should they be more rigid, as suggested in the current document, and require that source data be in precisely the proper "shape"?
Resolution:
Discussion: The discussion indicates that the input values should be extremely close to what schema validation supplies, although allowing things like the terminating "L" for "long literals" might be acceptable. Therefore, "long(3.5)" should be a syntax error rather than a compile-time round/truncate. RESOLVED: Require the specified value to be "of the proper type" per Schema (with a few enhancements such as "L").
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
There is general agreement that character strings must be compared using a collation that gives culturally-correct results. However, there is sometimes the need to determine whether two character strings are "distinguishable" (meaning that they are to be compared without use of a collation).
Furthermore, it is possible to build collations that are "parameterized", perhaps allowing case-sensitive and case-insensitive (or accent-sensitive versus accent-insensitive comparisons) by providing a parameter to the collation when it is invoked. Should such a facility be provided in this specification? If so, how far should it be taken?
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Many uses of the word "character" in this specification would probably be better if changed to "codepoint" to make it clear precisely what Unicode concept is meant.
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
What is the precise type returned by various functions? Is the specific type of the argument the returned type, or does it get "upcast" to "string"? Some operations might not be able to keep the most specific type (e.g., SUBSTRING(NCNAME,2) may not be a NCNAME!)
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
This specification appears to have assumed that "normalize" means "NFC". Should it support other forms (NFD, compatibility variants) in addition to NFC? Instead of NFC?
Resolution:
RESOLVED: Added the normalization form as second argument to the function
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Currently, the specification uses the phrase "...returns a copy" in several locations. That phrase is not appropriate and the wording should be changed to accurately describe the intended semantics.
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
For various numeric and datetime types, and possibly others as well, the document must address the facets of those types, specifying precisely what effect the facets have on the operators being defined.
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Should this specification provide a function that normalizes timeDuration types so that the month/day boundary is not violated?
Resolution:
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Is the Gregorian calendar adequate for V1? What about durations in specific additional (e.g., banking) calendars based on Gregorian?
Resolution:
RESOLVED: For the first version of this document, only the Gregorian calendar is supported (particularly since XML Schema offers support for no other calendar).
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
What is the schema type returned by the datetime "contains" operator? It's not Boolean, since "indeterminate" is a possible returned value. Is it, for example, an enumerated type?
Resolution:
RESOLVED. We have redefined the contains functions to return a boolean value.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
What sort or sorts of function overloading do we assume? None, number of arguments, declared types of parameters, or most-specific runtime types of arguments? What about user-defined types derived from a schema type (or from another user-defined type)?
Temporarily (only), it has been determined that functions can be overloaded only by the number of parameters and not by the data types of the parameters.
Resolution:
RESOLVED. No overloading supported for now. See 1.1 Syntax
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
The document currently defines functions such as "getDay(dateTime)" that returns a "gDay". Should there be a function that returns an "integer" (with the value of the day)?
Resolution:
RESOLVED: Return types changed to integers.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
This specification should probably contain a table, such as the one in the SQL standard, indicating what the valid combinations of source type and target type are for the CAST operation.
Resolution:
RESOLVED: Such a table (four of them, actually) are now present in this document.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
This specification must distinguish between constructing a value from a lexical representation (that is, literals) and invoking a constructor function (e.g., "make-date(string)"). Of course, it must also have run-time type conversions ("cast", whether using the currently-proposed CAST AS syntax or some implicit function invocation as in SQL's user-defined type capability).
Some participants suggest that there might be only numeric literals and string literals and everything else (datetime, Booleans, lists) don't have literals but have constructors instead. Others asked why we don't do just what the XQuery document does today (literals for only a few types and constructors and/or casts for everything else).
Resolution:
RESOLVED: The document currently provides literals for many data types, constructors for others, and casting for others.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
In addition to, or instead of, constructors for datetime and timeDuration types, should there be a form of literal (such as is done in the SQL standard)?
Resolution:
RESOLVED: The literals that look a lot like SQL's literals are too similar to the constructors, so there is no gain. Microsoft's approach of using square brackets to delimit them conflicts with XPath's use of square brackets. Resolved: No, there will not be literals; the constructors will be the closest we provide.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
Some participants believe that every issue in this specification should be linked to the URI of the source of the issue (e.g., minutes of a meeting or e-mail in the archives).
Resolution:
Most issues have a source specified, but a few remain to be done.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
The type signature of functions will have a significant impact on some decisions about the type interpreted for numeric literals.
Resolution:
RESOLVED: There should not be any influence on the types of literals caused by function signatures.
Originator: | Operators Task Force F2F 2001-03-21 |
Locus: | Syntax |
The operators document has to address the required capabilities, but might not immediately prescribe the concrete syntax used to produce those capabilities.
Resolution:
Originator: | Michael Kay (member only message) |
Locus: | Syntax |
The functions contains(), substring-before(), and substring-after() [spelt thus] rely on equality matching of substrings. They are therefore locale sensitive in the same way as "=" comparison. Arguably each of these functions needs a version that uses Unicode codepoint comparison, a version that uses a defined collation, and a version that does case-folded comparison.
Resolution:
RESOLVED: Collations are always used for string comparisons and ordering. Another issue ([Issue 70: How are "default" collations determined?]) has been raised to capture questions about collations, default collation selection, and locales.
Originator: | Michael Kay (member only message) |
Locus: | Syntax |
Perhaps we need a "linguistic contains" as in free text searching, e.g. contains-word(., "England"). Would we then want to define the word-breaking, stemming, and matching rules, or leave it to the implementor? See also the Library of Congress Use Cases in e-mail: (member-only message)
Resolution:
Originator: | Mary Fernandez (member only message) |
Locus: | Syntax |
The operators document has to address the required capabilities, but might not immediately prescribe the concrete syntax used to produce those capabilities.
Resolution:
Originator: | Jim Melton |
Locus: | Syntax |
This document offers support for 8 numeric types. Should support be explicitly provided for the remaining numeric types defined by XML Schema ([XML Schema Part 2: Datatypes])?
It has been suggested (see (member-only message)) that the "type promotion" scheme in this document implicitly means that the operators and functions already implicitly support all of XML Schema's numeric types.
Resolution:
Originator: | Jim Melton |
Locus: | Syntax |
Some of the numeric constructors have return types that are unexpected and counter-intuitive. For example, xo:long(string)
returns an integer
instead of a long
. Shouldn't constructors return the type they claim to be constructing?
Resolution:
Originator: | Operator Editors |
Locus: | Syntax |
XPath 1.0 has a function string(node)
, described thus: "Returns a string representation of the argument".
On 2001-07-17, the Operators Task Force F2F determined that the document will not (at this time) support function overloading solely by argument data type.
This document includes a constructor function string(string)
whose signature is identical to the XPath 1.0 function except for the data type of the argument. This function has been stated to be desirable for reasons of orthogonality.
The names of all constructor functions presently in the document are the same as the names of the types for which they are constructors.
There appears to be a conflict between (1)the presence of string(string)
, (2)the decision to prohibit overloading by argument type, and (3)the XPath 1.0 function string(node)
. How shall this conflict be resolved? (Incidentally, a previous version of this document included a function string(node)
with similar or identical functionality to the XPath 1.0 function; that function has been renamed string-value(node)
to remove its overloading and to preserve the characteristic that the name of constructor functions is the same as the name of the type being constructed.
Resolution:
Originator: | Operator Editors |
Locus: | Syntax |
On 2001-07-17, the Operators Task Force agreed that there will be no function overloading based solely on the data types of arguments, further deciding to rename all functions currently overloaded in this way by adopting the convention of prefixing their names with the names of the "primary" data type to which they apply. Therefore, "concat(string, string+)" became string-concat(string, string+)
. But the name of the corresonding XPath 1.0 function that requires us to keep a variable number of arguments is just string
.
What, then, is the most appropriate name for the string concatenation function?
Resolution:
RESOLVED: named "concat" for XPath compatibility.
Originator: | Operator Editors |
Locus: | Syntax |
This document specifies constructor functions named ID
and IDREF
. It also specifies functions named id
and idref
that return sets of nodes identified from the argument values.
Is it confusing to have two (pairs of) functions whose names are so similar, differing only in the case of their names, and is it thus appropriate to consider renaming the latter two to something possibly more descriptive?
Resolution:
Originator: | Michael Sperberg-McQueen |
Locus: | Syntax |
In a presentation at the July, 2001 F2F, it was said that collations would be referred to by URI references. Michael Rys said one rationale is to allow relative URI(-reference)s so one can refer to "French" rather than http://www.example.com/i18n/collation-sequences/case-sensitive/French, and so on. There may be negative impacts on the interoperability results caused by allowing relative URI referneces for this function. It is tempting to suggest requiring collation names to be absolute URIs without fragment identifiers. If brevity is really important, perhaps we should invent a way to assign short names to collations.
Issue resulted from e-mail: Michael Sperberg-McQueen (member-only message)
Resolution:
Originator: | Michael Sperberg-McQueen |
Locus: | Syntax |
The relation of collation-sequence selection to xml:lang labeling of the data needs to be addressed explicitly, even if there is none. (Steve Zilles suggested that since nothing was said about getting defaults from xml:lang values, it was clear that xml:lang does not affect the selection of collation sequences. But earlier, people had said an implementation was clearly free to take a default-collation value from the user's locale, if one was available, on the grounds that nothing was said about it and thus nothing prevents it. We can't argue both that silence in the discussion allows implementors to do anything they like as regards the user's locale, and that it requires implementors to do nothing as regards xml:lang.
So I argue that if we want there to be no interaction with xml:lang, or if we want such an interaction to be legal but not required, or if we want it to be required, we ought to say explicitly what we want.
Issue resulted from e-mail: Michael Sperberg-McQueen (member-only message)
Resolution:
Originator: | Anders Berglund |
Locus: | Syntax |
The semantics of xf:token() explicitly calls out #x20 as the only space character that is prohibited in the string that is an argument to the function. What about the "other" Unicode space characters, such as 00A0 NO-BREAK SPACE, 1361 ETHIOPIC WORDSPACE, 1680 OGHAM SPACE MARK, 2002 EN SPACE, 2003 EM SPACE, 2004 THREE-PER-EM SPACE, 2005 FOUR-PER-EM SPACE, 2006 SIX-PER-EM SPACE, 2007 FIGURE SPACE, 2008 PUNCTUATION SPACE, 2009 THIN SPACE, 200A HAIR SPACE, 200B ZERO WIDTH SPACE, 202F NARROW NO-BREAK SPACE, 3000 IDEOGRAPHIC SPACE, 303F IDEOGRAPHIC HALF FILL SPACE, and FEFF ZERO WIDTH NO-BREAK SPACE?
Issue resulted from e-mail: Anders Berglund (member-only message)
Resolution:
Originator: | Anders Berglund |
Locus: | Syntax |
The requirement is to have an underlying design that permits (either in version 1.0, or at least permits a natural extension to) supporting:
Converting a "date" to a string in a non-Gregorian calendar.
Converting a string, in an appropriate format for a non Gregorian calendar, to a "date".
Comparisons of "dates" where the XML has these expressed in non-Gregorian calendar(s).
The range of supported calendars should probably be left to the implementation.
A possible design approach could be to:
Keep the schema/ISO-8601 date as the "hub" datatype.
Change some of the existing constructors by adding a "calendar" parameter, which would default to Gregorian.
Add some "date" to string functions for presenting non-Gregorian calendar dates.
A comparison of, say, two BE years would be achieved by
-- Year("2544", "BE") > Year("2540", "BE")
Naturally there is no requirement that the two calendars match, so a comparison
-- Year("2544", "BE") > Year("1374", "AH")
would be perfectly reasonable.
There are some issues that need solving/clear definition. These include:
What to do if the calendars do not match, for example a year in a lunar calendar (where a year is shorter than a Gregorian year) typically corresponds to two Gregorian years and arithmetic can become very interesting and certain functions may well need to be changed to accomodate this. This applies to calendars in use today.
Should country and date variations be taken into account? For example in some countries using the Julian calendar the year started September 1st (many of these countries changed it later to January 1st). This applies to "historic" dates.
Should "calendar change over" effects be taken into account? For example if a country had a "Gregorian April" with more than 30 days the year a switch from Julian to Gregorian calendar took place should this be reflected in the conversion? This applies to "historic" dates.
Issue resulted from member-only e-mail: Anders Berglund (member-only message) and subsequently amended by(member-only message)
Resolution:
Originator: | Anders Berglund |
Locus: | Syntax |
The various get-timezone-from-*()
functions will not really work for those areas that still use "sun time" (Saudi Arabia at least used to). Thus better to have "get-timezone" return a string and e.g. get-timezone-difference-from-GMT give a duration, which for sun time would vary.
Issue resulted from e-mail: Anders Berglund (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
(a) the syntax doesn't agree with the XPath grammar.
(b) having a leading minus or plus sign as part of the literal isinconsistent with having it as a unary operator in front of the literal.
(c) I see no need to define special literal formats for a "long", alone among all the derived numeric types. 3L and 3 are the same value: the value space of int is a subset of the value space for long, so 3 is both an int and a long.
(d) 1.0e6 should be a double, not a float. Users who take the trouble to use floating point are more likely to want double precision than single precision; if they want single precision, they can use a constructor.
Resolution:
Originator: | Michael Kay, Steve Tolkin |
Locus: | Syntax |
I'm less sure whether it's right that xf:double('ZZZZZ')
should return an error value, rather than NaN. In XPath 1.0 it's a general principle that NaN is used as a kind of null value for numbers: anything which would normally produce a number, produces NaN when no number is appropriate. But I can live with it.
Similarly, cast as float('ZZZZZ')
is currently defined to return an error value, but could instead return NaN.
Issue resulted from e-mail: (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
In Xpath 1.0, "mod" is an operator, while it is defined in this document as a function. It should be redefined to be an operator, which will require moving it to a different section of the document.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
Section 2.6.2 - 2.6.4 (floor(), ceiling(), and round()). A complete treatment of these functions requires consideration of negative zero, NaN, Infinity, etc. This is all covered in XPath 1.0.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
RESOLVED. XPath 1.0 definitions used for these functions.
Originator: | Michael Kay |
Locus: | Syntax |
General: there do not seem to be equivalents to the XPath 1.0 functions number(), boolean(), and string(), with all argument combinations, e.g. no function to convert a number to a boolean.
In addition, other possible desirable functions (e.g., abs()
) are not specified and it should be determined how to decide what functions are to be included and what ones are not to be included.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
Section 3.3: As discussed, I don't think xf:ID, xf:IDREF, or xf:ENTITY make sense.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
Section 3.5, the "Source" column has many inaccuracies. Many of the functions attributed to XPath 1.0 are new.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
RESOLVED. Erroneous references to XPath 1.0 removed from table.
Originator: | Michael Kay |
Locus: | Syntax |
Some functions (e.g., string-pad-beginning and string-pad-end?) seem to be composite functions that could more usefully be composed from concat() and a primitive pad(char, integer).
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
AND and OR should be and and or. NOT should be not. For completeness, the semantics of these operators should be given. (Note this isn't trivial, because of the rules for whether the second argument is evaluated or not).
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
PARTIALLY RESOLVED: Changed names of the functions. Still needs more explanation of semantics.
Originator: | Michael Kay |
Locus: | Syntax |
Section 9.1, I think perhaps xf:name() should return the name as a QName, not as a string. This needs exploring, especially the semantics for comparing a Qname with a string, e.g. if (name(x) = 'foo:bar'). (For that matter, do we define "=" on two QNames?) This and several other functions return "" if the node has no name.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
RESOLVED: name returns a QName.
Originator: | Michael Kay |
Locus: | Syntax |
Section 9.9: I would like to see the operators "=" assigned to "value-equal", and "is" assigned to "node-equal".
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
We need to specify which aspects of the newly constructed node are the same as the original. E.g. does it have the same name? string-value? namespaces? parent? children? This is the only function in this document with side-effects, so it needs special care. How does it relate to the node-construction functions defined in the data model?
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
"node sequences", "user defined list types": we only have sequences.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
I'm not sure about NMTOKENS, etc. Perhaps we should simply provide a tokenize() on these that turns the NMTOKENS value into a sequence?
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
index-of. Need find-by-value and find-by-identity.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
reverse-sort() looks like a composite function that could and should be defined in terms of reverse() and sort(): so replace it with a new primitive function reverse().
If there were an xf:reverse() function, would xf:reverse(xf:sort(x,y)) = xf:reverse-sort(x,y)?
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
sublist-before() etc. need "by value" and "by identity" variants. So it might be better to rely on index-of() and sublist() which can be used to compose these functions.
Similarly, a primitive pad-sequence(item, integer) would be more useful. (Having to write pad-end( (), 3, 'x') is silly!
truncate-beginning and end seem not especially useful, and can easily be composed from sublist().
resize-beginning and -end: I haven't begun to work out how these might be used, but they don't seem very primitive to me.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay (member-only message). |
Locus: | Syntax |
Need a simple function to sort a sequence in document order (equivalent to union( (), $x )). See also item 20 in member-only e-mail from Don Chamberlin: http://lists.w3.org/Archives/Member/w3c-xml-query-wg/2001Aug/0216.html
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
I think it may be important for 1.0 compatibility for count() and sum() to remove duplicate nodes first. This depends on the decision we make on the "/" operator.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay |
Locus: | Syntax |
id(), need to say which documents the returned nodes come from.
Issue resulted from e-mail: Michael Kay (member-only message)
Resolution:
Originator: | Michael Kay, Mary Fernandez |
Locus: | Syntax |
This idref() function does what id() does in XPath 1.0.
We do not understand what these functions mean. What is the 'node' argument to xf:id? Why does xf:idref return a sequence of nodes when its argument is a singleton IDREF. Also, the type to xf:idref is wrong: it should be xsd:IDREF.
We would expect to have a function with the following signature:
xf:id(xsd:IDREF)
Returns the node having the unique ID represented by the IDREF argument or or the empty sequence (or ERROR) if no such node exists.
Issue resulted from e-mail: Michael Kay (member-only message) and Mary Fernandez (member-only message)
Resolution:
RESOLVED: Function names and signatures have been changed. Except that we made the argument to ID a string* rather than an IDREF for convenience.
Originator: | Steve Zilles and F2F 2001-07-17 |
Locus: | Syntax |
The binary comparison operations determine a "default" collation to use. These operators (and related string "operations" such as sort) must specify how the default collation is determined.
The rules for default collation should specify how "locale" information is taken into account in determining the default.
The rules for default collation should specify how any collation information associated with the data is merged with any other default collation specifications.
The F2F on 2001-07-17 agreed that it should be desirable to specify a default collation in a query (XQuery, at least) preamble, but that explicit specification of a collation in a function invocation overrides that default. The F2F also agreed that the Schema WG should be requested to reconsider whether it should be possible to specify default collations on schema and element (and type) definitions in an XML Schema; those default collations would be used in the absence of a query-level or a function invocation-level specification of a collation. The F2F further agreed that the "ultimate fallback" for a default collation must always be the Unicode Collation Algorithm. Finally, the Function and Operators F2F recognizes the value of allowing collations to be specified (or implied) by a client locale in many instances, but also firmly believes that many data-management situations preclude absolute dependence on locale as the governing factor for ordering decisions.
Issue resulted from e-mail: Steve Zilles (member-only message)
Resolution:
Originator: | Andrew Eisenberg |
Locus: | Syntax |
I believe that we are stating that xf:decimal (" 12.5 ") will generate an error. We should be much more explicit about this. A negative example in the text would help. (Similar comments apply to other numeric constructors.)
Issue resulted from e-mail: Andrew Eisenberg (member-only message)
Resolution:
Resolved. Whitespce is not allowed in the argument string. Example added.
Originator: | Andrew Eisenberg |
Locus: | Syntax |
I believe that we need to say something about numeric overflow and underflow in this section. An issue to that effect should be added to the document.
Issue resulted from e-mail: Andrew Eisenberg (member-only message)
Resolution:
Originator: | Andrew Eisenberg |
Locus: | Syntax |
I believe that we should have an xf:compare-between function (analogous to SQL's BETWEEN predicate, allowing quick determination of whether one value lies between two other values).
Issue resulted from e-mail: Andrew Eisenberg (member-only message)
Resolution:
Originator: | Andrew Eisenberg |
Locus: | Syntax |
I was surprised that match is defined to see that match returns a list of offsets where a pattern is found.
Let's suppose that I'd like to know whether $x is a phone number (for which I have some pattern defined). If I test for xf:match ($x, $phone) = 1, then $x may have many extraneous trailing characters.
If we retain xf:match as it is specified, then I suggest an xf:exact-match function as well.
Issue resulted from e-mail: Andrew Eisenberg (member-only message)
Resolution:
Originator: | Andrew Eisenberg |
Locus: | Syntax |
This definition includes the following: "... that is the value of the second argument, or is matched by the regular expression that is the second argument, ...".
This seems ill-defined; perhaps two functions are needed.
It seems necessary to say something about the matching that is going on here.
xf:replace ("acadae", "a(.*)a", "b$1b") => "bcbdae" or "bcadbe"
The match could be to the shortest string that qualifies, or the longest string that qualifies.
It seems like I will need a special ("escape") mechanism to be able to include "$1" as actual replacement text in the third argument.
Issue resulted from e-mail: Andrew Eisenberg (member-only message)
Resolution:
Originator: | Ashok Malhotra |
Locus: | Syntax |
I had a question from one of our developers re. the "type" function on Node. Does this return the type of the node i.e. element-node, attribute-node, etc. or does it return the Schema type i.e. definition of that node? Since this function merely reflects an accessor in the data model draft, I checked that document. It says that the function returns a Schema component. So we need to make the text clearer to say that the "type" function returns the Schema definition of the node.
Resolution:
RESOLVED. The XPath taskforce decided on 7/24/2001 to remove the xf:type function.
[Issue 77: Should there be a function to convert characters to strings?]
Originator: | Mary Fernandez |
Locus: | Syntax |
Jerome and I are working on the mapping from XQuery to the core. We want to know whether there should be any operators defined on individual characters or whether all operators are on strings. At a minimum, we think we need a constructor that takes an individual character and returns a string.
Issue resulted from e-mail: Mary Fernandez (member-only message)
Resolution:
Originator: | Steve Tolkin |
Locus: | Syntax |
In one place, the type promotion rules say that any type may be promoted to the type of its "primitive ancestor", while in another, the phrase used is "most primitive ancestor". This discrepancy is confusing; is there intended to be a difference in semantics implied by the different wording?
In addition, two paragraphs claims that "one operand is promoted to be the type of the other operand", while the remainder of the section discussion promoting types to their "primitive ancestor".
Issue resulted from marked-up copy of Version 0.6.
Resolution:
Originator: | Steve Tolkin |
Locus: | Syntax |
The xf:mod()
function, the div
operator, and others do not specify what the required number of digits of precision is in the result.
Issue resulted from marked-up copy of Version 0.6.
Resolution:
Originator: | Steve Tolkin |
Locus: | Syntax |
The constructor normalizedString()
requires that the value of the supplied string
argument conform to the lexical requirements of a normalizedString
. Is there a function or constructor that will normalized a non-normalized string and return a normalizedString
?
Issue resulted from marked-up copy of Version 0.6.
Resolution:
RESOLVED. Yes, its called normalize-space. Tkes a string as argument returns a normalized string. Casting from string to normalizedString will perform a similar function.
Originator: | Paul Biron |
Locus: | Syntax |
The precise semantics of regular expressions are not clear. The definition of regular expressions in XML Schema are somewhat ambiguous and, in addition, it is not completely clear what semantics would best benefit XQuery. One issue (just as an example): XML Schema's regular expressions do not support "$n" placeholders, even though the examples in this document depend on them.
Another example: Although the XML Schema definition of regular expressions provides a "greedy" algorithm that attempts to match the longest possible strings, the use cases for defining a subset of a datatype's lexical space, on the one hand, and on matching a string against a pattern, on the other hand, are different...and one might want different behavior. Meaning that one might want a variation that is not as "greedy".
See also [Issue 74: Is a "match-exact()" function needed?] and [Issue 75: The semantics of match() are incompletely specified].
Issue resulted from private e-mail exchange.
Resolution:
Originator: | Steve Tolkin |
Locus: | Syntax |
This document must properly align with [XQuery 1.0 and XPath 2.0 Data Model] when dealing with node sets (which are defined in [XPath 1.0]), lists, and sequences. The current wording in this document is sometimes confused about the distinctions.
Issue resulted from marked-up copy of Version 0.6.
Resolution:
Originator: | Steve Tolkin |
Locus: | Syntax |
The tables would be much more useful if these names grouped by "family" (e.g., all numeric types started with the letter "n", all date and time types with "d", string-like types with "s", and binary types with "b"). (Of course, boolean could go either in with the numeric types or in with the binary types.) Then the tables could be sorted alphabetically. There is already a system: the types cast to in the next few sections.
Issue resulted from marked-up copy of Version 0.6.
Resolution:
Originator: | Steve Tolkin |
Locus: | Syntax |
The current definition of casting to token
states that all line feed codes and all tab characters are removed, after which leading and trailing spaces are deleted and multiple spaces are replaced with a single space. Shouldn't line feed codes and tabs be converted to spaces instead of simply deleted?
Issue resulted from marked-up copy of Version 0.6.
Resolution:
Originator: | Steve Tolkin |
Locus: | Syntax |
The semantics of xf:boolean(node)
currently read "Returns a boolean based on the argument". This seems just a teensy big underspecified.
Issue resulted from marked-up copy of Version 0.6.
Resolution:
The semantics have been specified more completely.
Originator: | Jim Tivy and the WG telcon on 8/1 |
Locus: | Syntax |
Do the node accessor functions such as name(), string-value(), typed-value(), parent(), children(), node-kind() belong in this document or the datamodel document? Similarly, do the kind tests text(), noder() belong in this document or the XQuery document?
Resolution:
Originator: | Andrew Eisenberg (member-only message) |
Locus: | Syntax |
Do the node accessor functions such as string-value(), typed-value(), parent(), children(), node-kind(), node(), text() and data() belong in this document or the datamodel document?.
Resolution:
Originator: | Michael Rys on the August 15, 2001 telcon. |
Locus: | Syntax |
Should the input parameters to these operators be restricted so that indeterminate values never arise. For example, when comparing dateTimes etc. both arguments either must or must not have a timezone. When comparing durations, both arguments mest be either year-month or day-hour-minute-sceond. This is what SQL allows.
Resolution:
Originator: | Phil Wadler (member-only message) |
Locus: | Syntax |
Functions with AnyType in the return are problematic for two reasons. To be concrete, I discuss the following.
-- xf:item-at(anyType* $seqParam, decimal $posParam) => anyType
(1) Note that the types anyType* and anyType are equivalent, which suggests that the typing here is not quite right. We should define
-- define group AnyItem = AnyElement | AnyAttribute | AnySimpleType
and then give the above the type
-- xf:item-at(anyItem* $seqParam, decimal $posParam) => anyItem
(2) Even having made the above change, the type is too broad to be useful, and one will almost always have to cast the result of calling xf:item-at (and similarly for other functions with anyItem or anyType in the result).
Instead, we should allow parametric polymorphism when specifying the signatures of built-in functions.
-- xf:item-at($anyItem* $seqParam, decimal $posParam) => $anyItem
Here $anyItem is a type variable which ranges over any group $anyItem such that $anyItem <: AnyItem. (Recall that s <: t if the extent of type s is a subset of the extent of type t, where the extent of a type is the set of values that have that type.)
Here are two examples of functions written with the current signature.
-- define function second-integer (integer* $integer-sequence) integer { treat as integer (xf:item($integer-sequence, 2)) }
-- define function third-book (Book* $book-sequence) book { treat as Book (xf:item($book-sequence, 3)) }
Here are two examples of functions that would type check under this scheme.
-- define function second-integer (integer* $integer-sequence) integer { xf:item($integer-sequence, 2) }
-- define function third-book (Book* $book-sequence) book { xf:item($book-sequence, 3) }
The definitions are easier to write and more efficient to execute (since no "treat as" needs to check the structure of the result).
Parametric polymorphism would also be useful for user-defined functions, it were clear how to define it for user-defined functions in general. But at the very least, we should allow parametric polymorphism for the built-in functions defined in the functions and operators document.
Resolution:
Originator: | Norm Walsh (member-only message). |
Locus: | Syntax |
The constructor functions for ID and IDREF need a document context for validity.
Resolution:
Originator: | Steve Tolkin in August 15, 2001 telcon. |
Locus: | Syntax |
There also need to be be versions of UNION, INTERSECT, and EXCEPT that work on value equality rather than node identity. See also the XQuery Issue 63: Set operations based on value (xquery-set-operators-on-values).
See also [Issue 132: union(), intersect(), and except(): only for simple values?].
Resolution:
Originator: | Steve Tolkin (member-only message). |
Locus: | Syntax |
The abs(), absolute, function is required. The question is, what should the return type be?
If there were a single type at the top of the numericv type hierarchy that would be a reasonable choice. But there isn't -- decimal, float and double are siblings.
A better answer is the return type is the type of the argument, e.g., abs(-1) is 1, abs(-2L) is 2L, abs(-3.4) is 3.4 etc. This implies function overloading. But I think thus is OK because this is a system defined function.
Resolution:
Originator: | Steve Tolkin (member-only message). |
Locus: | Syntax |
For example a search for the string "bronte" should find the string "Bronté"
In earlier versions of this Functions and Operators document there were special functions that did this Then a decision was made that this should instead be based on a named collation. However at the moment there does not seem to be any vendor who supports a named collation that has this behavior. Nor does XQuery provide any way for a user to define a collation.
Assuming that such a collation exists we also need to clarify how (or if) a query can specify that it applies to string comparison done with the = operator.
Also the each of the two existing xf:substring functions should have a variant that takes a collation argument.
Resolution:
Originator: | Steve Tolkin (member-only message). |
Locus: | Syntax |
Systems that search text often provide a function to search for several terms that are near one another. Often this function is named NEAR and allows an argument to specify the maximum distance allowed between the terms. (Generally it only allow two terms, but sometimes it allows more).
XQuery should provide a standard function to achieve this. To avoid the difficult problem of deciding what is a word boundary etc. it should be defined instead on a sequence of nodes, and search their content.
One possible signature would be within-window(node* node-sequence, int size, string* strings, boolean order-matters)
The last argument order-matters is optional and if true the strings arguments would have to match in the order given.
A function like this could be used to answering the Library of Congress use cases at http://lcweb.loc.gov/crsinfo/xml/lc_usecases.html
It also could be used to answer the question of how to search for the sequence of pitches: C, D, E, C. as discussed in the memeber-only e-mail http://list3.w3.org/Archives/Member/w3c-xml-query-wg/2001Jul/0085.html
First the user would remove the extraneous ancestor nodes caused by the measures, leaving a sequence of note nodes. Then the essence of the query is simply: within-window(note-sequence, 4, make-list-of-strings("C","D","E","C"), true) where the 3rd argument is hypothetical syntax for constructing a list of strings.
Issue resulted from member-only e-mail http://lists.w3.org/Archives/Member/w3c-xml-query-wg/2001Jul/0127.html
Resolution:
Originator: | Steve Tolkin (member-only message). |
Locus: | Syntax |
Users will need to test for absent elements, and optionally provide a value for them. (There is no support in XML Schema for a default value for absent elements.) Writing this function is quite complex -- perhaps too complex to expect from XQuery users -- and too verbose to include in every query that needs it. It should be provided as a built-in function.
The following code was provided by Mary Fernandez; see more in the e-mail cited below.
define function if_absent ( xs:AnyElement? $e, xs:AnySimpleType $s ) return xs:AnySimpleType { typeswitch ($e) as $v case () return $s case xs:AnyElement return (typeswitch ($v) case ELEMENT * { xs:AnySimpleType } return data($v) default return error ) default return error
Similarly users will need to be able to translate an empty element to an element with a particular value. This could be a separate issue, but for now we can keep it as part of this issue. The e-mail cited below has even more complex code for this function.
There should also be a way to test for an absent attribute. This might be a separate function, or combined into this function.
Issue resulted from member-only e-mail http://lists.w3.org/Archives/Member/w3c-xml-query-wg/2001Jul/0171.html
Resolution:
Originator: | Don Chamberlin (member-only message; Items 13 and 14) |
Locus: | Syntax |
Section 5.7 describes three functions named get-duration(), get-end(), and get-start() whose purpose is to compute sums and differences of dates and durations. This purpose would be served in a much more readable way by using arithmetic operators, as in SQL. For example, get-end(time1, duration2) could be better expressed as time1+duration2.
Section 5.7 also describes three functions named temporal-dateTimes-contains, temporal-dateTimeDuration-contains, and temporal-durationDateTimes-contains whose purpose is to test whether a time is contained in an interval. This purpose would be served in a much more readable way by using comparison operators. For example, temporal-dateTimeDuration-contains(time1, duration2, time3) could be better expressed as time3 >= time1 AND time3 <= time1 + duration2.
Resolution:
Originator: | Don Chamberlin (member-only message; Item 16) |
Locus: | Syntax |
This section lists two equality-tests between nodes, called "node-equal" and "value-equal". Value-equal is defined to ignore nested markup, as in XPath 1.0. We also need a comparison-test, possibly called "deep-equal", which respects nested markup, requiring the nodes being compared to have the matching descendant-subtrees with corresponding nodes equal in name and content. For example, consider a book with author "Mark Twain" and title "Tom Sawyer", and another book with title "Mark Twain" and author "Tom Sawyer". These books would be value-equal (according to XPath 1.0) but not deep-equal.
Resolution:
Originator: | Dana Florescu (member-only message). |
Locus: | Syntax |
We need head() and tail functions on sequences. Some disagreement from Denise Draper: http://lists.w3.org/Archives/Member/w3c-xml-query-wg/2001Aug/0247.html
Resolution:
Originator: | Don Chamberlin (member-only message; Item 20). |
Locus: | Syntax |
Remove sort and reverse-sort-functions. They are covered by SORTBY operator in the language.
Resolution:
Originator: | Don Chamberlin (member-only message; Item 20) |
Locus: | Syntax |
last() is not correctly described. In the XPath 1.0 function library, last() takes no arguments, and it returns the "context size from the expression evaluation context". This terminology doesn't fit XQuery very well.
position() is not correctly described. In the XPath 1.0 function library, position() takes no arguments, and it returns the "context position from the expression evaluation context". This terminology doesn't fit XQuery very well.
Resolution:
Originator: | Don Chamberlin (member-only message; Item 9) |
Locus: | Syntax |
The WG has decided that if a function expects a scalar value and it is called with an empty sequence, an error results. But many functions are defined that expect scalar values. As specified, none of these functions can be safely used in a path expression. Consider floor(decimal) as an example. Consider the path expression //emp/floor(bonus). This expression will raise an error if any employee does not have a bonus. This is clearly not what is intended. Nearly all of the parameter declarations should be edited to allow for empty sequences.
Resolution:
Originator: | Don Chamberlin (member-only message; Item 1 (e)). |
Locus: | Syntax |
We need operators for UNION, INTERSECT and EXCEPT; these are operators in XQuery. They can be implemented by mapping to the functions but should be added to the document.
Resolution:
Originator: | Don Chamberlin (member-only message; Item 1 (f)) |
Locus: | Syntax |
We need operators for BEFORE and AFTER. They can be implemented by mapping to the functions but should be added to the document.
Resolution:
Originator: | Don Chamberlin (member-only message; Item 10) |
Locus: | Syntax |
XPath 1.0 supports six comparison operators between string-values. None of these are listed in this document. You should list them, and raise an issue about their meaning, since the XPath definition is probably no longer appropriate.
Resolution:
Originator: | Mike Kay (member-only message) |
Locus: | Syntax |
A compatibility issue exists with the div
operator. When XPath 1.0 semantics are used, 1 div 4
returns 0.25
, but the current specification apparently returns zero.
Resolution:
Originator: | Mike Kay (member-only message) |
Locus: | Syntax |
The examples may have to be changed, since expansion of character references is performed by a preprocessing step that may happen in some contexts (e.g., as part of an XML document), but not in others (e.g., when used in a Java application).
Resolution:
Originator: | Norm Walsh (member-only message) |
Locus: | Syntax |
It is not clear that many functions can make sense without some notion of document context. Perhaps those functions should have an optional context argument of some sort. What does it mean to have an XML ENTITY value without some notion of the document in which it occurs?
Resolution:
Originator: | Norm Walsh (member-only message) |
Locus: | Syntax |
When does it make sense for a function to return a string that is in a form other than some Unicode normalized form?
Resolution:
Originator: | Norm Walsh (member-only message) |
Locus: | Syntax |
Some sort of (defaulted) calendar context is required for all these functions to allow for future support of non-Gregorian calendars.
Resolution:
Originator: | Norm Walsh (member-only message) |
Locus: | Syntax |
"...does not contain a timezone, the result is the empty sequence" should be "...does not contain a timezone, the result is the empty string".
Resolution:
Originator: | Norm Walsh (member-only message) |
Locus: | Syntax |
The functions xf:get-end() and xf:get-start() should have more unique names, perhaps xf:get-end-datetime() and xf:get-start-datetime.
Resolution:
Originator: | Norm Walsh (member-only message) |
Locus: | Syntax |
How can QName-from-uri operate without some sort of context?
What is a QName in no namespace?
Resolution:
Originator: | Mike Kay (member-only message) |
Locus: | Syntax |
The document does not specify the complete semantics of numeric comparisons. For example, is -0.0e1 equal to +0.0e1? Is NaN equal to Nan?
Resolution:
Originator: | Mike Kay (member-only message) |
Locus: | Syntax |
It would be simpler to provide codepoint-compare() by defining a standard collation name that represents "Unicode codepoint order". This would avoid the need for codepoint-substring-after(), etc.
Resolution:
Originator: | Mike Kay (member-only message) |
Locus: | Syntax |
Users might reasonablyi expect us to provide a QName constructor that takes as input the string "prefix:local-name".
Resolution:
Originator: | Mike Kay (member-only message) |
Locus: | Syntax |
It might be more user-friendly to return an empty string if the name is in no namespace, since an empty string can never be used as a namespace URI.
Resolution:
Originator: | Mike Kay (member-only message) |
Locus: | Syntax |
These sections state that, if the nodes are in different documents, the results are implementation-dependent. This is unnecessary, since [XQuery 1.0 and XPath 2.0 Data Model] defines what document order means in these cases.
Resolution:
Originator: | Don Chamberlin (member-only message) |
Locus: | Syntax |
string(node) contains the XPath 1.0 definition of the string() function. This definition is based on the type system of XPath 1.0, which has only four types, and it should be reconsidered for XQuery. For example, the definition says that a sequence is converted to a string by returning the string-value of the first item in the sequence. This conflicts with our general policy of raising an error if a scalar function is called with a sequence of length greater than one.
Resolution:
Originator: | Don Chamberlin (member-only message) |
Locus: | Syntax |
The XPath 1.0 definition of the boolean() function is based on the type system of XPath 1.0, which has only four types, and it should be reconsidered for XQuery. For example, the definition says that a sequence is true if and only if it is non-empty. But in XQuery we believe that a sequence that contains the single Boolean value False should in fact be False, not True.
Resolution:
Originator: | Don Chamberlin (member-only message) |
Locus: | Syntax |
The union-all, intersect-all, and except-all functions return their results in document order. These functions should return their results in implementation-defined order. The union, intersect, and except operators (like path-steps) eliminate duplicates and sort in document order. This is expensive. The "-all" versions of these operators should take the "cheap" strategy: don't eliminate duplicates and don't sort in document order. If the user has a need for either duplicate elimination or sorting, orthogonal functions are (or should be) available for this purpose (see [Issue 66: A function to reorder a sequence into document order is needed]).
Resolution:
Originator: | Don Chamberlin (member-only message) |
Locus: | Syntax |
Comparison of datetime values describes six comparison operators that return true, false, or "indeterminate", said to be represented by the symbol "<>". All other comparison operators in the language represent the unknown truth value by an empty sequence. The operators in that section should be defined consistently with the rest of the language. This issue is not adequately described by [Issue 38: How are indeterminate values in date/time values represented?], which deals with indeterminate date/time values (in the new issue, we need to represent an unknown truth-value, not an indeterminate date/time.)
Resolution:
Originator: | Don Chamberlin (member-only message) |
Locus: | Syntax |
Section 6.1.2.4 refers to namespace URI's as being "in scope". I do not understand what this means. XQuery has a concept of a scope for namespace prefixes, but as far as I know we have not defined the concept of scope for URI's.
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
Comparisons of Duration and Datetime values: Should these be functions or operators?
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
add-months() is incompletely specified. What is February 28th, 2001 + 1 month? Does the result always have the same value for DD? Or is there a standard length month? add-years() is also incompletely specified. Is leap-year taken into account? Does the result have the same values for MM and DD?
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
Since these are essentially blobs, does it make sense to define comparators on them?
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
We need to carefully consider which boolean conversions make sense.
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
This document is missing the distinct() function used in the XQuery document. We have added xf:value-distinct() - I think it would be best to remove this latter function and add the former, which occurs liberally in our use cases and XQuery spec.
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
This document is missing a function to put nodes in document order, eliminating duplicates by identity. This has been called xf:doc-order() in some emails. I am not sure that we need xf:identity-distinct(). If we have xf:doc-order() and xf:distinct(), that should be enough.
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
I also question the utility of the following: xf:sort() Why not use SORTBY instead? xf:reverse-sort() Again, use SORTBY.
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
xf:position(), xf:sublist-before(), xf:sublist-after(), xf:sequence-pad-beginning(), xf:sequence-pad-end(), xf:truncate-beginning(), xf:truncate-end(), xf:resize-beginning(), xf:resize-end()
All of these rely on complex deep comparisons of nodes. But we will probably need to supply more than one way of comparing nodes. None of these feel like they fall on the right side of the 80/20 decision.
A complete specification of these functions is complex, and there are many different ways we might choose to define the comparisons that all make sense.
If we do choose to keep these functions, I would argue that they should be second order functions, where the user specifies the comparison function. Since we have agreed not to do second-order functions in Level 1.0, that would mean postponing it until Level 2.0.
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
xf:insert() - This is easy to do without the function.
xf:insert($target, $position, $inserts) => $target[1 to $position], $inserts, $target[($position+1) to last()]
Do we really need a function for this?
Resolution:
Originator: | Jonathan Robie (member-only message) |
Locus: | Syntax |
Since there is no one right way to compare nodes by value, this makes sense only for simple types. See also [Issue 91: Need value-based functions for Union, Intersect and Except.].
Resolution:
n
can be any sequence of decimal digits (0 through 9).
The L
can appear in either lower case (l
)
or upper case.
The E
can appear in either lower case (e
)
or upper case.
"xxx" can be any sequence of Unicode characters that does not include the double-quote character ("
).
"xxx" can be any sequence of Unicode characters that does not include the single-quote (or apostrophe) character ('
).