W3C

XQuery 1.0 and XPath 2.0 Data Model

W3C Working Draft 23 July 2004

This version:
http://www.w3.org/TR/2004/WD-xpath-datamodel-20040723/
Latest version:
http://www.w3.org/TR/xpath-datamodel/
Previous versions:
http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/ http://www.w3.org/TR/2003/WD-xpath-datamodel-20030502/
Editors:
Mary Fernández (XML Query WG), AT&T Labs <mff@research.att.com>
Ashok Malhotra (XML Query and XSL WGs), Microsoft <ashokma@microsoft.com>
Jonathan Marsh (XSL WG), Microsoft <jmarsh@microsoft.com>
Marton Nagy (XML Query WG), Science Applications International Corporation (SAIC) <marton.nagy@saic.com>
Norman Walsh (XSL WG), Sun Microsystems <Norman.Walsh@Sun.COM>

This document is also available in these non-normative formats: XML.


Abstract

This document defines the W3C XQuery 1.0 and XPath 2.0 Data Model, which is the data model of [XPath 2.0], [XSLT 2.0], and [XQuery], and any other specifications that reference it. This data model is based on the [XPath 1.0] data model and earlier work on an [XML Query Data Model]. This document is the result of joint work by the [XSL Working Group] and the [XML Query Working Group].

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a Public Working Draft for review by W3C Members and other interested parties. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The XQuery 1.0 and XPath 2.0 Data Model has been defined jointly by the XML Query Working Group and the XSL Working Group (both part of the XML Activity).

This working draft includes a number of changes made in response to comments received during the Last Call period that ended on Feb. 15, 2004. The working group is continuing to process these comments, and additional changes are expected.

This document reflects decisions taken up to and including the face-to-face meeting in Cambridge, MA during the week of June 21, 2004. These decisions are recorded in the Last Call issues list (http://www.w3.org/2004/07/data-model-issues.html). However, some of these decisions may not yet have been made in this document.

Public comments on this document and its open issues are invited. Comments should be sent to the W3C mailing list public-qt-comments@w3.org. (archived at http://lists.w3.org/Archives/Public/public-qt-comments/) with “[DM]” at the beginning of the subject field.

The patent policy for this document is expected to become the 5 February 2004 W3C Patent Policy, pending the Advisory Committee review of the renewal of the XML Query Working Group. Patent disclosures relevant to this specification may be found on the XML Query Working Group's patent disclosure page and the XSL Working Group's patent disclosure page. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1 Introduction
2 Concepts
    2.1 Terminology
    2.2 Notation
    2.3 Node Identity
    2.4 Document Order
    2.5 Sequences
    2.6 Types
        2.6.1 Representation of Types
        2.6.2 Predefined Types
        2.6.3 Type Hierarchy
        2.6.4 Atomic Values
        2.6.5 String Values
3 Data Model Construction
    3.1 Direct Construction
    3.2 Construction from an Infoset
    3.3 Construction from a PSVI
        3.3.1 Mapping PSVI Additions to Type Names
            3.3.1.1 Element and Attribute Node Type Names
            3.3.1.2 Atomic Value Type Names
        3.3.2 Mapping xsi:nil on Element Nodes
        3.3.3 Dates and Times
            3.3.3.1 Storing xs:dateTime, xs:date, and xs:time Values in the Data Model
            3.3.3.2 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values
    3.4 String and Typed Values
        3.4.1 Consistent with XML Schema Validation
            3.4.1.1 Pattern Facets
        3.4.2 Undefined Values
4 Data Model Serialization
5 Infoset Mapping
6 Accessors
    6.1 base-uri Accessor
    6.2 node-name Accessor
    6.3 parent Accessor
    6.4 string-value Accessor
    6.5 typed-value Accessor
    6.6 type-name Accessor
    6.7 children Accessor
    6.8 attributes Accessor
    6.9 namespaces Accessor
    6.10 nilled Accessor
7 Nodes
    7.1 Document Nodes
        7.1.1 Overview
        7.1.2 Accessors
        7.1.3 Construction from an Infoset
        7.1.4 Construction from a PSVI
        7.1.5 Infoset Mapping
    7.2 Element Nodes
        7.2.1 Overview
        7.2.2 Accessors
        7.2.3 Construction from an Infoset
        7.2.4 Construction from a PSVI
        7.2.5 Infoset Mapping
    7.3 Attribute Nodes
        7.3.1 Overview
        7.3.2 Accessors
        7.3.3 Construction from an Infoset
        7.3.4 Construction from a PSVI
        7.3.5 Infoset Mapping
    7.4 Namespace Nodes
        7.4.1 Overview
        7.4.2 Accessors
        7.4.3 Construction from an Infoset
        7.4.4 Construction from a PSVI
        7.4.5 Infoset Mapping
    7.5 Processing Instruction Nodes
        7.5.1 Overview
        7.5.2 Accessors
        7.5.3 Construction from an Infoset
        7.5.4 Construction from a PSVI
        7.5.5 Infoset Mapping
    7.6 Comment Nodes
        7.6.1 Overview
        7.6.2 Accessors
        7.6.3 Construction from an Infoset
        7.6.4 Construction from a PSVI
        7.6.5 Infoset Mapping
    7.7 Text Nodes
        7.7.1 Overview
        7.7.2 Accessors
        7.7.3 Construction from an Infoset
        7.7.4 Construction from a PSVI
        7.7.5 Infoset Mapping
8 Conformance

Appendices

A XML Information Set Conformance
B Error Summary
C References
    C.1 Normative References
    C.2 Other References
D Glossary (Non-Normative)
E Example (Non-Normative)
F Accessor Summary (Non-normative)
G Infoset Construction Summary (Non-normative)
H PSVI Construction Summary (Non-normative)
I Infoset Mapping Summary (Non-normative)


1 Introduction

This document defines the XQuery 1.0 and XPath 2.0 Data Model, which is the data model of [XPath 2.0], [XSLT 2.0] and [XQuery]

The XQuery 1.0 and XPath 2.0 Data Model (henceforth "data model") serves two purposes. First, it defines the information contained in the input to an XSLT or XQuery processor. Second, it defines all permissible values of expressions in the XSLT, XQuery, and XPath languages. A language is closed with respect to a data model if the value of every expression in the language is guaranteed to be in the data model. XSLT 2.0, XQuery 1.0, and XPath 2.0 are all closed with respect to the data model.

The data model is based on the [Infoset] (henceforth "Infoset"), but it requires the following new features to meet the [XPath 2.0 Requirements] and [XML Query Requirements]:

As with the Infoset, the XQuery 1.0 and XPath 2.0 Data Model specifies what information in the documents is accessible, but it does not specify the programming-language interfaces or bindings used to represent or access the data.

The data model can represent various values including not only the input and the output of a stylesheet or query, but all values of expressions used during the intermediate calculations. Examples include the input document or document repository (represented as a Document Node or a sequence of Document Nodes), the result of a path expression (represented as a sequence of nodes), the result of an arithmetic or a logical expression (represented as an atomic value), a sequence expression resulting in a sequence of items, etc.

This document provides a precise definition of the properties of nodes in the XQuery 1.0 and XPath 2.0 Data Model, how they are accessed, and how they relate to values in the Infoset and PSVI.

2 Concepts

This section outlines a number of general concepts that apply throughout this specification.

2.1 Terminology

For a full glossary of terms, see D Glossary.

In this specification the words must, must not, should, should not, may and recommended are to be interpreted as described in [RFC 2119].

This specification distinguishes between the data model as a general concept and specific items (documents, elements, atomic values, etc.) that are concrete examples of the data model by identifying all concrete examples as instances of the data model.

[Definition: Every instance of the data model is a sequence.].

[Definition: A sequence is an ordered collection of zero or more items.] A sequence cannot be a member of a sequence. A single item appearing on its own is modeled as a sequence containing one item. Sequences are defined in 2.5 Sequences.

[Definition: An item is either a node or an atomic value],

Every node is one of the seven kinds of nodes defined in 7 Nodes. Nodes form a tree that consists of a root node plus all the nodes that are reachable directly or indirectly from the root node via the dm:children, dm:attributes, and dm:namespaces accessors. Every node belongs to exactly one tree, and every tree has exactly one root node.

[Definition: A tree whose root node is a Document Node is referred to as a document.]

[Definition: A tree whose root node is not a Document Node is referred to as a fragment.]

[Definition: An atomic value is a value in the value space of an atomic type and is labeled with the name of that atomic type.]

[Definition: An atomic type is a primitive simple type or a type derived by restriction from another atomic type.] (Types derived by list or union are not atomic.)

[Definition: There are 24 primitive simple types: the 19 defined in Section 3.2 Primitive datatypesXS2 of [Schema Part 2] and xdt:anyAtomicType, xdt:untyped, xdt:untypedAtomic, xdt:dayTimeDuration, and xdt:yearMonthDuration], defined in 2.6 Types.

A type is represented in the data model by an expanded-QName.

[Definition: An expanded-QName is a pair of values consisting of a possibly empty namespace URI and a local name. They belong to the value space of the XML Schema type xs:QName. References to xs:QName in this document always mean the value space, i.e. a namespace URI, local name pair (and not the lexical space referring to constructs of the form “prefix:local-name”).]

[Definition: Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementor for each particular implementation.]

[Definition: Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementor for any particular implementation.]

In all cases where this specification leaves the behavior implementation-defined or implementation-dependent, the implementation has the option of providing mechanisms that allow the user to influence the behavior.

This document normatively defines the XQuery 1.0 and XPath 2.0 Data Model. In this document, examples and material labeled as "Note" are provided for explanatory purposes and are not normative.

2.2 Notation

In addition to prose, this specification defines a set of accessor functions to explain the data model. The accessors are shown with the prefix dm:. This prefix is always shown in italics to emphasize that these functions are abstract; they exist to explain the interface between the data model and specifications that rely on the data model: they are not accessible directly from the host language.

Several prefixes are used throughout this document for notational convenience. The following bindings are assumed.

  1. xs: bound to http://www.w3.org/2001/XMLSchema

  2. xsi: bound to http://www.w3.org/2001/XMLSchema-instance

  3. xdt: bound to http://www.w3.org/2004/07/xpath-datatypes

  4. fn: bound to http://www.w3.org/2004/07/xpath-functions

In practice, any prefix that is bound to the appropriate URI may be used.

The signature of accessor functions is shown using the same style as [Functions and Operators], described in Section 1.2 Function Signatures and DescriptionsFO.

This document relies on the [Infoset] and PSVI. Information items and properties are indicated by the styles information item and [infoset property], respectively.

Some aspects of type assignment rely on the ability to access properties of the schema components. Such properties are indicated by the style {component property}. Note that this does not mean a lightweight schema processor cannot be used, it only means that the application must have some mechanism to access the necessary properties.

2.3 Node Identity

Each node has a unique identity. Every node in an instance of the data model is unique: identical to itself, and not identical to any other node. (Atomic values do not have identity; every instance of the value “5” as an integer is identical to every other instance of the value “5” as an integer.)

Note:

The concept of node identity should not be confused with the concept of a unique ID, which is a unique name assigned to an element by the author to represent references using ID/IDREF correlation.

2.4 Document Order

[Definition: A document order is defined among all the nodes accessible during a given query or transformation. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order corresponds to the order in which the first character of the XML representation of each node occurs in the XML representation of the document.] [Definition: Document order is stable, which means that the relative order of two nodes will not change during the processing of a given query or transformation, even if this order is implementation-dependent.]

Within a tree, document order satisfies the following constraints:

  1. The root node is the first node.

  2. The relative order of siblings is determined by their order in the XML representation of the tree. A node N1 occurs before a node N2 in document order if and only if the start of N1 occurs before the start of N2 in the XML representation.

  3. Namespace Nodes immediately follow the Element Node with which they are associated. The relative order of Namespace Nodes is stable but implementation-dependent.

  4. Attribute Nodes immediately follow the Namespace Nodes of the element with which they are associated. The relative order of Attribute Nodes is stable but implementation-dependent.

  5. Element Nodes occur before their children; children occur before following-siblings.

The relative order of nodes in distinct trees is stable but implementation-dependent, subject to the following constraint: If any node in tree T1 is before any node in tree T2, then all nodes in tree T1 are before all nodes in tree T2.

2.5 Sequences

An important characteristic of the data model is that there is no distinction between an item (a node or an atomic value) and a singleton sequence containing that item. An item is equivalent to a singleton sequence containing that item and vice versa.

A sequence may contain nodes, atomic values, or any mixture of nodes and atomic values. When a node is added to a sequence its identity remains the same. Consequently a node may occur in more than one sequence and a sequence may contain duplicate items.

Sequences never contain other sequences; if sequences are combined, the result is always a “flattened” sequence. In other words, appending “(d e)” to “(a b c)” produces a sequence of length 5: “(a b c d e)”. It does not produce a sequence of length 4: “(a b c (d e))”, such a nested sequence never occurs.

Note:

Sequences replace node-sets from XPath 1.0. In XPath 1.0, node-sets do not contain duplicates. In generalizing node-sets to sequences in XPath 2.0, duplicate removal is provided by functions on node sequences.

2.6 Types

The data model supports strongly typed language such as [XPath 2.0] and [XQuery] that have a type system based on [Schema Part 1]. The type system is formally defined in [Formal Semantics].

Every item in the data model has both a value and a type. In addition to nodes, the data model can represent atomic values like the number 5 or the string “Hello World.” For each of these atomic values, the data model contains both the value of the item (such as 5 or “Hello World”) and its type name (such as xs:integer or xs:string).

2.6.1 Representation of Types

The data model uses expanded-QNames to represent the names of schema types, which include the built-in types defined by [Schema Part 2], five additional types defined by this specification, and may include other user- or implementation-defined types.

For XML Schema types, the namespace name of the expanded-QName is the [target namespace] property of the type definition, and its local name is the [name] property of the type definition.

The data model relies on the fact that an expanded-QName uniquely identifies every named type. (Although it is possible for different schemas to define different types with the same expanded-QName, at most one of them can be used in any given validation episode.)

For anonymous types, the processor must construct an anonymous type name that is distinct from the name of every named type and the name of every other anonymous type. [Definition: An anonymous type name is an implementation defined, unique type name provided by the processor for every anonymous type declared in the schemas available in the static context.] Anonymous type names must be globally unique across all anonymous types that are accessible to the processor. In the formalism of this specification, the anonymous type names are assumed to be xs:QNames, but in practice implementations are not required to use xs:QNames to represent the implementation-defined names of anonymous types.

The scope over which the names of anonymous types must be meaningful and distinct depends on the processing context. In XSLT, it is the duration of an entire transformation. In XQuery, it is the duration of the evaluation of a top-level expression, i.e. an expression not contained in any other expression.

The data model associates schema type information with Element Nodes, Attribute Nodes and atomic values. The item is guaranteed to be an instance of that kind of item with the given schema type.

The data model does not represent element or attribute declaration schema components, but it supports various type-related operations. The semantics of other operations, for example, checking if a particular instance of an Element Node has a given schema type is defined in [Formal Semantics].

2.6.2 Predefined Types

In addition to the 19 types defined in Section 3.2 Primitive datatypesXS2 of [Schema Part 2], the data model defines five additional types: xdt:anyAtomicType, xdt:untyped, xdt:untypedAtomic, xdt:dayTimeDuration, and xdt:yearMonthDuration:

xdt:anyAtomicType

The abstract datatype xdt:anyAtomicType is a child of xs:anySimpleType and is the base type for all the primitive atomic types described in [Schema Part 2]. This datatype cannot be used in [Schema Part 1] type declarations, nor can it be used as a base for user-defined atomic types. It can be used, as discussed in Section 3.12 Expressions on SequenceTypesXQ, to define a required type (for example in a function signature) to indicate that any of the primitive atomic types or xdt:untypedAtomic is acceptable.

xdt:untyped

The datatype xdt:untyped is a child of xs:anyType and serves as a special type annotation to indicate types that have not been validated by a XML Schema or a DTD. This type cannot be used in [Schema Part 1] type declarations, nor can it be used as a base for user-defined types. It can be used, as discussed in Section 3.12 Expressions on SequenceTypesXQ, to define a required type (for example in a function signature) to indicate that only an untyped value is acceptable.

xdt:untypedAtomic

The datatype xdt:untypedAtomic is a child of xdt:anyAtomicType and serves as a special type annotation to indicate atomic values that have not been validated by a XML Schema or a DTD or have received an instance type annotation of xs:anySimpleType in the PSVI. This datatype cannot be used in [Schema Part 1] type declarations, nor can it be used as a base for user-defined atomic types. It can be used, as discussed in Section 3.12 Expressions on SequenceTypesXQ, to define a required type (for example in a function signature) to indicate that only an untyped atomic value is acceptable.

xdt:dayTimeDuration

The type xdt:dayTimeDuration is derived from xs:duration by restricting its lexical representation to contain only the days, hours, minutes and seconds components. The value space of xdt:dayTimeDuration is the set of fractional second values. The components of xdt:dayTimeDuration correspond to the day, hour, minute and second components defined in Section 5.5.3.2 of [ISO 8601], respectively. xdt:dayTimeDuration is derived from xs:duration as follows:

<xs:simpleType name='dayTimeDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[\-]?P([0-9]+D(T([0-9]+(H([0-9]+(M([0-9]+(\.[0-9]*)?S
                       |\.[0-9]+S)?|(\.[0-9]*)?S)|(\.[0-9]*)?S)?|M([0-9]+
                       (\.[0-9]*)?S|\.[0-9]+S)?|(\.[0-9]*)?S)|\.[0-9]+S))?
                       |T([0-9]+(H([0-9]+(M([0-9]+(\.[0-9]*)?S|\.[0-9]+S)?
                       |(\.[0-9]*)?S)|(\.[0-9]*)?S)?|M([0-9]+(\.[0-9]*)?S|\.[0-9]+S)?
                       |(\.[0-9]*)?S)|\.[0-9]+S))"/>
  </xs:restriction>
</xs:simpleType>

To make the long pattern easier to read, it has been formatted on six lines using additional new line and space characters in the pattern string. These additional characters should not be interpreted as part of the pattern.

xdt:yearMonthDuration

The type xdt:yearMonthDuration is derived from xs:duration by restricting its lexical representation to contain only the year and month components. The value space of xdt:yearMonthDuration is the set of xs:integer month values. The year and month components of xdt:yearMonthDuration correspond to the Gregorian year and month components defined in section 5.5.3.2 of [ISO 8601], respectively.

The type xdt:yearMonthDuration is derived from xs:duration as follows:

<xs:simpleType name='yearMonthDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[\-]?P[0-9]+(Y([0-9]+M)?|M)"/>
  </xs:restriction>
</xs:simpleType>

2.6.3 Type Hierarchy

The diagram below shows how the nodes, primitive simple types, and user defined types fit together into a hierarchy.

The xs:IDREFS, xs:NMTOKENS, xs:ENTITIES and user-defined list and union types are special types in that these types are lists or unions rather than true subtypes.

Type hierarchy graphic

2.6.4 Atomic Values

An atomic value can be constructed from a lexical representation. Given a string and an atomic type, the atomic value is constructed in such a way as to be consistent with validation. If the string does not represent a valid value of the type, an error is raised. When xdt:untypedAtomic is specified as the type, no validation takes place. The details of the construction are described in Section 5 Constructor FunctionsFO and the related Section 17 CastingFO section of [Functions and Operators].

2.6.5 String Values

A string value can be constructed from an atomic value. Such a value is constructed by converting the atomic value to its string representation as described in Section 17 CastingFO. Using the canonical lexical representation for atomic values may not always be compatible with XPath 1.0. These and other backwards incompatibilities are described in Section H Backwards Compatibility with XPath 1.0 (Non-Normative)XP.

3 Data Model Construction

This section describes the constraints on instances of the data model.

The data model supports well-formed XML documents conforming to [Namespaces in XML] or [Namespaces in XML 1.1]. Documents that are not well-formed are, by definition, not XML. XML documents that do not conform to [Namespaces in XML] or [Namespaces in XML 1.1] are not supported (nor are they supported by [Infoset]).

In other words, the data model supports the following classes of XML documents:

This document describes how to construct an instance of the data model from an [Infoset] or a Post Schema Validation Infoset (PSVI), the augmented infoset produced by an XML Schema validation episode.

An instance of the data model can also be constructed directly through application APIs, or from non-XML sources such as relational tables in a database.

The data model supports some kinds of values that are not supported by [Infoset]. Examples of these are document fragments and sequences of Document Nodes. The data model also supports values that are not nodes. Examples of these are sequences of atomic values, or sequences mixing nodes and atomic values. These are necessary to be able to represent the results of intermediate expressions in the data model during expression processing.

3.1 Direct Construction

Although this document describes construction of an instance of the data model in terms of infoset properties, an infoset is not an absolutely necessary precondition for building an instance of the data model.

There are no constraints on how an instance of the data model may be constructed directly, save that the resulting instance must satisfy all of the constraints described in this document.

3.2 Construction from an Infoset

An instance of the data model can be constructed from an [Infoset] that satisfies the following general constraints:

  • All general and external parsed entities must be fully expanded. The Infoset must not contain any unexpanded entity reference information items.

  • The infoset must provide all of the properties identified as "required" in this document. The properties identified as "optional" may be used, if they are present. All other properties are ignored.

An instance of the data model constructed from an information set must be consistent with the description provided for each node kind.

3.3 Construction from a PSVI

An instance of the data model can be constructed from a PSVI, whose element and attribute information items have been strictly assessed, laxly assessed, or have not been assessed. Constructing an instance of the data model from a PSVI must be consistent with the description provided in this section and with the description provided for each node kind.

Data model construction requires that the PSVI provide unique names for all anonymous schema types.

Note:

[Schema Part 1] does not require all schema processors to provide unique names for anonymous schema types. In order to build an instance of the data model from a PSVI produced by a processor that does not provide the names, some post-processing will be required in order to assure that they are all uniquely identified before construction begins.

[Definition: An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than 'valid' for the [validity] property in the PSVI.]

The data model supports incompletely validated documents. Elements and attributes that are not valid are treated as having unknown schema types.

The most significant difference between Infoset construction and PSVI construction occurs in the area of schema type assignment. Other differences can also arise from schema processing: default attribute and element values may be provided, white space normalization of element content may occur, and the user-supplied lexical form of elements and attributes with atomic schema types may be lost.

3.3.1 Mapping PSVI Additions to Type Names

A PSVI element or attribute information item may have a [validity] property. The [validity] property may be "valid", "invalid", or "notKnown" and reflects the outcome of schema-validity assessment. In the data model, precise schema type information is exposed for Element and Attribute Nodes that are "valid". Nodes that are not "valid" are treated as if they were simply well-formed XML and only very general schema type information is associated with them.

3.3.1.1 Element and Attribute Node Type Names

The precise definition of the schema type of an element or attribute information item depends on the properties of the PSVI. In the PSVI, [Schema Part 1] only guarantees the existence of either the [type definition] property, or the [type definition namespace], [type definition name] and [type definition anonymous] properties. If the type definition refers to a union type, there are further properties defined, that refer to the type definition which actually validated the item's normalized value. These properties are not used to determine the schema type of the node.

If the [validity] and [validation attempted] properties exist and have the values "valid" and "full", respectively, the schema type of an element or attribute information item is represented by an expanded-QName whose namespace and local name correspond to the first applicable items in the following list:

  • If the [type definition] property exists:

    • If the {name} property is not absent, the {target namespace} and {name} properties of the [type definition] property;

    • Otherwise, the namespace and local name of the appropriate anonymous type name.

  • If [type definition anonymous] exists:

    • If it is false: the [type definition namespace] and the [type definition name] properties;

    • Otherwise, the namespace and local name of the appropriate anonymous type name.

If the [validity] property does not exist or is not "valid", or the [validition attempted] property does not exist or is not "full", the schema type of an element is xdt:untyped and the type of an attribute is xdt:untypedAtomic.

3.3.1.2 Atomic Value Type Names

The typed value of Attribute Nodes and some Element Nodes is an atomic value. (Elements that have a complex type with element-only content do not contain atomic values; such nodes have no typed value and this section does not apply to them.)

The schema type of each item in the typed value of an Element or Attribute Node depends on the schema type of the node and may be further refined. The type must be further refined when the Element or Attribute Node has a list or union type.

If the schema type definition of a node refers to a union type, the PSVI will contain properties that refer to the type definition which actually validated the item's normalized value. These properties are either the [member type definition], or the [member type definition namespace], [member type definition name] and [member type definition anonymous] properties. If these are available, the schema type of the typed value of an element or attribute will be the member type that actually validated the schema normalized value.

The schema type of the typed value is represented by an expanded-QName whose namespace and local name correspond to the first applicable items in the following list:

  • If the schema type of the node (as defined in 3.3.1.1 Element and Attribute Node Type Names) is xdt:untyped or xdt:untypedAtomic, the namespace and local name of xdt:untypedAtomic.

  • If [member type definition] exists:

    • If the {name} property is not absent, the {target namespace} and {name} properties of the [member type definition] property;

    • Otherwise, the namespace and local name of the appropriate anonymous type name.

  • If [member type definition anonymous] exists:

    • If it is false: the [member type definition namespace] and the [member type definition name] properties;

    • Otherwise, the namespace and local name of the appropriate anonymous type name.

  • Otherwise, the namespace and local name of the node’s schema type.

The {variety} of the resulting type will be either atomic or list. If the {variety} is atomic, that is the type of the atomic value.

If the {variety} is list, each member of the list must be examined to determine its atomic value. The [schema normalized value] of the node is a space-separated list of lexical forms. These lexical forms can be used to create a list of strings, where each string represents one member of the list of atomic values.

For each string in the list, the nominal type of each member of the list is identified by the {item type definition} of the type identified above.

  1. If the {variety} of the nominal type is atomic, if the string is castable to that type, then the atomic value is the result of casting the string to that type.

  2. If the {variety} of the nominal type is union, then each type listed in the {member type definitions} must be considered in turn as a nominal type.

  3. If the {variety} of the nominal type is list, then the {item type definition} must be considered as the nominal type.

Note that this process is recursive: the member type of a union may be a list which may have an item type that is a union. The process is guaranteed to terminate because (1) it terminates immediately if the initial {variety} is atomic and (2) the initial {variety} can only be non-atomic if validation succeeded.

3.3.2 Mapping xsi:nil on Element Nodes

[Schema Part 2] introduced a mechanism for signaling that an element should be accepted as valid when it has no content despite a content type which does not require or even necessarily allow empty content. That mechanism is the xsi:nil attribute.

The data model exposes this special semantic in the nilled property. (It also exposes the attribute, irrespective of whether or not schema processing has been performed.)

If the [validity] property exists on an Element Node and is "valid" then if the [nil] property exists and is true, then nilled property is "true". In all other cases, including all cases where schema validity assessment was not attempted or did not succeed, the nilled property is "false".

3.3.3 Dates and Times

The date and time types require special attention. The following sections apply to xs:dateTime, xs:date, and xs:time types and types derived from them.

3.3.3.1 Storing xs:dateTime, xs:date, and xs:time Values in the Data Model

[Schema Part 2] permits xs:dateTime, xs:date, and xs:time values both with and without timezones and therefore only specifies a partial ordering between date and time values. In the data model, it is necessary to preserve timezone information.

In order to achieve this goal, xs:dateTime, xs:date, and xs:time values must be stored with care. If the lexical representation of the value includes a timezone, it is converted to UTC as defined by [Schema Part 2] and the timezone in the lexical representation is converted to a xdt:dayTimeDuration value (as an offset from UTC). Implementations must keep track of both these values for each xs:dateTime, xs:date, and xs:time stored.

Lexical representations that do not have a timezone are assumed to be in UTC for the purposes of normalization only. An empty sequence is used for their timezone.

Thus, for the purpose of validation, "2003-01-02T11:30:00-05:00" is converted to "2003-01-02T16:30:00Z", but in the data model it must be stored as as "(2003-01-02T16:30:00Z, -PT5H0M)". The value "2003-01-16T16:30:00" is stored as "(2003-01-16T16:30:00Z, ())" because it has no timezone.

3.3.3.2 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values

For xs:dateTime, xs:date and xs:time, the typed value is the atomic value that is determined from its stored form as follows:

  • If the timezone component is not the empty sequence (the timezone was specified), then the value contains the time component, normalized to the timezone specified by the timezone component, as well as the timezone component. The stored values "(2003-01-02T16:30:00Z, -PT5H0M)" produce the value "2003-01-02T11:30:00-05:00".

  • If the timezone component is the empty sequence (the timezone was not specified), then the time component without any indication of timezone. The stored values "(2003-01-02T16:30:00Z, ())" produce the value "2003-01-02T16:30:00".

3.4 String and Typed Values

The dm:string-value and dm:typed-value of Document Nodes, Element Nodes, and Attribute Nodes are defined in terms of string-value and typed-value properties. This specification describes how values are computed for those properties when constructing an instance of the data model from an Infoset or a PSVI.

This is a formalism used to simplify the explanations in this specification, it is not a constraint on implementations. In practice, implementations are free to adopt any strategy they wish, provided that the results are indistinguishable in every significant respect from the results that would be obtained by following precisely the algorithms described in this specification.

In practice, some implementations, particularly those constructing instances of the data model from sources other than Infosets or PSVIs, may have access to only the string or typed values and not both. In order to support these implementations, this specification explicitly identifies some variations in the string value of a node as insignificant. Implementations that do not have access to, or cannot retain, the original string value of a node may reconstruct it from the typed value. In this case, the string value may differ from the string value that would be returned by an implementation that preserved the original lexical form.

Consider the following node:

<offset xsi:type="xs:integer">0030</offset>

Assuming that the node is valid, it has a typed value of “30” as an xs:integer. Some implementations will return “0030” as the string value and some will return “30”. In this regard, any string value that is a lexical representation of the typed value is acceptable.

Applications that care about the original lexical forms must choose an appropriate implementation.

3.4.1 Consistent with XML Schema Validation

Validity assessment only assures that the lexical forms present in the Infoset or PSVI are within the lexical space of the required schema type. So while such assessment can determine that “0030” is a valid integer, it does not produce a typed integer value.

The data model contains not only the string value of each node, it also contains the typed value. In other words, the process of constructing an instance of the data model does, in principle, require that an implementation produce the typed integer value “30” from the lexical value “0030”.

The phrase “derived from the string-value of the node and its type in a way that is consistent with XML Schema validation” is used to describe this process. It is impossible to define precisely what this means as implementations may choose different internal representations for typed values. However, it is a constraint on the data model that string values and typed values must be consistent. Although variations in string value are allowed, the string value must always be a valid lexical representation of the typed value.

3.4.1.1 Pattern Facets

Creating a subtype by restriction generally reduces the value space of the original schema type. For example, expressing a hat size as a restriction of decimal with a minimum value of 6.5 and maximum value of 8.0 creates a schema type whose legal values are only those in the range 6.5 to 8.0.

The pattern facet is different because it restricts the lexical space of the schema type, not its value space. Expressing a three-digit number as a restriction of integer with the pattern facet “[0-9]{3}” creates a schema type whose legal values are only those with a lexical form consisting of three digits.

The pattern facet is not reversible in practice; given an arbitrary pattern, there’s no practical way to determine how the lexical form of a typed value must be constructed so that the result will satisfy that pattern.

As a consequence, pattern facets are not respected during serialization and values in the data model that were originally valid with respect to a schema that contains pattern-based restrictions may not be valid after serialization.

3.4.2 Undefined Values

Some typed values in the data model are undefined. Attempting to access an undefined property always raises an error.

4 Data Model Serialization

Serialization of an instance of the data model is governed by [Serialization].

5 Infoset Mapping

This specification describes how to map each kind of node to the corresponding information item. This mapping produces an Infoset, it does not and cannot produce a PSVI. Validation must be used to obtain a PSVI for a (portion of a) data model instance.

6 Accessors

A set of accessors is defined on all seven kinds of nodes, see 7 Nodes. Some accessors return a constant empty sequence on certain node kinds. The dm:unparsed-entity-system-id, dm:unparsed-entity-public-id, and dm:document-uri accessors, which are only available on Document Nodes, and the dm:in-scope-namespaces accessor, which is only available on Element Nodes are not included in this summary.

In order for processors to be able to operate on instances of the data model, the model must expose the properties of the items it contains. The data model does this by defining a family of accessor functions. These are not functions in the literal sense, they are not available for users or applications to call directly, rather they are descriptions of the information that an implementation of the data model must expose to applications. Functions and operators available to end-users are described in [Functions and Operators].

6.1 base-uri Accessor

dm:base-uri($n as node()) as xs:anyURI?

The dm:base-uri accessor returns the base URI of a node as a sequence containing zero or one URI reference. For more information about base URIs, see [XML Base].

It is defined on all seven node kinds.

6.2 node-name Accessor

dm:node-name($n as node()) as xs:QName?

The dm:node-name accessor returns the name of the node as a sequence of zero or one xs:QNames.

It is defined on all seven node kinds.

6.3 parent Accessor

dm:parent($n as node()) as node()?

The dm:parent accessor returns the parent of a node as a sequence containing zero or one nodes.

It is defined on all seven node kinds.

6.4 string-value Accessor

dm:string-value($n as node()) as xs:string

The dm:string-value accessor returns the string value of a node.

It is defined on all seven node kinds.

6.5 typed-value Accessor

dm:typed-value($n as node()) as xdt:anyAtomicType*

The dm:typed-value accessor returns the typed-value of the node as a sequence of zero or more atomic values.

It is defined on all seven node kinds.

6.6 type-name Accessor

dm:type-name($n as node()) as xs:QName?

The dm:type-name accessor returns the name of the schema type of a node as a sequence of zero or one xs:QNames.

It is defined on all seven node kinds.

6.7 children Accessor

dm:children($n as node()) as node()*

The dm:children accessor returns the children of a node as a sequence containing zero or more nodes.

It is defined on all seven node kinds.

6.8 attributes Accessor

dm:attributes($n as node()) as attribute()*

The dm:attributes accessor returns the attributes of a node as a sequence containing zero or more Attribute Nodes. The order of Attribute Nodes is stable but implementation dependent.

It is defined on all seven node kinds.

6.9 namespaces Accessor

dm:namespaces($n as node()) as node()*

The dm:namespaces accessor returns the namespaces associated with a node as a sequence containing zero or more Namespace Nodes. The order of Namespace Nodes is stable but implementation dependent.

It is defined on all seven node kinds.

6.10 nilled Accessor

dm:nilled($n as node()) as xs:boolean?

The dm:nilled accessor returns true if the node is "nilled", see 3.3.2 Mapping xsi:nil on Element Nodes.

It is defined on all seven node kinds.

7 Nodes

[Definition: The seven distinct kinds of Node: document, element, attribute, text, namespace, processing instruction, and comment,] are defined in the following subsections.

All nodes must satisfy the following general constraints:

  1. Every node must have a unique identity, distinct from all other nodes.

  2. The children property of a node must not contain two consecutive Text Nodes.

  3. The children property of a node must not contain any empty Text Nodes.

  4. The sequence of nodes in the children property of a node is ordered and must be in document order.

  5. The children and attributes properties of a node must not contain two nodes with the same identity.

7.1 Document Nodes

7.1.1 Overview

Document Nodes encapsulate XML documents. Documents have the following properties:

  • base-uri, possibly empty.

  • children, possibly empty.

  • unparsed-entities, possibly empty.

  • document-uri, possibly empty.

  • string-value

  • typed-value

Document Nodes must satisfy the following constraints.

  1. The children must consist exclusively of Element, Processing Instruction, Comment, and Text Nodes if it is not empty. Attribute, namespace, and Document Nodes can never appear as children

  2. If a node N is a child of a Document Node D, then the parent of N must be D.

  3. If a node N has a parent Document Node D, then N must be among the children of D.

In the [Infoset], a document information item must have at least one child, its children must consist exclusively of element information items, processing instruction information items and comment information items, and exactly one of the children must be an element information item. This data model is more permissive: a Document Node may be empty, it may have more than one Element Node as a child, and it also permits Text Nodes as children.

Implementations that support DTD processing and access to the unparsed entity accessors use the unparsed-entities property to associate information about an unordered collection of unparsed entities with a Document Node. This property is accessed indirectly through the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id functions.

7.1.2 Accessors

dm:base-uri

Returns the value of the base-uri property if it exists and is not empty, otherwise returns ().

dm:node-name

Returns ().

dm:parent

Returns ()

dm:string-value

Returns the value of the string-value property.

dm:typed-value

Returns the value of the typed-value property.

dm:type-name

Returns ().

dm:children

Returns the value of the children property.

dm:attributes

Returns ()

dm:namespaces

Returns ()

dm:nilled

Returns ()

Three additional accessors are defined on Document Nodes:

dm:document-uri($node as document-node()) as xs:string?

The dm:document-uri accessor returns the absolute URI of the resource from which the Document Node was constructed, if the absolute URI is available. If there is no URI available, or if it cannot be made absolute when the the Document Node is constructed, the empty sequence is returned.

For example, if a collection of documents is returned by the fn:collection function, the dm:document-uri may serve to distinguish between them even if each has the same dm:base-uri.

dm:unparsed-entity-system-id( $node  as document-node(),
$entityname  as xs:string) as xs:string?

The dm:unparsed-entity-system-id accessor returns the system identifier of an unparsed external entity declared in the specified document. If no entity with the name specified in $entityname exists, or if the entity is not an external unparsed entity, the empty sequence is returned.

dm:unparsed-entity-public-id( $node  as document-node(),
$entityname  as xs:string) as xs:string?

The dm:unparsed-entity-public-id accessor returns the public identifier of an unparsed external entity declared in the specified document. If no entity with the name specified in $entityname exists, or if the entity is not an external unparsed entity, or if the entity has no public identifier, the empty sequence is returned.

7.1.3 Construction from an Infoset

The document information item is required. A Document Node is constructed for each document information item.

The following infoset properties are required: [children] and [base URI].

The following infoset properties are optional: [unparsed entities].

Document Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, and comment found in the [children] property, a corresponding Element, Processing Instruction, or Comment Node is constructed and that sequence of nodes is used as the value of the children property.

If present among the [children], the document type declaration information item is ignored.

unparsed-entities

If the [unparsed entities] property is present and is not the empty set, the values of the unparsed entity information items must be used to support the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id accessors.

The internal structure of the values of the unparsed-entities property is implementation defined.

string-value

The concatenation of the string-values of all its Text Node descendants in document order. If the document has no such descendants, "".

typed-value

The dm:string-value of the node as an xdt:untypedAtomic value.

document-uri

The document-uri property holds the absolute URI for the resource from which the document node was constructed, if one is available and can be made absolute. For example, if a collection of documents is returned by the fn:collection function, the document-uri property may serve to distinguish between them even though each has the same base-uri property.

If the document-uri is not (), then the following constraint must hold: the node returned by evaluating fn:doc() with the document-uri as its argument must return the document node that provided the value of the document-uri property.

In other words, for any Document Node $arg, either fn:document-uri($arg) must return the empty sequence or fn:doc(fn:document-uri($arg)) must return $arg.

7.1.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

7.1.5 Infoset Mapping

A Document Node maps to a document information item. The mapping fails and produces no value if the Document Node contains Text Node children that do not consist entirely of white space or if the Document Node contains more than one Element Node child.

The following properties are specified by this mapping:

[children]

A list of information items obtained by processing each of the dm:children in order and mapping each to the appropriate information item(s).

[document element]

The element information item that is among the [children].

[unparsed entities]

An unordered set of unparsed entity information items constructed from the unparsed-entities.

Each unparsed entity maps to an unparsed entity information item. The following properties are specified by this mapping:

[name]

The name of the entity.

[system identifier]

The system identifier of the entity.

[public identifier]

The public identifier of the entity.

[declaration base URI]

The base URI of the entity in which the declaration occurred.

The following properties have no value: [notation name], [notation].

The following properties have no value: [notations] [character encoding scheme] [version] [all declarations processed].

7.2 Element Nodes

7.2.1 Overview

Element Nodes encapsulate XML elements. Elements have the following properties:

  • base-uri, possibly empty.

  • node-name

  • parent, possibly empty

  • type-name

  • children, possibly empty

  • attributes, possibly empty

  • namespaces, possibly empty

  • nilled

  • string-value

  • typed-value

  • in-scope-namespaces

Element Nodes must satisfy the following constraints.

  1. The children must consist exclusively of Element, Processing Instruction, Comment, and Text Nodes if it is not empty. Attribute, Namespace, and Document Nodes can never appear as children

  2. The attributes of an element must have distinct xs:QNames.

  3. If a node N is a child of an element E, then the parent of N must be E.

  4. Exclusive of Attribute and Namespace Nodes, if a node N has a parent element E, then N must be among the children of E. (Attribute and Namespace Nodes have a parent, but they do not appear among the children of their parent.)

    The data model permits Element Nodes without parents (to represent partial results during expression processing, for example). Such Element Nodes must not appear among the children of any other node.

  5. If an Attribute Node A has a parent element E, then A must be among the attributes of E.

    The data model permits Attribute Nodes without parents. Such Attribute Nodes must not appear among the attributes of any Element Node.

  6. If a Namespace Node N has a parent element E, then N must be among the namespaces of E.

    The data model permits Namespace Nodes without parents. Such Namespace Nodes must not appear among the namespaces of any Element Node.

  7. If the dm:type-name of an Element Node is xdt:untyped, then the dm:type-name of all its descendant elements must also be xdt:untyped and the dm:type-name of all its Attribute Nodes must be xdt:untypedAtomic.

  8. If the dm:type-name of an Element Node is xdt:untyped or xs:anyType, then the nilled property must be false.

  9. If the nilled property is true, then the children property must not contain Element Nodes or Text Nodes.

  10. For every expanded QName that appears in the dm:node-name of the element, the dm:node-name of any Attribute Node among the attributes of the element, or in any value of type xs:QName or xs:NOTATION (or any type derived from those types) that appears in the typed-value of the element or the typed-value of any of its attributes, if the expanded QName has a non-empty URI, then there must be a prefix binding for this URI among the in-scope-namespaces of this Element Node.

    If any of the expanded QNames has an empty URI, then there must not be any binding among the in-scope-namespaces of this Element Node which binds the empty prefix to a URI.

  11. The in-scope-namespaces of every element must include a binding for the prefix xml to the URI http://www.w3.org/XML/1998/namespace and there must be no other prefix bound to that URI.

7.2.2 Accessors

dm:base-uri

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the element has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-name

Returns the value of the node-name property.

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the string-value property.

dm:typed-value

Returns the value of the typed-value property.

dm:type-name

Returns the value of the type-name property.

dm:children

Returns the value of the children property.

dm:attributes

Returns the value of the attributes property. The order of Attribute Nodes is stable but implementation dependent.

dm:namespaces

Returns the value of the namespaces property. The order of Namespace Nodes is stable but implementation dependent.

dm:nilled

Returns the value of the nilled property.

One additional accessor is defined on Element Nodes:

dm:in-scope-namespaces($node as element()) as xs:string*

The dm:in-scope-namespaces accessor returns a set of prefix/URI pairs. In the formalism of this specification, these pairs are represented as a list of strings where each odd-numbered list item is the prefix and the following even-numbered item is the URI. In practice, implemenations may choose a more efficient return type.

The prefix for the default namespace is "".

7.2.3 Construction from an Infoset

The element information items are required. An Element Node is constructed for each element information item.

The following infoset properties are required: [namespace name], [local name], [children], [attributes], [in-scope namespaces], [base URI], and [parent].

Element Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

parent

The node that corresponds to the value of the [parent] property.

type-name

All Element Nodes constructed from an infoset have the type xdt:untyped.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

Default and fixed attributes provided by both DTD and XML Schema processing are added to the [attributes] and are therefore included in the data model attributes’s of an element.

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property.

Implementations may ignore namespace information items for namespaces which do not appear in the expanded QName of the element name or the names of any of its attribute information items. This can arise when QNames are used in content.

nilled

All Element Nodes constructed from an infoset have a nilled property of "false".

string-value

The concatenation of the string-values of all its Text Node descendants in document order. If the document has no such descendants, "".

typed-value

Returns the string-value as an xdt:untypedAtomic.

7.2.4 Construction from a PSVI

The following Element Node properties are affected by PSVI properties.

type-name
children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

For elements with schema simple types, or complex types with simple content, if the [schema normalized value] PSVI property exists, the processor may use a sequence of nodes containing the Processing Instruction and Comment Nodes corresponding to the processing instruction and comment information items found in the [children] property, plus an optional single Text Node whose string value is the [schema normalized value] for the children property. If the [schema normalized value] is the empty string, the Text Node must not be present, otherwise it must be present.

The relative order of Processing Instruction and Comment Nodes must be preserved, but the position of the Text Node, if it is present, among them is implementation defined.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

nilled

The nilled value, as calculated in 3.3.2 Mapping xsi:nil on Element Nodes.

string-value

The string-value is calculated as follows:

  • If the element is empty: its string value is the empty string, "".

  • If the element has a type of xdt:untyped, a complex type with element-only content, or a complex type with mixed content: its string-value is the concatenation of the string-values of all its Text Node descendants in document order.

  • If the element has a simple type or a complex type with simple content: its string-value is the [schema normalized value] of the node.

typed-value

The typed-value is calculated as follows:

  • If the nilled property is true, its typed-value is ().

  • If the element is of type xdt:untyped, its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • If the element is empty: its typed-value is the empty sequence, ().

  • If the element has a simple type or a complex type with simple content: it’s typed value is compute as described in 3.3.1.2 Atomic Value Type Names. The result is a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

  • If the element has a complex type with mixed content, its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • Otherwise, the element must be a complex type with element-only content. The typed-value of such an element is undefined. Attempting to access this property with the dm:typed-value accessor always raises an error.

All other properties have values that are consistent with construction from an infoset.

7.2.5 Infoset Mapping

An Element Node maps to an element information item.

The following properties are specified by this mapping:

[namespace name]

The namespace name of the value of dm:node-name.

[local name]

The local part of the value of dm:node-name.

[prefix]

The prefix associated with the value of dm:node-name, if it is known, otherwise not known.

[children]

A list of information items obtained by processing each of the dm:children in order and mapping each to the appropriate information item(s).

[attributes]

A list of information items obtained by processing each of the dm:attributes and mapping each to the appropriate information item(s).

[in-scope namespaces]

An unordered set of namespace information items constructed from the in-scope-namespaces.

Each in-scope namespace maps to a namespace information item. The following properties are specified by this mapping:

[prefix]

The prefix associated with the namespace.

[namespace name]

The URI associated with the namespace.

[base URI]

The value of dm:base-uri.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

The following property has no value: [namespace attributes].

7.3 Attribute Nodes

7.3.1 Overview

Attribute Nodes represent XML attributes. Attributes have the following properties:

  • node-name

  • parent, possibly empty

  • type-name

  • string-value

  • typed-value

Attribute Nodes must satisfy the following constraints.

  1. If an Attribute Node A is among the attributes of an element E, then the parent of A must be E.

  2. If a Attribute Node A has a parent element E, then A must be among the attributes of E.

    The data model permits Attribute Nodes without parents (to represent partial results during expression processing, for example). Such attributes must not appear among the attributes of any Element Node.

For convenience, the Element Node that owns this attribute is called its "parent" even though an Attribute Node is not a "child" of its parent element.

7.3.2 Accessors

dm:base-uri

If the attribute has a parent, returns the value of the dm:base-uri of its parent; otherwise it returns ().

dm:node-name

Returns the value of the node-name property.

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the string-value property.

dm:typed-value

Returns the value of the typed-value property.

dm:type-name

Returns the value of the type-name property.

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

7.3.3 Construction from an Infoset

The attribute information items are required. An Attribute Node is constructed for each attribute information item.

The following infoset properties are required: [namespace name], [local name], [normalized value], [attribute type], and [owner element].

Attribute Node properties are derived from the infoset as follows:

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

string-value

The [normalized value] property.

parent

The Element Node that corresponds to the value of the [owner element] property.

type-name
  • If the [attribute type] property has one of the following values: ID, IDREF, IDREFS, ENTITY, ENTITIES, NMTOKEN, or NMTOKENS, an xs:QName with the [attribute type] as the local name and "http://www.w3.org/2001/XMLSchema" as the namespace name.

  • Otherwise, xdt:untypedAtomic.

string-value

The [normalized value] of the attribute.

typed-value

The typed-value is calculated as follows:

  • If the attribute is of type xdt:untypedAtomic: its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • Otherwise: its typed-value is a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

7.3.4 Construction from a PSVI

The following Attribute Node properties are affected by PSVI properties.

string-value
  • The [schema normalized value] PSVI property if that exists.

  • Otherwise, the [normalized value] property.

type-name
  • If the [validity] property does not exist on this node or any of its ancestors, Infoset processing applies.

    Note that this processing is only performed if neither the node nor any of its ancestors was schema validated. In particular, Infoset-only processing does not apply to subtrees that are "skip" validated in a document.

  • If the [validity] property exists and is "valid", type is assigned as described in 3.3.1 Mapping PSVI Additions to Type Names

  • Otherwise, xdt:untypedAtomic.

typed-value

The typed-value is calculated as follows:

  • If the attribute is of type xdt:untypedAtomic: its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • Otherwise: its typed-value is a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation. The type of each atomic value is assigned as described in 3.3.1.2 Atomic Value Type Names.

All other properties have values that are consistent with construction from an infoset.

Note: attributes from the XML Schema instance namespace, "http://www.w3.org/2001/XMLSchema-instance", (xsi:schemaLocation, xsi:type, etc.) appear as ordinary attributes in the data model.

7.3.5 Infoset Mapping

An Attribute Node maps to an attribute information item.

The following properties are specified by this mapping:

[namespace name]

The namespace name of the value of dm:node-name.

[local name]

The local part of the value of dm:node-name.

[prefix]

The prefix associated with the value of dm:node-name, if it is known, otherwise not known.

[normalized value]

The value of dm:string-value.

[owner element]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

The following properties have no value: [specified] [attribute type] [references].

7.4 Namespace Nodes

7.4.1 Overview

Namespace Nodes encapsulate XML namespaces. Namespaces have the following properties:

  • prefix, possibly empty

  • uri

  • parent, possibly empty

Namespace Nodes must satisfy the following constraints.

  1. If a Namespace Node N is among the namespaces of an element E, then the parent of N, if it has one, must be E.

  2. If a Namespace Node N has a parent element E, then N must be among the namespaces of E.

The data model permits Namespace Nodes without parents, see below.

In XPath 1.0, Namespace Nodes were directly accessible by applications, by means of the namespace axis. In XPath 2.0 the namespace axis is deprecated, and it is not available at all in XQuery 1.0. XPath 2.0 implementations are not required to expose the namespace axis, though they may do so if they wish to offer backwards compatibility.

The information held in namespace nodes is instead made available to applications using functions defined in [Functions and Operators]. Some properties of Namespace Nodes are not exposed by these functions: in particular, properties related to the identity of Namespace Nodes, their parentage, and their position in document order. Implementations that do not expose the namespace axis can therefore avoid the overhead of maintaining this information.

Implementations that expose the namespace axis must provide unique Namespace Nodes for each element. Each element has an associated set of Namespace Nodes, one for each distinct namespace prefix that is in scope for the element (including the xml prefix, which is implicitly declared by [Namespaces in XML] and one for the default namespace if one is in scope for the element. The element is the parent of each of these Namespace Nodes; however, a Namespace Node is not a child of its parent element. In implementations that expose the namespace axis, elements never share namespace nodes.

Note:

In implementations that do not expose the namespace axis, there is no means by which the host language can tell if namespace nodes are shared or not and in such circumstances, sharing namespace nodes may be a very reasonable implementation strategy.

7.4.2 Accessors

dm:base-uri

Returns ().

dm:node-name

If the prefix is available, returns an xs:QName with the value of the prefix property in the local-name and an empty namespace name, otherwise returns ().

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the uri property.

dm:typed-value

Returns the value of the uri property as an xs:string.

dm:type-name

Returns ().

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

7.4.3 Construction from an Infoset

The namespace information items are required.

The following infoset properties are required: [prefix], [namespace name].

Namespace Node properties are derived from the infoset as follows:

prefix

The [prefix] property.

uri

The [namespace name] property.

parent

The element in whose in-scope namespaces property the namespace information item appears, if the implementation exposes any mechanism for accessing the dm:parent accessor of Namespace Nodes.

7.4.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

7.4.5 Infoset Mapping

A Namespace Node maps to a namespace information item.

The following properties are specified by this mapping:

[prefix]

The prefix associated with the namespace.

[namespace name]

The value of dm:string-value.

7.5 Processing Instruction Nodes

7.5.1 Overview

Processing Instruction Nodes encapsulate XML processing instructions. Processing instructions have the following properties:

  • target

  • content

  • base-uri, possibly empty

  • parent, possibly empty

Processing Instruction Nodes must satisfy the following constraints.

  1. The string "?>" must not occur within the content.

  2. The target must be an NCName.

7.5.2 Accessors

dm:base-uri

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the processing instruction has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-name

Returns an xs:QName with the value of the target property in the local-name and an empty namespace name.

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the content property.

dm:typed-value

Returns the value of the content property as a xs:string.

dm:type-name

Returns ().

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

7.5.3 Construction from an Infoset

A Processing Instruction Node is constructed for each processing instruction information item that is not ignored.

The following infoset properties are required: [target], [content], [base URI], and [parent].

Processing Instruction Node properties are derived from the infoset as follows:

target

The value of the [target] property.

The target must be an NCName. It is an error if the [target] property in the Infoset does not conform to an NCName; the processor may recover from this error by ignoring the entire processing instruction. It must not create a Processing Instruction Node for such a processing instruction.

content

The value of the [content] property.

base-uri

The value of the [base URI] property.

parent

The node corresponding to the value of the [parent] property.

There are no Processing Instruction Nodes for processing instructions that are children of a document type declaration information item.

7.5.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

7.5.5 Infoset Mapping

An Processing Instruction Node maps to a processing instruction information item.

The following properties are specified by this mapping:

[target]

The local part of the value of dm:node-name.

[content]

The value of dm:string-value.

[base URI]

The value of dm:base-uri.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

The following property has no value: [notation].

7.6 Comment Nodes

7.6.1 Overview

Comment Nodes encapsulate XML comments. Comments have the following properties:

  • content

  • parent, possibly empty

Comment Nodes must satisfy the following constraints.

  1. The string "--" must not occur within the content.

  2. The character "-" must not occur as the last character of the content.

7.6.2 Accessors

dm:base-uri

If the comment has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-name

Returns ().

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the content property.

dm:typed-value

Returns the value of the content property as a xs:string.

dm:type-name

Returns ().

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

7.6.3 Construction from an Infoset

The comment information items are optional.

A Comment Node is constructed for each comment information item that is not ignored.

The following infoset properties are required: [content] and [parent].

Comment Node properties are derived from the infoset as follows:

content

The value of the [content] property.

parent

The node corresponding to the value of the [parent] property.

There are no Comment Nodes for comments that are children of a document type declaration information item.

7.6.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

7.6.5 Infoset Mapping

A Comment Node maps to a comment information item.

The following properties are specified by this mapping:

[content]

The value of the dm:string-value.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

7.7 Text Nodes

7.7.1 Overview

Text Nodes encapsulate XML character content. Text has the following properties:

  • content, possibly empty.

  • parent, possibly empty.

Text Nodes must satisfy the following constraint:

  1. If the parent of a text node is not empty, the Text Node must not contain the empty string as its content.

In addition, Document and Element Nodes impose the constraint that two consecutive Text Nodes can never occur as adjacent siblings. When a Document or Element Node is constructed, Text Nodes that would be adjacent are combined into a single Text Node. If the resulting Text Node is empty, it is never placed among the children of its parent, it is simply discarded.

7.7.2 Accessors

dm:base-uri

If the Text Node has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-name

Returns ().

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the content property.

dm:typed-value

Returns the value of the content property as an xdt:untypedAtomic.

dm:type-name

Returns xdt:untypedAtomic.

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

7.7.3 Construction from an Infoset

The character information items are required. A Text Node is constructed for each maximal sequence of character information items.

The following infoset properties are required: [character code] and [parent].

The following infoset properties are optional: [element content white space].

A sequence of character information items is maximal if it satisfies the following constraints:

  1. All of the information items in the sequence have the same parent.

  2. The sequence consists of adjacent character information items uninterrupted by other types of information item.

  3. No other such sequence exists that contains any of the same character information items and is longer.

Text Node properties are derived from the infoset as follows:

content

A string comprised of characters that correspond to the [character code] properties of each of the character information items.

If the resulting Text Node consists entirely of white space and the Text Node occurs in Element contentXML, the content of the Text Node is the empty string.

The content of the Text Node is not necessarily normalized as described in the [Character Model]. It is the responsibility of data producers to provide appropriately normalized text, and the responsibility of applications to make sure that operations do not de-normalize text.

parent

The node corresponding to the value of the [parent] property.

7.7.4 Construction from a PSVI

For Text Nodes constructed from the [schema normalized value] of elements, content contains the value of the [schema normalized value].

Otherwise, construction from a PSVI is the same as construction from the Infoset. In the PSVI, element content occurs where the {content type} of the element containing the text is not “mixed”.

7.7.5 Infoset Mapping

A Text Node maps to a sequence of character information items.

Each character of the dm:string-value of the node is converted into a character information item as specified by this mapping:

[character code]

The Unicode code point value of the character.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

[element content whitespace]

Unknown.

This sequence of characters constitutes the infoset mapping.

8 Conformance

The data model is intended primarily as a component that can be used by other specifications. Therefore, the data model relies on specifications that use it (such as [XPath 2.0], [XSLT 2.0], and [XQuery]) to specify conformance criteria for the data model in their respective environments. Specifications that set conformance criteria for their use of the data model must not relax the constraints expressed in this specification.

Authors of conformance criteria for the use of the data model should pay particular attention to the following features of the data model:

  1. Support for DTD processing (both validation and unparsed entities).

  2. Support for W3C XML Schema processing.

  3. Support for the normative construction from an infoset described in 3.2 Construction from an Infoset.

  4. Support for the normative construction from a PSVI described in 3.3 Construction from a PSVI.

  5. Support for XML 1.0 and XML 1.1.

A XML Information Set Conformance

This specification conforms to the XML Information Set [Infoset]. The following information items must be exposed by the infoset producer to construct a data model unless they are explicitly identified as optional:

Other information items and properties made available by the Infoset processor are ignored. In addition to the properties above, the following properties are required from the PSVI if the data model is constructed from a PSVI:

B Error Summary

err:DMTY0001, undefined type

This error is raised whenever an accessor is called for a property that is undefined.

C References

C.1 Normative References

Infoset
XML Information Set (Second Edition), John Cowan and Richard Tobin, Editors. World Wide Web Consortium, 04 Feb 2004. This version is http://www.w3.org/TR/2004/REC-xml-infoset-20040204. The latest version is available at http://www.w3.org/TR/xml-infoset.
Namespaces in XML
Namespaces in XML, Tim Bray, Dave Hollander, and Andrew Layman, Editors. World Wide Web Consortium, 14 Jan 1999. This version is http://www.w3.org/TR/1999/REC-xml-names-19990114. The latest version is available at http://www.w3.org/TR/REC-xml-names.
Namespaces in XML 1.1
Namespaces in XML 1.1, Andrew Layman, Richard Tobin, Tim Bray, and Dave Hollander, Editors. World Wide Web Consortium, 04 Feb 2004. This version is http://www.w3.org/TR/2004/REC-xml-names11-20040204. The latest version is available at http://www.w3.org/TR/xml-names11/.
XPath 2.0
XML Path Language (XPath) 2.0, Mary F. Fernández, Michael Kay, Jonathan Robie, et. al., Editors. World Wide Web Consortium, 12 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xpath20-20031112. The latest version is available at http://www.w3.org/TR/xpath20.
Functions and Operators
XQuery 1.0 and XPath 2.0 Functions and Operators, Ashok Malhotra, Jim Melton, and Norman Walsh, Editors. World Wide Web Consortium, 12 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xpath-functions-20031112/. The latest version is available at http://www.w3.org/TR/xpath-functions/.
Schema Part 1
XML Schema Part 1: Structures, Henry S. Thompson, David Beech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web Consortium, 02 May 2001. This version is http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/. The latest version is available at http://www.w3.org/TR/xmlschema-1/.
Schema Part 2
XML Schema Part 2: Datatypes, Paul V. Biron and Ashok Malhotra, Editors. World Wide Web Consortium, 02 May 2001. This version is http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/. The latest version is available at http://www.w3.org/TR/xmlschema-2/.
Serialization
XSLT 2.0 and XQuery 1.0 Serialization, Michael Kay, Norman Walsh, and Henry Zongaro, Editors. World Wide Web Consortium, 12 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/. The latest version is available at http://www.w3.org/TR/xslt-xquery-serialization/.
Formal Semantics
XQuery 1.0 and XPath 2.0 Formal Semantics, Ashok Malhotra, Kristoffer Rose, Michael Rys, et. al., Editors. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xquery-semantics-20030822/. The latest version is available at http://www.w3.org/TR/xquery-semantics/.
RFC 2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. Network Working Group, IETF, Mar 1997.
Character Model
Character Model for the World Wide Web 1.0: Fundamentals, Tex Texin, Martin J. Dürst, François Yergeau, et. al., Editors. World Wide Web Consortium, 25 Feb 2004. This version is http://www.w3.org/TR/2004/WD-charmod-20040225/. The latest version is available at http://www.w3.org/TR/charmod/.

C.2 Other References

XML Query Data Model
XML Query Data Model, Mary Fernández and Jonathan Robie, Editors. World Wide Web Consortium, 15 Feb 2001.
XML Base
XML Base, Jonathan Marsh, Editor. World Wide Web Consortium, 27 Jun 2001. This version is http://www.w3.org/TR/2001/REC-xmlbase-20010627/. The latest version is available at http://www.w3.org/TR/xmlbase/.
XPath 1.0
XML Path Language (XPath) Version 1.0, James Clark and Steven DeRose, Editors. World Wide Web Consortium, 16 Nov 1999. This version is http://www.w3.org/TR/1999/REC-xpath-19991116. The latest version is available at http://www.w3.org/TR/xpath.
XPath 2.0 Requirements
XPath Requirements Version 2.0, Mark Scardina and Mary Fernández, Editors. World Wide Web Consortium, 22 Aug 2003. This version is http://www.w3.org/TR/2003/WD-xpath20req-20030822. The latest version is available at http://www.w3.org/TR/xpath20req.
XSLT 2.0
XSL Transformations (XSLT) Version 2.0, Michael Kay, Editor. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xslt20-20030502/. The latest version is available at http://www.w3.org/TR/xslt20.
XML Query Working Group
XML Query Working Group, World Wide Web Consortium. Home page: http://www.w3.org/XML/Query
XSL Working Group
XSL Working Group, World Wide Web Consortium. Home page: http://www.w3.org/Style/XSL/
XQuery
XQuery 1.0: An XML Query Language, Daniela Florescu, Jonathan Robie, Jérôme Siméon, et. al., Editors. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xquery-20030822/. The latest version is available at http://www.w3.org/TR/xquery.
XML Query Requirements
XML Query (XQuery) Requirements, Don Chamberlin, Peter Fankhauser, Massimo Marchiori, and Jonathan Robie, Editors. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xquery-requirements-20030627. The latest version is available at http://www.w3.org/TR/xquery-requirements.
ISO 8601
ISO (International Organization for Standardization). Representations of dates and times, 2000-08-03. Available from: http://www.iso.ch/

D Glossary (Non-Normative)

anonymous type name

An anonymous type name is an implementation defined, unique type name provided by the processor for every anonymous type declared in the schemas available in the static context.

atomic type

An atomic type is a primitive simple type or a type derived by restriction from another atomic type.

atomic value

An atomic value is a value in the value space of an atomic type and is labeled with the name of that atomic type.

document

A tree whose root node is a Document Node is referred to as a document.

document order

A document order is defined among all the nodes accessible during a given query or transformation. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order corresponds to the order in which the first character of the XML representation of each node occurs in the XML representation of the document.

expanded-QName

An expanded-QName is a pair of values consisting of a possibly empty namespace URI and a local name. They belong to the value space of the XML Schema type xs:QName. References to xs:QName in this document always mean the value space, i.e. a namespace URI, local name pair (and not the lexical space referring to constructs of the form “prefix:local-name”).

fragment

A tree whose root node is not a Document Node is referred to as a fragment.

implementation defined

Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementor for each particular implementation.

implementation dependent

Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementor for any particular implementation.

incompletely validated

An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than 'valid' for the [validity] property in the PSVI.

instance of the data model

Every instance of the data model is a sequence.

item

An item is either a node or an atomic value

Node

The seven distinct kinds of Node: document, element, attribute, text, namespace, processing instruction, and comment,

primitive simple type

There are 24 primitive simple types: the 19 defined in Section 3.2 Primitive datatypesXS2 of [Schema Part 2] and xdt:anyAtomicType, xdt:untyped, xdt:untypedAtomic, xdt:dayTimeDuration, and xdt:yearMonthDuration

sequence

A sequence is an ordered collection of zero or more items.

stable

Document order is stable, which means that the relative order of two nodes will not change during the processing of a given query or transformation, even if this order is implementation-dependent.

E Example (Non-Normative)

Ed. Note: This appendix does not exactly reflect the current state of the documents. It will be updated before the next publication.

The following XML document is used to illustrate the information contained in a data model:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="dm-example.xsl"?>
<catalog xmlns="http://www.example.com/catalog"
         xmlns:html="http://www.w3.org/1999/xhtml"
         xmlns:xlink="http://www.w3.org/1999/xlink"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.example.com/catalog
                             dm-example.xsd"
         xml:lang="en"
         version="0.1">

<!-- This example is for data model illustration only.
     It does not demonstrate good schema design. -->

<tshirt code="T1534017" label=" Staind : Been Awhile "
        xlink:href="http://example.com/0,,1655091,00.html"
        sizes="M L XL">
  <title> Staind: Been Awhile Tee Black (1-sided) </title>
  <description>
    <html:p>
      Lyrics from the hit song 'It's Been Awhile'
      are shown in white, beneath the large
      'Flock &amp; Weld' Staind logo.
    </html:p>
  </description>
  <price> 25.00 </price>
</tshirt>

<album code="A1481344" label=" Staind : Its Been A While "
       formats="CD">
  <title> It's Been A While </title>
  <description xsi:nil="true" />
  <price currency="USD"> 10.99 </price>
  <artist> Staind </artist>
</album>

</catalog>

The document is associated with the URI "http://www.example.com/catalog.xml", and is valid with respect to the following XML schema:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:cat="http://www.example.com/catalog"
           xmlns:xlink="http://www.w3.org/1999/xlink"
           targetNamespace="http://www.example.com/catalog"
           elementFormDefault="qualified">

<xs:import namespace="http://www.w3.org/XML/1998/namespace"
           schemaLocation="http://www.w3.org/2001/xml.xsd" />

<xs:import namespace="http://www.w3.org/1999/xlink"
           schemaLocation="http://www.cs.rpi.edu/~puninj/XGMML/xlinks-2001.xsd" />

<xs:element name="catalog">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="cat:_item" maxOccurs="unbounded" />
    </xs:sequence>
    <xs:attribute name="version" type="xs:string" fixed="0.1" use="required" />
    <xs:attribute ref="xml:base" />
    <xs:attribute ref="xml:lang" />
  </xs:complexType>
</xs:element>

<xs:element name="_item" type="cat:itemType" abstract="true" />

<xs:complexType name="itemType">
  <xs:sequence>
    <xs:element name="title" type="xs:token" />
    <xs:element name="description" type="cat:description" nillable="true" />
    <xs:element name="price" type="cat:price" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute name="label" type="xs:token" />
  <xs:attribute name="code" type="xs:ID" use="required" />
  <xs:attributeGroup ref="xlink:simpleLink" />
</xs:complexType>

<xs:element name="tshirt" type="cat:tshirtType" substitutionGroup="cat:_item" />

<xs:complexType name="tshirtType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:attribute name="sizes" type="cat:clothesSizes" use="required" />
    </xs:extension>
  </xs:complexContent>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:simpleType name="clothesSizes">
  <xs:union memberTypes="cat:sizeList">
    <xs:simpleType>
      <xs:restriction base="xs:token">
        <xs:enumeration value="oneSize" />
      </xs:restriction>
    </xs:simpleType>
  </xs:union>
</xs:simpleType>

<xs:simpleType name="sizeList">
  <xs:restriction>
    <xs:simpleType>
      <xs:list itemType="cat:clothesSize" />
    </xs:simpleType>
    <xs:minLength value="1" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="clothesSize">
  <xs:union memberTypes="cat:numberedSize cat:categorySize" />
</xs:simpleType>

<xs:simpleType name="numberedSize">
  <xs:restriction base="xs:integer">
    <xs:enumeration value="4" />
    <xs:enumeration value="6" />
    <xs:enumeration value="8" />
    <xs:enumeration value="10" />
    <xs:enumeration value="12" />
    <xs:enumeration value="14" />
    <xs:enumeration value="16" />
    <xs:enumeration value="18" />
    <xs:enumeration value="20" />
    <xs:enumeration value="22" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="categorySize">
  <xs:restriction base="xs:token">
    <xs:enumeration value="XS" />
    <xs:enumeration value="S" />
    <xs:enumeration value="M" />
    <xs:enumeration value="L" />
    <xs:enumeration value="XL" />
    <xs:enumeration value="XXL" />
  </xs:restriction>
</xs:simpleType>

<xs:element name="album" type="cat:albumType" substitutionGroup="cat:_item" />

<xs:complexType name="albumType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:sequence>
        <xs:element name="artist" type="xs:string" />
      </xs:sequence>
      <xs:attribute name="formats" type="cat:formatsType" use="required" />
    </xs:extension>
  </xs:complexContent>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:simpleType name="formatsType">
  <xs:list itemType="cat:formatType" />
</xs:simpleType>

<xs:simpleType name="formatType">
  <xs:restriction base="xs:token">
    <xs:enumeration value="CD" />
    <xs:enumeration value="MiniDisc" />
    <xs:enumeration value="tape" />
    <xs:enumeration value="vinyl" />
  </xs:restriction>
</xs:simpleType>

<xs:complexType name="description" mixed="true">
  <xs:sequence>
    <xs:any namespace="http://www.w3.org/1999/xhtml" processContents="lax"
            minOccurs="0" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:complexType name="price">
  <xs:simpleContent>
    <xs:extension base="cat:monetaryAmount">
      <xs:attribute name="currency" type="cat:currencyType" default="USD" />
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<xs:simpleType name="currencyType">
  <xs:restriction base="xs:token">
    <xs:pattern value="[A-Z]{3}" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="monetaryAmount">
  <xs:restriction base="xs:decimal">
    <xs:fractionDigits value="2" />
    <xs:pattern value="\d+\.\d{2}" />
  </xs:restriction>
</xs:simpleType>

</xs:schema>

This example exposes the data model for a document that has an associated schema and has been validated successfully against it. In general, an XML Schema is not required, that is, the data model can represent a schemaless, well-formed XML document with the rules described in 2.6 Types.

The XML document is represented by the nodes described below. The value D1 represents a Document Node; the values E1, E2, etc. represent Element Nodes; the values A1, A2, etc. represent Attribute Nodes; the values N1, N2, etc. represent Namespace Nodes; the values P1, P2, etc. represent Processing Instruction Nodes; the values T1, T2, etc. represent Text Nodes.

For brevity:

// Document node D1
dm:base-uri(D1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(D1) "document"
dm:string-value(D1) = "  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00    It's  Been  A  While    10.99    Staind  "
dm:children(D1) ([E1])
 
// Namespace node N1
dm:node-kind(N1) "namespace"
dm:node-name(N1) xs:QName("", "xml")
dm:string-value(N1) = "http://www.w3.org/XML/1998/namespace"
 
// Namespace node N2
dm:node-kind(N2) "namespace"
dm:node-name(N2) ()
dm:string-value(N2) = "http://www.example.com/catalog"
 
// Namespace node N3
dm:node-kind(N3) "namespace"
dm:node-name(N3) xs:QName("", "html")
dm:string-value(N3) = "http://www.w3.org/1999/xhtml"
 
// Namespace node N4
dm:node-kind(N4) "namespace"
dm:node-name(N4) xs:QName("", "xlink")
dm:string-value(N4) = "http://www.w3.org/1999/xlink"
 
// Namespace node N5
dm:node-kind(N5) "namespace"
dm:node-name(N5) xs:QName("", "xsi")
dm:string-value(N5) = "http://www.w3.org/2001/XMLSchema-instance"
 
// Processing Instruction node P1
dm:base-uri(P1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(P1) "processing-instruction"
dm:node-name(P1) xs:QName("", "xml-stylesheet")
dm:string-value(P1) = "type="text/xsl"  href="dm-example.xsl""
dm:parent(P1) ([D1])
 
// Element node E1
dm:base-uri(E1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E1) "element"
dm:node-name(E1) xs:QName("http://www.example.com/catalog", "catalog")
dm:string-value(E1) = "  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00    It's  Been  A  While    10.99    Staind  "
dm:typed-value(E1) fn:error()
// xs:anyType because of the anonymous type definition
dm:type(E1) xs:anyType
dm:parent(E1) ([D1])
dm:children(E1) ([E2], [E7])
dm:attributes(E1) ([A1], [A2], [A3])
dm:namespaces(E1) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A1
dm:node-kind(A1) "attribute"
dm:node-name(A1) xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:schemaLocation")
dm:string-value(A1) = "http://www.example.com/catalog                                                            dm-example.xsd"
dm:typed-value(A1) (xs:anyURI("http://www.example.com/catalog"), xs:anyURI("catalog.xsd"))
dm:type(A1) xs:anySimpleType
dm:parent(A1) ([E1])
 
// Attribute node A2
dm:node-kind(A2) "attribute"
dm:node-name(A2) xs:QName("http://www.w3.org/XML/1998/namespace", "xml:lang")
dm:string-value(A2) = "en"
dm:typed-value(A2) "en"
dm:type(A2) xs:NMTOKEN
dm:parent(A2) ([E1])
 
// Attribute node A3
dm:node-kind(A3) "attribute"
dm:node-name(A3) xs:QName("", "version")
dm:string-value(A3) = "0.1"
dm:typed-value(A3) "0.1"
dm:type(A3) xs:string
dm:parent(A3) ([E1])
 
// Comment node C1
dm:base-uri(C1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(C1) "comment"
dm:string-value(C1) = "  This  example  is  for  data  model  illustration  only.\n          It  does  not  demonstrate  good  schema  design.  "
dm:typed-value(C1)
dm:parent(C1) ([E1])
 
// Element node E2
dm:base-uri(E2) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E2) "element"
dm:node-name(E2) xs:QName("http://www.example.com/catalog", "tshirt")
dm:string-value(E2) = "  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00  "
dm:typed-value(E2) fn:error()
dm:type(E2) cat:tshirtType
dm:parent(E2) ([E1])
dm:children(E2) ([E3], [E4], [E6])
dm:attributes(E2) ([A4], [A5], [A6], [A7])
dm:namespaces(E2) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A4
dm:node-kind(A4) "attribute"
dm:node-name(A4) xs:QName("", "code")
dm:string-value(A4) = "T1534017"
dm:typed-value(A4) xs:ID("T1534017")
dm:type(A4) xs:ID
dm:parent(A4) ([E2])
 
// Attribute node A5
dm:node-kind(A5) "attribute"
dm:node-name(A5) xs:QName("", "label")
dm:string-value(A5) = "Staind  :  Been  Awhile"
dm:typed-value(A5) xs:token("Staind : Been Awhile")
dm:type(A5) xs:token
dm:parent(A5) ([E2])
 
// Attribute node A6
dm:node-kind(A6) "attribute"
dm:node-name(A6) xs:QName("http://www.w3.org/1999/xlink", "xlink:href")
dm:string-value(A6) = "http://example.com/0,,1655091,00.html"
dm:typed-value(A6) xs:anyURI("http://example.com/0,,1655091,00.html")
dm:type(A6) xs:anyURI
dm:parent(A6) ([E2])
 
// Attribute node A7
dm:node-kind(A7) "attribute"
dm:node-name(A7) xs:QName("", "sizes")
dm:string-value(A7) = "M  L  XL"
dm:typed-value(A7) (xs:anySimpleType("M"), xs:anySimpleType("L"), xs:anySimpleType("XL"))
dm:type(A7) cat:sizeList
dm:parent(A7) ([E2])
 
// Element node E3
dm:base-uri(E3) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E3) "element"
dm:node-name(E3) xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E3) = "Staind:  Been  Awhile  Tee  Black  (1-sided)"
dm:typed-value(E3) xs:token("Staind: Been Awhile Tee Black (1-sided)")
dm:type(E3) xs:token
dm:parent(E3) ([E2])
dm:children(E3) ()
dm:attributes(E3) ()
dm:namespaces(E3) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T1
dm:base-uri(T1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T1) "text"
dm:string-value(T1) = "Staind:  Been  Awhile  Tee  Black  (1-sided)"
dm:typed-value(T1) xs:anySimpleType("Staind:  Been  Awhile  Tee  Black  (1-sided)")
dm:type(T1) xs:anySimpleType
dm:parent(T1) ([E3])
 
// Element node E4
dm:base-uri(E4) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E4) "element"
dm:node-name(E4) xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E4) = "\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(E4) fn:error()
dm:type(E4) cat:description
dm:parent(E4) ([E2])
dm:children(E4) ([E5])
dm:attributes(E4) ()
dm:namespaces(E4) ([N1], [N2], [N3], [N4], [N5])
 
// Element node E5
dm:base-uri(E5) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E5) "element"
dm:node-name(E5) xs:QName("http://www.w3.org/1999/xhtml", "html:p")
dm:string-value(E5) = "\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(E5) xdt:untypedAtomic(dm:string-value())
dm:type(E5) xs:anyType
dm:parent(E5) ([E4])
dm:children(E5) ()
dm:attributes(E5) ()
dm:namespaces(E5) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T2
dm:base-uri(T2) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T2) "text"
dm:string-value(T2) = "\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(T2) xs:anySimpleType("\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        ")
dm:type(T2) xs:anySimpleType
dm:parent(T2) ([E5])
 
// Element node E6
dm:base-uri(E6) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E6) "element"
dm:node-name(E6) xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E6) = "25.00"
// The typed-value is based on the content type of the complex type for the element
dm:typed-value(E6) cat:monetaryAmount(25.0)
dm:type(E6) cat:price
dm:parent(E6) ([E2])
dm:children(E6) ()
dm:attributes(E6) ()
dm:namespaces(E6) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T3
dm:base-uri(T3) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T3) "text"
dm:string-value(T3) = "25.00"
dm:typed-value(T3) xs:anySimpleType("25.00")
dm:type(T3) xs:anySimpleType
dm:parent(T3) ([E6])
 
// Element node E7
dm:base-uri(E7) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E7) "element"
dm:node-name(E7) xs:QName("http://www.example.com/catalog", "album")
dm:string-value(E7) = "  It's  Been  A  While    10.99    Staind  "
dm:typed-value(E7) fn:error()
dm:type(E7) cat:albumType
dm:parent(E7) ([E1])
dm:children(E7) ([E8], [E9], [E10], [E11])
dm:attributes(E7) ([A8], [A9], [A10])
dm:namespaces(E7) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A8
dm:node-kind(A8) "attribute"
dm:node-name(A8) xs:QName("", "code")
dm:string-value(A8) = "A1481344"
dm:typed-value(A8) xs:ID("A1481344")
dm:type(A8) xs:ID
dm:parent(A8) ([E7])
 
// Attribute node A9
dm:node-kind(A9) "attribute"
dm:node-name(A9) xs:QName("", "label")
dm:string-value(A9) = "Staind  :  Its  Been  A  While"
dm:typed-value(A9) xs:token("Staind : Its Been A While")
dm:type(A9) xs:token
dm:parent(A9) ([E7])
 
// Attribute node A10
dm:node-kind(A10) "attribute"
dm:node-name(A10) xs:QName("", "formats")
dm:string-value(A10) = "CD"
dm:typed-value(A10) cat:formatType("CD")
dm:type(A10) cat:formatType
dm:parent(A10) ([E7])
 
// Element node E8
dm:base-uri(E8) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E8) "element"
dm:node-name(E8) xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E8) = "It's  Been  A  While"
dm:typed-value(E8) xs:token("It's Been A While")
dm:type(E8) xs:token
dm:parent(E8) ([E7])
dm:children(E8) ()
dm:attributes(E8) ()
dm:namespaces(E8) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T4
dm:base-uri(T4) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T4) "text"
dm:string-value(T4) = "It's  Been  A  While"
dm:typed-value(T4) xs:anySimpleType("It's  Been  A  While")
dm:type(T4) xs:anySimpleType
dm:parent(T4) ([E8])
 
// Element node E9
dm:base-uri(E9) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E9) "element"
dm:node-name(E9) xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E9) = ""
// xsi:nil is true so the typed value is the empty sequence
dm:typed-value(E9) ()
dm:type(E9) cat:description
dm:parent(E9) ([E7])
dm:children(E9) ()
dm:attributes(E9) ([A11])
dm:namespaces(E9) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A11
dm:node-kind(A11) "attribute"
dm:node-name(A11) xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:nil")
dm:string-value(A11) = "true"
dm:typed-value(A11) xs:boolean("true")
dm:type(A11) xs:boolean
dm:parent(A11) ([E9])
 
// Element node E10
dm:base-uri(E10) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E10) "element"
dm:node-name(E10) xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E10) = "10.99"
dm:typed-value(E10) cat:monetaryAmount(10.99)
dm:type(E10) cat:price
dm:parent(E10) ([E7])
dm:children(E10) ()
dm:attributes(E10) ([A12])
dm:namespaces(E10) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A12
dm:node-kind(A12) "attribute"
dm:node-name(A12) xs:QName("", "currency")
dm:string-value(A12) = "USD"
dm:typed-value(A12) cat:currencyType("USD")
dm:type(A12) cat:currencyType
dm:parent(A12) ([E10])
 
// Text node T5
dm:base-uri(T5) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T5) "text"
dm:string-value(T5) = "10.99"
dm:typed-value(T5) xs:anySimpleType("10.99")
dm:type(T5) xs:anySimpleType
dm:parent(T5) ([E10])
 
// Element node E11
dm:base-uri(E11) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E11) "element"
dm:node-name(E11) xs:QName("http://www.example.com/catalog", "artist")
dm:string-value(E11) = "  Staind  "
dm:typed-value(E11) " Staind "
dm:type(E11) xs:string
dm:parent(E11) ([E7])
dm:children(E11) ()
dm:attributes(E11) ()
dm:namespaces(E11) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T6
dm:base-uri(T6) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T6) "text"
dm:string-value(T6) = "  Staind  "
dm:typed-value(T6) xs:anySimpleType("  Staind  ")
dm:type(T6) xs:anySimpleType
dm:parent(T6) ([E11])
 

A graphical representation of the data model for the preceding example is shown below. Document order in this representation can be found by following the traditional in-order, left-to-right, depth-first traversal; however, because the image has been rotated for easier presentation, this appears to be in-order, bottom-to-top, depth-first order.

Graphical depiction of the example data model.
Graphic representation of the data model. [large view, SVG]

F Accessor Summary (Non-Normative)

This section summarizes the return values of each accessor by node type.

F.1 dm:base-uri Accessor

Document Nodes

Returns the value of the base-uri property if it exists and is not empty, otherwise returns ().

Element Nodes

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the element has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

Attribute Nodes

If the attribute has a parent, returns the value of the dm:base-uri of its parent; otherwise it returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the processing instruction has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

Comment Nodes

If the comment has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

Text Nodes

If the Text Node has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

F.2 dm:node-name Accessor

Document Nodes

Returns ().

Element Nodes

Returns the value of the node-name property.

Attribute Nodes

Returns the value of the node-name property.

Namespace Nodes

If the prefix is available, returns an xs:QName with the value of the prefix property in the local-name and an empty namespace name, otherwise returns ().

Processing Instruction Nodes

Returns an xs:QName with the value of the target property in the local-name and an empty namespace name.

Comment Nodes

Returns ().

Text Nodes

Returns ().

F.3 dm:parent Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the parent property.

Attribute Nodes

Returns the value of the parent property.

Namespace Nodes

Returns the value of the parent property.

Processing Instruction Nodes

Returns the value of the parent property.

Comment Nodes

Returns the value of the parent property.

Text Nodes

Returns the value of the parent property.

F.4 dm:string-value Accessor

Document Nodes

Returns the value of the string-value property.

Element Nodes

Returns the value of the string-value property.

Attribute Nodes

Returns the value of the string-value property.

Namespace Nodes

Returns the value of the uri property.

Processing Instruction Nodes

Returns the value of the content property.

Comment Nodes

Returns the value of the content property.

Text Nodes

Returns the value of the content property.

F.5 dm:typed-value Accessor

Document Nodes

Returns the value of the typed-value property.

Element Nodes

Returns the value of the typed-value property.

Attribute Nodes

Returns the value of the typed-value property.

Namespace Nodes

Returns the value of the uri property as an xs:string.

Processing Instruction Nodes

Returns the value of the content property as a xs:string.

Comment Nodes

Returns the value of the content property as a xs:string.

Text Nodes

Returns the value of the content property as an xdt:untypedAtomic.

F.6 dm:type-name Accessor

Document Nodes

Returns ().

Element Nodes

Returns the value of the type-name property.

Attribute Nodes

Returns the value of the type-name property.

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns xdt:untypedAtomic.

F.7 dm:children Accessor

Document Nodes

Returns the value of the children property.

Element Nodes

Returns the value of the children property.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

F.8 dm:attributes Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the attributes property. The order of Attribute Nodes is stable but implementation dependent.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

F.9 dm:namespaces Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the namespaces property. The order of Namespace Nodes is stable but implementation dependent.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

F.10 dm:nilled Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the nilled property.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

G Infoset Construction Summary (Non-Normative)

This section summarizes data model construction from an Infoset for each kind of information item. General notes occur elsewhere.

G.1 Document Nodes Information Items

The document information item is required. A Document Node is constructed for each document information item.

The following infoset properties are required: [children] and [base URI].

The following infoset properties are optional: [unparsed entities].

Document Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, and comment found in the [children] property, a corresponding Element, Processing Instruction, or Comment Node is constructed and that sequence of nodes is used as the value of the children property.

If present among the [children], the document type declaration information item is ignored.

unparsed-entities

If the [unparsed entities] property is present and is not the empty set, the values of the unparsed entity information items must be used to support the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id accessors.

The internal structure of the values of the unparsed-entities property is implementation defined.

string-value

The concatenation of the string-values of all its Text Node descendants in document order. If the document has no such descendants, "".

typed-value

The dm:string-value of the node as an xdt:untypedAtomic value.

document-uri

The document-uri property holds the absolute URI for the resource from which the document node was constructed, if one is available and can be made absolute. For example, if a collection of documents is returned by the fn:collection function, the document-uri property may serve to distinguish between them even though each has the same base-uri property.

If the document-uri is not (), then the following constraint must hold: the node returned by evaluating fn:doc() with the document-uri as its argument must return the document node that provided the value of the document-uri property.

In other words, for any Document Node $arg, either fn:document-uri($arg) must return the empty sequence or fn:doc(fn:document-uri($arg)) must return $arg.

G.2 Element Nodes Information Items

The element information items are required. An Element Node is constructed for each element information item.

The following infoset properties are required: [namespace name], [local name], [children], [attributes], [in-scope namespaces], [base URI], and [parent].

Element Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

parent

The node that corresponds to the value of the [parent] property.

type-name

All Element Nodes constructed from an infoset have the type xdt:untyped.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

Default and fixed attributes provided by both DTD and XML Schema processing are added to the [attributes] and are therefore included in the data model attributes’s of an element.

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property.

Implementations may ignore namespace information items for namespaces which do not appear in the expanded QName of the element name or the names of any of its attribute information items. This can arise when QNames are used in content.

nilled

All Element Nodes constructed from an infoset have a nilled property of "false".

string-value

The concatenation of the string-values of all its Text Node descendants in document order. If the document has no such descendants, "".

typed-value

Returns the string-value as an xdt:untypedAtomic.

G.3 Attribute Nodes Information Items

The attribute information items are required. An Attribute Node is constructed for each attribute information item.

The following infoset properties are required: [namespace name], [local name], [normalized value], [attribute type], and [owner element].

Attribute Node properties are derived from the infoset as follows:

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

string-value

The [normalized value] property.

parent

The Element Node that corresponds to the value of the [owner element] property.

type-name
  • If the [attribute type] property has one of the following values: ID, IDREF, IDREFS, ENTITY, ENTITIES, NMTOKEN, or NMTOKENS, an xs:QName with the [attribute type] as the local name and "http://www.w3.org/2001/XMLSchema" as the namespace name.

  • Otherwise, xdt:untypedAtomic.

string-value

The [normalized value] of the attribute.

typed-value

The typed-value is calculated as follows:

  • If the attribute is of type xdt:untypedAtomic: its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • Otherwise: its typed-value is a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

G.4 Namespace Nodes Information Items

The namespace information items are required.

The following infoset properties are required: [prefix], [namespace name].

Namespace Node properties are derived from the infoset as follows:

prefix

The [prefix] property.

uri

The [namespace name] property.

parent

The element in whose in-scope namespaces property the namespace information item appears, if the implementation exposes any mechanism for accessing the dm:parent accessor of Namespace Nodes.

G.5 Processing Instruction Nodes Information Items

A Processing Instruction Node is constructed for each processing instruction information item that is not ignored.

The following infoset properties are required: [target], [content], [base URI], and [parent].

Processing Instruction Node properties are derived from the infoset as follows:

target

The value of the [target] property.

The target must be an NCName. It is an error if the [target] property in the Infoset does not conform to an NCName; the processor may recover from this error by ignoring the entire processing instruction. It must not create a Processing Instruction Node for such a processing instruction.

content

The value of the [content] property.

base-uri

The value of the [base URI] property.

parent

The node corresponding to the value of the [parent] property.

There are no Processing Instruction Nodes for processing instructions that are children of a document type declaration information item.

G.6 Comment Nodes Information Items

The comment information items are optional.

A Comment Node is constructed for each comment information item that is not ignored.

The following infoset properties are required: [content] and [parent].

Comment Node properties are derived from the infoset as follows:

content

The value of the [content] property.

parent

The node corresponding to the value of the [parent] property.

There are no Comment Nodes for comments that are children of a document type declaration information item.

G.7 Text Nodes Information Items

The character information items are required. A Text Node is constructed for each maximal sequence of character information items.

The following infoset properties are required: [character code] and [parent].

The following infoset properties are optional: [element content white space].

A sequence of character information items is maximal if it satisfies the following constraints:

  1. All of the information items in the sequence have the same parent.

  2. The sequence consists of adjacent character information items uninterrupted by other types of information item.

  3. No other such sequence exists that contains any of the same character information items and is longer.

Text Node properties are derived from the infoset as follows:

content

A string comprised of characters that correspond to the [character code] properties of each of the character information items.

If the resulting Text Node consists entirely of white space and the Text Node occurs in Element contentXML, the content of the Text Node is the empty string.

The content of the Text Node is not necessarily normalized as described in the [Character Model]. It is the responsibility of data producers to provide appropriately normalized text, and the responsibility of applications to make sure that operations do not de-normalize text.

parent

The node corresponding to the value of the [parent] property.

H PSVI Construction Summary (Non-Normative)

This section summarizes data model construction from a PSVI for each kind of information item. General notes occur elsewhere.

H.1 Document Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

H.2 Element Nodes Information Items

The following Element Node properties are affected by PSVI properties.

type-name
children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

For elements with schema simple types, or complex types with simple content, if the [schema normalized value] PSVI property exists, the processor may use a sequence of nodes containing the Processing Instruction and Comment Nodes corresponding to the processing instruction and comment information items found in the [children] property, plus an optional single Text Node whose string value is the [schema normalized value] for the children property. If the [schema normalized value] is the empty string, the Text Node must not be present, otherwise it must be present.

The relative order of Processing Instruction and Comment Nodes must be preserved, but the position of the Text Node, if it is present, among them is implementation defined.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

nilled

The nilled value, as calculated in 3.3.2 Mapping xsi:nil on Element Nodes.

string-value

The string-value is calculated as follows:

  • If the element is empty: its string value is the empty string, "".

  • If the element has a type of xdt:untyped, a complex type with element-only content, or a complex type with mixed content: its string-value is the concatenation of the string-values of all its Text Node descendants in document order.

  • If the element has a simple type or a complex type with simple content: its string-value is the [schema normalized value] of the node.

typed-value

The typed-value is calculated as follows:

  • If the nilled property is true, its typed-value is ().

  • If the element is of type xdt:untyped, its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • If the element is empty: its typed-value is the empty sequence, ().

  • If the element has a simple type or a complex type with simple content: it’s typed value is compute as described in 3.3.1.2 Atomic Value Type Names. The result is a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

  • If the element has a complex type with mixed content, its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • Otherwise, the element must be a complex type with element-only content. The typed-value of such an element is undefined. Attempting to access this property with the dm:typed-value accessor always raises an error.

All other properties have values that are consistent with construction from an infoset.

H.3 Attribute Nodes Information Items

The following Attribute Node properties are affected by PSVI properties.

string-value
  • The [schema normalized value] PSVI property if that exists.

  • Otherwise, the [normalized value] property.

type-name
  • If the [validity] property does not exist on this node or any of its ancestors, Infoset processing applies.

    Note that this processing is only performed if neither the node nor any of its ancestors was schema validated. In particular, Infoset-only processing does not apply to subtrees that are "skip" validated in a document.

  • If the [validity] property exists and is "valid", type is assigned as described in 3.3.1 Mapping PSVI Additions to Type Names

  • Otherwise, xdt:untypedAtomic.

typed-value

The typed-value is calculated as follows:

  • If the attribute is of type xdt:untypedAtomic: its typed-value is its dm:string-value as an xdt:untypedAtomic.

  • Otherwise: its typed-value is a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation. The type of each atomic value is assigned as described in 3.3.1.2 Atomic Value Type Names.

All other properties have values that are consistent with construction from an infoset.

Note: attributes from the XML Schema instance namespace, "http://www.w3.org/2001/XMLSchema-instance", (xsi:schemaLocation, xsi:type, etc.) appear as ordinary attributes in the data model.

H.4 Namespace Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

H.5 Processing Instruction Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

H.6 Comment Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

H.7 Text Nodes Information Items

For Text Nodes constructed from the [schema normalized value] of elements, content contains the value of the [schema normalized value].

Otherwise, construction from a PSVI is the same as construction from the Infoset. In the PSVI, element content occurs where the {content type} of the element containing the text is not “mixed”.

I Infoset Mapping Summary (Non-Normative)

This section summarizes the infoset mapping for each kind of node. General notes occur elsewhere.

I.1 Document Nodes Information Items

A Document Node maps to a document information item. The mapping fails and produces no value if the Document Node contains Text Node children that do not consist entirely of white space or if the Document Node contains more than one Element Node child.

The following properties are specified by this mapping:

[children]

A list of information items obtained by processing each of the dm:children in order and mapping each to the appropriate information item(s).

[document element]

The element information item that is among the [children].

[unparsed entities]

An unordered set of unparsed entity information items constructed from the unparsed-entities.

Each unparsed entity maps to an unparsed entity information item. The following properties are specified by this mapping:

[name]

The name of the entity.

[system identifier]

The system identifier of the entity.

[public identifier]

The public identifier of the entity.

[declaration base URI]

The base URI of the entity in which the declaration occurred.

The following properties have no value: [notation name], [notation].

The following properties have no value: [notations] [character encoding scheme] [version] [all declarations processed].

I.2 Element Nodes Information Items

An Element Node maps to an element information item.

The following properties are specified by this mapping:

[namespace name]

The namespace name of the value of dm:node-name.

[local name]

The local part of the value of dm:node-name.

[prefix]

The prefix associated with the value of dm:node-name, if it is known, otherwise not known.

[children]

A list of information items obtained by processing each of the dm:children in order and mapping each to the appropriate information item(s).

[attributes]

A list of information items obtained by processing each of the dm:attributes and mapping each to the appropriate information item(s).

[in-scope namespaces]

An unordered set of namespace information items constructed from the in-scope-namespaces.

Each in-scope namespace maps to a namespace information item. The following properties are specified by this mapping:

[prefix]

The prefix associated with the namespace.

[namespace name]

The URI associated with the namespace.

[base URI]

The value of dm:base-uri.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

The following property has no value: [namespace attributes].

I.3 Attribute Nodes Information Items

An Attribute Node maps to an attribute information item.

The following properties are specified by this mapping:

[namespace name]

The namespace name of the value of dm:node-name.

[local name]

The local part of the value of dm:node-name.

[prefix]

The prefix associated with the value of dm:node-name, if it is known, otherwise not known.

[normalized value]

The value of dm:string-value.

[owner element]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

The following properties have no value: [specified] [attribute type] [references].

I.4 Namespace Nodes Information Items

A Namespace Node maps to a namespace information item.

The following properties are specified by this mapping:

[prefix]

The prefix associated with the namespace.

[namespace name]

The value of dm:string-value.

I.5 Processing Instruction Nodes Information Items

An Processing Instruction Node maps to a processing instruction information item.

The following properties are specified by this mapping:

[target]

The local part of the value of dm:node-name.

[content]

The value of dm:string-value.

[base URI]

The value of dm:base-uri.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

The following property has no value: [notation].

I.6 Comment Nodes Information Items

A Comment Node maps to a comment information item.

The following properties are specified by this mapping:

[content]

The value of the dm:string-value.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

I.7 Text Nodes Information Items

A Text Node maps to a sequence of character information items.

Each character of the dm:string-value of the node is converted into a character information item as specified by this mapping:

[character code]

The Unicode code point value of the character.

[parent]
  • If this node is the root of the infoset mapping operation, unknown.

  • If this node has a parent, the information item that corresponds to the node returned by dm:parent.

  • Otherwise no value.

[element content whitespace]

Unknown.

This sequence of characters constitutes the infoset mapping.