Copyright © 2011, 2012 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document records all known errors in the Efficient XML Interchange (EXI) Format 1.0 (hereinafter, "the specification" or "the spec").
If you find errors in the specification that are not listed in this document, please report them to public-exi-comments@w3.org. Archives of the mailing list are available at public archive
1. Substantive Errata
2. Editorial Errata
3. Clarifications
A. Errata Changes
Below is a paragraph excerpted from section 7 Representing Event Content.
Schemas can provide one or more enumerated values for datatypes. When thePreserve.lexicalValues option is false, EXI exploits those pre-defined values when they are available to represent values of such datatypes in a more efficient manner than would have done otherwise without using pre-defined values. The encoding rule for representing enumerated values is described in. Datatypes that are derived from another by union and their subtypes are always represented as String regardless of the availability of enumerated values. Representation of values of which the datatype is one of QName, Notation or a datatype derived therefrom by restriction are also not affected by enumerated values if any.
Make the above paragraph the one shown below. The modified part is highlighted in color for distinction purposes only.
Schemas can provide one or more enumerated values for datatypes. When thePreserve.lexicalValues option is false, EXI exploits those pre-defined values when they are available to represent values of such datatypes in a more efficient manner than would have done otherwise without using pre-defined values. The encoding rule for representing enumerated values is described in. Datatypes that are derived from another by union and their subtypes are always represented as String regardless of the availability of enumerated values. Representation of values of which the datatype is either a list datatypeXS2, or one of QName, Notation or a datatype derived therefrom by restriction are also not affected by enumerated values if any.
Below is a paragraph excerpted from section 7.2 Enumerations.
Exceptions are for schema types derived from others by union and their subtypes, QName or Notation and types derived therefrom by restriction. The values of such types are processed by their respective built-in EXI datatype representations instead of being represented as enumerations.
Make the above paragraph the one shown below. The modified part is highlighted in color for distinction purposes only.
Exceptions are for schema union datatypesXS2 , list datatypesXS2, as well as QName or Notation and types derived therefrom by restriction. The values of such types are processed by their respective built-in EXI datatype representations instead of being represented as enumerations.
Change the semantics section that currently reads as follows
All productions in the built-in element grammarof the form LeftHandSide: AT (*) RightHandSide are evaluated as follows:
- Let qname be the qname of the attribute matched by AT (*)
- Create a production of the form LeftHandSide : AT (qname) RightHandSide with an event code 0 and increment the first part of the event code of each production in the current grammar with the non-terminal LeftHandSide on the left-hand side. Add this production to the grammar.
- If qname is xsi:type, let target-type be the value of the xsi:type attribute and assign it the QName datatype representation (see 7.1.7 QName). If there is no namespace in scope for the specified qname prefix, set the uri of target-type to empty ("") and the localName to the full lexical value of the QName, including the prefix. Encode target-type according to section 7. Representing Event Content. If a grammar can be found for the target-type type using the encoded target-type representation, evaluate the element contents using the grammar for target-type type instead of RightHandSide.
to
All productions in the built-in element grammarof the form LeftHandSide: AT (*) RightHandSide are evaluated as follows:
- Let qname be the qname of the attribute matched by AT (*)
- If qname is not xsi:type or If a production of the form LeftHandSide : AT(xsi:type) with an event code of length 1 does not exist in the current element grammar, create a production of the form LeftHandSide : AT (qname) RightHandSide with an event code 0 and increment the first part of the event code of each production in the current grammar with the non-terminal LeftHandSide on the left-hand side. Add this production to the grammar.
- If qname is xsi:type, let target-type be the value of the xsi:type attribute and assign it the QName datatype representation (see 7.1.7 QName). If there is no namespace in scope for the specified qname prefix, set the uri of target-type to empty ("") and the localName to the full lexical value of the QName, including the prefix. Encode target-type according to section 7. Representing Event Content. If a grammar can be found for the target-type type using the encoded target-type representation, evaluate the element contents using the grammar for target-type type instead of RightHandSide.
Change the the fourth paragraph in Section 8.5.4.1.3 Type Grammars from
Sections 8.5.4.1.3.1 Simple Type Grammars and 8.5.4.1.3.2 Complex Type Grammars describe the processes for creating Type i and TypeEmpty i from XML Schema simple type definitionsXS1 and complex type definitionsXS1 defined in schemas as well as built-in primitive typesXS2, built-in derived typesXS2 and simple ur-typeXS2 defined by XML Schema specification [XML Schema Datatypes]. Section 8.5.4.1.3.3 Complex Ur-Type Grammar defines the grammar used for processing instances of element contents of type xsd:anyTypeXS1.
to
Sections 8.5.4.1.3.1 Simple Type Grammars and 8.5.4.1.3.2 Complex Type Grammars describe the processes for creating Type i and TypeEmpty i from XML Schema simple type definitionsXS1 and complex type definitionsXS1 defined in schemas as well as built-in primitive typesXS2, built-in derived typesXS2, simple ur-typeXS2 and complex ur-typeXS1 defined by XML Schema specification [XML Schema Datatypes].
Change the grammar that reads as follows
G n−1, 0 : EE
to the following form
G n−1, 0 : EE
G n−1, 1 : EE
and add the following rule just before the first note in the section:
If there is neither an attribute use nor an {attribute wildcard}, G 0 of the following form is used as an attribute use grammar.
G 0, 0 : EE
Given that the EXI specification is already clear in Section 8.5.4.1.3 Type Grammars how grammars are build Section 8.5.4.1.3.3 Complex Ur-Type Grammar and references to it are entirely removed.
Add the following paragraph below the Namespaces in XML reference:
Namespaces in XML 1.1
Namespaces in XML 1.1 (Second Edition), T. Bray, D. Hollander, A. Layman, and R. Tobin, Editors. World Wide Web Consortium, 4 February 2004, revised 16 August 2006. This version is http://www.w3.org/TR/2006/REC-xml-names11-20060816. The latest version is available at http://www.w3.org/TR/xml-names11/.
Append the following text as 2nd paragraph right after Table 7-2.
The restricted character set for a value that would be represented as an EXI enumeration is the restricted character set of the EXI datatype representation of the enumeration base type.
Below are two paragraphs excerpted from section 7.1.2 Boolean.
In the absence of pattern facets in the schema datatype, the Boolean datatype representation is a n-bit unsigned integer (7.1.9 n-bit Unsigned Integer), where n is one (1). The value zero (0) represents false and the value one (1) represents true.
Otherwise, when pattern facets are available in the schema datatype, the Boolean datatype representation is a n-bit unsigned integer (7.1.9 n-bit Unsigned Integer), where n is two (2) and the values zero (0), one (1), two (2) and three (3) represent the values "false", "0", "true" and "1" respectively.
Change the excerpted text to the one shown below.
When the associated schema datatype is derived from xsd:boolean and pattern facets are available in the schema datatype, the Boolean datatype representation is a n-bit unsigned integer (7.1.9 n-bit Unsigned Integer), where n is two (2) and the values zero (0), one (1), two (2) and three (3) represent the values "false", "0", "true" and "1" respectively.
Otherwise, the Boolean datatype representation is a n-bit unsigned integer (7.1.9 n-bit Unsigned Integer), where n is one (1). The value zero (0) represents false and the value one (1) represents true.
The primary change is in the order of the two paragraphs. In the revised text, the special case is described first, followed by the default case. A clause clarifying the condition is added, highlighted in color above for distinction.
Below is a paragraph excerpted from section 7.4 Datatype Representation Map.
EXI processors that support Datatype Representation Maps MAY provide implementation specific means to define and install user-defined datatype representations. EXI processors MAY also provide implementation specific means for applications or users to specify alternate built-in EXI datatype representations or user-defined datatype representations for representing specific schema datatypes. As with the default EXI datatype representations, alternate datatype representations are used for the associated XML Schema types specified in the Datatype Representation Map and XML Schema datatypes derived from those datatypes. When there are built-in or user-defined datatype representations associated with more than one XML Schema datatype in the type hierarchy of a particular datatype, the closest ancestor with an associated datatype representation is used to determine the EXI datatype representation.
Make the above paragraph the one shown below by appending a text. The appended part is highlighted in color for distinction purposes only.
EXI processors that support Datatype Representation Maps MAY provide implementation specific means to define and install user-defined datatype representations. EXI processors MAY also provide implementation specific means for applications or users to specify alternate built-in EXI datatype representations or user-defined datatype representations for representing specific schema datatypes. As with the default EXI datatype representations, alternate datatype representations are used for the associated XML Schema types specified in the Datatype Representation Map and XML Schema datatypes derived from those datatypes. When there are built-in or user-defined datatype representations associated with more than one XML Schema datatype in the type hierarchy of a particular datatype, the closest ancestor with an associated datatype representation is used to determine the EXI datatype representation. For XML Schema datatypes with enumerated values, the encoding rule described in 7.2 Enumerations is used as the representation when the closest ancestor datatype with an associated datatype representation has no enumerated values.
Below is a paragraph excerpted from section 8.5.4.1.5 Particles.
Otherwise, if {max occurs} is unbounded, generate one additional copy of Term 0 , G {min occurs} and replace all productions of the form:
G {min occurs}, k : EE with productions of the form:
G {min occurs}, k : G {min occurs}, 0 indicating this term may be repeated indefinitely. Then if there is no production of the form:
G {min occurs}, 0 : EE add one after the other productions with the non-terminal G {min occurs}, 0 on the left-hand side, indicating this term may be omitted from the content model. Then, create the grammar for Particle i using the grammar concatenation operator defined in section 8.5.4.1.1 Grammar Concatenation Operator as follows:
Particle i = G 0 ⊕ G 1 ⊕ … ⊕ G {min occurs}
Make the above text the one shown below. The modified part is highlighted in color for distinction purposes only.
Otherwise, if {max occurs} is unbounded, generate one additional copy of Term 0 , G {min occurs} and replace all productions of the form:
G {min occurs}, k : EE with productions of the form:
G {min occurs}, k : G {min occurs}, 0 indicating this term may be repeated indefinitely. Then, when there is no more production of the form:
G {min occurs}, 0 : EE add one after the other productions with the non-terminal G {min occurs}, 0 on the left-hand side, indicating this term may be omitted from the content model. Then, create the grammar for Particle i using the grammar concatenation operator defined in section 8.5.4.1.1 Grammar Concatenation Operator as follows:
Particle i = G 0 ⊕ G 1 ⊕ … ⊕ G {min occurs}
Append the following text as 4th paragraph after Table 4-1:
The namespace of elements and attributes is specified as part of SE and AT events and hence namespace declarations can be omitted from the EXI stream if preservation of prefixes is not required by the applications. As prescribed by Table B-2 and Table B-11, [namespace attributes] representing namespace declarations are mapped to NS events and SHOULD NOT be represented by AT events. This also implies that the following AT events SHOULD NOT occur in EXI streams: (1) AT events with qname whose uri is "http://www.w3.org/2000/xmlns/"; (2) AT events with qname which has empty uri ("") and local name either of the form "xmlns" or "xmlns:*", where "*" represent string with 0 or more characters.
Below is the first paragraph and Table 7-3 excerpted from section 7.1.8 Date-Time:
The Date-Time datatype representation is a sequence of values representing the individual components of the Date-Time. The following table specifies each of the possible date-time components along with how they are encoded.
Component | Value | Type |
---|---|---|
Year | Offset from 2000 | Integer ( 7.1.5 Integer) |
MonthDay | Month * 32 + Day | 9-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) where day is a value in the range 1-31 and month is a value in the range 1-12. |
Time | ((Hour * 64) + Minutes) * 64 + seconds | 17-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) |
FractionalSecs | Fractional seconds | Unsigned Integer ( 7.1.6 Unsigned Integer) representing the fractional part of the seconds with digits in reverse order to preserve leading zeros |
TimeZone | TZHours * 64 + TZMinutes | 11-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) representing a signed integer offset by 896 ( = 14 * 64 ) |
presence | Boolean presence indicator | Boolean (7.1.2 Boolean) |
Change the content of the paragraph and the table to the one shown below by appending the highlighted text:
The Date-Time datatype representation is a sequence of values representing the individual components of the Date-Time. The following table specifies each of the possible date-time components along with how they are encoded. The value ranges of the date-time components follow the definitions of the XML Schema specification [XML Schema Datatypes] which for example prescribes the value range of the seconds to be between 0 and 60 to account for leap second representation and hour between 0 and 24 among others.
Component | Value | Type |
---|---|---|
Year | Offset from 2000 | Integer ( 7.1.5 Integer) |
MonthDay | Month * 32 + Day | 9-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) where day is a value in the range 1-31 and month is a value in the range 1-12. |
Time | ((Hour * 64) + Minutes) * 64 + seconds | 17-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) where Hour is a value in the range 0-24, Minutes is a value in the range 0-59 and seconds is a value in the range 0-60 |
FractionalSecs | Fractional seconds | Unsigned Integer ( 7.1.6 Unsigned Integer) representing the fractional part of the seconds with digits in reverse order to preserve leading zeros |
TimeZone | TZHours * 64 + TZMinutes | 11-bit Unsigned Integer (7.1.9 n-bit Unsigned Integer) representing a signed integer offset by 896 ( = 14 * 64 ) where TZHours is a value in the range [-14 .. 14] and TZMinutes is a value in the range [-59 .. 59] |
presence | Boolean presence indicator | Boolean (7.1.2 Boolean) |
Below is the second paragraph of section 7.1.5 Integer:
If the associated schema datatype is derived from xsd:integer and the bounded range determined by its minInclusiveXS2, minExclusiveXS2, maxInclusiveXS2 and maxExclusiveXS2 facets has 4096 or fewer values, the value is represented as an n-bit Unsigned Integer where n is ⌈ log2 m ⌉ and m is the bounded range of the schema datatype.
Change the paragraph to the one shown below by appending the highlighted text:
If the associated schema datatype is derived from xsd:integer and the bounded range determined by its minInclusiveXS2, minExclusiveXS2, maxInclusiveXS2 and maxExclusiveXS2 facets has 4096 or fewer values, the value is represented as an n-bit Unsigned Integer offset from the minimum value in the range where n is ⌈ log2 m ⌉ and m is the bounded range of the schema datatype.
Remove the last sentence from Section "8.5.3 Schema-informed Element Fragment Grammar" that reads:
The content index of grammars ElementFragment and ElementFragmentTypeEmpty are both 1 (one).
Remove the third paragraph from Section "8.5.4.1.3 Type Grammars" that reads:
[Definition:] For each type grammar Type i , an unique index number content is determined such that all non-terminal symbols of indices smaller than content have at least one AT terminal symbol and the rest of the non-terminal symbols in Type i do not have AT terminal symbols on their right-hand side, where indices are assigned to non-terminal symbols in ascending order with the entry non-terminal symbol of Type i being assigned index 0 (zero). There is also a content index associated with each TypeEmpty i where its value is determined in the same manner as for Type i .
Remove the last sentence from Section "8.5.4.1.3.1 Simple Type Grammars" that reads:
The content index of grammar Type_i and TypeEmpty_i created from an XML Schema simple type definition is always 0 (zero).
An excerpt from Section "8.5.4.1.3.2 Complex Type Grammars" is given below:
The grammar TypeEmpty i is created by combining the sequence of attribute use grammars terminated by an empty {content type} grammar as follows:
TypeEmpty i = G 0 ⊕ G 1 ⊕ … ⊕ G n−1 ⊕ Content i where the grammar Content i is created as follows:
Content i, 0 : EE The content index of grammar TypeEmpty i is the index of its last non-terminal symbol.
Remove the last sentence from this excerpt that reads:
The content index of grammar TypeEmpty i is the index of its last non-terminal symbol.
Also remove the last sentence from Section "8.5.4.1.3.2 Complex Type Grammars" that reads:
The content index of grammar Type i created from an XML Schema complex type definition is the index of the first non-terminal symbol of Content i within the context of Type i .
Remove the paragraph from Section "8.5.4.2.2 Eliminating Duplicate Terminal Symbols" that reads:
When G i is a type grammar, if both k and l are smaller than content index of G i , k ⊔ l is also considered to be smaller than content for the purpose of index comparison purposes. Otherwise, if either k or l is not smaller than content, k ⊔ l is considered to be larger than content.
Insert the following text as a second paragraph in Section "8.5.4.4.1 Adding Productions when Strict is False" right after the first sentence that reads:
This section describes the process for augmenting the normalized grammars when the value of the strict option is false.
insert the following paragraph:
[Definition:] For each normalized element grammar Element_i , an unique index number content is determined such that: for each set of grammar productions with left-hand side non-terminal symbol of index smaller than content there is at least one production with AT terminal symbol and the rest of the productions in Element_i with left-hand side non-terminal symbols of indices equal or greater than content do not have AT terminal symbols. The left-hand side non-terminal symbols indices are assigned in ascending order with the entry non-terminal symbol of Element_i being assigned index 0 (zero). If there are no productions in Element_i that have AT terminal symbols on their right-hand side, the content index is 0.
Modify the second sentence from Section "8.5.4.4.1 Adding Productions when Strict is False" that reads:
For each normalized element grammar Element_i , create a copyElement_i,content2 of Element_i,content where the index "content" is the content of the type of the element from which Element_i was created.
changed it to:
For each normalized element grammar Element_i , create a copyElement_i,content2 of Element_i,content where the index "content" is the content of the Element_i grammar.
Clarified the definition and intended use of content index. (see 19 August 2013 (1), 19 August 2013 (2), 19 August 2013 (3), 19 August 2013 (4), 19 August 2013 (5), 19 August 2013 (6))
Added: 19 August 2013Clarified the offset of Integer datatype for bounded range schema types. (see 27 June 2013)
Added: 27 June 2013Add a references to Namespaces in XML 1.1 specification. (see 26 June 2013)
Added: 26 June 2013Clarified the valid value range of the time components in Date-Time datatype. (see 13 June 2013)
Added: 13 June 2013Clarified that the namespace declarations are mapped to NS events and should not be represented by AT events. (see 06 May 2013)
Added: 06 May 2013Fixed discrepancy between complex type grammars and ur-type grammar. (see 29 March 2013 (1), 29 March 2013 (2), and 29 March 2013 (3))
Added: 29 March 2013Clarified that enumerated values do not affect how values of list datatypes are encoded. (see 19 September 2012 (1), 19 September 2012 (2))
Added: 19 September 2012Clarified that AT(xsi:type) productions are added to a grammar at most once. (see AT(xsi:type) handling in Built-in Element Grammar)
Added: 08 May 2012Improved the wording describing when to add an extra production representing EE when { max_occurs } is unbounded. (see Unbounded { max_occurs } of Particles)
Added: 03 April 2012Clarified when patterns if any are relevant in Boolean datatype representation. (see Patterns in Boolean)
Added: 22 February 2012Clarified how values with enumerated values are represented when a DTRM is in effect. (see Enumerated Values with DTRM)
Added: 05 October 2011Added a clarification regarding the restricted character set used for a value that would be represented as an EXI enumeration. (see Restricted Character Set of Enumeration)
Added: 30 May 2011