XSLT 2.0 and XQuery 1.0 Serialization Last Call Issues

Last Call Comments 2005-02-11

Editor:
Joanne Tong
Scott Boag

Serialization Last Call issues

This document identifies the status of Last Call issues on XSLT 2.0 and XQuery 1.0 Serialization as of February 11, 2005.

The serialization draft has been defined jointly by the XML Query Working Group and the XSL Working Group (both part of the XML Activity).

The February 11, 2005 working draft includes a number of changes made in response to comments both received during the Last Call period that ended on Feb. 15, 2004. The working group is continuing to process these comments, and additional changes are expected.

Public comments on this document and its open issues are invited. Comments should be sent to the W3C mailing list public-qt-comments@w3.org. (archived at http://lists.w3.org/Archives/Public/public-qt-comments/) with “[Serial]” at the beginning of the subject field.

Most issues are classified as either “substantive”, meaning the editor believes technical changes to the document are required to address them, or “editorial”, meaning that the issue is one of specification clarity not technical correctness.

An issue transitions through several states. Issues tracking begins when an issue is “raised”. After discussion, the Working Group may have “decided” how to resolve the issue. This decision is “announced” and hopefully “acknowledged” by external commenters. For the most part, once an issue is decided, it is considered closed.


There are 125 issue(s).

17 raised (9 substantive), 0 proposed, 3 decided, 41 announced and 59 acknowledged.

Id Title Type State
+qt-2003Nov0050-01 WD-xslt-xquery-serialization-20030502 omit-xml-declaration substantive acknowledged
+qt-2003Nov0305-01 [XQuery] SAG-XQ-005 Serializing Arbitrary Sequences substantive acknowledged
+qt-2004Jan0019-04 [XSLT2.0] Specify normalization form for serialization substantive acknowledged
+qt-2004Jan0029-01 [Serial] omit-xml-declaration substantive acknowledged
+qt-2004Feb0049-01 [Serialization] IBM-SE-001: Documentization substantive acknowledged
+qt-2004Feb0053-01 [Serialization] IBM-SE-004: XML Output Method substantive acknowledged
+qt-2004Feb0055-01 [Serialization] IBM-SE-006: Schema used in round-tripping substantive acknowledged
+qt-2004Feb0057-01 [Serialization] IBM-SE-008: Serializing namespace nodes substantive acknowledged
+qt-2004Feb0058-01 [Serialization] IBM-SE-009: Discarding of type annotations substantive acknowledged
+qt-2004Feb0059-01 [Serialization] IBM-SE-010: Namespace nodes after round-trip substantive acknowledged
+qt-2004Feb0060-01 [Serialization] IBM-SE-011: Character expansion substantive acknowledged
+qt-2004Feb0061-01 [Serialization] IBM-SE-012: Version parameter substantive acknowledged
+qt-2004Feb0062-01 [Serialization] IBM-SE-013: XML v1.1 vs. Namespaces v1.1 substantive acknowledged
+qt-2004Feb0064-01 [Serialization] IBM-SE-014: Serializing the "nilled" property substantive acknowledged
+qt-2004Feb0146-01 [Serial] canonicalization substantive announced
+qt-2004Feb0188-01 Serialization (sometimes) needs to include type information substantive announced
+qt-2004Feb0261-01 [Serialization] SCHEMA-A substantive acknowledged
+qt-2004Feb0262-01 [Serialization] SCHEMA-B substantive acknowledged
+qt-2004Feb0263-01 [Serialization] SCHEMA-C substantive acknowledged
+qt-2004Feb0264-01 [Serialization] SCHEMA-D substantive announced
+qt-2004Feb0265-01 [Serialization] SCHEMA-E substantive acknowledged
+qt-2004Feb0266-01 [Serialization] SCHEMA-F substantive acknowledged
+qt-2004Feb0267-01 [Serialization] SCHEMA-G substantive raised
+qt-2004Feb0268-01 [Serialization] SCHEMA-H substantive acknowledged
+qt-2004Feb0269-01 [Serialization] SCHEMA-I substantive acknowledged
+qt-2004Feb0271-01 [Serialization] SCHEMA-K substantive acknowledged
+qt-2004Feb0272-01 [Serialization] SCHEMA-L substantive acknowledged
+qt-2004Feb0362-01 [Serial] I18N WG last call comments [4] substantive objected
+qt-2004Feb0362-02 [Serial] I18N WG last call comments [5] substantive objected
+qt-2004Feb0362-03 [Serial] I18N WG last call comments [6] substantive announced
+qt-2004Feb0362-04 [Serial] I18N WG last call comments [7] substantive announced
+qt-2004Feb0362-05 [Serial] I18N WG last call comments [8] substantive announced
+qt-2004Feb0362-06 [Serial] I18N WG last call comments [9] substantive announced
+qt-2004Feb0362-07 [Serial] I18N WG last call comments [first comment 12] substantive announced
+qt-2004Feb0362-08 [Serial] I18N WG last call comments [11] substantive announced
+qt-2004Feb0362-09 [Serial] I18N WG last call comments [Second comment 12] substantive announced
+qt-2004Feb0362-10 [Serial] I18N WG last call comments [13] substantive announced
+qt-2004Feb0362-11 [Serial] I18N WG last call comments [14] substantive announced
+qt-2004Feb0362-12 [Serial] I18N WG last call comments [15] substantive announced
+qt-2004Feb0362-13 [Serial] I18N WG last call comments [16] substantive decided
+qt-2004Feb0362-14 [Serial] I18N WG last call comments [17] substantive acknowledged
+qt-2004Feb0362-15 [Serial] I18N WG last call comments [18] substantive acknowledged
+qt-2004Feb0362-16 [Serial] I18N WG last call comments [19] substantive announced
+qt-2004Feb0362-17 [Serial] I18N WG last call comments [20] substantive announced
+qt-2004Feb0362-19 [Serial] I18N WG last call comments [22] substantive acknowledged
+qt-2004Feb0362-20 [Serial] I18N WG last call comments [23] substantive announced
+qt-2004Feb0362-21 [Serial] I18N WG last call comments [24] substantive announced
+qt-2004Feb0362-22 [Serial] I18N WG last call comments [25] substantive announced
+qt-2004Feb0362-24 [Serial] I18N WG last call comments [31] substantive announced
+qt-2004Feb0362-25 [Serial] I18N WG last call comments [32] substantive announced
+qt-2004Feb0918-01 ORA-SE-341-B: serialization of XQuery DataModel instance is inadequate substantive acknowledged
+qt-2004Feb0919-01 ORA-SE-292-B: Processing of empty sequence is roundabout and confusing substantive raised
+qt-2004Feb0921-01 ORA-SE-300-B: Implementation-defined output methods need not normalize substantive acknowledged
+qt-2004Feb0922-01 ORA-SE-302-B: Phase 1, "Markup generation", is poorly specified substantive acknowledged
+qt-2004Feb0923-01 ORA-SE-304-Q: possible parameter for how to handle elements with no children substantive acknowledged
+qt-2004Feb0924-01 ORA-SE-308-C: What circumstances are meant by "in all other circumstances"? substantive acknowledged
+qt-2004Feb0926-01 ORA-SE-312-B: Missing exception for additional whitespace added by indent parameter substantive acknowledged
+qt-2004Feb0927-01 ORA-SE-315-Q: How can character expansion create new nodes? substantive acknowledged
+qt-2004Feb0928-01 ORA-SE-326-B: XML declaration is mandatory if the version is not 1.0 substantive acknowledged
+qt-2004Feb0929-01 ORA-SE-320-B: What does it mean to say two data models (sic) are the same? substantive decided
+qt-2004Feb0930-01 ORA-SE-301-B: Indent parameter should not apply to (potentially) mixed-mode elements substantive acknowledged
+qt-2004Feb0932-01 ORA-SE-309-B: Poorly worded constraints on the output substantive acknowledged
+qt-2004Feb0936-01 ORA-SE-317-B: document-uri property cannot be serialized substantive decided
+qt-2004Feb0976-01 [Serial] IBM-SE-100: Default parameter values should account for specifics for particular output methods substantive acknowledged
+qt-2004Feb0977-01 [Serial] IBM-SE-101: Default HTML version substantive acknowledged
+qt-2004Feb0980-01 [Serial] IBM-SE-103: Treatment of whitespace in XHTML attributes substantive acknowledged
+qt-2004Feb0996-01 FW: XSLT 2.0: XML Output Method: the omit-xml-declaration Parameter substantive announced
+qt-2004Feb1040-01 ORA-SE-305-E: Phase 2 should mention generation of character references substantive acknowledged
+qt-2004Feb1042-01 ORA-SE-298-E: Please clarify that all parameters are optional substantive acknowledged
+qt-2004Feb1195-01 [Serialization] MS-SER-LC1-001 substantive acknowledged
+qt-2004Feb1197-01 [Serialization] MS-SER-LC1-002 substantive acknowledged
+qt-2004Feb1198-01 [Serialization] MS-SER-LC1-005 substantive acknowledged
+qt-2004Feb1204-01 [Serialization] MS-SER-LC1-009 substantive acknowledged
+qt-2004Feb1205-01 [Serialization] MS-SER-LC1-012 substantive acknowledged
+qt-2004May0006-01 [Serial] additional last call comment about xml:lang substantive announced
+qt-2004Sep0022-01 [Serial] XHTML indentation substantive acknowledged
+qt-2004Nov0025-01 [Serial] XHTML Serialization substantive raised
+qt-2004Nov0025-02 [Serial] XHTML Serialization substantive raised
+qt-2004Nov0025-03 [Serial] XHTML Serialization substantive raised
+qt-2004Nov0025-04 [Serial] XHTML Serialization substantive raised
+qt-2004Nov0025-07 [Serial] XHTML Serialization substantive raised
+qt-2004Nov0025-09 [Serial] XHTML Serialization substantive raised
+qt-2004Nov0074-01 [Serial] > in processing instructions substantive raised
+qt-2004Nov0075-01 [Serial] 2 Sequence Normalization substantive raised
+qt-2004Feb0050-01 [Serialization] IBM-SE-002: Bugs in example editorial announced
+qt-2004Feb0052-01 [Serialization] IBM-SE-003: Undeclare-namespaces parameter editorial announced
+qt-2004Feb0054-01 [Serialization] IBM-SE-005: Definition of serialized output editorial announced
+qt-2004Feb0056-01 [Serialization] IBM-SE-007: Definition of round-tripping editorial announced
+qt-2004Feb0270-01 [Serialization] SCHEMA-J editorial announced
+qt-2004Feb0273-01 [Serialization] SCHEMA-M editorial announced
+qt-2004Feb0274-01 [Serialization] SCHEMA-N editorial raised
+qt-2004Feb0275-01 [Serialization] SCHEMA-O editorial announced
+qt-2004Feb0276-01 [Serialization] SCHEMA-P editorial announced
+qt-2004Feb0278-01 [Serialization] SCHEMA-Q editorial announced
+qt-2004Feb0362-18 [Serial] I18N WG last call comments [21] editorial announced
+qt-2004Feb0362-23 [Serial] I18N WG last call comments [26-30,33-34] editorial announced
+qt-2004Feb0920-01 ORA-SE-327-B: Surely namespace declaration is part of serializing XML version 1.0 editorial acknowledged
+qt-2004Feb0931-01 ORA-SE-306-C: Confusing definition of the "version" parameter editorial announced
+qt-2004Feb0933-01 ORA-SE-310-E: difficult sentence to parse editorial acknowledged
+qt-2004Feb0934-01 ORA-SE-303-B: undeclare-namespaces parameter is relevant to markup generation editorial acknowledged
+qt-2004Feb0935-01 ORA-SE-311-C: What is the "processor"? editorial announced
+qt-2004Feb0937-01 ORA-SE-314-B: Additional namespace nodes may be present if serialization does not undeclare namespaces editorial announced
+qt-2004Feb0978-01 [Serial] IBM-SE-102: Serialization editorial comments editorial announced
+qt-2004Feb1037-01 ORA-SE-293-E: Redundant phrase that can be deleted editorial acknowledged
+qt-2004Feb1038-01 ORA-SE-307-E: "An xml output method" is better than "the xml output method" editorial announced
+qt-2004Feb1039-01 ORA-SE-328-E: no mention of the standalone property editorial acknowledged
+qt-2004Feb1041-01 ORA-SE-296-E: Please define "serialization error" editorial announced
+qt-2004Feb1043-01 ORA-SE-299-E: misplaced comma editorial acknowledged
+qt-2004Feb1044-01 ORA-SE-297-E: Alphabetization problem editorial acknowledged
+qt-2004Feb1045-01 ORA-SE-295-E: The Note overflow the right margin when printed editorial acknowledged
+qt-2004Feb1046-01 ORA-SE-291-E: Term "empty string" is a poor choice of words editorial raised
+qt-2004Feb1047-01 ORA-SE-290-E: Title misuses the term "data models" editorial acknowledged
+qt-2004Feb1196-01 [Serialization] MS-SER-LC1-003 editorial acknowledged
+qt-2004Feb1199-01 [Serialization] MS-SER-LC1-004 editorial acknowledged
+qt-2004Feb1200-01 [Serialization] MS-SER-LC1-007 editorial announced
+qt-2004Feb1201-01 [Serialization] MS-SER-LC1-008 editorial announced
+qt-2004Feb1202-01 [Serialization] MS-SER-LC1-006 editorial raised
+qt-2004Feb1203-01 [Serialization] MS-SER-LC1-010 editorial acknowledged
+qt-2004Feb1206-01 [Serialization] MS-SER-LC1-011 editorial announced
+qt-2004Nov0025-05 [Serial] XHTML Serialization editorial raised
+qt-2004Nov0025-06 [Serial] XHTML Serialization editorial raised
+qt-2004Nov0025-08 [Serial] XHTML Serialization editorial raised
+qt-2004Nov0037-01 [Serial] Normalization and References editorial raised
+qt-2004Nov0037-02 [Serial] Normalization and References editorial raised
+qt-2004Dec0001-01 [Serial] serialization of xhtml + omit-xml-declaration editorial raised
qt-2003Nov0050-01: WD-xslt-xquery-serialization-20030502 omit-xml-declaration
[substantive, acknowledged] 2004-09-23





According to 

http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/#N400318





   The omit-xml-declaration parameter should be ignored if the standalone

   parameter is present, or if the encoding parameter specifies a value

   other than UTF-8 or UTF-16.



There is one other case where it would be very useful to omit the

declaration (or at least to use a value of utf-8) namely

iso-646 (aka ASCII aka US-ASCII).



It may be politically incorrect to say that ascii characters are still

more interoperable than non-ascii characters, but in practice this is

still the case. Especially in XML which specifies that a charset

specified in the mime headers takes precedence it is hard to give (say) a

utf8 file to someone to serve from their website without first finding

out what http server they use, and how to make sure it won't serve the

thing as latin 1 resulting in a non-well formed document.

(See current discussion on W3C'S TAG list about this).



One style of producing XML files that avoids these problems is to

produce files that don't have an xml declaration (or have one that

specifies utf-8) but to encode all non-ascii characters as numeric

character references.



Currently in an XSLT 1 usage in production I use

<xsl:output encoding="US-ASCII"/>

with saxon and post process with sed to remove the US-ASCII

encoding declaration (which stops the file being parsed on several XML

systems I have locally) I think that it would be very desirable if



<xsl:output encoding="iso-646" omit-xml-declaration="yes"/>



was defined to work, and produce files of the form described above.



Failing that it would be good if it would be allowed by the

specification if the system understood that encoding.



David



________________________________________________________________________

This e-mail has been scanned for all viruses by Star Internet. The

service is powered by MessageLabs. For more information on a proactive

anti-virus service working around the clock, around the globe, visit:

http://www.star.net.uk

________________________________________________________________________



    
draft minutes for day 4, liam@w3.org (2004-01-22)

     Thank you for raising this issue.  The XSL and XQuery working groups 

discussed the issue and decided not to require processors to support the 

US-ASCII encoding and its aliases.  The working groups decided that the 

appropriate way of addressing your comment would be to replace the second 

paragraph of Section 4.5 of the last call working draft of XSLT 2.0 and 

XQuery 1.0 Serialization [1], which currently reads:



<<

The omit-xml-declaration parameter must be ignored if the standalone 

parameter is present, or if the encoding parameter specifies a value other 

than UTF-8 or UTF-16.

>>



with the following:



<<

The omit-xml-declaration parameter must be ignored if the standalone 

parameter is present, or if the encoding parameter specifies a value that 

is not UTF-8, UTF-16 or a subset of either of those encodings.  An 

encoding S is a subset of another encoding E if the set of codepoints that 

can be encoded in S is a subset of those that can be encoded in B, and the 

encodings of those codepoints in S is the same as the encodings of those 

same codepoints in encoding E.

>>



     That resolution seems to be in accord with the last sentence of your 

comment.  Please let us know whether you consider this resolution 

acceptable.


Thanks, looks good to me.



David


David,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
According to 
http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/#N400318


   The omit-xml-declaration parameter should be ignored if the standalone
   parameter is present, or if the encoding parameter specifies a value
   other than UTF-8 or UTF-16.

There is one other case where it would be very useful to omit the
declaration (or at least to use a value of utf-8) namely
iso-646 (aka ASCII aka US-ASCII).

It may be politically incorrect to say that ascii characters are still
more interoperable than non-ascii characters, but in practice this is
still the case. Especially in XML which specifies that a charset
specified in the mime headers takes precedence it is hard to give (say) a
utf8 file to someone to serve from their website without first finding
out what http server they use, and how to make sure it won't serve the
thing as latin 1 resulting in a non-well formed document.
(See current discussion on W3C'S TAG list about this).

One style of producing XML files that avoids these problems is to
produce files that don't have an xml declaration (or have one that
specifies utf-8) but to encode all non-ascii characters as numeric
character references.

Currently in an XSLT 1 usage in production I use
<xsl:output encoding="US-ASCII"/>
with saxon and post process with sed to remove the US-ASCII
encoding declaration (which stops the file being parsed on several XML
systems I have locally) I think that it would be very desirable if

<xsl:output encoding="iso-646" omit-xml-declaration="yes"/>

was defined to work, and produce files of the form described above.

Failing that it would be good if it would be allowed by the
specification if the system understood that encoding.
>>

     The XSL and XML Query Working Groups discussed your comment, and 
initially responded in [2], indicating that Serialization would respect 
the setting of the omit-xml-declaration whenever the encoding was UTF-8, 
UTF-16 or some "subset" encoding of those two encodings.

     However, subsequent to making that decision, the working groups 
decided that the setting of the omit-xml-declaration parameter should be 
respected always, regardless of the setting of the encoding parameter. The 
23 July working draft of Serialization [3] reflects that decision.

     Thank you once again for your comment.  May I ask you to confirm that 
the revised response is acceptable to you?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1235.html
[3] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com


>      Thank you once again for your comment.  May I ask you to confirm that 
> the revised response is acceptable to you?

Yes this is perfectly acceptable, thank you.

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
qt-2003Nov0305-01: [XQuery] SAG-XQ-005 Serializing Arbitrary Sequences
[substantive, acknowledged] 2004-03-29

SAG-XQ-005 Serializing Arbitrary Sequences



We don't think that the mechanism for serializing an arbitrary sequence

described in section 2 of the Serialization specification meets any known

user requirement.



We do think that there is a requirement for an interoperable serialization

format for an arbitrary sequence, and that this should be defined. We think

that the requirement is for a format that wraps each item in the sequence in

an XML wrapper providing information about the type, the value, and in the

case of nodes, the name of the node. For example, an attribute node might be

serialized as



<res:attribute name="my:att" type="my:shoesize" value="7.3"/>



In the case of elements and documents, the tree rooted at that node would be

serialized. The format would be extensible to allow implementation-defined

attributes that represent the identity of nodes, allowing the information to

be used for a subsequent update, or for creating hyperlinks.



(Note, technically we are talking here about a representation of an

arbitrary sequence in the form of a document. Serializing that document is

entirely orthogonal).



Michael Kay

for Software AG



    
draft minutes for day 4, Liam Quin (2004-01-22)



Walter,



     In [1] Michael Kay submitted the following comment from Software AG.



Michael Kay wrote on 2003-11-27 06:56:34 AM:

> SAG-XQ-005 Serializing Arbitrary Sequences 

> We don't think that the mechanism for serializing an arbitrary 

> sequence described in section 2 of the Serialization specification 

> meets any known user requirement.

> We do think that there is a requirement for an interoperable 

> serialization format for an arbitrary sequence, and that this should

> be defined. We think that the requirement is for a format that wraps

> each item in the sequence in an XML wrapper providing information 

> about the type, the value, and in the case of nodes, the name of the

> node. For example, an attribute node might be serialized as

> <res:attribute name="my:att" type="my:shoesize" value="7.3"/> 

> In the case of elements and documents, the tree rooted at that node 

> would be serialized. The format would be extensible to allow 

> implementation-defined attributes that represent the identity of 

> nodes, allowing the information to be used for a subsequent update, 

> or for creating hyperlinks.

> (Note, technically we are talking here about a representation of an 

> arbitrary sequence in the form of a document. Serializing that 

> document is entirely orthogonal).



     Thanks to Michael and Software AG for raising the comment.



     The XSL and XQuery working groups considered this comment and related 

comments.  There was general agreement that there is some need for a 

mechanism for serializing arbitrary sequences that preserves most or all 

of the properties of the items in an arbitrary sequence that is being 

serialized.



     However, the working groups decided that precisely defining all of 

the requirements for such a mechanism at this stage would be difficult, 

and would likely lead to a solution that would not satisfy real user 

requirements.  Therefore, the working groups decided to consider such a 

feature for a future revision of the recommendations, and close this 

comment without any changes to the specifications.



     May I ask you to confirm that this resolution is acceptable?



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0305.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

AW: [XQuery] SAG-XQ-005 Serializing Arbitrary Sequences, Walter.Waterfeld@softwareag.com (2004-03-29)



Hello Henry,

although I do not understand all the arguments, I can live with that decision;

especially as we have a high interest in a finished XQuery recommendation 

as early as possible.



Best regards

Walter

qt-2004Jan0019-04: [XSLT2.0] Specify normalization form for serialization
[substantive, acknowledged] 2004-07-13

SUGGESTION 4:

20 Serialization normalize-unicode?  attribute of the xsl:output element

Problem:

Not clear which normalization forms to use. "NFC"?

Why we should have "NFC" only, if we can support others as in

fn:normalize-unicode?



Solution:

normalize-unicode? = string

The attribute should follow the rules of the second

argument($normalizationForm   )

of the fn:normalize-unicode

(http://www.w3.org/TR/xquery-operators/#func-normalize-unicode)





Igor Hersht

XSLT Development

IBM Canada Ltd., 8200 Warden Avenue, Markham, Ontario L6G 1C7

Office D2-260, Phone (905)413-3240 ; FAX  (905)413-4839



draft minutes for day 4, Liam Quin (2004-01-22)
Re: [XSLT2.0], Henry Zongaro (2004-03-28)



Igor,



     In [1], you submitted the following comment on the XSLT 2.0 and 

Serialization last call drafts.



Igor Hersht wrote on 2004-01-11 05:01:13 PM:

> SUGGESTION 4:

> 20 Serialization normalize-unicode?  attribute of the xsl:output element

> Problem:

> Not clear which normalization forms to use. "NFC"?

> Why we should have "NFC" only, if we can support others as in

> fn:normalize-unicode?

> 

> Solution:

> normalize-unicode? = string

> The attribute should follow the rules of the second

> argument($normalizationForm   )

> of the fn:normalize-unicode

> (http://www.w3.org/TR/xquery-operators/#func-normalize-unicode)



     Thank you for your comment.



     The XSL and XQuery Working Groups discussed your comment, and agreed 

that the serialization parameter for Unicode normalization should be 

aligned with the fn:normalize-unicode function and permit additional 

normalization forms to be specified.  The working groups decided to make 

the following changes to the definition of the normalize-unicode 

serialization parameter:



1. Rename the parameter to "normalization-form".



2. The possible values of the parameter will be "NFC", "NFD", "NFKC", 

"NFKD", "fully-normalized", "none" or an implementation-defined 

normalization form.  The default value is "none".  We will also add

a note advising of the interoperability problems that can arise by

using anything other than NFC.



3. All of "NFC", "NFD", "NFKC", "NFKD", "fully-normalized", "none" and

any implementation-defined value are permitted for the xml, xhtml and

text output methods.  The values "NFC", "fully-normalized" and "none"

must be supported by an implementation for these output methods.



4. The normalization-form parameter is permitted to have the values

"NFC", "NFD", "NFKC", "NFKD", "none" or an implementation-defined value

if the output method is "html".  The values "NFC" and "none" must be

supported for the html output method.  The value "fully-normalized" is

not permitted if the output method is "html".



5. In the case of "fully-normalized", the normalization is the same as for 



NFC, but the processor must signal a serialization error if any of the 

"relevant constructs" of the result would begin with a combining 

character.



     The XSL Working Group will also make the corresponding changes to the 

xsl:output element in XSLT 2.0, replacing the normalize-unicode attribute 

with a normalization-form attribute.



     May I ask you to confirm that this response is acceptable?



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Jan/0019.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

Re: [XSLT2.0], Henry Zongaro (2004-07-13)

Hello,

     In [1], I asked Igor Hersht to confirm that he found the response to 
the issue labelled "SUGGESTION 4" in [2] acceptable.  I'm responding on 
Igor's behalf to indicate that the response is acceptable to IBM.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Mar/0276.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Jan/0019.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Jan0029-01: [Serial] omit-xml-declaration
[substantive, acknowledged] 2004-09-01
[Serial] omit-xml-declaration, David Carlisle (2004-01-13)





Jonathan said in another thread:



> If you do want to enter this as a last call comment, could you please start 

> a new thread that clearly says that?



I made the following comment on the previous draft serialization document:



http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html



Please take that as a last call comment on this draft (as the coment has

not been answered, and the situation is the same in this draft).



David





________________________________________________________________________

This e-mail has been scanned for all viruses by Star Internet. The

service is powered by MessageLabs. For more information on a proactive

anti-virus service working around the clock, around the globe, visit:

http://www.star.net.uk

________________________________________________________________________



    
Duplicate issue (Serial qt-2004Jan0029-01), Henry Zongaro (2004-08-30)

David,

     In [1], you submitted the following comment against the Last Call 
Working Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Jonathan said in another thread:

> If you do want to enter this as a last call comment, could you please 
start 
> a new thread that clearly says that?

I made the following comment on the previous draft serialization document:

http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html

Please take that as a last call comment on this draft (as the coment has
not been answered, and the situation is the same in this draft).
>>

     The XSL and XQuery Working Groups actually logged your comment twice, 
in both [1] and [2].  We would like to close the issue raised in [1] as a 
duplicate of the issue raised in [2].  I trust this will be acceptable. 
Our apologies for any confusion.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Jan/0029.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com



> We would like to close the issue raised in [1] as a 
> duplicate of the issue raised in [2].  I trust this will be acceptable. 

Yes, That's fine:-)

David


________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
qt-2004Feb0049-01: [Serialization] IBM-SE-001: Documentization
[substantive, acknowledged] 2004-04-05

Serialization Section 2, "Serializing Arbitrary Data Models": This comment 

recapitulates a discussion from a recent (21 January) meeting of the Query 

and XSLT working groups. It is suggested that the serialization document 

should define two separate and independent processes, possibly called 

"documentization" and "serialization". 



The documentization process should be defined to convert any Query Data 

Model (QDM) instance (which in general may contain zero, one, or many 

documents, or documents mixed with non-document fragments) into a QDM 

instance that contains exactly one document. This can be done by replacing 

each top-level item in the QDM instance by a descriptive "wrapper" element 

that labels it with its kind: attribute, atomic value, element, document, 

etc. A new synthetic document element is then inserted as parent of all 

the wrapper elements.  This documentization process (unlike the one 

currently described in Section 2) should apply successfully to any QDM 

instance whatsoever. Thus (for example) if the QDM instance contains 

multiple documents, the boundaries between these documents is preserved. 

If documentization is invoked on a QDM instance that already contains a 

single document, that document is nevertheless wrapped in a descriptive 

element which is placed under a new synthetic parent document node (it is 

treated simply as a sequence of documents of length one).



The serialization process then needs to be defined only for QDM instances 

that contain exactly one document. A serialization parameter can be 

defined to control whether documentization is applied before serialization 

(possibly documentization could be defined to occur by default if the 

first item in the sequence to be serialized is not a node).



--Don Chamberlin



    



Don,



     In [1] you submitted the following comment on the serialization 

draft:



Don Chamberlin wrote on 2004-02-02 06:37:20 PM:

> Serialization Section 2, "Serializing Arbitrary Data Models": This 

> comment recapitulates a discussion from a recent (21 January) 

> meeting of the Query and XSLT working groups. It is suggested that 

> the serialization document should define two separate and 

> independent processes, possibly called "documentization" and 

"serialization". 

> 

> The documentization process should be defined to convert any Query 

> Data Model (QDM) instance (which in general may contain zero, one, 

> or many documents, or documents mixed with non-document fragments) 

> into a QDM instance that contains exactly one document. This can be 

> done by replacing each top-level item in the QDM instance by a 

> descriptive "wrapper" element that labels it with its kind: 

> attribute, atomic value, element, document, etc. A new synthetic 

> document element is then inserted as parent of all the wrapper 

> elements.  This documentization process (unlike the one currently 

> described in Section 2) should apply successfully to any QDM 

> instance whatsoever. Thus (for example) if the QDM instance contains

> multiple documents, the boundaries between these documents is 

> preserved. If documentization is invoked on a QDM instance that 

> already contains a single document, that document is nevertheless 

> wrapped in a descriptive element which is placed under a new 

> synthetic parent document node (it is treated simply as a sequence 

> of documents of length one). 

> 

> The serialization process then needs to be defined only for QDM 

> instances that contain exactly one document. A serialization 

> parameter can be defined to control whether documentization is 

> applied before serialization (possibly documentization could be 

> defined to occur by default if the first item in the sequence to be 

> serialized is not a node). 



     Thank you for submitting this comment.



     The XSL and XQuery working groups considered your comment and related 

comments.  There was general agreement that there is some need for a 

mechanism for serializing arbitrary sequences that preserves most or all 

of the properties of the items in an arbitrary sequence that is being 

serialized.



     However, the working groups decided that precisely defining all of 

the requirements for such a mechanism at this stage would be difficult, 

and would likely lead to a solution that would not satisfy real user 

requirements.  Therefore, the working groups decided to consider such a 

feature for a future revision of the recommendations, and close this 

comment without any changes to the specifications.



     May I ask you to confirm that this resolution is acceptable?



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0049.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

Henry,
Thanks for your response. I understand why the working group has chosen 
not to provide this feature in XQuery Version 1, and I do not intend to 
pursue this issue further at this time.
Regards,
--Don Chamberlin
qt-2004Feb0053-01: [Serialization] IBM-SE-004: XML Output Method
[substantive, acknowledged] 2004-06-07

Serialization Section 4, "XML Output Method": The first paragraph states 

that serialization produces either an XML document entity or an external 

general parsed entity. No indication is given about how the serialization 

process chooses between these alternatives. The normalization rules in 

Section 2 always reduce the data model instance to exactly one document, 

so it is not clear how the second alternative is ever invoked. 



Also in this section, the second paragraph adds nothing that is not 

already said in the first paragraph. It should be deleted.



--Don Chamberlin



    

The serialization process doesn't choose between these two alternatives

(indeed, the set of well-formed XML document entities and the set of

well-formed EGPEs have a large overlap). Rather, this sentence is

stating a constraint. If the document node contains multiple elements or

text nodes among its children then the result cannot be a well-formed

document entity, therefore it must be a well-formed EGPE. If a

standalone attribute is requested in the serialization parameters, then

the result cannot be a well-formed EGPE, therefore it must be a

well-formed document entity. If both conditions are true, there is a

conflict, and I think there are rules later on about how this conflict

should be resolved.

 

Michael Kay


Hi, Don.

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

Don Chamberlin wrote on 2004-02-02 07:14:53 PM:
> Serialization Section 4, "XML Output Method": The first paragraph 
> states that serialization produces either an XML document entity or 
> an external general parsed entity. No indication is given about how 
> the serialization process chooses between these alternatives. The 
> normalization rules in Section 2 always reduce the data model 
> instance to exactly one document, so it is not clear how the second 
> alternative is ever invoked. 
> 
> Also in this section, the second paragraph adds nothing that is not 
> already said in the first paragraph. It should be deleted. 

     Thank you for your comment.

     The XSL and XML Query Working Groups discussed your comment.  It was 
noted that, although there is only ever one document node to process, the 
document node could have no element node children, more than one element 
node child or text node children.  If any of those conditions holds, the 
serialization process produces an external general parsed entity; 
otherwise, it produces a document entity (which might also meet the 
syntactic criteria of an external general parsed entity).

     In order to clarify the first and third paragraphs of section 4, the 
working groups decided to make the following changes:

- in the first sentence of the third paragraph, change "and the"
  to "then", to make it clear the conditions under which a
  document entity will be the result of the serialization process.

- change the wording to make it clear that these rules describe
  requirements on the processor, rather than on the user.  The
  processor will be required to produce a serialization error if
  it is unable to produce a well-formed entity of the appropriate
  kind, unless that is because of the action of the character
  expansion phase of serialization.

     The working groups further agreed that the second paragraph of 
section 4 adds no useful information, and decided to delete it.

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0053.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0055-01: [Serialization] IBM-SE-006: Schema used in round-tripping
[substantive, acknowledged] 2004-09-01

Serialization Section 4, "XML Output Method": The paragraph before the 

bullet list says that "if a new tree were constructed by parsing the 

[serialized] XML document and converting it into a data model as described 

in [Data Model], then the new data model would be the same as the starting 

data model." But this conversion process involves validation, which is 

schema-dependent. We need to specify that the schema used in this 

round-trip process is an effective schema consisting of the in-scope 

schema definitions in the static context.



--Don Chamberlin



    

After further discussion, it appears that it is not sufficient to use the 

in-scope-schema-definitions (ISSD) during round-tripping (serialization 

and re-parsing) of a data model instance. Round-tripping is used in 

validation, which in turn is used in every element constructor. It is 

necessary for round-tripping to preserve the type annotation of an element 

node, which may not be known in the ISSD. I think the schema used during 

round-tripping needs to be the union of the ISSD and the schema(s) from 

which the type annotations of the nodes were originally derived (called 

the "data model schema" in Section 2.2.5, Consistency Constraints).



--Don Chamberlin

Serialization and round-tripping, Henry Zongaro (2004-06-12)
Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Serialization Section 4, "XML Output Method": The paragraph before the 
bullet list says that "if a new tree were constructed by parsing the 
[serialized] XML document and converting it into a data model as described 

in [Data Model], then the new data model would be the same as the starting 

data model." But this conversion process involves validation, which is 
schema-dependent. We need to specify that the schema used in this 
round-trip process is an effective schema consisting of the in-scope 
schema definitions in the static context.
>>

     Thank you for your comment.

     The XSL and XML Query Working Groups discussed your comment and 
decided that the "round-tripping" description in Serialization was not 
intended to be part of the definition of validation, but only to define 
the requirements on the form of a serialized instance of the data model. 
The issue will be closed without any change to Serialization.

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0055.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0057-01: [Serialization] IBM-SE-008: Serializing namespace nodes
[substantive, acknowledged] 2004-04-28

Serialization Section 4, "XML Output Method": This section should specify 

the rules for serializing namespace nodes in the form of namespace 

declaration attributes. Does every namespace node attached to an element 

node result in an xmlns-attribute in that element's start-tag? Can the 

xmlns-attribute be omitted if it is present in the start-tag of a parent 

element?



--Don Chamberlin



    

I don't think it's necessary to specify these rules in detail. Any

output that satisfies the round-tripping constraints is acceptable. The

philosophy of the serialization spec is to state the basic constraints

that the output must specify, and beyond that, to be non-prescriptive. 

 

Michael Kay 


Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

> Serialization Section 4, "XML Output Method": This section should 
> specify the rules for serializing namespace nodes in the form of 
> namespace declaration attributes. Does every namespace node attached
> to an element node result in an xmlns-attribute in that element's 
> start-tag? Can the xmlns-attribute be omitted if it is present in 
> the start-tag of a parent element?

     Thank you for your comment.

     The XSL and XML Query Working Groups discussed your comment, and 
concluded that any output that satisfies the round-tripping requirement 
for the XML output method can be used in serializing namespace nodes.  The 
responses to your specific questions are "No, the serialized start-tag for 
an element node does not have to have an xmlns-attribute for every 
namespace node," and "Yes, if an element node has a namespace node, the 
xmlns-attribute can be omitted if the start-tag for an ancestor element 
declares the namespace, subject to the usual constraints imposed by 
namespace undeclaration or changes in binding."  No change to the 
serialization specification is required.

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0057.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0058-01: [Serialization] IBM-SE-009: Discarding of type annotations
[substantive, acknowledged] 2004-04-13

Serialization Section 4, "XML Output Method": Bullet 5 says that type 

annotations are discarded during serialization. The note just below this 

bullet says that type annotations are optionally preserved during 

serialization. Which is true? If type annotations are optionally 

preserved, how is this option controlled? Is it implementation-defined, or 

controlled by a serialization parameter? It would be very helpful to have 

a note or example to illustrate how (and when) type annotations are 

serialized in the form of xsi:type attributes.



Also, the note below Bullet 5 is very awkwardly phrased. If retained, this 

note should be condensed as follows: "In order to preserve type 

annotations, the serialization process could use mechanisms such as 

xsi:type and xsi:schemaLocation attributes." 



--Don Chamberlin



    

I agree the wording of the note could be improved. The intent is to say

that type annotations are not retained through serialization, and if

this causes a problem, the user can include xsi:type or

xsi:schemaLocation attributes in the result tree, which will cause type

annotations to be reconstituted when the serialized document is

re-parsed. The serializer will never add these attributes itself. (Well,

there's no ban on an implementor adding options to do this of course,

but it's not part of serialization as specified).

 

Michael Kay


Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

XML Query wrote on 2004-02-02 07:15:26 PM:> 
> Serialization Section 4, "XML Output Method": Bullet 5 says that 
> type annotations are discarded during serialization. The note just 
> below this bullet says that type annotations are optionally 
> preserved during serialization. Which is true? If type annotations 
> are optionally preserved, how is this option controlled? Is it 
> implementation-defined, or controlled by a serialization parameter? 
> It would be very helpful to have a note or example to illustrate how
> (and when) type annotations are serialized in the form of xsi:type 
attributes.
> 
> Also, the note below Bullet 5 is very awkwardly phrased. If 
> retained, this note should be condensed as follows: "In order to 
> preserve type annotations, the serialization process could use 
> mechanisms such as xsi:type and xsi:schemaLocation attributes." 

     Thank you for your comment.

     The XSL and XQuery Working Groups discussed your comment and decided 
that the note was intended to indicate that if the user would like type 
annotations to be preserved, the user should ensure the data model that 
the sequence that is input to the serialization process uses xsi:type and 
xsi:schemaLocation attributes.  The note wasn't intended to grant a 
license to the serialization process to manufacture such attributes.

     In order to clarify this, the working groups decided to replace the 
note in the bullet 5 of section 4 with the following:

<<
Note: In order to influence the type annotations in the data model that 
would result from processing a serialized XML document, the author of the 
XSLT stylesheet, XQuery expression or other process may wish to create the 
data model that is input to the serialization process so that it makes use 
of mechanisms provided by [XML Schema], such as xsi:type and 
xsi:schemaLocation attributes.  The serialization process will not 
automatically create such attributes in the serialized document if those 
attributes were not part of the result tree that is to be serialized.
>>

     As you were present when this decision was made, I will take it that 
the decision is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0058.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0059-01: [Serialization] IBM-SE-010: Namespace nodes after round-trip
[substantive, acknowledged] 2004-07-13

Serialization Section 4, "XML Output Method": The statement in Bullet 6 is 

backward. Additional namespace nodes may be generated if the serialization 

process FAILS to undeclare namespaces. In addition, the namespace nodes 

may be different after round-tripping because the process of constructing 

an element node from an infoset may ignore namespaces that are not used in 

element or attribute names (see Data Model Section 6.2.4, Element Node 

Construction from Infoset".)



--Don Chamberlin



    

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Serialization Section 4, "XML Output Method": The statement in Bullet 6 is 
backward. Additional namespace nodes may be generated if the serialization 
process FAILS to undeclare namespaces. In addition, the namespace nodes 
may be different after round-tripping because the process of constructing 
an element node from an infoset may ignore namespaces that are not used in 
element or attribute names (see Data Model Section 6.2.4, Element Node 
Construction from Infoset".)
>>

     Thank you for your comment.

     The XSL and XML Query Working Groups discussed your comment, and 
decided to make the corrections that you had recommended, with a small 
refinement:  that namespace nodes must not be ignored if the round-tripped 
data model instance is constructed from PSVI if the namespace prefix was 
used in a value of type xs:QName.

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0059.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0060-01: [Serialization] IBM-SE-011: Character expansion
[substantive, acknowledged] 2004-04-13

Serialization Section 4, "XML Output Method": Bullet 7 says that 

"Additional nodes may be present in the new tree" due to character 

expansion. Please explain how character expansion could result in new 

nodes and provide an example.



Thanks,

--Don Chamberlin



    

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

> Serialization Section 4, "XML Output Method": Bullet 7 says that 
> "Additional nodes may be present in the new tree" due to character 
> expansion. Please explain how character expansion could result in 
> new nodes and provide an example. 

     Thank you for your comment.

     The XSL and XQuery working groups discussed your comment, and decided 
to add a note to clarify the situation.  I would like to add the following 
note to the final bullet of the bulleted list in section 4.

<<
Note:  The use-character-maps parameter can cause arbitrary characters to 
be inserted into the serialized XML document in an unescaped form, 
including characters that would be considered part of XML markup.  Such 
characters could result in arbitrary new element nodes, attribute nodes, 
and so on, in the new tree that results from processing the serialized XML 
document.
>>

     As you were present when this decision was made, I will take it that 
the decision is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0060.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0061-01: [Serialization] IBM-SE-012: Version parameter
[substantive, acknowledged] 2004-07-13

Serialization Section 4.1, "XML Output Method: the version Parameter": 

This section contains the words "If the processor does not support this 

version of XML ...". This seems to imply that support for XML versions is 

an optional feature. We should clearly specify the requirements in this 

area. Possibly support for XML 1.0 is required and support for XML 1.1 is 

an optional feature that should be included on our optional feature list?



--Don Chamberlin



    

I think the view of the XSL working group was that we should leave it to

the implementor to decide which versions of XML to support. It would be

commercial suicide for a vendor not to support XML 1.0 in a 2004

product, but by 2010 the situation may look different, and we want our

spec to be durable.



Michael Kay

Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)
[public-qt-comments] <none>, Henry Zongaro (2004-07-13)
Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Serialization Section 4.1, "XML Output Method: the version Parameter": 
This section contains the words "If the processor does not support this 
version of XML ...". This seems to imply that support for XML versions is 
an optional feature. We should clearly specify the requirements in this 
area. Possibly support for XML 1.0 is required and support for XML 1.1 is 
an optional feature that should be included on our optional feature list?
>>

     Thank you for this comment.

     The XSL and XML Query Working Groups discussed your comment, and 
decided that the Serialization specification should be flexible in this 
regard and not place any requirements on the versions of XML or HTML that 
must be supported, although a particular host language might impose such 
requirements.

     The Serialization draft will be modified to state, for each of the 
xml and html output methods, that it is a serialization error if the value 
of the version parameter specifies a version of the XML or the HTML 
Recommendation that is not supported by the processor.  The Serialization 
draft will not place any requirements on the processor on which versions 
of XML or HTML must be supported by a processor.

     The XQuery Working Group further decided that the XML Query language 
will require the processor to support the value 1.0 in the version 
parameter if the output method is xml. 

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0061.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
[public-qt-comments] <none>, Henry Zongaro (2004-07-13)
qt-2004Feb0062-01: [Serialization] IBM-SE-013: XML v1.1 vs. Namespaces v1.1
[substantive, acknowledged] 2004-10-08

Serialization Section 4.1, "XML Output Method: the version Parameter": 

This section should explain what it means to serialize a data model using 

XML Version 1.0 or Version 1.1. These versions are distinguished mainly by 

the characters they allow in names. Does the data model need to specify 

which XML version it is using? (Currently the data model does not provide 

any way to do this.) What happens if serialization is using XML Version 

1.0 but it encounters a name that contains a character in the Version 1.1 

character set?



Also, this section should specify whether XML Version 1.1 interpreted to 

include Namespaces Version 1.1 as well. If not, should a separate version 

parameter be defined for this purpose?



--Don Chamberlin



    

Hi, Don.

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Serialization Section 4.1, "XML Output Method: the version Parameter": 
This section should explain what it means to serialize a data model using 
XML Version 1.0 or Version 1.1. These versions are distinguished mainly by 

the characters they allow in names. Does the data model need to specify 
which XML version it is using? (Currently the data model does not provide 
any way to do this.) What happens if serialization is using XML Version 
1.0 but it encounters a name that contains a character in the Version 1.1 
character set?

Also, this section should specify whether XML Version 1.1 interpreted to 
include Namespaces Version 1.1 as well. If not, should a separate version 
parameter be defined for this purpose?
>>

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment, and decided that all NCNames must conform to the 
version of Namespaces in XML specified by the version parameter; if 
NCNames do not conform to the appropriate version of the Namespaces 
recommendation, a serialization error results.

     Similarly, if the instance of the data model contains any characters 
that are not permitted by the particular version of XML specified by the 
version parameter, a serialization error results.  For instance, if the 
version parameter has the value 1.0, and a text node contains one of the 
non-whitespace control characters in the range #x1 to #x1F, a 
serialization error results, because those characters were not permitted 
in XML 1.0 documents; if the version parameter has the value 1.1, and a 
comment node contains one of the control characters in the range #x7F to 
#x9f, other than NEL, a serialization error results, because those 
characters are also permitted to appear as character references in XML 1.1 
documents.

     Finally, the description of the version parameter should indicate 
that it controls the version of both XML and Namespaces in XML to which 
the serialized result should conform.  No independent parameter specifying 
the version of Namespaces in XML is required.

     I will modify the Serialization specification to reflect these 
decisions.

     As you were present when these decisions were made, I will assume the 
response is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0062.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0064-01: [Serialization] IBM-SE-014: Serializing the "nilled" property
[substantive, acknowledged] 2004-04-13

The Serialization document says nothing about how the "nilled" property of 

an element node is serialized. Does this property always result in an 

xsi:nil attribute on the generated element? Does this process depend on 

anything (for example, the type of the element and/or whether it is 

"nillable")?



--Don Chamberlin



    

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

> The Serialization document says nothing about how the "nilled" 
> property of an element node is serialized. Does this property always
> result in an xsi:nil attribute on the generated element? Does this 
> process depend on anything (for example, the type of the element 
> and/or whether it is "nillable")? 

     Thank you for your comment.

     The XSL and XQuery working groups discussed your comment and decided 
that it could happen that an element has the nilled property with the 
value true, but has no xsi:nil attribute.  The working groups decided to 
add a note stating that, in such cases, the serialization process will not 
create an xsi:nil attribute for the element.

     As you were present when this decision was made, I will take it that 
the decision is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0064.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0146-01: [Serial] canonicalization
[substantive, announced] 2004-04-28
[Serial] canonicalization, Elliotte Rusty Harold (2004-02-07)



It seems useful for the XML output method to allow a canonical 

parameter which if true would cause the processor to emit canonical 

XML. This should not be required of processors, but should be 

allowed. (i.e. it's optional like indent).



The trickiest part is that this could conflict with other properties 

like indent and omit-xml-declaration. The processor could either 

signal an error if there was an explicit conflict, or recover by 

simply outputting canonical XML.  I prefer the latter solution.

-- 



   Elliotte Rusty Harold

   elharo@metalab.unc.edu

   Effective XML (Addison-Wesley, 2003)

   http://www.cafeconleche.org/books/effectivexml

   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA



    
Re: [Serial] canonicalization, Henry Zongaro (2004-04-28)

Elliotte,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

> It seems useful for the XML output method to allow a canonical 
> parameter which if true would cause the processor to emit canonical 
> XML. This should not be required of processors, but should be 
> allowed. (i.e. it's optional like indent).
> 
> The trickiest part is that this could conflict with other properties 
> like indent and omit-xml-declaration. The processor could either 
> signal an error if there was an explicit conflict, or recover by 
> simply outputting canonical XML.  I prefer the latter solution.

     Thank you for your comment.

     The XSL and XML Query Working Groups discussed your comment.  The 
working groups decided that it was too late in the process to add a new 
feature to serialization to support canonicalization, particularly in 
light of the fact that a solution to this problem is currently available: 
serialize using the xml output method, and post-process that serialized 
result with a processor that converts the XML documents to the appropriate 
type of canonical XML.  In addition, the lack of type-awareness in 
existing definitions of canonicalization was of concern to the working 
groups.

     May I ask you to confirm that this response is acceptable to you?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0146.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0188-01: Serialization (sometimes) needs to include type information
[substantive, announced] 2004-07-13
Consider the following schema fragment:

<xs:element name="A">
    <xs:complexType>
       <xs:sequence>
          <xs:element name="C" type="myns:Type1"/>
       </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:element name="B">
    <xs:complexType>
       <xs:sequence>
          <xs:element name="C" type="myns:Type2"/>
       </xs:sequence>
    </xs:complexType>
</xs:element>

Now if we consider a document (or any other data source) containing
both A and B elements, the following query

<result>
{
    for $x in doc("myDocument")//C
    return $x
}
</result>

returns a result that cannot be strongly typed without losing type
information by any valid schema, as the schema spec forbids elements
with the same name and a different type in the same content model.

It seems to me that the only way of retaining type information would
be to annotate produced C elements with xsi:type. This could be a
serialization parameter, similar to the
cdata-section-elements. However, this would raise another issue, as
anonymous type names would then be exposed, and would thus require to
be handled in a consistent way by different XQuery and XML Schema
processors.

This issue is important, especially for tools that perform distributed
XQuery processing, and that need to retain consistent type information
when moving XML data from one processing node to another.
Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)

Hello,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Consider the following schema fragment:

<xs:element name="A">
    <xs:complexType>
       <xs:sequence>
          <xs:element name="C" type="myns:Type1"/>
       </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:element name="B">
    <xs:complexType>
       <xs:sequence>
          <xs:element name="C" type="myns:Type2"/>
       </xs:sequence>
    </xs:complexType>
</xs:element>

Now if we consider a document (or any other data source) containing
both A and B elements, the following query

<result>
{
    for $x in doc("myDocument")//C
    return $x
}
</result>

returns a result that cannot be strongly typed without losing type
information by any valid schema, as the schema spec forbids elements
with the same name and a different type in the same content model.

It seems to me that the only way of retaining type information would
be to annotate produced C elements with xsi:type. This could be a
serialization parameter, similar to the
cdata-section-elements. However, this would raise another issue, as
anonymous type names would then be exposed, and would thus require to
be handled in a consistent way by different XQuery and XML Schema
processors.

This issue is important, especially for tools that perform distributed
XQuery processing, and that need to retain consistent type information
when moving XML data from one processing node to another.
>>

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment and several related comments.  There was general 
agreement that there is some need for a mechanism that preserves most or 
all of the properties of the items in the sequence that is being 
serialized.

     However, the working groups decided that precisely defining all of 
the requirements for such a mechanism at this stage would be difficult, 
and would likely lead to a solution that would not satisfy real user 
requirements.  Therefore, the working groups decided to consider such a 
feature for a future revision of the recommendations, and close this 
comment without any changes to the specifications.

     May I ask you to confirm that this response is acceptable to you?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0188.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0261-01: [Serialization] SCHEMA-A
[substantive, acknowledged] 2004-05-21
[Serialization] SCHEMA-A, Mary Holstege (2004-02-12)





Technical



[A] [Section 3: Serialization Parameters] This section outlines 15

serialization parameters. These parameters have informal descriptions - such

as 'The value must be yes or no', etc. It is possible to describe all the

parameters, except 'use-character-maps', using XML Schema data types.

Suggested descriptions are,



 encoding: a new datatype derived from 'xs:string'

 cdata-section-element: a list of 'xs:QName'

 doctype-system: a new datatype derived from 'xs:string'

 doctype-public: a new datatype derived from 'xs:string'

 escape-uri-attributes: 'xs:boolean'

 include-content-type: 'xs:boolean'

 indent: 'xs:boolean'

 media-type: a new datatype derived from 'xs:string'

 normalize-unicode: 'xs:boolean'

 omit-xml-declaration: 'xs:boolean'

 standalone: 'xs:boolean'

 undeclare-namespaces: 'xs:boolean'

 version: a new datatype derived from 'xs:string'

 use-character-maps: NONE (see related issue, [N])

 method: 'xs:QName'





On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
RE: [Serialization] SCHEMA-A, Michael Kay (2004-02-13)

Seems a good idea in principle. The types as listed aren't quite right,

for example standalone is xs:boolean? rather than xs:boolean, but this

only adds to the argument for formalising them.



Michael Kay (personal response)

Re: [Serialization] SCHEMA-A, Henry Zongaro (2004-03-28)



Mary,



     In [1], you submitted the following comment on the Serialization last 

call draft on behalf of the XML Schema WG.



Mary Holstege wrote on 2004-02-12 04:11:28 PM:

> [A] [Section 3: Serialization Parameters] This section outlines 15

> serialization parameters. These parameters have informal descriptions - 

such

> as 'The value must be yes or no', etc. It is possible to describe all 

the

> parameters, except 'use-character-maps', using XML Schema data types.

> Suggested descriptions are,

> 

>  encoding: a new datatype derived from 'xs:string'

>  cdata-section-element: a list of 'xs:QName'

>  doctype-system: a new datatype derived from 'xs:string'

>  doctype-public: a new datatype derived from 'xs:string'

>  escape-uri-attributes: 'xs:boolean'

>  include-content-type: 'xs:boolean'

>  indent: 'xs:boolean'

>  media-type: a new datatype derived from 'xs:string'

>  normalize-unicode: 'xs:boolean'

>  omit-xml-declaration: 'xs:boolean'

>  standalone: 'xs:boolean'

>  undeclare-namespaces: 'xs:boolean'

>  version: a new datatype derived from 'xs:string'

>  use-character-maps: NONE (see related issue, [N])

>  method: 'xs:QName'



     Thanks to you and the Schema WG for this comment.



     The XSL and XQuery working groups considered the comment, and agreed 

that the definitions of the permissible sets of values need to be 

specified more clearly.  However, the working groups did not feel it was 

necessary to describe the values with reference to the XML Schema data 

types, as the serialization parameters are not part of an API, but merely 

a formalism used between specifications.



     The working groups would like to replace the descriptions of the 

values of the parameters that appears in the bulleted list in Section 3, 

with a table.  The following is my proposed replacement.



<<

+----------------------+------------------------------------------------+

|PARAMETER NAME        |PERMITTED VALUES FOR PARAMETER                  |

+----------------------+------------------------------------------------+

|cdata-section-elements|A list of expanded-QNames, possibly empty.      |

+----------------------+------------------------------------------------+

|doctype-public        |A string of Unicode characters.  This parameter |

|                      |is optional.                                    |

+----------------------+------------------------------------------------+

|doctype-system        |A string of Unicode characters.  This parameter |

|                      |is optional.                                    |

+----------------------+------------------------------------------------+

|encoding              |A string of Unicode characters in the range #x21|

|                      |to #x7E (that is, printable ASCII characters);  |

|                      |the value should be a charset registered with   |

|                      |the Internet Assigned Numbers Authority [IANA], |

|                      |[RFC2278] or begin with the characters x- or X-.|

+----------------------+------------------------------------------------+

|escape-uri-attributes |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|include-content-type  |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|indent                |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|media-type            |A string of Unicode characters specifying the   |

|                      |media type (MIME content type) [RFC2376]; the   |

|                      |charset parameter of the media type must not be |

|                      |specified explicitly.                           |

+----------------------+------------------------------------------------+

|method                |An expanded-QName with a null namespace URI, and|

|                      |the local part of the name equal to xml, xhtml, |

|                      |html or text, or having a non-null namespace    |

|                      |URI.  If the namespace URI is non-null, the     |

|                      |parameter specifies an implementation-defined   |

|                      |output method.                                  |

+----------------------+------------------------------------------------+

|normalize-unicode     |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|omit-xml-declaration  |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|standalone            |One of the enumerated values yes, no or none    |

+----------------------+------------------------------------------------+

|undeclare-namespaces  |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|use-character-maps    |A list of pairs, possibly empty, with each pair |

|                      |consisting of a single Unicode character and a  |

|                      |string of Unicode characters.                   |

+----------------------+------------------------------------------------+

|version               |A string of Unicode characters.                 |

+----------------------+------------------------------------------------+

>>



     May I ask you to confirm that this response is acceptable to the XML 

Schema WG?



Thanks,



Henry

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

Re: [Serialization] SCHEMA-A, Mary Holstege (2004-05-21)


The Schema WG thanks you for your response. We find your response a distinct
improvement and are content with it, although certain members of the WG
continue to feel that the definitions would be cleaner if you did, in fact, go
the final step to using concrete datatypes for the serialization parameter
definitions. 

//Mary
qt-2004Feb0262-01: [Serialization] SCHEMA-B
[substantive, acknowledged] 2004-05-21
[Serialization] SCHEMA-B, Mary Holstege (2004-02-12)





Technical



[B] [Section 2: Serializing Arbitrary Data Models] "cast as xs:string" is a

key phrase in this section. To improve readability, there should be a

pointer to what "cast as xs:string" means.



We found 2 locations where how to "cast as xs:string" is indirectly

described,



[1]

http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#ElementNodeAccessors

(see dm:string-value)

[2]

http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#AttributeNodeAccessors 

(see dm:string-value)



We found 1 location where how to "cast to string is directly described,



[3] http://www.w3.org/TR/2003/WD-xpath-functions-20031112/#casting-to-string





On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Re: [Serialization] SCHEMA-B, Henry Zongaro (2004-04-28)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:

> [B] [Section 2: Serializing Arbitrary Data Models] "cast as xs:string" 
is a
> key phrase in this section. To improve readability, there should be a
> pointer to what "cast as xs:string" means.
> 
> We found 2 locations where how to "cast as xs:string" is indirectly
> described,
> 
> [1]
> 
http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#ElementNodeAccessors

> (see dm:string-value)
> [2]
> 
http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/#AttributeNodeAccessors

> (see dm:string-value)
> 
> We found 1 location where how to "cast to string is directly described,
> 
> [3] 
http://www.w3.org/TR/2003/WD-xpath-functions-20031112/#casting-to-string

     Thanks to you and to the working group for this comment.

     The XSL and XML Query Working Groups discussed the comment, and 
agreed that the description should indicate that Section 2 of 
Serialization should refer to a normative definition of casting to string. 
 The working groups decided the normative definition should be that found 
in the Functions and Operators draft.[3]

     May I ask you to confirm that the working group finds the response 
acceptable?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[4] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0262.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] SCHEMA-B, Mary Holstege (2004-05-21)


The Schema WG thanks you for your response to our comment, and are happy with
it.

//Mary
qt-2004Feb0263-01: [Serialization] SCHEMA-C
[substantive, acknowledged] 2004-05-21
[Serialization] SCHEMA-C, Mary Holstege (2004-02-12)





Technical



[C] [Section 2: Serializing Arbitrary Data Models] Saying the process fails

for sequences containing xs:QName or xs:NOTATION nodes seems unhelpful. What

happens if I have such a sequence? This appears to be a serialization error

because processor is unable to cast an atomic value to string. Suggestion:

replace 'process will fail' statement with 'serialization error'.



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Re: [Serialization] SCHEMA-C, Henry Zongaro (2004-04-28)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:

> [C] [Section 2: Serializing Arbitrary Data Models] Saying the process 
fails
> for sequences containing xs:QName or xs:NOTATION nodes seems unhelpful. 
What
> happens if I have such a sequence? This appears to be a serialization 
error
> because processor is unable to cast an atomic value to string. 
Suggestion:
> replace 'process will fail' statement with 'serialization error'.

     Thanks to you and the working group for this comment.

     The XSL and XML Query Working Groups discussed the comment, and 
agreed that the description should indicate that this is a serialization 
error.  In fact, the second and sixth items in the numbered list in 
Section 2 already normatively indicate that fact.  For clarity, the note 
in Section 2 will be changed to use the term "serialization error" as 
well.

     As that is the change the XML Schema Working Group recommended, I 
trust it will be acceptable.  May I ask you to confirm that the working 
group finds the response acceptable?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0263.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] SCHEMA-C, Mary Holstege (2004-05-21)


Henry Zongaro writes:
> 
> Mary,
> 
>      In [1], you submitted the following comment on the Last Call Working 
> Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
> Working Group:
> 
> > [C] [Section 2: Serializing Arbitrary Data Models] Saying the process 
> fails
> > for sequences containing xs:QName or xs:NOTATION nodes seems unhelpful. 
> What
> > happens if I have such a sequence? This appears to be a serialization 
> error
> > because processor is unable to cast an atomic value to string. 
> Suggestion:
> > replace 'process will fail' statement with 'serialization error'.
> 
>      Thanks to you and the working group for this comment.
> 
>      The XSL and XML Query Working Groups discussed the comment, and 
> agreed that the description should indicate that this is a serialization 
> error.  In fact, the second and sixth items in the numbered list in 
> Section 2 already normatively indicate that fact.  For clarity, the note 
> in Section 2 will be changed to use the term "serialization error" as 
> well.
> 
>      As that is the change the XML Schema Working Group recommended, I 
> trust it will be acceptable.  May I ask you to confirm that the working 
> group finds the response acceptable?

The Schema WG thanks you for this response. We find the clarification as a
serialization error an improvement and accept that. We continue to be deeply
troubled, however, by the fact that data models with xs:QNames fail to
serialize and therefore validate correctly. We are heartened by our knowledge
that the Query/XSL Working groups have continued to discuss that matter, and
encourage them in that effort.

//Mary
qt-2004Feb0264-01: [Serialization] SCHEMA-D
[substantive, announced] 2004-07-13
[Serialization] SCHEMA-D, Mary Holstege (2004-02-12)





Technical



[D] [Section 1: Introduction] "Ed. Note: This material has been moved out of

the XSLT draft and into a separate document. The Working Groups also

considered moving this material directly into the Data Model document, but

elected to keep it separate for the moment, principally in order to advance

the Data Model to Last Call. In the future, this material may be moved into

the Data Model. The Working Groups solicit public opinion about which

alternative is superior. "



We prefer keeping this material in this separate document. This way, it

makes serialization as independent as possible.





On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)
Re: [Serialization] SCHEMA-D, Henry Zongaro (2004-07-13)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
working group:

<<
[D] [Section 1: Introduction] "Ed. Note: This material has been moved out 
of
the XSLT draft and into a separate document. The Working Groups also
considered moving this material directly into the Data Model document, but
elected to keep it separate for the moment, principally in order to 
advance
the Data Model to Last Call. In the future, this material may be moved 
into
the Data Model. The Working Groups solicit public opinion about which
alternative is superior. "

We prefer keeping this material in this separate document. This way, it
makes serialization as independent as possible.
>>

     Thanks to you and the schema working group for this comment.

     The XSL and XML Query Working Groups discussed the comment, and 
agreed that it would be best to specify the serialization process in a 
separate document.  The editorial note will be deleted.

     I trust the XML Schema Working Group will find that response 
acceptable, as it is as they suggested.  May I ask you to confirm that it 
is?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0264.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0265-01: [Serialization] SCHEMA-E
[substantive, acknowledged] 2004-10-04
[Serialization] SCHEMA-E, Mary Holstege (2004-02-12)





Technical/Editorial



[E] [Section 2: Serializing Arbitrary Data Models] This section talks about

serialization of a "arbitrary" data model but in fact will fail for some

data model instances. In addition, section 4 implicitly adds additional

constraints on the data model (by putting constraints on the serialized

form) without making it clear what those constraints are.



Suggested changes are,



- Change title of section 2 to Normalization.

- Be clearer about actual goal (in particular: to serialize to well formed

XML? to serialize to XML in such a way as to ensure 1:1 roundtrippability?

other?).

- List explicitly the conditions for successful serialization (and/or

successful serialization to Well Formed XML, and/or successful serialization

to XML which when schema-validated will produce the same data model

instance).



Section 4  (XML Output Method)  does state:

   "The xml output method outputs the data model as an XML entity that must

   satisfy the rules for either a well-formed XML document entity or a

   well-formed XML external general parsed entity, or both, unless the

   processor is unable to satisfy those rules due to either serialization

   errors or the requirements of the character expansion phase of

   serialization, as described in 3 Serialization Parameters."



It is not clear what happens when there are problems (when "the

processor is unable to satisfy those rules"): failure? output of

non-WF XML?



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Re: [Serialization] SCHEMA-E, Henry Zongaro (2004-09-21)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:

<<
[E] [Section 2: Serializing Arbitrary Data Models] This section talks 
about
serialization of a "arbitrary" data model but in fact will fail for some
data model instances. In addition, section 4 implicitly adds additional
constraints on the data model (by putting constraints on the serialized
form) without making it clear what those constraints are.

Suggested changes are,

- Change title of section 2 to Normalization.
- Be clearer about actual goal (in particular: to serialize to well formed
XML? to serialize to XML in such a way as to ensure 1:1 roundtrippability?
other?).
- List explicitly the conditions for successful serialization (and/or
successful serialization to Well Formed XML, and/or successful 
serialization
to XML which when schema-validated will produce the same data model
instance).

Section 4  (XML Output Method)  does state:
   "The xml output method outputs the data model as an XML entity that 
must
   satisfy the rules for either a well-formed XML document entity or a
   well-formed XML external general parsed entity, or both, unless the
   processor is unable to satisfy those rules due to either serialization
   errors or the requirements of the character expansion phase of
   serialization, as described in 3 Serialization Parameters."

It is not clear what happens when there are problems (when "the
processor is unable to satisfy those rules"): failure? output of
non-WF XML?
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed your comment.

     Regarding the first point, the working groups agreed.  The title will 
be changed to "Sequence Normalization"

     Regarding the second point, the goal is to serialize well-formed XML 
that reflects the content of the input sequence to the extent possible. We 
will add a statement to that effect.

     Regarding the third point, in answer to the question as to what 
happens when "those rules" can't be 
satisfied, we will clarify the cited text by changing it to the following:

<<
The xml output method outputs the instance of the data model as an XML 
entity that must satisfy the rules for either a well-formed XML document 
entity or a well-formed XML external general parsed entity, or both.  A 
serialization error results if the serializer is unable to satisfy those 
rules, except for contents modified by the character expansion phase of 
serialization, as described in 4 Phases of Serialization, which may result 

in the serial output being not well-formed rather than a serialization
error.  If a serialization error results, the processor must signal the
error.
>>

     However, describing the conditions under which serialization to XML 
will result in the same data model instance when schema-validated would 
not be practical; because of implementation-defined issues, etc., the list 
would be open ended.  The working groups decided to make no change to 
Serialization in response to this part of the comment.

     May I ask you to confirm that this response is acceptable to the 
Schema Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0265.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] SCHEMA-E, David Ezell (2004-10-04)

Dear XSL and XML Query Working Groups:

The XML Schema WG has reviewed your response to this issue during 
the telcon on October 1, 2004 [1], and we are satisfied with 
your answer.  Thanks for your time and consideration.

Sincerely,
David Ezell (on behalf of the XML Schema WG)

[1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Oct/0000.html
qt-2004Feb0266-01: [Serialization] SCHEMA-F
[substantive, acknowledged] 2004-09-23
[Serialization] SCHEMA-F, Mary Holstege (2004-02-12)





Technical/Editorial



[F] [Section 2: Serializing Arbitrary Data Models] The first paragraph of

this section states:



"An instance of the data model that is input to the serialization process is

a sequence. The serialization process must first place that input sequence

into a normalized form for serialization; it is the normalized sequence that

is actually serialized. The normalized form for serialization is constructed

by applying all of the following rules in order, with the initial sequence

being input to the first step, and the sequence that results from any step

being used as input to the subsequent step."



We think wording in this section tends to imply a required implementation,

which, given the destructive nature of the implementation described, leads to

the conclusion that serialized data models cannot subsequently be used for

anything else. We believe what is intended is the description of a mapping

between data models and normalized data models, without attempting to constrain

implementations. We request that the text in this section be recast in a more

declarative fashion to make these intentions clear.



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:

<<
[F] [Section 2: Serializing Arbitrary Data Models] The first paragraph of
this section states:

"An instance of the data model that is input to the serialization process 
is
a sequence. The serialization process must first place that input sequence
into a normalized form for serialization; it is the normalized sequence 
that
is actually serialized. The normalized form for serialization is 
constructed
by applying all of the following rules in order, with the initial sequence
being input to the first step, and the sequence that results from any step
being used as input to the subsequent step."

We think wording in this section tends to imply a required implementation,
which, given the destructive nature of the implementation described, leads 
to
the conclusion that serialized data models cannot subsequently be used for
anything else. We believe what is intended is the description of a mapping
between data models and normalized data models, without attempting to 
constrain
implementations. We request that the text in this section be recast in a 
more
declarative fashion to make these intentions clear.
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed your comment, and decided to make the 
following changes to Section 2 of Serialization.  These changes are with 
respect to the July 23 draft of Serialization.[2]


o Change the second sentence of the first paragraph to make it clear
  that the process is not destructive:

<<
Prior to serializing a sequence using any of the output methods whose 
behavior is specified by this document (3 Serialization Parameters) the 
serialization process must first compute a normalized sequence for 
serialization; it is the normalized sequence that is actually serialized.
>>

o Reword the items in the numbered list so that it's clear that the
  result at each step is a new sequence:

<<
1. If the sequence that is input to serialization is empty, create a
sequence S1 that consists of a zero-length string.  Otherwise, copy each
item in the sequence that is input to serialization to create the new
sequence S1.

2. For each item in the sequence S1, if the item is atomic, obtain the 
lexical representation of the item by casting it to an xs:string and copy 
the string representation to the new sequence; otherwise, copy the item, 
which must be a node, to the new sequence. It is a serialization error if 
an atomic value cannot be cast to xs:string.  The new sequence is S2.

3. For each subsequence of adjacent strings in S2, copy a single 
string to the new sequence equal to the values of the strings in the 
subsequence concatenated in order, each separated by a single space.  Copy 

all other items to the new sequence.  The new sequence is S3.

4. For each item in S3, if the item is a string, create a text 
node in the new sequence whose string value is equal to the string; 
otherwise, copy the item to the new sequence.  The new sequence is S4.

5. For each item in S4, if the item is a document node, copy its 
children to the new sequence; otherwise, copy the item to the new 
sequence.  The new sequence is S5.

6. It is a serialization error if an item in S5 is an attribute 
node or a namespace node. Otherwise, cconstruct a new sequence, S6, that 
consists of a single document node and copy all the items in the sequence, 

which are all nodes, as children of that document node.

S6 is the normalized sequence.
>>

     May I ask you to confirm that this response is acceptable to the 
Schema working group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0266.html
[2] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serdm
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Dear XSL and XML Query Working Groups:

The XML Schema WG has reviewed your response to this issue during 
the telcon on September 17, 2004 [1], and we are satisfied with 
your answer.  Thanks for your time and consideration.

Sincerely,
David Ezell (on behalf of the XML Schema WG)

[1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Sep/0090.html
qt-2004Feb0267-01: [Serialization] SCHEMA-G
[substantive, raised] 2004-09-16
[Serialization] SCHEMA-G, Mary Holstege (2004-02-12)





Technical/Editorial



[G] [Section 3: Serialization Parameters] Namespace binding generation ought

to be explicitly called out either as its own phase or as part of markup

generation.





On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)
qt-2004Feb0268-01: [Serialization] SCHEMA-H
[substantive, acknowledged] 2004-05-21
[Serialization] SCHEMA-H, Mary Holstege (2004-02-12)





Technical/Editorial



[H] [Section 4: XML Output Method] The exception to the round-trippability

of the serialization is unclear: "Additional nodes may be present in the new

tree, and the values of attribute nodes and text nodes in the new tree may

be different from those in the original tree, due to the character expansion

phase of serialization."



What additional nodes may be present? How may they differ? As written this

sentence is ambiguous and may be read as allowing _any_ additional nodes in

the tree. 





On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Re: [Serialization] SCHEMA-H, Henry Zongaro (2004-04-13)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group.

> [H] [Section 4: XML Output Method] The exception to the 
round-trippability
> of the serialization is unclear: "Additional nodes may be present in the 
new
> tree, and the values of attribute nodes and text nodes in the new tree 
may
> be different from those in the original tree, due to the character 
expansion
> phase of serialization."
> 
> What additional nodes may be present? How may they differ? As written 
this
> sentence is ambiguous and may be read as allowing _any_ additional nodes 
in
> the tree. 

     Thanks to Mary and the XML Schema Working Group for this comment.

     The XSL and XQuery Working Groups discussed the comment, and decided 
to add a note to clarify the situation.  I would like to add the following 
note to the final bullet of the bulleted list in section 4.

<<
Note:  The use-character-maps parameter can cause arbitrary characters to 
be inserted into the serialized XML document in an unescaped form, 
including characters that would be considered part of XML markup.  Such 
characters could result in arbitrary new element nodes, attribute nodes, 
and so on, in the new tree that results from processing the serialized XML 
document.
>>

     May I ask you to confirm that this response is acceptable to the XML 
Schema Working Group?

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0268.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] SCHEMA-H, Mary Holstege (2004-05-21)


>      The XSL and XQuery Working Groups discussed the comment, and decided 
> to add a note to clarify the situation.  I would like to add the following 
> note to the final bullet of the bulleted list in section 4.
> 
> <<
> Note:  The use-character-maps parameter can cause arbitrary characters to 
> be inserted into the serialized XML document in an unescaped form, 
> including characters that would be considered part of XML markup.  Such 
> characters could result in arbitrary new element nodes, attribute nodes, 
> and so on, in the new tree that results from processing the serialized XML 
> document.
> >>
> 
>      May I ask you to confirm that this response is acceptable to the XML 
> Schema Working Group?

The Schema WG thanks you for your response, and finds it acceptable.

	-- Mary
	   Holstege@mathling.com
qt-2004Feb0269-01: [Serialization] SCHEMA-I
[substantive, acknowledged] 2004-05-21
[Serialization] SCHEMA-I, Mary Holstege (2004-02-12)





Technical/Editorial



[I] [Section 4.3: XML Output Method: the indent Parameter] licenses the

addition of additional whitespace; this is not called out as permitted under

the rules in section 4, however.





On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Re: [Serialization] SCHEMA-I, Henry Zongaro (2004-04-13)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group.

> [I] [Section 4.3: XML Output Method: the indent Parameter] licenses the
> addition of additional whitespace; this is not called out as permitted 
under
> the rules in section 4, however.

     Thanks to you and the XML Schema Working Group for this comment.

     The XSL and XQuery Working Groups discussed the comment, and agreed. 
The following item will be added to the bulleted list in section 4 to 
address this comment:

<<
o Additional text nodes consisting of whitespace characters may be present 
in the new tree and some text nodes in the new tree may contain additional 
whitespace characters that were not present in the original tree if the 
indent parameter has the value yes, as described in 4.3 XML Output Method: 
the indent Parameter.
>>

     May I ask you to confirm that this response is acceptable to the XML 
Schema Working Group?

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0269.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] SCHEMA-I, Mary Holstege (2004-05-21)


Henry Zongaro writes:
>      The XSL and XQuery Working Groups discussed the comment, and agreed. 
> The following item will be added to the bulleted list in section 4 to 
> address this comment:
> 
> <<
> o Additional text nodes consisting of whitespace characters may be present 
> in the new tree and some text nodes in the new tree may contain additional 
> whitespace characters that were not present in the original tree if the 
> indent parameter has the value yes, as described in 4.3 XML Output Method: 
> the indent Parameter.
> >>
> 
>      May I ask you to confirm that this response is acceptable to the XML 
> Schema Working Group?

The Schema WG thanks you for this response and finds it acceptable.


	-- Mary
	   Holstege@mathling.com
qt-2004Feb0271-01: [Serialization] SCHEMA-K
[substantive, acknowledged] 2004-09-23
[Serialization] SCHEMA-K, Mary Holstege (2004-02-12)





Technical



[K] [General] In the absence of 'Conformance' Section, what should a

processor do to claim conformance to this specification?



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:

<<
[K] [General] In the absence of 'Conformance' Section, what should a
processor do to claim conformance to this specification?
>>

     Thanks to you and the Schema Working Group for this comment.  The XSL 
and XML Query Working Groups discussed your comment, and decided to add 
the following Conformance section to the Serialization draft:

<<
10. Conformance

Serialization is intended primarily as a component that can be
used by other specifications.  Therefore, this document relies
on specifications that use it to specify conformance criteria
for Serialization in their respective environments.
Specifications that set conformance criteria for their use of
Serialization must not change the semantic definitions of 
Serialization as given in this specification, except by
subsetting and/or compatible extensions.
>>

     May I ask you to confirm that this response is acceptable to the 
Schema Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0271.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Dear XSL and XML Query Working Groups:

The XML Schema WG has reviewed your response to this issue during 
the telcon on September 17, 2004 [1], and we are satisfied with 
your answer.  Thanks for your time and consideration.

Sincerely,
David Ezell (on behalf of the XML Schema WG)

[1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Sep/0090.html
qt-2004Feb0272-01: [Serialization] SCHEMA-L
[substantive, acknowledged] 2004-09-23
[Serialization] SCHEMA-L, Mary Holstege (2004-02-12)





Technical



[L] [Section 4.1] Given that XML 1.1 is not [Should be "now". HZ] a

recommendation, we believe that

the serialization specification should give guidance to users and implementers

about serializing as 1.0 or 1.1. We believe this section (4.1) is a good start,

but needs more details about how serializers should deal with characters in the

range x00 to x1F (ref http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-xml11). 

See our related comment on the data model.



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
working group:

<<
Technical

[L] [Section 4.1] Given that XML 1.1 is not [Should be "now". HZ] a
recommendation, we believe that
the serialization specification should give guidance to users and 
implementers
about serializing as 1.0 or 1.1. We believe this section (4.1) is a good 
start,
but needs more details about how serializers should deal with characters 
in the
range x00 to x1F (ref 
http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-xml11). 
See our related comment on the data model.
>>

     Thanks to you and the XML Schema Working Group for this comment.

     The XSL and XML Query Working Groups discussed your comment.  In 
response to another comment on the Last Call Working Draft, the working 
groups decided to add information on how NEL and LSEP characters must be 
handled, but neglected to add information on the control characters.

     The working groups decided to make the following changes based on the 
July 23 Working Draft of Serialization.[2]

o In Section 5, the first paragraph following the bulleted list,
change "certain whitespace characters" to "certain characters".
(We've not been dealing with whitespace characters alone for some
time.)

o In Section 5, the first paragraph following the bulleted list,
append the following sentence

<<
In addition, the non-whitespace control characters #x1 through #x1F
and #x7F through #x9F in text nodes and attribute nodes must be
output as character references.
>>

o In Section 5, in the last note, remove the words "or CDATA sections".

o In Section 5, in the last note, append the following paragraph

<<
XML 1.0 permitted control characters in the range #x7F through #x9F
to appear as literal characters in an XML document, but XML 1.1
requires such characters to be escaped as character references.  An
external general parsed entity with no text declaration or a text
declaration that specifies a version pseudo-attribute with value "1.0"
that is invoked by an XML 1.1 document entity must follow the rules of
XML 1.1.  Therefore, the non-whitespace control characters in the
ranges #x1 through #x1F and #x7F through #x9F must always be escaped,
regardless of the value of the version parameter.
>>

     May I ask you to confirm that this response is acceptable to the 
Schema Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0272.html
[2] http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Dear XSL and XML Query Working Groups:

The XML Schema WG has reviewed your response to this issue during 
the telcon on September 17, 2004 [1], and we are satisfied with 
your answer.  Thanks for your time and consideration.

Sincerely,
David Ezell (on behalf of the XML Schema WG)

[1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Sep/0090.html
qt-2004Feb0362-01: [Serial] I18N WG last call comments [4]
[substantive, objected] 2004-09-08
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[4] This only defines serialization into bytes. In some contexts

   (e.g. Databases, in-program,...), serialization into a stream

   of characters is also important. The spec should specify how

   this is done.



Regards,    Martin.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-13)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
the I18N Working Group.

> [4] This only defines serialization into bytes. In some contexts
>    (e.g. Databases, in-program,...), serialization into a stream
>    of characters is also important. The spec should specify how
>    this is done.

     Thanks to Martin and the I18N Working Group for this comment.

     The XSL and XQuery Working Groups discussed the comment.  The working 
groups noted that there is an analogy in parsing XML documents.  XML 1.0 
and XML 1.1 parsed entities are defined as sequences of character code 
points, each in some encoding.  Though it is common practice to parse XML 
documents that have already been decoded into a sequence of characters, 
the XML 1.0 and XML 1.1 Recommendations do not describe the actions of an 
XML processor in those terms.

     Based on this analogy, the working groups decided that it was not 
appropriate for Serialization to specify normatively how to serialize into 
a stream of characters.  The working groups did decide to add a note to 
Section 3 of Serialization indicating that a processor could provide an 
option that would permit the fourth phase of serialization (Encoding) to 
be skipped.

     May I ask the I18N Working Group to confirm that this response is 
acceptabe?

Thanks,

Henry [On behalf of the XSL and XQuery Working Groups.]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, François Yergeau (2004-06-15)

Henry Zongaro a écrit :
>      In [1], Martin Duerst submitted the following comment on the Last 
> Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
> the I18N Working Group.
> 
> 
>>[4] This only defines serialization into bytes. In some contexts
>>   (e.g. Databases, in-program,...), serialization into a stream
>>   of characters is also important. The spec should specify how
>>   this is done.
> 
> 
>      Thanks to Martin and the I18N Working Group for this comment.
> 
>      The XSL and XQuery Working Groups discussed the comment.  The working 
> groups noted that there is an analogy in parsing XML documents.  XML 1.0 
> and XML 1.1 parsed entities are defined as sequences of character code 
> points, each in some encoding.  Though it is common practice to parse XML 
> documents that have already been decoded into a sequence of characters, 
> the XML 1.0 and XML 1.1 Recommendations do not describe the actions of an 
> XML processor in those terms.
> 
>      Based on this analogy, the working groups decided that it was not 
> appropriate for Serialization to specify normatively how to serialize into 
> a stream of characters.  The working groups did decide to add a note to 
> Section 3 of Serialization indicating that a processor could provide an 
> option that would permit the fourth phase of serialization (Encoding) to 
> be skipped.

We are not really satisfied with this resolution and would like to 
request further clarification.  In particular, conformance when one is 
actually serializing to characters instead of bytes is not clear at all 
to us.  Allowing this but not normatively is very strange, one is left 
to wonder what would be the conformance status of an implementation that 
*only* serializes to characters (because that's all that is required in 
a given context).

> [1] 
> http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html

Regards,

-- 
François Yergeau
Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-06-15)

Hello, François.

François Yergeau wrote on 2004-06-14 09:28:11 PM:
>Henry Zongaro a écrit :
>>      In [1], Martin Duerst submitted the following comment on the Last 
>> Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf 
of 
>> the I18N Working Group.
>> 
>>>[4] This only defines serialization into bytes. In some contexts
>>>   (e.g. Databases, in-program,...), serialization into a stream
>>>   of characters is also important. The spec should specify how
>>>   this is done.
>>      The XSL and XQuery Working Groups discussed the comment.  The 
working 
>> groups noted that there is an analogy in parsing XML documents.  XML 
1.0 
>> and XML 1.1 parsed entities are defined as sequences of character code 
>> points, each in some encoding.  Though it is common practice to parse 
XML 
>> documents that have already been decoded into a sequence of characters, 

>> the XML 1.0 and XML 1.1 Recommendations do not describe the actions of 
an 
>> XML processor in those terms.
>> 
>>      Based on this analogy, the working groups decided that it was not 
>> appropriate for Serialization to specify normatively how to serialize 
into 
>> a stream of characters.  The working groups did decide to add a note to 

>> Section 3 of Serialization indicating that a processor could provide an 

>> option that would permit the fourth phase of serialization (Encoding) 
to 
>> be skipped.
>
>We are not really satisfied with this resolution and would like to 
>request further clarification.  In particular, conformance when one is 
>actually serializing to characters instead of bytes is not clear at all 
>to us.  Allowing this but not normatively is very strange, one is left 
>to wonder what would be the conformance status of an implementation that 
>*only* serializes to characters (because that's all that is required in 
>a given context).
> 
>> [1] 
>> 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html

     Thank you for your response.  The intent of the note was to indicate 
that an implementer might supply such a feature as an extension, because 
it is often required, but that such a feature is explicitly beyond the 
scope of the specification.  An implementer might supply anything as an 
extension, and doesn't require permission to do so - we would just like to 
mention this one as a useful extension.

     Here is the text of the note that I'm proposing:

<<
Note: Serialization is only defined in terms of encoding the result as a 
stream of bytes. However, a processor may provide an option that allows 
the encoding phase to be skipped, so that the result of serialization is a 
stream of Unicode characters. The effect of any such option is 
implementation-defined, and a processor is not required to support such an 
option.
>>

     I don't believe there is a question of conformance here. 
Serialization to characters is explicitly a usage that is beyond the 
specification, and the behaviour of a processor that supplies such a 
feature is unspecified.  Similarly, many XML parsers are able to parse 
characters in addition to parsing encoded characters, but the conformance 
of such parsers is not in question in spite of the fact that this feature 
is an extension that is not described by the XML 1.0 or 1.1 
Recommendations.

     Does the I18N Working Group feel it would be better not to include 
such a note at all?

Thanks,

Henry
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Martin, François.

     In [1] Martin submitted the following comment on the Last Call 
Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the 
I18N working group:

<<
[4] This only defines serialization into bytes. In some contexts
   (e.g. Databases, in-program,...), serialization into a stream
   of characters is also important. The spec should specify how
   this is done.
>>

     In [2], I announced the following decision on behalf of the XSL and 
XML Query Working Groups:

<<
     The XSL and XQuery Working Groups discussed the comment.  The working 

groups noted that there is an analogy in parsing XML documents.  XML 1.0 
and XML 1.1 parsed entities are defined as sequences of character code 
points, each in some encoding.  Though it is common practice to parse XML 
documents that have already been decoded into a sequence of characters, 
the XML 1.0 and XML 1.1 Recommendations do not describe the actions of an 
XML processor in those terms.

     Based on this analogy, the working groups decided that it was not 
appropriate for Serialization to specify normatively how to serialize into 

a stream of characters.  The working groups did decide to add a note to 
Section 3 of Serialization indicating that a processor could provide an 
option that would permit the fourth phase of serialization (Encoding) to 
be skipped.
>>

     In [3], François raised the following objection on behalf of I18N:

<<
We are not really satisfied with this resolution and would like to 
request further clarification.  In particular, conformance when one is 
actually serializing to characters instead of bytes is not clear at all 
to us.  Allowing this but not normatively is very strange, one is left 
to wonder what would be the conformance status of an implementation that 
*only* serializes to characters (because that's all that is required in 
a given context).
>>

     The XSL and XML Query Working Groups discussed this comment again, 
and are unsure what change would resolve this issue.  There does not 
appear to be any interoperability problem with not requiring 
implementations to support skipping the encoding phase.  In addition, XSLT 
1.0 did not require support for skipping the encoding phase of 
serialization, and such support has been raised as a requirement for XSLT 
2.0.  Would it be sufficient to remove the note in the Serialization 
specification that mentions that processors may implement an option that 
allows serialization to characters rather than serialization to bytes?

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0065.html
[3] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0109.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Hello.

     In [1], I wrote:

<<
In addition, XSLT 1.0 did not require support for skipping the encoding 
phase of serialization, and such support has been raised as a requirement 
for XSLT 2.0.
>>

I omitted the word "not" from that sentence!  It should read as follows:

<<
In addition, XSLT 1.0 did not require support for skipping the encoding 
phase of serialization, and such support has NOT been raised as a 
requirement for XSLT 2.0.
>>

     My apologies for any confusion.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Aug/0137.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-02: [Serial] I18N WG last call comments [5]
[substantive, objected] 2004-11-08
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[5] Section 2, point 3: "each separated by a single space":

   Inserting a space may not be the right thing, in particular for

   Chinese, Japanese, Thai,... which don't have spaces between words.

   This has to be checked very carefully.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

This isn't trying to achieve linguistic separation, it is trying to

achieve separation of tokens that meets the rules defined in XML Schema.

XML Schema allows any sequence of whitespace characters between the

items in a list, we mandate a single space character because that's the

simplest whitespace sequence.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-28)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
the I18N Working Group:

> [5] Section 2, point 3: "each separated by a single space":
>    Inserting a space may not be the right thing, in particular for
>    Chinese, Japanese, Thai,... which don't have spaces between words.
>    This has to be checked very carefully.

     Thanks to Martin and the working group for this comment.

     The XSL and XML Query Working Groups discussed the comment, and 
decided that no change to the Serialization specification is required. The 
reason for separating each pair of string values by a single space is not 
to achieve any kind of linguistic separation of words, but to separate 
values in a way that would be consistent with the requirements for an XML 
Schema type derived by list, for instance.

     May I ask the I18N Working Group to confirm that this response is 
acceptable?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, Martin Duerst (2004-05-05)

Hello Henry,

Many thanks for your responses. The I18N WG (Core TF) has looked
at your response below, and unfortunately, we have to say that we
cannot accept it. At the least, we need some more information
exchange to make sure we understand each other well.

Below, you write that the convention of inserting a space isn't
for linguistic separation, but for creating XML Schema lists.
This may be the intention of the spec-writers, but who guarantees
that this is how this will be used? In cases where it will be used
in other ways, there would be serious problems when adapting a
query or transformation to a different language (in particular
Chinese, Japanese, Thai,...).

So in particular, we need to know more about the following questions:
- How/when/why would sequences of strings (or other atomic data types)
   typically be generated?
- How would e.g. combinations of data values, strings,... be serialized
   other than though this mechanism? We think that in many cases, in
   particular for XML Query, this could be the mechanism of choice to
   write out texts mixed with e.g. stringified numbers.
- What would the effort be to change a script relying on this mechanism
   so that it works for Chinese/Japanese,...?
- How can the distinction between strings and text nodes be used to
   affect/create the right behavior, and how can we make sure that
   programmers use the solution that is easily adapted to all kinds
   of languages.

Regards,    Martin.


At 11:55 04/04/28 -0400, Henry Zongaro wrote:

>Hello,
>
>      In [1], Martin Duerst submitted the following comment on the Last
>Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of
>the I18N Working Group:
>
> > [5] Section 2, point 3: "each separated by a single space":
> >    Inserting a space may not be the right thing, in particular for
> >    Chinese, Japanese, Thai,... which don't have spaces between words.
> >    This has to be checked very carefully.
>
>      Thanks to Martin and the working group for this comment.
>
>      The XSL and XML Query Working Groups discussed the comment, and
>decided that no change to the Serialization specification is required. The
>reason for separating each pair of string values by a single space is not
>to achieve any kind of linguistic separation of words, but to separate
>values in a way that would be consistent with the requirements for an XML
>Schema type derived by list, for instance.
>
>      May I ask the I18N Working Group to confirm that this response is
>acceptable?
>
>Thanks,
>
>Henry [On behalf of the XSL and XML Query Working Groups]
>[1]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
>[2]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
>------------------------------------------------------------------
>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
>mailto:zongaro@ca.ibm.com
RE: [Serial] I18N WG last call comments, Michael Kay (2004-05-06)

> 
> Hello Henry,
> 
> Many thanks for your responses. The I18N WG (Core TF) has looked
> at your response below, and unfortunately, we have to say that we
> cannot accept it. At the least, we need some more information
> exchange to make sure we understand each other well.
> 
> Below, you write that the convention of inserting a space isn't
> for linguistic separation, but for creating XML Schema lists.
> This may be the intention of the spec-writers, but who guarantees
> that this is how this will be used? 

Sorry, Martin, but I think you have completely missed the point here. If an
XML Schema declares the colors attribute as having type xs:NMTOKENS, and the
typed value is the sequence ("red", "green", "blue"), then the correct
lexical representation of this according to the rules in XML Schema is
colors="red green blue". If you don't like that, you need to complain to the
XML Schema WG.

The places where XSLT/XQuery use space as a default separator are all
associated with converting a typed value to the string value of a node, and
are therefore closely associated with this XML Schema convention for
representing lists. Of course we can't totally control how the facility is
used, but we do provide a string-join function that allows any separator to
be used in the lexical representation of a sequence, so we are not imposing
any constraints on users.

Michael Kay
RE: [Serial] I18N WG last call comments, Martin Duerst (2004-05-24)

Hello Michael,

At 17:52 04/05/06 +0100, Michael Kay wrote:

> >
> > Hello Henry,
> >
> > Many thanks for your responses. The I18N WG (Core TF) has looked
> > at your response below, and unfortunately, we have to say that we
> > cannot accept it. At the least, we need some more information
> > exchange to make sure we understand each other well.
> >
> > Below, you write that the convention of inserting a space isn't
> > for linguistic separation, but for creating XML Schema lists.
> > This may be the intention of the spec-writers, but who guarantees
> > that this is how this will be used?
>
>Sorry, Martin, but I think you have completely missed the point here.

I may, or I may not. Given the complexity of the XSLT/XQuery specs,
and the fact that I'm dealing with a lot of other things (not to
speak about the rest of the I18N WG), it might not necessarily
come as a surprise.


>If an
>XML Schema declares the colors attribute as having type xs:NMTOKENS, and the
>typed value is the sequence ("red", "green", "blue"), then the correct
>lexical representation of this according to the rules in XML Schema is
>colors="red green blue". If you don't like that, you need to complain to the
>XML Schema WG.

There is no problem with that, if indeed these values are typed as
xs:NMTOKENS. But we strongly suspect that there is a problem if there
are some values that are just simple strings. The fact that simple
strings and text nodes are not treated in the same way, we suspect,
will often lead to confusion.


>The places where XSLT/XQuery use space as a default separator are all
>associated with converting a typed value to the string value of a node, and
>are therefore closely associated with this XML Schema convention for
>representing lists. Of course we can't totally control how the facility is
>used, but we do provide a string-join function that allows any separator to
>be used in the lexical representation of a sequence, so we are not imposing
>any constraints on users.

Would it be possible for you to write the following three examples:

- An example (such as above with "red", "green", "blue", but with the
   actual code) where these are e.g. NMTOKENS, and where the serialization
   with spaces makes sense.

- An example with e.g. strings used as intermediate text in a formating-
   like operation (a la printf in C), where inserting spaces would happen,
   but would not be desired.

- The previous example with the above 'string-join' function used to
   avoid the problems with spaces.


Regards,    Martin.
RE: [Serial] I18N WG last call comments, Henry Zongaro (2004-06-09)

Hi, Martin.

     In [1], you wrote:

Martin Duerst wrote on 2004-05-24 05:31:53 AM:
> At 17:52 04/05/06 +0100, Michael Kay wrote:
> >The places where XSLT/XQuery use space as a default separator are all
> >associated with converting a typed value to the string value of a node, 
and
> >are therefore closely associated with this XML Schema convention for
> >representing lists. Of course we can't totally control how the facility 
is
> >used, but we do provide a string-join function that allows any 
separator to
> >be used in the lexical representation of a sequence, so we are not 
imposing
> >any constraints on users.
> 
> Would it be possible for you to write the following three examples:
> 
> - An example (such as above with "red", "green", "blue", but with the
>    actual code) where these are e.g. NMTOKENS, and where the 
serialization
>    with spaces makes sense.

Assume the following input document, where the type of the colors 
attribute is xs:NMTOKENS.

<elem colors="red   green  blue"/>

and the following stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0">
  <xsl:template match="/">
    <xsl:sequence select="data(elem/@colors)"/>
  </xsl:template>
</xsl:stylesheet>

The result of serialization will be the following external general parsed 
entity.

<?xml version="1.0" encoding="UTF-8"?>red green blue

That entity might be subsequently referenced in the content of an element 
that has the simple type xs:NMTOKENS.  If the PSVI that results is used to 
construct an instance of the XPath/XQuery Data Model, the typed valued of 
the element would be a sequence of three values of type xs:NMTOKEN; 
without the spaces, the typed value would be a sequence of a single value 
of type xs:NMTOKEN:  "redgreenblue".


Compare that with the result of the following stylesheet, where the rules 
for evaluating an attribute value template (section 5.5 of the last call 
draft of XSLT 2.0) state that each atomized value in the sequence that 
results from evaluating each XPath expression will be converted to a 
string, and separated by a space:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0">
  <xsl:template match="/">
    <elem colors="{data(elem/@colors)}"/>
  </xsl:template>
</xsl:stylesheet>

Result:

<?xml version="1.0" encoding="UTF-8"?><elem colors="red green blue"/>

Again, if that serialized entity is assessed against a schema in which the 
colors attribute has type xs:NMTOKENS, the typed value of the attribute 
will be a sequence of three values of type xs:NMTOKEN.


Similarly, the result of the following stylesheet, where the rules for 
constructing complex content (section 5.6.1 of XSLT 2.0) describe how a 
text node is created from the sequence of atomic values that results from 
evaluating the xsl:sequence instruction:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0">
  <xsl:template match="/">
    <elem><xsl:sequence select="data(elem/@colors)"/></elem>
  </xsl:template>
</xsl:stylesheet>

Result:

<elem>red green blue</elem>

> - An example with e.g. strings used as intermediate text in a formating-
>    like operation (a la printf in C), where inserting spaces would 
happen,
>    but would not be desired.

Is this the kind of example you're looking for?  I've used an XPath 
expression to perform a simple date formatting operation, constructing the 
result as a sequence of strings.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:hz="http://www.example.org"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                version="2.0"
                exclude-result-prefixes="hz xs">
  <xsl:function name="hz:format">
    <xsl:param name="date" as="xs:date"/>
    <xsl:param name="format" as="xs:string"/>

    <xsl:sequence
       select="
         for $c in
           (for $i in (1 to string-length($format))
            return substring($format, $i, 1))
         return
           if ($c = 'y') then
             get-year-from-date($date)
           else if ($c = 'd') then
             get-day-from-date($date)
           else if ($c = 'm') then
             get-month-from-date($date)
           else if ($c = 'M') then
             ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
              'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')
             [get-month-from-date($date)]
           else
             $c"/>
  </xsl:function>

  <xsl:template match="/">
    <doc>
      <v1>
       <xsl:sequence
         select="hz:format(xs:date('2004-12-21'), 'y-m-d')"/>
      </v1>
      <v2>
       <xsl:sequence
         select="hz:format(xs:date('2004-12-31'), 'M d, y')"/>
      </v2>
    </doc>
  </xsl:template>
</xsl:stylesheet>

This stylesheet will produce the following result, which is probably not 
what was intended.

<doc><v1>2004 - 12 - 21</v1><v2>Dec   31 ,   2004</v2></doc>

> - The previous example with the above 'string-join' function used to
>    avoid the problems with spaces.

If I change the definition of hz:format to add in a reference to 
string-join, specifying '' as the separator,

  <xsl:function name="hz:format-date">
    <xsl:param name="date" as="xs:date"/>
    <xsl:param name="format" as="xs:string"/>

    <xsl:sequence
      select="string-join(
                for $c in
                  (for $i in (1 to string-length($format))
                     return substring($format, $i, 1))
                return
                  ...
              , '')"/>
  </xsl:function>

the result will be:

<doc><v1>2004-12-21</v1><v2>Dec 31, 2004</v2></doc>

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0053.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
RE: [Serial] I18N WG last call comments, Martin Duerst (2004-10-26)

Hello Henry,

[this is a reply to
http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0038.html]

We have looked at your code examples below in detail. The examples
you are giving look reasonable, but we are concerned about
is cases where text is not put together programmatically,
but just concatenated, e.g. in an example such as

<p>Document creation date: <xsl:sequence
          select="hz:format(xs:date('2004-12-21'), 'y-m-d')"/>.</p>

Overall, I think that the convention of using a space between
strings, inherited from SGML NMTOKENS and IDREFS, should not be the
default in XQuery and XSLT to contatenate strings. Either there
should be a function, e.g. called stringify-tokens, to handle
cases such as "red green blue", which I guess would make the
first of your examples

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 version="2.0">
   <xsl:template match="/">
     <elem colors="stringify-tokens({data(elem/@colors)})"/>
   </xsl:template>
</xsl:stylesheet>

or alternatively, making sure that the data model can distinguish,
based on schema information, between tokens such as NMTOKENS and IDREFS
and plain strings that don't need spaces when concatenated.

Simply defining that all strings behave like tokens because some
strings are tokens doesn't seem to make sense at all.

Regards,   Martin.

Re: [Serial] I18N WG last call comments, Jonathan Robie (2004-10-26)

Martin Duerst wrote:

> Overall, I think that the convention of using a space between
> strings, inherited from SGML NMTOKENS and IDREFS, should not be the
> default in XQuery and XSLT to contatenate strings.

Hi Martin,

For concatenating strings, which is what the concat() function does, we 
do not insert anything. I think Henry has shown [1] that our string 
manipulation library is pretty good at allowing other delimiters to be 
inserted if needed.

Serializing a sequence of atomic values is not the same thing as 
"concatenating strings". The lexical representation of these atomic 
values is given by XML Schema, and the delimiters used are the 
delimiters used by XML Schema. The default for serializing a sequence of 
tokens defined by XML Schema pretty much has to be the format defined by 
XML Schema, or else XML processors won't be able to read serialized 
documents. So for serialization, I think your beef is with XML Schema.

Linguistic tokens and delimiters are not the same as computerlanguage 
tokens and delimiters. In my opinion, the biggest problem occurs not 
when they differ, but when they are the same. That's why we have to 
invent conventions like camelCase or hyphenated-names to allow ourselves 
to create computer language tokens that consist of multiple linguistic 
tokens. XML Schema could have allowed users to create a sequence of 
string values that contain spaces, as in:

<sequenceOfRoads>Gibson Road, Main Street</sequenceOfRoads>

That would require XML Schema to allow an alternate delimiter to be 
specified. It doesn't. And it shouldn't - in XML, the best way to 
delimit individual items is to use markup:

<roads>
   <road>Gibson Road</road>
   <road>Main Street</road>
</roads>

As a markup language, XML exists for the sole purpose of clearly 
identifying data. Let's use it! The alternative is to use microparsing. 
But that's not how XML works, and XQuery is based on XML.

We support XML Schema, and that's what our serialization does by 
default. If you want a different serialization, you can use string 
manipulation to create whatever you want, but an XML Schema processor 
won't be able to recognize the tokens.

Jonathan
My opinion only. Not on behalf of anyone.
Minutes from Oct. 26 Telcon!!!, Jonathan Robie (2004-11-01)
RE: [Serial] I18N WG last call comments, scott_boag@us.ibm.com (2004-11-08)

Hi Martin.  The joint working groups have discussed your objection to the 
resolution to qt-2004Feb0362-02.  The WG has decided to endorse Jonathan 
Robie's response [1] to your note.  We will leave the issue's status as 
"objected". 

-scott


[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Oct/0064.html

qt-2004Feb0362-03: [Serial] I18N WG last call comments [6]
[substantive, announced] 2004-07-13
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[6] Section 3, 'encoding': Given that this is already required for

   the XML output method, we think it's highly desirable to make

   the requirement for support for UTF-8 and UTF-16 general

   (including text).



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

I can't think of any reason not to make this change.

Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)
Re: [Serial] I18N WG last call comments [6], Henry Zongaro (2004-07-13)

Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
working group:

<<
[6] Section 3, 'encoding': Given that this is already required for
   the XML output method, we think it's highly desirable to make
   the requirement for support for UTF-8 and UTF-16 general
   (including text).
>>

     Thanks to you and the I18N working group for this comment.

     The XSL and XML Query Working Groups discussed the working group's 
comment, and decided to accept the I18N working group's suggestion.  The 
serialization specification will be modified to require support for UTF-8 
and UTF-16 encodings for all the output methods defined by the 
specification - namely, the xml, xhtml, html and text output methods.

     As this is the change the I18N working group proposed, I believe the 
response should be acceptable to the working group.  May I ask you to 
confirm that it is?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-04: [Serial] I18N WG last call comments [7]
[substantive, announced] 2004-10-13
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[7] Section 3, 'encoding': Here or for each individual output method,

   something should be said about the BOM. We think it should be

   the following:

   - XML/XHTML: UTF-16: required; UTF-8: may be used.

   - HTML/text: UTF-16: recommended; UTF-8: may be used.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

I agree. In Saxon, I've added an extension attribute to control whether

a BOM should be emitted, and I think it would be a good idea to make

this a standard feature. The default should be yes for UTF-16, no for

UTF-8.

Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)

Martin,

     In [1], you submitted the following comments on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[7] Section 3, 'encoding': Here or for each individual output method,
   something should be said about the BOM. We think it should be
   the following:
   - XML/XHTML: UTF-16: required; UTF-8: may be used.
   - HTML/text: UTF-16: recommended; UTF-8: may be used.

[8] Section 3, 'encoding': This should say that for UTF-16,
   endianness implementation-dependent (or implementation-defined)
>>

     Thanks to you and the I18N Working Group for these comments.

     The XSL and XML Query Working Groups discussed the comments, and 
decided to add a byte-order-mark parameter to the Serialization 
specification to control whether a Byte Order Mark is written.  The actual 
byte order used is implementation-dependent.  If the concept of a Byte 
Order Mark does not make sense for the particular encoding selected, the 
byte-order-mark parameter is ignored.

     May I ask you to confirm that this response is acceptable to the 
working group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-05: [Serial] I18N WG last call comments [8]
[substantive, announced] 2004-10-13
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[8] Section 3, 'encoding': This should say that for UTF-16,

   endianness implementation-dependent (or implementation-defined)



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

Agreed.

Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)

Martin,

     In [1], you submitted the following comments on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[7] Section 3, 'encoding': Here or for each individual output method,
   something should be said about the BOM. We think it should be
   the following:
   - XML/XHTML: UTF-16: required; UTF-8: may be used.
   - HTML/text: UTF-16: recommended; UTF-8: may be used.

[8] Section 3, 'encoding': This should say that for UTF-16,
   endianness implementation-dependent (or implementation-defined)
>>

     Thanks to you and the I18N Working Group for these comments.

     The XSL and XML Query Working Groups discussed the comments, and 
decided to add a byte-order-mark parameter to the Serialization 
specification to control whether a Byte Order Mark is written.  The actual 
byte order used is implementation-dependent.  If the concept of a Byte 
Order Mark does not make sense for the particular encoding selected, the 
byte-order-mark parameter is ignored.

     May I ask you to confirm that this response is acceptable to the 
working group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-06: [Serial] I18N WG last call comments [9]
[substantive, announced] 2004-09-21
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[9] Section 3, 'encoding': "If this parameter is not specified, and

   the output method does not specify any additional requirements,

   the encoding used is implementation defined."

   This should be more specific. In the absence of an 'encoding'

   parameter, information e.g. given to an implementation via an

   option, and specific information for a particular 'host language'

   (e.g. other than XQuery or XSLT), there should be a default of

   UTF-8.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

Off-hand, I don't see any objection to this except that it might give

some vendors a backwards compatibility problem.

Re: [Serial] I18N WG last call comments [9], Henry Zongaro (2004-09-21)

Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[9] Section 3, 'encoding': "If this parameter is not specified, and
   the output method does not specify any additional requirements,
   the encoding used is implementation defined."
   This should be more specific. In the absence of an 'encoding'
   parameter, information e.g. given to an implementation via an
   option, and specific information for a particular 'host language'
   (e.g. other than XQuery or XSLT), there should be a default of
   UTF-8.
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed your comment.

     In response to another last call issue, the encoding parameter is no 
longer optional.  Therefore, any host specification is obliged to specify 
how the value of the parameter is determined.  This is reflected in the 23 
July Working Draft of Serialization.[2] There is no longer a need for any 
change to Serialization.

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serparam
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-07: [Serial] I18N WG last call comments [first comment 12]
[substantive, announced] 2004-11-08
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[12] [Should be "10". HZ]

   Section 3, 'escape-uri-attributes' (and other places in this spec):

   RFC 2396, section 2.4.1, only specifies how to escape a string of

   bytes in an URI, and cannot directly be applied to a string of

   (Unicode) characters. In accordance with the IRI draft and many

   other W3C specifications, this must be specified to use UTF-8

   first and then use RFC 2396, section 2.4.1 (%-escaping).



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

Agreed.

Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
working group:

<<
[12] [Should be "10". HZ]
   Section 3, 'escape-uri-attributes' (and other places in this spec):
   RFC 2396, section 2.4.1, only specifies how to escape a string of
   bytes in an URI, and cannot directly be applied to a string of
   (Unicode) characters. In accordance with the IRI draft and many
   other W3C specifications, this must be specified to use UTF-8
   first and then use RFC 2396, section 2.4.1 (%-escaping).
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed the comment, and noted that Section 
16.1 of XSLT 1.0 [2] relied upon Appendix B.2.1 of HTML 4.0 [3] for the 
normative definition of URI escaping.  The working groups also noted that 
some specifications have duplicated the description of URI escaping, while 
still others have relied on diverse references for the normative 
definition of the URI escaping algorithm.  In particular, the working 
groups noted that Section 3.2.17 of the PER of XML Schema:  Datatypes 2nd 
ed. [4] refers to Section 5.4 of XML Linking Language [5] for the 
normative definition of URI escaping.

     The working groups decided to follow the lead of the XML Schema 
Working Group, and adopted the following changes:

. In Section 6 of Serialization, sixth bullet, change
<<
escape non-ASCII characters in URI attribute values using the method 
recommended in Section 2.4.1 of [RFC2396].
>>

to

<<
escape non-ASCII characters in URI attribute values using the method 
defined by Section 5.4 Locator Attribute of [XML Linking Language], except 
that relative URIs must not be absolutized.
>>

. In Section 7.2 of Serialization, third paragraph change
<<
escape non-ASCII characters in URI attribute values using the method 
recommended in [RFC2396] (section 2.4.1).
>>

to

<<
escape non-ASCII characters in URI attribute values using the method 
defined by Section 5.4 Locator Attribute of [XML Linking Language], except 
that relative URIs must not be absolutized.
>>

     May I ask you to confirm that this response is acceptable to the 
working group?  If not, we would ask the I18N working group to suggest the 
most appropriate normative reference for URI escaping that should be used 
by all new W3C Recommendations.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] http://www.w3.org/TR/xslt#section-HTML-Output-Method
[3] http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1
[4] http://www.w3.org/TR/2004/PER-xmlschema-2-20040318/#anyURI
[5] http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Hi Martin.  This is another gentle reminder that we are waiting for a 
confirmation for the resolution of this issue so that we might close it.
qt-2004Feb0362-08: [Serial] I18N WG last call comments [11]
[substantive, announced] 2004-10-26
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[11] Section 3, 'include-content-type': Why is this parameter needed?

   It seems that it may be better to always include a <meta> element.

   Please remove the parameter or tell us when/why it's necessary to

   not have a <meta> element



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

This parameter has been requested by users a number of times, but the

situations that justify it are difficult to describe concisely. The

simplest case is where the user wants to output the meta element "by

hand", to give greater control. The other cases I've seen are where the

encoding isn't known until after subsequent stages in the processing

pipeline.


Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[11] Section 3, 'include-content-type': Why is this parameter needed?
   It seems that it may be better to always include a <meta> element.
   Please remove the parameter or tell us when/why it's necessary to
   not have a <meta> element
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed the comment, and noted that there are 
many situations in which users have found there to be a need for the 
include-content-type parameter.  A user might not want the serialization 
process to produce a META element because some post-processing phase will 
be responsible for creating that element or because the sequence that is 
input to serialization already contains such a META element that the user 
would like the serialization process to preserve.  Users sometimes find it 
necessary to do this in order to work around bugs in web server software.

     The working group decided that no change to the Serialization draft 
was necessary.

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Hello Henry,

The I18N WG has looked at the response below.

We looked at two documents:
       http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/
       http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/
The LC version says 'value is implementation defined'.
The new WD doesn't even say that (or we haven't found it).
Your email below is written as if it was 'yes' by default.
We think that (default 'yes') would be the right thing to do.
If the spec indeed uses 'yes' as the default, please send
us a pointer to the place where it does. If not, we would not
be satisfied with this resolution.

Regards,    Martin.

qt-2004Feb0362-09: [Serial] I18N WG last call comments [Second comment 12]
[substantive, announced] 2004-09-21
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[12] The description of 'media-type' is confusing. Does it change

   something in the output, or only in the way the output is labelled?

   Does it affect the <meta>, if output? Can it affect other things,

   e.g. a Content-Type header in HTTP? This should be clarified.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

You're not the only one who's confused. It's often used by

transformation servlets to set the HTTP headers, but as far as the

serializer itself is concerned, it's documentary.


Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[12] The description of 'media-type' is confusing. Does it change
   something in the output, or only in the way the output is labelled?
   Does it affect the <meta>, if output? Can it affect other things,
   e.g. a Content-Type header in HTTP? This should be clarified.
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed the comment.

     Yes, the setting of the media-type parameter may affect the sequence 
of octets that is the result of serialization using the html and xhtml 
output methods, as is clearly indicated in [2] and [3].

     In addition, the serializer may use the parameter to influence things 
that are outside of the scope of this specification, such as an HTTP 
header.  To make this clear, the following will be added to the 
description of the media-type parameter in the table in Section 3 of 
Serialization:

<<
    If the destination of the serialized output is annotated with
    a media type, this parameter may be used to provide such an
    annotation.  For example, it may be used to set the media type
    in an HTTP header.
>>

     Finally, the reference to RFC 2396 that currently appears in the 
description of the media-type parameter is not appropriate; that RFC 
defines XML media types.  The reference will be changed to RFC 2046[4] 
"Multipurpose Internet Mail Extensions (MIME) Part Two:  Media Types."

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#N10F34 
[3] 
http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#xhtml-output
[4] ftp://ftp.rfc-editor.org/in-notes/rfc2046.txt
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-10: [Serial] I18N WG last call comments [13]
[substantive, announced] 2004-03-28
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[13] Section 3, 'normalize-unicode': Using Normalization Form C is

   the right thing, but XML 1.1, in accordance with the Character

   Model, defines some additional start conditions in some cases.

   How are these guaranteed (e.g. by adding an initial space if

   necessary)? If there is no such guarantee, there should at least

   be a warning, but a guarantee is highly preferable.



Regards,    Martin.

Fw: [Serial] I18N WG last call comments, Henry Zongaro (2004-03-28)



Martin,



    In [1], you submitted the following comment on the Serialization last 

call draft.



Martin Duerst wrote on 2004-02-15 12:37:30 PM:

> [13] Section 3, 'normalize-unicode': Using Normalization Form C is

>    the right thing, but XML 1.1, in accordance with the Character

>    Model, defines some additional start conditions in some cases.

>    How are these guaranteed (e.g. by adding an initial space if

>    necessary)? If there is no such guarantee, there should at least

>    be a warning, but a guarantee is highly preferable.



     Our thanks to you and the I18N WG for submitting this comment.

 

     The XSL and XQuery Working Groups discussed the comment and related 

comments, and decided to make the following changes to the 

normalize-unicode parameter:



1. Rename the parameter to "normalization-form".



2. The possible values of the parameter will be "NFC", "NFD", "NFKC", 

"NFKD", "fully-normalized", "none" or an implementation-defined 

normalization form.  The default value is "none".  We will also add a note 

advising of the interoperability problems that can arise by using anything 

other than NFC.



3. All of "NFC", "NFD", "NFKC", "NFKD", "fully-normalized", "none" and any 

implementation-defined value are permitted for the xml, xhtml and text 

output methods.  The values "NFC", "fully-normalized" and "none" must be 

supported by an implementation for these output methods.



4. The normalization-form parameter is permitted to have the values "NFC", 

"NFD", "NFKC", "NFKD", "none" or an implementation-defined value if the 

output method is "html".  The values "NFC" and "none" must be supported 

for the html output method.  The value "fully-normalized" is not permitted 

if the output method is "html".



5. In the case of "fully-normalized", the normalization is the same as for 

NFC, but the processor must signal a serialization error if any of the 

"relevant constructs" of the result would begin with a combining 

character.



     We believe that item 5 on this list addresses the particular concern 

raised in the comment, that guarantees should be provided that the start 

conditions of the Character Model are not violated.



     May I ask you to confirm that this is an acceptable response to the 

I18N WG's comment?



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

qt-2004Feb0362-11: [Serial] I18N WG last call comments [14]
[substantive, announced] 2004-08-31
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[14] Section 3, four phases of serialization: Character expansion

   comes before Encoding, but encoding depends on character

   expansion (using numeric character references for characters

   that don't exist in a certain encoding). This has to be

   sorted out very carefully and explained in detail, ideally

   with examples. There's also an interaction between mapping and

   normalization.  If there's a mapping combining grave->&#x300;,

   normalization must be aware that &#x300; is not an ASCII string!



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

You are probably right that we need to analyze and explain the

interactions between the different options better than we do at the

moment.

Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[14] Section 3, four phases of serialization: Character expansion
   comes before Encoding, but encoding depends on character
   expansion (using numeric character references for characters
   that don't exist in a certain encoding). This has to be
   sorted out very carefully and explained in detail, ideally
   with examples. There's also an interaction between mapping and
   normalization.  If there's a mapping combining grave->&#x300;,
   normalization must be aware that &#x300; is not an ASCII string!
>>

     Thanks to you and the I18N Working Group for this comment.  The XSL 
and XML Query Working Groups discussed the comment, and decided, because 
of the interactions between Unicode normalization and creation of 
character references, to fold together character expansion and Unicode 
normalization.  In addition, the working groups decided to add creation of 
character references to the character expansion phase, because it had not 
been explicitly mentioned as part of that phase.

     Specifically, the working groups decided to replace the second and 
third bullets of Section 4 of Serialization with 
the following text:

<<
2. Character expansion is concerned with the representation of
   characters appearing in text and attribute nodes in the
   instance of the data model. The substitution processes that
   may apply are listed below, in priority order: a character
   that is handled by one process in this list will be
   unaffected by processes appearing later in the list, except
   that a character affected by Unicode normalization may be
   affected by creation of CDATA sections or by character
   escaping

   o URI escaping (in the case of URI-valued attributes in the
     HTML and XHTML output methods), as determined by the
     escape-uri-attributes parameter

   o Character mapping, as determined by the use-character-maps
     parameter.  Text nodes that are children of elements
     specified by the cdata-section-elements parameter are not
     affected by this step. 

   o Unicode Normalization, if requested by the
     normalization-form parameter. Unicode normalization is
     applied to the character stream that results after all
     markup generation and character expansion has taken place.

     For the definitions of the various normalization forms,
     see [Character Model for the World Wide Web 1.0]

     The meanings associated with the possible values of the
     normalization-form parameter are as follows:

     o NFC specifies the serialized result should be in Unicode
       Normalization Form C.

     o NFD specifies the serialized result should be in Unicode
       Normalization Form D.

     o NFKC specifies the serialized result should be in Unicode
       Normalization Form KC.

     o NFKD specifies the serialized result should be in Unicode
       Normalization Form KD.

     o fully-normalized specifies the serialized result should
       be in fully normalized form.

     o none specifies that no Unicode normalization should be
       applied.

     o An implementation-defined value has an implementation-
       defined effect.

   o Creation of CDATA sections, as determined by the
     cdata-section-elements parameter. Note that this is also
     affected by the encoding parameter, in that characters not
     present in the selected encoding cannot be represented in
     a CDATA section.

   o Escaping according to XML or HTML rules of special
     characters and of characters that cannot be represented in
     the selected encoding.  For example replacing < by &lt;.
>>

     The Unicode Normalization phase becomes the third step of character 
expansion.  Character mapping becomes the second step, with the 
clarification that it does not affect elements to which 
cdata-section-elements applies.  This was done to make it clear that any 
characters affected by character mapping are not affected by Unicode 
Normalization.  The lead-in to the bulleted list will be modified so that 
CDATA section creation and escaping still apply to characters affected by 
Unicode Normalization - this is a consequence of trying to fold the two 
together.  Finally, the last bullet will be modified to make it clear that 
not only special characters, but characters that can't be represented in 
the selected encoding are affected by that final step.

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-12: [Serial] I18N WG last call comments [15]
[substantive, announced] 2004-04-28
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[15] Section 4, "To anticipate the proposed changes to end-of-line

   handling in XML 1.1, implementations may also output the characters

   x85 and x2028 as character references. This will not affect the way

   they are interpreted by an XML 1.0 parser.": XML 1.1 is now a REC,

   so this is no longer anticipated. See

   http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-line-ends



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

Yes. Now that XML+NS 1.1 is at Rec status, I think the WGs need to take

a fresh top-level look at our policy towards them; serialization is just

one aspect of this.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-28)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization:

> [15] Section 4, "To anticipate the proposed changes to end-of-line
>    handling in XML 1.1, implementations may also output the characters
>    x85 and x2028 as character references. This will not affect the way
>    they are interpreted by an XML 1.0 parser.": XML 1.1 is now a REC,
>    so this is no longer anticipated. See
>    http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-line-ends

     Thanks to Martin and the working group for this comment.

     The XSL and XML Query Working Groups discussed the comment, and 
agreed that the Serialization specification should be amended so that it 
no longer refers to XML 1.1 as if it were not yet a recommendation. 
Furthermore, the working groups decided that the handling of x85 and x2028 
should be such that they can be successfully processed by either an XML 
1.0 or an XML 1.1 processor without being normalized to a line-feed 
character, even if the value of the version parameter is 1.0.  Following 
are the changes required to implement that change:

Replace the paragraph after the bulleted list in Section 4 with the 
following:

<<
A consequence of this rule is that certain whitespace characters must be
output as character references, to ensure that they survive the round
trip through serialization and parsing. Specifically, CR, NEL and LINE
SEPARATOR characters in text nodes must be output respectively as &#xD;,
&#x85;, and &#x2028;, or their equivalents; while CR, NL, TAB, NEL and
LINE SEPARATOR characters in attribute nodes must be output respectively
as &#xD;, &#xA;, &#x9;, &#x85;, and &#x2028;, or their equivalents
>>

And replace the note following the bulleted list with the following note:

<<
Note:  XML 1.0 did not permit processors to normalize NEL or LINE
SEPARATOR characters to a LINE FEED character.  However, if a document
entity that specifies version 1.1 invokes an external general parsed
entity with no TextDecl or a TextDecl that specifies a version of 1.0,
the external parsed entity is processed according to the rules of XML
1.1.  For this reason, NEL and LINE SEPARATOR characters in text and
attribute nodes must always be escaped using character references or
CDATA sections, regardless of the value of the version parameter.
>>

     May I ask the working group to confirm that this response is 
acceptable to it?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-13: [Serial] I18N WG last call comments [16]
[substantive, decided] 2004-11-01
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[16] Section 4.2 (XML output method, encoding): "If no encoding parameter

   is specified, then the processor must use either UTF-8 or UTF-16.":

   It may be desirable to further narrow this to UTF-8 for higher

   predictability. On the other hand, this should not say

   "If no encoding parameter is specified", but "If no encoding

   is specified (either with an encoding parameter or externally)"

   to allow e.g. specification of encoding with an option.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

On the first point: yes, perhaps.



On the second, the serializer is driven by a set of parameters. I think

that by the time the serializer is invoked, the parameter values have

been fully computed, regardless where they came from, so the

serialization spec does not need to discuss different ways of supplying

the parameters.

RE: [Serial] I18N WG last call comments, François Yergeau (2004-02-18)

Michael Kay a écrit :

> On the second, the serializer is driven by a set of parameters. I think

> that by the time the serializer is invoked, the parameter values have

> been fully computed, regardless where they came from, so the

> serialization spec does not need to discuss different ways of supplying

> the parameters.



If the parameters are the only way to influence serialization behaviour, 

then this should be clarified.  Section 3 now starts "There are a number 

of parameters that influence...", which doesn't seem to claim to 

exhaustiveness.


Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[16] Section 4.2 (XML output method, encoding): "If no encoding parameter
   is specified, then the processor must use either UTF-8 or UTF-16.":
   It may be desirable to further narrow this to UTF-8 for higher
   predictability. On the other hand, this should not say
   "If no encoding parameter is specified", but "If no encoding
   is specified (either with an encoding parameter or externally)"
   to allow e.g. specification of encoding with an option.
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed your comment.

     Regarding the first point: in response to other Last Call comments, 
Serialization no longer specifies default values for parameters.  This is 
reflected in the 23 July Working Draft of Serialization.[2]  XSLT and 
XQuery now specify how the value of the encoding parameter is determined 
in all circumstances, so no change to the Serialization specification is 
required in response to that part of the comment.

     Regarding the second point:  again, all the serialization parameters 
are fully determined by whatever mechanisms are provided by the host 
specification.  Beyond that, serialization has implementation-dependent 
and implementation-defined aspects, so it should be clear that not all of 
a serializer's behaviour is governed by the settings of the parameters. 
The working groups feel no change to the Serialization specification is 
required in this regard.

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serparam
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Hello Henry,

The I18N WG (Core TF) has looked at your reply. We are not satisfied
with your answer.

On a procedural point, we would like to point out that moving the defaults
elsewhere makes them very difficult to check, and risks that agreements
between WGs are forgotten. In particular, it is not sufficient to just
close a comment on the specification it is made; it should be transferred
to the other specification(s) where it now applies.

On the actual issue, our main concern is to make sure that defaults
are actually specified appropriately.
In XQuery, the default seems to be 'implementation-defined':
    http://www.w3.org/TR/2004/WD-xquery-20040723/#id-xq-serialization-parameters
We are not at all convinced that this will lead to the necessary
degree of interoperability.
For XSLT, there is no new public WD. A pointer to or explanation
of the current solution for this issue for XSLT would be appreciated.
Without having a look at it, we cannot assess whether we are satisfied
with the resolution to our comment.

We would also like to mention that while there may be specific considerations
for each specification, using the same defaults where possible will make
things easier for users, and will lead to better overall interoperability.

Regards,    Martin.


Re: [Serial] I18N WG last call comments [16], Jonathan Robie (2004-10-26)

Martin Duerst wrote:

> In XQuery, the default seems to be 'implementation-defined':
> 
> http://www.w3.org/TR/2004/WD-xquery-20040723/#id-xq-serialization-parameters
> 
> 
> We are not at all convinced that this will lead to the necessary 
> degree of interoperability.

He was referring to this earlier comment:

>> [16] Section 4.2 (XML output method, encoding): "If no encoding
>> parameter is specified, then the processor must use either UTF-8 or
>> UTF-16.": It may be desirable to further narrow this to UTF-8 for
>> higher predictability. On the other hand, this should not say "If
>> no encoding parameter is specified", but "If no encoding is
>> specified (either with an encoding parameter or externally)" to
>> allow e.g. specification of encoding with an option.

Hi Martin,

According to the XML Spec:

> All XML processors MUST accept the UTF-8 and UTF-16 encodings of
> Unicode 3.1 [Unicode3]; the mechanisms for signaling which of the two
> is in use, or for bringing other encodings into play, are discussed
> later, in 4.3.3 Character Encoding in Entities.

Serialization produces XML for XML processors. Since all XML processors 
are required to accept the encodings that XQuery serialization is 
allowed to produce, the distinction between the two encodings should not 
make a difference unless an XML processor fails to implement the XML 
specification.

Are you suggesting that XML processors should not be required to accept 
both encodings? It's true that supporting both encodings complicates 
implementations, especially when the various normalizations are taken 
into account. But in XML, I think that's a done deal, and I think that 
we incurred this complication largely at the urging of the I18N community.

Jonathan
Not on behalf of anybody.
qt-2004Feb0362-14: [Serial] I18N WG last call comments [17]
[substantive, acknowledged] 2004-05-05
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[17] Section 4.2 (XML output method, encoding): "When outputting a newline

   character in the data model, the implementation is free to represent

   it using any character sequence that will be normalized to a newline

   character by an XML parser,...": This should probably says that

   for interoperability, it is better to avoid x85 and x2028.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

I don't see a specific need to say that: if you're generating XML 1.0

then you need to avoid these characters and if you're generating XML 1.1

then you don't. This seems to be covered by the statement as written.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-13)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
the I18N Working Group.

> [17] Section 4.2 (XML output method, encoding): "When outputting a 
newline
>    character in the data model, the implementation is free to represent
>    it using any character sequence that will be normalized to a newline
>    character by an XML parser,...": This should probably says that
>    for interoperability, it is better to avoid x85 and x2028.

     In [2], Michael Kay responded:

> I don't see a specific need to say that: if you're generating XML 1.0
> then you need to avoid these characters and if you're generating XML 1.1
> then you don't. This seems to be covered by the statement as written.

     Thanks to Martin and the I18N Working Group for this comment.

     The XSL and XQuery Working Groups discussed the comment, and agreed 
with Michael Kay that the statement regarding the representation of 
newline characters in the serialized document was correct as written, and 
that no change is required.

     May I ask the I18N Working Group to confirm that this response is 
acceptable?

Thanks,

Henry [On behalf of the XSL and XQuery Working Groups.]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, Martin Duerst (2004-05-05)

Hello Henry,

Sorry for the delay in replying to your mails.

The I18N WG (Core TF) has looked at your response,
and we are glad to tell you that it is acceptable for us.

Regards,    Martin.

At 10:59 04/04/13 -0400, Henry Zongaro wrote:

>Hello,
>
>      In [1], Martin Duerst submitted the following comment on the Last
>Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of
>the I18N Working Group.
>
> > [17] Section 4.2 (XML output method, encoding): "When outputting a
>newline
> >    character in the data model, the implementation is free to represent
> >    it using any character sequence that will be normalized to a newline
> >    character by an XML parser,...": This should probably says that
> >    for interoperability, it is better to avoid x85 and x2028.
>
>      In [2], Michael Kay responded:
>
> > I don't see a specific need to say that: if you're generating XML 1.0
> > then you need to avoid these characters and if you're generating XML 1.1
> > then you don't. This seems to be covered by the statement as written.
>
>      Thanks to Martin and the I18N Working Group for this comment.
>
>      The XSL and XQuery Working Groups discussed the comment, and agreed
>with Michael Kay that the statement regarding the representation of
>newline characters in the serialized document was correct as written, and
>that no change is required.
>
>      May I ask the I18N Working Group to confirm that this response is
>acceptable?
>
>Thanks,
>
>Henry [On behalf of the XSL and XQuery Working Groups.]
>[1]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
>[2]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
>------------------------------------------------------------------
>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
>mailto:zongaro@ca.ibm.com
qt-2004Feb0362-15: [Serial] I18N WG last call comments [18]
[substantive, acknowledged] 2004-06-01
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[18] Section 4.5 (XML output method, omit-xml-declaration): "The

   omit-xml-declaration parameter must be ignored if the standalone

   parameter is present, or if the encoding parameter specifies a

   value other than UTF-8 or UTF-16.": This disallows producing

   XML other than UTF-8 or UTF-16 without an xml declaration even

   though this is legal e.g. if served over HTTP with a corresponding

   charset parameter. We are not sure this is intended, and we

   are not sure this is a good thing. On the other hand,

   omit-xml-declaration must also be ignored if version is not 1.0.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

This rule overriding omit-xml-declaration has proved controversial with

some users, usually because they want to output fragments of XML that

they can concatenate into a single file. We should review it. On the

other hand, users do complain if the serializer produces output that an

XML parser then rejects.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-13)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
the I18N Working Group.

> [18] Section 4.5 (XML output method, omit-xml-declaration): "The
>    omit-xml-declaration parameter must be ignored if the standalone
>    parameter is present, or if the encoding parameter specifies a
>    value other than UTF-8 or UTF-16.": This disallows producing
>    XML other than UTF-8 or UTF-16 without an xml declaration even
>    though this is legal e.g. if served over HTTP with a corresponding
>    charset parameter. We are not sure this is intended, and we
>    are not sure this is a good thing. On the other hand,
>    omit-xml-declaration must also be ignored if version is not 1.0.

     Thanks to Martin and the I18N Working Group for this comment.

     The XSL and XQuery Working groups discussed this comment.

     Regarding the second point, although XML 1.1 requires a document 
entity to have an XML declaration, it does not require an external general 
parsed entity to have a text declaration.  The setting of the 
omit-xml-declaration parameter could still be meaningful, even if the 
version parameter has a value other than 1.0.

     Regarding the first point, as originally written, XML 1.0 required an 
XML declaration or a text declaration if the encoding of the document or 
external general parsed entity was anything other than UTF-8 or UTF-16. 
XSLT 1.0 enforced that requirement in its serialization mechanism.  The 
draft of Serialization inherited that behaviour from XSLT 1.0.  However, 
an erratum to XML 1.0 removed that requirement.

     In response to both points, the working groups decided that the 
Serialization specification should permit an XML declaration or text 
declaration to be omitted in precisely those circumstances in which it can 
be omitted according to XML 1.0 and XML 1.1.

     In particular, the working groups decided that if the serialized 
result could be considered to be the text declaration of an external 
general parsed entity, the omit-xml-declaration parameter could have the 
value yes or the value no, and the parameter's setting would take effect. 
They further decided that if the serialized result could only be 
considered to be a document entity because

  o the standalone parameter had the value yes or no; or
  o the version parameter had a value other than 1.0 and the
    doctype-system parameter was supplied

the omit-xml-declaration parameter must have the value no.  Otherwise, a 
serialization error results.  A host language would, of course, have the 
option of ensuring such conflicts never arise through whatever 
language-specific mechanism it uses to specify serialization parameters.

     May I ask the working group to confirm that that response is 
acceptable?

Thanks,

Henry [On behalf of the XSL and XQuery Working Groups.]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, Martin Duerst (2004-05-05)

Hello Henry,

The I18N WG (Core TF) has looked at your response.
We can confirm that we are okay with your solution under the
assumption that the default is still the same (i.e.
omit-xml-declaration='no', i.e. it is the default to omit an
XML declaration).

Regards,    Martin.

At 11:17 04/04/13 -0400, Henry Zongaro wrote:

>Hello,
>
>      In [1], Martin Duerst submitted the following comment on the Last
>Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of
>the I18N Working Group.
>
> > [18] Section 4.5 (XML output method, omit-xml-declaration): "The
> >    omit-xml-declaration parameter must be ignored if the standalone
> >    parameter is present, or if the encoding parameter specifies a
> >    value other than UTF-8 or UTF-16.": This disallows producing
> >    XML other than UTF-8 or UTF-16 without an xml declaration even
> >    though this is legal e.g. if served over HTTP with a corresponding
> >    charset parameter. We are not sure this is intended, and we
> >    are not sure this is a good thing. On the other hand,
> >    omit-xml-declaration must also be ignored if version is not 1.0.
>
>      Thanks to Martin and the I18N Working Group for this comment.
>
>      The XSL and XQuery Working groups discussed this comment.
>
>      Regarding the second point, although XML 1.1 requires a document
>entity to have an XML declaration, it does not require an external general
>parsed entity to have a text declaration.  The setting of the
>omit-xml-declaration parameter could still be meaningful, even if the
>version parameter has a value other than 1.0.
>
>      Regarding the first point, as originally written, XML 1.0 required an
>XML declaration or a text declaration if the encoding of the document or
>external general parsed entity was anything other than UTF-8 or UTF-16.
>XSLT 1.0 enforced that requirement in its serialization mechanism.  The
>draft of Serialization inherited that behaviour from XSLT 1.0.  However,
>an erratum to XML 1.0 removed that requirement.
>
>      In response to both points, the working groups decided that the
>Serialization specification should permit an XML declaration or text
>declaration to be omitted in precisely those circumstances in which it can
>be omitted according to XML 1.0 and XML 1.1.
>
>      In particular, the working groups decided that if the serialized
>result could be considered to be the text declaration of an external
>general parsed entity, the omit-xml-declaration parameter could have the
>value yes or the value no, and the parameter's setting would take effect.
>They further decided that if the serialized result could only be
>considered to be a document entity because
>
>   o the standalone parameter had the value yes or no; or
>   o the version parameter had a value other than 1.0 and the
>     doctype-system parameter was supplied
>
>the omit-xml-declaration parameter must have the value no.  Otherwise, a
>serialization error results.  A host language would, of course, have the
>option of ensuring such conflicts never arise through whatever
>language-specific mechanism it uses to specify serialization parameters.
>
>      May I ask the working group to confirm that that response is
>acceptable?
>
>Thanks,
>
>Henry [On behalf of the XSL and XQuery Working Groups.]
>[1]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
>------------------------------------------------------------------
>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
>mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-05-05)

Hello, Martin,

     Regarding the response to the I18N WG's comment number [18], you 
wrote:

Martin Duerst <duerst@w3.org> wrote on 2004-05-05 04:12:39 AM:
> The I18N WG (Core TF) has looked at your response.
> We can confirm that we are okay with your solution under the
> assumption that the default is still the same (i.e.
> omit-xml-declaration='no', i.e. it is the default to omit an
> XML declaration).

     In response to another last call comment, default settings for 
parameters to serialization will be determined by the process that sets 
those parameters.  The particular default settings specified by XSLT 2.0 
and XQuery 1.0 have not changed, however.  In particular, the XSLT 2.0 
specifies a default value of no for the value of the omit-xml-declaration 
parameter, while XQuery 1.0 specifies a default value of yes.

Thanks,

Henry
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, Martin Duerst (2004-06-01)

Hello Henry,

We have looked at the response below, and are fine with it,
i.e. we think that it satisfactorily addresses our original comment.

Regards,    Martin.

At 08:50 04/05/05 -0400, Henry Zongaro wrote:

>Hello, Martin,
>
>      Regarding the response to the I18N WG's comment number [18], you
>wrote:
>
>Martin Duerst <duerst@w3.org> wrote on 2004-05-05 04:12:39 AM:
> > The I18N WG (Core TF) has looked at your response.
> > We can confirm that we are okay with your solution under the
> > assumption that the default is still the same (i.e.
> > omit-xml-declaration='no', i.e. it is the default to omit an
> > XML declaration).
>
>      In response to another last call comment, default settings for
>parameters to serialization will be determined by the process that sets
>those parameters.  The particular default settings specified by XSLT 2.0
>and XQuery 1.0 have not changed, however.  In particular, the XSLT 2.0
>specifies a default value of no for the value of the omit-xml-declaration
>parameter, while XQuery 1.0 specifies a default value of yes.
>
>Thanks,
>
>Henry
>------------------------------------------------------------------
>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
>mailto:zongaro@ca.ibm.com
qt-2004Feb0362-16: [Serial] I18N WG last call comments [19]
[substantive, announced] 2004-06-07
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[19] 6.4 HTML Output Method: Writing Character Data: "When outputting

   a sequence of whitespace characters in the data model, within an

   element where whitespace is treated normally, (but not in elements

   such as pre and textarea) the html output method may represent it

   using any character sequence that will be treated as whitespace

   by an HTML user agent.": @@@ We need to check whether this (which

   allows replacement of whitespace including linebreaks by whitespace

   not including linebreaks and vice-versa) is okay for Chinese,

   Japanese, Thai,... (languages without spaces between words).

   This has to be checked extremely carefully.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

I think it's better if we don't try to define the detailed rules here,

but just state the constraint: you can replace one whitespace sequence

by another if user agents treat them as equivalent. If we try to be more

precise than this, we will get it wrong.

RE: [Serial] I18N WG last call comments, François Yergeau (2004-02-18)

The current text does not say that, it says that one sequence of white 

can be replaced by another if HTML user agents consider the latter as 

whitespace (presumably in the XML sense).  But HTML user agents need to 

distinguish line breaks from other whitespace, for the reasons hinted to 

by Martin. See list item 9 in 

http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/conformance.html#s_conform_user_agent 

for the gory details.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-19)

Thanks, distinction noted.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-06-07)

Hi, Martin.

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
working group:

Martin Duerst wrote on 2004-02-15 12:37:30 PM:
> [19] 6.4 HTML Output Method: Writing Character Data: "When outputting
>    a sequence of whitespace characters in the data model, within an
>    element where whitespace is treated normally, (but not in elements
>    such as pre and textarea) the html output method may represent it
>    using any character sequence that will be treated as whitespace
>    by an HTML user agent.": @@@ We need to check whether this (which
>    allows replacement of whitespace including linebreaks by whitespace
>    not including linebreaks and vice-versa) is okay for Chinese,
>    Japanese, Thai,... (languages without spaces between words).
>    This has to be checked extremely carefully.

In [2], François Yergeau added the following information, in response to a 
note from Michael Kay on the topic:

> > I think it's better if we don't try to define the detailed rules here,
> > but just state the constraint: you can replace one whitespace sequence
> > by another if user agents treat them as equivalent.
> 
> The current text does not say that, it says that one sequence of white 
> can be replaced by another if HTML user agents consider the latter as 
> whitespace (presumably in the XML sense).  But HTML user agents need to 
> distinguish line breaks from other whitespace, for the reasons hinted to 

> by Martin. See list item 9 in 
> 
http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/conformance.html#s_conform_user_agent 

> for the gory details.

     Thanks to you and the I18N working group for this comment.

     The XSL and XML Query Working Groups discussed the comment.  The 
working groups were unable to find any statement in HTML 4.01 that 
different whitespace characters can be treated differently, ignoring such 
elements as pre and textarea.  The reference that François provided was 
from the XHTML Modularization Recommendation, although the original 
comment was on the html output method.

     In discussing the comment, some members of the WGs thought that XHTML 
Modularization probably better reflected the requirements placed on HTML 
user agents in order to support languages such as those you mentioned. The 
WGs decided to add a normative requirement in the description of the html 
output method stating that whitespace characters can be replaced only with 
any other sequence of whitespace characters that has the same effect in a 
user agent.  The WGs also decided to add a non-normative reference 
pointing to bullet 9 of section 3.5 of XHTML Modularization, to provide 
further information on the issues involved.

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1025.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-17: [Serial] I18N WG last call comments [20]
[substantive, announced] 2004-12-08
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[20] 6.4 HTML Output Method: Writing Character Data: "Certain characters,

   specifically the control characters #x7F-#x9F, are legal in XML but

   not in HTML. ... The processor may signal the error, but is not

   required to do so.": Please change this to require the processor

   to produce an error.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

I worry that we will get many complaints from users who are misusing

these codepoints if we do this. Their code will stop working, and it may

be quite difficult for them to fix it. (Though it's a good use case for

character maps...)

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-13)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
the I18N Working Group.

> [20] 6.4 HTML Output Method: Writing Character Data: "Certain 
characters,
>    specifically the control characters #x7F-#x9F, are legal in XML but
>    not in HTML. ... The processor may signal the error, but is not
>    required to do so.": Please change this to require the processor
>    to produce an error.

     In [2], Michael Kay responded:

> I worry that we will get many complaints from users who are misusing
> these codepoints if we do this. Their code will stop working, and it may
> be quite difficult for them to fix it. (Though it's a good use case for
> character maps...)

     Thanks to Martin and the I18N Working Group for this comment.

     The XSL and XQuery Working Groups discussed the comment, and decided 
to endorse Michael Kay's response without any change to the Serialization 
specification.

     May I ask the I18N Working Group to confirm that this response is 
acceptable?

Thanks,

Henry [On behalf of the XSL and XQuery Working Groups.]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, Martin Duerst (2004-05-05)

Hello Henry,

Many thanks for your replies to our comments.
The I18N WG (Core TF) has looked at your reply below.

We are sorry, but we have to clearly disagree.
We think that producing junk is never a good idea.
See below for further discussion.


At 11:17 04/04/13 -0400, Henry Zongaro wrote:

>Hello,
>
>      In [1], Martin Duerst submitted the following comment on the Last
>Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of
>the I18N Working Group.
>
> > [20] 6.4 HTML Output Method: Writing Character Data: "Certain
>characters,
> >    specifically the control characters #x7F-#x9F, are legal in XML but
> >    not in HTML. ... The processor may signal the error, but is not
> >    required to do so.": Please change this to require the processor
> >    to produce an error.
>
>      In [2], Michael Kay responded:
>
> > I worry that we will get many complaints from users who are misusing
> > these codepoints if we do this.

How are they misusing these code points? The case we know is that
bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with
the intent of giving them the windows-1252 semantics. If somebody
is reading in windows-1252 documents, then it's simple to just
declare them that way. Also, if somebody wants windows-1252 as
output, they can just say so using XSLT. Neither reading windows-1252
nor writing out windows-1252 is in any way a misuse of XML, HTML,
or XSLT. HTML allows using the *bytes* 0x80-0x9F if in the encoding
used, they are encoding *characters* that are allowed by HTML.

If it is some other misuse that you are speaking about, please
inform us about the details.


> > Their code will stop working,

In some way just a detail, but: There is currently no XSLT 2.0
code that will stop working. XSTL 1.0 doesn't have the XHTML
output method.

With kind regards,    Martin.


> > and it may
> > be quite difficult for them to fix it. (Though it's a good use case for
> > character maps...)
>
>      Thanks to Martin and the I18N Working Group for this comment.
>
>      The XSL and XQuery Working Groups discussed the comment, and decided
>to endorse Michael Kay's response without any change to the Serialization
>specification.
>
>      May I ask the I18N Working Group to confirm that this response is
>acceptable?
>
>Thanks,
>
>Henry [On behalf of the XSL and XQuery Working Groups.]
>[1]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
>[2]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
>------------------------------------------------------------------
>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
>mailto:zongaro@ca.ibm.com
RE: [Serial] I18N WG last call comments, Michael Kay (2004-05-06)

> > > I worry that we will get many complaints from users who 
> are misusing
> > > these codepoints if we do this.
> 
> How are they misusing these code points? The case we know is that
> bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with
> the intent of giving them the windows-1252 semantics.

This was the case I had in mind. People create documents in cp1252 and
declare them as iso-8859-1. And it all works, because the errors cancel each
other out. If we oblige processors to detect this situation we will be
asking users to pay for the extra processing cost, and in return the
application that worked before will stop working. Will they thank us?
Because if they won't, we shouldn't do it.


> 
> In some way just a detail, but: There is currently no XSLT 2.0
> code that will stop working. XSTL 1.0 doesn't have the XHTML
> output method.

I may have lost the thread, but I thought we were discussing the HTML output
method?

> > [20] 6.4 HTML Output Method: Writing Character Data: "Certain
>characters,

Michael Kay
RE: [Serial] I18N WG last call comments, Martin Duerst (2004-05-21)

Hello Michael,

The I18N WG (Core TF) has discussed your mail, and has asked
me to reply. I'm sorry for the delay.

At 17:52 04/05/06 +0100, Michael Kay wrote:

> > > > I worry that we will get many complaints from users who
> > are misusing
> > > > these codepoints if we do this.
> >
> > How are they misusing these code points? The case we know is that
> > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with
> > the intent of giving them the windows-1252 semantics.
>
>This was the case I had in mind. People create documents in cp1252 and
>declare them as iso-8859-1. And it all works, because the errors cancel each
>other out. If we oblige processors to detect this situation we will be
>asking users to pay for the extra processing cost, and in return the
>application that worked before will stop working. Will they thank us?
>Because if they won't, we shouldn't do it.

Some users will be very thankful, others won't. The users that will
be thankful will be those that care about data integrity and interoperability
worldwide and in the long term. They will be able to fix a problem
in their data that they otherwise might not have found. As a result,
they will not only produce correct, valid output, but will also
make sure that their input data will work well in other circumstances,
such as searching, sorting, and any kind of other processing. Not the
least, with the introduction of XML 1.1, there are also such issues
as the confusion betwen NEL and the three-dot elipsis.

There was a time when the mentality on the Web was 'everything goes',
which lead to the slippery slope of bugwards compatibility. We have
learned, with great pain, that this is a dead end, and we don't want
to go there anymore. XML is the clearest example of how this can be
done better. And I sincerely hope that XSLT will not be tempted to
go down the bugwards compatibility slope.

The C1 area is forbidden in HTML exactly because it is a very easy
and cheap way to help people check and (if necessary) clean up their
data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written
almost 10 years ago. That C1 is allowed in XML is, according to
James Clark, an oversight. XML 1.1 has corrected it.


> > In some way just a detail, but: There is currently no XSLT 2.0
> > code that will stop working. XSTL 1.0 doesn't have the XHTML
> > output method.
>
>I may have lost the thread, but I thought we were discussing the HTML output
>method?

Okay, sorry. There is still no XSLT 2.0 code that will stop working,
even for the HTML output method. And because the XHTML output
method is supposed to work according to the compatibility guidelines,
it of course also should forbid producing C1 character output.

Regards,    Martin.


> > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain
> >characters,
>
>Michael Kay
RE: [Serial] I18N WG last call comments, Michael Kay (2004-05-21)

Thanks. There's no easy right answer on this one. It's similar to the
question of whether products should accept "c:\a\b.xml" in places where a
URI is required. Some products allow it. I've resisted, and report it as an
error. When users find that it works on one product and doesn't work on
mine, it's me they complain to. I tell them they are wrong and they should
read the specs, but I can afford to do that because they aren't (at present)
paying customers.

I would be happy with the stricter rule if we had imposed it from the start.
I'm not happy with the idea that version 2 should be stricter than version
1. That's in good measure because, for the time being, people's first
exposure to XSLT 2.0 is through my product, and when they get compatibility
or usability problems, they report it to me as "a Saxon bug".

In addition, the XSLT spec has always been pragmatic about the reality of
HTML interoperability. If the spec wasn't pragmatic in this way, then I
think XSLT implementors would have to be pragmatic, and the weaknesses of
HTML conformance would spill over into weaknesses in XSLT conformance. There
are many ways that we allow XSLT stylesheets to generate non-conformant
HTML, and I don't see that this one is particularly different from the
others. Most areas where we have tried to be strict about what we generate
(for example, in URI escaping) have led to practical problems for users.

Michael Kay


> -----Original Message-----
> From: Martin Duerst [mailto:duerst@w3.org] 
> Sent: 21 May 2004 08:09
> To: Michael Kay; 'Henry Zongaro'; w3c-i18n-ig@w3.org
> Cc: public-qt-comments@w3.org
> Subject: RE: [Serial] I18N WG last call comments
> 
> Hello Michael,
> 
> The I18N WG (Core TF) has discussed your mail, and has asked
> me to reply. I'm sorry for the delay.
> 
> At 17:52 04/05/06 +0100, Michael Kay wrote:
> 
> > > > > I worry that we will get many complaints from users who
> > > are misusing
> > > > > these codepoints if we do this.
> > >
> > > How are they misusing these code points? The case we know is that
> > > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with
> > > the intent of giving them the windows-1252 semantics.
> >
> >This was the case I had in mind. People create documents in 
> cp1252 and
> >declare them as iso-8859-1. And it all works, because the 
> errors cancel each
> >other out. If we oblige processors to detect this situation 
> we will be
> >asking users to pay for the extra processing cost, and in return the
> >application that worked before will stop working. Will they thank us?
> >Because if they won't, we shouldn't do it.
> 
> Some users will be very thankful, others won't. The users that will
> be thankful will be those that care about data integrity and 
> interoperability
> worldwide and in the long term. They will be able to fix a problem
> in their data that they otherwise might not have found. As a result,
> they will not only produce correct, valid output, but will also
> make sure that their input data will work well in other circumstances,
> such as searching, sorting, and any kind of other processing. Not the
> least, with the introduction of XML 1.1, there are also such issues
> as the confusion betwen NEL and the three-dot elipsis.
> 
> There was a time when the mentality on the Web was 'everything goes',
> which lead to the slippery slope of bugwards compatibility. We have
> learned, with great pain, that this is a dead end, and we don't want
> to go there anymore. XML is the clearest example of how this can be
> done better. And I sincerely hope that XSLT will not be tempted to
> go down the bugwards compatibility slope.
> 
> The C1 area is forbidden in HTML exactly because it is a very easy
> and cheap way to help people check and (if necessary) clean up their
> data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written
> almost 10 years ago. That C1 is allowed in XML is, according to
> James Clark, an oversight. XML 1.1 has corrected it.
> 
> 
> > > In some way just a detail, but: There is currently no XSLT 2.0
> > > code that will stop working. XSTL 1.0 doesn't have the XHTML
> > > output method.
> >
> >I may have lost the thread, but I thought we were discussing 
> the HTML output
> >method?
> 
> Okay, sorry. There is still no XSLT 2.0 code that will stop working,
> even for the HTML output method. And because the XHTML output
> method is supposed to work according to the compatibility guidelines,
> it of course also should forbid producing C1 character output.
> 
> Regards,    Martin.
> 
> 
> > > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain
> > >characters,
> >
> >Michael Kay
> 
> 

FWIW here's what we do:  we tout our products as being standards-based and 
therefore more interoperable.  When a customer complains about something that 
is, in fact, following the standard, but not doing what they want, we provide a 
custom solution (for $$$).  We also take a look at the standard to make sure 
that it makes sense, and if it doesn't, and we have the bandwidth, we try to 
improve the standard.

So the question is, will the majority be happy or unhappy with a particular 
decision on the standard?

I am not trying to answer that question, I'm only saying that customers 
complaining about the standard will always be there.  The issue is if lots of 
customers complain about the same thing, then it's a telling sign that the 
standard isn't serving the purpose.

Andrea
(from the cheap seats)

Michael Kay wrote:

> Thanks. There's no easy right answer on this one. It's similar to the
> question of whether products should accept "c:\a\b.xml" in places where a
> URI is required. Some products allow it. I've resisted, and report it as an
> error. When users find that it works on one product and doesn't work on
> mine, it's me they complain to. I tell them they are wrong and they should
> read the specs, but I can afford to do that because they aren't (at present)
> paying customers.
> 
> I would be happy with the stricter rule if we had imposed it from the start.
> I'm not happy with the idea that version 2 should be stricter than version
> 1. That's in good measure because, for the time being, people's first
> exposure to XSLT 2.0 is through my product, and when they get compatibility
> or usability problems, they report it to me as "a Saxon bug".
> 
> In addition, the XSLT spec has always been pragmatic about the reality of
> HTML interoperability. If the spec wasn't pragmatic in this way, then I
> think XSLT implementors would have to be pragmatic, and the weaknesses of
> HTML conformance would spill over into weaknesses in XSLT conformance. There
> are many ways that we allow XSLT stylesheets to generate non-conformant
> HTML, and I don't see that this one is particularly different from the
> others. Most areas where we have tried to be strict about what we generate
> (for example, in URI escaping) have led to practical problems for users.
> 
> Michael Kay
> 
> 
> 
>>-----Original Message-----
>>From: Martin Duerst [mailto:duerst@w3.org] 
>>Sent: 21 May 2004 08:09
>>To: Michael Kay; 'Henry Zongaro'; w3c-i18n-ig@w3.org
>>Cc: public-qt-comments@w3.org
>>Subject: RE: [Serial] I18N WG last call comments
>>
>>Hello Michael,
>>
>>The I18N WG (Core TF) has discussed your mail, and has asked
>>me to reply. I'm sorry for the delay.
>>
>>At 17:52 04/05/06 +0100, Michael Kay wrote:
>>
>>
>>>>>>I worry that we will get many complaints from users who
>>>>
>>>>are misusing
>>>>
>>>>>>these codepoints if we do this.
>>>>
>>>>How are they misusing these code points? The case we know is that
>>>>bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with
>>>>the intent of giving them the windows-1252 semantics.
>>>
>>>This was the case I had in mind. People create documents in 
>>
>>cp1252 and
>>
>>>declare them as iso-8859-1. And it all works, because the 
>>
>>errors cancel each
>>
>>>other out. If we oblige processors to detect this situation 
>>
>>we will be
>>
>>>asking users to pay for the extra processing cost, and in return the
>>>application that worked before will stop working. Will they thank us?
>>>Because if they won't, we shouldn't do it.
>>
>>Some users will be very thankful, others won't. The users that will
>>be thankful will be those that care about data integrity and 
>>interoperability
>>worldwide and in the long term. They will be able to fix a problem
>>in their data that they otherwise might not have found. As a result,
>>they will not only produce correct, valid output, but will also
>>make sure that their input data will work well in other circumstances,
>>such as searching, sorting, and any kind of other processing. Not the
>>least, with the introduction of XML 1.1, there are also such issues
>>as the confusion betwen NEL and the three-dot elipsis.
>>
>>There was a time when the mentality on the Web was 'everything goes',
>>which lead to the slippery slope of bugwards compatibility. We have
>>learned, with great pain, that this is a dead end, and we don't want
>>to go there anymore. XML is the clearest example of how this can be
>>done better. And I sincerely hope that XSLT will not be tempted to
>>go down the bugwards compatibility slope.
>>
>>The C1 area is forbidden in HTML exactly because it is a very easy
>>and cheap way to help people check and (if necessary) clean up their
>>data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written
>>almost 10 years ago. That C1 is allowed in XML is, according to
>>James Clark, an oversight. XML 1.1 has corrected it.
>>
>>
>>
>>>>In some way just a detail, but: There is currently no XSLT 2.0
>>>>code that will stop working. XSTL 1.0 doesn't have the XHTML
>>>>output method.
>>>
>>>I may have lost the thread, but I thought we were discussing 
>>
>>the HTML output
>>
>>>method?
>>
>>Okay, sorry. There is still no XSLT 2.0 code that will stop working,
>>even for the HTML output method. And because the XHTML output
>>method is supposed to work according to the compatibility guidelines,
>>it of course also should forbid producing C1 character output.
>>
>>Regards,    Martin.
>>
>>
>>
>>>>>[20] 6.4 HTML Output Method: Writing Character Data: "Certain
>>>>
>>>>characters,
>>>
>>>Michael Kay
>>
>>
> 

-- 
I have always wished that my computer would be as easy to use as my telephone. 
My wish has come true. I no longer know how to use my telephone.
-Bjarne Stroustrup, designer of C++ programming language (1950- )
RE: [Serial] I18N WG last call comments, Martin Duerst (2004-05-24)

Hello Michael,

At 11:04 04/05/21 +0100, Michael Kay wrote:

>Thanks. There's no easy right answer on this one. It's similar to the
>question of whether products should accept "c:\a\b.xml" in places where a
>URI is required. Some products allow it. I've resisted, and report it as an
>error. When users find that it works on one product and doesn't work on
>mine, it's me they complain to. I tell them they are wrong and they should
>read the specs, but I can afford to do that because they aren't (at present)
>paying customers.

I think rather than waiting for the customers to complain, the best
solution may be to produce an instructing error message. In this case,
such a message is quite easy to produce. In this way, you can tell them,
without having to write emails individually.

For example, an error message could read as follows:

line x, character y: Illegal C1 codepoint in HTML output.
(Hint: This is most probably due to a problem in the input, in
particular for example due to input declared to be in the iso-8859-1
character encoding  that is actually in the windows-1252 character
encoding.)


>I would be happy with the stricter rule if we had imposed it from the start.
>I'm not happy with the idea that version 2 should be stricter than version
>1. That's in good measure because, for the time being, people's first
>exposure to XSLT 2.0 is through my product, and when they get compatibility
>or usability problems, they report it to me as "a Saxon bug".

I have separately complained about the number of XSLT 1.0/2.0 compatibility
issues, so I'm definitely not unsympathetic to this point. But looking
at the various patterns of compatibility issues, this one is really
harmless: The XSLT fails with a very clear error message and a very
clear fix. Although I haven't done an in-depth analysis (I have suggested
such a thing), I strongly suspect that many other incompatibilities
are of a much more dangerous nature: The XSLT still works, but the
output is a little different in some cases, which may be detected
sooner or later, or too late.


>In addition, the XSLT spec has always been pragmatic about the reality of
>HTML interoperability.

Well, if it were only HTML interoperability, that may be another issue.
But the fact is that we know that the XML input is garbage. And I don't
think XSLT should be lenitent with garbage XML input in cases where it
is easy to detect that it's garbage.


>If the spec wasn't pragmatic in this way, then I
>think XSLT implementors would have to be pragmatic, and the weaknesses of
>HTML conformance would spill over into weaknesses in XSLT conformance.

I'm sure there will be a test suite for XSLT 2.0. Adding the right
tests would probably go a long way, and wouldn't be very difficult.
(I'd be happy to produce some.)


>There
>are many ways that we allow XSLT stylesheets to generate non-conformant
>HTML, and I don't see that this one is particularly different from the
>others.

Could you point to a list of these, or list (some of) them here?

Regards,     Martin.


>Most areas where we have tried to be strict about what we generate
>(for example, in URI escaping) have led to practical problems for users.
>
>Michael Kay
>
>
> > -----Original Message-----
> > From: Martin Duerst [mailto:duerst@w3.org]
> > Sent: 21 May 2004 08:09
> > To: Michael Kay; 'Henry Zongaro'; w3c-i18n-ig@w3.org
> > Cc: public-qt-comments@w3.org
> > Subject: RE: [Serial] I18N WG last call comments
> >
> > Hello Michael,
> >
> > The I18N WG (Core TF) has discussed your mail, and has asked
> > me to reply. I'm sorry for the delay.
> >
> > At 17:52 04/05/06 +0100, Michael Kay wrote:
> >
> > > > > > I worry that we will get many complaints from users who
> > > > are misusing
> > > > > > these codepoints if we do this.
> > > >
> > > > How are they misusing these code points? The case we know is that
> > > > bytes in the rage 0x80-0x9F are used e.g. in iso-8859-1 but with
> > > > the intent of giving them the windows-1252 semantics.
> > >
> > >This was the case I had in mind. People create documents in
> > cp1252 and
> > >declare them as iso-8859-1. And it all works, because the
> > errors cancel each
> > >other out. If we oblige processors to detect this situation
> > we will be
> > >asking users to pay for the extra processing cost, and in return the
> > >application that worked before will stop working. Will they thank us?
> > >Because if they won't, we shouldn't do it.
> >
> > Some users will be very thankful, others won't. The users that will
> > be thankful will be those that care about data integrity and
> > interoperability
> > worldwide and in the long term. They will be able to fix a problem
> > in their data that they otherwise might not have found. As a result,
> > they will not only produce correct, valid output, but will also
> > make sure that their input data will work well in other circumstances,
> > such as searching, sorting, and any kind of other processing. Not the
> > least, with the introduction of XML 1.1, there are also such issues
> > as the confusion betwen NEL and the three-dot elipsis.
> >
> > There was a time when the mentality on the Web was 'everything goes',
> > which lead to the slippery slope of bugwards compatibility. We have
> > learned, with great pain, that this is a dead end, and we don't want
> > to go there anymore. XML is the clearest example of how this can be
> > done better. And I sincerely hope that XSLT will not be tempted to
> > go down the bugwards compatibility slope.
> >
> > The C1 area is forbidden in HTML exactly because it is a very easy
> > and cheap way to help people check and (if necessary) clean up their
> > data. RFC 2070 (http://www.ietf.org/rfc/rfc2070.txt) was written
> > almost 10 years ago. That C1 is allowed in XML is, according to
> > James Clark, an oversight. XML 1.1 has corrected it.
> >
> >
> > > > In some way just a detail, but: There is currently no XSLT 2.0
> > > > code that will stop working. XSTL 1.0 doesn't have the XHTML
> > > > output method.
> > >
> > >I may have lost the thread, but I thought we were discussing
> > the HTML output
> > >method?
> >
> > Okay, sorry. There is still no XSLT 2.0 code that will stop working,
> > even for the HTML output method. And because the XHTML output
> > method is supposed to work according to the compatibility guidelines,
> > it of course also should forbid producing C1 character output.
> >
> > Regards,    Martin.
> >
> >
> > > > > [20] 6.4 HTML Output Method: Writing Character Data: "Certain
> > > >characters,
> > >
> > >Michael Kay
> >
> >
RE: [Serial] I18N WG last call comments, Michael Kay (2004-05-24)

> 
> >There
> >are many ways that we allow XSLT stylesheets to generate 
> non-conformant
> >HTML, and I don't see that this one is particularly 
> different from the
> >others.
> 
> Could you point to a list of these, or list (some of) them here?
> 

For example:

* you can produce elements and attributes that aren't defined in HTML

* you can nest elements in ways that aren't allowed in HTML

* you can give attributes values that aren't allowed in HTML

* you can use any system ID and public ID that you like in the doctype
declaration

* you can use disable-output-escaping (or now character maps) to produce any
kind of garbage that takes your fancy

* you can suppress the escaping of URIs in URI-valued attributes

* you can suppress the generation of the META element defining the character
encoding or generate your own that contains a value unrelated to the true
character encoding

All these features are occasionally useful either to exploit non-standard
features in browsers, or to generate output designed for processing by
software other than HTML browsers.

Michael Kay
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)
Re: [Serial] I18N WG last call comments, Martin Duerst (2004-09-14)

Hello Henry,

[I have copied the I18N IG as well as a list on your side to reduce
the change that this gets lost. I suggest to do that with all messages
related to last-call discussions.]

At 16:44 04/08/30 -0400, Henry Zongaro wrote:
>Hello, Martin.
>
>      In [1], you submitted the following comment on the Last Call Working
>Draft of Serialization on behalf of the I18N Working Group:
>
><<
>[20] 6.4 HTML Output Method: Writing Character Data: "Certain characters,
>    specifically the control characters #x7F-#x9F, are legal in XML but
>    not in HTML. ... The processor may signal the error, but is not
>    required to do so.": Please change this to require the processor
>    to produce an error.
> >>
>
>      The XSL and XML Query Working Groups made the decision recorded at
>[2], but the I18N Working Group raised an objection [3] to that decision.
>
>      A subsequent e-mail exchange [4-9] ensued between yourself, Michael
>Kay and Andrea Vine.  The final message in the thread came from Michael
>Kay.
>
>      As the XSL and XQuery Working Groups have not heard whether the
>additional discussion satisfactorily clarified the issue for the I18N
>Working Group, we will assume that the issue has been resolved to their
>satisfaction.  If that is not the case, please advise us of any additional
>points requiring clarification.


The I18N WG (Core TF) has had a look at this issue (again!).
We have decided that we need to object to your current resolution.

In particular, we want to point out that while Michael Kay has listed
other cases where it is possible to create non-valid HTML with the
HTML serialization method (see
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0054.html),
all these issues are higher-level than the issue at hand. They are
all related to document structure and/or are not used by default.

Trying to address issues related to document structure would mean
that serialization would have to deal with HTML versioning info
and configuration options for such versioning, which would
considerably complicate the specification. This is not at all
the case for disallowing code points in the C1 range, which is
independent of HTML versioning and at a much more basic level.

Also, as Micheal mentioned, character maps can be used to circumvent
any kinds of output restrictions. It is much better to make the
production of clean, correct output the default (in particular
when this can be easily achieved), and have some mechanism for
circumvention, than to tolerate crappy output from the start.
The misused of codepoints in the C1 range has been a long-standing
problem, and we greatly hope that XQuery and XSLT can help to
solve it rather than contribute to production of more garbage.

Regards,     Martin.

P.S.: For the record, I would also like to point out that the I18N WG
has officially disagreed on this issue at
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0004.html.
The following discussion has brought up some more details, but there
is no indication that the I18N WG would have changed its opinion.
I think that it is clearly inappropriate in such cases to say, as
you do above "we will assume that the issue has been resolved to their
satisfaction". [I think doing such a thing is appropriate when you have
fully (or maybe partially) addressed our comment.]


>Thanks,
>
>Henry
>[1]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
>[2]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0068.html
>[3]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0004.html
>[4]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0012.html
>[5]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0040.html
>[6]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0041.html
>[7]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0049.html
>[8]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0052.html
>[9]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0054.html
>------------------------------------------------------------------
>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
>mailto:zongaro@ca.ibm.com
Minutes Oct. 28 telcon, Zarella Rendon (2004-10-28)
HTML/XHTML Serialization Resolutions, Scott Boag (2004-11-08)
In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group.

>>
[20] 6.4 HTML Output Method: Writing Character Data: "Certain characters,
   specifically the control characters #x7F-#x9F, are legal in XML but
   not in HTML. ... The processor may signal the error, but is not
   required to do so.": Please change this to require the processor
   to produce an error.
<<

The XSL and XQuery working groups have gave an initial response [2] to 
this issue, the i18n WG disagreed with this response [3], and a lengthy 
email discussion followed.  After further discussion, the working group 
have decided to accept your comment, and resolve it by requiring the 
processor to signal the error.

Please let us know if the resolution to this issue is acceptable.

Joanne Tong

[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0068.html
[3] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0004.html
qt-2004Feb0362-19: [Serial] I18N WG last call comments [22]
[substantive, acknowledged] 2004-07-13
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



General last call comments, not i18n-related:



[22] There shouldbe some warning about denormalization when using

   charmaps



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

I agree.

Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)

Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
working group:

<<
[22] There should be some warning about denormalization when using
   charmaps
>>

     Thanks to you and the I18N working group for this comment.

     The XSL and XML Query Working Groups discussed the working group's 
comment, and decided to add a note indicating that the use of character 
maps may result in a serialized document that is not normalized.

     May I ask you to confirm that this response is acceptable to the 
working group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com

Hello Henry,

We just decided at the I18N WG Core TF teleconf that we are happy
with this resolution.

Regards,    Martin.

At 13:09 04/07/13 -0400, Henry Zongaro wrote:

>Martin,
>
>      In [1], you submitted the following comment on the Last Call Working
>Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N
>working group:
>
><<
>[22] There should be some warning about denormalization when using
>    charmaps
> >>
>
>      Thanks to you and the I18N working group for this comment.
>
>      The XSL and XML Query Working Groups discussed the working group's
>comment, and decided to add a note indicating that the use of character
>maps may result in a serialized document that is not normalized.
>
>      May I ask you to confirm that this response is acceptable to the
>working group?
>
>Thanks,
>
>Henry [On behalf of the XSL and XML Query Working Groups]
>[1]
>http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
>------------------------------------------------------------------
>Henry Zongaro      Xalan development
>IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
>mailto:zongaro@ca.ibm.com
qt-2004Feb0362-20: [Serial] I18N WG last call comments [23]
[substantive, announced] 2004-04-13
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



General last call comments, not i18n-related:



[23] Section 4: "The base URIs of nodes in the two trees may be different."

   Does this mean that base URIs are not serialized? This should be

   checked or at least explained.



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

Yes, the base URI typically is supplied at the time a tree is built by a

parser, it is not normally explicit in the content of the tree.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-13)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
the I18N Working Group.

>[23] Section 4: "The base URIs of nodes in the two trees may be 
different."
>  Does this mean that base URIs are not serialized? This should be
>  checked or at least explained.

     Thanks to Martin and the I18N Working Group for this comment.  In [2] 
Michael Kay responded as follows:

>Yes, the base URI typically is supplied at the time a tree is built by a
>parser, it is not normally explicit in the content of the tree.

     The XSL and XQuery Working Groups discussed this comment, and 
concurred with Michael Kay's response.  The working groups did not feel 
any clarification of the specification was required.

     May I ask the I18N Working Group to confirm that this response is 
acceptable?

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-21: [Serial] I18N WG last call comments [24]
[substantive, announced] 2004-08-31
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



General last call comments, not i18n-related:



[24] Cases of creation of non-wellformed XML where the processor is not

   required to signal an error: It would be good to have an option to

   request well-formedness checking even if Character Maps are used.



Regards,    Martin.

Re: [Serial] I18N WG last call comments, Henry Zongaro (2004-04-13)

Hello,

     In [1], Martin Duerst submitted the following comment on the Last 
Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
the I18N Working Group.

> [24] Cases of creation of non-wellformed XML where the processor is not
>    required to signal an error: It would be good to have an option to
>    request well-formedness checking even if Character Maps are used.

     Thanks to Martin and the I18N Working Group for this comment.

     The XSL and XQuery Working Groups discussed the comment, and 
concluded that, although such a mechanism might be useful, an XML parser 
would be capable of performing the same well-formedness checking.  On 
those grounds, the working groups decided it was not necessary to 
duplicate that functionality in Serialization.

     May I ask the working group to confirm that this response is 
acceptable?

Thanks,

Henry [On behalf of the XSL and XQuery Working Groups.]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serial] I18N WG last call comments, François Yergeau (2004-06-15)

Henry Zongaro a écrit :
>      In [1], Martin Duerst submitted the following comment on the Last 
> Call Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
> the I18N Working Group.
> 
> 
>>[24] Cases of creation of non-wellformed XML where the processor is not
>>   required to signal an error: It would be good to have an option to
>>   request well-formedness checking even if Character Maps are used.
> 
> 
>      Thanks to Martin and the I18N Working Group for this comment.
> 
>      The XSL and XQuery Working Groups discussed the comment, and 
> concluded that, although such a mechanism might be useful, an XML parser 
> would be capable of performing the same well-formedness checking.  On 
> those grounds, the working groups decided it was not necessary to 
> duplicate that functionality in Serialization.

We are not satisfied with this resolution.  We feel that 1) 
well-formedness is very important ; 2) using a parser to check it is 
just a possible implementation strategy and 3) that this strategy may 
not even be available when serializing to other than a local file, e.g. 
to a network socket.

> [1] 
> http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html

Regards,

-- 
François Yergeau
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Martin, François.

     In [1] Martin submitted the following comment on the Last Call 
Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the 
I18N working group:

<<
[24] Cases of creation of non-wellformed XML where the processor is not
   required to signal an error: It would be good to have an option to
   request well-formedness checking even if Character Maps are used.
>>

     In [2], I announced the following decision on behalf of the XSL and 
XML Query Working Groups:

<<
     The XSL and XQuery Working Groups discussed the comment, and 
concluded that, although such a mechanism might be useful, an XML
parser would be capable of performing the same well-formedness
checking.  On those grounds, the working groups decided it was
not necessary to duplicate that functionality in Serialization.
>>

     In [3], François raised the following objection on behalf of I18N:

<<
We are not satisfied with this resolution.  We feel that 1) 
well-formedness is very important ; 2) using a parser to check it is 
just a possible implementation strategy and 3) that this strategy may 
not even be available when serializing to other than a local file, e.g. 
to a network socket.
>>


     The XSL and XML Query Working Groups discussed this issue further, 
and concluded that requiring a serialization component to be capable of 
detecting XML that was not well-formed in the presence of character maps 
would be too much of an implementation burden on a serializer.  The 
working groups also noted that there is no interoperability problem with 
this resolution, and that an implementation could always add an 
implementation-specific option that would perform the sort of checking 
that has I18N suggested.

     Finally, the XSL and XQuery Working Groups noted that the last 
paragraph of Section 5 of the most recent draft of Serialization [4] 
indicates that only character maps and the use of user-written extension 
functions might result in the creation of XML that is not well-formed.  In 
fact it is only character maps that might result in XML that is not 
well-formed without being detected.  The working groups will correct that 
misstatement.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Apr/0059.html
[3] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0108.html
[4] 
http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#xml-output

------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-22: [Serial] I18N WG last call comments [25]
[substantive, announced] 2004-09-21
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



General last call comments, not i18n-related:



[25] 7, Text Output Method: "The media-type parameter is applicable for

   the text output method.": What does that mean? How is it applied?



Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

It means, go and read the general (method-independent) description of

this parameter (which in this case, is not very enlightening...)


Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
[25] 7, Text Output Method: "The media-type parameter is applicable for
   the text output method.": What does that mean? How is it applied?
>>

     Thanks to you and the working group for this comment.  The XSL and 
XML Query Working Groups discussed the comment.

     In [2], I indicated that, in response to the second comment number 12 
in [1], the working groups will clarify the description of the media-type 
parameter that appears in the table in section 3 of Serialization.[3]  In 
response to the comment at hand, the working groups will add a reference 
to that description to the text you've quoted above, and to similar text 
that appears in Sections 5.9 and 7.8 of Serialization.  The working groups 
believe that will clarify what it means for the parameter to be 
applicable.

     Similar descriptions of the use-character-maps parameter in sections 
5.9, 7.8 and 8 of Serialization will be clarified through the addition of 
references to section 9 of Serialization, which describes character maps 
in detail.

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Sep/0079.html
[3] 
http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20040723/#serparam
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-24: [Serial] I18N WG last call comments [31]
[substantive, announced] 2004-12-08
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



Editorial:



[31] Section 5 and Section 6: "If the data model includes a head element

   that has a meta element child, the processor should replace any

   content attribute of the meta element, or add such an attribute,

   with the value as described above, rather than output a new meta element."

   This is written as if there would be only one <meta> element.

   Replacement should only take place if the <meta> element has a

   http-equiv attribute with value 'Content-Type'.


Regards,    Martin.

Serialization: adding a <meta> element, Michael Kay (2004-08-31)
Minutes Oct. 28 telcon, Zarella Rendon (2004-10-28)
HTML/XHTML Serialization Resolutions, Scott Boag (2004-11-08)
In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group.

>>
[31] Section 5 and Section 6: "If the data model includes a head element
   that has a meta element child, the processor should replace any
   content attribute of the meta element, or add such an attribute,
   with the value as described above, rather than output a new meta 
element."
   This is written as if there would be only one <meta> element.
   Replacement should only take place if the <meta> element has a
   http-equiv attribute with value 'Content-Type'.
<<

The XSL and XQuery working groups have accepted your comment.  It is 
related to an issue about having to explore the entire contents of the 
HEAD element before it decides whether to to add a new META element as the 

first child.  See [2].  We intend to resolve this by replacing the 
offending text with:

"If a META element has been added to the HEAD element as described above,
then any existing META element child of the HEAD element having an
http-equiv attribute with the value "Content-Type" MUST be discarded."

Please let us know if the resolution to this issue is acceptable.

Joanne Tong


[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
[2] http://lists.w3.org/Archives/Member/w3c-xsl-wg/2004Aug/0088.html
qt-2004Feb0362-25: [Serial] I18N WG last call comments [32]
[substantive, announced] 2004-12-08
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



Editorial:



[32] Section 5 and Section 6: Note starting: "This escaping is deliberately

   confined to non-ASCII characters,": There are certain ASCII characters

   that are not allowed in URIs. They should be escaped.


Regards,    Martin.

RE: [Serial] I18N WG last call comments, Michael Kay (2004-02-15)

> [32] Section 5 and Section 6: Note starting: "This escaping 

> is deliberately

>    confined to non-ASCII characters,": There are certain 

> ASCII characters

>    that are not allowed in URIs. They should be escaped.



The decision here is very deliberate, as the text says. Note that

appendix B.2.1 of the HTML 4.0 specification also refers to %HH escaping

only in connection with non-ASCII characters.



Although characters such as spaces are not allowed in URIs, if you

escape them in URIs that are interpreted client-side, such as

javascript: URIs, the URI stops working in most browsers.  



Also, you can't escape an id attribute that acts as the target of a

link, because % is not valid in an ID attribute. In practice (whatever

the spec says) if you escape the URI fragment identifier of a same-page

URI reference but don't escape the corresponding ID attribute, the

browser doesn't match them up. In fact, the evidence appears to be that

browsers don't unescape URIs at all, they leave this to be done at the

server. Escaping non-ASCII characters, as we currently specify, appears

to work for fragment identifiers referring to a different page, but not

for same-page references. It's a mess, which is one reason why we now

provide the option to switch off automatic escaping of URIs and allow

the user to do it themselves using the escape-uri() function.



Regards,



Michael Kay

Minutes Oct. 28 telcon, Zarella Rendon (2004-10-28)
HTML/XHTML Serialization Resolutions, Scott Boag (2004-11-08)
In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group.

>>
[32] Section 5 and Section 6: Note starting: "This escaping is 
deliberately
   confined to non-ASCII characters,": There are certain ASCII characters
   that are not allowed in URIs. They should be escaped.
<<

The XSL and XQuery working groups have discussed your comment, and decline 

to act on it.   We endorse Mike Kay's response [2]:

The decision here is very deliberate, as the text says. Note that
appendix B.2.1 of the HTML 4.0 specification also refers to %HH escaping
only in connection with non-ASCII characters.

Although characters such as spaces are not allowed in URIs, if you
escape them in URIs that are interpreted client-side, such as
javascript: URIs, the URI stops working in most browsers. 

Also, you can't escape an id attribute that acts as the target of a
link, because % is not valid in an ID attribute. In practice (whatever
the spec says) if you escape the URI fragment identifier of a same-page
URI reference but don't escape the corresponding ID attribute, the
browser doesn't match them up. In fact, the evidence appears to be that
browsers don't unescape URIs at all, they leave this to be done at the
server. Escaping non-ASCII characters, as we currently specify, appears
to work for fragment identifiers referring to a different page, but not
for same-page references. It's a mess, which is one reason why we now
provide the option to switch off automatic escaping of URIs and allow
the user to do it themselves using the escape-uri() function.

Please let us know if the resolution to this issue is acceptable.

Joanne Tong


[1] 
http://www.w3.org/XML/Group/xsl-query-specs/last-call-comments/xquery-serialization/issues.xml#qt-2004Feb0362-25

[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html
qt-2004Feb0918-01: ORA-SE-341-B: serialization of XQuery DataModel instance is inadequate
[substantive, acknowledged] 2004-03-28



SECTION 2: 



Serialization of the Data Model is important because we want to be

able to share Data Model instances between applications over a

network in a standard way. Today the spec does not support this

(e.g. document boundaries are blurred, scalar values are converted

to text nodes).



We should use XML itself to solve the problem. The idea

is the same as using XML to represent XML schema and using XML to

represent XQuery (XQueryX).



First, we can  define an XML schema that describes the

XQuery data model , then serialize the XQuery data model instance

based on this XML schema.

The key is to define a comprehensive XML schema to describe an

XQuery data model instance.



For example, consider an XQuery data model instance that consists

of 2 items, an xs:integer  of value 1, followed by an xs:string of

value 'abc'. It can be serialized as:



<xqdm:seq xmlns:xqdm="http://www.w3.org/2004/xqdm" >

   <xqdm:item>

        <xqdm:type>xs_integer<xqdm:type>

        <xqdm:value>1<xqdm:value>

   </xqdm:item>

    <xqdm:item>

        <xqdm:type>xs_string<xqdm:type>

        <xqdm:value>abc<xqdm:value>

   </xqdm:item>

</xqdm:seq>



With this kind of serialization, the Data Model can be serialized

in exactly one way.





- Steve B.



    



Steve,



     In [1] you submitted the following comment on the serialization 

draft:



Steve Buxton wrote on 2004-02-17 06:31:41 AM:

> SECTION 2: 

> 

> Serialization of the Data Model is important because we want to be 

> able to share Data Model instances between applications over a 

> network in a standard way. Today the spec does not support this (e.

> g. document boundaries are blurred, scalar values are converted to 

> text nodes).

> 

> We should use XML itself to solve the problem. The idea

> is the same as using XML to represent XML schema and using XML to 

> represent XQuery (XQueryX).

> 

> First, we can  define an XML schema that describes the

> XQuery data model , then serialize the XQuery data model instance

> based on this XML schema.

> The key is to define a comprehensive XML schema to describe an XQuery

> data model instance.

> 

> For example, consider an XQuery data model instance that consists of

> 2 items, an

> xs:integer  of value 1, followed

> by an xs:string of value 'abc'. It can be serialized as:

> 

> <xqdm:seq xmlns:xqdm="http://www.w3.org/2004/xqdm" >

>    <xqdm:item>

>         <xqdm:type>xs_integer<xqdm:type>

>         <xqdm:value>1<xqdm:value>

>    </xqdm:item>

>     <xqdm:item>

>         <xqdm:type>xs_string<xqdm:type>

>         <xqdm:value>abc<xqdm:value>

>    </xqdm:item>

> </xqdm:seq>

> 

> With this kind of serialization, the Data Model can be serialized in

> exactly one way.



     Thank you for submitting this comment.



     The XSL and XQuery working groups considered your comment and related 

comments.  There was general agreement that there is some need for a 

mechanism for serializing arbitrary sequences that preserves most or all 

of the properties of the items in an arbitrary sequence that is being 

serialized.



     However, the working groups decided that precisely defining all of 

the requirements for such a mechanism at this stage would be difficult, 

and would likely lead to a solution that would not satisfy real user 

requirements.  Therefore, the working groups decided to consider such a 

feature for a future revision of the recommendations, and close this 

comment without any changes to the specifications.



     May I ask you to confirm that this resolution is acceptable?



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0918.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

qt-2004Feb0919-01: ORA-SE-292-B: Processing of empty sequence is roundabout and confusing
[substantive, raised] 2004-02-17



SECTION 2: Serializing arbitrary data models



Step 1 says to replace an empty sequence by a zero-length string,

which is presumably an atomic value of type xs:string of length

0.  Step 2 casts this xs:string to xs:string, a no-op.

Step 3 would add a space if there were more than one 

atomic value in the sequence, but there is not, so this step is 

a no-op.

Step 4 says to convert this atomic value into a text node

with the same string value.  But a text node may not have a 

zero-length string as its content; this is a bug in the 

specification.  Perhaps step 4 should be 

interpreted as simply deleting any zero-length xs:string 

values from the sequence.  In that case, if we started with an

empty sequence, we are back to an empty sequence, and one wonders

why step 1 is there.  Or perhaps step 4 is intended to raise 

a serialization error.  If that is the intention, please say so.



- Steve B.



    
qt-2004Feb0921-01: ORA-SE-300-B: Implementation-defined output methods need not normalize
[substantive, acknowledged] 2004-07-13



SECTION 3: Serialization parameters



It says "The method identifies the overall method...

If the QName is in a namespace, then it identifies an 

implementation-defined output method; the behavior in this case 

is not specified by this document."  However, you have specified

that normalization (section 2) occurs prior to invoking the 

method.  This implies that the implementation-defined method

has no control over the normalization.  It would be desirable to

reverse this, so that normalization occurs inside the method

rather than prior to the method.  In that case, normalization

would be part of the standard-defined methods but 

implementation-defined methods might have other algorithms

for dealing with values permitted by the data model that do

not correspond to well-formed XML.



To accomplish this, simply make the current section 2 into 

the first phase, prior to "Markup generation".



- Steve B.



    
Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
SECTION 3: Serialization parameters

It says "The method identifies the overall method...
If the QName is in a namespace, then it identifies an 
implementation-defined output method; the behavior in this case 
is not specified by this document."  However, you have specified
that normalization (section 2) occurs prior to invoking the 
method.  This implies that the implementation-defined method
has no control over the normalization.  It would be desirable to
reverse this, so that normalization occurs inside the method
rather than prior to the method.  In that case, normalization
would be part of the standard-defined methods but 
implementation-defined methods might have other algorithms
for dealing with values permitted by the data model that do
not correspond to well-formed XML.

To accomplish this, simply make the current section 2 into 
the first phase, prior to "Markup generation".
>>

     Thank you for this comment.

     The XSL and XML Query Working Groups discussed your comment, and 
agreed that implementation-defined output methods should be granted 
control of whether the normalization of arbitrary sequences that is 
specified by section 2 occurs.

     As a representative of Oracle was present when this decision was 
made, I will assume the response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0921.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0922-01: ORA-SE-302-B: Phase 1, "Markup generation", is poorly specified
[substantive, acknowledged] 2004-08-31



SECTION 3: Serialization parameters



It is not clear what the scope of phase 1, "Markup generation", is.

It says that this phase "produces the representation of start

and end tags for elements, and other constructs such as ...".

The use of "such as..." is non-specific.  It would be better to

provide a complete list of what is included.  Even the phrase

"representation of start and end tags for elements" is not very

clear -- does this include the attributes and namespace declarations

within the start tag?





- Steve B.



    
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment, and decided to clarify the description of the 
markup generation phase by replacing the first bullet of Section 4 of 
Serialization with the following:

<<
1. Markup generation produces the character representation of those parts 
of the serialized result that describe the structure of the normalized 
instance of the data model.  In the cases of the xml, html and xhtml 
output methods, this phase produces the character representations of the 
following:

 o the document type declaration;
 o start tags and end tags (except for the attribute values,
   whose representation is produced by the character expansion
   phase);
 o processing instructions; and
 o comments.

In the case of the xml and xhtml output methods, this phase also produces 
the following:

 o the XML or text declaration; and
 o empty element tags (except for the attribute values);

In the case of the text output method, this phase has no effect.
>>

     In addition, the working groups decided to add a statement to the 
effect that the phases of serialization apply to the output methods 
defined by the Serialization specification, and that it is 
implementation-defined whether any apply for an implementation-defined 
output method.

     As a representative of Oracle was present when this decision was 
made, I will assume the response is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0922.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0923-01: ORA-SE-304-Q: possible parameter for how to handle elements with no children
[substantive, acknowledged] 2004-09-21



SECTION 3: Serialization parameters



Perhaps there should be a parameter to indicate whether to 

output elements with no children as start-tag plus end-tag or as

empty-element tags.



- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
SECTION 3: Serialization parameters

Perhaps there should be a parameter to indicate whether to 
output elements with no children as start-tag plus end-tag or as
empty-element tags.
>>

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment, and decided that such a parameter was not 
necessary, as it would not affect the Infoset of the serialized result. In 
addition, it was noted that such a parameter might conflict with the 
requirements of the xhtml and html output methods. 

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0923.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0924-01: ORA-SE-308-C: What circumstances are meant by "in all other circumstances"?
[substantive, acknowledged] 2004-07-13



SECTION 4: XML output method



Second para: "In all other circumstances, the serialized form must

comply with the requirements described for the xml output method."

It is not clear what "all other circumstances" is constrasting

itself with.  One naturally looks back to the first paragraph

to see what conditions it lays out.  The only condition in that

paragraph appears to be "unless the processor is unable to 

satisfy those rules...".  Thus the logical structure appears to

be:



test if the processor is able to satisfy such-and-such rules

  if yes, the beginning of the first paragraph appies

  if no, the second paragraph applies.



But this seems unlikely to be your intent.





- Steve B.



    
Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
SECTION 4: XML output method

Second para: "In all other circumstances, the serialized form must
comply with the requirements described for the xml output method."
It is not clear what "all other circumstances" is constrasting
itself with.  One naturally looks back to the first paragraph
to see what conditions it lays out.  The only condition in that
paragraph appears to be "unless the processor is unable to 
satisfy those rules...".  Thus the logical structure appears to
be:

test if the processor is able to satisfy such-and-such rules
  if yes, the beginning of the first paragraph appies
  if no, the second paragraph applies.

But this seems unlikely to be your intent.
>>

     Thank you for this comment.

     The XSL and XML Query Working Groups discussed your comment.  In 
response to other comments in Section 4, the working groups decided to 
reword the first and third paragraphs of that section in order to make it 
clear that a serialization error results if the serialized result is not a 
well-formed document entity or external general parsed entity.  In 
addition, the second paragraph of section 4 was deleted.

     The working groups believe that these changes address your comment as 
well.

     I believe that a representative of Oracle was present when this 
decision was made, so I will assume the response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0924.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0926-01: ORA-SE-312-B: Missing exception for additional whitespace added by indent parameter
[substantive, acknowledged] 2004-04-13



SECTION 4: XML output method



Setting the indent parameter to yes may introduce additional

whitespace in the output.  Reparsing the output value may 

retain this additional whitespace, for example, if it is added

to an element of mixed content.  This exception is not listed.

(You have an exception for the character expansion phase, but

the indent parameter is processed by the Markup generation 

phase, so the exception for character expansion does not 

cover the action of the indent parameter.)





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

> SECTION 4: XML output method
> 
> Setting the indent parameter to yes may introduce additional
> whitespace in the output.  Reparsing the output value may 
> retain this additional whitespace, for example, if it is added
> to an element of mixed content.  This exception is not listed.
> (You have an exception for the character expansion phase, but
> the indent parameter is processed by the Markup generation 
> phase, so the exception for character expansion does not 
> cover the action of the indent parameter.)

     Thank you for your comment.

     The XSL and XQuery working groups discussed your comment, and agreed 
with your analysis.  The following item will be added to the bulleted list 
in section 4 to address this comment:

<<
o Additional text nodes consisting of whitespace characters may be present 
in the new tree and some text nodes in the new tree may contain additional 
whitespace characters that were not present in the original tree if the 
indent parameter has the value yes, as described in 4.3 XML Output Method: 
the indent Parameter.
>>

     As you were present when this decision was made, I will take it that 
the decision is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0926.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0927-01: ORA-SE-315-Q: How can character expansion create new nodes?
[substantive, acknowledged] 2004-04-13



SECTION 4: XML output method



Final bullet says "Additional nodes may be present in the new tree

.. due to the character expansion phase of serialization."

Could you please give an example of how character expansion can

cause new nodes?  I don't see how any of the four kinds of

character expansion (URI escaping, CDATA sections, character

mapping, special character references) can cause a new node.

While these character expansions might change the physical 

presentation of a text node, I don't see how they can cause one

text node to become two text nodes, for example.





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

> SECTION 4: XML output method
> 
> Final bullet says "Additional nodes may be present in the new tree
> .. due to the character expansion phase of serialization."
> Could you please give an example of how character expansion can
> cause new nodes?  I don't see how any of the four kinds of
> character expansion (URI escaping, CDATA sections, character
> mapping, special character references) can cause a new node.
> While these character expansions might change the physical 
> presentation of a text node, I don't see how they can cause one
> text node to become two text nodes, for example.

     Thank you for your comment.

     The XSL and XQuery working groups discussed your comment, and decided 
to add a note to clarify the situation.  I would like to add the following 
note to the final bullet of the bulleted list in section 4.

<<
Note:  The use-character-maps parameter can cause arbitrary characters to 
be inserted into the serialized XML document in an unescaped form, 
including characters that would be considered part of XML markup.  Such 
characters could result in arbitrary new element nodes, attribute nodes, 
and so on, in the new tree that results from processing the serialized XML 
document.
>>

     As you were present when this decision was made, I will take it that 
the decision is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0927.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0928-01: ORA-SE-326-B: XML declaration is mandatory if the version is not 1.0
[substantive, acknowledged] 2004-04-13



SECTION 4.5: XML output method: the omit-xml-declaration parameter



The last sentence says "The omit-xml-declaration parameter must

be ignored if the standlone parameter is present, or if the 

encoding parameter specifies a value other than UTF-8 or UTF-16."

That is, if standalone is specified, then an XML declaration is 

mandatory in the output.  Isn't an XML declaration also mandatory

if the version is not 1.0?  That should probably be added to 

the list in this sentence.





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

> SECTION 4.5: XML output method: the omit-xml-declaration parameter
> 
> The last sentence says "The omit-xml-declaration parameter must
> be ignored if the standlone parameter is present, or if the 
> encoding parameter specifies a value other than UTF-8 or UTF-16."
> That is, if standalone is specified, then an XML declaration is 
> mandatory in the output.  Isn't an XML declaration also mandatory
> if the version is not 1.0?  That should probably be added to 
> the list in this sentence.

     Thank you for your comment.

     The XSL and XQuery working groups discussed your comment, and 
concluded that, although XML 1.1 requires a document entity to have an XML 
declaration, it does not require an external general parsed entity to have 
a text declaration.

     However, prompted by your comment, the working groups decided to 
formulate their requirements for the omit-xml-declaration parameter to fit 
with the requirements of XML 1.0 and XML 1.1.  The specification will 
require the setting of the omit-xml-declaration parameter to be obeyed 
always, and to require conflicts between the settings of that parameter 
and other parameters to be considered a serialization error.  A host 
language would, of course, have the option of ensuring such conflicts 
never arise through whatever language-specific mechanism it uses to 
specify serialization parameters.

     In particular, the working groups decided that if the serialized 
result could be considered to be the text declaration of an external 
general parsed entity, the omit-xml-declaration parameter could have the 
value yes or the value no, and the parameter's setting would take effect. 
They further decided that if the serialized result could only be 
considered to be a document entity because

  o the standalone parameter had the value yes or no; or
  o the version parameter had a value other than 1.0 and the
    doctype-system parameter was supplied

the omit-xml-declaration parameter must have the value no.  Otherwise, a 
serialization error results.

     As you were present when this decision was made, I will take it that 
the decision is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XQuery Working Groups.]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0928.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0929-01: ORA-SE-320-B: What does it mean to say two data models (sic) are the same?
[substantive, decided] 2004-11-13



SECTION 4: XML output method



Fifth paragraph says "In addition, the output must be such that 

if a new tree was constructed by parsing the XML document and 

converting it into a data model 

as specified in [Data Model], then the new data model

would be the same as the starting data model, with the following

possible exceptions:...".  The word "same" is not defined for

data models (sic; you mean "sequence", "item", "node" or 

"document node").  One cannot apply the word "same" 

literally to properties that are sequences of nodes, 

such as parent, children, attributes, and namespaces, since it

is impossible to construct nodes with the same node identity as

the original value.  You may wish to look at how SQL/XML:2003

handled this issue (see Subclause 10.3 "Determination of 

identical values").



- Steve B.



    
Serialization and round-tripping, Henry Zongaro (2004-06-12)
Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)
qt-2004Feb0930-01: ORA-SE-301-B: Indent parameter should not apply to (potentially) mixed-mode elements
[substantive, acknowledged] 2004-09-01



SECTION 3: serialization parameters



It says "indent specifies whether the processor may add additional

whitespace when outputting the data model...".  

It is not clear to what extent this interacts with the 

properties of nodes.  For example, if an element's type permits

mixed content, then adding whitespace to that element's content

potentially damages that element's semantics.  If an element has

not been validated, then it is possible that that element is 

intended to have mixed content and the processor just doesn't know

it, so again, the conservative thing to do is to prohibit adding

whitespace.  If an element has been validated and is known to 

have only elements in its content model, then it would be permissible

to add whitespace to that element's content on output as a 

pretty-printing option.  The conclusion is that this parameter

should only govern the output of such elements.





- Steve B.



    
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
SECTION 3: serialization parameters

It says "indent specifies whether the processor may add additional
whitespace when outputting the data model...". 
It is not clear to what extent this interacts with the 
properties of nodes.  For example, if an element's type permits
mixed content, then adding whitespace to that element's content
potentially damages that element's semantics.  If an element has
not been validated, then it is possible that that element is 
intended to have mixed content and the processor just doesn't know
it, so again, the conservative thing to do is to prohibit adding
whitespace.  If an element has been validated and is known to 
have only elements in its content model, then it would be permissible
to add whitespace to that element's content on output as a 
pretty-printing option.  The conclusion is that this parameter
should only govern the output of such elements.
>>

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment.  Recently, the working groups made a decision in 
principle to define serialization in terms of the mapping to Infoset 
defined by Data Model.  As such, the concept of the content model of an 
element will no longer available when the indent parameter takes effect.

     In order to give some guidance to processors regarding the effect of 
indent on elements with mixed content, the working groups decided to add a 
statement that whitespace should not be added where it might be 
significant.  I would like to add the following item to the bulleted list 
in Section 5.3 of the 23 July draft of Serialization:

<<
o Whitespace characters should not be added in places where the characters 
would be significant - for example, in the content of an element whose 
content model is known to be mixed.
>>

     As a representative of Oracle was present when this decision was 
made, I will assume the response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0930.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0932-01: ORA-SE-309-B: Poorly worded constraints on the output
[substantive, acknowledged] 2004-06-07



SECTION 4: XML output method



The constraints expressed in the third paragraph, as currently

worded, seem incomplete.  The paragraph says

"If the document node of the data model has a single element node 

child and no text node children, and the serialized output is a 

well-formed XML document entity, the serialized output must conform 

to the XML Namespaces Recommendation [XML Names]. If the data model 

does not take this form, and the serialized output is a well-formed 

XML external general parsed entity, then the serialized output must 

be an entity which, when referenced within a trivial XML document 

wrapper like this



<!DOCTYPE doc [

<!ENTITY e SYSTEM "entity-URI">

]>

<doc>&e;</doc>



where entity-URI is a URI for the entity, produces a document which 

must itself be a well-formed XML document conforming to the XML 

Namespaces Recommendation [XML Names]."



This language seems to leave open the following possibilities:

1. The document node has a single element node child and no text

node child, but the serialized output is not well-formed XML.

2. The document node does not have a single element node child,

or has a text node child, but the serialized output is not a 

well-formed XML external general parsed entity.



I think the solution is to reword the paragraph as follows:



If the document node of the input value has a single element node 

child and no text node children, then the serialized output shall

be a well-formed XML document entity that conforms to the XML 

Namespaces Recommendation [XML Names]. Otherwise, the serialized

output shall be a well-formed XML external general parsed entity, 

which, when referenced within a trivial XML document wrapper like 

this



<!DOCTYPE doc [

<!ENTITY e SYSTEM "entity-URI">

]>

<doc>&e;</doc>



where entity-URI is a URI for the entity, produces a document 

which must itself be a well-formed XML document conforming to the 

XML Namespaces Recommendation [XML Names].











- Steve B.



    

Hi, Steve.

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

Steve Buxton wrote on 2004-02-17 06:44:15 AM:
> SECTION 4: XML output method
> 
> The constraints expressed in the third paragraph, as currently
> worded, seem incomplete.  The paragraph says
> "If the document node of the data model has a single element node 
> child and no text node children, and the serialized output is a 
> well-formed XML document entity, the serialized output must conform 
> to the XML Namespaces Recommendation [XML Names]. If the data model 
> does not take this form, and the serialized output is a well-formed 
> XML external general parsed entity, then the serialized output must 
> be an entity which, when referenced within a trivial XML document 
> wrapper like this
> 
> <!DOCTYPE doc [
> <!ENTITY e SYSTEM "entity-URI">
> ]>
> <doc>&e;</doc>
> 
> where entity-URI is a URI for the entity, produces a document which 
> must itself be a well-formed XML document conforming to the XML 
> Namespaces Recommendation [XML Names]."
> 
> This language seems to leave open the following possibilities:
> 1. The document node has a single element node child and no text
> node child, but the serialized output is not well-formed XML.
> 2. The document node does not have a single element node child,
> or has a text node child, but the serialized output is not a 
> well-formed XML external general parsed entity.
> 
> I think the solution is to reword the paragraph as follows:
> 
> If the document node of the input value has a single element node 
> child and no text node children, then the serialized output shall
> be a well-formed XML document entity that conforms to the XML 
> Namespaces Recommendation [XML Names]. Otherwise, the serialized 
> output shall be a well-formed XML external general parsed entity, 
> which, when referenced within a trivial XML document wrapper like 
> this
> 
> <!DOCTYPE doc [
> <!ENTITY e SYSTEM "entity-URI">
> ]>
> <doc>&e;</doc>
> 
> where entity-URI is a URI for the entity, produces a document 
> which must itself be a well-formed XML document conforming to the 
> XML Namespaces Recommendation [XML Names].

     Thank you for your comment.

     The XSL and XML Query Working Groups discussed your comment.  The 
working groups agreed that the first paragraph of section 4 was intended 
to place a requirement on the serialization process that it must produce a 
well-formed entity (a document entity or external general parsed entity, 
as appropriate), unless it is unable to do so because of the effect of the 
character expansion phase of serialization.  Otherwise, a serialization 
error results.

     In response to your comment and a related comment on the first three 
paragraphs of section 4, the working groups decided to make clear the 
intent of the first and third paragraphs of section 4 by making the 
following changes:

- in the first sentence of the third paragraph, change "and the"
  to "then", to make it clear the conditions under which a
  document entity will be the result of the serialization process.

- change the wording to make it clear that these rules describe
  requirements on the processor, rather than on the user.  The
  processor will be required to produce a serialization error if
  it is unable to produce a well-formed entity of the appropriate
  kind, unless that is because of the action of the character
  expansion phase of serialization.

     As this seems to be in agreement with your proposed rewording, and a 
representative of Oracle was present when this decision was made, I will 
assume the response is acceptable.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0932.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0936-01: ORA-SE-317-B: document-uri property cannot be serialized
[substantive, decided] 2004-04-06



SECTION 4: XML output method



The third bullet says "The base URIs in the two trees may be

different."  Document nodes also have a property called 

document-uri; probably the document-uri is not recoverable by

reparsing a serialization either.





- Steve B.



    
qt-2004Feb0976-01: [Serial] IBM-SE-100: Default parameter values should account for specifics for particular output methods
[substantive, acknowledged] 2004-03-28



[My apologies that these comments are coming in after the end of the Last 

Call comment period.]



Section 6



This section states that the default value for the version method is 4.0, 

while section 3 states that the default is implementation defined.  The 

two statements need to be reconciled.  The same comment probably applies 

to other parameters.



Thanks,



Henry

[Speaking on behalf of reviewers from IBM.]

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com



    



> 

> Section 6

> 

> This section states that the default value for the version 

> method is 4.0, 

> while section 3 states that the default is implementation 

> defined.  The 

> two statements need to be reconciled.  The same comment 

> probably applies 

> to other parameters.

> 



Actually, I think that the Serialization spec should not define default

values for any parameters. This should be up to the client application

to specify.



Michael Kay




Hello,



     In [1], I submitted the following comment on the last call draft of 

Serialization on behalf of IBM:



Henry Zongaro/Toronto/IBM wrote on 2004-02-17 08:45:45 PM:

> Section 6

>

> This section states that the default value for the version method is

> 4.0, while section 3 states that the default is implementation 

> defined.  The two statements need to be reconciled.  The same 

> comment probably applies to other parameters.



In response [2], Michael Key proposed:



Michael Kay wrote on 2004-02-18 03:31:48 AM:

> Actually, I think that the Serialization spec should not define default

> values for any parameters. This should be up to the client application

> to specify.



     The XSL and XQuery working groups considered this comment, and 

decided to accept Michael Kay's suggestion.



     This note announces the decision and signals my acceptance of the 

response.



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0976.html

[2] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0988.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

qt-2004Feb0977-01: [Serial] IBM-SE-101: Default HTML version
[substantive, acknowledged] 2004-03-28
[Serial] IBM-SE-101: Default HTML version, Henry Zongaro (2004-02-17)



[My apologies that these comments are coming in after the end of the Last 

Call comment period.]



Section 6



The default version of HTML should probably be 4.01.  That's the default 

specified by XSLT.



Thanks,



Henry

[Speaking on behalf of reviewers from IBM.]

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com



    



Hello,



     In [1], I submitted the following comment on the last call draft of 

Serialization on behalf of IBM:



Henry Zongaro/Toronto/IBM wrote on 2004-02-17 08:46:35 PM:

> Section 6

> 

> The default version of HTML should probably be 4.01.  That's the 

> default specified by XSLT.



     The XSL and XQuery working groups considered this comment, and 

decided to have client specifications of serialization specify all 

parameter value settings.  No defaults will be specified by the 

serialization specification.



     This note announces the decision and signals my acceptance of the 

response.



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0977.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

qt-2004Feb0980-01: [Serial] IBM-SE-103: Treatment of whitespace in XHTML attributes
[substantive, acknowledged] 2004-06-07



[My apologies that these comments are coming in after the end of the Last 

Call comment period.]



Section 5



The third bullet of this section states, "The serializer should avoid 

outputting line breaks and multiple whitespace characters within attribute 

values."  It's not clear what a processor should do in such cases.  This 

should state that these characters should be replaced by a single space 

character.



Thanks,



Henry

[Speaking on behalf of reviewers from IBM.]

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com



    

Hello,

     In [1], I submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of IBM:

Henry Zongaro/Toronto/IBM wrote on 2004-02-17 08:53:23 PM:
> Section 5
> 
> The third bullet of this section states, "The serializer should 
> avoid outputting line breaks and multiple whitespace characters 
> within attribute values."  It's not clear what a processor should do
> in such cases.  This should state that these characters should be 
> replaced by a single space character.

     The XSL and XML Query Working Groups discussed the comment, and 
decided to remove this rule about whitespace in attributes for the xhtml 
output method.  All the other rules for describing the formatting 
requirements of the xhtml output method are strictly under the control of 
the processor, but in this case, the user has control of the content of 
the data model instance that is to be serialized, so the serialization 
process should just leave that to the user's control.  Instead, the rule 
will be replaced with a non-normative reference to the compatibility 
appendix of XHTML as guidance to the user. 

     This note announces and acknowledges that decision.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0980.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0996-01: FW: XSLT 2.0: XML Output Method: the omit-xml-declaration Parameter
[substantive, announced] 2004-04-13

4.5 XML Output Method: the omit-xml-declaration Parameter 

http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/#N105F3 



This says: "The omit-xml-declaration parameter must be ignored if the 

standalone parameter is present, or if the encoding parameter specifies 

a value other than UTF-8 or UTF-16." 

 

I would like to control the output of the omit-xml-declaration parameter,

where the encoding parameter specifies a value other than UTF-8 or UTF-16.

I often don't use Unicode. I would like the option to output with

non-standard encoding as XHTML. The XHTML standard

(http://www.w3.org/TR/xhtml1/) specifies that "an XML declaration is not

required in all XML documents"; it is often desirable to omit it, given

that it is known that there are unexpected results with some user agents.

 

Thanks

 

Deborah





BBCi at http://www.bbc.co.uk/





This e-mail (and any attachments) is confidential and may contain

personal views which are not the views of the BBC unless specifically

stated.

If you have received it in error, please delete it from your system. 

Do not use, copy or disclose the information in any way nor act in

reliance on it and notify the sender immediately. Please note that the

BBC monitors e-mails sent or received. 

Further communication will signify your consent to this.



    

Deborah,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

> 4.5 XML Output Method: the omit-xml-declaration Parameter 
> http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/#N105F3 
> 
> This says: "The omit-xml-declaration parameter must be ignored if the 
> standalone parameter is present, or if the encoding parameter specifies 
> a value other than UTF-8 or UTF-16." 
> 
> I would like to control the output of the omit-xml-declaration 
> parameter, where the encoding parameter specifies a value other than
> UTF-8 or UTF-16. I often don't use Unicode. I would like the option 
> to output with non-standard encoding as XHTML. The XHTML standard (
> http://www.w3.org/TR/xhtml1/) specifies that "an XML declaration is 
> not required in all XML documents"; it is often desirable to omit 
> it, given that it is known that there are unexpected results with 
> some user agents.

     Thank you for your comment.

     The XSL and XQuery Working groups discussed your comment.  As 
originally written, XML 1.0 required an XML declaration or a text 
declaration if the encoding of the document or external general parsed 
entity was anything other than UTF-8 or UTF-16.  XSLT 1.0 enforced that 
requirement in its serialization mechanism.  The draft of Serialization 
inherited that behaviour from XSLT 1.0.  However, an erratum to XML 1.0 
removed that requirement.

     In response to your comment, the working groups decided to require 
the XML declaration or text declaration to be omitted, regardless of the 
setting of the encoding parameter.  Serialization will permit an XML 
declaration or text declaration to be omitted in precisely those 
circumstances in which it can be omitted according to XML 1.0 and XML 1.1. 
 This would affect both the xml and xhtml output methods.

     As that is the change you requested, I believe that decision will be 
acceptable to you.   May I ask you to confirm that it is?

Thanks,

Henry [On behalf of the XSL and XQuery Working Groups.]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0996.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1040-01: ORA-SE-305-E: Phase 2 should mention generation of character references
[substantive, acknowledged] 2004-08-31



SECTION 3: Serialization parameters



Phase 2, "Character markup", fourth bullet, mentions 

escaping of special characters such as &lt;.  You could 

also mention here the creation of character references 

for characters that are not representable in the encoding.





- Steve B.



    
Final minutes of the Redmond 2004 face to face, Massimo Marchiori (2004-09-06)

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
SECTION 3: Serialization parameters

Phase 2, "Character markup", fourth bullet, mentions 
escaping of special characters such as &lt;.  You could 
also mention here the creation of character references 
for characters that are not representable in the encoding.
>>

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment, and decided, because of the interactions between 
Unicode normalization and creation of character references, to fold 
together character expansion 
and Unicode normalization, and at the same time, add creation of character 
references to the character expansion phase.

     Specifically, the working groups decided to replace the second and 
third bullets of Section 4 of Serialization with 
the following text:

<<
2. Character expansion is concerned with the representation of
   characters appearing in text and attribute nodes in the
   instance of the data model. The substitution processes that
   may apply are listed below, in priority order: a character
   that is handled by one process in this list will be
   unaffected by processes appearing later in the list, except
   that a character affected by Unicode normalization may be
   affected by creation of CDATA sections and by character
   escaping

   o URI escaping (in the case of URI-valued attributes in the
     HTML and XHTML output methods), as determined by the
     escape-uri-attributes parameter

   o Character mapping, as determined by the use-character-maps
     parameter.  Text nodes that are children of elements
     specified by the cdata-section-elements parameter are not
     affected by this step. 

   o Unicode Normalization, if requested by the
     normalization-form parameter. Unicode normalization is
     applied to the character stream that results after all
     markup generation and character expansion has taken place.

     For the definitions of the various normalization forms,
     see [Character Model for the World Wide Web 1.0]

     The meanings associated with the possible values of the
     normalization-form parameter are as follows:

     o NFC specifies the serialized result should be in Unicode
       Normalization Form C.

     o NFD specifies the serialized result should be in Unicode
       Normalization Form D.

     o NFKC specifies the serialized result should be in Unicode
       Normalization Form KC.

     o NFKD specifies the serialized result should be in Unicode
       Normalization Form KD.

     o fully-normalized specifies the serialized result should
       be in fully normalized form.

     o none specifies that no Unicode normalization should be
       applied.

     o An implementation-defined value has an implementation-
       defined effect.

   o Creation of CDATA sections, as determined by the
     cdata-section-elements parameter. Note that this is also
     affected by the encoding parameter, in that characters not
     present in the selected encoding cannot be represented in
     a CDATA section.

   o Escaping according to XML or HTML rules of special
     characters and of characters that cannot be represented in
     the selected encoding.  For example replacing < by &lt;.
>>

     The Unicode Normalization phase becomes the third step of character 
expansion.  Character mapping becomes the second step, with the 
clarification that it does not affect elements to which 
cdata-section-elements applies.  This was done to make it clear that any 
characters affected by character mapping are not affected by Unicode 
Normalization.  The lead-in to the bulleted list will be modified so that 
CDATA section creation and escaping still apply to characters affected by 
Unicode Normalization - this is a consequence of trying to fold the two 
together.  Finally, the last bullet will be modified to make it clear that 
not only special characters, but characters that can't be represented in 
the selected encoding are affected by that final step.

     As a representative of Oracle was present when this decision was 
made, I will assume the response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1040.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1042-01: ORA-SE-298-E: Please clarify that all parameters are optional
[substantive, acknowledged] 2004-03-28



SECTION 3: Serialization parameters



It says "standalone specifies whether the processor is to emit

a standlone document declaration and the value of the declaration;

the value must be yes or no."  The first sentence implies that

standalone is a parameter with three possible values:

don't emit a standalone declaration; do emit and its value is yes;

do emit and its value is no.  Having only two values for the

parameter is not adequate for its task.  



In section 4.5 "XML output method: the omit-xml-declaration

parameter", the last paragraph includes the phrase

"...if the standalone parameter is present...".  This indicates

that your model is that parameters are optional.  With that 

model you can indeed get by with two values for the parameter,

because the third state would be indicated by the absense of

the parameter.  If this is your intent, you should preface the

list of parameters by saying that they are optional.  Also,

most of the parameter descriptions include a sentence of the 

form "If this parameter is not specified..." but a few do not.

It would be good to supply this sentence for all parameters.



- Steve B.



    



Steve,



     In [1], you submitted the following comment on the Serialization last 

call draft:



Steve Buxton wrote on 2004-02-18 05:22:15 PM:

> SECTION 3: Serialization parameters

> 

> It says "standalone specifies whether the processor is to emit

> a standlone document declaration and the value of the declaration;

> the value must be yes or no."  The first sentence implies that

> standalone is a parameter with three possible values:

> don't emit a standalone declaration; do emit and its value is yes;

> do emit and its value is no.  Having only two values for the

> parameter is not adequate for its task. 

> 

> In section 4.5 "XML output method: the omit-xml-declaration

> parameter", the last paragraph includes the phrase

> "...if the standalone parameter is present...".  This indicates

> that your model is that parameters are optional.  With that 

> model you can indeed get by with two values for the parameter,

> because the third state would be indicated by the absense of

> the parameter.  If this is your intent, you should preface the

> list of parameters by saying that they are optional.  Also,

> most of the parameter descriptions include a sentence of the 

> form "If this parameter is not specified..." but a few do not.

> It would be good to supply this sentence for all parameters.



     Thank you for submitting your comment.



     The XSL and XQuery Working groups discussed your comment and several 

related comments.  In most cases, the serialization draft treated a 

parameter whose value was not specified by the client specification as if 

it had been specified with a particular default value that was defined by 

either the serialization draft or by the implementation.  Such parameters, 

though optional from the point of view of the client specification, always 

had some value.  In a few instances - as with the standalone parameter - 

the absence of a parameter was treated as if it was a distinct setting for 

the parameter.



     The working groups decided to place the onus on the client 

specifications (XSLT and XQuery for now) to specify default values for 

parameters, if appropriate, rather than defining any in the Serialization 

specification.  With this change, only the doctype-public and 

doctype-system parameters could be absent.



     The following table, which will replace the descriptions of the 

parameter values that currently appear in Section 3, should clarify this. 

Corresponding changes to the uses of the parameter values in subsequent 

sections will similarly be made.



<<

+----------------------+------------------------------------------------+

|PARAMETER NAME        |PERMITTED VALUES FOR PARAMETER                  |

+----------------------+------------------------------------------------+

|cdata-section-elements|A list of expanded-QNames, possibly empty.      |

+----------------------+------------------------------------------------+

|doctype-public        |A string of Unicode characters.  This parameter |

|                      |is optional.                                    |

+----------------------+------------------------------------------------+

|doctype-system        |A string of Unicode characters.  This parameter |

|                      |is optional.                                    |

+----------------------+------------------------------------------------+

|encoding              |A string of Unicode characters in the range #x21|

|                      |to #x7E (that is, printable ASCII characters);  |

|                      |the value should be a charset registered with   |

|                      |the Internet Assigned Numbers Authority [IANA], |

|                      |[RFC2278] or begin with the characters x- or X-.|

+----------------------+------------------------------------------------+

|escape-uri-attributes |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|include-content-type  |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|indent                |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|media-type            |A string of Unicode characters specifying the   |

|                      |media type (MIME content type) [RFC2376]; the   |

|                      |charset parameter of the media type must not be |

|                      |specified explicitly.                           |

+----------------------+------------------------------------------------+

|method                |An expanded-QName with a null namespace URI, and|

|                      |the local part of the name equal to xml, xhtml, |

|                      |html or text, or having a non-null namespace    |

|                      |URI.  If the namespace URI is non-null, the     |

|                      |parameter specifies an implementation-defined   |

|                      |output method.                                  |

+----------------------+------------------------------------------------+

|normalize-unicode     |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|omit-xml-declaration  |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|standalone            |One of the enumerated values yes, no or none    |

+----------------------+------------------------------------------------+

|undeclare-namespaces  |One of the enumerated values yes or no          |

+----------------------+------------------------------------------------+

|use-character-maps    |A list of pairs, possibly empty, with each pair |

|                      |consisting of a single Unicode character and a  |

|                      |string of Unicode characters.                   |

+----------------------+------------------------------------------------+

|version               |A string of Unicode characters.                 |

+----------------------+------------------------------------------------+

>>



     May I ask you to confirm that this response to your comment is 

acceptable?



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1042.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

qt-2004Feb1195-01: [Serialization] MS-SER-LC1-001
[substantive, acknowledged] 2004-07-13
[Serialization] MS-SER-LC1-001, Michael Rys (2004-02-26)



Section 1 Introduction	

Editorial	



We think it may be better, if the serialization document is kept

separate, so other serialization formats can be added without impacting

the general datamodel document.

Draft minutes Query/XSLT Cambridge days 1-4, massimo@w3.org (2004-06-24)
Re: [Serialization] MS-SER-LC1-001, Henry Zongaro (2004-07-13)

Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Section 1 Introduction 
Editorial 

We think it may be better, if the serialization document is kept
separate, so other serialization formats can be added without impacting
the general datamodel document.
>>

     Thank you for this comment.

     The XSL and XML Query Working Groups discussed your comment, and 
agreed that it would be best to specify the serialization process in a 
separate document.  The editorial note will be deleted.

     I am unsure whether any representative of Microsoft, apart from the 
chair of the XML Query WG, was present when this decision was made.  May I 
ask you to confirm that this response is acceptable to you?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1195.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] MS-SER-LC1-001, Henry Zongaro (2004-07-13)
qt-2004Feb1197-01: [Serialization] MS-SER-LC1-002
[substantive, acknowledged] 2004-09-08
[Serialization] MS-SER-LC1-002, Michael Rys (2004-02-26)



Section 2	

Editorial/Technical	



Please rewrite "Replace any string in the sequence with a text node

whose string value is equal to the string." as "Replace any string with

length greater than 0 in the sequence with a text node whose string

value is equal to the string. Remove any zero-length string from the

sequence."


Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Section 2 
Editorial/Technical 

Please rewrite "Replace any string in the sequence with a text node
whose string value is equal to the string." as "Replace any string with
length greater than 0 in the sequence with a text node whose string
value is equal to the string. Remove any zero-length string from the
sequence."
>>

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment, and noted that the following text was added to 
Section 7.7.1 of the July 23 draft the Data Model specification [2]:

<<
When a Document or Element Node is constructed, Text Nodes that would be 
adjacent are combined into a single Text Node. If the resulting Text Node 
is empty, it is never placed among the children of its parent, it is 
simply discarded.
>>

     The working groups decided that this text in Data Model makes the 
change that you recommended no longer necessary.

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1197.html
[2] 
http://www.w3.org/TR/2004/WD-xpath-datamodel-20040723/#TextNodeOverview
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1198-01: [Serialization] MS-SER-LC1-005
[substantive, acknowledged] 2004-09-08
[Serialization] MS-SER-LC1-005, Michael Rys (2004-02-26)



Section 4		

Editorial/Technical	



Please rewrite "and this may result in type annotations that are either

more or less precise than those in the original result tree." as "and

this may result in type annotations that are different from those in the

original result tree."


Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Section 4 
Editorial/Technical 

Please rewrite "and this may result in type annotations that are either
more or less precise than those in the original result tree." as "and
this may result in type annotations that are different from those in the
original result tree."
>>

     Thanks you for this comment.  The XSL and XML Query Working Groups 
discussed your comment, and agreed to make the change that you suggested.

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1198.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1204-01: [Serialization] MS-SER-LC1-009
[substantive, acknowledged] 2004-09-08
[Serialization] MS-SER-LC1-009, Michael Rys (2004-02-26)



General		

Editorial	



Section 2 introduces the term of a normalized sequence. Use this instead

of "data model" in the rules that operate on the normalized sequence.


Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization:

<<
Section 2 introduces the term of a normalized sequence. Use this instead
of "data model" in the rules that operate on the normalized sequence.
>>

     Thank you for this comment.  The XSL and XML Query Working Groups 
discussed your comment, and decided to change "instance of the data model" 
throughout Section 2 to "sequence", keeping the distinction between the 
sequence that is input to serialization and the normalized sequence clear 
throughout.

     As you were present when this decision was made, I will assume the 
response is acceptable to you.

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1204.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1205-01: [Serialization] MS-SER-LC1-012
[substantive, acknowledged] 2004-03-28
[Serialization] MS-SER-LC1-012, Michael Rys (2004-02-26)



General		

Technical	



We see a use for an XML-based markup vocabulary or XQuery

expression-based serialization to describe a full data model instance

that provides for all items using a fixed schema that exposes the data

model's node structure and property information. However, we believe

that such a modus should be done in a future version and not delay the

current publication cycle.

Fw: [Serialization] MS-SER-LC1-012, Henry Zongaro (2004-03-28)



Michael,



     In [1], you submitted the following comment on the serialization last 

call:



> General 

> Technical 

> 

> We see a use for an XML-based markup vocabulary or XQuery

> expression-based serialization to describe a full data model instance

> that provides for all items using a fixed schema that exposes the data

> model's node structure and property information. However, we believe

> that such a modus should be done in a future version and not delay the

> current publication cycle.



     Thank you for submitting this comment.



     The XSL and XQuery working groups considered your comment and related 

comments.  There was general agreement that there is some need for a 

mechanism for serializing arbitrary sequences that preserves most or all 

of the properties of the items in an arbitrary sequence that is being 

serialized.



     The working groups decided that precisely defining all of the 

requirements for such a mechanism at this stage would be difficult, and 

would likely lead to a solution that would not satisfy real user 

requirements.  Therefore, the working groups decided to consider such a 

feature for a future revision of the recommendations, and close this 

comment without any changes to the specifications.



     As this seems to be the outcome you proposed, I trust this resolution 

is acceptable to you.  May I ask you to confirm?



Thanks,



Henry

[1] 

http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1205.html

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com

RE: [Serialization] MS-SER-LC1-012, Michael Rys (2004-03-28)



Confirmed.



Thanks

Michael

qt-2004May0006-01: [Serial] additional last call comment about xml:lang
[substantive, announced] 2004-09-21

Dear XML Query WG and XSL WG,

This is a last call comment on your Serialization document.
We are sorry that this last call comment is late, but it is
very important.

We have earlier sent comments on the Data Model and on XSLT
where we have urged better support for the inheritance of
xml:lang (or for inherited attributes in general). Without
any such support, it is extremely tedious to write a
transformation or query that adequately copies xml:lang
from the input to the output.

In internal discussion, Liam Quin suggested that
it might be more appropriate to submit our comment against
Serialization. The reason for this is that XPath/XSLT offers
reasonable support for extracting xml:lang information from
source documents into the data model. However, when serialized,
this leads in most cases to a completely unnecessary and
undesirable multiplication of xml:lang attributes on
virtually every element. Adding some support for reducing
unnecessary xml:lang attributes from the output on
serialization would be highly desirable. As I think we
have written previously, better support for xml:lang
(and maybe inherited attributes in general) was something
that was left over as 'future work' from XSLT 1.0. Your
current work is the best chance to fix this problem.

Regards,     Martin.

> We have earlier sent comments on the Data Model and on XSLT
> where we have urged better support for the inheritance of
> xml:lang (or for inherited attributes in general). Without
> any such support, it is extremely tedious to write a
> transformation or query that adequately copies xml:lang
> from the input to the output.

I have been monitoring questions and answers on the xsl-list (at one time
there were 100 a day) for five years now, and I have not once seen a
complaint about this from a user. It might be difficult in theory, but I
don't think it is a problem in practice.

Michael Kay

Hello Michael,

At 17:52 04/05/06 +0100, Michael Kay wrote:

> > We have earlier sent comments on the Data Model and on XSLT
> > where we have urged better support for the inheritance of
> > xml:lang (or for inherited attributes in general). Without
> > any such support, it is extremely tedious to write a
> > transformation or query that adequately copies xml:lang
> > from the input to the output.
>
>I have been monitoring questions and answers on the xsl-list (at one time
>there were 100 a day) for five years now, and I have not once seen a
>complaint about this from a user. It might be difficult in theory, but I
>don't think it is a problem in practice.

If you don't think it's a problem in practice, what about taking
the xml-to-xhtml XSLT associated with the xmlspec DTD, and change
it so that multilingual input is output with the correct xml:lang
attributes (but without having xml:lang on every element if not
necessary).

Regards,    Martin.

> If you don't think it's a problem in practice, what about taking
> the xml-to-xhtml XSLT associated with the xmlspec DTD, and change
> it so that multilingual input is output with the correct xml:lang
> attributes (but without having xml:lang on every element if not
> necessary).
> 

There are lots of things in XSLT that aren't easy, such as processing CALS
table models. But on the scale of problems this one is by no means
difficult: for example it can be done by running a three-phase
transformation in which a pre-processing phase adds redundant xml:lang
attributes:

<xsl:template match="*">
  <xsl:copy>
    <xsl:copy-of select="@* | ancestor-or-self::*/@xml:lang[last()]"/>
    <xsl:apply-templates/>
  </xsl:copy>
</xsl:template>

and a post-processing phase removes redundant xml:lang attributes:

<xsl:template match="*">
  <xsl:copy>
    <xsl:copy-of select="@* except @xml:lang[. =
ancestor::*/@xml:lang[last()]]"/>
    <xsl:apply-templates/>
  </xsl:copy>
</xsl:template>

When I said it doesn't seem to be a problem in practice, I meant that I
don't see evidence of lots of users trying to do this and complaining that
it's difficult. I see a lot more complaints about the difficulty of handling
CALS table models. For a problem that doesn't arise often in practice, a
solution in 12 lines of code seems good enough.

Inherited attributes are a bit of an oddity. Formally, there is no such
thing as an inherited attribute, it's only a design convention, and there
are lots of different variations on it - xml:space, xml:base, and xml:lang
work quite differently from each other. Without a formalisation of inherited
attributes in the data model and in the XML Schema type system, it's not
easy to come up with language constructs that would be generic enough to be
useful, and an ad-hoc solution for one particular attribute would be really
bad design, especially in the absence of any evidence of a pressing user
problem.

Michael Kay

Hello Michael,

Sorry for the delay of my answer.

At 10:17 04/05/07 +0100, Michael Kay wrote:

> > If you don't think it's a problem in practice, what about taking
> > the xml-to-xhtml XSLT associated with the xmlspec DTD, and change
> > it so that multilingual input is output with the correct xml:lang
> > attributes (but without having xml:lang on every element if not
> > necessary).
>
>There are lots of things in XSLT that aren't easy, such as processing CALS
>table models. But on the scale of problems this one is by no means
>difficult: for example it can be done by running a three-phase
>transformation in which a pre-processing phase adds redundant xml:lang
>attributes:
>
><xsl:template match="*">
>   <xsl:copy>
>     <xsl:copy-of select="@* | ancestor-or-self::*/@xml:lang[last()]"/>
>     <xsl:apply-templates/>
>   </xsl:copy>
></xsl:template>

It turns out that this first pass can be integrated into the main pass.


>and a post-processing phase removes redundant xml:lang attributes:
>
><xsl:template match="*">
>   <xsl:copy>
>     <xsl:copy-of select="@* except @xml:lang[. =
>ancestor::*/@xml:lang[last()]]"/>
>     <xsl:apply-templates/>
>   </xsl:copy>
></xsl:template>
>
>When I said it doesn't seem to be a problem in practice, I meant that I
>don't see evidence of lots of users trying to do this and complaining that
>it's difficult. I see a lot more complaints about the difficulty of handling
>CALS table models. For a problem that doesn't arise often in practice, a
>solution in 12 lines of code seems good enough.

Well, if it were only this code. But the overhead of going from a
one-pass solution to a two-pass solution is quite heavy in many
respects. It is an important barrier.


>Inherited attributes are a bit of an oddity. Formally, there is no such
>thing as an inherited attribute, it's only a design convention, and there
>are lots of different variations on it - xml:space, xml:base, and xml:lang
>work quite differently from each other. Without a formalisation of inherited
>attributes in the data model and in the XML Schema type system, it's not
>easy to come up with language constructs that would be generic enough to be
>useful, and an ad-hoc solution for one particular attribute would be really
>bad design,

I agree. But I'm sure there are ways to do this that are not ad-hoc
for a single attribute, that can adapt to the different inherited
attributes, and that don't necessarily need to be in the data model.

I seem to remember that James Clark at one point said that having
a feature to recursively invoke XSLT (in this case on its output)
would easily solve this problem.

Regards,    Martin.


>especially in the absence of any evidence of a pressing user
>problem.
>
>Michael Kay

> I seem to remember that James Clark at one point said that having
> a feature to recursively invoke XSLT (in this case on its output)
> would easily solve this problem.

You can now indeed invoke one XSLT template to process the output of
another. This is the multi-pass solution that I showed you.

Michael Kay

At 10:07 04/05/25 +0100, Michael Kay wrote:

> > I seem to remember that James Clark at one point said that having
> > a feature to recursively invoke XSLT (in this case on its output)
> > would easily solve this problem.
>
>You can now indeed invoke one XSLT template to process the output of
>another. This is the multi-pass solution that I showed you.

Hello Michael,

Are you saying that this can indeed be done with a single invocation
of an XSLT implementation, with a single stylesheet? Your use of
"pre-processing phase" and so on in your previous mail wasn't
totally clear on this, at least not for me.

If this is true, it would be very nice, and I would assume that our
WG would then be very happy with the result. For our reference, can
you please either point to the section in the spec where this
multi-pass thing is described, or can you resend the code in
your earlier mail with some framework code added that shows how
to define the various passes?

Regards,    Martin.

> At 10:07 04/05/25 +0100, Michael Kay wrote:
> 
> > > I seem to remember that James Clark at one point said that having
> > > a feature to recursively invoke XSLT (in this case on its output)
> > > would easily solve this problem.
> >
> >You can now indeed invoke one XSLT template to process the output of
> >another. This is the multi-pass solution that I showed you.
> 
> Hello Michael,
> 
> Are you saying that this can indeed be done with a single invocation
> of an XSLT implementation, with a single stylesheet? Your use of
> "pre-processing phase" and so on in your previous mail wasn't
> totally clear on this, at least not for me.

Yes, it can all be done within a single transformation in a single
stylesheet.
> 
> If this is true, it would be very nice, and I would assume that our
> WG would then be very happy with the result. For our reference, can
> you please either point to the section in the spec where this
> multi-pass thing is described, or can you resend the code in
> your earlier mail with some framework code added that shows how
> to define the various passes?

There's a simple example showing how temporary trees can be used to support
multi-phase transformations in section 9.4 of the spec:

http://www.w3.org/TR/xslt20/#temporary-trees

I'm afraid I'm too busy today to do a worked example for you.

Michael Kay

At 08:47 04/05/26 +0100, Michael Kay wrote:

> > Hello Michael,
> >
> > Are you saying that this can indeed be done with a single invocation
> > of an XSLT implementation, with a single stylesheet? Your use of
> > "pre-processing phase" and so on in your previous mail wasn't
> > totally clear on this, at least not for me.
>
>Yes, it can all be done within a single transformation in a single
>stylesheet.

This sounds great!

> > If this is true, it would be very nice, and I would assume that our
> > WG would then be very happy with the result. For our reference, can
> > you please either point to the section in the spec where this
> > multi-pass thing is described, or can you resend the code in
> > your earlier mail with some framework code added that shows how
> > to define the various passes?
>
>There's a simple example showing how temporary trees can be used to support
>multi-phase transformations in section 9.4 of the spec:
>
>http://www.w3.org/TR/xslt20/#temporary-trees
>
>I'm afraid I'm too busy today to do a worked example for you.

Okay, I took this example, and the code fragments that you sent
earlier, and put something together below. I'd appreciate if you
could check it. I'm not sure I got everything right, in particular
all the modes.

How to create a stlyesheet that cleanly copies xml:lang:
[assuming for simplicity that all xml:lang information
is comming from the source, not from the stylesheet, and
that only whole elements are transferred, not independent
textual pieces]
[I'm using a tree-pass solution; this could be done in
many cases as a two-pass solution]

- Start with your stylesheet.
- Make sure that on all elements, xml:lang is copied.
- Assumes that the main mode for the original stylesheet
   is the default mode.


<xsl:stylesheet
   version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="*" mode="expandXmlLang">
   <xsl:copy>
     <xsl:copy-of select="@* | ancestor-or-self::*/@xml:lang[last()]"/>
     <xsl:apply-templates mode="expandXmlLang"/>
   </xsl:copy>
</xsl:template>

<xsl:template match="*" mode="cleanXmlLang">
   <xsl:copy>
     <xsl:copy-of select="@* except @xml:lang[. =
ancestor::*/@xml:lang[last()]]"/>
     <xsl:apply-templates mode="cleanXmlLang"/>
   </xsl:copy>
</xsl:template>

<!-- rest of your stylesheet here or somewhere -->

<xsl:variable name="xmlLangExpanded">
   <xsl:apply-templates select="/" mode="expandXmlLang"/>
</xsl:variable>

<xsl:variable name="processedMain">
   <xsl:apply-templates select="$xmlLangExpanded" mode="#default"/>
</xsl:variable>

<xsl:template match="/">
   <xsl:apply-templates select"$processedMain" mode="cleanXmlLang"/>
<xsl:template>
</xsl:stylesheet>


Regards,     Martin.

Yes, this code looks correct.

Michael Kay 

> -----Original Message-----
> From: public-qt-comments-request@w3.org 
> [mailto:public-qt-comments-request@w3.org] On Behalf Of Martin Duerst
> Sent: 27 May 2004 07:26
> To: Michael Kay; public-qt-comments@w3.org
> Cc: w3c-i18n-ig@w3.org; 'Liam Quin'
> Subject: RE: [Serial] additional last call comment about xml:lang
> 
> 
> At 08:47 04/05/26 +0100, Michael Kay wrote:
> 
> > > Hello Michael,
> > >
> > > Are you saying that this can indeed be done with a single 
> invocation
> > > of an XSLT implementation, with a single stylesheet? Your use of
> > > "pre-processing phase" and so on in your previous mail wasn't
> > > totally clear on this, at least not for me.
> >
> >Yes, it can all be done within a single transformation in a single
> >stylesheet.
> 
> This sounds great!
> 
> > > If this is true, it would be very nice, and I would 
> assume that our
> > > WG would then be very happy with the result. For our 
> reference, can
> > > you please either point to the section in the spec where this
> > > multi-pass thing is described, or can you resend the code in
> > > your earlier mail with some framework code added that shows how
> > > to define the various passes?
> >
> >There's a simple example showing how temporary trees can be 
> used to support
> >multi-phase transformations in section 9.4 of the spec:
> >
> >http://www.w3.org/TR/xslt20/#temporary-trees
> >
> >I'm afraid I'm too busy today to do a worked example for you.
> 
> Okay, I took this example, and the code fragments that you sent
> earlier, and put something together below. I'd appreciate if you
> could check it. I'm not sure I got everything right, in particular
> all the modes.
> 
> How to create a stlyesheet that cleanly copies xml:lang:
> [assuming for simplicity that all xml:lang information
> is comming from the source, not from the stylesheet, and
> that only whole elements are transferred, not independent
> textual pieces]
> [I'm using a tree-pass solution; this could be done in
> many cases as a two-pass solution]
> 
> - Start with your stylesheet.
> - Make sure that on all elements, xml:lang is copied.
> - Assumes that the main mode for the original stylesheet
>    is the default mode.
> 
> 
> <xsl:stylesheet
>    version="2.0"
>    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> 
> <xsl:template match="*" mode="expandXmlLang">
>    <xsl:copy>
>      <xsl:copy-of select="@* | 
> ancestor-or-self::*/@xml:lang[last()]"/>
>      <xsl:apply-templates mode="expandXmlLang"/>
>    </xsl:copy>
> </xsl:template>
> 
> <xsl:template match="*" mode="cleanXmlLang">
>    <xsl:copy>
>      <xsl:copy-of select="@* except @xml:lang[. =
> ancestor::*/@xml:lang[last()]]"/>
>      <xsl:apply-templates mode="cleanXmlLang"/>
>    </xsl:copy>
> </xsl:template>
> 
> <!-- rest of your stylesheet here or somewhere -->
> 
> <xsl:variable name="xmlLangExpanded">
>    <xsl:apply-templates select="/" mode="expandXmlLang"/>
> </xsl:variable>
> 
> <xsl:variable name="processedMain">
>    <xsl:apply-templates select="$xmlLangExpanded" mode="#default"/>
> </xsl:variable>
> 
> <xsl:template match="/">
>    <xsl:apply-templates select"$processedMain" mode="cleanXmlLang"/>
> <xsl:template>
> </xsl:stylesheet>
> 
> 
> Regards,     Martin.
> 

Hello Michael,

I'm sorry for the delay of this message.

I would like to thank you for your help. The I18N WG is satisfied
with the solution to the problem of carrying xml:lang information
from a source document to an output document using XSLT. However,
we think that it will be difficult for the reader to understand
this, and we therefore request that the code below, or something
similar, be added to the specification as an example.

With kind regards,     Martin.

At 11:56 04/05/27 +0100, Michael Kay wrote:

>Yes, this code looks correct.
>
>Michael Kay
>
> > -----Original Message-----
> > From: public-qt-comments-request@w3.org
> > [mailto:public-qt-comments-request@w3.org] On Behalf Of Martin Duerst
> > Sent: 27 May 2004 07:26
> > To: Michael Kay; public-qt-comments@w3.org
> > Cc: w3c-i18n-ig@w3.org; 'Liam Quin'
> > Subject: RE: [Serial] additional last call comment about xml:lang
> >
> >
> > At 08:47 04/05/26 +0100, Michael Kay wrote:
> >
> > > > Hello Michael,
> > > >
> > > > Are you saying that this can indeed be done with a single
> > invocation
> > > > of an XSLT implementation, with a single stylesheet? Your use of
> > > > "pre-processing phase" and so on in your previous mail wasn't
> > > > totally clear on this, at least not for me.
> > >
> > >Yes, it can all be done within a single transformation in a single
> > >stylesheet.
> >
> > This sounds great!
> >
> > > > If this is true, it would be very nice, and I would
> > assume that our
> > > > WG would then be very happy with the result. For our
> > reference, can
> > > > you please either point to the section in the spec where this
> > > > multi-pass thing is described, or can you resend the code in
> > > > your earlier mail with some framework code added that shows how
> > > > to define the various passes?
> > >
> > >There's a simple example showing how temporary trees can be
> > used to support
> > >multi-phase transformations in section 9.4 of the spec:
> > >
> > >http://www.w3.org/TR/xslt20/#temporary-trees
> > >
> > >I'm afraid I'm too busy today to do a worked example for you.
> >
> > Okay, I took this example, and the code fragments that you sent
> > earlier, and put something together below. I'd appreciate if you
> > could check it. I'm not sure I got everything right, in particular
> > all the modes.
> >
> > How to create a stlyesheet that cleanly copies xml:lang:
> > [assuming for simplicity that all xml:lang information
> > is comming from the source, not from the stylesheet, and
> > that only whole elements are transferred, not independent
> > textual pieces]
> > [I'm using a tree-pass solution; this could be done in
> > many cases as a two-pass solution]
> >
> > - Start with your stylesheet.
> > - Make sure that on all elements, xml:lang is copied.
> > - Assumes that the main mode for the original stylesheet
> >    is the default mode.
> >
> >
> > <xsl:stylesheet
> >    version="2.0"
> >    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> >
> > <xsl:template match="*" mode="expandXmlLang">
> >    <xsl:copy>
> >      <xsl:copy-of select="@* |
> > ancestor-or-self::*/@xml:lang[last()]"/>
> >      <xsl:apply-templates mode="expandXmlLang"/>
> >    </xsl:copy>
> > </xsl:template>
> >
> > <xsl:template match="*" mode="cleanXmlLang">
> >    <xsl:copy>
> >      <xsl:copy-of select="@* except @xml:lang[. =
> > ancestor::*/@xml:lang[last()]]"/>
> >      <xsl:apply-templates mode="cleanXmlLang"/>
> >    </xsl:copy>
> > </xsl:template>
> >
> > <!-- rest of your stylesheet here or somewhere -->
> >
> > <xsl:variable name="xmlLangExpanded">
> >    <xsl:apply-templates select="/" mode="expandXmlLang"/>
> > </xsl:variable>
> >
> > <xsl:variable name="processedMain">
> >    <xsl:apply-templates select="$xmlLangExpanded" mode="#default"/>
> > </xsl:variable>
> >
> > <xsl:template match="/">
> >    <xsl:apply-templates select"$processedMain" mode="cleanXmlLang"/>
> > <xsl:template>
> > </xsl:stylesheet>
> >
> >
> > Regards,     Martin.
> >

Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group:

<<
This is a last call comment on your Serialization document.
We are sorry that this last call comment is late, but it is
very important.

We have earlier sent comments on the Data Model and on XSLT
where we have urged better support for the inheritance of
xml:lang (or for inherited attributes in general). Without
any such support, it is extremely tedious to write a
transformation or query that adequately copies xml:lang
from the input to the output.

In internal discussion, Liam Quin suggested that
it might be more appropriate to submit our comment against
Serialization. The reason for this is that XPath/XSLT offers
reasonable support for extracting xml:lang information from
source documents into the data model. However, when serialized,
this leads in most cases to a completely unnecessary and
undesirable multiplication of xml:lang attributes on
virtually every element. Adding some support for reducing
unnecessary xml:lang attributes from the output on
serialization would be highly desirable. As I think we
have written previously, better support for xml:lang
(and maybe inherited attributes in general) was something
that was left over as 'future work' from XSLT 1.0. Your
current work is the best chance to fix this problem.
>>

     There was much subsequent discussion of the topic between Michael Kay 
and yourself.[2-9]  In [10], you indicated that the I18N Working group was 
satisfied with the mechanisms that are available for filtering redundant 
xml:lang attributes.  The XSL and XQuery Working Groups discussed the 
issue and decided that, in light of the discussion, no change to 
Serialization is required.

     The XSL WG will consider adding an example to the XSLT 2.0 
specification.

     May I ask you to confirm that this response is acceptable to the I18N 
Working Group?

Thanks,

Henry [On behalf of the XSL and XML Query Working Groups]
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0006.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0010.html
[3] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0013.html
[4] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0014.html
[5] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0055.html
[6] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0056.html
[7] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0067.html
[8] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0068.html
[9] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004May/0074.html
[10] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Jul/0052.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Sep0022-01: [Serial] XHTML indentation
[substantive, acknowledged] 2004-11-16
[Serial] XHTML indentation, Colin Paul Adams (2004-09-03)

The public draft says:

"The serialization of the instance of the data model follows the same
rules as for the xml output method, with the exceptions noted below." 

Indentation is not mentioned, which implies that the xml, not the
html, indentation rules should be followed.

So I did.

But then output within a <pre> tag is wrecked.

So I've changed my code to follow the html rules for indentation.
Surely this is what is intended?
-- 
Colin Paul Adams
Preston Lancashire
Re: [Serial] XHTML indentation, Henry Zongaro (2004-09-07)

Colin,

Colin Paul Adams wrote on 09/03/2004 08:10:24 AM:
> The public draft says:
> 
> "The serialization of the instance of the data model follows the same
> rules as for the xml output method, with the exceptions noted below." 
> 
> Indentation is not mentioned, which implies that the xml, not the
> html, indentation rules should be followed.
> 
> So I did.
> 
> But then output within a <pre> tag is wrecked.
> 
> So I've changed my code to follow the html rules for indentation.
> Surely this is what is intended?

     Speaking for myself, I believe you are correct - that would be the 
only thing that would make sense in the context of the xhtml output 
method.

Thanks,

Henry
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Face to face meeting, Redwood Shores, C. M. Sperberg-McQueen (2004-11-11)
Re: [Serial] XHTML indentation, Joanne Tong (2004-11-16)
Colin,

     In [1], you submitted the following comment on the 23 July Working 
Draft of Serialization:

<<
The public draft says:

"The serialization of the instance of the data model follows the same
rules as for the xml output method, with the exceptions noted below." 

Indentation is not mentioned, which implies that the xml, not the
html, indentation rules should be followed.

So I did.

But then output within a <pre> tag is wrecked.

So I've changed my code to follow the html rules for indentation.
Surely this is what is intended?
>>


     Thank you for this comment. 

     The XSL Working Group discussed the comment and intend to resolve 
this by adding the following item to the 
bulleted list in Section 6 of Serialization "XHTML Output Method", 
modelled upon the corresponding description for the HTML output method.

<<
o If the indent parameter has the value yes, the serializer may add or 
remove whitespace as it outputs the result tree, so long as it does not 
change the way that a conforming HTML user agent would render the output.

Note: This rule can be satisfied by observing the following constraints:

  o Whitespace must only be added before or after an element, or
    adjacent to an existing whitespace character.

  o Whitespace must not be added or removed adjacent to an inline
    element. The inline elements are those elements in the XHTML
    namespace in the %inline category of any of the XHTML 1.0 DTD's,
    in the %inline.class category of the XHTML 1.1 DTD, and elements
    in the XHTML namespace with local names ins and del if they are
    used as inline elements (i.e., if they do not contain element
    children).

  o Whitespace must not be added or removed inside a formatted
    element, the formatted elements being those in the XHTML
    namespace with local names pre, script, style, and
    textarea.

  The HTML definition of whitespace is different from the XML
  definition: see section 9.1 of the HTML 4.01 specification.
>>

     May I ask you to confirm that this response is acceptable to you?

Thanks,

Joanne Tong

[1]  
http://lists.w3.org/Archives/Public/public-qt-comments/2004Sep/0022.html
Re: [Serial] XHTML indentation, Colin Paul Adams (2004-11-16)

>>>>> "Joanne" == Joanne Tong <joannet@ca.ibm.com> writes:

    Joanne>      May I ask you to confirm that this response is
    Joanne> acceptable to you?

Yes it is. Thank you.
-- 
Colin Paul Adams
Preston Lancashire
qt-2004Nov0025-01: [Serial] XHTML Serialization
[substantive, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

[...]
  Equally, it is entirely under the control of the person or process
  that creates the instance of the data model whether the output
  conforms to XHTML Strict, XHTML Transitional, XHTML Frameset, or
  XHTML Basic.
[...]

Please change the enumeration of document types to something more
general, there is no "XHTML Strict" document type (it would be "XHTML
1.0 Strict") and some document types such as XHTML 1.1 are missing.

qt-2004Nov0025-02: [Serial] XHTML Serialization
[substantive, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

[...]
  Given an empty instance of an XHTML element whose content model is not
  EMPTY (for example, an empty title or paragraph) the serializer MUST
  NOT use the minimized form.
[...]

It is not clear to me how it is determined whether an element has such a
content model, please specify clearly how this is determined and whether
implementations should/may/etc. apply these rules to elements for which
the algorithm to determine the content model defines no result, e.g. the
algorithm is unlikely to define the rules for the "wbr" element which is
a proprietary element with a content model of EMPTY.

qt-2004Nov0025-03: [Serial] XHTML Serialization
[substantive, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

[...]
  The serializer SHOULD output namespace declarations in a way that is
  consistent with the requirements of the XHTML DTD if this is possible.
  The DTD requires the declaration xmlns="http://www.w3.org/1999/xhtml"
  to appear on the html element, and only on the html element. 
[...]

This is only true for XHTML 1.0 document types, XHTML 1.1 for example
allows xmlns="http://www.w3.org/1999/xhtml" on all elements.

qt-2004Nov0025-04: [Serial] XHTML Serialization
[substantive, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

"Note:  Where the process used to construct the input instance of the 
data model does not provide complete control over the prefix used for 
an element name in the instance of the data model or control of 
whether the element is in the default namespace (for instance, the 
XSLT namespace fixup process), implementors are encouraged to provide 
means or endeavor to preserve the obvious intent of a user to place 
the html element in the default namespace, wherever possible. For 
example, implementors of XSLT processors are encouraged to place 
the html element that results from a literal result element like the 
following in the default namespace:"


It is not clear to me whether the note following the item above is a
clarification of the requirement or an additional suggestion. If it is
an additional suggestion, please change the document so that
implementations SHOULD implement what the note describes, if it is a
clarification, please make this more obvious in the document.

qt-2004Nov0025-07: [Serial] XHTML Serialization
[substantive, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

[...]
  If the instance of the data model includes a head element that has a
  meta element child, the serializer SHOULD replace any content
  attribute of the meta element, or add such an attribute, with the
  value as described above, rather than output a new meta element.
[...]

Please change this text to limit the behavior to meta elements with
http-equiv="Content-Type" (where the value of the attribute is case-
insensitive).


qt-2004Nov0025-09: [Serial] XHTML Serialization
[substantive, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

Please add a note that this serialization is insufficient to meet all
the requirements to deliver XHTML documents to legacy user agents, for
example, if the instance of the data model includes

  <p xml:lang="en">...</p>

the serializer does not serialize it to

  <p xml:lang="en" lang="en">...</p>

which would however be necessary to make this information available to
user agents that only look at the lang attribute.

qt-2004Nov0074-01: [Serial] > in processing instructions
[substantive, raised] 2004-11-20
[Serial] > in processing instructions, Bjoern Hoehrmann (2004-11-20)

Dear XSL Working Group,
Dear XML Query Working Group,

  Section 7.1.4 of the latest XSLT 2.0 and XQuery 1.0 Serialization
draft notes that "The html output method MUST terminate processing
instructions with > rather than ?>" but it does not seem to require that
serializers singal a serialization error if the processing instruction
data contains a ">" which is legal in XML but not possible in HTML.
Please add a serialization error for this case.

regards.
qt-2004Nov0075-01: [Serial] 2 Sequence Normalization
[substantive, raised] 2004-11-22
[Serial] 2 Sequence Normalization, David Carlisle (2004-11-22)


Section 2 defines the normalisation step three ways (in English, in
XSLT and in XQuery) unfortunately I don't think they are equivalent.
I think the intention is to get the effect of the xslt/xquery code
but I don't think the prose does this. The prose could be corrected but
defining things in "equivalent" ways, even if two of them are in a
non-normative note is dangerous, and it might be better _just_ to give
an unambiguous definition (in XSLT and/or Xquery) and drop the prose
description. 


The problem with the existing text definition is I believe with
concatenation of text nodes.

 <xsl:result-document>
  <xsl:copy-of select="$seq"/>
</xsl:result-document>


would merge any adjacent text nodes into a single text node with the
concatenated string and drop any empty text nodes (as they would acquire
a parent after copying. adjacent text nodes may arise either because
they were in adjacent positions in the original sequence, or may "become
adjacent" as a result of taking the children of document nodes, or
converting atomic strings to text nodes (steps 4 and 5).

So if you want to keep the existing text I think you need to make S6
into S7 and add a new S6 that merges adjacent text nodes and removes
text nodes with value the empty string.

David
qt-2004Feb0050-01: [Serialization] IBM-SE-002: Bugs in example
[editorial, announced] 2004-10-15

Serialization Section 2, "Serializing Arbitrary Data Models": The note at 

the end of this section contains code that is too wide to print. Also, the 

XQuery expression in this node will not parse because it has no 

parentheses around the conditional part of the if-expression. This note 

should be edited as necessary because of the "documentization" process 

described above. The resulting code should be checked for validity and 

broken into lines of suitable length.



--Don Chamberlin



    

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
Serialization Section 2, "Serializing Arbitrary Data Models": The note at 
the end of this section contains code that is too wide to print. Also, the 

XQuery expression in this node will not parse because it has no 
parentheses around the conditional part of the if-expression. This note 
should be edited as necessary because of the "documentization" process 
described above. The resulting code should be checked for validity and 
broken into lines of suitable length.
>>

     Thank you for your comment, which I have handled editorially.

     I have applied the editorial changes that you suggested.  I would 
appreciate if you could check the next draft of the specification when it 
becomes available, and verify that I've correctly applied the changes.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0050.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0052-01: [Serialization] IBM-SE-003: Undeclare-namespaces parameter
[editorial, announced] 2004-10-15

Serialization Section 3, "Serialization Parameters": The 

"undeclare-namespaces" parameter needs a better explanation. I am guessing 

that it means the following: If no namespace node attached to an element E 

defines the namespace prefix P, but P is defined by a namespace node 

attached to the parent of element E, then the serialization of element E 

must contain a namespace declaration attribute that binds the prefix P to 

an empty URI.



--Don Chamberlin



    

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
Serialization Section 3, "Serialization Parameters": The 
"undeclare-namespaces" parameter needs a better explanation. I am guessing 

that it means the following: If no namespace node attached to an element E 

defines the namespace prefix P, but P is defined by a namespace node 
attached to the parent of element E, then the serialization of element E 
must contain a namespace declaration attribute that binds the prefix P to 
an empty URI.
>>

     Thank you for your comment, which I am handling editorially.

     The section titled, "Serialization Parameters" no longer includes a 
description of the purpose of the "undeclare-namespaces" parameter. 
Instead, I have attempted to improve the description of the parameter that 
appears in the section titled, "XML Output Method: the 
undeclare-namespaces Parameter".

     I would appreciate if you could check the next draft of the 
specification when it becomes available, and verify that the description 
that appears there is satisfactory..

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0052.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0054-01: [Serialization] IBM-SE-005: Definition of serialized output
[editorial, announced] 2004-10-15

Serialization Section 4, "XML Output Method": The third paragraph contains 

a circular definition of the serialized output that depends on the 

serialized output. Presumably it intends to say something like this: "If 

the document node of the data model has a single element node and no text 

node children, then the serialized output is a well-formed XML document 

entity that conforms to the XML Namespaces Recomendation. Otherwise, the 

serialized output is a well-formed XML external general parsed entity 

which, when referenced within a trivial XML document wrapper like this 

..."   (This will clean up the circular logic. But this whole paragraph 

should be edited based on the "documentization" process described is 

separate comment IBM-SE-001.) 



--Don Chamberlin



    

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
Serialization Section 4, "XML Output Method": The third paragraph contains 

a circular definition of the serialized output that depends on the 
serialized output. Presumably it intends to say something like this: "If 
the document node of the data model has a single element node and no text 
node children, then the serialized output is a well-formed XML document 
entity that conforms to the XML Namespaces Recomendation. Otherwise, the 
serialized output is a well-formed XML external general parsed entity 
which, when referenced within a trivial XML document wrapper like this 
..."   (This will clean up the circular logic. But this whole paragraph 
should be edited based on the "documentization" process described is 
separate comment IBM-SE-001.) 
>>

     Thank you for your comment, which I am handling editorially.

     I believe that the wording of this section was revised in response to 
another comment on the last call draft, and that revision has fixed the 
circular definition you cite.  I would appreciate if you could check the 
next draft of the specification when it becomes available, and verify that 
it correctly resolves your issue.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0054.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0056-01: [Serialization] IBM-SE-007: Definition of round-tripping
[editorial, announced] 2004-10-15

Serialization Section 4, "XML Output Method": The paragraph before the 

bullet list says that the round-tripped data model must be "the same as 

the starting data model". This should be clarified to mean the data model 

after the "normalization" (or "documentization") process described in 

Section 2.



--Don Chamberlin



    

Don,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
Serialization Section 4, "XML Output Method": The paragraph before the 
bullet list says that the round-tripped data model must be "the same as 
the starting data model". This should be clarified to mean the data model 
after the "normalization" (or "documentization") process described in 
Section 2.
>>

     Thank you for your comment, which I am handling editorially.

     I have applied the editorial change that you suggested.  I would 
appreciate if you could check the next draft of the specification when it 
becomes available to verify that I've correctly applied the change.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0056.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0270-01: [Serialization] SCHEMA-J
[editorial, announced] 2004-10-15
[Serialization] SCHEMA-J, Mary Holstege (2004-02-12)





Editorial



[J] [Section 6.5 HTML Output Method] Encoding states that [style-1] "then

unless the include-content-type parameter is present and has the value "no""

But, for all other parameters, the style is [style-2] "If the xxx parameter

has the value yes". We have a preference for [style-2]. For consistency, we

request you to recast the sentence in Section 6.5 using [style-2].



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
RE: [Serialization] SCHEMA-J, Michael Kay (2004-02-13)

Yes. There are a few phrases like this that predate the separation of

the serialization spec from XSLT, and that make assumptions about the

default values of parameters: the theory is that defaults should be

defined in the XSLT specification, not here, but this has not always

been carried through.



Michael Kay (speaking personally)


Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group.

<<
Editorial

[J] [Section 6.5 HTML Output Method] Encoding states that [style-1] "then
unless the include-content-type parameter is present and has the value 
"no""
But, for all other parameters, the style is [style-2] "If the xxx 
parameter
has the value yes". We have a preference for [style-2]. For consistency, 
we
request you to recast the sentence in Section 6.5 using [style-2].
>>

     Thanks to you and the XML Schema Working Group for this comment, 
which I am handling editorially.

     In response to another comment, the XSL and XML Query Working Groups 
decided to make all serialization parameters mandatory, except for 
doctype-public and doctype-system, which are still optional.  There are no 
longer instances of the undesirable style you cited.

     I would appreciate if you could check the next public draft of the 
specification when it becomes available to verify that the styles used to 
refer to the values of the two serialization parameters that are still 
optional and to those that are non-optional are acceptable to the XML 
Schema Working Group.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0270.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0273-01: [Serialization] SCHEMA-M
[editorial, announced] 2004-10-18
[Serialization] SCHEMA-M, Mary Holstege (2004-02-12)





Editorial



[M] [Section 1: Introduction] The key words defined in RFC 2119 are in

uppercase and should be here too and where used as a term of art. e.g.

s/may/MAY s/must/MUST etc. or a note should be added clarifying that the

lowercase forms are always used as terms of art.



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:

<<
Editorial

[M] [Section 1: Introduction] The key words defined in RFC 2119 are in
uppercase and should be here too and where used as a term of art. e.g.
s/may/MAY s/must/MUST etc. or a note should be added clarifying that the
lowercase forms are always used as terms of art.
>>

     Thanks to you and the working group for this comment, which I have 
handled editorially.

     I have applied the editorial change that you suggested.  I would 
appreciate if you could check the next public draft of Serialization when 
it becomes available, and verify that I've addressed the comment to the 
Schema WG's satisfaction.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0273.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0274-01: [Serialization] SCHEMA-N
[editorial, raised] 2004-02-12
[Serialization] SCHEMA-N, Mary Holstege (2004-02-12)





Editorial



[N] [Section 2: Serializing Arbitrary Data Models] This section uses terms

from 'XQuery 1.0 and XPath 2.0 Data Model' and 'XQuery 1.0 and XPath 2.0

Functions and Operators' - such as 'sequence', 'atomic value', 'text node',

'document node', 'casting to xs:string', etc. For readability, these terms

should be cross linked to the extent that it is possible.



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
qt-2004Feb0275-01: [Serialization] SCHEMA-O
[editorial, announced] 2004-10-15
[Serialization] SCHEMA-O, Mary Holstege (2004-02-12)





Editorial



[O] [Section 3: Serialization Parameters] 'use-character-maps' is a

misnomer. It is not a boolean parameter as it sounds. It is a list of

{character, string} pairs. Similar to cdata-section-elements, it should be

'character-maps'.



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group.

<<
Editorial

[O] [Section 3: Serialization Parameters] 'use-character-maps' is a
misnomer. It is not a boolean parameter as it sounds. It is a list of
{character, string} pairs. Similar to cdata-section-elements, it should be
'character-maps'.
>>

     Thanks to you and the XML Schema Working Group for this comment, 
which I am handling editorially.

     The name of this parameter derives from the attribute of the same 
name that is defined for xsl:output and xsl:result-document by XSLT 2.0. 
Admittedly, the name can be interpreted as implying that it has a boolean 
value, but it is intended to mean "these are the character maps to use." 
The name actually has an antecedent in XSLT 1.0:  use-attribute-sets.

     As I'd like to keep the names of the serialization parameters the 
same as the corresponding attributes defined by XSLT, I'm inclined not to 
make this editorial change.

     If this response is not acceptable to the XML Schema Working Group, I 
would invite you reopen the issue.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0275.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0276-01: [Serialization] SCHEMA-P
[editorial, announced] 2004-06-13
[Serialization] SCHEMA-P, Mary Holstege (2004-02-12)





Editorial



[P] [Section 3: Serialization Parameters] Though the title is 'Serialization

Parameters', this section also outlines the four phases of serialization.

Request to break this section into two: 'Serialization Parameters' and

'Serialization'.





On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Re: [Serialization] SCHEMA-P, Henry Zongaro (2004-06-13)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:


Mary Holstege wrote on 2004-02-12 04:18:57 PM:
> Editorial
> 
> [P] [Section 3: Serialization Parameters] Though the title is 
'Serialization
> Parameters', this section also outlines the four phases of 
serialization.
> Request to break this section into two: 'Serialization Parameters' and
> 'Serialization'.

     I have applied the editorial change that you suggested, splitting the 
section into two.  I named the second section "Phases of Serialization". 
Thanks to you and the Schema WG for this comment.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0276.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0278-01: [Serialization] SCHEMA-Q
[editorial, announced] 2004-06-13
[Serialization] SCHEMA-Q, Mary Holstege (2004-02-12)





Editorial



[Q] [Section 4.5: XML Output Method: the omit-xml-declaration Parameter]

This section jams two parameters, omit-xml-declaration and standalone.

Suggestion: split them into 2 sections.



On behalf of the XML Schema WG.



	-- Mary

	   Holstege@mathling.com



    
Re: [Serialization] SCHEMA-Q, Henry Zongaro (2004-06-13)

Mary,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the XML Schema 
Working Group:

Mary Holstege wrote on 2004-02-12 04:19:19 PM:
> Editorial
> 
> [Q] [Section 4.5: XML Output Method: the omit-xml-declaration Parameter]
> This section jams two parameters, omit-xml-declaration and standalone.
> Suggestion: split them into 2 sections.

     Thanks to you and the Schema Working Group for this comment.  I felt 
that the effect of the standalone parameter is too tightly coupled with 
the effect of the omit-xml-declaration parameter to split this section 
into two.  Rather than make the proposed change, I decided to rename the 
section "XML Output Method: the omit-xml-declaration and standalone 
Parameters".

     I hope that change will be acceptable to the working group.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0278.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-18: [Serial] I18N WG last call comments [21]
[editorial, announced] 2004-10-15
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



[21] Section 8: There should be a reference to XSLT to show examples

   of use of character maps.



Regards,    Martin.


Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group.

<<
[21] Section 8: There should be a reference to XSLT to show examples
   of use of character maps.
>>

     Thanks to you and the I18N Working Group for this comment, which I'm 
handling editorially.

     I have applied the editorial change that you suggested.  I would 
appreciate if you could check the next public draft of the specification 
when it becomes available, and verify that I've correctly applied the 
change.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0362-23: [Serial] I18N WG last call comments [26-30,33-34]
[editorial, announced] 2004-10-15
[Serial] I18N WG last call comments, Martin Duerst (2004-02-15)

Dear XML Query WG and XSL WG,



Below please find the I18N WGs comments on your last call document

"XSLT 2.0 and XQuery 1.0 Serialization"

(http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20031112/).



Please note the following:

- Please address all replies to there comments to the I18N IG mailing

   list (w3c-i18n-ig@w3.org), not just to me.

- Our comments are numbered in square brackets [nn].





We look forward to further discussion with you.



[this mail is copied to the DOM WG to tell them what we are

telling you about UTF-16 and endianness, which they should

adopt for the

Document Object Model (DOM) Level 3 Load and Save Specification]



Editorial:



[26] Normalization: This term is used for different things:

   - Character normalization (Charmod, NFC)

   - Normalization as described in section 2 of this document.

   - Normalization as described in the formal semantics document.

   These should be very clearly distinguished and labeled.



[27] Section 3, 'media-type', says "... the charset parameter of the

   media type must not be specified explicitly". This should be

   changed to "... the charset parameter of the media type must

   not be specified explicitly here." to make clear that this

   is just a statement about this parameter, not in general.



[28] Section 3, "omit-xml-declaration specifies whether the serialization

   process is to output an XML declaration. The value must be yes or no

   If this parameter is not specified, the value is implementation defined."

   The wording should be improved to make clear which is yes and which

   is no. (and please add a period after 'no').



[29] Section 4: "Additional nodes may be present in the new tree, and

   the values of attribute nodes and text nodes in the new tree may be

   different from those in the original tree, due to the character

   expansion phase of serialization.": this should clearly state

   that this applies only to URI escaping and character mapping, and

   that CDATA sections and escaping of special characters cannot

   create differences.



[30] 4.8: "If the output method is xml and the value of the version

   parameter is 1.0, namespace >UN<declaration is not performed,

   and the undeclare-namespace parameter is ignored."



[33] Section 7, freestanding paragraph "The default encoding for the text

   output method is implementation-defined.": this is a repetition from

   the previous paragraph and should be removed.



[34] RFC 2376 is obsoleted by RFC 3023.



Regards,    Martin.


Martin,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group.

<<
[26] Normalization: This term is used for different things:
   - Character normalization (Charmod, NFC)
   - Normalization as described in section 2 of this document.
   - Normalization as described in the formal semantics document.
   These should be very clearly distinguished and labeled.

[27] Section 3, 'media-type', says "... the charset parameter of the
   media type must not be specified explicitly". This should be
   changed to "... the charset parameter of the media type must
   not be specified explicitly here." to make clear that this
   is just a statement about this parameter, not in general.

[28] Section 3, "omit-xml-declaration specifies whether the serialization
   process is to output an XML declaration. The value must be yes or no
   If this parameter is not specified, the value is implementation 
defined."
   The wording should be improved to make clear which is yes and which
   is no. (and please add a period after 'no').

[29] Section 4: "Additional nodes may be present in the new tree, and
   the values of attribute nodes and text nodes in the new tree may be
   different from those in the original tree, due to the character
   expansion phase of serialization.": this should clearly state
   that this applies only to URI escaping and character mapping, and
   that CDATA sections and escaping of special characters cannot
   create differences.

[30] 4.8: "If the output method is xml and the value of the version
   parameter is 1.0, namespace >UN<declaration is not performed,
   and the undeclare-namespace parameter is ignored."

[Made [31] and [32] into separate, substantive issues.  HZ]

[33] Section 7, freestanding paragraph "The default encoding for the text
   output method is implementation-defined.": this is a repetition from
   the previous paragraph and should be removed.

[34] RFC 2376 is obsoleted by RFC 3023.
>>

     Thanks to you and the I18N Working Group for these comments, which I 
am handling editorially.

     I have applied the following changes to the serialization draft in 
response to these comments:

[26] I've tried to clarify the use of the first two types of normalization 
by referring to them as sequence normalization and Unicode normalization 
throughout the document.

[27] I made the suggested correction.

[28] Most descriptions of parameters have been removed from the 
"Serialization Parameters" section, including the description of 
omit-xml-declaration.  The description of this parameter that appears in 
the section on the XML output method should be clear.

[29] I have clarified this by referring explicitly to URI escaping, 
character mapping and Unicode normalization.

[30] It is now considered to be a serialization error if 
undeclare-namespaces has the value yes and the output method is xml, so 
this sentence no longer appears in the draft.

[33] The encoding parameter is no longer optional, so the two pieces of 
redundant information have been removed.

[34] Added a new normative reference to RFC 3023.


     I would appreciate if you could check the next public draft of the 
specification when it becomes available, and verify that I've correctly 
applied all the changes, and that they resolve these issues to the 
satisfaction of the I18N Working Group.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0362.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0920-01: ORA-SE-327-B: Surely namespace declaration is part of serializing XML version 1.0
[editorial, acknowledged] 2004-06-13



SECTION 4.7 : XML output method: the undeclare-namespaces parameter



Last sentence: "If the output method is xml and the value of the

version parameter is 1.0, namespace declaration is not performed...".

This statement looks incorrect.  Surely the output of an XML 1.0

method must declare namespaces, otherwise the result will not 

conform to Namespaces 1.0 Recommendation.  I think "declaration"

is a typo here; I think you mean "undeclaration".





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.


Stephen Buxton wrote on 2004-02-17 06:32:42 AM:
> SECTION 4.7 : XML output method: the undeclare-namespaces parameter
> 
> Last sentence: "If the output method is xml and the value of the
> version parameter is 1.0, namespace declaration is not performed...".
> This statement looks incorrect.  Surely the output of an XML 1.0
> method must declare namespaces, otherwise the result will not 
> conform to Namespaces 1.0 Recommendation.  I think "declaration"
> is a typo here; I think you mean "undeclaration".

     Thank you for pointing out this typographical error.  In response to 
another issue, this conflict in the settings of parameters has become a 
serialization error; the subsequent rewording removed the typo.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0920.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0931-01: ORA-SE-306-C: Confusing definition of the "version" parameter
[editorial, announced] 2004-10-15



SECTION 3: Serialization parameters



The parameter "version" is said to be "the version of the output

method".  Viewing an output method as a deliverable product,

or component of a deliverable product,

the version of the output method is not subject to

change from invocation to invocation, and hence is not a 

parameter in the usual meaning of the term, being rather a 

descriptive identifier about the output method, just the same as

the name of the vendor selling the output method is not a 

parameter, it is a descriptive identifier.  Perhaps what is meant

is the version of XML to be generated.  This is corroborated by

section 4.1 "XML output method: the version parameter".



- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
SECTION 3: Serialization parameters

The parameter "version" is said to be "the version of the output
method".  Viewing an output method as a deliverable product,
or component of a deliverable product,
the version of the output method is not subject to
change from invocation to invocation, and hence is not a 
parameter in the usual meaning of the term, being rather a 
descriptive identifier about the output method, just the same as
the name of the vendor selling the output method is not a 
parameter, it is a descriptive identifier.  Perhaps what is meant
is the version of XML to be generated.  This is corroborated by
section 4.1 "XML output method: the version parameter".
>>

     Thank you for this comment, which I am handling editorially.

     In response to another last call comment, the description of the 
version parameter no longer appears in Section 3 "Serialization 
Parameters".  Instead, descriptions of this parameter appear in the 
definitions of the XML and HTML output methods.

     I would appreciate if you could check the next draft of the 
specification when it becomes available, and verify that those definitions 
of the version parameter are acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0931.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0933-01: ORA-SE-310-E: difficult sentence to parse
[editorial, acknowledged] 2004-06-13
ORA-SE-310-E: difficult sentence to parse, Stephen Buxton (2004-02-17)



SECTION 4: XML output method



Sixth open circle bullet: "Additional namespace nodes may be 

present in the new tree if the serialization process undeclared

namespaces...".  On first reading, this sentence seemed 

ungrammatical.  The problem was that it felt like "the 

serialization" was the subject, and "process" was the verb,

which should either be "processes" or "processed", and then

"undeclared namespaces" would be the direct object.

Rewording it as "Additional namespace nodes may be 

present in the new tree if the serialization process undeclared

one or more namespaces..." might help.



- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Stephen Buxton wrote on 2004-02-17 06:44:21 AM:
> SECTION 4: XML output method
> 
> Sixth open circle bullet: "Additional namespace nodes may be 
> present in the new tree if the serialization process undeclared
> namespaces...".  On first reading, this sentence seemed 
> ungrammatical.  The problem was that it felt like "the 
> serialization" was the subject, and "process" was the verb,
> which should either be "processes" or "processed", and then
> "undeclared namespaces" would be the direct object.
> Rewording it as "Additional namespace nodes may be 
> present in the new tree if the serialization process undeclared
> one or more namespaces..." might help.

     Thank you for pointing out the potential difficulty in parsing this 
sentence.

     I have reworded the clause as follows:  "Additional namespace nodes 
may be present in the new tree if the serialization process did not 
undeclare one or more namespaces. . . ."  Note that "undeclared" has 
become "did not undeclare" in response to a substantive comment.  I hope 
you will find that the new version is not so difficult to parse.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0933.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0934-01: ORA-SE-303-B: undeclare-namespaces parameter is relevant to markup generation
[editorial, acknowledged] 2004-06-13



SECTION 3: Serialization parameters



Assuming that phase 1, "Markup generation", is responsible for

creating namespace declarations, the parameter undeclare-namespaces

is relevant to this phase and should be listed here.





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Stephen Buxton wrote on 2004-02-17 06:44:00 AM:
> SECTION 3: Serialization parameters
> 
> Assuming that phase 1, "Markup generation", is responsible for
> creating namespace declarations, the parameter undeclare-namespaces
> is relevant to this phase and should be listed here.

     Yes, creation of namespace declarations falls under the purview of 
the markup generation phase.  I have added the undeclare-namespaces 
parameter to the list of parameters that influence that phase of 
serialization.  Thank you for pointing out the omission.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0934.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0935-01: ORA-SE-311-C: What is the "processor"?
[editorial, announced] 2004-10-15
ORA-SE-311-C: What is the "processor"?, Stephen Buxton (2004-02-17)



SECTION 4: XML output method



Last sentence concludes with "...if nodes in the model contain

characters that are invalid in XML (introduced, perhaps, by

calling a user-written extension function: this is an error

but the processor is not required to signal it)." It is not

clear what is meant by "processor".  Does this refer to the 

XQuery engine that invoked the user-written extension and 

thereby obtained a corrupt value, or does it refer to the 

XML output method?  Probably the former, since there does 

not appear to be any provision for calling user-written 

functions from the output method.  This could be fixed by

changing "processor" to "XQuery processor".



- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
SECTION 4: XML output method

Last sentence concludes with "...if nodes in the model contain
characters that are invalid in XML (introduced, perhaps, by
calling a user-written extension function: this is an error
but the processor is not required to signal it)." It is not
clear what is meant by "processor".  Does this refer to the 
XQuery engine that invoked the user-written extension and 
thereby obtained a corrupt value, or does it refer to the 
XML output method?  Probably the former, since there does 
not appear to be any provision for calling user-written 
functions from the output method.  This could be fixed by
changing "processor" to "XQuery processor".
>>

     Thank you for your comment, which I am handling editorially.

     In response to your comment, and at the suggestion of Michael Kay, 
I've introduced a new term:  serializer.  That term is now used throughout 
the specification.  The only two remaining instances of "processor" are 
qualified and refer to XML processors and XSLT processors.

     I would appreciate if you could check the next draft of the 
specification when it becomes available, and verify that this new term and 
its definition satisfactorily address your comment.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0935.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0937-01: ORA-SE-314-B: Additional namespace nodes may be present if serialization does not undeclare namespaces
[editorial, announced] 2004-10-15



SECTION 4: XML output method



Sixth bullet, "Additional namespace nodes may be present in the 

new tree if the serialization process undeclared namespaces."

This seems to be a misstatement of what you intend.  Given a

document node D with an element node E1 with a child E2 with fewer

inscope namespaces than its parent E1, then there are four scenarios

to consider, forming a two-by-two matrix: The output method may

undeclare namespaces, or it may not; and the parse of the output

may be an XML 1.0 parser or an XML 1.1 parser.  The analysis of

the four cases is:



undeclare, reparse with XML 1.0: this will generate an error 

during the reparse, since undeclaring is not a feature of XML 1.0.



undeclare, reparse with XML 1.1: this will restore the original

value.



no undeclare, reparse with XML 1.0: no error during the reparse

step (at least for namespace undeclarations), so the resulting

document node will have more namespace nodes in the regenerated

E2 than it should have.



no undeclare, reparse with XML 1.1: same analysis as preceeding case.



Thus the correct statement is that additional namespaces nodes 

may be present in the new tree if the serialization process did

not undeclare namespaces.



That is replace "undeclared" with "did not undeclare".





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
SECTION 4: XML output method

Sixth bullet, "Additional namespace nodes may be present in the 
new tree if the serialization process undeclared namespaces."
This seems to be a misstatement of what you intend.  Given a
document node D with an element node E1 with a child E2 with fewer
inscope namespaces than its parent E1, then there are four scenarios
to consider, forming a two-by-two matrix: The output method may
undeclare namespaces, or it may not; and the parse of the output
may be an XML 1.0 parser or an XML 1.1 parser.  The analysis of
the four cases is:

undeclare, reparse with XML 1.0: this will generate an error 
during the reparse, since undeclaring is not a feature of XML 1.0.

undeclare, reparse with XML 1.1: this will restore the original
value.

no undeclare, reparse with XML 1.0: no error during the reparse
step (at least for namespace undeclarations), so the resulting
document node will have more namespace nodes in the regenerated
E2 than it should have.

no undeclare, reparse with XML 1.1: same analysis as preceeding case.

Thus the correct statement is that additional namespaces nodes 
may be present in the new tree if the serialization process did
not undeclare namespaces.

That is replace "undeclared" with "did not undeclare".
>>

     Thank you for your comment, which I am handling editorially.

     I have applied the correction you pointed out.  I would appreciate if 
you could check the next draft of the specification when it becomes 
available, and verify that I've correctly applied the change.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0937.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb0978-01: [Serial] IBM-SE-102: Serialization editorial comments
[editorial, announced] 2004-10-15



[My apologies that these comments are coming in after the end of the Last 

Call comment period.]



Hello,



     Following are comments on Serialization that we believe to be 

editorial in nature.





------------------------------------------------------------------



Section 2



The last sentence states that xs:NOTATION cannot be converted to 

xs:string.  That's no longer true.



------------------------------------------------------------------



Section 3



In the second bullet (cdata-section-elements), the value should be a list 

of expanded QNames rather than names.



------------------------------------------------------------------



Section 3



In the second bullet (cdata-section-elements), the clause "no elements 

will be treated specially" appears.  The meaning of "treated specially" is 

not clear - the statement should be made more clear.



------------------------------------------------------------------



Section 3



In the penultimate bullet (use-character-maps), the word "provides" is 

vague.  This should use the word "specifies", as do other bullets.



------------------------------------------------------------------



Section 3



In the penultimate bullet (use-character-maps), the name of the parameter 

is not entirely accurate.  In fact, there is just one mapping, though it 

may map many characters to strings.



------------------------------------------------------------------



Section 4.4



The last bullet refers to the xml:space attribute.  A reference to the 

definition of that attribute would be appropriate.



------------------------------------------------------------------



Section 4.4



In the last bullet, the style used to describe the value of the xml:space 

attribute isn't appropriate.  Change 'xml:space="preserve" attribute' to 

"xml:space attribute with the value 'preserve'".



------------------------------------------------------------------



Section 4.5



The last sentence describes circumstances in which the 

omit-xml-declaration parameter should be ignored.  Rather than saying it's 

ignored, it might be easier to understand if this indicated it's treated 

as if the value was "no".



------------------------------------------------------------------



Section 5



In the note in the fifth bullet, "in in" should be "in".



------------------------------------------------------------------



Section 5



Suggest replacing the note in the fifth bullet with the following:



<<

NOTE: Where the process used to construct the input data model does not 

provide complete control over the prefix (or lack thereof) used for an 

element name in the data model, implementors are encouraged to produce 

namepace syntax appropriate to the kind of document being serialized (when 

possible). For example, when serializing a document as XHTML it is 

preferable to bind "http://www.w3.org/1999/xhtml" as the default namespace 

(no prefix), like so: 



<html xmlns="http://www.w3.org/1999/xhtml"> ... </html> 



for best compatability with pre-XHTML applications.

>>



------------------------------------------------------------------



Section 5



The sixth bullet states, "The content type should be set to the value 

given for the media-type parameter; the default value for XHTML is 

text/html. The value application/xhtml+xml, registered in [RFC3236], may 

also be used."  It is not clear whether this means that a processor has 

two choices of media-type to use as the default, or "may also be used" 

refers to what the client of the serialization process may specify as the 

value of the parameter.  That needs to be clearly specified.  If the 

latter, it also needs to be clearly specified whether those are the only 

two values permitted.



------------------------------------------------------------------



Appendix A



Add references to XML 1.1 and Namespaces in XML 1.1.



------------------------------------------------------------------



Thanks,



Henry

[Speaking on behalf of reviewers from IBM.]

------------------------------------------------------------------

Henry Zongaro      Xalan development

IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044

mailto:zongaro@ca.ibm.com



    

Hello.

     In [1], I submitted various editorial comments on the Last Call 
Working Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of 
reviewers at IBM.

     I have applied the suggested editorial changes, except those in the 
description of the XHTML output method.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0978.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1037-01: ORA-SE-293-E: Redundant phrase that can be deleted
[editorial, acknowledged] 2004-06-13



SECTION 2: Serializing arbitrary data models



Step 2 says "If the data model instance contains any atomic

values, or sequences that contain atomic values, ...".

But the input to the normalization process is a single sequence,

and sequences do not nest, so what does it mean to say that

a sequence contains a sequence?  Perhaps you mean the second 

sequence to be a subsequence of the input sequence.  But there

is no need for this case, since if a subsequence contains an

atomic value, the input sequence also contains an atomic value.

The phrase "or sequences that contain atomic values" can be

deleted.



- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.


Stephen Buxton wrote on 2004-02-18 05:21:15 PM:
> SECTION 2: Serializing arbitrary data models
> 
> Step 2 says "If the data model instance contains any atomic
> values, or sequences that contain atomic values, ...".
> But the input to the normalization process is a single sequence,
> and sequences do not nest, so what does it mean to say that
> a sequence contains a sequence?  Perhaps you mean the second 
> sequence to be a subsequence of the input sequence.  But there
> is no need for this case, since if a subsequence contains an
> atomic value, the input sequence also contains an atomic value.
> The phrase "or sequences that contain atomic values" can be
> deleted.

     I am not entirely certain why that phrase appeared here, but I agree 
that it is entirely redundant.  I have applied the editorial change that 
you suggested.  Thank you for your comment.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1037.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1038-01: ORA-SE-307-E: "An xml output method" is better than "the xml output method"
[editorial, announced] 2004-10-15



SECTION 4: XML output method



The first four words of this section are "The xml output method".

The specification is non-constructive and non-deterministic.  

This is a good thing, but it means that there is more than one

acceptable algorithm for an xml output method.  Consequently

calling it "the xml output method" is misleading.  It would be

better to say explicitly that the specification is non-constructive

and non-deterministic, and talk about the requirements on 

"an xml output method" rather than "the xml output method".

Similarly, the title might be "XML output methods" rather than

the singular.



- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
SECTION 4: XML output method

The first four words of this section are "The xml output method".
The specification is non-constructive and non-deterministic. 
This is a good thing, but it means that there is more than one
acceptable algorithm for an xml output method.  Consequently
calling it "the xml output method" is misleading.  It would be
better to say explicitly that the specification is non-constructive
and non-deterministic, and talk about the requirements on 
"an xml output method" rather than "the xml output method".
Similarly, the title might be "XML output methods" rather than
the singular.
>>

     Thank you for your comment, which I am handling editorially.

     You make a good point.  I have added a paragraph to the end of the 
section entitled "Serialization Parameters" indicating that some 
unspecified details of the output methods are implementation-dependent, in 
an attempt to make it clear that the specifications of the output methods 
are non-constructive.

     However, I've decided not to accept your suggestion to refer to "an 
xml output method" rather than "the xml output method."  I'm inclined to 
regard this as a single method that is parameterized by both serialization 
parameters and implementation-defined and -dependent behaviours.  I think 
using the indefinite article would be confusing for readers.

     I would appreciate it if you could check the next draft of 
serialization to verify whether this resolution is acceptable to you.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1038.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1039-01: ORA-SE-328-E: no mention of the standalone property
[editorial, acknowledged] 2004-06-13



SECTION 4.8:  XML output method: other parameters



There is no section for the standalone parameter, and it is not

mentioned here either.  In section 3 "Serialization parameters"

two paragraphs after the list of parameters, last sentence, it 

says "If the semantics of a parameter are not described for an 

output method, then it is not applicable to that output method."

This would seem to imply that the standalone parameter is not

applicable to the XML output method.  But that can't be; see

section 4.5 "XML output method: the omit-xml-declaration parameter,

which describes interactions between the omit-xml-declaration

parameter and the standalone parameter.  Arguably these mentions

are sufficient, but it would be better if either the standalone

property appeared in the title of some section, or was listed

in this section as one of the "other parameters".



- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Stephen Buxton wrote on 2004-02-18 05:21:28 PM:
> SECTION 4.8:  XML output method: other parameters
> 
> There is no section for the standalone parameter, and it is not
> mentioned here either.  In section 3 "Serialization parameters"
> two paragraphs after the list of parameters, last sentence, it 
> says "If the semantics of a parameter are not described for an 
> output method, then it is not applicable to that output method."
> This would seem to imply that the standalone parameter is not
> applicable to the XML output method.  But that can't be; see
> section 4.5 "XML output method: the omit-xml-declaration parameter,
> which describes interactions between the omit-xml-declaration
> parameter and the standalone parameter.  Arguably these mentions
> are sufficient, but it would be better if either the standalone
> property appeared in the title of some section, or was listed
> in this section as one of the "other parameters".

     In response to your comment, I applied an editorial change to rename 
the section that was previously entitled "XML output method: the 
omit-xml-declaration Parameter" to "XML Output Method: the 
omit-xml-declaration and standalone Parameters".  I believe that should 
make it clear that the standalone parameter is applicable to the xml 
output method.

     Thank you for submitting your comment.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1039.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1041-01: ORA-SE-296-E: Please define "serialization error"
[editorial, announced] 2004-10-15



SECTION 2: serializing arbitrary data models



The term "serialization error" is used in various places

but never formally defined and therefore it

is not clear what this term encompasses.  Step 2 says 

"It is a serialization error if the value cannot be cast to 

xs:string." and step 6 says "It is a serialization error if an 

item in the sequence is an attribute node or a namespace node."

So "serialization error" includes at least these two conditions,

but it is not clear on reading this section whether there might

be others defined later in the specification.  The paragraph

after the six steps says "If the normalization process results 

in a serialization error, the processor must signal the error."

So it must signal the two just described; are there any others?





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
SECTION 2: serializing arbitrary data models

The term "serialization error" is used in various places
but never formally defined and therefore it
is not clear what this term encompasses.  Step 2 says 
"It is a serialization error if the value cannot be cast to 
xs:string." and step 6 says "It is a serialization error if an 
item in the sequence is an attribute node or a namespace node."
So "serialization error" includes at least these two conditions,
but it is not clear on reading this section whether there might
be others defined later in the specification.  The paragraph
after the six steps says "If the normalization process results 
in a serialization error, the processor must signal the error."
So it must signal the two just described; are there any others?
>>

     Thank you for your comment, which I am handling editorially.

     I have added a terminology section that includes a definition of 
"serialization error", and what it means to signal a serialization error. 
I would appreciate if you could check the next draft of the specification 
when it becomes available to verify that you find the definition to be 
acceptable.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1041.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1043-01: ORA-SE-299-E: misplaced comma
[editorial, acknowledged] 2004-06-13
ORA-SE-299-E: misplaced comma, Stephen Buxton (2004-02-18)



SECTION 3: Serialization parameters



It says "undeclare-namespaces specifies whether namespaces, are..."

The comma should be removed.







- Steve B.



    
Re: ORA-SE-299-E: misplaced comma, Henry Zongaro (2004-06-13)

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Stephen Buxton wrote on 2004-02-18 05:22:21 PM:
> SECTION 3: Serialization parameters
> 
> It says "undeclare-namespaces specifies whether namespaces, are..."
> The comma should be removed.

     Thank you for submitting your comment.

     In response to another comment, the description of the meaning of a 
parameter appears only in the section on the xml output method.  I removed 
the text with the typographical error you pointed out.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1043.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: ORA-SE-299-E: misplaced comma, Henry Zongaro (2004-06-13)
qt-2004Feb1044-01: ORA-SE-297-E: Alphabetization problem
[editorial, acknowledged] 2004-06-13
ORA-SE-297-E: Alphabetization problem, Stephen Buxton (2004-02-18)



SECTION 3: serialization parameters



The list of parameters appears to be alphabetized, with the 

exception of the first parameter, encoding.  Perhaps this one

should be placed in alphabetic order.  Perhaps there should be

a prefatory note that the parameters are listed in alphabetic 

order.



- Steve B.



    
Re: ORA-SE-297-E: Alphabetization problem, Henry Zongaro (2004-06-13)

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Stephen Buxton wrote on 2004-02-18 05:22:29 PM:
> SECTION 3: serialization parameters
> 
> The list of parameters appears to be alphabetized, with the 
> exception of the first parameter, encoding.  Perhaps this one
> should be placed in alphabetic order.  Perhaps there should be
> a prefatory note that the parameters are listed in alphabetic 
> order.

     I have applied the first editorial change you suggested, and placed 
all of the parameters in alphabetical order.  I did not feel it was 
necessary to include a prefatory note.  Thank you for submitting your 
comment.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1044.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: ORA-SE-297-E: Alphabetization problem, Henry Zongaro (2004-06-13)
qt-2004Feb1045-01: ORA-SE-295-E: The Note overflow the right margin when printed
[editorial, acknowledged] 2004-06-13



SECTION 2: Serializing arbitrary data models



The Note towards the end of this section overflows the right

margin when this specification is printed.





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Stephen Buxton wrote on 2004-02-18 05:22:38 PM:
> SECTION 2: Serializing arbitrary data models
> 
> The Note towards the end of this section overflows the right
> margin when this specification is printed.

     I believe you were referring specifically to the XQuery expression 
that appears in the example in that note.  I have adjusted the formatting 
of the example, so that it should fit within a page of reasonable width. 
Thank you for pointing out the problem.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1045.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1046-01: ORA-SE-291-E: Term "empty string" is a poor choice of words
[editorial, raised] 2004-02-18



SECTION 2: Serializing arbitrary data models



The introductory paragraph says that the result of each

step should be another sequence.  Step 1 says "Replace an 

empty sequence with a zero-length string."  The term 

"empty string" is a poor choice of terminology.  a 

"string" might be either a text node or an atomic value of

type xs:string.  Since text nodes 

are not allowed to have length zero, you must mean an xs:string

value of length 0.  You could save your readers the trouble

of making these deductions (by possibly missing the fact

that a text node can not be empty) by simply saying

"Replace the empty sequence by an atomic value of type

xs:string and length 0".



- Steve B.



    
qt-2004Feb1047-01: ORA-SE-290-E: Title misuses the term "data models"
[editorial, acknowledged] 2004-06-13



SECTION 2: Serializing arbitrary data models



The use of the term "data models" in the title is incorrect.

A data model is an abstract specification of what values are

permissible within a system.  You are not talking about 

serializing arbitrary abstract specifications.

What you mean here is "serializing arbitrary sequences".

This abuse of the term "data model" is persistent throughout

the entire specification.  It would be a good idea to scan

the entire specification for "data model" and use the proper

terminology for those so-called "data models" that are 

really values.





- Steve B.



    

Steve,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Stephen Buxton wrote on 2004-02-18 05:22:56 PM:
> SECTION 2: Serializing arbitrary data models
> 
> The use of the term "data models" in the title is incorrect.
> A data model is an abstract specification of what values are
> permissible within a system.  You are not talking about 
> serializing arbitrary abstract specifications.
> What you mean here is "serializing arbitrary sequences".
> This abuse of the term "data model" is persistent throughout
> the entire specification.  It would be a good idea to scan
> the entire specification for "data model" and use the proper
> terminology for those so-called "data models" that are 
> really values.

     Thank you for pointing out this error.  The editors of the various 
XQuery and XSLT specifications decided that the appropriate term to use 
when speaking of a value is "instance of the data model", and that the 
term "data model" should be used only when speaking of that specification.

     I have applied this editorial change throughout the Serialization 
specification.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1047.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1196-01: [Serialization] MS-SER-LC1-003
[editorial, acknowledged] 2004-06-13
[Serialization] MS-SER-LC1-003, Michael Rys (2004-02-26)



Section 2		

Editorial	



Please rewrite "whether namespaces, are to be undeclared " as "whether

namespaces are to be undeclared ".

Fw: [Serialization] MS-SER-LC1-003, Henry Zongaro (2004-06-13)

Michael,

     In [2], I sent the following response to one of your comments:

Henry Zongaro/Toronto/IBM wrote on 2004-06-13 02:05:48 PM:
>      In [1], you submitted the following comment on the Last Call 
> Working Draft of XSLT 2.0 and XQuery 1.0 Serialization.
> 
> Michael Rys wrote on 2004-02-26 04:23:18 PM:
> > Section 2 
> > Editorial 
> > 
> > Please rewrite "whether namespaces, are to be undeclared " as "whether
> > namespaces are to be undeclared ".

>      I have corrected the typographical error that you pointed out. 
> Thank you for submitting this comment.

     In fact, in response to another comment, the description of the 
meaning of the undeclare-namespaces parameter appears only in the section 
on the xml output method.  I removed the text with the typographical error 
you pointed out.

     My apologies for any confusion.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1196.html
[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Jun/0062.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Fw: [Serialization] MS-SER-LC1-003, Henry Zongaro (2004-06-13)
qt-2004Feb1199-01: [Serialization] MS-SER-LC1-004
[editorial, acknowledged] 2004-06-13
[Serialization] MS-SER-LC1-004, Michael Rys (2004-02-26)



Section 2		

Editorial	



"Serialization can be regarded as involving four phases of processing,

carried out sequentially as follows:" should add normalization step that

is mentioned earlier or make it clear that normalization has already

occurred.

Re: [Serialization] MS-SER-LC1-004, Henry Zongaro (2004-06-13)

Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Michael Rys wrote on 2004-02-26 04:23:21 PM:
> Section 2 
> Editorial 
> 
> "Serialization can be regarded as involving four phases of processing,
> carried out sequentially as follows:" should add normalization step that
> is mentioned earlier or make it clear that normalization has already
> occurred.

     I have applied an editorial change to indicate that the phases of 
serialization that this describes follow the normalization step.  Thank 
you for your comment.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1199.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] MS-SER-LC1-004, Henry Zongaro (2004-06-13)
qt-2004Feb1200-01: [Serialization] MS-SER-LC1-007
[editorial, announced] 2004-10-15
[Serialization] MS-SER-LC1-007, Michael Rys (2004-02-26)



Section 4		

Editorial	



Please insert a subsection title before current 4.1 to improve structure

of section.


Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
Section 4 
Editorial 

Please insert a subsection title before current 4.1 to improve structure
of section.
>>

     Thank you for your comment, which I am handling editorially.

     I have applied the editorial change that you suggested to both the 
section entitled "XML Output Method" and the section entitled "HTML Output 
Method".  I would appreciate if you could check the next draft of the 
specification when it becomes available, and verify that I've applied the 
change to your satisfaction.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1200.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1201-01: [Serialization] MS-SER-LC1-008
[editorial, announced] 2004-10-15
[Serialization] MS-SER-LC1-008, Michael Rys (2004-02-26)



Section 4.2		

Editorial	



Add some more concrete examples for last paragraph. E.g, U+0007 in XML

1.0 or U+0000 in XML 1.1.


Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
Section 4.2 
Editorial 

Add some more concrete examples for last paragraph. E.g, U+0007 in XML
1.0 or U+0000 in XML 1.1.
>>

     Thank you for your comment, which I am handling editorially.

     The section you refer to is entitled "XML Output Method: the encoding 
Parameter".  I decided instead to add an example describing the effect of 
control characters in the section on the version parameter, and I added an 
example describing the effect of using characters that cannot be 
represented in a particular encoding to this section on the encoding 
parameter.

     I would appreciate if you could check the next draft of the 
specification when it becomes available, and verify that this is an 
acceptable response to your comment.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1201.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Feb1202-01: [Serialization] MS-SER-LC1-006
[editorial, raised] 2004-02-26
[Serialization] MS-SER-LC1-006, Michael Rys (2004-02-26)



Section 4		

Editorial	



Remove note about using XSD type annotation mechanisms. If this is

added, it should be added as a separate output method (see comment

MS-SER-LC1-012).

qt-2004Feb1203-01: [Serialization] MS-SER-LC1-010
[editorial, acknowledged] 2004-06-13
[Serialization] MS-SER-LC1-010, Michael Rys (2004-02-26)



Section 4.7		

Editorial	



Please reword "represented most accurately " as "represented accurately

".

Re: [Serialization] MS-SER-LC1-010, Henry Zongaro (2004-06-13)

Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

Michael Rys wrote on 2004-02-26 04:23:48 PM:
> Section 4.7 
> Editorial 
> 
> Please reword "represented most accurately " as "represented accurately
> ".

     I have applied the editorial change that you recommended.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1203.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
Re: [Serialization] MS-SER-LC1-010, Henry Zongaro (2004-06-13)
qt-2004Feb1206-01: [Serialization] MS-SER-LC1-011
[editorial, announced] 2004-10-15
[Serialization] MS-SER-LC1-011, Michael Rys (2004-02-26)



Section 4.7		

Editorial	



Please add the xml namespace node to the example, since that node is

always in-scope. Also consider to represent the data model nodes without

using an XML serialized form.


Michael,

     In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization.

<<
Section 4.7 
Editorial 

Please add the xml namespace node to the example, since that node is
always in-scope. Also consider to represent the data model nodes without
using an XML serialized form.
>>

     Thank you for this comment, which I am handling editorially.

     I have applied the editorial changes that you suggested.  I would 
appreciate if you could check the next draft of the specification when it 
becomes available, and verify that I've correctly applied the change.

Thanks,

Henry
[1] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/1206.html
------------------------------------------------------------------
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
mailto:zongaro@ca.ibm.com
qt-2004Nov0025-05: [Serial] XHTML Serialization
[editorial, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

[...]
  If the instance of the data model includes a head element in the XHTML
  namespace, and the include-content-type parameter has the value yes,
  the xhtml output method MUST add a meta element immediately after the
  start-tag of the head element specifying the character encoding
  actually used.
[...]

It is not clear what "immediately after" means here, I would expect that
the meta element is the first child of the head element, but in the
example it is the second node (preceded by a whitespace text node).

The example is non-conforming as it lacks the trailing space before />,
please change the example to conform to the specification.

qt-2004Nov0025-06: [Serial] XHTML Serialization
[editorial, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

[...]
  The content type SHOULD be set to the value given for the media-type
  parameter; the default value for XHTML is text/html. The value
  application/xhtml+xml, registered in [RFC3236], MAY also be used.
[...]

It is not clear to me what you mean here, there are only two behaviors
that make sense to me, either

  * the value is always text/html
  * the value is always the value given
    for media-type defaulting to text/html

It is never acceptable to use application/xhtml+xml unless explicitly
requested (among other things, some user agents fail to recognize the
charset parameter if the type is not text/html), please change the text
to state that the value must be set to the value given for the
media-type or text/html if no media-type is specified.

qt-2004Nov0025-08: [Serial] XHTML Serialization
[editorial, raised] 2004-11-08
[Serial] XHTML Serialization, Bjoern Hoehrmann (2004-11-08)

Dear XSL Working Group,
Dear XML Query Working Group,

Comment on section 6 of the Serialization spec:

Please add a note that this process removes possible parameters in the
attribute value, i.e. that

  <meta http-equiv="Content-Type"
        content="text/html;version='3.0'" />

in the data model instance would be replaced by e.g.

  <meta http-equiv="Content-Type"
        content="text/html;charset=utf-8" />

qt-2004Nov0037-01: [Serial] Normalization and References
[editorial, raised] 2004-11-10
[Serial] Normalization and References, Bjoern Hoehrmann (2004-11-10)

Dear XSL Working Group,
Dear XML Query Working Group,

  In Section 4 item 2 "Character expansion" it is not clear in which
order these are to be processed, the prose states the list is "in
priority order" but the list is unordered. If the list is meant to be in
an order please use an ordered list in the markup. Specifically, please
clarify whether URI escaping is affected by the normalization-form
parameter and further please add a note in how far escape-uri-attributes
is consistent with the latest IRI Internet Draft. I further note that
the reference for the normalization forms is

  http://www.w3.org/TR/2002/WD-charmod-20020430/

which is quite outdated and the latest version does no longer cover this
subject. 
qt-2004Nov0037-02: [Serial] Normalization and References
[editorial, raised] 2004-11-10
[Serial] Normalization and References, Bjoern Hoehrmann (2004-11-10)

I also note that the reference [Unicode Normalization] appears
in the References list but is not actually used in the document. It
would seem you want to reference UAX 15 rather than charmod for the nor-
mative definition of the normalization forms. Please review the
references section for other outdated and/or unused references and
change the section to comply with the recommendations for references
sections outlined in <http://www.w3.org/2001/06/manual/>.
http://esw.w3.org/topic/SiteTools lists several tools that might help
here, for example <http://www.w3.org/2004/07/references-checker-ui>.

qt-2004Dec0001-01: [Serial] serialization of xhtml + omit-xml-declaration
[editorial, raised] 2004-12-07

hello.

working with saxon, i discovered that the omit-xml-declaration value is
ignored if i use any other xml encoding than UTF-8. from the xml point
of view, this makes sense. however, when using xhtml documents as
fragments that are assembled with a non xml-aware server-side mechanism
such as php, this is a problem because unless i use UTF-8, i end up with
xml declarations in the middle of my assembled web page, which makes it
invalid.

http://www.w3.org/TR/xslt-xquery-serialization/#xhtml-output claims that
 the omit-xml-declaration should be observed, but does so only in a
note, which leaves me wondering whether saxon is incorrect or just
interpreting the spec differently. i would suggest to clarify this
issue, and i would also suggest to require that the xml declaration is
not output, regardless of the encoding.


> 
> http://www.w3.org/TR/xslt-xquery-serialization/#N10E63 probably is the
> relevant part. as i understand it, it only specifies when a 
> declaration
> MUST be output, but not when it MUST NOT be output. or am i reading it
> wrong or lacking some context?

I think you're right, the specification forgets to say that an XML
declaration must not be output if the omit-xml-declaration parameter has the
value "yes". Sometimes one forgets to say the obvious.

Michael Kay