This document summarizes the comments on XML Schema received during its Candidate-Recommendation comment period which appear to need tracking and formal responses from the XML Schema WG. In its current form it has been prepared by Michael Sperberg-McQueen.
In this version (7 March 2001) the items raised during the final inspection of the spec have been added as CR-64 and following.
The process by which the XML Schema WG plans to handle these issues is described at http://www.w3.org/2000/11/03-xmlschema-crprocess.html.
Material reproduced from comments has been marked up, and obvious typos have been corrected. Postings and documents which raise several substantively distinct points have been silently divided among several comments. To consult the original postings, consult the archive of the www-xml-schema-comments list.
Commentators are requested to consult the entries for the comments they have made, and to check to make sure we have correctly understood and paraphrased their comment. (Note that in a few cases the paraphrase may pose a slightly broader question that the commentator appears to have had in mind.)
In addition to the postings to the XML Schema comments list, some postings to the XML Schema Interest-Group mailing list have been included here; this list is W3C-internal and only those with member access to the W3C web site will be able to follow the relevant hyperlinks. Where we have received permission to quote the original posting in this public document, we have done so; in other cases, a paraphrase enclosed in square brackets has been supplied. Links to member-only material included in postings to the public list have been left intact in the interests of completeness (for those who do have member access) and simplicity (for those maintaining this document).
Not all postings to the comments mailing list have resulted in entries in this list. Some postings continue discussion of issues raised during last call and resolved before the Candidate Recommendation was published; others call attention to typographic errors and require attention from the editors but not from the WG as a whole; others comment on the draft, or ask for clarification, but do not appear to raise substantive decidable questions. And of course some postings are wholly off topic.
Some postings which do raise decidable substantive questions have nevertheless not been included in this list, for various reasons. In some cases the issues involved have already been discussed and decided by the WG, the postings appear to provide no new information, and the chairs do not believe it is a useful expenditure of resources for the WG to reconsider the issues in question. In other cases, the issues involved go beyond the power of the XML Schema WG to resolve them. In particular, the following questions have not been included as requiring tracking by the WG:
WG members or others who believe that the points raised in these notes ought to be tracked using the XML Schema WG's formal issue-tracking mechanism should contact the WG chairs and tell them which points need tracking, and why.
Num | Cl | Cluster | Status | Originator | Responsible | Description |
---|---|---|---|---|---|---|
CR-1 | dd | datahead | ok | XML Schema WG | xsi:null | |
CR-2 | d | usability | ok | XML Schema WG | Setting schema-level defaults | |
CR-3 | s | datahead | ok | XML Schema WG | Local element declarations | |
CR-4 | c | usability | nok | XML Schema WG | Order of content model and attribute declarations | |
CR-5 | s | composition | ok | XML Schema WG | Multiple inclusions/imports/redefinitions | |
CR-6 | dd | composition | ok | XML Schema WG | Redefine | |
CR-7 | s | numerics | ok | XML Schema WG | Minimal level of support for decimals digits | |
CR-8 | dd | datetime | ok | XML Schema WG | Order on timeDuration | |
CR-9 | dd | datetime | nok | XML Schema WG | Interoperability of date/time types | |
CR-10 | d | misc-uri | ok | XML Schema WG | Non-URI characters and URI references | |
CR-11 | s | regex | ok | XML Schema WG | Regular expression and diacritics | |
CR-12 | s | regex | ok | XML Schema WG | Whitespace in regular expressions | |
CR-13 | z | qnames | resolved | James Clark | Lee Buck | Interpretation of QNames without prefixes |
CR-14 | s | declarables | ok | James Clark | MSM | Value space of entities, notations, IDREFs (and QNames) |
CR-15 | s | refs | ok | James Clark | Priscilla Walmsley | Value space of ID |
CR-16 | s | declarables | ok | James Clark | David Cleary | Entity-declared constraint |
CR-17 | d | declarables | nok | James Clark | Alex Milowski | Remove notation type? |
CR-18 | z | refs | nok | Eric van der Vlist | Aki Yoshida | Validating XPointer IDREFS? |
CR-19 | s | regex | resolved | Uwe Plonus | Chuck Campbell | Fix regex grammar to outlaw ambiguous regexes? |
CR-20 | d | complextypes | nok | James Clark | Ashok Malhotra | Make wildcard namespace selection more powerful? |
CR-21 | z? | xsi:type | ok | Asir Vedamuthu | Martin Gudgin | Drop use of xsi:type as determinant for unions? |
CR-22 | z | numerics | ok | Mike Cowlishaw | Henry S. Thompson | Allow negative scale? |
CR-23 | z? | numerics | ok | Mike Cowlishaw | Dan Fox | Allow exponential notation for decimals? |
CR-24 | c | simpletypes | ok | Asir Vedamuthu | Allen Brown | Should some types be changed from derived to primitive? |
CR-25 | c | strings | ok | XML Schema WG | Noah Mendelsohn | Should the name CDATA be changed? |
CR-26 | z? | complextypes | ok | XML Schema WG | Bob Streich | Allow fuller constraints on mixed content? |
CR-27 | z | usability | ok | XML Schema WG | Should the default value for the schema-level elementFormDefault attribute be changed? | |
CR-28 | s | datetime | resolved | Graham Ross | Peter Chen | Should canonical forms of date values be allowed to include timezone indications? |
CR-29 | z | components | ok | XML Schema WG | Should the transfer syntax and the components have identical expressive power? | |
CR-30 | c | complextypes | ok | XML Schema WG | Should optional ALL groups be allowed? | |
CR-31 | d | complextypes | silent | I18n WG | Asir Vedamuthu | Should xml:lang be allowed by default in all complex types? |
CR-32 | u | complextypes | ok | Morris Matsa | Lee Buck | Change named model groups? |
CR-33 | s | regex | ok | James Clark | MSM | Add a formal grammar for regular expressions? |
CR-34 | d | complextypes | ok | Asir Vedamuthu | Priscilla Walmsley | Clarify definition of restriction and extension of wildcard attributes? |
CR-35 | s | simpletypes | ok | Asir Vedamuthu | David Cleary | Allow simple types to block further derivation? |
CR-36 | s | simpletypes | ok | Asir Vedamuthu | Alex Milowski | Clarify meaning and processing of repeated facets? |
CR-37 | z | dtd | silent | Curt Arnold | Aki Yoshida | Change the namespace defaults in DTD for Schemas? |
CR-38 | s | composition | resolved | Roberto Galnares | Chuck Campbell | Clarify rules on importing the target namespace? |
CR-39 | s | simpletypes | ok | Bob Schloss | Ashok Malhotra | Allow id attributes on facet elements? |
CR-40 | c | dtd | ok | Curt Arnold | Martin Gudgin | Change default value of block? |
CR-41 | s | qnames | ok | Asir Vedamuthu | Henry S. Thompson | Prefixes for qualified attributes in PSVI? |
CR-42 | z | numerics | nok | Mike Cowlishaw | Dan Fox | Change canonical representation for decimal? |
CR-43 | s | complextypes | silent | Alexander Falk | Allen Brown | Should cyclic substitution groups be forbidden explicitly? |
CR-44 | z | xsi:type | ok | Alexander Falk | Noah Mendelsohn | Drop xsi:type? |
CR-45 | z | misc-agg | resolved | Mike McCaleb | Bob Streich | Distinguish arrays, lists, sets, bags? |
CR-46 | z | refs | resolved | Mike McCaleb | Peter Chen | Allow keyref to model IDREFS? |
CR-47 | z | simpletypes | silent | Bob Schloss, Mike McCaleb | Asir Vedamuthu | Allow simple types to be final? abstract? |
CR-48 | s | simpletypes | ok | Martin Gudgin | Lee Buck | Change canonical form of URIs for builtin datatypes and facets? |
CR-49 | d | refs | nok | Andy Clark | Jim Trezzo | Use subset not full Xpath for keys and keyrefs? |
CR-50 | sd | refs | silent | MPEG 7 | Priscilla Walmsley | How do IDREF and union types interact? |
CR-51 | s | refs | silent | Noah Mendelsohn | David Cleary | Does IDREF validation contradict our validation story? |
CR-52 | s | refs | ok | Noah Mendelsohn | Alex Milowski | Do defaulted values participate in identity-constraint checking? |
CR-53 | sd | misc-bin | ok | James Clark | Aki Yoshida | Add built-in base64Binary and hexBinary types? |
CR-54 | dd | refs | silent | Allen Brown | Chuck Campbell | Remove identity constraints? |
CR-55 | sd | misc-boole | silent | Allen Brown | Ashok Malhotra | Restore 0 and 1 as Boolean values? |
CR-56 | sd | unassigned | ok | Murata Makoto | Martin Gudgin | Specify text/xml or application/xml? |
CR-57 | unassigned | ok | XML Schema WG | (X-1) Ensure that out-of-band attributes in the schema are reflected into the schema components? | ||
CR-58 | unassigned | ok | XML Schema WG | (X-2) Provide declarations for xsi attributes? | ||
CR-59 | unassigned | ok | XML Schema WG | (X-3) Specify what happens when the input to a schema processor is a PSVI? | ||
CR-60 | unassigned | ok | XML Schema WG | (X-4) Clarify when two local elements with the same name have the same type? | ||
CR-61 | unassigned | ok | XML Schema WG | (X-5) Use extensional definition of complex-type restriction? | ||
CR-62 | unassigned | ok | XML Schema WG | (X-6) Wildcards and substitution groups? | ||
CR-63 | unassigned | ok | WG | Canonical form of Boolean? | ||
CR-64 | unassigned | resolved | WG | <anyAttribute> and ambiguity | ||
CR-65 | unassigned | ok | WG | Qnames and their value spaces | ||
CR-66 | unassigned | ok | WG | Eating our own cooking (fixed values) | ||
CR-67 | unassigned | ok | Noah Mendelsohn | Type for minOccurs and maxOccurs? | ||
CR-68 | unassigned | ok | WG | Multiple uses of ID type | ||
CR-69 | unassigned | ok | WG | Chameleon include, redefine, etc. | ||
CR-70 | unassigned | resolved | WG | Name of URI type | ||
CR-71 | unassigned | ok | XPathTF | From XPath task force |
Does the xsi:null feature provide useful functionality? Do you have requirements to support null values in the area in which you expect your software to be deployed? If so, does the xsi:null feature provide what you need? Would you have a problem if XML Schema provided no mechanism for document authors / data sources to provide explicit null values?
Does the xsi:null feature create any implementation problems for you?
Input from William Jamieson:
Input from Alexander Falk:
[Item 4 in the list.]
Input from Allen Brown:
[member-confidential]
Input from David Beech:
Input from Andrew Layman:
Input from Andrew Layman:
Chairs propose to discuss and decide. Proposals: keep as is, remove, rename.
The WG discussed this issue in its meeting of January 2001 in London.
The extensive discussion prior to the meeting and during the meeting won't be repeated here. The arguments on each side of the issue were summarized by Jonathan Robie and David Beech.
When the discussion was concluded, there was no consensus in favor of removing xsi:null. There was consensus in favor of renaming it, and the WG agreed to do so. A variety of names were proposed, discussed, considered overnight, and discussed again on the second day of the meeting.
After lengthy discussion and several ballots, it became clear that the names xsi:nil and nillable were acceptable to a larger percentage of the WG than any other names proposed. RESOLVED without dissent: to change xsi:null to xsi:nil and nullable to nillable.
Since this issue was raised by the WG itself, there is no commentator to notify.
A number of the attributes on the <schema> element provide defaults for attributes on subordinate elements. To wit:
This allows setting values for attributes we judge likely to have the same value across a whole schema document in only one place. It does constitute a kind of minimisation, and does not provide any new semantics. Is it a good thing, or not a good thing, to allow the default value to vary from schema to schema? Is this a good way to set such schema-level defaults? These default-setting attributes themselves have default values; are they the correct values?
Input from Alexander Falk:
[Item 7 of list.]
Chairs propose to discuss elementFormDefault briefly
and vote on changing default to qualified
, removing default,
or retaining default of unqualified
.
Retain other defaults as is.
At the face to face meeting in London, January 2001, the WG discussed this issue. Since the only feedback received had related to elementFormDefault, the discussion was divided.
RESOLVED: to retain the schema-level defaults other than elementFormDefault, and to make no change to their default values.
On elementFormDefault, some WG members preferred to
change its default value from unqualified
to
qualified
. Some vendors said that the largest single
cause of user error in definition of schemas was this default value:
most users prefer qualified local elements. Others agreed that the
original choice of default value for this schema-level default had been
an error. On the other side, some WG members said that in their experience
the current default was pedagogically useful, as it forced learners of
the language to think clearly about what they do want. Some argued
that although qualified
would have been a better default,
it was now important for the stability of the spec to retain the
current default. Other WG members continued
to argue that local elements ought, by default, to be unqualified, and
that a change would be a bad idea in itself.
RESOLVED: to make no change to the default value for elementFormDefault. Dissenting: Calico, Software AG, Sun, TIBCO Extensibility, webMethods. Abstaining: Commerce One, Edinburgh, Holstege, HP, MITRE.
Since this issue was raised by the WG itself, there is no commentator to notify.
The provision of local element declarations is in part intended to simplify mapping between programming language and database structures where locally scoped name-type bindings are commonplace. It is a departure from XML 1.0 DTDs, in which the name-type binding for elements (but not for attributes) is constant across a document. Is it a good thing or not a good thing to provide local element declarations? Does it in fact simplify mappings as intended?
At its face to face meeting of January 2001 in London, the WG voted to affirm the current design of this feature. No contrary feedback had been received, and experience shows that local element declarations are useful in the expected cases.
Since this issue was raised by the WG itself, there is no commentator to notify.
Complex types which have both a content model and attributes must have them in that order. Is it a good idea to require a fixed order? (It is sometimes suggested that a fixed order simplifies implementations. Is this true or false, in your case?) If a fixed order is required, should it be this one (content model then attributes), or should it be the other way round (attributes then content model)?
At its face to face meeting in November 2000, the WG voted not to change the CR draft in this respect. Dissenting: Calico, IBM, Intel by proxy, HL7, webMethods.
Sections 6.2.1, 6.2.2, and 6.2.3 each end with a note about multiple inclusion/redefinition/importing of other schema documents. The space of possibilities here, particular once nesting is considered, is very large: we solicit feedback on ease of implementation, and any interoperability issues which arise.
Chairs propose to make no change.
At its face to face meeting of January 2001 in London, the WG noted that no adverse feedback had been received, and voted to affirm the current design of this feature.
Since this issue was raised by the WG itself, there is no commentator to notify.
In an effort to provide some support for evolution and versioning, it is possible to incorporate components corresponding to a schema document with modifications. The modifications have a pervasive impact, that is, only the redefined components are used, even when referenced from other incorporated components, whether redefined themselves or not.
This facility is very powerful, perhaps too powerful. Reports of implementation experience, in terms of useability for particular purposes, of the constraints on redefinition imposed in the spec and of implementation difficulty, would be very welcome.
At its meeting in November 2000, the WG noted that some users would like to have a way to block redefinition of components.
Input from Asir Vedamuthu:
Input from Norm Walsh:
Input from Asir Vedamuthu:
Input from Henry Thompson:
Input from Rick Jelliffe:
Input from Bob Schloss:
Chairs propose to discuss and decide. Proposals: Proposal to remove the feature. Norm Walsh and Henry Thompson's proposal to allow restriction and extension on content model groups and attribute groups. Also Asir Vedamuthu's proposal to introduce a mechanism to allow schema authors to block redefinition of types and groups.
At the face to face meeting in London, January 2001, the WG discussed this issue.
RESOLVED: not to remove or change the feature. Dissenting: Altova, Commerce One, Microsoft, Progress, Sun, webMethods. Abstaining: Software AG.
Some WG members dislike this feature, and would be happy to place some major restriction on it, but there was no concrete proposal. The majority in the WG felt that the feature was essential for modularization of XML-based languages (XHTML modularization is an outstanding example; without redefine, those involved in the XHTML modularization work have repeatedly doubted that XML Schema can handle their task at all). It is also a key part of our response to the I18n WG's request to make it easier to take schemas written without xml:lang and similar attributes on the appropriate types, and add them.
The WG discussed whether to introduce a mechanism to allow schema authors to block redefinition of types and groups, but there was no consensus in favor of the change at this point (vote of 12 to 11).
Since this issue was raised by the WG itself, there is no commentator to notify.
The I18n WG wrote 8 March 2001 (member-only link) saying (among other things) that they agree with XML Schema that the redefine facility is an acceptable solution to their request for an easy method of adding attributes (and, to the degree possible, sub-elements) to existing declarations; their final agreement is conditional upon their checking the final text of the description of the string type and the Primer, and the use of the xml:lang attribute on natural-language prose in all examples in the spec.
All minimally conforming processors must support decimal numbers with a minimum of 18 decimal digits (i.e., with a precision of 18). However, minimally conforming processors may set an application-defined limit on the maximum number of decimal digits they are prepared to support, in which case that application-defined maximum number must be clearly documented.
As in all such cases, the minimum number of decimal digits that all minimally conforming processors must support is too small for some applications and, perhaps, too large for others. We welcome further input from implementors whether the minimum value of 18 is acceptable.
Input from William Jamieson:
Input from Charles Gordon (via Ashok Malhotra):
Input from Charles Gordon:
At its face to face meeting of January 2001 in London, the WG voted to affirm the current design of this feature.
Since this issue was raised by the WG itself, and all commentators ultimately expressed their satisfaction with the status quo, there is no commentator to notify.
Note that the order-relation on timeDuration is defined for some pairs of durations but not for all pairs. In such cases the the order relation in said to be indeterminate. For example, while P1M25D > P50D and P1M10D < P50 the order relation between P1M20D and P50D is indeterminate.
The complexity of real world durations of time introduces difficulties into any design that attempts to support them. The XML Schema Working Group acknowledges the undesirability of an order-relation that specifies a partial (as opposed to a total) order; however, it has found no other solution that garnered consensus. Therefore, the XML Schema Working Group welcomes feedback from implementors and schema authors on alternative designs. Specifically, we are interested in knowing whether timeDuration needs to be ordered at all and in hearing how other implemented systems which provide a total order for durations of time have defined that total order.
Input from Jeff Lowery:
Input from William Jamieson:
Chairs propose to discuss and adopt the datatypes editors' proposed changes.
At its London meeting in January 2001, the WG discussed a proposal from the editors for a restructuring of the date and time types, which provides a precise account of when time duration comparisons are determinate, and when not, and stipulates that when the comparison of a value with an inclusive minimum or maximum is indeterminate, the value should be legal, while such a value is illegal if the minimum or maximum is exclusive.
An amendment (stemming from the previous day's joint coordination meeting with members of the XML Query WG) was proposed, to make the value illegal in all indeterminate cases (so that a value is legal if its comparison with the min and max values is determinate and true, and otherwise illegal). The amendment was adopted.
The WG accepted the proposal as amended.
The I18n WG wrote 8 March 2001 (member-only link) saying (among other things) that they have no problems with the order relation.
While recurringDuration is capable of serving as the base type of datatypes used in many different date and time related applications beyond those supplied by its use as the base type of the built-in datatypes derived from it, recurringDuration is not intended as a general-purpose solution to calendaring and scheduling applications.
The XML Schema Working Group is particularly interested in feedback from implementors and schema authors as to how timeDuration, recurringDuration and the other date and time related datatypes derived from recurringDuration interoperate with other date and time related systems.
Input from Graham Ross:
Input from William Jamieson:
Chairs propose to discuss and accept datatypes editors' proposed changes.
At its London meeting in January 2001, the WG discussed a proposal from the editors for a restructuring of the date and time types, which removes some types from the CR draft and modifies some others in ways which do not affect implementation experience.
Several amendments stemming from discussions with Mark Davis, Martin Duerst, James Clark, and others were discussed.
The WG accepted the proposal as amended.
The I18n WG wrote 8 March 2001 (member-only link) saying (among other things) that they have no problems with any of the changes to the date/time type, but that they dissent from the decision to include the types day, month, year, monthDay, and yearMonth.
URI References require certain ASCII characters and all non-ASCII characters be hex encoded, sometimes called URI-escaping (see Section 2 of [RFC 2396], as amended by Section 3 of [RFC 2732]). Therefore, schema authors need to exercise caution in the use of uriReference. Specifically, schema authors should avoid uriReference in cases where literals should be allowed to directly contain characters that [RFC 2396], as amended by [RFC 2732], require to be hex encoded.
There is ongoing discussion about how to treat URI References that might contain non-ASCII characters. It is extremely important that all W3C specifications that deal with such URI References (at least this specification, [Character Model], [XML 1.0 Recommendation (Second Edition)] and [XPointer], probably others) be aligned; however, it is not clear how best to achieve that alignment with this specification. In addition to the current design, where both the lexical space and value space of uriReference are considered to be hex encoded, there are at least 3 alternative designs that could be considered: 1) have 2 types, the current type and another type (not strictly speaking, a URI Reference) where both the lexical space and value space where allowed to contain non-ASCII characters; 2) a single type whose lexical space is allowed to contain non-ASCII characters, but whose value space was the set of hex-encoded literals; 3) a single type whose lexical space was hex-encoded, but whose value space was allowed to contain non-ASCII characters (i.e., the set of hex-decoded literals). The XML Schema Working Group welcomes feedback from implementors and schema authors on how to further harmonize the effected specifications; in particular, we seek advice on which of the above alternatives (or some other alternative not yet considered) is most desirable. Changes resulting from such further harmonization might result in additional changes to the XML Schema Language in cases where uriReference in used (e.g., xsi:schemaLocation in [XML Schema Part 1: Structures]).
Chairs propose to define two types: URIreference is what the current IETF draft calls an internationalized resource identifier; hexencodedURIreference is the hex-encoded form of a URIreference as described in RFC 2396 and 2732. The spec should refer to the algorithm for converting from the former to the latter in the XLink spec and the i18n character-model draft.
At the face to face meeting of January 2001 in London, the WG considered this issue. The majority of the WG saw no need for the more restricted type, and so the resolution proposed by the chairs was modified.
RESOLVED: to define a single type corresponding to the IRI / internationalized URI, and to refer to an algorithm for transforming values of this type into hex-encoded strings which match the description in RFC 2396. Dissenting: Developmentor, DLIS, Edinburgh, HL7, Holstege, MITRE. Abstaining: Lotus.
At its face to face meeting in Cambridge in March 2001, the WG considered this topic again, in the light of continuing discussion over what the name of this type ought to be. Various commentators had objected to uriReference as being confusing; many WG members felt that IRI and other candidates were equally confusing. Various possibilities were canvassed, and the preponderance of sentiment favored anyURI.
RESOLVED: to name the type for (internationalized) URI references anyURI.
The I18n WG wrote 8 March 2001 (member-only link) saying (among other things) that they are satisfied with the internationalization of URIs, subject to their checking of the final text.
The regular expression language defined here does not attempt to provide a general solution to "regular expressions" over [Unicode3] strings. In particular, it does not easily provide for matching sequences of base characters and combining marks. The language is targeted at support of "Level 1" features as defined in [Unicode Regular Expression Guidelines]. It is hoped that future versions of this specification will provide support for "Level 2" features.
At its face to face meeting of January 2001 in London, the WG noted that no feedback had been received on this issue. We concluded that changes in this area are not an urgent matter for the user community, and voted to affirm the current design of this feature.
Since this issue was raised by the WG itself, there is no commentator to notify.
Future versions of this specification might allow non-significant white space embedded within a regular expression. The XML Schema Working Group welcomes feedback from implementors and schema authors on the advisability of including such white space.
At its face to face meeting of January 2001 in London, the WG noted that no feedback had been received on this issue. We concluded that changes in this area are not an urgent matter for the user community, and voted to affirm the current design of this feature.
Since this issue was raised by the WG itself, there is no commentator to notify.
XML Schema may need a facet for QName-based types which specifies whether unprefixed values of the type have the default namespace name (as with element-type names), or a null namespace name (as with attribute names, and various QName-based types in XSLT).
Input from James Clark:
The Namespaces Rec specifies two ways in which QNames can be interpreted, which differ in how a QName without a prefix are expanded:
A. For element type names, a prefix-less QName is expanded to have the namespace name of the default namespace if there is one, and a null namespace name otherwise
B. For attribute names, a prefix-less QName is always expanded to have a null namespace name.
Applications may differ in which treatment they use. For example, XSLT uses B (for the names of attribute-sets, keys, modes, variables, parameters).
I didn't notice anything in Part 2 which says whether A or B is to be used. I suspect that you need a facet to control this.
Input from Henry Thompson:
Part 1 makes it clear that we take approach (A). I have to say I find the fact that XSLT has taken (B) to be the single most inconvenient aspect of the language, and we've had no other requests to support this behaviour.
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to reply with polite negative. It is true that our spec specifies that unprefixed names are in the default namespace, instead of following XSL in specifying that they are unqualified. We believe that this is the correct approach.
Should the value space of entities, notations, and IDREFs be changed to parallel more closely the treatment of QNames?
In all cases, the lexical space of the type contains names which are interpreted relative to some set of other constructs in the document (declarations, or elements with IDs). In the case of QName, the value space contains not the prefix but the namespace name to which the prefix is bound. The other types named should be changed analogously: their value spaces should be sets of the relevant constructs:
Given that the restriction to declared notation names cannot be expressed using XML Schema constructs but must be built into a schema processor, should NOTATION be made a primitive type (rather than being derived from QName)?
Cf. Should some types be changed from derived to primitive?
Input from James Clark:
The ENTITY and QName datatypes are similar in that their lexical space both contain names that are interpreted relative to a context containing a set of declarations. For QName, the context is an element containing namespace declarations; for ENTITY, the context is a document containing unparsed entity declarations. Yet the value spaces are handled inconsistently. For QName, the value space contains not the prefix, but the namespace URI to which it is bound by the declaration. However, for ENTITY, the value space contains not the entity to which the entity name is bound, but the original string. This makes a significant difference when you start to manipulate or programatically construct values (eg in XML Query or XSLT). I believe QName has it right, and that the value space of ENTITY should be a set of Entity Declaration information items one for each unparsed entity declared in the document. Similarly, the value space of NOTATION should be a set of Notation info items declared in the Schema, and the value space of IDREF should be the set of all the Element info items in the document that have an ID attribute.
Input from Asir Vedamuthu:
Chairs propose to adopt the proposal, with the modification that the NOTATION value space should be the set of NOTATION components, not the set of NOTATION declaration information items. (I.e. the schema NOTATION datatype refers to notations declared in the schema, not those declared in the DTD.)
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue.
In practice, since the name suffices to allow a processor to identify the appropriate component, the two ways of defining the value space delimit the same set of acceptable implementations. The main difference appears to be in intension or connotation rather than in extension or denotation. After discussion, the WG concluded that since we are instructed by our charter to make sure that our simple types may be used in contexts other than XML Schema, it would be problematic to define the value space of entities, notations, and IDREFs as consisting of components, rather than of names (QNames or NCNames).
RESOLVED unanimously: to affirm that the value spaces of ENTITY and NOTATION are the value spaces of their respective name types.
RESOLVED unanimously: to make the value space of IDREF be the set of NAMES (rather than, as proposed, the set of element information items).
C. M. Sperberg-McQueen responded formally on 7 March 2001. James Clark replied 9 March 2001 asking for more information (see ensuing thread), and after clarification reports that he won't dissent from the WG decision.
The value space of ID should not be defined as containing only the values which have actually been used in a document; it should be the set of strings which match the NCName production.
Input from James Clark:
Part 2 says "The value space of ID is the set of all strings that match the NCName production in [Namespaces in XML] and have been used in an XML document."
The "and have been used in an XML document" makes no sense to me. The value space is surely the set of all strings that match the NCName production. In addition there is the implicit uniqueness constraintthat you can't have the same NCName occurring on more than one distinct element. (XML also has the constraint that you can't have a single element with multiple IDs, but I don't see any need for XML Schemas to enforce this.)
The "and have been used in an XML document" is perhaps an attempt to capture the uniqueness constraint in the specification of the value space, but I don't think that can work.
Chairs propose to adopt the change: the uniqueness constraint is a side condition, not a characteristic of the value space.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue. After discussion (of this issue in connection with CR-14, CR-50, and CR-51), the WG voted unanimously to specify that the value space of ID is the set of NCNames, and to instruct the editor to remove the phrase "and have been used in an XML document" from the description.
Priscilla Walmsley replied formally to James Clark 23 January 2001.
James Clark replied that he was satisfied.
The constraint "ENTITY declared" should require a declaration for the entity in the document, not in the schema.
Input from James Clark:
Chairs propose to adopt the proposed correction.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by accepting the correction proposed and fixing the bug.
David Cleary responded formally to James Clark on 23 January 2001; James Clark replied 24 January 2001 that he was satisfied with the decision.
Declared NOTATIONs can be referred to from ENTITY declarations or from attribute declarations. As things stand in the CR draft, any notation referred to from an ENTITY declaration must be declared in the DTD; any notation referred to from an attribute declaration must be declared in the schema. (This last means that attribute types ENTITY and NOTATION behave differently in a confusing way.)
Should this design be changed?
Cf. Value space of entities, notations, IDREFs (and QNames)
Input from James Clark:
The value of an attribute whose type is declared in the schema as an ENTITY must match the name of an unparsed entity declared in the DTD of the document being validated. (At least that's what I think the spec meant to say.) Such an unparsed entity must also have a notation, which, in a valid document, will also be declared in the same DTD. This makes the ENTITY datatype of Schema truly compatible with the ENTITY type of XML 1.0: if an XML 1.0 valid document contains an attribute declared in the DTD as ENTITY, then that attribute would be schema-valid wrt a schema that also declares that attribute as ENTITY.
However, the value of an attribute whose type is declared in the schema as an NOTATION must match the name of a notation declared not in the DTD of the document being validated, but instead in the Schema. This means that declaring an attribute as NOTATION in the Schema does not mean the same thing at all as declaring an attribute as NOTATION in the DTD. In the former case, the value must match a NOTATION declared in the Schema; in the latter case, the value must match a NOTATION declared in the DTD of the Schema being validated. Thus, although the NOTATION attribute is supposed to be in XML Schemas for compatibility, it isn't really compatible.
An application that wishes to fully support notations as defined in XML Schemas would need to provide two distinct sets of notations each with a separate symbol space. It can't simply ignore the notations declared in the DTD because these are implicitly referenced by the unparsed entities that are referenced by ENTITY attributes declared in the Schema.
This doesn't seem a very sensible design to me. I can think of the following alternative approaches for improving things:
Personally I would prefer F. I would reiterate the comments I made at http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999OctDec/0047.html [See below -MSM] (to which I have yet to receive a response).
Input from James Clark:
Is there really any justification for keeping notations given that external unparsed entities have gone? They don't seem to have received much use in XML 1.0. They have never seemed to me to integrate well into the Web architecture: I would claim that things that in SGML were done with notations should on the Web be done with MIME media types.
Input from Asir Vedamuthu:
Input from Henry Thompson:
Input from Asir Vedamuthu:
Remove part but not all of it.
Chairs propose to reply with polite negative.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue.
In discussion, it proved that there was no support for removing the NOTATION type. Notations provide information about otherwise opaque objects; they are not limited to entities, and there is no reason to remove them from the XML Schema language simply because entities are not in the language. It is true that processors which are both schema-aware and DTD-aware will need to keep track of two distinct sets of notations; notations do not differ, in this way, from many other constructs in schemas and DTDs which have analogues in the other formalism.
The uniqueness constraint on notation names cannot be expressed by our type derivation rules; this seems to require that NOTATION be a primitive, not a derived type.
RESOLVED unanimously: to retain NOTATION.
RESOLVED unanimously: to make NOTATION a primitive, rather than a derived, type.
Alex Milowski responded formally to the commentator on 7 March 2001.
James Clark replied 9 March that he dissents from this decision.
An issue which was raised during review of XPointer, concerning validation of IDREFs in XPointers, needs to be reviewed by the XML Schema WG during CR. The IDREFS type can be used to require that certain link targets be document-internal, and to ensure that the targets linked to actually exist. The URI-reference type can be used with a pattern facet to require that certain link targets be document internal (the pattern requires that they begin with a hash mark), but there is, at present, no way to ensure that the targets of such internal XPointers actually exist.
Should XML Schema be changed to provide this functionality?
Input from Eric van der Vlist, Eve Maler:
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to reply with polite negative.
Aki Yoshida informed the commentator 14 February 2001. Commentator not satisfied, and desires that fields in keys and key references be allowed to be any result of evaluating an Xpath expression, instead of being required to be an element or an attribute.
The treatment of character ranges and square brackets in the definition of regular expressions in Appendix E allows for some ambiguous regular expressions.
Input from Uwe Plonus:
I'm working on an implementation for regular expressions. Because we want to use the XML schema after becoming a recommandation, I took the specification of the XML schema part 2 for implementation. In section E.1 Character Classes I've found a part of text, which can lead to misinterpretation. It's the part of character ranges, which can be interpreted wrong.
It says, that the characters '[' and ']' are no valid character ranges. But in the form 's-e' it is allowed to replace 's' with '[' or ']' and to replace 'e' with ']'. There're some constructs, which can not interpreted, if this characters are allowed.
First example:
[----[-]] |
This example has the following possible interpretations:
[-]
' is a character class
expression.
[-]
' is also a character range.
These two interpretations are not equivalent, but both can be interpreted with the definition.
Second example:
[A-]]-z(ab|cd) |
The following interpretations are possible:
[A-]]
' means a character class expression and
the rest is no character class expression.
[
' I take the part 'A-]
'
as a character range, the following ']-z]
' as a
concatenated character range and then I've problems getting an end of
this character class expression.
A suggestion to avoid this misinterpretations is not to allow the characters '[' and ']' as start and end of a character range. The only exception could be as the first character of a positive character group.
Input from James Clark:
Chairs propose to remove the ambiguity in the way described by Alexander Falk's grammar.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and then to resolve it by accepting the invitation to eliminate ambiguities in the grammar (by accepting the formal grammar provided by Alexander Falk, see CR-33).
Should the ability to define wildcards as matching anything in any namespace except the current one, or except a specified one, be extended? In particular
[N.B. the last three items in this list added 2001-01-09. -MSM]
Input from James Clark:
The wildcard facility in XML Schemas doesn't seem to handle the case of xsl:stylesheet, which allows specific elements from the xsl namespace plus any element whose namespace URI is both not XSL and not absent. This seems to me a pretty reasonable thing to want to do.
Input from Judith A. Slein:
Input from Matthew Fuchs:
Input from Judith A. Slein:
Input from Matthew Fuchs:
Input from Judith A. Slein:
Chairs propose to discuss Martin Gudgin's proposal, and either adopt for 1.0 or put on the list for 1.1.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue.
In discussion, it was argued that the particular use cases in question can in fact be handled by XML Schema. Some WG members wanted to consider a more powerful wildcard selection mechanism in a later version.
RESOLVED without dissent: to close this issue without change to the draft spec.
Ashok Malhotra informed James Clark on 25 January 2001. James Clark replies 26 January to record "a mild dissent" and suggest that the specialized syntax for wildcards is a syntactic dead end better replaced by a more elaborate one which could scale better.
The CR draft says that xsi:type may be used to specify which member of a union type should be used to govern the interpretation of a specific value. Should this be changed?
Input from Asir Vedamuthu:
Chairs propose to reply with polite negative. They note that in response to comments in this area the editor has suggested renaming the "Type derivation OK [Simple]" constraint to "Type allowed [Simple]".
At its meeting in January 2001 in London, the WG decided against opening this as an issue.
Martin Gudgin notified the commentator 22 January 2001. Commentator replied 1 February 2001 saying he is not satisfied with the resolution of the issue but that his dissent is "not grave enough for the Director".
Should negative values be allowed for the scale facet?
Input from Mike Cowlishaw:
At the November 2000 meeting in Menlo Park the WG decided (with two dissents) against allowing negative values for scale.
At its meeting in January 2001 in London, the WG decided against reopening this issue.
Henry Thompson responded formally to Mike Cowlishaw on 5 March 2001. Mike Cowlishaw replied 6 March that he was content for this item to be "deferred until the next round", but that he was unhappy about other issues.
Should the lexical space for decimals be changed to allow exponential (scientific) notation?
Input from Mike Cowlishaw:
At the November 2000 meeting in Menlo Park the WG decided against allowing decimals to be expressed using scientific notation. Dissenting: CommerceOne, Edinburgh, Lotus, Tibco Extensibility.
At its January 2001 meeting in London, the WG declined to reopen this as an issue.
Should the simple types NOTATION, timePeriod, time, date, month, year, century, recurringDate, and recurringDay, which are defined in the CR draft as derived types, be made primitive types?
Cf. Value space of entities, notations, IDREFs (and QNames)
Input from Asir Vedamuthu:
RESOLVED: to deal with NOTATION in the context of other NOTATION-related issues, and to deal with date/time types in the context of the overall review of the date/time changes proposed by the datatypes editors.
[The upshot of this is that all these types are now primitive. -MSM]
Commentator is satisfied (member-only link).
Should the name of the simple type called CDATA in the CR draft be changed? (Leading candidate is normalizedString.)
At its November 2000 ftf meeting, the WG leaned toward making this change, and decided to postpone final decision until later, when this and other similar minor changes would be voted on en bloc. (In the terminology used in the meeting, we "put it on the pile" to come back to.)
[The pile was dealt with in London, 19 January 2001. A new namespace will be made, with a few changes, this among them.]
Noah Mendelsohn sent a notice to the XML Schema comments list 29 January 2001.
Should XML Schema allow fuller constraints on mixed content? In particular,
Chairs propose to respond with polite negative.
At its January 2001 meeting in London, the WG declined to open this as an issue, on the view that there was clearly no consensus in favor of any change in this area.
The I18n WG wrote 8 March 2001 (member-only link) saying (among other things) that they remain unhappy about the omission of support for character-repertoire restrictions on mixed content, and requesting a commitment to add such a capability in XML Schema 1.1. The XML Schema WG agreed to place this feature on the list of candidate requirements for XML Schema 1.1.
The default value for the elementFormDefault attribute
on the schema element in the XML transfer syntax for
schemas is currently unqualified
(which is also the
default for attributeFormDefault). Should the default
value be changed to qualified
? Should the default value
be eliminated?
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to treat this as part of CR-2.
In the CR draft, the canonical form for dates is restricted to UTC. This makes some sense for time instants, but translating time periods from other time zones into UTC requires the ability to give time-of-day information in UTC. This is not allowed in dates (which do not allow specification of hours, minutes, etc.). Should time-zone information be allowed in the canonical form of the date type? (Should some other change be made?)
Input from Graham Ross:
Chairs propose to adopt the proposed correction.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by adopting the changes to date and time types proposed by the datatypes editors, which include the correction proposed by the commentator.
Should the definition of XML Schema ensure that the XML transfer syntax and the abstract component level have exactly the same expressive power? This would ensure both that every legal schema document in the XML transfer syntax corresponds to a legal set of components, and that every legal set of components could be described by some set of schema documents in the XML transfer syntax.
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to respond with polite negative. This is without prejudice to considering this as a design goal or requirement for future versions of XML Schema.
Should it be possible for top-level ALL groups to be made optional?
At its ftf meeting in November, 2000, the WG voted without dissent to remove the restriction which requires minOccurs to be 1 on top-level ALL-groups.
Since this issue was raised by the WG itself, there is no commentator to notify.
Should xml:lang be allowed by default in all complex types unless the schema author turns it off (e.g. by setting its maxOccurs to 0)? (Possible alternative: make and recommend a complex type in a type library suitable for text in human languages, which does define xml:lang.
The WG discussed this issue at its meeting of January 2001 in London. Various possibilities were explored:
None of these commanded a majority. There was substantial sentiment that a solution which was technically well designed would take more work.
RESOLVED: to instruct the editors to add references to the type library solution.
This issue discussed with the I18n WG which originated it in a joint session on 2 March 2001. The I18n WG indicates that this resolution is acceptable.
Asir Vedamuthu confirms and expounds in writing, 4 March 2001.
Should the rules regarding definition and/or use of named model groups be changed? If so, how?
Input from Morris Matsa:
Input from Asir Vedamuthu:
Input from Morris Matsa:
Input from Asir Vedamuthu:
Input from Henry Thompson:
Input from Noah Mendelsohn:
Chairs propose to adopt the proposal to make occurrence indicators illegal (not merely ignored) on named groups.
RESOLVED unanimously (London, January 2001): to modify the schema for schemas in such a way that top-level named groups may not have occurrence indicators.
Commentator (Morris Matsa) indicated on 7 March 2001 that he was satisfied (member-only link).
Should a formal grammar for regular expressions be added to Appendix F of Part 2? Should it replace or supplement the existing description of regular expressions? In case of conflict, which should have priority?
Input from James Clark:
Input from Paul Biron:
Chairs propose to adopt Alexander Falk's proposal.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by accepting Alexander Falk's proposed BNF for regular expressions.
C. M. Sperberg-McQueen responded formally to James Clark on 7 March 2001.
James Clark confirmed 9 March that he was satisfied.
Shall the description of complex-type derivation be changed to specify that
Chairs propose to make the minimal change required to fill the gap identified in the spec.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue.
RESOLVED: to resolve CR-34 by instructing the editors to replace the current mapping for wildcards in derivation by extension, to make it be a union of the sets matched by the wildcard expressions. In favor: 19, opposed: 0, abstaining 5 (Contivo, HP, Informix, Oracle, W3C), concurring 4 (Arbortext, Lotus, Software AG, Xerox).
Priscilla Walmsley notified the commentator 23 January 2001. Asir Vedamuthu replied that he was satisfied, 23 January 2001.
Should the mechanisms for declaring simple types be changed to allow schema authors to block derivation of other types from them?
Input from Asir Vedamuthu:
See also http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000OctDec/0337 (the URI in the mail message raising this issue is no longer correct).
Chairs propose to adopt the proposal.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by instructing the editor to add the ability to block further derivation of simple types.
David Cleary responded formally to the commentator on 23 January 2001. Asir Vedamuthu confirmed that he was satisfied with the WG decision on 23 January.
Should Datatypes be modified to specify explicitly whether multiple occurrences of facets (length, minLength, maxLength, whiteSpace, maxInclusive, minInclusive, maxExclusive, minExclusive, precision, scale, encoding, duration, and period) should be an error or not? If it is not an error, which facet specification in the schema document should be followed?
Input from Asir Vedamuthu:
Chairs propose (1) to clarify that repeated facets are an error if not authorized in the spec, and (2) to ensure that facets occurring at multiple points in a derivation (e.g. patterns) can be combined successfully in a single component at the abstract level (need ANDing of patterns).
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and (after brief discussion) to resolve it as proposed by the chairs.
Alex Milowski replied formally on 7 March 2001. Asir Vedamuthu noted that he is satisfied.
Should the default declarations for namespace parameter entities be changes to use explicit prefixes? Should extra namespace-declaration (pseudo-)attributes be declared for the 'schema' element, to make it easier for users to build schemas using existing DTD-based tools without having to use an internal DTD subset?
Input from Curt Arnold:
Currently %p and %s's default values of
'' would be rarely preferred except while defining schema for schemas.
'xsd:
' and ':xsd
' would be a better default
for most uses. XSLT, for example, can't generate the internal subset
declarations that would allow using the default namespace for
references to elements in the target schema.
I'd recommend changing in the DTD:
<!ENTITY % p 'xsd:'> <!ENTITY % s ':xsd'> |
In addition (even if the defaults weren't changed), it would be useful if the attribute list of the schema element were something like this:
<!ATTLIST %schema; targetNamespace %URIref; #IMPLIED version CDATA #IMPLIED %nds; %URIref; #FIXED 'http://www.w3.org/2000/10/XMLSchema' xmlns:x %URIref; #IMPLIED xmlns:target %URIref; #IMPLIED xmlns %URIref; #IMPLIED xmlns:import1 %URIref; #IMPLIED xmlns:import2 %URIref; #IMPLIED xmlns:import3 %URIref; #IMPLIED finalDefault %complexDerivationSet; '' blockDefault %blockSet; '' id ID #IMPLIED elementFormDefault %formValues; 'unqualified' attributeFormDefault %formValues; 'unqualified' %schemaAttrs;> |
[N.B. The proposal adds the new attributes xmlns:x, xmlns:target, xmlns, xmlns:import1, xmlns:import2, and xmlns:import3 and leaves the existing attributes unchanged. -MSM]
The combination would allow most schemas to be validated against the DTD without requiring the end user to define an internal subset. Just the additional xmlns attribute definitions, should allow most schemas to be validated without having to redefine %schemaAttrs.
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to classify this as an editorial question.
Aki Yoshida informed the commentator of this resolution on 14 February 2001.
Should the spec be changed to clarify whether it is legal or not legal to import a namespace which is the same as the target namespace? If such an import is legal, should the spec specify (more clearly) what the consequences of the import would be? Should we specify consequences other than those now implied by the spec?
Input from Roberto Galnares:
Input from Noah Mendelsohn:
Chairs propose to specify that an import of the target namespace means that the schema processor must find a schema for the target namespace; the schema document it has in had providing such a schema, the instruction has no effect. Short discussion may be in order to decide whether this pointless import should receive an error message.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and (after discussion) to close it with an instruction to the editor substantially as proposed by the chairs (some WG members would prefer something slightly less specific).
Should the schema for schemas (and the DTD for schemas) be changed to allow an id attribute (of type ID) on the facet element in schema documents?
Input from Bob Schloss:
Chairs propose to allow such IDs.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by allowing such IDs.
Ashok Malhotra notified the commentator on 25 January 2001. Bob Schloss replied the same day that he is satisfied.
Should the schema for schemas (and the DTD for schemas) be modified
to change the default value of the block and final
attributes
on complexType and element from '' to
#IMPLIED
?
Input from Curt Arnold:
Input from Henry Thompson:
Chairs note that this has already been accepted as a typo report by the editor.
Accepted as an error and fixed.
Martin Gudgin notified the commentator of this disposition on 22 January 2001. Curt Arnold replies that he is satisfied on 24 January 2001.
Should Structures specify how / where to generate a prefix and a namespace declaration for the namespace of an attribute with a qualified name for which the schema processor must supply a default? (If the value is specified in the instance, there is no problem.)
Input from Asir Vedamuthu:
[And thread beginning there.]
Chairs propose to specify that the generation of such prefixes is governed by any rules applicable to serialization of the PSVI; since our spec does not have any such rules, our spec can and should remain silent.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it as proposed by the chairs.
Henry Thompson notified the commentator formally on 2 March 2001 (though since the commentator is a WG member, the commentator has known since January). Asir Vedamuthu responded 5 March 2001 to say he was happy.
Should the canonical representation [and value space?] for decimals be changed?
Input from Mike Cowlishaw:
[The commentator forecasts problems if the we continue to specify that the values denoted by "1 * 10^0" and "100 * 10^-2" are the same, and proposes a change to [the value space and] the definition of the canonical form which would avert them. It appears to me that the change would require modifying our definition of the value space of decimals. -MSM]
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to respond with a polite negative, and declined to open this as an issue.
Should the spec explicitly forbid the creation of circular substitution groups?
Input from Alexander Falk:
[Item 3 in the list.]
Input from C. M. Sperberg-McQueen:
Input from Alexander Falk:
Chairs propose to adopt the proposal to forbid cyclic substitution groups explicitly.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by accepting the proposal to forbid cyclic substitution groups explicitly.
Allen Brown sent a formal response to the commentator 8 March 2001
Should xsi:type be dropped?
Input from Alexander Falk:
[Item 5 of list.]
On 18 January 2001, at its meeting in London, the WG noted that Alexander Falk had proposed to withdraw this suggestion and decided not to open it as an issue.
Noah Mendelsohn sent a formal response on 29 January 2001. Alexander Falk confirmed on 29 January that he is satisfied with the outcome
Are there / should there be predefined types, or other constructs, which distinguish / allow application software reliably to distinguish among arrays, lists, sets, and bags (in the sense in which these terms are used by EXPRESS)?
Input from Mike McCaleb:
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to respond with a polite negative. This proposal is identical to several proposals we have considered in the past, and no new information appears to be available which would lead anyone in the WG to change their mind as to the merits of the case.
It is not clear to the commentator whether key and keyref constructs can augment the SGML/XML IDREFS construct. If not, should the spec be changed to make it possible? If so, should the spec be changed to make the functionality more obvious?
Input from Mike McCaleb:
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal not to open this as an issue.
Should XML Schema be changed to allow schema authors to declare simple types (as well as complex types) as final? as abstract?
Cf. Allow simple types to block further derivation?
Issue CR-35. (Overlaps this one.)
Input from Bob Schloss:
Input from Mike McCaleb:
On 18 January 2001, at its meeting in London, the WG agreed with the chairs' proposal to consider the question of final simple types as part of CR-35, and to respond with a polite negative on the suggestion for abstract simple types. A proposal for abstract simple types was considered at some length, and removed from the spec when they proved to have more complex consequences than we felt were appropriate to deal with in XML Schema 1.0.
Asir Vedamuthu responded formally on 5 March 2001.
Should the URIs defined for the builtin datatypes and facets be
changed to prescribe that the shorthand form (e.g.
http://www.w3.org/2000/10/XMLSchema#date
)
is preferred over
the explicit XPointer form (e.g.
http://www.w3.org/2000/10/XMLSchema#xpointer(id("date"))
)
as the 'correct' form of opaque URI?
Input from Martin Gudgin:
[Consult earlier messages in the same thread for background.]
Chairs propose to respond with polite negative.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by reaffirming the existing design.
At its March 2001 meeting in Cambridge, the WG reopened this question on the basis of new information about expected applications of the canonical URIs, in which the quoting necessary in the long form is cumbersome, and in which the naked-fragment-identifier form is substantially more convenient precisely because it can be copied to URIs of other MIME types and be reinterpreted successfully in the new context. RESOLVED unanimously to shift to the `naked-identifier' as the canonical form for these URIs.
C. M. Sperberg-McQueen sent a formal notice of this to the comments list 8 March 2001. Dan Connolly responded 9 March that he was satisfied.
Should XML Schema define a subset of Xpath for use in specifying key and keyref constraints, rather than allowing full Xpath?
Chairs propose to discuss and decide.
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue. It was discussed in connection with issue CR-54, and a subset of XPath was identified and adopted.
In the ensuing weeks, a joint task force of the XML Schema and XSL Working Groups reviewed the subset in light of the implementation experience in the XSL Working Group, checking to see whether the subset achieved its intended goal of reducing the implementation burden on conforming processors to approximately the same as that imposed by implementing ID/IDREF, and to see whether the subset could be simplified or generalized without jeopardizing that goal. (The WG concluded that the goal of statically determining the type of all key fields was not possible given that in an instance the key might well have an xsi:type attribute.)
The joint task force suggested the removal of some restrictions in
the subset, which would have the effect of making the subset slightly
larger and its description slightly smaller. The task force did not
achieve consensus on the treatment of `//
' (i.e. the
descendant::
axis) and produced two proposals,
recommending that the WG choose between them. The proposal actually
adopted is summarized thus:
RestrictExpr ::= Union Union ::= Path ('|' Path)* Path ::= ('.//')? Step ('/' Step)* Step ::= '.' | '@'? NameTest NameTest ::= QName | '*' | NCName ':' '*' |
The other alternative read
Path ::= Step (('/' | '//') Step)* |
The WG considered this task force recommendation at its meeting in Cambridge, Massachusetts, 1-2 March 2001. A proposal to extend the subset further by adding either
Path ::= 'ancestor::*/@' NameTest |
or
Path ::= 'scope()/@' NameTest |
was discussed, but did not achieve consensus. It seems clear that this proposal might help deal with situations in which the scoping elements for sets of keys are themselves uniquely identified by a key within a larger scope (elements representing vehicles might be uniquely identified within states; the elements representing states might themselves be uniquely identified within a larger scope, etc.), but this seems likely to benefit from deeper analysis.
Between the two proposals forwarded by the task force, the WG preferred the one which restricts the use of descendant. The unrestricted form makes matching nondeterministic and thus increases the implementation burden; the restricted form may reduce the complications facing a future generalization of our design, and makes it easier to get static type checking in some cases (though it may be observed that full static prediction of the types of keys is not possible given that an instance may always use xsi:type on keys). The WG could not think of any serious use cases for keys which would require the descendant axis in positions other than first.
A proposal to modify the subset so as to allow both
long forms using explicit axis keywords (child::
etc.)
was adopted, on the grounds that the purpose of the subset is to
restrict the functionality, and restricting the subset to short forms
does not conduce to that goal.
RESOLVED without dissent: to adopt the proposal as amended (by restricting descendant to the first step, and by allowing long forms). Abstaining: Lotus. (Lotus expressed concerns about the restriction of some functionality to attributes instead of child elements.)
Neither the subset of XPath originally adopted, nor the later revision of that subset in cooperation with the XSL Working Group, is identical to the subset proposed by Andy Clark.
At its meeting in Cambridge (1-3 March 2001) the WG affirmed that it accepted that, using the subset it has adopted, it is not possible to predict, from a static analysis, the type of key fields given only the XPath expressions used to identify them. It is the belief of the WG that static type prediction of this type is not feasible in any case, given that in an instance the key field may have an xsi:type attribute.
The use of the descendant axis in the subset of XPath seems essential to providing the desired functionality; since it does not make the difference between feasible static type analysis and infeasible static type analysis, the WG was not forced to make an explicit judgement of the tradeoff between the two. Some WG members did express the opinion that the functionality provided by the descendant axis was more important than the ability to do static type checking, but other WG members might feel differently if there were a real choice.
Jim Trezzo notified Andy Clark of the January decision on 13 February 2001. An extensive correspondence then ensued, the upshot of which is that most of the specific concerns of the commentator regarding implementation difficulty have been addressed by the later revisions to the XPath subset, though his expressed desire to allow predicates has not been followed. At this time, it seems safest to assume the commentator is not wholly satisfied with the compromise arrived at by the XML Schema and XSL WGs.
If IDREF is one member of a union type, and a value in the instance is lexically a legal IDREF, but is not the value of an ID in the document, is this an error, or does the processor proceed to try the value as a value of the other, later types in the union?
Input from Jane Hunter:
Item 3 in list (the editor of this document is not sure the question described above is the question JH had in mind.)
Input from Paul Biron:
Input from C. M. Sperberg-McQueen:
Input from Martin Gudgin:
Chairs propose to specify that the value is attributed to the first type whose value space it matches. (Requiring enforcement of the uniqueness constraint would involve arbitrary lookahead before allowing a processor to know what type a value has.)
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue. After discussion of this and related issues, the WG decided to separate the uniqueness constraints on IDs and the referential integrity constraints on IDREFs from the definitions of the types.
RESOLVED unanimously: to instruct the editors to ensure that the uniqueness and referential integrity constraints on legacy types (including IDREF) are described not as part of the type validity of values, but as constraints imposed at another level.
Since the choice of member types in a union is determined entirely by type validity, no referential integrity constraint is imposed on a candidate IDREF value before deciding whether the value is or is not to be taken as an IDREF.
RESOLVED unanimously: to instruct the editors to make the legacy types derived types rather than primitives, as appropriate (since in many cases they had been made primitive types because the uniqueness constraints were not expressible using our normal type derivation mechanisms).
Priscilla Walmsley notifed Jane Hunter of the resolution on 23 January 2001.
In general, validation of an element involves validation of the subtree rooted at that element, but can be performed without reference to its context. Thus, the spec manages almost completely to avoid any appeal to the notion of 'document' in defining validation.
The rules for type IDREF, however, do make such an appeal. Do those rules mean that we should revise our description of how validation is perfomed?
Similar arguments apply to keys and keyrefs. Do they mean that validation of subtrees below the scope of the key constraint is (a) impossible, (b) partial (omitting key and keyref constraint checking), (c) legal but requires the processor to climb the tree until it hits the scoping element, (d) other?
Input from Noah Mendelsohn:
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue. After discussion, the WG concluded that there is in fact no problem, and to instruct the editor to add the clarifying statement that schema validation is guaranteed to produce the same result as DTD-based validation only when validation is of the root element information item.
David Cleary posted a note to the comments list 23 January. Noah Mendelsohn replied 23 January that he was satisfied with the resolution.
The spec appears obscure or underspecified on whether defaulted element or attribute values participate in checking identity constraints. Does such checking operate on the unaugmented infoset?
Chairs propose to stipulate that identity constraints are checked against the schema normalized form (so default values do participate in such checking).
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by agreeing explicitly that default values do participate in identity-constraint checking.
Should Datatypes be modified to add built-in binary types (e.g. base64Binary and hexbinary)?
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by deleting the current binary type and adding two distinct types for base64Binary and hexBinary. We accept the unfortunate side effect that these types have no relation visible in the type hierarchy, even though they will share the same value space.
Aki Yoshida notified James Clark of this outcome on 14 February 2001.
James Clark replied 12 March that he was satisfied.
Should identity constraints be removed from XML Schema?
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue. A substantial amount of time was devoted to discussing the issue.
Allen Brown summarized the arguments for removing identity constraints.
Jim Trezzo summarized the arguments for retaining identity constraints.
Henry Thompson proposed as a compromise to restrict XPath so as to remove some of the characteristics most objected to, and summarized the salient points of the existing proposals (one from him, one from IBM) for such a subpath.
After extensive discussion, it was agreed to appoint a task force consisting of Martin Gudgin (chair), Lee Buck, David Cleary, Ashok Malhotra, Jonathan Robie, and Henry Thompson to attempt to devise a suitable subset and report back on the second day of the meeting.
When the task force reported, its proposal was accepted.
For further details, see issue CR-49.
Should the lexical space of Boolean be changed?
[member-confidential link]
At its January 2001 meeting in London, the WG voted to open this as an outstanding issue, and to resolve it by restoring 0 and 1 as Boolean values. Dissenting: Contivo, IBM, Software AG, TIBCO Extensibility.
In its teleconference of 2001-01-25, the WG decided to specify
"true
" and "false
" as the canonical forms.
Dissenting: GCA, W3C (after the fact).
Abstaining: Microsoft.
(Rationale for the W3C dissent: 0 and 1 are less language- and
culture-specific, and historically more apposite since they are in fact
the lexical forms used by Boole.)
Ashok Malhotra formally notified Allen Brown of the result on 22 January 2001.
Should the Structures part of the spec be changed to specify
that schemas should be served as type application/xml
?
Should it retain the current statements that schemas should be
served as type text/xml
? Should it allow either,
with or without recommending one (and if so which)?
Input from Murata Makoto:
[Member-confidential]
Input from Henry Thompson:
[Member-confidential]
Input from Murata Makoto:
[Member-confidential]
Input from Dan Connolly:
[Member-confidential]
Input from Rick Jelliffe:
[Member-confidential]
Chairs propose to allow all four lexical forms.
At its January 2001 meeting in London, the WG voted to open this as
an outstanding issue, and to resolve it by instructing the editor to
make the spec agnostic on MIME types. XML Schema processors accept
text/xml
and application/xml
and perhaps
others.
Martin Gudgin formally notified Murata Makoto of the outcome on
22 January 2001. Murata-san replied on 22
January saying he is not satisfied, since the IETF-XML-MIME ML
has, after a lot of time spent on the issue, concluded that schemas
should be served as application/xml
, not
text/xml
. Henry Thompson expanded on the WG rationale on
22 January, explaining that the spec was made 'agnostic' on
this issue in order to allow XML Schema processors to continue, rather
than raising an error, if the material is served with a MIME type
other than the recommended one. Murata-san replied 25 January saying that processors need not raise an error if
the server serves the schema under the wrong MIME type, but that the
XML Schema specification should mention application/xml
as an appropriate MIME type, and adding that if the sentence "XML
Schema documents should be served as application/xml
" is
added to the spec, he does not wish for review of this issue by the
Director. In a
later note, he agrees on an alternative formulation: "[RFC
3023] recommends that XML-encoded material should be served as MIME
type application/xml
, unless the XML source code is
readable by casual users (in which case it should be served as
text/xml). Many observers believe that XML Schema documents fall into
this category and should therefore be served as
application/xml
."
Should the spec specify that attributes from namespaces other than the XSD namespace, which are legal by design in XML Schema documents, must be reflected to the schema components and thence to the PSV infoset?
This question was raised and resolved at the XML Schema WG meeting in London in January 2001. The WG resolved unanimously to make such out-of-band attributes visible in the PSVI.
Should the spec provide declarations for the attributes in the xsi namespace?
This question was raised and resolved at the XML Schema WG meeting in London in January 2001. The WG resolved unanimously to instruct the editors to provide declarations for the items in the XSI namespace.
Should the spec describe what happens when the input to a schema processor is itself the output of another schema processor (as might happen when schema validators are chained together)?
This question was raised and resolved at the XML Schema WG meeting in London in January 2001. On the question "Shall the editor be instructed to specify what happens when input to a processor is a PSV Infoset?", there were four in favor, a large majority against. RESOLVED: to resolve this issue without action.
Should the spec specify exactly when two anonymous types are the same, and when they are not? (The spec can currently be read to allow processors to reuse component structures built for one anonymous type for another anonymous type with the same structure. A spec that does this would on some documents reach results different from those of a schema processor which does not do so.)
This question was raised and resolved at the XML Schema WG meeting in London in January 2001.
RESOLVED unanimously: to instruct the editor to change the "Schema Component Constraint: Element Declarations Consistent" to read "...., all their {type definitions} must be named, and must have the same name and targetnamespace".
Should the spec replace the current definition of when restriction of complex types is legal and when it's not, by a simple statement that it's legal if the set of elements recognized is a subset of the set recognized by the base type?
This question was raised and resolved at the XML Schema WG meeting in London in January 2001.
There were no WG members in favor of removing the current description of constraints on complex-type restriction which guarantee that the resulting complex type is a subset of the base type.
Should the spec specify that if an element matches a wildcard, members of its substitution group match that wildcard, too? Or should the spec specify that wildcard matching takes no account of substitution groups?
This question was raised and resolved at the XML Schema WG meeting in London in January 2001.
There were no WG members in favor of specifying that whenever an element might match a wildcard, any element in its substitution group might match the wildcard.
At its London meeting the WG added "0" and "1" as lexical forms for Boolean but forgot to decide which forms should be the canonical forms.
At its teleconference of 25 January 2001, the WG voted to make
true
and false
the canonical forms.
Dissenting: GCA, W3C.
(Rationale for W3C dissent: 0 and 1 are substantially more language-independent, as well as being the lexical forms actually used by Boole in his algebraic work.)
If we have:
<anyAttribute/> <attribute name="foo" type="xs:decimal"/> |
Is this a bug because a 'foo' attribute can be allowed two different ways?
HST proposes to say 'not a bug', and clarify that the explicit declaration takes precedence, and the foo, when it appears, must be a decimal. This has minor ramifications in the derivation prose.
In its teleconference of 2001-02-23, the WG RESOLVED unanimously: to specify that this is not a problem and that explicit declarations take precedence, as suggested by HST.
Who, if anyone, is responsible for stating the mapping from lexical space to value space for QNames?
For example: if
then several questions arise:
In its teleconference of 2001-02-23, the WG RESOLVED unanimously: to accept PVB's proposal that the value is not type valid (because it is not possible to translate the lexical form into a value), that this is Datatypes' responsibility, and that the spec should make it explicit.
C. M. Sperberg-McQueen wrote to the comments list to this effect, 5 March 2001. James Clark wrote 6 March to say he found this acceptable.
If we assume that we take seriously the maxim about eating our own cooking and mostly replace references to 'normalized value' in defining the mapping from XML representation to component with references to 'real value', where this is the appropriate point in a value space.
In this case, at least one question arises:
Should 'fixed' value constraints change from strings (which they clearly are in the CR draft) to values, so that e.g. given
<xs:attribute name="foo" type="xs:decimal" fixed="0.1"/> |
the following would be valid?
<xxx foo=".1"/> |
(Another way of putting this is: should fixed values be construed as constraining the lexical form of the attribute value, or the value denoted?)
In its teleconference of 2001-02-23, the WG RESOLVED: that foo=".1" ought to be legal when there is a fixed value of "0.1" for a decimal attribute. The specification of fixed values constrains the value, not the lexical form. Changes to the Structures draft are required. Changes to the datatypes draft may be made at editorial discretion.
The status quo would appear to require processors to be able to support values of minOccurs and maxOccurs with up to 18 decimal digits; it is unlikely that most finite-state automata construction software will actually handle these cases. Would it not be better to define a narrower subset of integer to use for these cases? Possibly 2 raised to the 31st power?
The WG discussed this question at its face to face meeting in Cambridge, 1-2 March 2001.
The implementors in the WG reported that indeed no known implementations are in a position to handle occurrence indicators with 18 decimal digits. For most implementations, however, limiting the occurrence indicators to a 31 bit unsigned integer would not help, since numbers far smaller than 2 to the power 31 (2147483648), or even 10 to the fifth power, would cause a stack overflow or hit some other memory constraint. For a change to make any practical difference, the type would have to be given a maximum value somewhat smaller. This would involve us in trying to find a plausible value for the maximum, which struck the WG as a thankless task better avoided.
Some WG members argued that since we do not require any particular amount of memory or capacity of any application, we have already accepted implicitly that for some legal schemas and for some legal document instances, conforming processors may run out of resources. This is exactly analogous to XML, where it is in practice impossible to build a processor which accepts all and only well-formed documents, since the length of identifiers is unbounded and might exceed the capacity of available storage. Since there are an indeterminate and large number of other reasons a processor might run out of resources, it would be pointless (these WG members argued) to try to prevent that in this one case.
The WG resolved to close the issue without change to the spec.
Noah Mendelsohn, who had raised the question, expressed his satisfaction with the WG resolution of the issue.
SGML and XML 1.0 specify that in a DTD, there may be at most one attribute for a given element type which is declared as an ID attribute. For XML Schema, there are three options:
A straw poll during the WG teleconference of 2001-02-23 showed that almost everyone with a preference preferred the first option, which is what is in the CR spec, but that a large proportion of the WG was uncertain. Knowing that each element has either no ID or exactly one ID may be important for some implementations, or some applications, and many in the uncertain group said they felt uneasy about making such a change without more public discussion. On the other hand, no one could actually cite a reason to rely on that knowledge, or a reason one might want to, or any software that does so. And most of the old SGML hands present confessed that they had never really felt the restriction made a great deal of sense.
But we did not reach consensus.
The WG took up this question again at its face to face meeting in Cambridge, 1-2 March 2001.
After extensive discussion, it became clear that the preponderance of sentiment in the WG was for enforcing the old XML/SGML rule. RESOLVED without dissent: to modify the spec by enforcing the rule against multiple uses of the ID type, both on the schema and on the instance. Abstaining: Software AG. The wording of the constraints is left to the editor, but the WG established after discussion that the following is an acceptable approximation:
Several questions have come up regarding so-called `chameleon include', i.e. inclusions and redefinitions which include or redefine material in schema documents which do not have an identified target namespace.
##other
,
##targetNamespace
with respect to chameleon
include/redefine?
The WG discussed this topic extensively at its Cambridge meeting 1-2 March 2001. Some WG members asserted that late assembly is inherent in the fact that we do not require types to be declared before their first mention in the schema; no WG members argued the contrary. At the conclusion of the discussion, the WG RESOLVED without dissent: to agree that late assembly and late binding shall be rules in XML Schema. The editor was instructed to determine whether circular redefinitions posed a logical contradiction, and to forbid them if so.
There has been continuing discussion regarding the proper name to give the URI type. The WG decided this formally in its meeting of 15 February 2001, but this decision does not seem to have ended the discussion. The WG took up the question again at its meeting of 1-2 March 2001 in Cambridge and agreed to change the name to anyURI to satisfy those who object to uriReference. The name anyURI has the advantage of not being identical to any existing technical term, thus avoiding confusion and complaints that our type is not exactly that defined by RFC 2396.
Several questions are arose from the report of the joint task force of XSL and XML Schema to discuss the XML Schema subset of XPath.
//
be restricted to appear at the
beginning, e.g. .//...
,
or should it be allowed to appear anywhere?
ancestor::*/@a
sticks out like a sore thumb -- should it be allowed?
scope()/@a
?
The WG considered these questions at its face to face meeting in Cambridge, 1-2 March 2001.
The form which restricts the descendant axis
(//
) to the beginning of a path was preferred.
The ancestor and scope proposals were not adopted (they appear to be partial solutions to a more general problem, and better omitted until we have a more general approach).
The task force recommendation, as amended, was adopted.