This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
The discussion of bug 5157 suggests that there are some aspects of symbol spaces which could usefully be made clearer in the spec. Among them: (1) Symbol spaces contain expanded names (namespace name + local name pairs), not just local names. (2) Symbol spaces are used to enforce uniqueness constraints: no two components can have the same name within the same symbol space, so each name within a symbol space uniquely denotes a component. (3) Symbol spaces are also relevant to QName resolution, although not mentioned explicitly in the rules for it. The resolution of QName references to components involves looking for a component with the given expanded name, within the appropriate symbol space. (4) Symbol spaces are NOT used, however, in attributing element or attribute instances to particular particles and declarations in the schema. An element with a given expanded name will match (other conditions being propitious) any element declaration with that expanded name in the content model; it does not matter whether that element declaration is global or local. Simiilarly also for attributes. Proposition (1) appears to contradict the current text of section 2.5, which does its best to suggest that symbol spaces are somehow situated within target namespaces. It says, for example: There is a single distinct symbol space within a given target namespace for each kind of definition and declaration component ... But this description contradicts the usage of the term 'symbol space' in the spec. Section 3.1.1, for example, refers to ... equality of names (including target namespaces) within symbol spaces. If equality of names, including equality of target namespaces, can be tested for 'within' symbol spaces, then there cannot be distinct symbol spaces for top-level elements in distinct target namespaces. Another example: section 3.11.1 says Each constraint declaration has a name, which exists in a single symbol space for constraints. If there is a single symbol space for names of identity constraints, then it cannot be located within any single target namespace. So: first, section 2.5 needs to be corrected to agree with the spec's actual usage, and second, if possible the relation of symbol spaces to various name matching tasks (QName resolution, instance attribution, etc.) should be made clearer. This problem exists both in 1.1 and in 1.0.
The XML Schema WG agreed today (21 March 2008) that this needs to be fixed in both 1.0 (see bug 5584) and 1.1 (this bug). To the extent that the distinction matters, we believe that this is a clarification, not a substantive change, both in 1.0 and in 1.1.
In view of the WG's expressed view that this is a task of clarifying the spec's prose rather than changing the rules for conforming processors, I am adding the keyword 'editorial' to this issue. A consequence of this will be that the issue may be dealt with after, not before, the publication of the next working drafts.
I believe that symbol spaces are used to enforce uniqueness, in the sense that (as section 3.1.1 puts it) "multiple copies of components with the same name in the same ·symbol space· must not exist". Or at least, I believe that that is what the spec believes, and what the wg believes the spec to say. And I believe that the spec and wg believe that (as section 2.5 puts it) "Every complex type definition defines its own local attribute and element declaration symbol spaces." By what casuistry, then, do we explain that the following complex type definition is legal? <complexType name="upa-demo"> <sequence> <element name="a"/> <element name="a" minOccurs="0"/> </sequence> </complexType> Two element declarations named tns:a, both in the same local symbol space? I think the way out of this quandary is to say, not that complex type definitions create their own local symbol space for element declarations, but only that the names of local element declarations don't go into the symbol space for top-level declarations.
Rgarding the example: <complexType name="upa-demo"> <sequence> <element name="a"/> <element name="a" minOccurs="0"/> </sequence> </complexType> Doesn't this bug just remind us once again that our notion of component identity is murky? Specifically, http://www.w3.org/TR/xmlschema11-1/#dcl.elt.local says: "If the <element> element information item has <complexType> or <group> as an ancestor, and the ref [attribute] is absent, and it does not have minOccurs=maxOccurs=0, then it maps both to a Particle and to a local Element Declaration which is the {term} of that Particle. " The obvious difference between the two name="a" lines is in the minOccurs, which maps to the particle, not the element declaration. That then begs the question of whether the element declaration that is the term of the respective particles is the "same" or not. Turning the argument around, the statement that "Every complex type definition defines its own local attribute and element declaration symbol spaces." can be taken is at least indirect evidence that the answer is: they are the same. In any case, this suggests another possible resolution, in addition to the one suggested by MSM. We could attempt to make clear that, at least in cases like this, all local element declaration markup in the transfer syntax that shares a compexType ancestor and that declares elements of the same expanded name does indeed map to a single element declaration. I think this would be my preferred casuistry. Noah
Noah's suggestion in comment 4 seems not to handle cases like <complexType name="upa-demo"> <sequence> <element name="a"/> <element name="a" nillable="true" minOccurs="0"/> </sequence> </complexType> It would also seem to entail that the following schema document should give rise to a legal schema: <schema xmlns="http://www.w3.org/2001/XMLSchema"> <element name="a"/> <element name="a"/> </schema> That would be a feasible rule (although it's rather late for such a dramatic clarification), but all the processors I've tested reject it.
Michael Sperberg-McQueen writes: > That would be a feasible rule (although it's rather late for such a dramatic > clarification), but all the processors I've tested reject it. OK, I'm convinced (I don't know why I thought we allowed that). Thanks. Noah
>It would also seem to entail that the following schema document should give rise to a legal schema: <schema xmlns="http://www.w3.org/2001/XMLSchema"> <element name="a"/> <element name="a"/> </schema> >That would be a feasible rule (although it's rather late for such a dramatic clarification), but all the processors I've tested reject it. They reject it, I think, because they have made the decision to base component identity on the identity of nodes in the schema document. I think that's the only approach that is likely to work in practice, but it's not mandated by the spec. I think our rules for component identity are so weak that a processor could legally construct a schema from the above schema document.
MK wrote: > I think our rules for component > identity are so weak that a processor > could legally construct a schema from > the above schema document. That's what I thought. Indeed, I think it's the case that EDC ensures that such folding of local references can always or usually done. Still, regardless of what may have been possible or desirable in principle, I can't see changing the spec. in this area if it would cause most or even all widely deployed implementations to become nonconforming. So, I'm convinced by MSM's argument that constructs like this should be dissallowed, if only for the reason he cites. Noah
A wording proposal for bug 5507 is at http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b5507.html
The change proposal mentioned in comment 9 was adopted with amendments by the XML Schema WG at its telcon today and has been integrated into the status-quo documents.