5165 – Editorial: numbering of rules and constraints

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5165 - Editorial: numbering of rules and constraints

Summary: Editorial: numbering of rules and constraints

Status:	CLOSED FIXED

Alias:	None

Product:	XML Schema
Classification:	Unclassified
Component:	Structures: XSD Part 1 (show other bugs)
Version:	1.1 only
Hardware:	PC Windows XP

Importance:	P2 normal
Target Milestone:	---
Assignee:	C. M. Sperberg-McQueen
QA Contact:	XML Schema comments list

URL:
Whiteboard:	presentation cluster
Keywords:	editorial, noFurtherAction

Depends on:
Blocks:

Reported:	2007-10-09 00:07 UTC by Michael Kay
Modified:	2010-11-10 17:35 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Michael Kay 2007-10-09 00:07:49 UTC

Bug #5152 reminds me to make a comment that has been at the back of my mind for a long time: it would be very much easier on the reader if validation rules, constraints, information set contributions and similar pseudo-sections became regular sections numbered according to the place in the document structure where they occur. This would make it easier to refer to them when discussing the spec (for example, when raising comments on a draft), and when reading the spec, especially by following links, it would reduce the sense of disorientation felt when you have no idea where in the document you have landed. It would ensure that the rules appear in the table of contents, and for those who like to use printed copies of the spec on paper, it would make the links much easier to follow.

Comment 1 C. M. Sperberg-McQueen 2008-01-04 01:31:49 UTC

Just to make sure I follow correctly: the proposal is that,
effectively, wherever the document now has a validation rule,
constraint on schema, etc., we should wrap it in a subsection?

Consider, for example, section 3.4.6 Constraints on Complex Type
Definition Schema Components, which currently has the following
structure

section 3.4.6
title: Constraints on Complex Type Definition Schema Components
para: All complex type definitions ...
const: Complex Type Definition Properties Correct
const: Derivation Valid (Extension)
para: A complex type T is a valid extension ...
const: Derivation Valid (Restriction, Complex)
note: Valid restriction involves both a subset relation on ...
const: Content type restricts
note: To restrict a complex type definition ...
note: To restrict away a local element declaration ,,,
para: The following constraint defines a relation appealed to
elsewhere in this specification.
const: Type Derivation OK (Complex)
note: This constraint is used to check ...
note: The wording of clause 2.1 above appeals to a notion of
component identity ...
note: When a complex type definition S is said to be ...

(where 'const' is short for 'constraintnote')

If I understand your proposal correctly it would be to give the
section a structure something like this:

section 3.4.6
title: Constraints on Complex Type Definition Schema Components
para: All complex type definitions ...

section 3.4.6.1
title: Complex Type Definition Properties Correct
const: Complex Type Definition Properties Correct

section 3.4.6.2
title: Derivation Valid (Extension)
const: Derivation Valid (Extension)
para: A complex type T is a valid extension ...

section 3.4.6.3
title: Derivation Valid (Restriction, Complex)
const: Derivation Valid (Restriction, Complex)
note: Valid restriction involves both a subset relation on ...

section 3.4.6.4
title: Content type restricts
const: Content type restricts
note: To restrict a complex type definition ...
note: To restrict away a local element declaration ,,,

section 3.4.6.5
title: Type Derivation OK (Complex)
para: The following constraint defines a relation appealed to
elsewhere in this specification.
const: Type Derivation OK (Complex)
note: This constraint is used to check ...
note: The wording of clause 2.1 above appeals to a notion of
component identity ...
note: When a complex type definition S is said to be ...

The placement of paragraphs and notes which now occur between
constraint notes will require a very little judgement on the part of
those making the wording proposal.

From your remarks in http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2008Jan/0001.html
(member-only link), I think that this is indeed what you have in mind.

If this is what you are proposing, I endorse it so heartily I can
hardly understand why we haven't done it before.

Comment 2 Michael Kay 2008-01-04 10:15:31 UTC

>Just to make sure I follow correctly: the proposal is that, effectively, wherever the document now has a validation rule, constraint on schema, etc., we should wrap it in a subsection?  

Yes, in essence. Though I haven't tried to work out what difficulties it might cause.

The problem with the conventional subsection structure of 2.1, 2.1.1, 2.1.2 etc is that the only place for text that belongs at the 2.1 level, because it doesn't relate to a particular subsection, is before all the subsections. There are two ways that one can get around this: one is to make the subsidiary components "boxed components" (like figures, tables, equations, boxed notes) that allow a return to the parent level of hierarchic structure when they finish. The other way is by using out-of-line constructs (footnotes, endnotes, appendices) Both of these might be viable here: in reading the document, one so often arrives at a rule or constraint definition via a remote cross-reference that it wouldn't do much harm to have all the rules and constraints appear together in a chapter of their own. This isn't that dissimilar from the ISO convention of putting all the definitions together in one chapter.

In effect we are currently using "boxed components", but without a clear typographical indication of where the box ends, and without any mechanism for identifying the component other that a (usually rather long-winded) name, which gives you no clue whereabouts in the document the component is to be found.

The lack of structure can actually cause confusion. Looking at 3.4.6 which you cite, the introductory prose refers to "the following constraints". Which constraints is it referring to? All those in section 3.4.6? Or only those up to the next prose introduction, which occurs before Type Derivation OK (Complex)?

One advantage (and cost) of moving the constraints out-of-line would be that it would force a clearer definition of the parameters of each constraint. At the moment there are a number of styles. Look again at 3.4.6:

Complex Type Definition Properties Correct starts by talking of "the properties of a complex type definition" (which one?).

Derivation Valid (Extension) starts by talking about "the {derivation method}" (of what?)

Type Derivation OK (Complex) uses the style "For a complex type definition (call it D, for derived) to be validly derived"

Behind these stylistic differences there is actually something more fundamental: some of the constraints are merely definitions of properties that a component may or may not have (These are often introduced with the phrase "The following constraints define relations appealed to elsewhere in this specification."), while some of them describe rules that components must satisfy in order to be valid. 

Having rambled around this, I think I would try to go for the "subsection" style as you suggest. Try to make the subsections free-standing, so their meaning doesn't depend on context: constraints should be in the form "Every complex type definition T must satisfy all of the following", or "Definition: a complex type P is a _valid restriction_ of a complex type Q if all the following conditions are true"

Comment 3 Pete Cordell 2008-01-04 12:21:39 UTC

If you do go with each constraint having its own sub-section (which I am highly in support of), can I suggest that non-trivial rule sets have some non-normative preamble that summarizes the overall intent of the rules.

Comment 4 Dave Peterson 2008-01-06 21:12:24 UTC

(In reply to comment #2)

> Behind these stylistic differences there is actually something more
> fundamental: some of the constraints are merely definitions of properties that
> a component may or may not have (These are often introduced with the phrase
> "The following constraints define relations appealed to elsewhere in this
> specification."), while some of them describe rules that components must
> satisfy in order to be valid. 

>                                   constraints should be in the form "Every
> complex type definition T must satisfy all of the following", or "Definition: a
> complex type P is a _valid restriction_ of a complex type Q if all the
> following conditions are true"

I don't understand why a definition alone should ever be called a constraint.  Why aren't the presented simply as formal definitions?

Comment 5 Michael Kay 2008-01-06 22:19:48 UTC

I agree: calling these things constraints is a misnomer.

Comment 6 C. M. Sperberg-McQueen 2008-01-11 03:48:19 UTC

W.r.t. comment #2:  I think you are right that in principle there is some danger
that there might be text between two constraints that logically belongs not
in either constraint's sub-section but at the next level up.  In practice, it's
suggestive that the random example taken in comment #1 does not in fact have any
such text:  there are notes and paragraphs between constraints, but they are
all commentary on the constraints and belong with them in the sub-sections.  
Without having performed any census, I suspect that all (or at least most) 
sections containing constraints follow this pattern, if only because there 
tends to be so little connective prose in them.

W.r.t comment #3:  this editor agrees: almost every constraint would benefit
from an introductory paragraph saying what it does, why it's there, and 
perhaps when it does and doesn't apply.

W.r.t. comment #4: I have the same question, but of course you and I were
both members of the Working Group that wrote the 1.0 spec, so we can 
equally direct the question back at ourselves:  why did we vote to move
forward with a spec in which critical concepts appear not as the content of
the definition of a term but as the content of a 'constraint'?  (QName
resolution provides a convenient example.)  You and I are as responsible as 
anyone.

For what it's worth, I think the 1.0 text appears to reflect a particular
way of thinking about validation as the construction of a coherent story
about the validity of the document.  For the story to count as coherent,
this and this and this must be true.  Definitions and constraints are talked
about as if they could only ever be of interest in the course of proving that
a particular PSVI provides a coherent story; the term 'must' makes no sense
if we regard ourselves as describing what QNames in schema documents denote
(they denote X and not Y, there is no 'must' about it) -- but it makes more
sense if we view ourselves as describing what has to be true in order for the
story to make sense and be an officially blessed story about the schema
validity of a particular instance against a particular schema.  In that
situation, what must be true of a particular QName?  It *must* resolve to
(denote) a particular component in the schema.

I am not entirely certain of this analysis, and I'm not sure that it explains
everything in the wording of 1.0, but it does at least tend to explain how
it could conceivably come about that a definition might be formulated as if
it were a constraint.

I think the spec would be clearer if all the definitions in disguise were
reformulated as definitions, but that might in some case lead to the 
complete disappearance of a constraint; it's not clear to me that the WG
will be willing to countenance such changes in all cases.

Comment 7 David Ezell 2008-01-25 19:27:59 UTC

Out of this ponderous discourse, the WG was able to agree to the following modest points:
1) use subsections so that there is at most one named constraint in a subsection.
2) make it a practice (not a requirement) to have introductory prose for each constraint.

The WG so instructs the editors.

Comment 8 C. M. Sperberg-McQueen 2008-01-29 02:16:16 UTC

One note on what may be a minor point.  Under the current stylesheets,
the tables of contents are limited in depth:  the main toc shows two
levels, and most second-level divisions which have children have a 
local toc showing those children.  

But unless the stylesheets are changed, the new sections wrapped around
validation rules and constraints on schemas (for example) will not appear
in either kind of table of contents.  They will be at the fourth level
(div4 in the source).  

It would probably be worthwhile to include them in the local table of
contents, and I will spend some time seeing how easy it is to change
the depth of coverage in the local tocs.  But I am currently not willing to
make any promises.

Comment 9 C. M. Sperberg-McQueen 2008-02-08 23:41:39 UTC

The WG today accepted a wording proposal that partially resolves this
issue:
http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b5165.200802.html (member-only link).

The changes have been introduced into the status-quo document, but I'm
leaving this record open in order to track further change proposals
related to this issue.

Comment 10 C. M. Sperberg-McQueen 2009-10-10 01:10:57 UTC

In August and September 2009 the XML Schema working group performed
triage on the remaining open issues in a WBS poll [1], whose results
are summarized at [2] and accepted formally at [3]. In the course of
that triage we decided to close this issue without further action.
Since this is a WG issue, not an external one, I'm going both to mark
it resolved and to close it.  Since the issue was at least partially resolved,
I'm marking it fixed.  It's no longer clear to me what further changes we
at one time envisaged.

[1] http://www.w3.org/2002/09/wbs/19482/200908CRissues/
[2] http://lists.w3.org/Archives/Member/w3c-xml-schema-wg/2009Sep/0005.html
[3] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2009Sep/att-0005/2009-09-11telcon.html#item04
(all links member-only)

Comment 11 David Ezell 2010-11-10 17:35:01 UTC

The WG reported this bug as FIXED on 2009-10-10.  We are closing this bug
as requiring no futher work.  If there are issues remaining, you can reopen
this bug and enter a comment to indicate the problem.  Thanks very much for the
feedback.