This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
As part of discharging RQ-21 (provide regex and/or BNF for all primitive types), define a BNF and/or regex, lexical mapping, and canonical mapping for xsd:string. A proposal for wording to accomplish this is part of an omnibus proposal at http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.omnibus.20050824.html
The proposal for this item which was included in the omnibus proposal of 24 August 2005 has been separated out into a separate proposal: http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.b1902.20050831.html (member-only link) The proposal includes a design question: should the lexical and value spaces of string include only those characters which match the Char production of XML 1.1? or should they include a broader range of UCS characters? All UCS characters? (The proposal is otherwise pretty unsurprising.)
The proposal put forward in August was approved with amendments at the WG meeting in Edinburgh in September 2005. The amended text was incorporated into the status-quo document 8 December 2005. The WG decided to continue to define the lexical space of strings by reference to the set of legal XML characters, rather than expanding the space to allow Unicode characters not legal in XML. One argument for this result was that some users will legitimately want to ensure that their data can be written out in legal XML; if the lexical space of string were expanded, a second type with the restrictions of XML would be needed; the same would hold of the entire type hierarchy headed by string. The alignment with XML seemed preferable to having a parallel type hierarchy.
Although no formal request for closure was made, since the reporter also noted the resolution of this bug over two years ago, I'm marking it closed.