| Group Home Page | Member-Confidential!
This is the public part of the Disposition of Comments received during the Last Call for comments on the Character Model for the World Wide Web 1.0, W3C Working Draft 26 January 2001. Each Last Call Comment (LCC) is associated with a Last Call Issue (LCI). Related LCCs may be grouped together into a single LCI. To find the disposition of an LCC, click on the link to the corresponding LCI, then check the status columns. For various reasons, several of the links in this document are Member-only.
Other versions of the Character Model: Latest public version | Latest internal version (Members only)
Related documents: Requirements for String Identity Matching and String Indexing | Minutes of I18N WG meetings (Members only)
Related mail archives: www-i18n-comments
The original comment is linked from the originator's name. Where a single mail contained multiple comments, these have been chopped up into separate mails linked from the Description. Where this is not the case, the links from the originator's name and from the Description are identical. The value of Visibility is either Public (indicating that the comment was sent to the www-i18n-comments list) or Member (indicating the comment was sent to the w3c-i18n-ig list).
Comments: 1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 last
The three Status columns are:
The possible values of Accepted in principle are:
The possible values of Type are:
Issues: 1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 last
LCI | Status | T | Ref | Description | LCC | Comment | ||
---|---|---|---|---|---|---|---|---|
A | M | C | ||||||
LCI-197 | N | - | Y | S | 4.3 | Should search engines
normalize? -- Responsibility for Normalization: "[S] [I] A text-processing component that receives suspect text MUST NOT perform any normalization-sensitive operations unless it has first successfully validated the text for normalization, and MUST NOT normalize the suspect text." I understand that some application such as XML processor MUST NOT normalize the suspect text because the normalization can turn a well-formed document to ill-formed. On the other hand, some application such as search engine SHOULD normalize text so that it can find canonically equivalent text. |
LCC-221 | comment-221 |
LCI-196 | P | Y | Y | E | 3.7 | Delimiters for character
escaping -- Character Escaping: "[S] Explicit end delimiters MUST be provided. Escapes such as \uABCD where the end delimiter is a space or any character other than [01-9A-F] SHOULD be avoided." MUST and SHOULD are mixed here. If the first requirement is MUST, the second must be also MUST. |
LCC-220 | comment-220 |
LCI-195 | Y | Y | Y | E | 3.7 | "character data" vs. "text
data" -- Character Encoding Identification: "[S] Specifications MUST NOT use heuristics to determine the encoding of data." In what situation, would specifications "determine" the encoding of data? |
LCC-218 | comment-218 |
LCI-194 | Y | Y | Y | E | 3.6.2 | Would specifications "determine" the encoding of
data?-- Character Encoding Identification: "[S] Specifications MUST NOT use heuristics to determine the encoding of data." In what situation, would specifications "determine" the encoding of data? |
LCC-218 | comment-218 |
LCI-193 | Y | Y | Y | E | 3.6.1 | "There is also no ambiguity if
data is transferred non-electronically ..." -- Mandating a unique character encoding: "There is also no ambiguity if data is transferred non-electronically and later has to be converted back to a digital representation." If "transferred non-electronically" means that characters are written on paper, there are a lot of ambiguity to determine characters from glyph, like if this space is SPACE U+0020 or NO-BREAK SPACE U+00A0. |
LCC-217 | comment-217 |
LCI-192 | Y | Y | Y | E | 3.5 | XML doesn't allow use of full
range of Unicode code points and doesn't justify exceptions -- Reference Processing Model: In the first Note in this section, it says "All specifications that derive from the XML 1.0 specification [XML 1.0] automatically inherit this Reference Processing Model." But XML 1.0 is not very good example because it doesn't allow the use of the full range of Unicode code points and it doesn't justify the exceptions. |
LCC-216 | comment-216 |
LCI-190 | y | Y | Y | E | 7 | String Indexing needs more examples | LCC-29 | comment-29 |
LCI-188 | Y | Y | Y | S | 4.3 | Issue 5: Responsibilities "Proxy" versus "Recipient" | LCC-214 | comment-214 |
LCI-187 | Y | Y | Y | S | 4.2.2 | Issue 3: Full Normalization as document syntax dependent | LCC-212 | comment-212 |
LCI-186 | Y | Y | Y | S | 4.3 | Issue 2: XPath string-value | LCC-211 | comment-211 |
LCI-185 | Y | Y | Y | E | 3.3 3.5 |
Transcoding | LCC-209 | comment-209 |
LCI-184 | Y | Y | Y | E | 3.5 | Characters above U+10FFFF | LCC-208 | comment-208 |
LCI-183 | P | Y | Y | S | 8 | IURIs, URIs, CHARMOD -- See also subsequent mail | LCC-207 | comment-207 |
LCI-181 | Y | Y | Y | S | 4.2 4.3 |
State that W3C N11N is required | LCC-204 LCC-213 |
comment-204 comment-213 |
LCI-180 | N | - | Y | S | 4 | Normalization vs. encoding layers -- See also subsequent mail | LCC-202 | comment-202 |
LCI-179 | - | - | Y | S | 8 | This issue has been merged with LCI-8 | - | - |
LCI-121 | Y | Y | Y | E | 2 | "... all applicable requirements MUST be satisfied." | LCC-144 | comment-144 |
LCI-104 | Y | Y | Y | S | 4.3 | Critique of reliance on early normalization | LCC-125 | comment-125 |
LCI-102 | N | - | Y | S | 4.3 | Concerns about impact of early normalization | LCC-123 | comment-123 |
LCI-101 | N | - | Y | S | 4.2 | Concerns about NFC | LCC-122 | comment-122 |
LCI-100 | - | - | Y | N | 8 | (Character Encoding in URI References): Discussion of DSig approach | LCC-121 | comment-121 |
LCI-99 | - | - | Y | N | 4.3 | (Responsibility for Normalization): Discussion of DSig approach | LCC-120 | comment-120 |
LCI-98 | - | - | Y | O | 4.2 | U+0Fnn (Tibetan Block) characters | LCC-126 | comment-126 |
LCI-97 | - | - | Y | N | 4.3 | (Early Uniform Normalization): Discussion of DSig approach | LCC-119 | comment-119 |
LCI-96 | Y | Y | Y | S | 3.7 | (Character Escaping): "There SHOULD be only one way to escape a character." -- See miscellany 40 | LCC-118 | comment-118 |
LCI-95 | N | - | Y | S | 3.6.2 | (Private Use Code Points): Disagreement with our approach | LCC-117 | comment-117 |
LCI-94 | - | - | Y | N | 3.6.1 | (Character Encoding Identification): Discussion of DSig approach | LCC-116 | comment-116 |
LCI-93 | Y | Y | Y | E | 3.6 | (Choice and Identification of Character Encodings): Why do we say "For APIs, UTF-16 is more appropriate"? | LCC-115 | comment-115 |
LCI-92 | N | - | Y | E | 3.2 | (Digital Representation of Characters): "the distinction between CEF and CES is not very clear and might merit an example" | LCC-113 | comment-113 |
LCI-91 | Y | Y | Y | E | 3.1.5 | (Units of Collation): "Software developers MUST NOT merely use a one-to-one mapping as their string-compare function ..." | LCC-112 | comment-112 |
LCI-90 | Y | Y | Y | E | 3.1.3 | (Units of Visual Rendering): Define "logical order" | LCC-111 | comment-111 |
LCI-89 | Y | Y | Y | E | 3.1.2 | (Units of a Writing System, and Units of Aural Rendering): Define phoneme and syllabaries -- See also mail from Richard Ishida | LCC-110 | comment-110 |
LCI-88 | - | - | Y | S | 4.3 | This issue has been merged with LCI-85 | - | - |
LCI-75 | - | - | Y | - | 4.2 | This issue has been merged with LCI-47 | - | - |
LCI-74 | - | - | Y | - | 8 | This issue has been merged with LCI-8 | - | - |
LCI-73 | Y | Y | Y | T | Refs | W3C specs need commas between maturity level and date | LCC-82 | comment-82 |
LCI-72 | Y | Y | Y | T | Refs | "Eve Maler Eds." -> "Eve Maler, Eds." | LCC-81 | comment-81 |
LCI-71 | N | - | Y | E | Refs | In the References section, W3C specs could all have publication dates | LCC-80 | comment-80 |
LCI-70 | Y | Y | Y | T | 3.6.1 | "developers and software that tags" -> "developers and software that tag" | LCC-79 | comment-79 |
LCI-69 | Y | Y | Y | E | 3.1.7 | "Text is then defined as" -> "Text is then defined as" | LCC-78 | comment-78 |
LCI-68 | N | - | Y | E | 1.1 | "The Unicode Standard" -> "the Unicode Standard" | LCC-77 | comment-77 |
LCI-67 | Y | Y | Y | T | 1.1 | "target audience of this document are" -> "target audience of this document is" | LCC-76 | comment-76 |
LCI-66 | N | - | Y | E | 1.1 | "Universal Access" -> "universal access" | LCC-75 | comment-75 |
LCI-65 | Y | Y | Y | T | 3.1.2, 5 | "Hiragana and Katakana" vs "katakana and hiragana" | LCC-74 | comment-74 |
LCI-64 | Y | Y | Y | E | 9 | Unicode Consortium's instructions on how to refer to Unicode | LCC-73 | comment-73 |
LCI-63 | Y | Y | Y | E | 3.5 | "Since its early days, the Web has seen the development of a Reference Processing Model." | LCC-72 | comment-72 |
LCI-62 | y | Y | Y | E | 2 | Provide a conformance checklist | LCC-71 | comment-71 |
LCI-59 | Y | Y | Y | E | 3.5 | Please clarify how to handle "control characters" -- See miscellany 8 | LCC-68 LCC-114 |
comment-68 comment-114 |
LCI-57 | Y | Y | Y | E | A.3 | This example "appears oversimplified" | LCC-64 | comment-64 |
LCI-56 | Y | Y | Y | T | 8 | "conversion a legal" -> "conversion to a legal" | LCC-63 | comment-63 |
LCI-55 | Y | Y | Y | S | 4.2 | "turning marked-up W3C-normalised text into plain text may produce non-NFC results" | LCC-60 LCC-98 LCC-134 LCC-210 |
comment-60 comment-98 comment-134 comment-210 |
LCI-54 | P | Y | Y | E | 4.2 | Discussion of: "Note: Legacy text is always normalized unless it contains escapes which, once expanded, denormalize it." -- See miscellany 9 | LCC-59 | comment-59 |
LCI-53 | N | - | Y | E | 4.2 | Impact of versioning on normalisation | LCC-58 | comment-58 |
LCI-52 | - | - | Y | ? | 4.2 | This issue has been merged with LCI-47 | - | - |
LCI-51 | P | Y | Y | E | 4.2.2 | (W3C-normalized Text): "the parenthetical definition should be removed, along with its application." | LCC-56 | comment-56 |
LCI-50 | N | - | Y | E | 4.1 | Referencing UTR #15 | LCC-55 | comment-55 |
LCI-49 | Y | Y | Y | E | 3.7 | Recommend against unnecessary use of escapes | LCC-54 | comment-54 |
LCI-48 | N | - | Y | E | 3.7 | Use of character escapes in identifiers | LCC-53 | comment-53 |
LCI-47 | Y | Y | Y | S | 4.2 | Entities and normalization | LCC-52 LCC-57 LCC-84 LCC-85 LCC-91 LCC-96 LCC-97 LCC-206 |
comment-52 comment-57 comment-84 comment-85 comment-91 comment-96 comment-97 comment-206 |
LCI-46 | Y | Y | Y | E | 3.7 | Recommend the use of hex rather than decimal NCRs -- See note 2 | LCC-51 | comment-51 |
LCI-45 | N | - | Y | E | 3.6.1 | (Character Encoding Identification) Comments | LCC-50 | comment-50 |
LCI-44 | Y | Y | Y | E | 3.6.1 | "say that XML uses a pseudo-attribute called 'encoding' rather than 'charset'" | LCC-49 | comment-49 |
LCI-43 | N | - | Y | E | 3.2 | (Digital Representation of Characters) "Transfer Encoding Syntax is missing" | LCC-48 | comment-48 |
LCI-42 | Y | Y | Y | E | 3 | "code point" vs "code position" | LCC-47 | comment-47 |
LCI-41 | N | - | Y | E | 3 | "Terms such as 'byte' and 'wyde' are left for the reader to guess, likewise for 'octet' ..." | LCC-45 | comment-45 |
LCI-40 | N | - | Y | E | Gen | "There is no definition of terms in the document." | LCC-44 | comment-44 |
LCI-39 | N | - | Y | E | 3.1.7 | List the allowed meaning of 'character' | LCC-43 LCC-94 |
comment-43 comment-94 |
LCI-38 | Y | Y | Y | E | 3.1.6 | "when is multiple 'characters' stored in a single 'physical unit of storage'?" -- See note 3 | LCC-42 | comment-42 |
LCI-37 | Y | Y | Y | T | 2 | "All...specification" -> "All...specifications" | LCC-41 | comment-41 |
LCI-36 | N | - | Y | E | 2 | "The terminology (SHALL, ..., OPTIONAL, ...) should come before the conformity clause" | LCC-40 | comment-40 |
LCI-35 | N | - | Y | E | 2 | "The phrase 'MUST NOT' reflects in itself a lack of internationalisation" | LCC-39 | comment-39 |
LCI-34 | N | - | Y | E | 2 | "Conformance" -> "Conformity" | LCC-38 | comment-38 |
LCI-33 | N | - | Y | E | 1.3 | (Notation): Denoting Unicode code points in this specification | LCC-37 | comment-37 |
LCI-32 | Y | Y | Y | E | 3.5 | "For a specification to use the Reference Processing Model does not require that implementations actually use Unicode." | LCC-35 LCC-65 LCC-66 LCC-93 |
comment-35 comment-65 comment-66 comment-93 |
LCI-31 | Y | Y | Y | E | 9 | "in synchronism" | LCC-34 | comment-34 |
LCI-30 | Y | Y | Y | E | 7 | "translation of a document from one language to another" | LCC-30 | comment-30 |
LCI-29 | Y | Y | Y | E | 5 | Use of the acronym "GI" | LCC-27 LCC-61 |
comment-27 comment-61 |
LCI-28 | Y | Y | Y | E | 6 | "APIs in addition SHOULD NOT specify single character or single encoding-unit arguments." | LCC-32 | comment-32 |
LCI-27 | - | - | Y | Q | 6 | Is DOM Range spec an example of non-numeric substring identification? | LCC-31 | comment-31 |
LCI-26 | - | - | Y | Q | 6 | "Conversion to a common encoding of UCS" | LCC-28 | comment-28 |
LCI-25 | Y | Y | Y | E | 4.2.3 | (Examples) Clarify -- See miscellany 26, miscellany 27, miscellany 28 | LCC-26 | comment-26 |
LCI-24 | Y | Y | Y | E | 4.2 | Normalizing-transcoders -- See miscellany 25 | LCC-25 | comment-25 |
LCI-23 | Y | Y | Y | E | 4.2 | What is the definition of "legacy text"? "Legacy encoding"? -- See miscellany 24 | LCC-24 | comment-24 |
LCI-22 | Y | Y | Y | E | 4.2 | "Unicode encoding form" -- See miscellany 23 | LCC-23 | comment-23 |
LCI-21 | Y | Y | Y | E | 3.6.2 | "Where specifications need to allow the transmission of symbols not in Unicode ..., they MAY define markup for this purpose." -- See miscellany 22 | LCC-22 | comment-22 |
LCI-20 | Y | Y | Y | E | 3.6 | Receiving software MUST determine the encoding from available information. It MAY recognize as many encodings ... as appropriate. When no charset is provided the receiving software MUST adhere to the default encoding(s) ..." -- See miscellany 20, miscellany 21 | LCC-21 | comment-21 |
LCI-19 | Y | Y | Y | E | 3.2 | Use of "units of encoding" and "unit" -- See miscellany 15 | LCC-20 | comment-20 |
LCI-18 | Y | Y | Y | E | 3.2 | Use of "encoding" and "character encoding" -- See miscellany 17, miscellany 18, miscellany 19 | LCC-19 | comment-19 |
LCI-17 | Y | Y | Y | E | 3.1.5 | Collation units vs characters -- See miscellany 14 | LCC-18 | comment-18 |
LCI-16 | Y | Y | Y | E | 3.1.4 | "it is not the case that keystrokes and input characters correspond one-to-one" | LCC-17 | comment-17 |
LCI-15 | N | - | Y | S | 2 | All W3C specifications [have to | MUST] conform", "all applicable requirements MUST be satisfied" -- See miscellany 13 -- Other part(s) of this item are now in LCI-121 | LCC-16 LCC-109 |
comment-16 comment-109 |
LCI-14 | y | Y | Y | E | Gen | Need more examples, and more explanations | LCC-15 | comment-15 |
LCI-9 | - | - | Y | O | 3.1.3 | Directionality of numbers | LCC-9 | comment-9 |
LCI-8 | P | Y | Y | 8 | (Character Encoding in URI References) General issues | LCC-7 LCC-8 LCC-33 LCC-62 LCC-83 LCC-99 LCC-201 |
comment-7 comment-8 comment-33 comment-62 comment-83 comment-99 comment-201 |
|
LCI-7 | - | - | Y | E | 8 | This issue has been merged with LCI-8 | - | - |
LCI-6 | Y | Y | Y | E | 3.1.6 | Use of "bytes" | LCC-6 LCC-46 LCC-95 |
comment-6 comment-46 comment-95 |
LCI-5 | Y | Y | Y | E | 3.2 | Use of "ISO 8859-1" | LCC-5 | comment-5 |
LCI-1 | - | - | Y | E | Gen | This collection of editorial points has been moved elsewhere | - | - |
LCI | Status | T | Ref | Description | LCC | Comment | ||
A | M | C |