This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
9.1.4 Character references Draft document sayeth: "An ambiguous ampersand is a U+0026 AMPERSAND character (&) that is followed by some text other than a space character, a U+003C LESS-THAN SIGN character (<), or another U+0026 AMPERSAND character (&)." probably should insert 2nd line as follows An ambiguous ampersand is a U+0026 AMPERSAND character (&) that is not a valid character reference, and
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Rejected Change Description: no spec change Rationale: An ambiguous ampersand is text. A character reference is not text. Therefore an ambiguous ampersand can never be a character reference.
wellll, i guess if you strictly parse the linked definition of <a href="#syntax-text" title="syntax-text">text</a>, then the complete character reference only comprises a single <a href="#syntax-text" title="syntax-text">text</a> character. nevertheless the individual characters that follow that initial ampersand within a character reference would otherwise be considered plain text if they weren't in that context. due to overloaded terminology, this logic can get convoluted and doesn't seem immediately obvious to a reader trying to understand the definition. re-reading the definition for ambiguous ampersand still seems to me to include the characters making up character reference. inserting the phrase "that is not a valid character reference" explictly disambiguates such a possible misperception and reinforces the sense of the definition. thus i again suggest inserting that phrase.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Did Not Understand Request Change Description: no spec change Rationale: I am completely at a loss as to what comment 2 is trying to say. The concept of ambiguous ampersands is used to restrict what values "text" can have. Its purpose is to make it non-conforming to have an ampersand followed by something that would, when parsed, be confused for a character reference. As such, the only characters that are allowed after & are space characters, "<" characters, and other "&" characters. All other characters, including all the characters that would form a character reference, are not allowed, and thus a & followed by any such character (e.g. "a" or "#") is am ambiguous ampersand. If we were to _exclude_ characters that formed character references, then this would completely fail to achieve the stated goal. If "&" followed by "gt;" was _not_ an ambiguous ampersand, then there'd be no way to distinguish the text consisting of the four characters "&", "g", "t", ";" from a single character reference ">", and yet both would be legal. This is why ambiguous ampersands are defined as they are.