This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Consider the following in Chrome: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3172 The foo element from innerHTML goes into the MathML namespace but from the full parse, it goes into the HTML namespace. The definition of "adjusted current node" ends up examining a node from outside the current parse, so arguably, the adjusted current node doesn't have start tag token, which is the mechanism through with the encoding attribute special case has been defined. Please make the expected behavior less ambiguous in the spec. I was about to extend the APIs if the Validator.nu HTML Parser and Gecko to take the encoding attribute on annotation-xml context node into account in fragment parsing, but since Blink, probably accidentally, ignores it, now I'm not sure. CCing davve@opera who wrote the relevant Blink code and David Carlisle to argue which way the spec should go.
I would say that ideally the innerHTML result should match that of the full document parse and take note of the encoding attribute on annotation-xml.
Ok I've fixed this by making the fragment parsing mode create a fictional start tag token that the HTML integration point logic can use. It's a bit of a hack, but it's probably good enough. If we ever find we've got more places that need this then I'll make a more elaborate fix.
Checked in as WHATWG revision r8805. Check-in comment: Clarify what the start tag of the context node is when parsing https://html5.org/tools/web-apps-tracker?from=8804&to=8805
I think this is a poor way to fix this. The fix is too general and suggests to implementors that they should copy attributes off the DOM node into the parser, when in reality they need to pass one boolean worth of info. That is, by reading the spec, an implementor ends up writing more code than is necessary to address the issue.
How would you fix it?
I'd fix this by having explicitly one bit of extra input to the fragment parsing algorithm: the bit being true if the context node is annotation-xml with an HTML-permitting encoding attribute and false otherwise. This would make it clear that there really is just one bit worth of magic state and not a full arbitrary attribute set worth of magic state.
Where would you use that bit? The reason I did it as I did is that the place that uses it, the definition of an HTML integration point, has no context.
(In reply to Ian 'Hixie' Hickson from comment #7) > Where would you use that bit? I'd use it upon pushing the element on the stack by making the stack entry have an "is HTML integration point" bit. Concretely, in the Validator.nu/Gecko implementation: * Nodes on the tree builder stack have a bitfield of flags one of which is "is HTML integration point". For annotation-xml, this bit can go either way. Otherwise, the bitfield depends on the element namespace and local name only. * There is no "adjusted current node" checks. Instead, when the context node is foreign, what's "adjusted current node" per spec is simply pushed as the first element on the stack instead of pushing "html" in the HTML namespace there. (Checks for the first node on the stack are done based on the stack position instead of treating "html" in the HTML namespace as an always-present sentinel.) (Personally, I find the second bullet point above much more elegant than the "adjusted current node" concept plus "html" in the HTML namespace acting as a sentinel.)
The second bullet would would help bug 27314, certainly. Not sure it would really help here though. Having a bit, in spec terms, is somewhat less clean than the equivalent code, in terms of readability. Let me consider the second bullet for bug 27314 and then see what that does to this bug. Thanks for your patience on this.
https://github.com/whatwg/html/issues/4467