26783 – Parser: How should innerHTML setter on <annotation-xml encoding=application/xhtml+xml> work?

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26783 - Parser: How should innerHTML setter on <annotation-xml encoding=application/xhtml+xml> work?

Summary: Parser: How should innerHTML setter on <annotation-xml encoding=application/x...

Status:	RESOLVED MOVED

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	HTML (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	Unsorted
Assignee:	Ian 'Hixie' Hickson
QA Contact:	contributor

URL:	http://software.hixie.ch/utilities/js...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-09-11 10:57 UTC by Henri Sivonen
Modified:	2019-03-29 19:43 UTC (History)
CC List:	7 users (show)

See Also:

Attachments

Description Henri Sivonen 2014-09-11 10:57:57 UTC

Consider the following in Chrome:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3172

The foo element from innerHTML goes into the MathML namespace but from the full parse, it goes into the HTML namespace.

The definition of "adjusted current node" ends up examining a node from outside the current parse, so arguably, the adjusted current node doesn't have start tag token, which is the mechanism through with the encoding attribute special case has been defined.

Please make the expected behavior less ambiguous in the spec. I was about to extend the APIs if the Validator.nu HTML Parser and Gecko to take the encoding attribute on annotation-xml context node into account in fragment parsing, but since Blink, probably accidentally, ignores it, now I'm not sure.

CCing davve@opera who wrote the relevant Blink code and David Carlisle to argue which way the spec should go.

Comment 1 David Carlisle 2014-09-12 14:35:28 UTC

I would say that ideally the innerHTML result should match that of the full document parse and take note of the encoding attribute on annotation-xml.

Comment 2 Ian 'Hixie' Hickson 2014-09-22 23:52:20 UTC

Ok I've fixed this by making the fragment parsing mode create a fictional start tag token that the HTML integration point logic can use. It's a bit of a hack, but it's probably good enough. If we ever find we've got more places that need this then I'll make a more elaborate fix.

Comment 3 contributor 2014-09-22 23:53:09 UTC

Checked in as WHATWG revision r8805.
Check-in comment: Clarify what the start tag of the context node is when parsing
https://html5.org/tools/web-apps-tracker?from=8804&to=8805

Comment 4 Henri Sivonen 2014-09-30 06:49:40 UTC

I think this is a poor way to fix this. The fix is too general and suggests to implementors that they should copy attributes off the DOM node into the parser, when in reality they need to pass one boolean worth of info. That is, by reading the spec, an implementor ends up writing more code than is necessary to address the issue.

Comment 5 Ian 'Hixie' Hickson 2014-09-30 17:29:55 UTC

How would you fix it?

Comment 6 Henri Sivonen 2014-10-15 07:56:01 UTC

I'd fix this by having explicitly one bit of extra input to the fragment parsing algorithm: the bit being true if the context node is annotation-xml with an HTML-permitting encoding attribute and false otherwise.

This would make it clear that there really is just one bit worth of magic state and not a full arbitrary attribute set worth of magic state.

Comment 7 Ian 'Hixie' Hickson 2014-10-15 19:03:46 UTC

Where would you use that bit? The reason I did it as I did is that the place that uses it, the definition of an HTML integration point, has no context.

Comment 8 Henri Sivonen 2014-10-16 07:25:43 UTC

(In reply to Ian 'Hixie' Hickson from comment #7)
> Where would you use that bit?

I'd use it upon pushing the element on the stack by making the stack entry have an "is HTML integration point" bit.

Concretely, in the Validator.nu/Gecko implementation:

 * Nodes on the tree builder stack have a bitfield of flags one of which is "is HTML integration point". For annotation-xml, this bit can go either way. Otherwise, the bitfield depends on the element namespace and local name only.

 * There is no "adjusted current node" checks. Instead, when the context node is foreign, what's "adjusted current node" per spec is simply pushed as the first element on the stack instead of pushing "html" in the HTML namespace there. (Checks for the first node on the stack are done based on the stack position instead of treating "html" in the HTML namespace as an always-present sentinel.)

(Personally, I find the second bullet point above much more elegant than the "adjusted current node" concept plus "html" in the HTML namespace acting as a sentinel.)

Comment 9 Ian 'Hixie' Hickson 2014-11-26 20:03:13 UTC

The second bullet would would help bug 27314, certainly. Not sure it would really help here though. Having a bit, in spec terms, is somewhat less clean than the equivalent code, in terms of readability. Let me consider the second bullet for bug 27314 and then see what that does to this bug. Thanks for your patience on this.

Comment 10 Domenic Denicola 2019-03-29 19:43:00 UTC

https://github.com/whatwg/html/issues/4467