See also: IRC log
-> http://www.w3.org/XML/XProc/2010/06/17-agenda
Norm: Let's add my HTML/encoding question and drop 2.1 because there's nothing new today.
Henry: No, there's one thing we can talk about wrt 2.1
Accepted.
-> http://www.w3.org/XML/XProc/2010/06/10-minutes
Accepted.
Paul is at risk, he'll dial in if he can.
Some discussion of what the expected processing is for an XHTML document sent as text/html
Alex: If you do this with an
Reader in Java, you've already made the encoding choice. On an
InputStream, you haven't.
... What processors do here is sniff if the content type isn't
specified and work out the encoding from the first 200 bytes or
so.
<ht> http://www.rfc-editor.org/rfc/rfc2854.txt
Henry: I've been looking at RFC
2854, the RFC that current governs text/html
... oddly, the RFC makes several observations but doesn't
actually seem to say what to do.
Spec exploration ensues
Henry: The final note in 7.1.10.4 is clearly wrong, if there's a charset parameter it is text.
<ht> Content-Type: text/html; charset=utf-8
<scribe> ACTION: Norm to propose an erratum for the note at the end of 7.1.10.4 to add something like "without a charset" [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action01]
<ht> Or you could have said override-content-type="text/html; charset=utf-8"
Norm: Yes, I could. That might be the easiest solution, in fact.
Some discussion of content transfer encoding.
<ht> For what it's worth, RFC2616 defines 'entity body' as the octets in the message
<ht> Wrong
<ht> "The entity-body is obtained
<ht> from the message-body by decoding any Transfer-Encoding that might
<ht> have been applied to ensure safe and proper transfer of the message.
<ht> "
<scribe> ACTION: Norm to propose en erratum for 7.1.10.3 to clarify that "decoded if necessary" applies to Content-Encoding headers. [recorded in http://www.w3.org/2010/06/17-xproc-minutes.html#action02]
Henry: The .svgz documents should allow us to demonstrate the problem pretty quickly.
None heard.
Adjourned.