This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
The specification states (Section 20 in 2.0; Section 23 in 2.1) about the byte-order-mark property: The default value depends on the encoding used. If the encoding is UTF-16, the default is yes; for UTF-8 it is implementation-defined, and for all other encodings it is no. Surely if it defaults to yes for "UTF-16", it should also default to yes for "utf-16BE" and "utf-16LE" (as one of these is equivalent). The specification also does not state that the comparison should be ignoring case (although this is reasonably obvious). One might also argue that the default value should be either true or implementation defined for "utf-32" as well.
I have been reading what Section 3.10 "Unicode Encoding Schemes" of Unicode 5.2 has to say about UTF-16LE, UTF-16BE and UTF-16 encoding schemes.[1] It turns out that the UTF-16 encoding scheme is not equivalent to simply choosing one of UTF-16LE or UTF-16BE. My understanding, is that the byte order mark is only used at the start of the encoded byte sequence in the UTF-16 encoding scheme, according to Unicode 5.2, not in either UTF-16LE or UTF-16BE. The byte sequence FE FF at the start of a file or what-have-you would be interpreted as a zero-width no-break space in something that was known to be encoded in the UTF-16BE encoding scheme. For UTF-32, Unicode 5.2 says the byte order mark is optional. Changing the default to true could break existing implementations. Changing the default to implementation-defined wouldn't harm existing implementations, but I think it could have a slight impact on interoperability if some implementations chose a default byte-order-mark value of true for UTF-32. As an aside, I know far more about the distinction between Unicode character encoding schemes and Unicode character encoding forms than I did when I woke up this morning. :) [1] http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf#G7404
The WG decided, following the reasoning of Henry's response, to make no change to the specification. Please feel free to reopen if you think we have missed something.
This is an interesting subtlety that I had not appreciated. I completely agree with Henry's reasoning (and hence the Working Group's decision) and am marking the bug closed.