This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 16687 - iso-2022-kr decoder feedback
Summary: iso-2022-kr decoder feedback
Status: RESOLVED INVALID
Alias: None
Product: WHATWG
Classification: Unclassified
Component: Encoding (show other bugs)
Version: unspecified
Hardware: PC Windows 3.1
: P2 normal
Target Milestone: Unsorted
Assignee: Anne
QA Contact: sideshowbarker+encodingspec
URL:
Whiteboard:
Keywords:
Depends on: 20599
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-10 16:28 UTC by Anne
Modified: 2013-08-23 10:36 UTC (History)
2 users (show)

See Also:


Attachments

Description Anne 2012-04-10 16:28:56 UTC
Are you sure \n (0x1A) should switch to ASCII?   (Same comment as for ISO-2022-JP.)
Comment 1 pub-w3 2012-05-06 13:28:44 UTC
The following illustrates a number of ISO-2022-KR encoding errors:

<p>Space (' ') between two-byte sequences:  \x0EVP VPVP  VPVP VP\x0F
<p>Newline ('\\n') between two-byte sequences:  \x0EVP\nVPVP\n\nVPVP\nVP\x0F
<p>Space (' ') inside two-byte sequences:  \x0EVPV PVP VPVP\x0F
<p>Newline ('\\n') inside two-byte sequences:  \x0EVPV\nPVP\nVPVP\x0F

<p>Aligned escape sequence: \x0EVPVP\x1B\$)CVP\x0F
<p>Misaligned escape sequence: \x0EVPV\x1B\$)CPVP\x0F

<p>Aligned shift in: \x0EVPVP\x0FVP\x0F
<p>Misaligned shift in: \x0EVPV\x0FPVP\x0F

<p>Aligned shift out:    \x0EVPVP\x0EVP\x0F
<p>Misaligned shift out: \x0EVPV\x0EPVP\x0F

<p>Incomplete escape sequence:  VP\x1BVP\x1B\$VP\x1B\$)VP\x1B\$)CVP

<p>Aligned incomplete escape sequence: \x0EVPVP\x1B\$VPVP\x0F
<p>Misaligned incomplete escape sequence: \x0EVPV\x1B\$PVPVP\x0F


Testing in IE (no differences between IE6 and IE9), Opera, Safari and Firefox gives the following results:


Only Opera gets out of synch when a space appears between two-byte sequences.

Firefox and Opera do switch to ASCII when a newline is found in two-byte mode.  IE and Safari however remain in two-byte mode, and they have the same error handling.

All browsers recognise a misaligned shift-in (i.e., 0F as a trail byte in two-byte mode as a way of switching to one-byte mode).  The only divergence is that IE eats the following byte.

A misaligned shift-out or a misaligned escape sequence (ESC '$' ')' 'D') is ignored by IE.  Other browsers misalign/realign.

An incomplete escape sequence in ASCII mode is converted to characters byte by byte in IE and Firefox, starting with an escape character, and with no U+FFFD being inserted.  Perhaps this should not be an error at all.

An incomplete escape sequence in 2-byte mode makes Opera switch to ASCII mode.  All other browsers stay in 2-byte mode and have essentially the same error handling (at least for the example above).
Comment 2 pub-w3 2012-05-06 13:31:04 UTC
Typo:  the escape sequence is ESC '$' ')' 'C'.
Comment 3 Anne 2013-08-23 10:36:01 UTC
This is obsolete now.