Here is an example of what I believe we said happens for wrapping bidi text. This is purely an attempt to clarify a telecon discussion - it is not researched!
Original logically ordered sequence of characters:
0646: ن ARABIC LETTER NOON |
0634: ش ARABIC LETTER SHEEN |
0627: ا ARABIC LETTER ALEF |
0637: ط ARABIC LETTER TAH |
0020: SPACE |
0627: ا ARABIC LETTER ALEF |
0644: ل ARABIC LETTER LAM |
062A: ت ARABIC LETTER TEH |
062F: د ARABIC LETTER DAL |
0648: و ARABIC LETTER WAW |
064A: ي ARABIC LETTER YEH |
0644: ل ARABIC LETTER LAM |
060C: ، ARABIC COMMA |
0020: SPACE |
0057: W LATIN CAPITAL LETTER W |
0033: 3 DIGIT THREE |
0043: C LATIN CAPITAL LETTER C |
[means "Internationalization Activity, W3C"]
Establish possible Break opportunities acc UAX (I haven't checked this, so either take with a pinch of salt or correct for me):
0646: ن ARABIC LETTER NOON |
0634: ش ARABIC LETTER SHEEN |
0627: ا ARABIC LETTER ALEF |
0637: ط ARABIC LETTER TAH |
0020: SPACE |
Break opportunity |
0627: ا ARABIC LETTER ALEF |
0644: ل ARABIC LETTER LAM |
062A: ت ARABIC LETTER TEH |
062F: د ARABIC LETTER DAL |
0648: و ARABIC LETTER WAW |
064A: ي ARABIC LETTER YEH |
0644: ل ARABIC LETTER LAM |
060C: ، ARABIC COMMA |
0020: SPACE |
Break opportunity |
0057: W LATIN CAPITAL LETTER W |
0033: 3 DIGIT THREE |
0043: C LATIN CAPITAL LETTER C |
Determine, through shaping algorithms, the actual glyphs to be used in the phrase, and their widths and any kerning, etc. Here is a rendered version of the text all on one line to aid understanding.
نشاط التدويل، W3C
Here is a graphic for those who don't have Traditional Arabic font:
(I'm making up the actual widths)
Character | Joining form | Glyph width | Cumulative width |
---|---|---|---|
0646: ن ARABIC LETTER NOON | initial | 2 | 2 |
0634: ش ARABIC LETTER SHEEN | medial | 7 | 9 |
0627: ا ARABIC LETTER ALEF | final | 2 | 11 |
0637: ط ARABIC LETTER TAH | independent | 8 | 19 |
0020: SPACE | 5 | 24 | |
Break opportunity | |||
0627: ا ARABIC LETTER ALEF | independent | 2 | 26 |
0644: ل ARABIC LETTER LAM | initial | 2 | 28 |
062A: ت ARABIC LETTER TEH | medial | 2 | 30 |
062F: د ARABIC LETTER DAL | final | 5 | 35 |
0648: و ARABIC LETTER WAW | independent | 4 | 39 |
064A: ي ARABIC LETTER YEH | initial | 3 | 42 |
0644: ل ARABIC LETTER LAM | final | 7 | 49 |
060C: ، ARABIC COMMA | 5 | 54 | |
0020: SPACE | 5 | 59 | |
Break opportunity | |||
0057: W LATIN CAPITAL LETTER W | 6 | 65 | |
0033: 3 DIGIT THREE | 5 | 70 | |
0043: C LATIN CAPITAL LETTER C | 8 | 78 |
Assuming a line length of 62, we would backtrack to the previous break opportunity, and wrap the 'W3C' onto the following line.
Version: $Id: bidi-wrapping-example.html,v 1.2 2005/02/02 11:00:37 rishida Exp $