See also: IRC log
<scribe> Scribe: Addison Phillips
<scribe> ScribeNick: aphillip
Richard: respond on our behalf to CSS on ruby issue
addison: set up document edit
transition with dan for ws-i18n
... make a table of date changes for DST
<scribe> in progress
addison: ping martin about iri wg
activity
... set up an introductory call or meeting with DOM Events
folks to discuss
all: look at XForms 1.1 PR for internationalization issues, especially "input mode" at the back
Unicode 5.2 was released
(and there was much rejoicing)
IUC33 is next week
likely no call next week
<r12a> http://www.w3.org/International/wiki/Xmllang_and_css
richard working on update to document describing styling via CSS (particularly with language info)
<r12a> Using CSS selectors with xml:lang
scribe: see above
... has worked fingers to bone figuring out how xml:lang and
lang work with CSS
... please look it over and send comments
http://www.unicode.org/reports/tr46/
Mark: related to IDNA...
... and IDNA2003 contains rules for exchanging things in DNS
with chinese, arabic, etc.
... IDNA2008 is in process
... may finish this year or early next
... and there are incompatibilities between the two
... which means that some old domain names no longer work
... and some in 2008 will won't work with 2003
implementations
... 2008 has many good things
... but there are four serious incompatibilities, which uax46
calls
... deviations
... has implications for spoofing and compat
... Unicode consortium is concerned
... working with browser vendors to see if these
incompatibilities can be dealt with
... old UAX#46 was put on hold hoping IETF WG would deal with
it
... but they didn't, so this doc revived
... can see in the link
<mark> http://unicode.org/cldr/utility/idna.jsp?a=3%96BB.at%0D%0A8%A1og.com%0D%0AOC%88BB.at%0D%0Afass.de%0D%0Afa3%9F.de%0D%0Af3%A43%9F.de%0D%0ASch3%A4ffer.de%0D%0A%EFC%A1%EFC%A2%EFC%A3%E3%83B%E6%97%A5%E6%9C%AC%0D%0A%E6%97%A5%E6%9C%AC%EFD%A1co%EFD%A1jp%0D%0AxC%81C%A7%0D%0AxC%A7C%81%0D%0AI%E2%99%A5NY%0D%0AE%92F%8CEBEFF%82%0D%0A%EFB%8B%EFA%AE%EFA%91%EFB2
Mark: links shows demo with various punycode transforms
<mark> ȡog.com
Mark: the "d" like character introduced in Unicode after IDNA 2003
<mark> öbb.at
Mark: if lower case works in all cases
<mark> faß.de
Mark: under 2003 remaps to "fass", but encoded as punycode under 2008
<mark> 日本。co。jp
Mark: ideographic full stop fails
under 2008
... 2008 allows for pre-processing of the text in a domain
name
... called "custom mapping"
... could make them all succeed (or make worse)
... any *user* can do custom mapping
q
<fsasaki> felix: just on IRC due to bad sound: is any registered label with ideographic full stop character, so does the missing ideographic full stop hurt any existing data?
Mark: full stop treated same as a
period
... but no longer
<fsasaki> felix: tx for the explanation
<mark> ABC
(full width characters)
scribe: fail under 2008 but
treated as ASCII in 2003
... and these occur *frequently* in actual web pages (based on
Google inspection of pages)
... due to IME nature
Mark: proposal is to bridge gap by making stuff continue to work in 2008 that worked in 2003
<mark> ÖBB.at öbb.at fass.de faß.de fäß.de Schäffer.de ABC・日本 日本。co。jp x̧́ x̧́ I♥NY Βόλος ﻋﺮﺑﻲ
<mark> \u0001.com
Mark: what to do about four deviation characters
<mark> faß.de
Mark: such as estzett
... good reasons for both distinct and same treatment
... microsoft proposed to say that implementation (registrar)
must bundle them
... this is what UAX#46 recommends
... then it'd work with any registrar
David: so bundling means that they would register both "fass" and "fa"
Mark: yes
... two cases: "good" registrar will bundle
... and remapping won't matter
... "bad" registrar won't bundle
... so two domains exist
... so don't support character
... so use 2003 version of behavior
basically: generate only *one*
version for a given input
... even though mutliple outcomes are possible
Mark: one of the main reasons for
this is so users can see in browser
... users don't like transform of name for display
(remapping)
... only transform on wire
Richard: btw, we have a bunch of tests that show what people see
DENIC bundles
<r12a> http://www.w3.org/International/tests/test-idn-display-0
Mark: other thing concerned
with...
... IDNA very concerned with TLDs
... but in reality there are 1000's of registrars
... e.g. google registers subdomains in google
... or blogspot
<mark> http://ɓlog.blogspot.com/
"xn--log-nsb.blogspot.com"
so many many "actual" registrars
scribe: belief is that "actual"
registrars won't know to do anything about 2008
... a long time to replace browsers
... so how many on 2008 v. 2003
... and that's what TR46 is
... would like to inform you
... and also think W3C should have a position on this
... so HTML5 and XLink might need to be influenced
Mark: UTC will consider this, may
publish as early as November... or may be longer
... needs some wordsmithing, although that won't influence
decision by UTC
richard: do you know what HTML5 is doing about this?
Mark: don't know, but believe that it is just conformant to 2003 at present
richard: what we really want is everyone to implement just this one way of doing it
Mark: think we have a lot of
supprot from browser vendors
... best if everyone does the same way
<fsasaki> felix: again just on IRC (sorry), propose to put Thomas Roessler (W3C) into the loop, who is following IDNA and also ICANN actions also to some extend IIRC
addison: pub together or a pointer or do something to charmod-iri....
Mark: ICU will implement
TR46
... have a followup in a couple of weeks?
<scribe> ACTION: all: review UAX#46 draft for review in two weeks [recorded in http://www.w3.org/2009/10/07-core-minutes.html#action01]
<mark> mark@macchiato.com
Mark: send comments offline to above address
In particular Hixie's remarks: http://lists.w3.org/Archives/Public/public-i18n-core/2009OctDec/0001.html
<scribe> ACTION: addison: reply to Hixie in support of text at end of thread [recorded in http://www.w3.org/2009/10/07-core-minutes.html#action02]
http://www.w3.org/International/questions/qa-choosing-language-tags (new article) http://www.w3.org/International/articles/language-tags/temp.php (updated article)
http://lists.w3.org/Archives/Member/member-i18n-core/2009Oct/0001.html
Mark: my concern is that people
might use 'cmn' instead of 'zh'
... will mess themselves up,b ecause 'zh' means, for all
practical purposes Mandarin
... fear is that they won't see the squirms that say
otherwise
... after reading first two sentences
<r12a> http://www.w3.org/International/questions/qa-choosing-language-tags#extlangsubtags
There is always a 3-letter subtag that is equivalent to any language+extlang pairing. For example, zh-cmn (Mandarin Chinese) can also be expressed with the single subtag cmn.
Mark: so richard's suggestion
good
... use more specific tags in most cases, with some important
exceptions
... then follow with "there are however"
... and follow with "there are situations where you should
still use"
... and move the sentence on Macrolanguage searching to end
addison expresses happiness with that
Mark: in decision two, change examples from 'cmn' to 'yue'
<fsasaki> Felix: (on IRC) Sorry, have to leave at the top of the hour. My comments are editorial and proposals for "would be nice to have" additions, not "this is wrong" comments. So if you want to publish, please go ahead, the articles are good.
addison: tag "Chinese" as 'zh',
including all Mandarin
... all other Sinitic languages should use their specific
subtags and not use 'zh'
Mark: for predominant form, use
the macrolanguage
... including Malay, for example
... ISO distinguishes Filipino and Tagalog... also Twi and
Akan
... there are a number of issues with false distinctions
... "be consistent" and "try to be consistent with everyone
else"
<mark> http://cldr.unicode.org/development/design-proposals/languages-to-show-for-translation
Mark: case sensitivity in file names can be important in file systems that are case sensitive
<r12a> where it is important, consider using bcp 47 approach
(actually bcp47 has a canonicalization section on case)
Mark: ordering of variants
<r12a> note, any particular recipient of the tags may or may not take the order as important
http://www.inter-locale.com/ID/rfc5646.html#canonical
maybe a variant faq
richard: publish for wide review with changes?
<scribe> chair: any opposed?
none opposed
<scribe> ACTION: richard: publish language tag articles for wide review [recorded in http://www.w3.org/2009/10/07-core-minutes.html#action03]
<r12a> after changes discussion so far
This is scribe.perl Revision: 1.135 of Date: 2009/03/02 03:52:20 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/DENIC /DENIC (german registry)/ Found Scribe: Addison Phillips Found ScribeNick: aphillip Default Present: aphillips, David, Richard, +1.650.253.aaaa, +1.650.253.aabb, Felix, YvesS, mark, andrewc Present: aphillips David Richard +1.650.253.aaaa +1.650.253.aabb Felix YvesS mark andrewc Agenda: http://lists.w3.org/Archives/Member/member-i18n-core/2009Oct/0002.html Got date from IRC log name: 07 Oct 2009 Guessing minutes URL: http://www.w3.org/2009/10/07-core-minutes.html People with action items: addison all reply richard[End of scribe.perl diagnostic output]