Objectives
- Describe the new structure of the W3C Internationalization Activity
- Show sample issues that we address
- Mention key highlights of recent / planned work
- Recommend how you can contribute
Outline
- World Wide Web Consortium (W3C) Overview
- Internationalization (I18N) Activity
- Core Task Force
- Character Model
- Internationalized Resource Identifiers (IRIs)
- Reviews & support
- Web Services Task Force
- GEO (Guidelines, Education & Outreach) Task Force
W3C Overview
The World Wide Web Consortium
- Document technologies: (X)HTML, CSS, SVG, SMIL,...: the Web for human consumption
- Base technologies: XML, XML Namespaces, XSLT, XML Schema
- Web Services: machine-to-machine communication
- Semantic Web: abstracting and combining information
- Web addresses: URIs/IRIs keeping everything together
about 70 Team members
about 425 Member Organizations
W3C Overview
W3C Goals
I18N Activity
Goals
Hogy a Világháló valóban az egész világé lehessen!
جعل شبكة الويب العالميّة عالميّة
حقًّا!
ליצור מהרשת רשת כלל עולמית באמת!
전세계의 월드 와이드 웹으로 만들기!
缔造真正全球通行的万维网
締造真正全球通行的萬維網
ワールド・ワイド・ウェッブを世界中に広げましょう
Κάνοντας τον Παγκόσμιο Ιστό πραγματικά Παγκόσμιο
वर्ल्ड वाईड वेब को सचमुच विश्वव्यापी बना रहें हैं !
Сделаем "Всемирную паутину" действительно всемирной!
"Дүниежүзілік торды" нағыз дүниежүзілік етеміз!
Making the World Wide Web world wide!
I18N Activity
History
- Started in 1995
- Working Group and Interest Group started in 1998
- Rechartering September 2002
- Two new task forces November 2002
I18N Activity
Structure
- Working Group: Work done by email, teleconferences, face-to-face meetings
- Public mailing list for technical discussion: www-international@w3.org (subscribe, archive)
I18N Activity
Related work: Translation activities
- Volunteer based, best effort
- New translations summary pages
- draws together information in many languages
- displays data in several alternative views
- demonstrates internationalization capability of RDF
Core Task Force
Overview
- Continue / complete ongoing work
- Move the Character Model to Proposed Recommendation
- Submit the IRI Internet-Draft to the IESG for Proposed Standard
- Continue reviews of and support for other W3C specifications
Core Task Force
Character issues
This Hungarian word can be coded in more than one way. Unicode allows for 'precomposed' and 'decomposed' representations of accented
characters. This can cause problems for comparisons of strings during searching, sorting, verification, etc. It can also create problems for pointing
to a specific location in a string. To overcome these difficulties you need to normalize the text. W3C recommends early normalization based on
Unicode form NFC.
Core Task Force
What is the Character Model?
'Guidelines' for specification writers
- Introduction to character encoding and processing in an I18N context
- 'Think' in Unicode
- Text Normalization (NFC, early normalization)
- Compatibility and formating characters => Unicode in XML and other Markup Languages
- String identity matching, string indexing
- Use IRIs
- Referencing Unicode/ISO 10646
Core Task Force
URI issues
If people are not able use non-ASCII text in domain names this prevents them using intuitive and memorable names for their
domains. Work is underway to rectify this situation. In each of the two cases above the second link will take you to the site, but the first link is
really what the user wanted.
Core Task Force
What are IRIs?
- URIs are restricted to US-ASCII
- Defining Internationalized Resource Identifiers (IRIs,
draft, editing) to
allow to contain characters from any script
- Uniform convention to map to URIs (encode in UTF-8 and use %hh-encoding)
- Used already in XML (system identifiers), XLink, XML Schema, and others, and implemented in main browsers
- Coordinating with IETF work on International Domain Names (IDNs)
- Will be submitted to the IESG for Proposed Standard soon
Core Task Force
Styling issues
Core Task Force
Styling issues
Differing scripts require differing typographic features to represent text naturally on the Web. This example shows Japanese. Typographic
features represented here that are not used in English include autospacing around Latin text and numbers, appropriate character-based wrapping
behaviour, ruby, and vertical text that includes some horizontal Latin characters or numbers. Styling and markup on the Web has to be adapted to
support these requirements.
Core Task Force
Reviews
- Recent examples ...
- XForms: new generation of Web forms, replacing HTML forms
- SOAP 1.2
- DOM Text Events, DOM Load/Save
- Speech Synthesis
- RDF
- CSS Text
- XQuery
- many more
Core Task Force
Highlights
Web Services Task Force
What are Web Services?
Web technology (in particular XML) that enables machines talk to machines
Web services stack:
(business logic) |
|
|
|
Choreography |
(proprietary) |
|
|
WSDL 1.2 |
WSDL 1.1 |
(proprietary) |
|
SOAP 1.2 |
SOAP 1.1 |
HTTP GET |
(proprietary) |
HTTP (POST) |
(DIME) |
(proprietary) |
|
TCP/IP |
(proprietary) |
|
|
Applications: RCP, messaging, application integration
Web Services Task Force
WS issues
- Isn't it all covered by using XML?
- Explore issues with I18N-specific services, eg:
- Format today's date in Japanese (2003年5月23日)
- When does the month of Ramadan start?
- Fault messages - how do you know the language of the user?
Web Services Task Force
Highlights
- Web Services Internationalization Usage Scenarios (Second Working Draft published 16
May)
- Data Integrity and Interoperability
- Language(Locale) Negotiation for SOAP Fault Messages
- Locale Neutral vs. Sensitive Data Exchanging
- Locale Sensitive Presentation
- Locale Sensitive Data Processing
- Finding Services
- Services for Internationalization Functionality
- Development of Internationalized Web Services
- Contributions and comments welcome!
- Requirements document
- We are looking for your use cases, scenarios, and more!
GEO Task Force
International usability issues
Nearly every aspect of this form creates difficulties for localization or use in other countries than the USA. The form assumes that people
write names, addresses and dates in the same way around the world, whereas there are actually many different approaches for each of these. Content
authors / developers need advice about what the pitfalls are, and how best to design around these issues.
GEO Task Force
International usability issues
Forms can also present issues relating to character encoding. How do you ensure that you can preserve and handle the wide range of characters
and encodings that could be typed in by your user. In this example, some of the accented letters in Hungarian do not appear in the Latin 1 character
set.
GEO Task Force
Questions ...
- How do I ...
- implement culturally adapted name and address forms in XForms?
- ensure that (X)HTML forms return data in the right encoding?
- declare language and encoding for CSS style sheets?
- order XSL output according to French rules?
- approach the creation of multilingual documents in HTML?
- define DTDs / Schemas so that they incorporate all information needed by the localization team?
- help users navigate to the right localized page?
- ensure the table I'm about to create has all the right i18n features?
GEO Task Force
Mission
- Mission: Ensure that I18N aspects of W3C technology are better understood and more widely and consistently used.
GEO Task Force
Highlights
GEO Task Force
Approach
Summary
- Describe the new structure of the W3C Internationalization Activity
- Show sample issues that we address
- Mention key highlights of recent / planned work
We'd like you to help!
- Wide range of areas to get involved with
- Exciting new ways to participate and contribute
- We would like to capture concerns of the localization community with regard to Web internationalization
Ways to help
- Visit the I18N Activity Home Page
- Join a Task Force, the Interest Group, the public mailing list
- Offer to help with reviews
- Take advantage of the i18n-readiness of W3C technology for the world
- Talk to me afterwards