Update from the Internationalization Working Group

Presenter: Fuqiao Xue
Duration: 4 min
Slides: download

The W3C Internationalization (i18n) Activity works with W3C working groups and liaises with other organizations to make it possible to use Web technologies around the world, regardless of language, writing system, or culture.

All demos

Skip ⬇

Slides & Video

In this video, I will briefly describe our work in the W3C internationalisation activity.

Our work is divided into three parts: Language enablement is about gathering and making available requirements for the support of local writing systems and languages on the Web.

Developer support is about providing internationalisation advice to other groups developing Web standards, and review their specifications.

Education & outreach is about developing materials to make the internationalisation aspects of W3C technology better understood and more widely and consistently used by content authors and implementers.

Next, I'll describe some of our recent progress.

As I just mentioned, one of our jobs is to review specifications.

Here are some of our comments to the CSS Working Group.

Internationalisation Best Practices for Spec Developers, aka specdev, is a checklist of internationalisation-related considerations when developing a specification.

We have recently added links to issues related to a particular section in many sections of specdev for readers of the document.

There are lots of other recent updates as well, including listing examples of common escaping mechanisms found on the Web, more information about sorting and defining identifiers, and so on and so forth.

Next, I'm going to introduce another document, Language and Direction Metadata of Strings on the Web, aka, string-meta.

This document describes the best practices for identifying language and base direction for strings used on the Web.

For example, in the figure on the left, the first line is labeled ja (which stands for Japanese), and the second line is labeled zh-Hant (which stands for Traditional Chinese).

The characters on both lines are the same code points, but the reader expects to see systematic differences between how those codepoints are rendered in Japanese vs. Chinese.

It's important to associate the right forms with the right language, otherwise you can make the reader uncomfortable or possibly unhappy.

The number on the right is an ISBN number.

When dropped into a right-to-left context when preceded by Arabic text, you will get the result just below, which is incorrect, because the sequencing is wrong, and this may not even be apparent to the reader, who will expect to read such numbers from left to right.

In order to solve this problem, we are discussing with the W3C Technical Architecture Group and TC39 about a proposal to include language and direction metadata for every natural language text string value in data structures using a standardized representation.

In addition, we have other updates like Language Tags and Locale Identifiers, String Searching, and so on and so forth.

If you're interested, we would be pleased to welcome you in our working group.

Thank you.

Skip ⬇

All demos