The Internationalization Tag Set (ITS) 1.0 is a W3C Recommendation to support the Internationalization and Localization of XML. There are various use cases for ITS 1.0 so-called Internationalization and Localization "data categories" . These are described in the Recommendation and in the accompanying document Best Practices for XML Internationalization. Here we will exemplify a use case achieved with the ITS 1.0 data category "Translate". "Translate" is a means to specify whether a piece of content needs to be translated or not during the localization process. A typical scenario is:
ITS 1.0 is an important means for in this process. An example input document is given below.
<messages xmlns:its="http://www.w3.org/2005/11/its" its:version="1.0"> <msg num="123">Click Resume Button on Status Display or <panelmsg its:translate="no">CONTINUE</panelmsg> Button on printer panel</msg> </messages>
The default of ITS 1.0 "Translate" is that elements content is translatable and attribute
values are not. In the example, the content of the
<panelmsg>
element must not be translated. This is expressed via the so-called
"local"
ITS 1.0 attribute
its:translate
with the value
no
.
An additional approach to such local usage of ITS 1.0 are global ITS "rules". These rules express the same functionality as local ITS 1.0 markup, but they are independent of a position of the target document and can be applied to several (parts of) documents. This is achieved via the usage of XPath. An example is given below.
<its:rules version="1.0" xmlns:its="http://www.w3.org/2005/11/its"> <its:translateRule selector="//panelmsg" translate="no"/> </its:rules>
The
<its:rules>
element contains an
<its:translateRule>
element with two attributes. The value of the
selector
attribute is an XPath expression which selects all
<panelmsg>
elements. The value of the
translate
attribute has the same function as the local
its:translate
attribute.
With the global approach of ITS 1.0, it becomes possible to apply ITS 1.0 data categories to documents without changing them, since the ITS 1.0 information can be stored independently of the target documents. This will be demonstrated in the following section, using the example of HTML 5 documents in various serializations.
Thousands of HTML documents (that is, HTML 4.01, XHTML, HTML 5, ...) are subject to localization. This means also that some parts of them need to be translated, but others not.
An example input document is given below.
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>An HTML Document</title> </head> <body> <h1>Example</h1> <p>This is an example HTML document.</p> <pre>Some source code</pre> </body> </html>
For this document, the author or localization engineer might want to specify that the
content of a specific
<pre>
element should not be translated. However, HTML has no markup for this purpose
available. A solution which is applicable to XHTML is to
extend the XHTML schema
with ITS markup.
However, this is only applicable to XHTML or the XML serialization of HTML 5, and the usage of
local ITS 1.0 markup in the XML serialization might create problems for browser display.
Our solution for using ITS 1.0 within XHTML and the HTML serialization of HTML 5 can be summarized as follows: to minimize the impact on HTML in both serializations, we do not use ITS 1.0 markup, but markup without a namespace. With the means of global ITS rules, this markup is associated with ITS 1.0 functionality. A detailed description is given below.
In the files listed below, non-translatable content is specified in the following manner.
First, with an attribute "translate" with the value "no" at the
<pre>
element. Second, with a separate document
xhtml-sample-rules.xml. This ITS .10 rules
file contains the following
<its:translateRule>
element:
<its:translateRule selector="//h:pre[@translate='no'] | //pre[@translate='no']" translate="no" xmlns:h="http://www.w3.org/1999/xhtml"/>
This rule means that all
<pre>
elements with an attribute
translate="no"
should not be translated. To put it differently, non-ITS 1.0 markup (the
translate
attribute) is associated with the functionality of the ITS 1.0 "Translate" data
category.
The files are provided in an HTML serialization, served as text/html, and in an XML serialization, served as application/xhtml+xml.
No ITS functionality | "translate" attribute at
pre
element |
HTML serialization | HTML serialization |
XML serialization | XML serialization |
The files with the
"translate"
attribute have been tested under Windows with the following browsers: DoCoMo P213i (browser unknown), Firefox 2, IE 6, Opera 9, Safari 3. All browsers display the files properly, which
demonstrates the limited impact of this approach.
The input file in XML serialization with markup for ITS 1.0 "Translate" functionality is
processed with an
ant file
in the following manner (note that the ant file assumes the presence of Saxon 8. The
location
attribute of the
<pathelement>
element has to be changed accordingly):
Step 3 involves CSS selectors. The selection of the non-translatable
<pre>
element is achieved with the following, automatically generated selector:
html > head ~ body > h1 ~ p ~ pre
The input file in HTML serialization needs to be converted into an XHTML serialization. This can be achieved e.g. by HTML tidy. After this step, the processing as described above is applied. However, the resulting CSS stylesheet / selectors can be applied to the original HTML serialization. In this way, the translator is able to work with the original document.
It has been shown that the global approach of ITS 1.0 data categories can be used to apply ITS 1.0 data category functionality without ITS 1.0 markup. The purpose for this exercise is a minimal impact on non-ITS processing, e.g. display and editing of HTML 5 documents. The generation of CSS selectors finally allows for applying the ITS 1.0 functionality to the original document in XML or HTML serialization.
$Id: Overview.html,v 1.13 2008/03/17 01:34:40 fsasaki Exp $