Warning:
This wiki has been archived and is now read-only.
LocQuality
NOTE: this page is out of date and not maintained.
Proposed text for the two localization quality-relted data categories
Contents
1 Localization Quality
1.1 Definition
The Localization Quality data category is used to express information related to localization quality assessment tasks.
This data category can be used in a number of ways, including the following example scenarios:
- An automatic quality checking tool flags a number of potential quality issues in an XML or HTML file and marks them up using ITS 2.0 markup. Other tools in the workflow then examine this markup and decide whether the file needs to be reviewed manually or passed on for further processing without a manual review stage.
- A quality assessment process identifies a number of issues and adds the ITS markup to a rendered HTML preview of an XML file along with CSS styling that highlights these issues. The resulting HTML file is then sent back to the translator to assist his or her revision efforts.
- A human reviewer working with a web-based tool adds quality markup, including comments and suggestions, to a localized text as part of the review process. A subsequent process examines this markup to ensure that changes were made.
The data category defines four pieces of information:
Information | Description | Permissible values | Default value | Notes |
---|---|---|---|---|
Type | A set of broad types of issues into which tool-specific issues can be categorized. | One of the values defined in list of type values | None | ITS 2.0-compliant tools that use these categories MUST map their internal values to these types. |
Comment | A human-readable description of the quality issue | text | None | Since it is not feasible to create machine-readable suggestions for issue resolution in all cases, tools may put suggestions in this attribute. |
Severity | An integer value representing the severity of the issue, as defined by the model generating the metadata | An integer between 0 to 100 (included), with higher values indicating greater severity | None | It is up to tools to map the values of this to their own system to this scale. If needed, the original value can be passed along using a custom namespace (for XML) or a data- attribute (for HTML). |
Profile Reference | A reference to a document describing the quality assessment model used for the issue. | a URI pointing to the reference document. | None | The use of resolvable URI is strongly recommended as it provides a way for human evaluators to learn more about the quality issues in use. |
If the type of the issue is set to uncategorized
, a comment MUST be specified as well.
1.2 Implementation
The Localization Quality data category can be expressed with global rules, or locally on individual elements. The information applies to the textual content of the element, including child elements, but excluding attributes.
1.2.1 GLOBAL
The locQualityRule
element contains the following:
- A required
selector
attribute. It contains an absolute selector which selects the nodes to which this rule applies. - At least one of the following:
- Exactly one of the following:
- A
locQualityIssueRef
attribute. Its value is a URI pointing to thelocQualityIssue
element containing the list of issues related to this content. - A
locQualityIssueRefPointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualityIssueRef
.
- A
- Exactly one of the following:
- A
locQualityType
attribute that implements the type information. - A
locQualityTypePointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualityType
.
- A
- Exactly one of the following:
- A
locQualityComment
attribute that implements the comment information. - A
locQualityCommentPointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualityComment
.
- A
- Exactly one of the following:
- None of exactly one of the following:
- A
locQualitySeverity
attribute that implements the severity information. - A
locQualitySeverityPointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualitySeverity
.
- A
- None of exactly one of the following:
- A
locQualityProfileRef
attribute that implements the profile reference information. - A
locQualityProfileRefPointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualityProfileRef
.
- A
1.2.1.1 Example of global markup in XML
<doc> <header> <its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> <its:locQualityRule selector="//span[@id='q1']" locQualityType="typographical" locQualityComent="Sentence without capitalization" locQualitySeverity="50"/> </its:rules> </header> <para><span id="q1">this</span> is an example</para> </doc>
1.2.1.2 Example of global markup using pointers in XML
<doc> <header> <its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> <its:locQualityRule selector="//issue" locQualityTypePointer="./@type" locQualityComentPointer="./@note" locQualitySeverityPointer="./@value" locQualityProfileRefPointer="./@profile"/> </its:rules> </header> <para><issue type="typographical" note="Sentence without capitalization" value="50" profile="http://example.org/qaModel/v13">this</issue> is an example</para> </doc>
1.2.1.3 Example of global markup in HTML
The following example show how to use the global rules in an HTML document.
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"/> <title>Example</title> <link href="EX-locQuality-global-html5-1.xml" rel="its-rules"/> </head> <body> <p><span id='q1'>this</span> is an example.</p> </body> </html>
Corresponding rules file:
<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> <its:locQualityRule selector="//span[@id='q1']" locQualityType="typographical" locQualityComent="Sentence without capitalization" locQualitySeverity="50"/> </its:rules>
1.2.1.4 Example of global markup using pointers in HTML
TODO? (Not sure it makes sense)
1.2.2 LOCAL
Using the inline markup to represent the data category locally is limited to a single occurrence for a given content (e.g. one cannot have different locQualityType
attributes applied to the same span of text). Because there may be several instances of a localization quality issue for a given content, a local standoff markup allowing such cases is also provided.
The following local markup is available for the Localization Quality data category:
EITHER (inline markup):
- At least one of the following attributes:
- A
locQualityType
attribute that implements the type information. - A
locQualityComment
attribute that implements the comment information.
- A
- An optional
locQualitySeverity
attribute that implements the severity information. - An optional
locQualityProfileRef
attribute that implements the profile reference information.
OR (standoff markup):
- A
locQualityIssueRef
attribute. Its value is a URI pointing to thelocQualityIssue
element containing the list of issues related to this content.
- Somewhere outside the content, an element
locQualityIssue
(or<span its-loc-quality-issue>
in HTML) which contains:- A required attribute
xml:id
(orid
in HTML). - One or more elements
locQualityIssueItem
(or<span its-loc-quality-issue-item>
in HTML)
Each of which contains:- At least one of the following attributes:
- A
locQualityType
attribute that implements the type information. - A
locQualityComment
attribute that implements the comment information.
- A
- An optional
locQualitySeverity
attribute that implements the severity information. - An optional
locQualityProfileRef
attribute that implements the profile reference information.
- At least one of the following attributes:
- A required attribute
Important: When the attributes locQualityType
, locQualityComment
, locQualitySeverity
and locQualityProfileRef
(or their equivalent representations) are used in in a standoff manner, the information they carry pertains to the content of the element that refers to the standoff annotation.
1.2.2.1 Example of local inline markup in XML
<doc xmlns:its="http://www.w3.org/2005/11/its" its:version="2.0"> <para><span its:locQualityType="typographical" its:locQualityComent="Sentence without capitalization" its:locQualitySeverity="50">this</span> is an example.</para> </doc>
1.2.2.2 Example of local inline markup in HTML5
The following example uses local HTML5 markup with CSS to highlight quality issues in browser rendition of the document. It uses a two fictional tools in the markup for (each indicated by a different set of bracketing characters in the CSS) for illustration. The markup should not be interpreted as referring to an actual quality assurance systems.
<!DOCTYPE html> <html lang="en"> <head> <title>Telharmonium 1897</title> <style type="text/css"> [its-loc-quality-type]{ border:1px solid green; margin:2px; } [its-loc-quality-type = untranslated]{ background-color:red; } [its-loc-quality-type = whitespace]{ background-color:yellow; } [its-loc-quality-type = inconsistent-entities]{ background-color:#9DFFE1; } [its-loc-quality-type = spelling]{ background-color:#FFE2F7; } [its-loc-quality-severity = "1.0"]{ border: 6px solid red; } [its-loc-quality-profile-pointer = "abc"]:before{ content:"⇛"; } [its-loc-quality-profile-pointer = "abc"]:after{ content:"⇚"; } [its-loc-quality-profile-pointer = "grammar"]:before{ content:"❮"; } [its-loc-quality-profile-pointer = "grammar"]:after{ content:"❯"; } </style> </head> <body> <h1 id="h0001" its-loc-quality-profile-pointer="abc" its-loc-quality-type="untranslated" data-mytool-qacode="target_equals_source" >Telharmonium (1897)</h1> <p id="p0001"> <span class="segment" id="s0001"> <span its-loc-quality-profile-pointer="abc" its-loc-quality-type="inconsistent-entities" its-loc-quality-note="Should be Thomas Cahill. Why is Batman in the picture?" its-loc-quality-severity="1.0" data-mytool-qacode="named_entity_not_found" >Christian Bale</span> <span its-loc-quality-profile-pointer="abc" its-loc-quality-type="whitespace" its-loc-quality-severity="0.1" data-mytool-qacode="extra_space_around_punctuation" >(1867 – 1934)</span> conceived of an instrument that could transmit its sound from a power plant for hundreds of miles to listeners over telegraph wiring. </span> <span class="segment" id="s0002">Beginning in 1889 the sound quality of regular telephone concerts was very poor on account of the buzzing generated by carbon-granule microphones. As a result Cahill decided to set a new standard in perfection of sound <span its-loc-quality-profile-pointer="grammar" its-loc-quality-type="spelling" its-loc-quality-severity="0.5" its-loc-quality-note="should be "quality"" >qulaity</span> with his instrument, a standard that would not only satisfy listeners but that would overcome all the flaws of traditional instruments. </span> </p> </body> </html>
1.2.2.3 Example of local standoff markup in XML
The following example shows a document using local standoff markup to encode the issues. The mrk
element delimits the content to markup and holds a locQualityIssueRef
attribute that points to the locQualityIssue
element where the issues are listed.
<xliff version='1.2' xmlns='urn:oasis:names:tc:xliff:document:1.2' xmlns:its="http://www.w3.org/2005/11/its" its:version="2.0"> <file original='example.doc' source-language='en' datatype='plaintext'> <body> <trans-unit id='1'> <source xml:lang='en'>This is the content</source> <target xml:lang='fr'><mrk mtype='x-itslq' its:locQualityIssueRef="#lq1">c'es</mrk> le contenu</target> <its:locQualityIssue xml:id="lq1"> <its:locQualityIssueItem locQualityType="misspelling" locQualityComent="'c'es' is unknown. Could be 'c'est'" locQualitySeverity="50"/> <its:locQualityIssueItem locQualityType="typographical" locQualityComent="Sentence without capitalization" locQualitySeverity="30"/> </its:locQualityIssue> </trans-unit> </body> </file> </xliff>
1.2.2.4 Example of local standoff markup with a global rule in XML
The following example shows a document using local standoff markup to encode the issues. But because, in this case, the mrk
element does not allow attribute from another namespace we cannot use locQualityIssueRef
directly. Instead, a global rule is used to map the function of locQualityIssueRef
to a non-ITS construct, here the ref
attribute of any mrk
elements that has its attribute type
set to "x-itslq".
<doc xmlns:its="http://www.w3.org/2005/11/its" its:version="2.0"> <file> <header> <its:rules> <its:locQualityIssueRule selector="//mrk[@type='x-itslq']" locQualityIssueRefPointer="@ref"/> </its:rules> </header> <unit id='1'> <segment> <source>This is the content</source> <target><mrk type='x-itslq' ref="#lq1">c'es</mrk> le contenu</target> </segment> <its:locQualityIssue xml:id="lq1"> <its:locQualityIssueItem locQualityType="misspelling" locQualityComent="'c'es' is unknown. Could be 'c'est'" locQualitySeverity="50"/> <its:locQualityIssueItem locQualityType="typographical" locQualityComent="Sentence without capitalization" locQualitySeverity="30"/> </its:locQualityIssue> </unit> </file> </doc>
1.2.2.5 Example of local standoff markup in HTML5
The following example shows a document using local standoff markup to encode the issues. The span
element delimits the content to markup and holds a loc-quality-issue-ref
attribute that points to a special span
element where the issues are listed within a set of other special span
elements.
<!DOCTYPE html> <html lang="en"> <head> <title>Example</title> </head> <body> <p><span its-loc-quality-issue-ref="#lq1">C'es</span> le contenu</p> <span its-loc-quality-issue id="lq1"> <span its-loc-quality-issue-item its-loc-quality-type="misspelling" its-loc-quality-coment="'c'es' is unknown. Could be 'c'est'" its-loc-quality-severity="50" /> <span its-loc-quality-issue-item its-loc-quality-type="typographical" its-loc-quality-coment="Sentence without capitalization" its-loc-quality-severity="30" /> </span> </body> </html>
1.2.3 Values of locQualityType
The locQualityType
attribute provides a basic level of interoperability between different localization quality assurance systems. It offers a list of high-level quality issue types common in automatic and human localization quality assessment. Localization quality assessment tools can map their internal categories to these categories in order to exchange information about the kinds of issues they identify and take appropriate action even if another tool does not know the specific issues identified by the generating tool.
The values listed in the following table are allowed for locQualityType
. The values a tool implementing locQualityType
produces for the attribute MUST match one of the values provided in this table and MUST be semantically accurate. If a tool can map its internal values to these categories it MUST do so and MUST NOT use the value other
, which is reserved strictly for values that cannot be mapped to these values.
Value | Description | Examples | Notes |
---|---|---|---|
terminology |
An incorrect term or a term from the wrong domain was used or terms are used inconsistently |
|
Should not be confused with the ITS terminology data category. |
mistranslation |
The content of the target mistranslates the content of the source |
|
Issues related to translation of specific terms related to the domain or task-specific language should be categorized as terminology issues |
omission |
Necessary text has been omitted from the localization or source |
|
This category should not be used for missing whitespace or formatting codes, but instead should be reserved for linguistic content. |
untranslated |
Content that should have been translated was left untranslated |
|
omission take precedence over untranslated . Omissions are distinct in that they address cases where text is not present, while untranslated address cases where text has been carried from the source untranslated. |
addition |
The translated text contains inappropriate additions |
|
|
duplication |
Content has been duplicated improperly |
|
|
inconsistency |
The text is inconsistent with itself (NB: not for use with terminology inconsistency) |
|
|
grammar |
The text contains a grammatical error (including errors of syntax and morphology) |
|
|
legal |
The text is legally problematic (e.g., it is specific to the wrong legal system) |
|
|
register |
The text is written in the wrong linguistic register of uses slang or other language variants inappropriate to the text |
|
|
locale-specific-content |
The localization contains content that does not apply to the locale for which it was prepared |
|
Legally inappropriate material should be classified as legal |
locale-violation |
Text violates norms for the intended locale |
|
|
style |
The text contains stylistic errors |
|
|
characters |
The text contains characters that are garbled or incorrect or that are not used in the language in which the content appears |
|
|
misspelling |
The text contains a misspelling |
|
|
typographical |
The text has typographical errors such as omitted/incorrect punctuation, incorrect capitalization, etc. |
|
|
formatting |
The text is formatted incorrectly |
|
|
inconsistent-entities |
The source and target text contain different named entities (dates, times, place names, individual names, etc.) |
|
|
numbers |
Numbers are inconsistent between source and target |
|
Some tools may correct for differences in units of measurement to reduce false positives |
markup |
There is an issue related to markup or a mismatch in markup between source and target |
|
|
pattern-problem |
The text fails to match a pattern that defines allowable content (or matches one that defines non-allowable content) |
|
|
whitespace |
There is a mismatch in whitespace between source and target content |
|
|
internationalization |
There is an error related to the internationalization of content |
|
There are many kinds of internationalization errors of various types. This category is therefore very heterogeneous in what it can refer to. |
length |
There is a significant difference in source and target length |
|
What constitutes a “significant” difference in length is determined by the model referred to in the locQualityProfile |
uncategorized |
The issue has not been categorized |
|
This category has to uses: (1) a tool can use it to pass through quality data from another tool in cases where the issues from the other tool are not classified (for example, a localization quality assurance tool interfaces with a third-party grammar checker); (2) a tool’s issues are not yet assigned to categories, and, until an updated assignment is made, they may be listed as uncategorized. In the latter case it is recommended that issues be assigned to appropriate categories as soon as possible since uncategorized does not foster interoperability. |
other |
Any issue that cannot be assigned to any values listed above. | This category allows for the inclusion of any issues not included in the previously listed values. This value MUST not be used for any tool- or model-specific issues that can be mapped to the values listed above. In addition, this value is not synonymous with uncategorized in that uncategorized issues may be assigned to another precise value, while other issues cannot.If a metric has an “miscellaneous” or “other” category, it should be mapped to this value even if the specific instance of the issue might be mapped to another category. |
2 Localization Quality Precis
2.1 Definition
The Localization Quality Precis data category is used to express an overall measurement of the localization quality of a document.
This data category allows to specify a quality score for a given document, as well as to indicate what constitutes a passing score. It also allows to point to a profile where the quality assessment model used for the scoring is described.
2.2 Implementation
The Localization Quality Precis data category can be expressed with global rules, or locally on individual elements. The information applies to the textual content of the element, including child elements, but excluding attributes.
2.2.1 GLOBAL
The locQualityPrecisRule
element contains the following:
- A required
selector
attribute. It contains an absolute selector which selects the nodes to which this rule applies. - Exactly one of the following:
- A
locQualityPrecisScore
attribute. Its value is an integer between 0 and 100 with higher values indicating a better score. - A
locQualityPrecisScorePointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualityPrecisScore
.
- A
- None or exactly one of the following:
- A
locQualityPrecisThreshold
attribute. Its value is an integer between 0 and 100 which indicates the lowest score value that constitutes a passing score in the profile used. - A
locQualityPrecisThresholdPointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualityPrecisThreshold
.
- A
- None or exactly one of the following:
- A
locQualityPrecisProfileRef
attribute. Its value is a URI pointing to the reference document describing the quality assessment model used for the scoring. - A
locQualityPrecisProfileRefPointer
attribute that contains a relative selector pointing to a node with the exact same semantics aslocQualityPrecisProfileRef
.
- A
2.2.1.1 Example of global markup in XML
<doc> <header> <its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> <its:locQualityPrecisRule selector="/doc" locQualityScore="100" locQualityPrecisThreshold="95" locQualityPrecisProfileRef="http://example.org/qaModel/v13" /> </its:rules> </header> <para>This is an example</para> </doc>
2.2.1.2 Example of global markup using pointers in XML
<doc> <header qaScore="100" qaPassingScore="95" qaProfile="http://example.org/qaModel/v13" > <its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> <its:locQualityPrecisRule selector="/header" locQualityPrecisScorePointer="./@qaScore" locQualityPrecisThresholdPointer="./@qaPassingScore" locQualityPrecisProfileRefPointer="./@qaProfile" /> </its:rules> </header> <para>This is an example</para> </doc>
2.2.1.3 Example of global markup in HTML
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"/> <title>Example</title> <link href="EX-locQualityPrecis-global-html5-1.xml" rel="its-rules"/> </head> <body> <p>This is an example.</p> </body> </html>
Corresponding rules file:
<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> <its:locQualityPrecisRule selector="/html" locQualityPrecisScore="100" locQualityPrecisThreshold="95" locQualityPrecisProfileRef="http://example.org/qaModel/v13" /> </its:rules>
2.2.2 LOCAL
The following local markup is available for the Localization Quality Profile data category:
- A
locQualityPrecisScore
attribute. Its value is an integer between 0 and 100 with higher values indicating a better score. - An optional
locQualityPrecisThreshold
attribute. Its value is an integer between 0 and 100 which indicates the lowest score value that constitutes a passing score in the profile used. - An optional
locQualityPrecisProfileRef
attribute. Its value is a URI pointing to the reference document describing the quality assessment model used for the scoring.
2.2.2.1 Example of local markup in XML
<doc xmlns:its="http://www.w3.org/2005/11/its" its:version="2.0" its:locQualityPrecisScore="100" its:locQualityPrecisThreshold="95" its:locQualityPrecisProfileRef="http://example.org/qaModel/v13" > <para>They continued to discuss the documents and Paul’s predicament for some time. At the end, Paul pocketed the whistle back and Renia escorted her visitors outside, leaving the big room empty.</para> <para>For a few moments the foliage of trees played with the sunlight through the many glasses of the high ceiling, dancing silently on the red titles of floor and the polish wood of the bookshelves. Then something moved at the back of the room, behind the one of the last tall bookshelves, in the darkest corner of the room. Someone walked carefully out of the shadows and came at the main table. He paged quickly through the documents and their translations, a smug smile on his thin lisps. After a while he put back the papers as he had found them, slipped into the corridor and left by one of the back doors.</para> </doc>
2.2.2.2 Example of local markup in HTML
<!DOCTYPE html> <html lang="en" its-loc-quality-precis-score="100" its-loc-quality-precis-threshold="95" its-loc-quality-precis-profile-ref="http://example.org/qaModel/v13"> <head> <title>Chapter 5 - The Watchtower</title> </head> <body> <h1>The Watchtower</h1> <p>Far to the east, beyond the Great Forest, and the rolling grasslands beyond the White hills, the wind blew down from an immense wall of high mountains: the Fangs.</p> <p>At the flank of one of the slopes, perched on a tall rocky knoll above a narrow winding road, a tower stood against the clear bleu of the sky.</p> <p>Three men surveyed the passage in silence. On each side of the fort, the huge mass of the Fangs casted long shadows. The cragged peaks glittered of permanent snow and everlasting ice. Behind the watchtower, far to the west, grassy hills dotted with small thickets of dark trees rolled under the warm afternoon sky.</p> </body> </html>
3 Annex: Mapping of Tool-Specific Quality Codes to locQualityType
Values (Non-Normative)
This Annex is informative.
The following table provides mappings of native quality assurance issue codes for a number of common localization quality tools to locQualityType
values. Tool developers are free to map their own issue codes to the locQualityType
values and are encouraged to make their mappings publicly available. Tools that produce ITS 2.0 loc-quality
markup should ensure that the output of their tools matches any publicly available mappings they may produce.
Note: These mappings are provides for example only. In the event of discrepancy between the mapping published by a developer and this annex, the statements from the developer take precedence over this annex.
locQualityType value | Tool/Metric-Specific Values | |||||
---|---|---|---|---|---|---|
CheckMate | xliff:doc | QA Distiller | SAE J2450 | LISA QA Model (UI) | LISA QA Model (doc)* - language only** | |
terminology |
|
|
|
|
|
|
mistranslation |
|
|
||||
omission |
|
|
|
|
|
|
untranslated |
|
|
||||
addition |
|
|
||||
duplication |
Not addressed in any of these metrics. It may be possible to treat this as a case of addition . |
|||||
inconsistency |
|
|
|
|||
grammar |
|
|
||||
legal |
Not addressed in any of these metrics. However, legal compliance checking is a big deal for regulated industries and forms a core part of their metrics. | |||||
register |
|
|||||
locale-specific-content |
|
|||||
locale-violation |
|
|
||||
style |
|
|
||||
characters |
|
|
|
|||
misspelling |
|
|
||||
typographical |
|
|
|
|
||
formatting |
|
(Numerous) | ||||
inconsistent-entities |
|
|||||
numbers |
|
|
||||
markup |
|
|
||||
pattern-problem |
|
|
||||
whitespace |
|
|
||||
internationalization |
(The examples for this code are broader than the type category here.) |
|
||||
length |
|
|||||
uncategorized |
|
|||||
other |
|
|
|
(** There are significant discrepancies between the categories in the LISA QA Model software and its documentation. The relationship between the two is unclear, so both are listed here.)
(** The LISA QA Model documentation addresses numerous issues related to software formatting that are outside the scope of the ITS 2.0 loc-quality model. For the sake of conciseness and clarity, these are not listed in this document.)