Accesskey n skips to in-page navigation. Skip to the content start.
Intended audience: this article is mainly intended for people developing the internationalization checker, but can also be used by those interested in tracking source references quickly for a particular report.
This page lists all the report messages used by the W3C Internationalization Checker. As well as the text of the report and the severity of the report (with variants), it lists the conditions which trigger that report. It also lists references to articles or specifications that provide authoritative sources for the report.
The conditions that trigger a report are often dependent on the format or mime-type of the page being considered. The checker currently supports only a subset of format/mime-type combinations. The following keywords are used to indicate the various possibilities that are currently tracked:
If there are no keywords, the conditions apply to all formats and mime-types.
The checker has not yet been tailored to deal with XHTML5 or Polyglot documents.
This page will be updated from time to time, as new features are added to the checker or existing features are refined.
A character encoding is specified in the HTTP header (%1
), but there was no matching encoding declaration in the page. This may lead to problems later if there is a chance that the document will be read from or saved to disk, CD, etc.
In addition, the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually.
Add information to indicate the character encoding of the page inside the page itself.
Declaring the character encoding in an X/HTML document
The UTF-8 Byte Order Mark (BOM) was found at the beginning of the page. It can sometimes introduce blank spaces or short sequences of strange-looking characters (such as )
Using an editor or an appropriate tool, remove the byte order mark from the beginning of the file. This can often be achieved by saving the document with the appropriate settings in the editor. On the other hand, some editors (such as Notepad on Windows) do not give you a choice, and always add the byte order mark. In this case you may need to use a different editor.
Polyglot, 3. Specifying a Document's Character Encoding specification
The UTF-8 Byte Order Mark (BOM) was found below the top of the page. This is often caused when the BOM is at the top of a file or chunk of content that is included into a page. It can sometimes introduce blank spaces or short sequences of strange-looking characters (such as ).
Using an editor or an appropriate tool, remove the byte order mark from the beginning of the file or chunk of content where it appears.
If the problem does arise from a BOM at the top of an included file, this can often be achieved by saving the content with appropriate settings in the editor. On the other hand, some editors (such as Notepad on Windows) do not give you a choice, and always add the byte order mark. In this case you may need to use a different editor.
The character encoding of this page is indicated using a byte-order mark.
Although this is usually sufficient to indicate to a browser what is the encoding of the page, the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document as well, because it helps developers, testers, or translation production managers to check the encoding of a document visually.
Add a meta
tag or XML declaration, as appropriate, to your page to indicate the character encoding used.
Declaring the character encoding for HTML
html
This page currently uses the following XML declaration:
%1
XML declarations are used by XML processors, and are not appropriate for pages that are parsed as HTML.
html5
This page currently uses the following XML declaration:
%1
HTML5 only allows comments before the Doctype, so this prevents the use of the XML declaration.
xhtml
This page currently uses the following XML declaration:
%1
XML declarations are sometimes used with XHTML pages that are served as text/html, so that when those files are read by an XML parser, rather than an HTML parser, the encoding information is recognized. However, an XML declaration in an HTML document can cause Internet Explorer to render in quirks mode rather than standards mode, so it is generally recommended that you avoid its use for such hybrid documents. If you use UTF-8 you don't need an XML declaration for an conforming XML parser.
html,html5
Remove the XML declaration from your page. Use a meta
element instead to declare the character encoding of the page.
xhtml
Since you are using XHTML 1.x but serving it as text/html, use UTF-8 for your page and remove the XML declaration.
Declaring the character encoding for HTML
HTML5, 8.1 Writing HTML documents specification
Polyglot Markup: HTML-Compatible XHTML Documents, 2. Processing Instructions and the XML Declaration specification
XHTML 1.0, C.1. Processing Instructions and the XML Declaration specification
XHTML 1.0, C.9. Character Encoding specification
XML 1.0, 4.3.3 Character Encoding in Entities specification
This page only declares a character encoding in the following XML declaration.
%1
An HTML parser does not recognize encoding declarations in the XML declaration, so effectively no encoding has been specified for this page.
Add a meta
element to indicate the character encoding of the page. You could also declare the encoding in the HTTP header, but it is recommended that you always use a meta
element too.
Declaring the character encoding for HTML
HTML: The Markup Language, 4.2. Character encoding declaration specification
HTML5, 8.1 Writing HTML documents specification
Polyglot Markup: HTML-Compatible XHTML Documents, 2. Processing Instructions and the XML Declaration specification
XHTML 1.0, C.1. Processing Instructions and the XML Declaration specification
XHTML 1.0, C.9. Character Encoding specification
meta
character encoding declaration uses http-equiv
This page uses the following character encoding declaration with an http-equiv
attribute:
%1
This is acceptable for HTML5, however you may want to consider using the meta element with a charset
attribute instead. For example:
<meta charset="%2">
Replace the http-equiv
and content
attributes in your meta
tag with a charset
attribute.
Declaring the character encoding for HTML
HTML5, 4.2.5.3 Pragma directives specification
meta
encoding declarations don't work with XMLThis page is being served as XML and there is a character encoding declaration in the following meta
tag:
%1
Encoding declarations in meta
tags are not recognised by XML processors, so this declaration has no actual effect.
In the absence of another declaration, the XML processor will recognize the encoding as UTF-8 or UTF-16 by sniffing the start of the file. If you sometimes serve this page as HTML, it will be useful then, but otherwise it would be better to use an XML declaration than a meta
tag to identify the encoding of the page. It is useful, by the way, to have a visible in-document declaration because it helps developers, testers, or translation production managers to check the encoding of a document visually.
Unless you sometimes serve this page as text/html
, remove the meta
tag and ensure you have an XML declaration with encoding information.
Declaring the character encoding for HTML
HTML: The Markup Language, 4.2. Character encoding declaration specification
HTML5, 4.2.5.5 Specifying the document's character encoding specification
XHTML 1.0, C.9. Character Encoding specification
meta
tag with a charset
attribute will cause validation to failThis page is not HTML5 but uses the following meta
element to specify the character encoding:
%1
Although all major browsers now recognize this character encoding declaration, and act appropriately, the charset
attribute on a meta
tag is only referred to by the HTML5 specification. This means that, although things will generally work as you expect in modern browsers, if you try to validate this page you will receive an error message.
If you want this page to be valid HTML, replace the charset
attribute with http-equiv
and content
attributes, eg. <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
.
Declaring the character encoding for HTML
XHTML 1.0, C.9. Character Encoding specification
HTML 4.01, 5.2.2 Specifying the character encoding specification
meta
encoding declarationsThe only character encoding declaration for this page is in the following meta
element, which specifies an encoding that is neither UTF-8 nor UTF-16:
%1
If a document is treated as XML and is encoded as neither UTF-8 nor UTF-16, you must declare the encoding in the XML declaration. The meta tag declaration is not recognized by an XML processor. When parsed as XML, this document will be treated as UTF-8.
Add an XML declaration with encoding information, or change the character encoding for this page to UTF-8. If this page is never parsed as HTML, you can remove the meta
tag.
Declaring the character encoding for HTML
This section still to be worked on
Polyglot Markup, 3. Specifying a Document's Character Encoding specification
XML 1.0, 4.3.3 Character Encoding in Entities specification
meta
tagOne document has to have a single character encoding, and you only need one meta
element to declare the character encoding. This page has the following list of meta
elements containing character encoding declarations:
%1
Edit the markup to remove all but one meta
element.
Declaring the character encoding in an X/HTML document
HTML5, 4.2.5 The meta element article
HTML5, 8.2.2 The input stream specification
This HTML5 page has a character encoding declaration in a meta
element:
%1
The HTML5 specification disallows the use of meta
character encoding declarations with UTF-16 encoded documents. A UTF-16 byte-order mark (BOM) is the only in-document encoding allowed.
Remove the meta
encoding declaration.
Declaring the character encoding for HTML
The meta
character encoding declaration for this page says that the page is encoded as UTF-16:
%1
The character encoding declaration is incorrect: this is not a UTF-16 encoded file. In HTML4.01 the page will be parsed as the default encoding for the browser. In XHTML 1.x and HTML5 it will be treated as UTF-8. If your document used a different character encoding than these, you will likely see corruption of the non-ASCII text on your page.
Change the encoding declaration to reflect the actual encoding of the page.
Declaring the character encoding for HTML
HTML5, 8.2.2 The input stream specification
The following encoding declaration(s) specify whether the page is either big-endian (UTF-16BE) or little-endian (UTF-16LE):
%1
You should not use the UTF-16BE and UTF16-LE charset names in character encoding declarations for markup - you should, instead, use just "UTF-16". All UTF-16 pages should start with a byte-order mark, and this will indicate whether the character encoding used is big- or little-endian.
Ensure that the page starts with a byte-order mark (BOM) and change the encoding declaration(s) to "UTF-16".
Declaring the character encoding for HTML
XML 1.0, 4.3.3 Character Encoding in Entities specification
meta
tag not within 1024 bytes of the file startThe following character encoding declaration did not fit completely within the first 1,024 bytes of the start of the page:
%1
In HTML5 this will mean that the encoding declaration is not recognized.
Move the character encoding declaration nearer to the top of the page. Usually it is best to make it the first thing in the head element.
Declaring the character encoding for HTML
HTML5, 8.2.2.1 Determining the character encoding specification
charset
attribute used on a
or link
elementsThe following a
and/or link
elements contained a charset
attribute:
%1
The charset attribute has been deprecated on these elements in HTML5, so it is recommended that you avoid using it in future for any format.
Remove the charset attribute. If pointing to a page that is under your control, ensure that any appropriate character encoding information is provided for that page.
Declaring the character encoding for HTML
HTML5, 11.2 Non-conforming features specification
There is no declaration or byte-order mark to indicate the character encoding of the page. You should always specify the encoding used for an HTML page. If you don't, you risk that characters in your content will be incorrectly interpreted. This is not just an issue of human readability, increasingly machines need to understand your data too.
HTML5 requires a meta encoding declaration if the character encoding is not declared in the HTTP header or a byte-order mark.
The W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually.
Add information to indicate the character encoding of the page.
Declaring the character encoding in an X/HTML document
HTML: The Markup Language, 4.2. Character encoding declaration specification
HTML5, 4.2.5.5 Specifying the document's character encoding specification
HTML 4.01, 5.2.2 Specifying the character encoding specification
Character Model for the World Wide Web, 4.4.1 Mandating a unique character encoding, C034 specification
No character encoding is declared for this page. Since it is being served as XML, the browser and any XML processor will assume that the encoding is UTF-8. If you are not saving your document as UTF-8, you will find that characters are being corrupted.
Even if you are intending the document to be read as UTF-8, the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually.
Add information to indicate the character encoding of the page inside the page itself .
Declaring the character encoding for HTML
The page currently uses the following non-UTF-8 character encoding declaration(s):
%1
UTF-8 is based on Unicode. A Unicode character encoding makes it easier to use a wide range of characters, from the registered trademark symbol to characters in multiple languages. It also simplifies the use of scripts and databases for multilingual sites, and allows you to more easily expand your site to cover new languages, when needed. Using non-UTF-8 encodings can also have unexpected results on form submission and URL encodings, which use the document's character encoding by default. It is not a requirement to use UTF-8, but the HTML5 specification recommends its use, and you should consider it.
UTF-16 is also a character encoding based on Unicode, but is little used on the Web, and generally best avoided.
Set your authoring tool to save your content as UTF-8, and change the encoding declarations.
Changing the encoding of a document
lang
attribute without an associated xml:lang
attributexhtml
In the following tag or tags the lang
attribute is not accompanied by an xml:lang
attribute.
%1
This may cause problems if you try to process this XHTML page as XML, since XML processors recognise xml:lang
but don't recognise lang
. For XHTML you should normally use both.
xhtml10x,xhtml11x
In the following tag or tags the lang
attribute is not accompanied by an xml:lang
attribute.
%1
XML processors recognise xml:lang
but don't recognise lang
. When serving a page as XML, you should have an xml:lang
attribute wherever there is a lang
attribute. (You only need to have the lang
attribute if you plan to serve the page as text/html
also.)
Add an xml:lang
attribute to each of the above tags, with the same value as the lang
attribute.
Language declarations explained
Using attributes to declare language
Polyglot markup, 7.2 Language Attributes specification
XHTML 1.0, C.7. The lang and xml:lang Attributes specification
XHTML 1.1, 3. The XHTML 1.1 Document Type specification
XML 1.0, 2.12 Language Identification specification
xml:lang
attribute without an associated lang
attributeIn the following tag or tags the xml:lang
attribute is not accompanied by a lang
attribute.
%1
This causes a problem if you try to display an XHTML page as HTML, since HTML parsers don't recognise xml:lang
, they only recognise the lang
attribute.
HTML5 and XHTML5 require you to use a lang
attribute if you use an xml:lang
attribute (and the values must be the same).
Add a lang
attribute to each of the above tags, with the same value as the xml:lang
attribute.
Language declarations explained
Using attributes to declare language
HTML5, 3.2.3.3 The lang and xml:lang attributes specification
Polyglot markup, 7.2 Language Attributes specification
XHTML 1.0, C.7. The lang and xml:lang Attributes specification
XHTML 1.1, 3. The XHTML 1.1 Document Type specification
lang
attribute value did not match an xml:lang
value when they appeared together on the same tag.In each of the following tag or tags the language values of the lang
and xml:lang
attributes don't match:
%1
Change one of the values in each tag by editing the markup
Language declarations explained
Using attributes to declare language
HTML5, 3.2.3.3 The lang and xml:lang attributes specification
Polyglot markup, 7.2 Language Attributes specification
XHTML 1.0, C.7. The lang and xml:lang Attributes specification
XHTML 1.1, 3. The XHTML 1.1 Document Type specification
html
tag has no language attributeThere is no language attribute in the html
tag.
%1
A language attribute on the html
tag sets the default natural language for the page. This information can be used for processing the content in various ways, including such things as spell-checking, accessibility, data formatting, and choice of styles for rendering the page. Every page should have the correct default language specified.
For HTML files, this should be a lang attribute. For XHTML served as HTML you should use both the lang and xml:lang attributes. For files served as XML only, you should have xml:lang, but you don't need to have the lang attribute.
html,html5
Add a lang
attribute that indicates the default language of your page.
Example: lang='de'
xhtml
Since this is an XHTML page served as HTML, add both a lang
attribute and an xml:lang
attribute to the html tag to indicate the default language of your page. The lang
attribute is understood by HTML processors, but not by XML processors, and vice versa.
Example: lang="de" xml:lang="de"
xhtml10x,xhtml11x
Add an xml:lang
attribute that indicates the default language of your page.
Example: xml:lang='de'
Language declarations explained
Using attributes to declare language
HTML5, 3.2.3.3 The lang and xml:lang attributes specification
Polyglot markup, 7.2 Language Attributes specification
HTML 4.01, 8.1 Specifying the language of content: the lang attribute specification
XHTML 1.0, C.7. The lang and xml:lang Attributes specification
XHTML 1.1, 3. The XHTML 1.1 Document Type specification
XML 1.0, 2.12 Language Identification specification
html
tag will have no effect This is the html
tag in this document.
%1
A language attribute on the html
tag sets the default natural language for the page. This information can be used for processing the content in various ways, including such things as spell-checking, accessibility, data formatting, and choice of styles for rendering the page. Every page should have the correct default language specified.
HTML parsers only recognize the lang
attribute. XML parsers only recognize the xml:lang
attribute. On this page the wrong attribute is being used, and so the default language of the page is not being recognized.
html,html5
Since this page is served as HTML, use the lang
attribute.
xhtml
Since this page is served as HTML, use the lang
attribute. If there is a chance that the same page will also be processed by an XML parser, use both the lang
attribute and the xml:lang
attribute.
xhtml10x,xhtml11x
Since this page is served as XML, use the xml:lang
attribute instead of a lang
attribute. If there is a chance that this page will also be served as text/html
in some circumstances, use both.
Language declarations explained
Using attributes to declare language
HTML5, 3.2.3.3 The lang and xml:lang attributes specification
Polyglot markup, 7.2 Language Attributes specification
HTML 4.01, 8.1 Specifying the language of content: the lang attribute specification
XHTML 1.0, C.7. The lang and xml:lang Attributes specification
XHTML 1.1, 3. The XHTML 1.1 Document Type specification
XML 1.0, 2.12 Language Identification specification
xml:lang
attributesThe page contains xml:lang attributes in the following places:
%1
The xml:lang
attribute is not a valid unless you are using XHTML.
Remove the xml:lang
attributes from the markup, replacing them, where appropriate, with lang
attributes.
Language declarations explained
HTML5, 3.2.3.3 The lang and xml:lang attributes specification
Polyglot markup, 7.2 Language Attributes specification
HTML 4.01, 8.1 Specifying the language of content: the lang attribute specification
XHTML 1.0, C.7. The lang and xml:lang Attributes specification
XHTML 1.1, 3. The XHTML 1.1 Document Type specification
In the following tag or tags the language values of the lang
and xml:lang
attributes are not well-formed according to BCP47. Attributes values must contain a maximum of one language tag, and a language tag is composed of one or more subtags taken from the IANA Language Subtag Registry, separated by hyphens (eg. zh-Hans-SG
).
%1
Change the attribute values to conform to BCP47 syntax rules.
Language declarations explained
Internet-Draft: BCP 47 specification
HTML5, 3.2.3.3 The lang and xml:lang attributes specification
Polyglot markup, 7.2 Language Attributes specification
HTML 4.01, 8.1 Specifying the language of content: the lang attribute specification
XML 1.0, 2.12 Language Identification specification
meta
element used to set the default document languageThis page uses a meta
element with the http-equiv
attribute value set to Content-Language
.
%1
The HTML5 specification has made this type of meta
element obsolete in HTML, so you should not use it for pages written in HTML5. This is due to the widespread confusion surrounding the use of this construct. In addition, browsers are inconsistent in the way they handle this information.
Given this, it is strongly recommended that you not use this Content-Language meta element in any HTML format.
Remove the Content-Language meta element, and ensure that you have used an attribute on the html
tag to specify the default language of the page.
Language declarations explained
Using attributes to declare language
HTML5, 4.2.5.3 Pragma directives specification
Polyglot markup, 7.2 Language Attributes specification
Unicode allows you to represent certain letters using different combinations of bytes. For example é can be represented as LATIN SMALL LETTER E WITH ACUTE or as LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT. To avoid problems when trying to match class or id names against CSS selectors, or for JavaScript lookup, all your markup tags and CSS and JavaScript code should use the same byte combinations for the same text, ie. be normalised.
Total number of non-NFC names: %1.
%2
It is recommended to save all content as Unicode Normalization Form C (NFC).
Unicode normalization forms article
%1
tags found with no class attributeOne or more %1
tags that don't use a class attribute were found in the source code for this page. These tags may cause problems for localization if the content for which they are used has more than one semantic value.
Total number of %1
tags: %2.
Number of %1
tags without a class attribute: %3.
You should not use %1
tags if there is a more descriptive and relevant tag available. If you do use them, it is usually better to add class attributes that describe the intended meaning of the markup, so that you can distinguish one use from another.
Using <b> and <i> tags article
HTML5, 4.6.17 The b element article
HTML5, 4.6.16 The i element specification
dir
attributehtml,xhtml,xhtml10x,xhtml11x
In the following tag or tags the value should be one of rtl
or ltr
:
%1
html5
In the following tag or tags the value should be one of rtl
, ltr
, or auto
:
%1
Correct the attribute values.
Markup for text direction explained
Setting up a right-to-left page
Changing the direction of a block element
HTML5, 3.2.3.5 The dir attribute specification
HTML 4.01, 8.2 Specifying the direction of text and tables: the dir attribute specification
bdo
tags found with no dir
attributeOne or more bdo
tags that don't use a class attribute were found in the source code for this page. Without a dir
attribute, the bidirectional override will not be applied.
Total number of bdo
tags: %2.
Number of bdo
tags without a class attribute: %3.
Add a dir
attribute to each bdo
tag.
Overriding the Unicode bidirectional algorithm
HTML5, 4.6.24 The bdo element specification
HTML 4.01, 8.2.4 Overriding the bidirectional algorithm: the BDO element specification
Tell us what you think (English).
We have recently published a Getting Started page to help you find information on the site. The Getting Started page points to a series of articles that are underway, and that provide newcomers with a gentle introduction to key internationalization topics and point to basic information on the site to get you going.
By: Richard Ishida, W3C.
Content first published 2011-07-08 18:08. Last substantive update 2011-07-08 18:08 GMT. This version 2011-07-08 18:08 GMT
For the history of document changes, search for article-checker in the i18n blog.
Copyright © 2011 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.