This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
This was was cloned from bug 17490 as part of operation convergence. Originally filed: 2012-06-14 19:29:00 +0000 ================================================================================ #0 contributor@whatwg.org 2012-06-14 19:29:07 +0000 -------------------------------------------------------------------------------- Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/named-character-references.html Multipage: http://www.whatwg.org/C#named-character-references Complete: http://www.whatwg.org/c#named-character-references Comment: `entities.json` is invalid syntax and incorrect content Posted from: 78.20.165.163 by mathias@qiwi.be User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1173.0 Safari/537.1 ================================================================================ #1 Mathias Bynens 2012-06-14 19:30:30 +0000 -------------------------------------------------------------------------------- Created attachment 1144 [details] Valid, working version Based on http://mathias.html5.org/tests/html/named-character-references/data.json ================================================================================ #2 Mathias Bynens 2012-06-15 07:27:19 +0000 -------------------------------------------------------------------------------- http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json currently has the following format: { "Æ": { "codepoints": [0x000C6], "characters": "\u00C6" }, … } However, hexadecimal integer literals (although valid in JavaScript) aren’t allowed in JSON. The easiest solution would be to use the numerical value in decimal notation instead, e.g. `198` instead of `0x000C6`. Another solution would be to make the `codepoints` property an array of strings instead of hexadecimal integers. (You can check for JSON conformance using a tool like http://jsonlint.com/.) ================================================================================ #3 Mathias Bynens 2012-06-15 07:41:41 +0000 -------------------------------------------------------------------------------- Possible fix for `entity-processor-json.py`: Replace: codes = '0x' + value[1:6] + ', 0x' + value[7:] With: codes = str(int(value[1:6], 16)) + ', ' + str(int(value[7:], 16)) And replace: codes = '0x' + value[1:] With: codes = str(int(value[1:], 16)) ================================================================================ #4 Mathias Bynens 2012-06-16 07:14:21 +0000 -------------------------------------------------------------------------------- Heads up: both http://www.whatwg.org/specs/web-apps/current-work/entities.json and http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json still show the old, invalid version. ================================================================================
A json format was also requested in bug 17994 . As noted there, there is a version available from http://www.w3.org/2003/entities/2007/htmlmathml.json this differs from the version that has been added to the spec in that it doesn't provide the values as integers, just as character strings (although it could do both if that is useful?) and more clearly distinguishes the ones without semicolons (which is useful for xml use as they aren't valid there). bug 17994 can probably be closed in favour of this bug as there is now a json link in the spec.
http://www.whatwg.org/specs/web-apps/current-work/entities.json and http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json have been updated and are now valid JSON.
I'll look at this at the same time as <https://www.w3.org/Bugs/Public/show_bug.cgi?id=14430>.
(In reply to comment #3) > I'll look at this at the same time as > <https://www.w3.org/Bugs/Public/show_bug.cgi?id=14430>. See comment #2 — you could just merge in the new versions of the files mentioned there. Problem solved.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the Editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the Tracker Issue; or you may create a Tracker Issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy-v2.html Status: Accepted Change Description: https://github.com/w3c/html/commit/ad9564f1a335d0601427637879a0eafb7f0aecce Rationale: accepted WHATWG change Additional comments: Probable original source for entity-processor-json.py: http://damowmow.com/temp/entity-processor-json.txt Unclear where to find unicode.xml, choosing to parse boilerplate/entities.inc instead. Output produces matches the following modulo sort order: http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json