27252 – Figure out which characters to escape in fragment

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 27252 - Figure out which characters to escape in fragment

Summary: Figure out which characters to escape in fragment

Status:	RESOLVED FIXED

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	URL (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	Unsorted
Assignee:	Sam Ruby
QA Contact:	sideshowbarker+urlspec

URL:
Whiteboard:
Keywords:

Duplicates (1):	26988 (view as bug list)
Depends on:
Blocks:

Reported:	2014-11-05 15:04 UTC by Simon Pieters
Modified:	2014-12-16 12:39 UTC (History)
CC List:	4 users (show)

See Also:	https://bugzilla.mozilla.org/show_bug.cgi?id=1093611

Attachments

Description Simon Pieters 2014-11-05 15:04:00 UTC

https://url.spec.whatwg.org/#url-parsing

[[
utf-8 percent encode c using the simple encode set, and append the result to url's fragment.
]]

See http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3290

Comment 1 Anne 2014-11-27 10:28:40 UTC

*** Bug 26988 has been marked as a duplicate of this bug. ***

Comment 2 Anne 2014-11-27 10:31:18 UTC

Given what IE and Chrome do it seems we should follow them and not escape code points in the fragment state.

Only Safari follows the specification here. Gecko is somewhere inbetween.

Comment 3 Sam Ruby 2014-11-30 01:21:50 UTC

Just checking, we should still percent encode anything that is not a URL code point, right? https://url.spec.whatwg.org/#url-code-points

Comment 4 Anne 2014-11-30 10:49:23 UTC

As far as I can tell that is not what happens. All code points except maybe 0x00 and the newlines are passed through as is. Simon's test also needs to be modified to put an "x" at the end of the fragment identifier, to ensure stripping of code points less than 0x21 does not happen.

Comment 5 Simon Pieters 2014-12-01 09:25:03 UTC

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3322

Comment 6 Anne 2014-12-12 12:05:24 UTC

So as a comment in the specification reminded me, different things happen for non-relative schemes. E.g. if you use "test" as your URL scheme, Chrome will percent-encode code points higher than U+007F. IE does not, so I'm still inclined to make this change, but that does indicate that this is a bit riskier.

Comment 7 Anne 2014-12-12 12:40:40 UTC

I gave up on bikeshed since I think rubys has a different bikeshed that supports stringifier which I don't get when I pull the latest bikeshed and run update on it...

https://github.com/whatwg/url/pull/13

Comment 8 Sam Ruby 2014-12-16 12:39:09 UTC

Fixed by https://github.com/whatwg/url/commit/05cbe06bfacf1e477df3d81234492413ca16acbf and https://github.com/w3c/web-platform-tests/pull/1471

Fix is to no longer escape code points in fragments and ignore U+0000 in fragments in addition to newlines.