This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Consider the following URLs: 1) http://intertwingly.net/projects/pegurl/liveview.html#http://¡/ 2) http://intertwingly.net/projects/pegurl/liveview.html#http://¡/ 3) http://intertwingly.net/projects/pegurl/liveview.html#http://%C2%A1/ If you perform a utf-8 decode without BOM on the percent decoding of utf-8 encode on the third URL, you will end up with the second URL. And indeed browsers treat these the two URLs the same. Unfortunately, this is also true for the first URL by virtue of the fact that there are no percent signs to decode. I say unfortunately as browsers do not treat these two URLs the same. I propose that the spec text in step 3 of https://url.spec.whatwg.org/#host-parsing be changed to reference a utf8PercentDecode function, one that only UTF-8 decodes bytes that were produced by percent decoding. One possible implementation of such a function can be found in: http://intertwingly.net/projects/pegurl/url.js
I don't understand why 1) and 2) would be treated the same. Could you elaborate?
After a more careful reading of the spec, I've come to the conclusion that this bug is invalid.