This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Aside from the space/plus sign mapping, http://url.spec.whatwg.org/#concept-urlencoded-byte-serializer percent-encodes bytes corresponding to the following ASCII non-control characters: !"#$%&\'()+,/:;<=>?@[\]^`{|} This seems more than necessary. I believe these should be enough: "#&<=>` That is, those encoded in the URL parser’s query state, and &= which are the only significant characters in application/x-www-form-urlencoded. Is there a reason I’m missing?
Are you saying existing implementations are not following the specification?
Right, I forgot to leave my sense of logic art the door. I'll do some testing.
I think ideally we align this with the query state from the URL parser.
But also, ideally we first make HTML use this algorithm rather than its own. (They're currently identical, I would not want them to get out of sync before we join them.)
I found that we also have https://github.com/whatwg/url/issues/18 and since GitHub is now preferred I'm going to mark this bug MOVED. Thank you for reporting this bug.