This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
The XQueryX tests that contain unicode characters have those characters encoded as UTF-8 twice. That is, one non-ascii character gets encoded on 4 bytes instead of 2. The failing tests are: XQueryX/EncodeURIfunc/K-EncodeURIfunc-4 XQueryX/EscapeHTMLURIFunc/K-EscapeHTMLURIFunc-5 XQueryX/Functions/AllStringFunc/AssDisassStringFunc/StringToCodepointFunc/fn-string-to-codepoints1args-4 XQueryX/Functions/AllStringFunc/EscapingFuncs/EncodeURIfunc/fn-encode-for-uri1args-2 XQueryX/Functions/AllStringFunc/EscapingFuncs/EscapeHTMLURIFunc/fn-escape-html-uri1args-2 XQueryX/Functions/AllStringFunc/EscapingFuncs/IRIToURIfunc/fn-iri-to-uri1args-2 XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-12 XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-19 XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-20 XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-21 The testing was performed using Zorba XQuery 1.1.
Sorry, but I am not seeing the problem that you describe. I looked at the first test case you listed, K-EncodeURIfunc-4. The XQuery contains encode-for-URI("~bébé") ... I see the string literal as bytes 7E 62 C3 A9 62 C3 A9. The XQueryX that is generated is: <xqx:functionCallExpr> <xqx:functionName>encode-for-uri</xqx:functionName> <xqx:arguments> <xqx:stringConstantExpr> <xqx:value>~bébé</xqx:value> </xqx:stringConstantExpr> </xqx:arguments> </xqx:functionCallExpr> The two-byte Unicode characters are being replaced by charRefs in the XQueryX that is generated.
Daniel, if I don't receive any further information from you, then I will have to close this bug report without making any changes.
Ok, I just checked them all and they work fine. They must have been fixed in the latest XQTS.