This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
In section 7.6.1, Regular Expression Syntax, we say: "Back-references are allowed. ... A back-reference matches the string that was matched by the nth capturing subexpression within the regular expression, that is, the parenthesized subexpression whose opening left parenthesis is the nth unescaped left parenthesis within the regular expression. The closing right parenthesis of this subexpression must occur before the back-reference. ..." In Bug #4610 we considered the following query: replace("abcd", "(a)\2(b)", "") While I interpreted the violation of "must occur" as requiring that an error be raised, Michael Kay interpreted it as causing the back-reference to fail to match a string. The replace function acknowledges that a pattern can be invalid, saying: "An error is raised [err:FORX0002] if the value of $pattern is invalid according to the rules described in section 7.6.1 Regular Expression Syntax." Let me suggest that this be clarified by changing the existing sentence: "The closing right parenthesis of this subexpression must occur before the back-reference." to the following: "The regular expression is invalid if the closing right parenthesis of this subexpression occurs before the back-reference."
For consistency this also means that it should be an error to use \3 if no third subexpression exists. So I would suggest changing The closing right parenthesis of this subexpression must occur before the back-reference. to "The regular expression is invalid if this subexpression does not exist or if its closing right parenthesis occurs after the back-reference."
The WGs accepted the proposal in comment #1
The change has been merged into erratum E4, which affects another sentence in the same paragraph. See bug #4106.