6028 – [FO] fn:id is broken -- gives wrong answers for elements of type ID

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6028 - [FO] fn:id is broken -- gives wrong answers for elements of type ID

Summary: [FO] fn:id is broken -- gives wrong answers for elements of type ID

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	Functions and Operators 1.0 (show other bugs)
Version:	Recommendation
Hardware:	PC Linux

Importance:	P1 major
Target Milestone:	---
Assignee:	Michael Kay
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:	http://www.w3.org/TR/xquery-operators...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-09-05 14:02 UTC by Henry S. Thompson
Modified:	2009-10-16 22:35 UTC (History)
CC List:	4 users (show)

See Also:

Attachments

Description Henry S. Thompson 2008-09-05 14:02:44 UTC

fn:id is broken -- it gives wrong answers for elements of type ID:  XML Schema and XPointer both make clear that an element of type ID identifies its _parent_, not the element itself.

This is a serious bug because it means XPointer cannot be simply implemented using an XPath2 implementation out of the box.

I am submitting this personally, but I expect the XML Core WG (the keeper of the XPointer specs) will endorse this comment at their next meeting (2008-09-10).

Comment 1 Michael Kay 2008-09-05 17:03:46 UTC

Personal response.

I agree the design is wrong. (I say as much in my book XSLT/XPath Programmer's Reference 4th ed). The question is what we can now do about it.

I think that issuing an erratum to change the behaviour of the existing function in such an incompatible way would be very damaging. Not just because it would break existing code, but also because there would be a period of at least three years where different implementations in active use would give different results, and also because books (like my own) and other reference and tutorial materials will be on sale and in use for many years and will be giving incorrect information to users, leading to general confusion in the user community. And because the release cycle for some products is long (up to 3 years in the case of major relational database systems), the goal that XPointer could use an XPath implementation "out of the box" would not be achieved for several years.

One solution might be to define a new function with the "improved" behaviour, to exist alongside the old. This isn't ideal either, because there will still be an interoperability problem and an education problem, but at least there won't be a compatibility problem, and the interoperability problem will be that some implementations accept the new function and others don't, which is a lot better than having them return fundamentally different results.

Another softer approach might be to introduce a new way of calling the existing function: if the supplied ID value is prefixed with "#" then the function exhibits the new behaviour. This approach has the benefit that it makes it easier for applications to handle the coexistence of implementations that do or don't implement the erratum - the current specification is that if a value such as "#D001" is passed to the function, no error is reported and nothing is selected. So the expression

(id("#D001"),id("D001")/..)[1]

would give the desired result on both pre-and post-erratum implementations.

But of course this isn't a panacea - one aim of the function is that you can use it to follow a link from an IDREF or IDREFS attribute, and this would require some string manipulation to make it work.

I think we should probably see this problem with a sense of perspective. Many users are reluctant to use id() anyway, because it only works when the input document is validated, and because some parsers don't report IDness even when they are performing validation. Under XSLT, many users prefer to use the key() function. Under XQuery, they prefer to write an expression such as //A[@id="D001"] and rely on the optimizer to use whatever indexes are available. Even if it's broken, do we really need to fix it?

Comment 2 David Ezell 2008-09-05 17:18:44 UTC

On 2008-09-05 the XML Schema WG resolved (minutes to be published RSN) to endorse this issue, and respectfully request that QT accept this issue as requiring its immediate attention.  The Schema WG believes that the most satisfactory dispensation would be to open an erratum against F&O.

Thank you.

Comment 3 Michael Kay 2008-11-11 17:43:18 UTC

I was asked today by the joint XQuery/XSLWG meeting to produce a proposal. The proposal is to add a new section, identical to 15.5.2 fn:id, with the following differences:

(a) the function name is fn:element-with-id. (It has the same two signatures as fn:id).

(b) Rule 2, bullet 1, sub-bullet 1 is changed to read:

The element has a child element node whose is-id property property (See Section 5.5 is-id AccessorDM.) is true, and whose typed value is equal to V under the rules of the eq operator using the Unicode code point collation (http://www.w3.org/2005/xpath-functions/collation/codepoint).

(c) The existing Notes under fn:id() are referenced rather than being duplicated

(d) A new note is added explaining the problem and the difference between the two functions, and the approach to introducing it as an optional function by means of an erratum.

(e) The new function is added by Erratum to XPath 2.0/XQuery 1.0, with a statement that it is an optional feature for this version. To localize the changes, we can do this, I think, by means of a statement within the F+O specification of the function that says "Where any statement in the conformance rules of a host language requires that all functions in this function library must be provided, such a statement does not apply to the fn:element-with-id function (in both its variants): processors may provide this function but are not required to do so."

(f) The function becomes an integral part of the function library for XPath 2.1/XQuery 1.1.

Comment 4 Michael Kay 2009-01-28 20:18:03 UTC

Erratum E31 has been drafted to define the new function fn:element-with-id().

Comment 5 Jim Melton 2009-02-07 01:04:35 UTC

I'm marking this bug FIXED on the grounds that a new function has been added to satisfy the requirement as stated without "breaking" the existing function. 

If you are satisfied with this resolution, please mark the bug CLOSED.

Comment 6 Michael Kay 2009-10-16 20:49:28 UTC

I'm marking this as closed, since it has been fixed by erratum (E31) and the new text is in the master document for both the 1.02e and 1.1 branches.

Comment 7 John Cowan 2009-10-16 22:00:31 UTC

I'm posting even though I agree that this bug is properly closed, because Michael Kay's comment #1 contains a broken meme I am trying to stamp out.

"id() [...] only works when the input document is validated" is not true.  Conforming non-validating XML parsers MUST process ATTLIST declarations in the internal subset that appear before the first reference to an external parameter entity. Conforming processors that read external DTD subsets and parameter entities, whether validating or non-validating, MUST process all ATTLIST declarations.  Once an ATTLIST declaration for a particular attribute has been processed, the XML parser MUST report it to the application.  So if you write:

<!DOCTYPE root [<!ATTLIST root foo ID #REQUIRED>]>
<root foo="FOO">

every conforming XML parser MUST report the IDness of root/@foo.  What is more, if you write:

<!DOCTYPE root [<!ATTLIST root foo ID #REQUIRED>]>
<root foo="FOO">

every conforming processor MUST still report the IDness of root/@foo, even though 32 is not a valid ID.

Comment 8 John Cowan 2009-10-16 22:35:09 UTC

Erm, make the second example:

<!DOCTYPE root [<!ATTLIST root foo ID #REQUIRED>]>
<root foo="32">

(thanks to Paul Grosso)