This document is also available in these non-normative formats: XML and diff-marked from FPWD .
Copyright
© 2010 © 2010 W3C ® (
MIT ,
ERCIM
, Keio ), All Rights Reserved.
W3C liability
, trademark
and document
use rules apply.
This specification defines several XML processor profiles, each of which fully determines a data model for any given XML document. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a first public Working
Draft for review by W3C members and other interested parties.
This document is a product of the XML Processing Model Working
Group which is part of the W3C XML Activity . The English
version of this specification is the only normative version.
However, for translations of this document, see
http://www.w3.org/2003/03/Translations/byTechnology?technology=xproc
.
This draft adds two further profiles, and sets out invariants in terms of document properties which will or will not be guaranteed to be the same for the same document when processed by processors conform to the same or different profiles. Comments on the utility of these additions are particularly welcome.
The Working Group invites review of this draft, which is likely
to be the only draft before publication
as followed soon by a Last Call
Working Draft. Please send comments on this draft to the public
mailing list public-xml-processing-model-comments@w3.org
(public
archives are available). Please include the string
"[xml-proc-profiles]" in your email subject line.
As this specification is intended for use by other specifications which themselves define one or more XML languages, the Working Group particularly welcomes input for other Working Groups who are responsible for such specifications.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
1 Background
1.1 Terminology
2 XML processor
profiles
2.1 The minimum XML processor profile
3 2.2 The basic XML processor profile
2.3 The modest XML processor
profile
2.4 The recommended XML
processor profile
3 Invariants
3.1 Data model
invariants within profilesa given profile
3.1.1
Underspecified
information
3.2 Data model
variation between profiles
3.2.1
Between minimum and
richer profiles
3.2.2
Between basic and
richer profiles
3.2.3
Between modest and
recommended profiles
4 Other profiles (non-normative)
5 Conformance
A References
A.1 Normative References
A.2 Non-normative References
The XML specification [Extensible Markup Language
(XML) 1.0 (Fifth Edition)] defines an XML processor as "a
software module. . .used to read XML documents and provide access
to their content and structure. . .on behalf of another module,
called the application." XML applications are often defined by
building on top of the [XML Information Set
(Second Edition)] or other similar XML data models such as
[XML Path Language (XPath) Version 1.0] or
[XQuery 1.0 and XPath 2.0 Data Model
(XDM)] , understood as the output of an XML processor. Such
definitions have suffered to some extent from an uncertainty
inherent in using that kind of foundation, in that the mappingXML mapping XML
processors perform from XML documents to data model is not rigid.
Some of this stems from the XML specification itself, which leaves
open the possiblity of reading and interpreting external entities,
or not. Some stems from the growth of the XML family of
specifications: if the input document includes uses of XInclude,
for instance.
This specification addresses this issue by defining several XML processor profiles, each of which fully determines a data model for any given XML document. It is intended as a resource for other specifications, which can by a single normative reference establish precisely what input processing they require.
The profiles defined here are appropriate for processing both XML 1.0 [Extensible Markup Language (XML) 1.0 (Fifth Edition)] and XML 1.1 [Extensible Markup Language (XML) 1.1 (Second Edition)] documents. References to XML or XML Namespaces below should be understood as references to 1.0 or 1.1 as required by the relevant document or application.
[ Definition :The key words must ,must not ,required ,shall ,shall not ,should ,should not ,recommended ,may ,and optional in this specification are to be interpreted as described in [RFC 2119] .]
The term base URI is used in this specification as it is defined in [RFC 3986] .
The minimum approach to the
construction All of the profiles describe the steps necessary to
construct a data model from a well-formed and
namespace
well-formed XML document
document. This specification does not
consider documents that are not namespace well-formed. Documents
which are not well-formed are not XML.
The minimum approach to the construction of a data model requires the following:
Processing of the document as required of conformant
non-validating XML processors while
without reading all any external markup
declarations ;
Maintenance of the base URI property of each element in conformance with
[XML Base] ;
The basic recommended approach to
the construction of a data model from a
well-formed requires the
following:
Processing of the document as
required of conformant non-validating XML
processors and namespace
well-formed without reading any
external markup declarations ;
Maintenance of the base URI of each element in conformance with [XML Base] ;
Identification of all
xml:id
attributes as IDs as required by [xml:id Version 1.0]
The modest approach to the construction of a data model requires the following:
Processing of the document as required of conformant non-validating XML processors while reading all external markup declarations ;
Maintenance of the base URI of each element in conformance with [XML Base] ;
Identification of all
xml:id
attributes as IDs as required by [xml:id Version 1.0]
The recommended approach to the construction of a data model requires the following:
Processing of the document as required of conformant non-validating XML processors while reading and processing all external markup declarations ;
Maintenance of the base URI property of each element in conformance with
[XML Base] ;
Identification of all xml:id
attributes as IDs as
required by [xml:id Version 1.0]
Replacement of all include
elements in the XInclude
namespace, and namespace, xml:base and xml:lang fixup of the
result, as required for conformance to [XML
Inclusions (XInclude) Version 1.0 (Second Edition)] .
The following [XProc: An XML Pipeline
Language] pipeline, pipeline implements the 2.4 The recommended
XML processor profile when implemented executed
by a conformant XProc processor which
processes
Processes its input as required by
point (1) above, implements above;
Recognizes and preserves the
default process: ID type of all xml:id
attributes in
conformance with [xml:id Version 1.0] .
Data models constructed in conformance with one of the profiles defined above will be guaranteed to share certain properties. The following sub-sections describe this in terms of invariants with respect to the information available in the data model.
Any two data models which are both constructed in conformance with the same profile from a given namespace-well-formed XML document will have exactly the same information with repect to the following information items and properties (per [XML Information Set (Second Edition)] ):
[document element], [base URI], [character encoding scheme], [standalone], [version], [all declarations processed]
[namespace name], [local name], [prefix], [children], [attributes], [namespace attributes], [in-scope namespaces], [base URI], [parent]
[namespace name], [local name], [prefix], [normalized value], [specified], [attribute type], [references], [owner element]
[target], [content], [base URI], [notation], [parent]
[name], [system identifier], [public
identifier], [declaration base URI], [parent]—
This type of information
item will not occur at all if
using 2.3
The modest XML processor profile or 2.4 The recommended XML processor profile
profiles, or if standalone="yes"
[character code], [parent]
[content], [parent]
[prefix], [namespace name]
Whether the remaining information is present or, if present only partially, whether it is the same, depends on implementation-dependent properties, so no invariant can be guaranteed :
[children], [notations], [unparsed entities]
[element content whitespace]
entirely or partially
entirely or partially
entirely or partially
When two data models are constructed in conformance with the two different profiles from a given namespace-well-formed XML document, the information contained therein will in some cases (depending on the specifics of the document in question) differ with repect to the following information items and properties (per [XML Information Set (Second Edition)] ) (leaving aside the items and properties identified as implementation-defined above):
[attribute type], [references]—
These properties may vary
for xml:id
attributes
And all the differences listed in the next two sections.
Entirely, in that where a basic processor reports an Unexpanded Entity Reference, richer ones will report the entity expansion, which may be or include entire elements.
Entirely, for the same reason, or, just with respect to [normalized value], [specified], [attribute type] and [references] where a basic processor has not processed the relevant declaration, but a richer one has.
Entirely, per the Element case above
Entirely, in the opposite sense to the Element case above
Entirely, per the Element case above
Entirely, per the Element case above
Entirely, per the Element case above
And all the differences listed in the next section.
Entirely, in that where a modest processor
reports an xinclude
Element, a recommend Processor will report
the result of XInclude processing, which may be or include entire
elements.
Entirely, for the same reason
Entirely, for the same reason
Entirely, for the same reason
Entirely, for the same reason
Entirely, for the same reason
The profiles defined here, particularly the 3 2.4 The basic
recommended XML processor
profile , can be used as a starting point for the
definition of further profiles. For example, the media type
registrations for stylesheet languages applicable to XML such as
or
text/xsl application/xslt+xmltext/css
might define a profile specifying appropriate
<?xml-stylesheet type="[their media type]" . .
.?>
processing in addition to the processing required by
3
2.4 The basic recommended XML
processor profile .
Conformance is a matter for any specification which references
this one to mandate, expressed in terms such as "Conforming
implementations must construct input data models
from XML documents as required by the basic
recommended XML processor profile ."