Copyright © 2004 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document defines the XML-binary Optimized Packaging (XOP) convention, a means of more efficiently serializing XML Infosets that have certain types of content.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is the W3C Last Call Working Draft of the XML-binary Optimized Packaging document. It has been produced by the XML Protocol Working Group (WG), which is part of the Web Services Activity.
Discussion of this document takes place on the public xml-dist-app@w3.org mailing list (public archive) under the email communication rules in the XML Protocol Working Group Charter .
Comments on this document are welcome. Send them before 29 June 2004 to xmlp-comments@w3.org mailing list (public archive). Note that all outstanding issues against this document are documented in the Working Group Last Call Issues List. If the feedback on this document is positive, the WG plans to submit it for consideration as a W3C Candidate Recommendation.
This document has been produced under the 24 January 2002 CPP as amended by the W3C Patent Policy Transition Procedure. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy. Patent disclosures relevant to this specification may be found on the Working Group's patent disclosure page.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
1 Introduction
1.1 Terminology
1.2 Example
1.3 Notational Conventions
2 XOP Infosets Constructs
2.1 xop:Include element information
item
2.2 href attribute information item
3 XOP's Processing Model
3.1 Creating XOP Packages
3.2 Interpreting XOP Packages
4 XOP Packages
4.1 MIME Multipart/Related XOP Packages
5 Identifying XOP Documents
6 Security Considerations
6.1 XOP Package Integrity
6.2 XOP Package Confidentiality
A Relationship to other specifications
A.1 Dependencies
A.2 Payload
A.3 Extension
B References
B.1 Normative References
B.2 Informative References
This specification defines the XML-binary Optimized Packaging (XOP) convention, a means of more efficiently serializing XML Infosets (see [XML InfoSet]) that have certain types of content.
A XOP package is created by placing a serialization of the XML Infoset inside of an extensible packaging format (such a MIME Multipart/Related, see [RFC 2387]). Then, selected portions of its content that are base64-encoded binary data are re-encoded (i.e., the data is decoded from base64) and placed into the package. The locations of those selected portions are marked in the XML with a special element that links to the packaged data using URIs.
In a number of important XOP applications, binary data need never be encoded in base64 form. If the data to be included is already available as a binary octet stream, then either an application or other software acting on its behalf can directly copy that data into a XOP package, at the same time preparing suitable linking elements for use in the root part; when parsing a XOP package, the binary data can be made available directly to applications, or, if appropriate, the base64 binary character representation can be computed from the binary data.
However, at the conceptual level, this binary data can be thought of as
being base64-encoded in the XML Document. As this conceptual form might
be needed during some processing of the XML Document (e.g., for signing
the XML document), it is necessary to have a one to one correspondence
between XML Infosets and XOP Packages. Therefore, the conceptual
representation of such binary data is as if it were base64-encoded,
using the canonical lexical form of XML Schema base64Binary
datatype (see [XML Schema Part 2] 3.2.16
base64Binary). In the reverse direction, XOP is capable of
optimizing only base64-encoded Infoset data that is in the canonical
lexical form.
As a result, only element content can be optimized; attributes,
non-base64-compatible character data, and data not in the canonical
representation of the base64Binary
datatype cannot be
successfully optimized by XOP.
The remainder of this specification is organized in the following fashion:
Section Two describes the XOP Infoset, which preserves the non-optimized content and structure of the original XML Infoset.
Section Three specifies XOP's processing model.
Section Four of this specification describes the form of the XOP Package.
Section Five describes how XOP Documents are identified.
Section Six explores the security considerations of using the XOP convention.
This specification uses terminology from the XML Infoset (see [XML InfoSet]) when discussing XML content and structure. This is only a convention for clear specification of XOP's behavior.
The following terms are used in this specification:
xop:Include
element information items.
Example shows an XML Infoset prior to XOP
processing. Example shows the same
Infoset, serialized using the XOP format in a MIME Multipart/Related
package. The base64-encoded content of the m:photo
and
m:sig
elements have been replaced by a
xop:Include
element, while
the binary octets have been serialized in separate MIME parts. Note
that those examples uses [media-type] to identify the
media type of the content of the m:photo
and
m:sig
elements.
<soap:Envelope xmlns:soap='http://www.w3.org/2003/05/soap-envelope' xmlns:xop='http://www.w3.org/2003/12/xop/include' xmlns:xmlmime='http://www.w3.org/2004/06/xmlmime'> <soap:Body> <m:data xmlns:m='http://example.org/stuff'> <m:photo xmlmime:content-type='image/png'> /aWKKapGGyQ= </m:photo> <m:sig xmlmime:content-type='application/pkcs7-signature'> Faa7vROi2VQ= </m:sig> </m:data> </soap:Body> </soap:Envelope>
MIME-Version: 1.0 Content-Type: Multipart/Related;boundary=MIME_boundary; type=text/xml;start="<mymessage.xml@example.org>" Content-Description: An XML document with my picture and signature in it --MIME_boundary Content-Type: text/xml; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-ID: <mymessage.xml@example.org> <soap:Envelope xmlns:soap='http://www.w3.org/2003/05/soap-envelope' xmlns:xop='http://www.w3.org/2003/12/xop/include' xmlns:xmlmime='http://www.w3.org/2004/06/xmlmime'> <soap:Body> <m:data xmlns:m='http://example.org/stuff'> <m:photo xmlmime:content-type='image/png'> <xop:Include href='cid:http://example.org/me.png'/> </m:photo> <m:sig xmlmime:content-type='application/pkcs7-signature'> <xop:Include href='cid:http://example.org/my.hsh'/> </m:sig> </m:data> </soap:Body> </soap:Envelope> --MIME_boundary Content-Type: image/png Content-Transfer-Encoding: binary Content-ID: <http://example.org/me.png> // binary octets for png --MIME_boundary Content-Type: application/pkcs7-signature Content-Transfer-Encoding: binary Content-ID: <http://example.org/my.hsh> // binary octets for signature --MIME_boundary--
Example shows another serialization of
the same XML Infoset, when it happens that the photo contained in the
m:photo
element is available on the web. Taking advantage
of this, the serialization uses a Content-Location
header
instead of a Content-ID
header to identify the MIME part
containing the photo.
MIME-Version: 1.0 Content-Type: Multipart/Related;boundary=MIME_boundary; type=text/xml;start="<mymessage.xml@example.org>" Content-Description: An XML document with my picture and signature in it --MIME_boundary Content-Type: text/xml; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-ID: <mymessage.xml@example.org> <soap:Envelope xmlns:soap='http://www.w3.org/2003/05/soap-envelope' xmlns:xop='http://www.w3.org/2003/12/xop/include' xmlns:xmlmime='http://www.w3.org/2004/06/xmlmime'> <soap:Body> <m:data xmlns:m='http://example.org/stuff'> <m:photo xmlmime:content-type='image/png'> <xop:Include href='http://example.org/me.png'/> </m:photo> <m:sig xmlmime:content-type='application/pkcs7-signature'> <xop:Include href='cid:http://example.org/my.hsh'/> </m:sig> </m:data> </soap:Body> </soap:Envelope> --MIME_boundary Content-Type: image/png Content-Transfer-Encoding: binary Content-Location: http://example.org/me.png // binary octets for png --MIME_boundary Content-Type: application/pkcs7-signature Content-Transfer-Encoding: binary Content-ID: <http://example.org/my.hsh> // binary octets for signature --MIME_boundary--
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC 2119].
This specification uses a number of namespace prefixes throughout; they are listed below. Note that the choice of any namespace prefix is arbitrary and not semantically significant.
Prefix | Namespace |
---|---|
Notes | |
xop | "http://www.w3.org/2003/12/xop/include" |
A non-normative XML Schema [XML Schema Part 1], [XML Schema Part 2] document for the "http://www.w3.org/2003/12/xop/include" namespace and for [XML 1.0] can be found at http://www.w3.org/2000/xp/Group/3/06/Attachments/include.xsd. | |
xmlmime | "http://www.w3.org/2004/06/xmlmime" |
The namespace for the content type attribute. | |
xs | "http://www.w3.org/2001/XMLSchema" |
The namespace of XML Schema data types [XML Schema Part 2]. |
Editorial note: HR | |
Note that the "http://www.w3.org/2004/06/xmlmime" URI is not final and will be changed. |
XOP operates by extracting the Optimized Content from the Original
Infoset to create the XOP Infoset. In particular, the character
information item children of element information
items to be optimized are removed and replaced with an
element information item named xop:Include
.
The xop:Include
element information item
contains an attribute information item with a link to
the part of the XOP Package that carries a binary representation of the
data removed from the original element information item.
Details of the construction and processing of XOP serializations are
provided in 3 XOP's Processing Model.
The Infoset used as input to XOP processing MUST NOT contain any
element information item with a [namespace name]
property of "http://www.w3.org/2003/12/xop/include"
and a [local name] property of Include
. Infosets
containing such element information items cannot be
serialized using XOP.
The following subsections provide formal definitions for the element information item and attribute information items used to construct a XOP serialization. A non-normative XML Schema for [XML 1.0] serializations of those element information item and attribute information items can be found at http://www.w3.org/2000/xp/Group/3/06/Attachments/include.xsd.
xop:Include
element information
item
The xop:Include
element information item
has the following constraints on its properties:
Include
.
xop:Include
element information items and MUST be ignored if it is
not recognized.
href
attribute information
items (see 2.2 href attribute information item).
xop:Include
element information items and MUST be ignored if
not recognized.
href
attribute information item
The href
attribute information item has the
following constraints on its properties:
href
.
xop:Include
element information item). The [normalized value] MUST
be a valid lexical form of the XML Schema xs:anyURI
datatype (see [XML Schema Part 2]3.2.17
anyURI).
xop:Include
element information item containing the
attribute information item.
This section describes the processing model for creating XOP Packages and interpreting XOP Packages. Unless otherwise stated, the result of such processing MUST be semantically equivalent to performing the specified steps separately, and in the order given.
To create a XOP Package from an Original XML Infoset:
Include
. As discussed in 2 XOP Infosets Constructs, XML Infosets with such element
information items cannot be represented using XOP.
xs:base64Binary
(see [XML Schema Part 2]3.2.16
base64Binary) and MUST not contain any whitespace
characters, preceding, inline with or following the non-whitespace
content.
xop:Include
element
information item (see 2.1 xop:Include element information
item) constructed as follows:
href
attribute information item of
the xop:Include
element information
item (see 2.2 href attribute information item).
xop:Include
element information item)
has a xmlmime:content-type
attribute
information item, its value SHOULD be reflected
appropriately in the part's metadata.
Additional parts MAY be added to the package to satisfy application specific requirements. Other content-specific metadata MAY be reflected in the packaging metadata as appropriate.
If content cannot be successfully encoded into the XOP Infoset, implementations SHOULD behave as if that portion of the Original XML Infoset was not nominated for optimization.
This section specifies the means by which the Original XML Infoset can be reconstructed from a XOP Package that has been prepared according to the rules of 3.1 Creating XOP Packages.
Note: conventions or error reporting mechanisms to be used in processing packages that incorrectly purport to be XOP Packages are beyond the scope of this specification.
To create a Reconstituted XML Infoset from a XOP Package:
xop:Include
element information item (as defined in 2.1 xop:Include element information
item):
xop:Include
's href
attribute
information item (i.e., corresponding to the URI
encoded in the attribute information item's
[normalized value]).
xop:Include
element information item
with the data reconstructed from the
package part).
XOP is capable of using a variety of underlying packaging mechanisms. Such packaging mechanisms MUST be able to represent, with full fidelity all the parts created according to 3 XOP's Processing Model (see 3.1 Creating XOP Packages).
The subsection below specifies normatively how a particular packaging mechanism, MIME Multipart/Related, is used, but does not preclude the use of other packaging mechanisms with the XOP convention.
This section describes how MIME Multipart/Related packaging (as specified in [RFC 2387]) is used with XOP.
The root MIME part is the root part of the package, and MUST be a serialization of the XOP Infoset using any W3C recommendation-level version of XML (e.g., [XML 1.0], [XML 1.1]), as defined below. This root MIME part MUST be identified with a media type that is specific to the XOP encoding of the Original XML Infoset's media type, as described in 5 Identifying XOP Documents. This media type MAY specify which version of XML was used for the serialization of the XOP Infoset.
Except for purposes of determining the root MIME part, as specified by [RFC 2387], ordering of MIME parts MUST NOT be considered significant to XOP processing or to the construction of the XOP Infoset.
Part metadata is reflected in MIME header fields. Specifically, if
the URI used in the value of a xop:Include
element's
href
attribute has a 'cid' scheme, the corresponding MIME
part's Content-ID header field (see [RFC 2387] and
[RFC 2392]) MUST have a corresponding field-value.
Otherwise, the MIME part's Content-Location header field (see [RFC 2557]) MUST have a field-value identical to the URI in
the value of the href
attribute.
Furthermore, if a xmlmime:content-type
attribute
information item is found (as described in 3 XOP's Processing Model), it SHOULD be reflected in the MIME
Content-Type header's field-value.
XOP Packages can use any of a variety of packaging mechanisms, and furthermore can serialize any number of XML formats. To enable applications to identify the use of XOP as well as the original Infoset's media type, a distinct media type MUST be registered for each application of XOP to a particular XML application vocabulary. This media type will be used to label the XOP package's root part, while the package's media type itself is unchanged. Formats MAY also restrict which version of XML the media type allows to be used in the XOP Infoset.
For example, if the format identified by "application/soap+xml" is to be packaged as XOP serializations, then a XOP-specific media type (e.g., "application/soap_xop+xml") MUST be registered. A XOP Package using the Multipart/Related packaging mechanism and serializing such an Infoset would have a package media type of "multipart/related" and a root media type of "application/soap_xop+xml".
Processors wishing to identify the use of XOP or the media type of the resulting Output Infoset MAY examine the root media type of the package. However they MAY also rely on other mechanism for doing this identification.
Note that there is currently no convention for structuring the names of XOP-based media types.
The integrity of Infosets optimized using XOP may need to be ensured. As XOP packages can be transformed to recover such Infosets (see 3.2 Interpreting XOP Packages, existing XML Digital Signature techniques can be used to protect them. Note, however, that a signature over the Infoset does not necessarily protect against modifications of other aspects of the XOP packaging; for example, an Infoset signature check might not protect against re-ordering of non-root parts.
The confidentiality of XOP Packages may need to be ensured. As such
packages can be transformed to an XML Information Set, existing XML
Encryption (see [XML Encryption]) techniques can be used to
protect such packages. Any part of a package can be encrypted,
whether it includes base64 characters or not. The resulting
CipherData
element information item can then
be optimized because the content of such an element information
item is base64 characters.
In future a transform algorithm for use with XML Encryption could provide a more efficient processing model where the raw octets are encrypted directly.
This appendix summarizes the dependencies upon XOP's underlying specifications, the nature of appropriate payloads for XOP and the means of extending XOP.
The XOP convention builds upon a number of underlying specifications. They are:
XML (e.g., [XML 1.0], [XML 1.1]) - The XOP Document is encoded using any W3C recommendation-level version of XML (see 3.1 Creating XOP Packages). Formats that use XOP MUST identify which versions of XML are permissible for encoding the XOP Infoset. XOP does not constrain the use of any mechanisms defined by XML, including those explicitly allowing extensions, nor does it constrain the use of underlying specifications.
Namespaces in XML (e.g., [Namespaces in XML], [Namespaces in XML 1.1]) - The XOP Document uses any W3C recommendation-level version of Namespaces in XML compatible with the version(s) of XML used. Formats that use XOP MUST identify which versions of Namespaces in XML are permissible for encoding the XOP Infoset. XOP does not constrain the use of any mechanisms defined by Namespaces in XML, including those explicitly allowing extensions, nor does it constrain the use of underlying specifications.
Uniform Resource Identifiers (see [RFC 2396]) - The XOP Document uses URIs to locate parts in the XOP Package (see 2.2 href attribute information item. XOP does not constrain the use of any mechanisms defined by URIs, including those explicitly allowing extensions, nor does it constrain the use of underlying specifications.
Packaging Mechanism - XOP requires the use of a packaging mechanism that satisfies the requirements in 4 XOP Packages. One such mechanism MUST be in use, but XOP does not require a specific mechanism. Formats using XOP MUST identify at least one such mechanism permissible for creating the XOP Package, and MUST specify how each allowed mechanism is to be used for building the XOP Package.
The relationship of one such mechanism to XOP, The MIME Multipart/Related Content-type, is specified in 4.1 MIME Multipart/Related XOP Packages.
The payload of a XOP Package is an XML Infoset. XOP constrains the
range of admissible characters in the payload to those contained in
the "Char" production of a W3C recommendation-level version of XML.
Additionally, the Original XML Infoset cannot contain an
element information item with a [local name] of of
Include
and a [namespace name] of
"http://www.w3.org/2003/12/xop/include". Finally,
portions of the payload which are nominated for optimization in XOP
MUST be base64-encoded data in the canonical lexical form of XML
Schema base64Binary
datatype (see [XML Schema Part 2] 3.2.16
base64Binary).
XOP Documents allow extensions to the xop:Include
element
when they do not change its semantics. Changes to the format MUST be
identified by a new namespace URI (i.e., they MUST define a new
Include
element information item in another
namespace).
The extensibility of XOP's underlying specifications is not constrained by their use in XOP.