XML Security – Issues and Requirements
BEA Systems is a vendor of distributed systems infrastructure software. We have been shipping our implementations of XML Signature and XML Encryption in our products for several years now. This paper reflects our experiences in implementing these standards, our customers’ experiences in using them and our experiences defining other standards which normatively reference them. These include SAML, XACML, Digital Signature Services, SPML, WS-Security, all developed at OASIS and the Basic Security Profile, developed at WS-I.
Our uses of XML and XML security are primarily in the context of network protocols and our requirements are primarily shaped by the characteristics of such environments.
Our experiences with XML Signature and XML Encryption have generally been positive. They provide more flexibility than prior alternatives, such as TLS and IPSec, which enables advanced e-commerce usecases. The following capabilities are particularly useful.
However, we have encountered a variety of issues relating to the use of XML Signature and XML Encryption both separately and in combination. The rest of this paper discusses these issues. While real problems in the field have been rare to date, it is our expectation that as applications begin to take advantage of the capabilities of these specifications and the ones that use them, problems will become more frequent. Failure to resolve these issues, especially the ones causing spurious validation errors, could lead to abandonment of their use.
This section describes issues relating to the use of XML Signature.
Our concerns about Canonicalization Algorithms relate mostly to the handling of namespace declarations, which in fact is the primary difference between the two most commonly implemented Canonicalization Algorithms. Canonical XML has the advantage of retaining all in-scope namespace declarations, thus ensuring all the semantics of the document are integrity protected. Unfortunately, because signatures constructed using this algorithm are invalidated by the addition or removal of namespace declarations in the surrounding context, it is not suitable for network protocol environments as described above.
Exclusive XML Canonicalization retains only namespaces which are visibly used in the portion of the XML document being signed. This works extremely well when a self-contained XML document is constructed, signed and then inserted into some other XML document. An example would be when a SAML assertion is signed and then carried somewhere in a SOAP message. However, in the environment described above, problems can arise even when using Exclusive Canonicalization.
Exclusive XML Canonicalization is not completely immune to spurious validation errors produced by adding or removing namespace declarations in the enclosing document. The following example was originally suggested by Melvin Hughes.
Consider signing this with Exclusive Canonicalization, including the foo prefix in the InclusiveNamespaces PrefixList.
<SomeEnclosingElement>
<ToBeSigned wsu:Id="tbs">
<Data xmlns:foo="urn:foo" Something="foo:Bar"/>
</ToBeSigned>
</SomeEnclosingElement>
If an intermediary introduces the use of this namespace prefix in an ancestor, then the signature will break because xmlns:foo will be canonicalized with the ToBeSigned element instead of the Data element. In both cases, foo will be assigned the same value and will be included under the signature, but since it will appear within a different element, the signature value computed will be different.
Exclusive Canonicalization is also subject to the security risk that the namespace declarations will not be included under the signature if the prefix is used with an element or attribute value. When the namespace defines some critical aspect of the semantics of a qname with this tag, this will create ambiguity even though the XML is digitally signed.
Although it is now recognized that it is best practice to avoid the use of qnames in content, both to avoid this threat of ambiguity and for language-theoretic reasons, it is still done in many applications and even required by some standards. In any event, would be risky to assume that qnames will never be present in content when performing XML Canonicalization.
The use of the InclusiveNamespaces PrefixList is intended to overcome this risk, but its use is problematic. In practice, application software which understands if and where qnames have been used in content and security software responsible for performing Canonicalization are distinct layered components. In principle it would be possible to define APIs to allow applications to specify prefixes to include in the PrefixList. However, there are practical difficulties. For example, Canonicalization and hence a PrefixList may be associated with each <ds:Reference> element as well as the <ds:SignedInfo> element.
In light of these considerations, many implementers have added a preprocessing pass, in which the message content is scanned for text preceding a colon which matches the tag of any namespace currently in scope. Any tags which are found are added to the InclusiveNamespaces PrefixList for the associated Exclusive Canonicalization step. This creates a slight possibility that namespace declaration will appear under the signature when it is not strictly required, but 1) it guarantees that any qnames in content will be properly handled and 2) will not cause any spurious validation errors which did not occur for other reasons.
However, this extra processing pass over the data is ugly and tends to undermine the purpose of Exclusive Canonicalization. It also impacts performance, which is already an issue for XML Signature. This is discussed below.
XML Signature also has an issue which can result in spurious validation errors. It was discovered during the development of the Digital Signature Services Standard that there is no form of XPath expression (except one referencing an Id Attribute) which is guaranteed to return some arbitrary XML content, which has been inserted into a protocol message. This assumes that other arbitrary content has been subsequently added to the message. No matter how the XPath expression is formulated, it will be possible for the reference to match the newly added content.
Although the use of an Id attribute eliminates this problem, there are a number of problems in using Id attributes to indicate signed content.
All XML Id attributes are defined to match, regardless of the namespace they are defined in. This means that if an Id attribute is duplicated, the behavior of a reference to that Id is undefined. Obviously this is a security threat if an Id which is used to identify signed data is duplicated elsewhere in the message. This can easily happen when the message is constructed by multiple software components acting independently.
A UUID-like procedure can be used whenever an Id is generated to ensure it is unique, but there is no generally accepted best practice in this area. A minor disadvantage to this approach is that it reduces readability.
Another disadvantage of using Id Attributes for signature references is that the Id is part of the signed data. This means that if the XML has previously been signed and an Id attribute is added, the original signature will be invalid. In other scenarios, removing an Id attribute will also invalidate the signature.
Some uses of Id Attributes can lead to security threats. An obvious example is when a portion of the content is signed in a situation where the relative position of the content has some semantic import. For example imagine an order form which a billing address and delivery address. If they were signed independently using an Id attribute reference, it would be possible to to interchange them without invalidating the signatures. A number of these threats are discussed in [McIntosh].
General experience with XML Signature has shown that its use can lead to many new types of threats. In general this does not represent any deficiency in XML Signature, but is a consequence of the increased flexibility it provides. The ability to sign portions of messages, create overlapping signatures and add or subtract text after signature creation has led to the discovery of a number of new security threats. It is likely there are others which have not been discovered yet.
Computing and verifying XML signatures has been discovered to be very resource intensive. Anecdotal evidence suggests that Canonicalization is a major element. This has led to the development of at least one alternative method to be used in restricted circumstances. [SIMSIGN] This is an area that needs investigation.
XML Encryption generally has presented fewer problems than XML Signature. However, one unfortunate effect is that the presence of encrypted data in a message is automatically schema invalid, thus preventing schema checking of the remainder of the XML. Some standards have defined schemas which may include encrypted data, but it would be better to deal with this in a more comprehensive way.
As is the case with Signature, the flexibility of XML Encryption has led to the discovery of new types of security threats. In some cases, the use of encryption has interacted with other protocols to create new threats. [McIntosh] In other case, the interactions between XML Encryption and XML Signature has been the source of new threats. [WSS 1.1] [BSP]
[McIntosh] XML Signature Element Wrapping Attacks and Countermeasures, Michael McIntosh, Paula Austel, http://domino.research.ibm.com/library/cyberdig.nsf/papers/73053F26BFE5D1D385257067004CFD80/$File/rc23691.pdf
[SIMSIGN] SAMLv2.0 HTTP POST “SimpleSign” Binding
http://www.oasis-open.org/committees/download.php/24974/draft-sstc-saml-binding-simplesign-03.pdf
[WSS 1.1] Web Services Security: SOAP Message Security 1.1; Chapter 13 – Security Considerations http://www.oasis-open.org/committees/download.php/16790/wss-v1.1-spec-os-SOAPMessageSecurity.pdf
[BSP] Basic Security Profile Version 1.1; Chapter 18 – Security Considerations http://www.ws-i.org/Profiles/BasicSecurityProfile-1.1.html