Copyright © 2003 W3C ® ( MIT , ERCIM , Keio ), All Rights Reserved. W3C liability , trademark , document use , and software licensing rules apply.
The architecture of the Web depends on applications making having a shared understanding of the messages exchanged between agents (clients, servers, intermediaries, etc.) and a shared expectation of how the payload of the message -- a representation -- will be interpreted by the recipient. The Web architecture uses representation metadata to indicate the sender's intentions to the recipient whenever the protocols used for communication allow such metadata to be communicated. In particular, dispatching and security security-related decisions for resources regarding the processing of a message are often based on their Internet Media Types values provided in representation metadata fields, such as the "Content-Type" field of HTTP and other MIME headers. MIME. In this finding, we review the architectural design choice that MIME headers metadata provided by an origin server be authoritative. We also examine why client behavior that misrepresents the user or server resource provider is harmful. Finally, we consider how specification authors should incorporate these points into their work.
This deleted text: draft incorporates comments from Norm Walsh, Rob Lanphier, Tim Berners-Lee, Stuart Williams and discussion from the TAG's <a href="http://www.w3.org/2003/07/07-tag-summary.html#contentTypeOverride-24"> 7 July 2003 teleconference </a>. </p> <p> This document has been developed for discussion by the W3C Technical Architecture Group . This draft finding addresses issue contentTypeOverride-24 and partly addresses issue errorHandling-20 . The TAG finding " " Internet Media Type registration, consistency of use " " also includes material related to this issue.
Publication of this finding does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time.
Additional TAG findings , both approved and in draft state, may also be available. The TAG expects to incorporate this and other findings into a Web Architecture Document that will be published according to the process of the W3C Recommendation Track .
The terms MUST, SHOULD, and SHOULD NOT are used in this document in accordance with [RFC2119] .
Please send comments on this finding to the publicly archived TAG mailing list www-tag@w3.org ( archive ).
1
Summary
of
Key
Points
2
Scenarios
3
<a href="#mime-headers">
Why
MIME
headers
are
metadata
from
the
resource
provider
is
authoritative
3.1
Multiple
interpretations
possible;
only
one
authoritative
3.2
Metadata
and
efficiency
3.3
Self-describing
data
4
Why
deleted text:
user
agent
behavior
that
misrepresents
the
user
or
resource
provider
is
harmful
4.1
Examples
of
inconsistencies
4.2
Client
handling
of
inconsistencies
5
Hints
in
specifications
6
Handling
misconfigured
servers
7
Conclusion
8
Future
Work
9
References
10
Acknowledgments
The following are the key architectural points of this finding:
For example, when an HTTP message contains a representation as its payload (within its message body), the HTTP header field "Content-Type", if present, defines the Internet media type of that representation. The recipient is not allowed to "guess" the media type unless no such metadata was provided with the representation.
Section 2 presents scenarios where these principles/points have been ignored and poses the question of what has been ignored and by whom. <a href="#mime-headers"> Section 3 discusses the motivation for a Web architecture where MIME headers are origin metadata is authoritative. Section 4 examines the potential harm caused by deleted text: user agents that misrepresent the user or silently disregard authoritative headers. Section 5 discusses the interaction between metadata hints in format specifications and protocol headers. Section 6 suggests ways in which server management can alleviate some header/content inconsistencies. problems due to inconsistencies between provided content and configured metadata.
Scenario
1:
Stuart
runs
his
own
Web
server
at
<code>
http://www.example.org/
</code>.
"http://www.example.org/".
He
creates
an
HTML
page
and
means
to
serve
it
as
<code>
text/html
</code>,
"text/html",
but
misconfigures
the
origin
server
so
that
the
content
is
served
via
HTTP/1.1
[RFC2616]
as
<code>
text/plain
</code>
rather
than
as
<code>
text/html
</code>.
"text/plain".
Tim's
browser
looks
inside
the
page,
detects
some
markup
that
suggests
that
this
is
an
HTML
document
(e.g.,
a
<!DOCTYPE
declaration
or
<title>
element),
and
quietly
renders
it
according
to
the
HTML
specification
rather
than
as
plain
text.
Janet's
browser
displays
the
content
as
plain
text.
Which party has neglected a principle of Web architecture: Stuart for the server misconfiguration, Tim's browser for silently overriding the server's headers, or Janet's browser for not detecting that the content looked like HTML?
Answer 1: By silently overriding authoritative headers metadata , Tim's browser did not respect Web architecture principles that promote shared understanding.
Scenario 2: Norm publishes an XHTML document that includes:
<link href="cool-style" type="text/css" rel="stylesheet"/> <link href="cool-style" type="text/css" rel="stylesheet"/>
Norm's
"cool-style"
"cool-style"
is
an
XSLT
style
sheet,
but
Norm
has
set
type
to
<code>
text/css
</code>.
"text/css".
Stuart
has
configured
the
origin
server
so
that
"cool-style"
"cool-style"
is
served
via
HTTP/1.1
as
<code>
application/xslt+xml
</code>.
"application/xslt+xml".
With
a
user
agent
that
understands
XSLT
but
not
CSS,
Janet
requests
the
content
that
includes
this
link
element.
As
it
loads
the
page,
Janet's
user
agent
reads
the
type
hint
and
does
not
fetch
"cool-style."
"cool-style."
Which party is responsible for the fact that Janet did not receive content she should have: Stuart for the server configuration, Norm for stating that the "cool-style" "cool-style" sheet is served as <code> text/css </code> "text/css" when in fact it's served with a different content media type, or Janet's user agent for not double-checking the content media type with the server?
Answer 2: Though not a violation of principles of Web architecture, Norm's mislabeling of content deprived Janet of content she should have received.
In the sections below, we explore these answers in more detail.
Successful communication between two parties about using a piece of information relies on shared understanding of the meaning of the information. On the Web, thousands Arbitrary numbers of independent parties can identify and communicate about a Web resource. To give these parties the confidence that they are all talking about the same thing when they refer to "the "the resource identified by the following URI ..." ..." the design choice for the Web is, in general, that the owner provider of a resource assigns the authoritative interpretation of its representations. representations of the resource. A representation is an octet sequence that consists logically of two parts:
In terms of Web architecture, the authoritative interpretation of representations is communicated as follows:
Generally the interpretation Interpretation of bits on the Internet is governed by deleted text: a protocol specification (e.g., specifications. The HTTP/1.1 and FTP). In the case of HTTP, that specification specification, for example, delegates the interpretation of the message entity to a format specification (e.g., XHTML, CSS, PNG, XLink, RDF/XML, and SMIL animation), identified by MIME type. </p> <p> There are benefits to allowing different interpretations of a bag assignment of deleted text: bits depending on context. For flexibility, some protocols like HTTP/1.1 allow resource owners to direct the interpretation of a bag of bits by sending metadata along with the bits. In HTTP/1.1 a a response from the server can include for a bag of bits (the "entity body") and metadata about those bits message entity (the deleted text: "entity headers", including Content-Type, Content-Language, and Content-Encoding). In Web architecture terms, a bag of bits plus metadata is called a <em> representation </em> of enclosed within a resource. In practice, the MIME mechanism defined in RFC2046 is used to associate a bag of bits with metadata. MIME headers are key message) to deleted text: understanding the authoritative interpretation for a bag of bits. </p> <p> This model does not imply that a given set of bits can only be interpreted as the author intended. The model is designed to <em> enable </em> global understanding by having parties agree to follow a small set of rules for interpreting bits (starting with the MIME type). Parties may reach local agreements independently, but they do not change the authoritative interpretation of the bits. </p> <p> Another benefit of separating metadata that guides interpretation from data is improved efficiency. For instance, when a server sends XML data header fields "Content-Encoding" and labels the data correctly through MIME headers, a client can dispatch processing after rapid inspection of "Content-Type", where the metadata (typically short strings). It latter's value is much more expensive if the client has to start up defined by an XML parser to guess the content type. </p> <p> A particularly important piece of metadata is the content Internet media type header, which instructs a client on which specification to follow first (e.g., "text/html" or "image/jpeg") that, in order to interpret turn, identifies a bag of bits; that registered data format specification may invoke others recursively. For convenience, the MIME mechanism includes a (e.g., XHTML, CSS, PNG, XLink, RDF/XML, etc.). The IANA registry of content type/specification bindings maintained by the Internet Assigned Numbers Authority <a href="#"> [IANA] </a>. maps media types to data format specifications.
For instance, in the IANA registry, the content type <code> text/html </code> "text/html" is associated with [RFC2854] , which in turn states that:
Thus, by serving a bag of bits representation with content media type <code> text/html </code>, "text/html", the resource owner provider declares that the HTML 4.01 Recommendation governs the authoritative interpretation. By serving a bag of bits representation data (even HTML bits) data) with content media type <code> text/plain </code>, "text/plain", the resource owner provider declares that [RFC2046] and [RFC2646] govern the authoritative interpretation. This is the first piece of explaining why Tim's browser in scenario 1 is the culprit.
A sequence The fact that there is one authoritative interpretation of representation data does not imply that there is only one possible interpretation. The Web's model is designed to enable global understanding by having parties agree to follow a small set of rules for interpreting bits (starting with the media type). Parties may reach local agreements independently, but they do not change the authoritative interpretation of the representation data.
One benefit of using metadata to guide processing is improved efficiency. For instance, when a server sends XML data and proper metadata, a client can determine the media type after rapid inspection of a short string. It is considerably more expensive in processing time to start up an XML parser to guess the media type.
Data is "self-describing" "self-describing" if it includes enough information to allow two parties to figure out how to interpret it the same way establish a consistent interpretation without additional clues. If the author intends for the data to be interpreted in a manner other than what is self-described (e.g., "treat "treat this XML content as plain text"), text"), then clarifying metadata is required (e.g., in protocol headers). Providing redundant metadata for data that is self-describing can lead to inconsistencies, however. Thus, for example, server managers SHOULD NOT in general specify the character encoding for XML data in protocol headers since the data is self-describing.
Below we examine appropriate client behavior when inconsistencies are detected between what the server resource provider declares the content media type to be through metadata and any type information available by inspection of the representation data itself.
A user agent represents the user for interactions with servers. Misrepresentation may lead to violations of privacy, security holes, and just plain confusion. User agent behavior that misrepresents the user or misrepresents the server resource provider ultimately undermines trust on the Web and is thus considered harmful. deleted text: Misrepresentation may lead to violations of privacy, security holes, and just plain confusion. Some examples of potential security violations include:
A client that ignores authoritative server headers metadata without deleted text: informing the user user's consent undermines the goal of creating a shared information space.
In scenario 1 , in terms of Web architecture, Stuart is innocent; misconfiguration of the server is not an architectural error, it's just a human error. Instead, Tim's browser is the culprit since it misrepresents the server resource provider by ignoring the authoritative headers, metadata, without informing Tim. Tim's consent. Janet's browser respected the <code> text/plain </code> header, "Content-Type" header field, and by doing so, helps Janet and Stuart detect a server misconfiguration.
Examples of inconsistencies between headers and a bag of bits representation data that have been observed on the Web include:
Clients should detect such inconsistencies but should not resolve them without involving the user (e.g., by securing permission or at least providing notification).
Another form of inconsistency is when the client expects a MIME header to receive metadata that includes media type information and the server doesn't send one. does not provide this information. For instance, HTTP/1.1 [RFC2616] , section 7.2.1 describes client behavior in the case when the server sends no content type header: media type:
application/octet-stream
.
This excerpt is consistent with the principle that metadata from the content type header, when present, origin server is authoritative. authoritative when present . HTTP/1.1 allows a client to guess when no content type is present; in this case, content that is self-describing is likely to lead to a coherent interpretation. </p> <p> For this reason, servers should only supply a character encoding header when there is complete certainty as to the encoding in use. Otherwise, an error will cause a perfectly usable representation to be rejected by an architecturally sound client. Section 7.1 of <a href="#rfc3023"> [RFC3023] </a> states: </p> <div class="notice"> The use of the charset parameter is STRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively the charset of the XML MIME entity. </div> <p> However, a receiving application can, with very high reliability, determine the character encoding of an XML document by reading it, without reference to any external headers and this is reflected by RFC 3023 in section 8.9, 8.10, and 8.11. Thus there is no ambiguity when the character encoding header is omitted, and the STRONGLY RECOMMENDED injunction to use the character encoding guess when no "Content-Type" header is misplaced for <code> application/xml </code> and for non-text <code> +xml </code> types. </p> <p> We recommend present; in this case, representation data that section 7.1 <a href="#rfc3023"> [RFC3023] </a> be amended is self-describing is likely to something like the following: lead to a consistent interpretation among multiple parties.
deleted text: <div class="notice"> Servers which generate representations MUST NOT generate the charset parameter unless there is certainty that the headers are correct. When correct, this information can be used by non-XML processors to determine authoritatively the character encoding of the XML MIME entity.In the absence of header information, metadata from the origin server, a flexible client would do even more than merely guess and silently proceed. For instance, in different configurations the client could:
In
Scenario
2
,
Norm
is
responsible
for
Janet
not
having
access
to
content
she
was
meant
to
receive.
The
HTML
4.01
Recommendation
states
that
"Authors
"Authors
who
use
[the
type
]
attribute
take
responsibility
to
manage
the
risk
that
it
may
become
inconsistent
with
the
content
available
at
the
link
target
address."
address."
Janet's
client
could
have
done
more
than
merely
read
the
type
hint
and
decide
to
skip
the
"cool-style."
"cool-style."
Users
benefit
from
clients
that
allow
different
configurations
for
handling
hints,
including:
It is not a violation of Web architecture when a client overrides server headers metadata and processes a bag of bits representation data in a non-authoritative manner, as long as the client is not misrepresenting the user or server. resource provider. For instance, an application does not violate Web architecture when it receives a content "Content-Type" header of <code> text/html </code> "text/html" and, rather than following the HTML 4.01 Recommendation, provides the service of validating the HTML, detecting broken links, converting it to another format, or rendering it as plain text. The problem arises when the user agent engages in non-authoritative behavior without the user's awareness or consent.
Some
format
specifications
allow
authors
to
include
in
content
"hints"
"hints"
for
servers
and
clients.
For
instance,
the
http-equiv
attribute
of
the
HTML
meta
element
is
was
intended
for
servers
(not
clients).
In
HTML
2.0
[RFC1866]
,
section
5.2.5,
the
attribute
is
specified
as
follows:
The
HTML
4.01
attribute
type
for
the
link
element
(used
in
Scenario
2
)
gives
clients
a
hint
about
what
the
content
media
type
of
a
representation
of
the
linked
resource
is
likely
to
be.
A format specification that includes hints for clients should make clear that when these hints interact with server headings, metadata, they are advisory only. Format specifications should not SHOULD NOT include requirements for clients to override server headers metadata without user consent. An architecturally sound description of an advisory attribute might read:
The
W3C
Recommendation
SMIL
2.0
[SMIL20]
is
outmoded
consistent
with
the
current
finding
in
this
regard
since
the
definition
of
the
type
attribute
(
section
7.3.1
)
specifies
circumstances
in
which
type
is
supposed
to
take
precedence
over
server
headers.
metadata.
The rationale frequently provided by specification designers for why the author should be able to override server headers metadata in content is to work around misconfigured servers. In many environments, authors do not have sufficient access to server managers to request that the affect server be configured for a new or special MIME type. configuration. The TAG does not believe that author-specified overrides is are the proper solution to this problem (for the reasons cited above, including security risks and masking of the problem). Instead the TAG recommends
Server managers can help reduce the following (in addition to <a href="#client-behavior"> suggested client behavior </a> ): risk of error through careful assignment of representation metadata (especially that which applies across representations). In particular:
Dan Connolly generously provided significant input to this finding. Martin Dürst, Roy Fielding, Philipp Hoschka, Rob Lanphier, Stuart Williams, deleted text: Norm Walsh, and Rob Lanphier Norm Walsh also provided valuable input. Many thanks to all reviewers for their contributions to this finding.