Common User Agent Problems

Once Upon A Time, A User Agent...

W3C Note 06 February 2001

This version:: http://www.w3.org/TR/2001/NOTE-cuap-20010206
Latest version:: http://www.w3.org/TR/cuap
Authors:: Karl Dubost <karl@w3.org>, W3C; Hugo Haas <hugo@w3.org>, W3C; Ian Jacobs <ij@w3.org>, W3C

Abstract

This document explains some common mistakes in user agents due to incorrect or incomplete implementation of specifications, and suggests remedies. It also suggests some "good behavior" where specifications themselves do not specify any particular behavior (e.g., in the face of error conditions). This document is not a complete set of guidelines for good user agent behavior.

Note: This document does not incriminate specific user agents. W3C does not generally track bugs or errors in implementations. That information is generally tracked by the vendors themselves or third parties.

Status of this document

This document is a Note made available by the W3C for discussion only. Publication of this Note by W3C indicates no endorsement by W3C or the W3C Team, or any W3C Members. There is no commitment by W3C to invest additional resources in topics addressed by this Note.

The authors are making this document available for information only. While the authors welcome comments on this document, they do not guarantee a reply or any further action. Please send comments to www-talk@w3.org; public archives are available.

A list of current W3C technical reports and publications, including Working Drafts and Notes, can be found at http://www.w3.org/TR/.

Introduction
1. Usability
2. Rendering
3. Protocols implementation
4. URI handling
Acknowledgements
References

Introduction

This document explains some common mistakes in user agents due to incorrect or incomplete implementation of specifications, and suggests remedies. It also suggests "good behavior" where specifications themselves do not specify any particular behavior (e.g., in the face of error conditions).

Each suggestion in this document, called a "checkpoint" begins with a one-sentence description of the "right thing to do." Checkpoints also include rationale, examples, and references. Checkpoints are not ranked according to importance. They are not listed in any particular order.

This document does not address accessibility issues for user agents. Please refer to W3C's User Agent Accessibility Guidelines 1.0 [UAAG10] for information on how to design user agents that are accessible to people with disabilities.

1. Usability

This section focuses on the user's experience, including customization, user interface, and other usability issues.

1.1 When the user follows a link to a target anchor, highlight the target location.

Techniques:

Put the target location at a consistent location in the viewport (e.g., at the top of a graphical viewport).
Allow configuration to highlight (e.g., graphically, through audio cues) the target location. Ensure that highlight mechanisms do not rely on color alone and are distinguishable from other highlight mechanisms.

1.2 If the user attempts to follow a link that is broken because it designates a missing anchor, let the user know it is broken.

There are many ways to indicate to the user that a link is broken. The recommended behavior is as follows:

Do not scroll or otherwise change the viewport. This could make the user believe the link is not broken.
Indicate to the user (e.g., via a text message in the status bar) that the link is broken. If no message is given to the user, they will not understand why the viewport didn't move.
Ensure that any non-text message to the user has a text equivalent; text may be rendered as visually displayed text, synthesized speech, and braille. Audio cues or visual cues may be used in addition to text messages.

Wrong: Some user agents scroll to the top or bottom of the document when the user attempts to follow a broken link. This behavior is discouraged since it is indistinguishable from the correct behavior when a target is at the beginning or end of a document.

References:

For information about accessible user interfaces, please refer to the User Agent Accessibility Guidelines 1.0 [UAAG10].

1.3 Allow the user to retrieve Web resources even if the browser cannot render them.

User agents may not be able to render certain types of content on the Web either natively or through a plug-in (e.g., XML content, XSLT style sheets, RDF documents, DTDs, XML schemas, etc). User agents should allow users to retrieve and save these resources, otherwise users may not be able to access this Web content at all.

1.4 When the user requests to print a frameset, allow the user to select to print an individual frame or the frameset.

The presentation of the frameset could be achieved, for example, by:

proposing a list of frames to the user.
using a graphical representation of the organization of the frames.

Note: The authors do not encourage Web content developers to use frames as they can cause many usability and accessibility problems.

References:

HTML frames are specified in section 16 of the HTML 4.01 Recommendation [HTML 4.01].

1.5 Allow the user to add new URI schemes in a straightforward way.

For instance, allow users to associate external programs with URI schemes. The user agent should inform the user when it does not recognize a URI scheme in content.

Example:

A user may want the "tel" scheme (e.g., tel:+33-4-12-34) to interact with their telephone. Or they may want the "irc" scheme (e.g., irc://irc.example.org/) to activate an IRC client on their desktop with a connection to the specified server.

Wrong: Some user agents ignore the scheme part (before the ":") when the scheme is unknown to them, interpret the colon character as though it were encoded as '%3A' and then treat the URI as though it were a relative URI, usually producing a broken link (and confusing users).

References:

From section 3 of "Uniform Resource Identifiers (URI): Generic Syntax" [RFC2396]:

An absolute URI contains the name of the scheme being used followed by a colon (":") and then a string whose interpretation depends on the scheme.
Refer to information about URI schemes in section 3.1 of "Uniform Resource Identifiers (URI): Generic Syntax" [RFC2396].
For a list of known URI schemes, see "An Index of WWW Addressing Schemes" [SCHEMES].

1.6 Allow the user to override any mechanism for guessing URIs or keywords.

Many user agents compensate for incomplete URIs by applying a series of transformations with the hope of creating a URI that works. For example, many user agents transform the string www.w3.org into the URI http://www.w3.org/. The user should be able to control whether, for example, typing a keyword should invoke a Web search or whether the user agent should prepend http://www. and append .org/.

1.7 Warn users about incomplete documents and transfers.

Rendering an incomplete document as though it were complete is very likely to confuse users. Part of the document is missing, hence some anchors might not be present, possibly breaking some links. The user agent should notify the user that the document is incomplete.

The HTTP/1.1 specification describes this behavior for caches at the protocol level. Partial responses should also be made obvious to the user with a warning.

References:

The correct behavior is specified in section 13.8 of the HTTP/1.1 specification [RFC2616].

A cache MUST NOT return a partial response to a client without explicitly marking it as such, using the 206 (Partial Content) status code. A cache MUST NOT return a partial response using a status code of 200 (OK).

1.8 Provide a mechanism to allow authentication information to expire.

Many browsers allow configuration to save HTTP authentication [RFC2616, RFC2617] information ("remember my password"). They should also allow users to "flush" that authentication information on request. For instance, the user may wish to leave the user agent running but tell it to forget the password to access the user's bank account.

Wrong: Most user agents consider that authentication information (e.g., password) provided by a user for a server/realm pair during a session is immutable for the duration of the session.

1.9 When a Web resource includes metadata that may be recognized by the user agent, allow the user to view that metadata.

Metadata – data about data – can provide very useful context to users about information on the Web. For instance, metadata about a book might include the book's author, title, publication date, publisher, etc. (refer to the Dublin Core [DC] for information about library-type metadata). Authors include metadata in HTML documents through a variety of elements and attributes (e.g., the TITLE and ADDRESS elements, the "alt", "title", and "summary" attributes, etc. Languages such as the Resource Description Framework [RDF] allow users to populate the Web with rich metadata. User agents should provide a user interface to allow users to view metadata. The user interface may vary according to the underlying markup language. For instance, many graphical browsers render the HTML "title" attribute (e.g., as a tool-tip) when the user selects or hovers over an element with that attribute specified.

References:

Some projects that address the display of metadata are linked from the RDF home page at the W3C Web site.

1.10 Allow the user to keep track of completed HTTP POST requests.

Users may wish to track and archive HTTP POST requests for the same reasons they wish to track and archive email. For instance, if the user places a book order through a form, and that form uses a POST request, the user should be able to store information about that transaction.

References:

HTTP/1.1 POST requests are described in section 9.5 of the HTTP/1.1 specification [RFC2616].
"Axioms of Web architecture: User Agent watch points" [UAWP].

1.11 Allow the user to bookmark negotiated resources.

The HTTP/1.1 protocol [RFC2616] allows the client to request a representation of a resource which is best suited to its needs (language, media type, etc); this mechanism is called "content negotiation".

When a resource is negotiated, the user might want to bookmark a particular version. For example, a document might be available in several languages under the same URI, and the user might want to point somebody to the Canadian version of this document, which has a different URI.

In such a case, it should be possible to bookmark either the original URI or the URI of the view that the user got. The original URI can be interpreted as being the generic object and the retrieved document as one view of this object.

References:

For more information on content negotiation, see section 12 of the HTTP/1.1 specification, [RFC2616].
Checkpoint about temporary redirects.

1.12 Allow the user to choose among supported transfer encodings.

HTTP/1.1 [RFC2616] allows transfer encoding. An example of encoding is data compression, which speeds up Web browsing over a slow connection.

The user agent should allow the user to set the transfer encoding in the HTTP requests sent out.

References:

Refer to information about the "TE" request header, described in section 14.39 of the HTTP/1.1 specification [RFC2616].

1.13 Use the user interface language as the default value for language negotiation.

The user should be allowed to specify the set of languages that the user agent may use for language negotiation.

In case the user does not specify any language, the user agent may use the language of its user interface as the value sent out. The user agent should allow the user to override this behavior.

References:

For more information on content negotiation, see section 12 of the HTTP/1.1 specification, [RFC2616].
For more information about the HTTP Accept-Language header, see section 14.4 of the HTTP/1.1 specification, [RFC2616].
For information about privacy issues related to the Accept-Language header, see section 15.1.4 of the HTTP/1.1 specification, [RFC2616].

2. Rendering

This section focuses on issues related to style sheets and link types.

2.1 Implement user style sheets. Allow the user to select from author and user style sheets or to ignore them.

A style sheet is a set of rules that specifies how to render a document on a graphical desktop computer monitor, on paper, as synthesized speech, etc. A document may have more than one style sheet associated with it, and users should be able to select from alternative style sheets.

References:

For information about associating style sheets with an HTML document, refer to section 14.3 of the HTML 4.01 Recommendation [HTML 4.01].
For XML, refer to the "Associating Style Sheets with XML documents" Recommendation [XML-STYLE].
User selection of style sheets is a requirement of the User Agent Accessibility Guidelines 1.0 [UAAG10].

2.2 Respect media descriptors when applying style sheets.

Some markup and style sheet languages allow authors (e.g., @media construct in [CSS2], media attribute in [HTML 4.01]) to design documents that are rendered differently according to the characteristics of the output device: whether graphical display, television screen, handheld device, speech synthesizer, braille display, etc.

References:

For information about media descriptors in HTML 4.01, please refer to section 6.13 of the HTML 4.01 Recommendation [HTML 4.01].
For information about media types in CSS2, please refer to section 7 of the CSS2 Recommendation [CSS2].
For information about negotiation of device capabilities, please refer to the W3C Note "Composite Capability/Preference Profiles" [CC/PP].

2.3 If a style sheet is missing, ignore it and continue processing.

Users must be able to view content even without style sheets.

Wrong: In some user agents, missing style sheets result in a fatal error or result in the user agent not rendering content.

References:

From section 3.2 of the CSS2 Recommendation, [CSS2]:

For each source document, [a user agent] must attempt to retrieve all associated style sheets that are appropriate for the supported media types. If it cannot retrieve all associated style sheets (for instance, because of network errors), it must display the document using those it can retrieve.

2.4 Implement the HTML 4 recognized link types.

Section 6.12 of the HTML 4.01 Recommendation [HTML 4.01] lists some link types that may be used by authors to make assertions about linked Web resources. These include alternate, stylesheet, start, next, prev, contents, glossary, and others. Although the HTML 4.01 specification does not specify definitive rendering or behavior for these link types, user agents should interpret them in useful ways. For instance, the start, next, prev, and contents link types may be used to build a table of contents, or may be used to identify the print order of documents, etc.

3. Protocols implementation

This section focuses on the implementation of network protocols used to download resources from the Web.

3.1 Save resources retrieved from the Web on the local system using the appropriate system naming conventions.

The media type of a resource retrieved by HTTP [RFC2616] is determined by the content type and encoding returned by the server in the response headers.

If the user wants to save a resource locally, the user agent should respect the system naming conventions for files (e.g. PNG images usually have a .png extension).

Example:

http://www.w3.org/TR/1999/REC-html401-19991224/html40.ps is a view of the gzip'ed PostScript version of the HTML 4.01 specification. The HTTP headers sent by the server include:

Content-Type: application/postscript; qs=0.001
Content-Encoding: gzip

If saved locally, the filename on most computers should be html40.ps.gz for the applications to recognize the file type.

Wrong: Saving this compressed PostScript document as html40.ps is likely to confuse other applications.

References:

RFC1630 [RFC1630] specifies that URIs are opaque to the client.
Content type information is described in section 7.2.1 of the HTTP/1.1 specification, [RFC2616].

3.2 Respect the media type of a resource if one is explicitly given using the Content-Type HTTP header.

Example:

If an HTML document is returned with a Content-Type value of text/plain, the user agent must render the document as plain text without interpreting HTML elements and attributes (i.e. the HTML source must be displayed).

Reference:

From section 7.2.1 of the HTTP/1.1 specification, [RFC2616]:

If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource.

3.3 Respect the character set of a resource when one is explicitly given.

User agents must respect the character set when it is explicitly specified in the response. The character set can be given by the HTTP Content-Type headers and/or by the document-internal fallback (HTML meta element, etc).

References:

From section 3.4.1 of the HTTP/1.1 specification, [RFC2616]:

HTTP/1.1 recipients MUST respect the charset label provided by the sender; and those user agents that have a provision to "guess" a charset MUST use the charset from the content-type field if they support that charset [..].
From section 5.2.2 of the HTML 4.01 Recommendation, [HTML 4.01]:
To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):
1. An HTTP "charset" parameter in a "Content-Type" field.
2. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
3. The charset attribute set on an element that designates an external resource.

3.4 Do not treat HTTP temporary redirects as permanent redirects.

The HTTP/1.1 specification [RFC2616] specifies several types of redirects. The two most common are designated by the codes 301 (permanent) and 302 or 307 (temporary):

A 301 redirect means that the resource has been moved permanently and the original requested URI is out-of-date.
A 302 or 307 redirect, on the other hand, means that the resource has a temporary URI, and the original URI is still expected to work in the future. The user should be able to bookmark, copy, or link to the original (persistent) URI or the result of a temporary redirect.

Wrong: User agents usually show the user (in the user interface) the URI that is the result of a temporary (302 or 307) redirect, as they would do for a permanent (301) redirect.

References:

For more information about HTTP/1.1 response codes 301 and 302, refer to section 10.3.2 and section 10.3.3, respectively, of the HTTP/1.1 specification [RFC2616].
Refer to "Axioms of Web architecture: User Agent watch points" [UAWP].

3.5 If a host name has multiple DNS entries, try them all before concluding that the Web site is down.

Many Web sites have a single hostname like www.example.org resolve to multiple servers for the purpose of load balancing or mirroring. If one server is unreachable, others may still be up, so browsers should try to contact all the servers of a Web site before concluding that the Web site is down.

3.6 List only supported media types in an HTTP Accept header.

HTTP/1.1 [RFC2616] defines content negotiation. The client sending out a request gives a list of media types that it is willing to accept; the server then returns a representation of the object requested in one of the specified formats if it is available.

When entities are embedded in a document (such as images in HTML documents), user agents should only send Accept headers for the formats they support.

Example:

If a user agent can render JPEG, PNG and GIF images, the list of media types accepted should be image/jpeg, image/png, image/gif.

Wrong: User agent agents should not send an HTTP header of Accept: */* since the server may support content types that the user agent does not. For instance, if a server is configured so that SVG images are preferred to PNG images, a user agent that only supports PNG, GIF, and JPEG will receive (unsupported) SVG rather than (supported) PNG.

References:

For more information on content negotiation, see section 12 of the HTTP/1.1 specification, [RFC2616].
For more information about the HTTP Accept header, see section 14.1 of the HTTP/1.1 specification, [RFC2616].

4. URI handling

Resources are located on the Web using Uniform Resources Identifiers [RFC2396]. This section discusses how user agents should handle URIs.

4.1 Handle the fragment identifier of a URI when the HTTP request is redirected.

When a resource (URI1) has moved, an HTTP redirect can indicate its new location (URI2).

If URI1 has a fragment identifier #frag, then the new target that the user agent should be trying to reach would be URI2#frag. If URI2 already has a fragment identifier, then #frag must not be appended and the new target is URI2.

Wrong: Most current user agents do implement HTTP redirects but do not append the fragment identifier to the new URI, which generally confuses the user because they end up with the wrong resource.

References:

HTTP redirects are described in section 10.3 of the HTTP/1.1 specification [RFC2616].
The required behavior is described in detail in "Handling of fragment identifiers in redirected URLs" [RURL].
The term "Persistent Uniform Resource Locator (PURL)" designates a URL (a special case of a URI) that points to another one through an HTTP redirect. For more information, refer to "Persistent Uniform Resource Locators" [PURL].

Example:

Suppose that a user requests the resource at http://www.w3.org/TR/WD-ruby/#changes and the server redirects the user agent to http://www.w3.org/TR/ruby/. Before fetching that latter URI, the browser should append the fragment identifier #changes to it: http://www.w3.org/TR/ruby/#changes.

Acknowledgements

The authors would like to thank the W3C Team for their input.

References

CC/PP: "Composite Capability/Preference Profiles (CC/PP): A user side framework for content negotiation", Franklin Reynolds, Johan Hjelm, Spencer Dawkins, Sandeep Singhal, 27 July 1999. Available at http://www.w3.org/1999/07/NOTE-CCPP-19990727/.
CSS2: "Cascading Style Sheets, Level 2", Bert Bos, Håkon Wium Lie, Chris Lilley, Ian Jacobs, 12 May 1998. Available at http://www.w3.org/TR/1998/REC-CSS2-19980512/.
DC: Dublin Core. Available at http://www.dublincore.org/.
HTML 4.01: "HTML 4.01 Specification", Dave Raggett, Arnaud Le Hors, Ian Jacobs, 24 December 1999. Available at http://www.w3.org/TR/1999/REC-html401-19991224/.
PURL: "Introduction to Persistent Uniform Resource Locators", Keith Shafer, Stuart Weibel, Erik Jul, Jon Fausey. Available at http://purl.oclc.org/OCLC/PURL/INET96.
RDF: "Resource Description Framework (RDF) Model and Syntax Specification", Ora Lassila, Ralph R. Swick, 22 February 1999. Available at http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/.
RFC1630: "Universal Resource Identifiers in WWW", T. Berners-Lee, June 1994. Available at http://www.ietf.org/rfc/rfc1630.txt.
RFC2396: "Uniform Resource Identifiers (URI): Generic Syntax", T. Berners-Lee et al., August 1998. Available at http://www.ietf.org/rfc/rfc2396.txt.
RFC2616: "Hypertext Transfer Protocol -- HTTP/1.1", R. Fielding et al., June 1999. Available at http://www.ietf.org/rfc/rfc2616.txt.
RFC2617: "HTTP Authentication: Basic and Digest Access Authentication", J. Franks et al., June 1999. Available at http://www.ietf.org/rfc/rfc2617.txt.
RURL: "Handling of fragment identifiers in redirected URLs", B. Bos, 30 June 1999. Available at http://www.ics.uci.edu/pub/ietf/http/draft-bos-http-redirect-00.txt.
SCHEMES: "An Index of WWW Addressing Schemes", Dan Connolly, 2000. Available at http://www.w3.org/Addressing/schemes.
UAAG10: "User Agent Accessibility Guidelines 1.0", Jon Gunderson, Ian Jacobs, 10 March 2000. Available at http://www.w3.org/TR/2000/PR-UAAG10-20000310/.
UAWP: "Axioms of Web architecture: User Agent watch points", Tim Berners-Lee, 1998. Available at http://www.w3.org/DesignIssues/UserAgent.
XML-STYLE: "Associating Style Sheets with XML documents Version 1.0", James Clark, 29 June 1999. Available at http://www.w3.org/1999/06/REC-xml-stylesheet-19990629/.