Working with Time Zones

October 2005

This version:
Latest version:
Previous version:
Editors:
Addison Phillips, (Invited Expert)
Felix Sasaki, W3C
Mark Davis, IBM
Martin Dürst, Aoyama University

This document is also available in these non-normative formats: .


Abstract

This document discusses some of the problems encountered when working with the date, time, and dateTime values from XML Schema when those value include (or omit) time zone offsets. Many W3C technologies rely on date and time types. Examples of these include the XQuery 1.0 and XPath 2.0 Functions and Operators specifications, since these are the basis for XQuery and XSLT processing of date/time values, but the concepts presents affect any datetime processing.

Status of this Document

This document is an editors' copy that has no official standing.

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is the first public Working Draft of the Internationalization Core Working Group for review by interested parties. Publication as Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document has been produced as part of the W3C Internationalization Activity, following the procedures set out for the W3C Process.

Your feedback is welcome, especially regarding the open issues identified in this document. Please send your comments to the public mailing list (public-i18n-core@w3.org), an archived mailing list dedicated to discussion associated with the Internationalization Core Working Group. See W3C mailing list and archive usage guidelines.

As of this publication, the Working Group expects this document to become a W3C Working Group Note.

Table of Contents

1 Working With Time Zones
    1.1 Background
    1.2 Identifying Time Zones and Zone Offsets
    1.3 Incremental versus Field-Based Time
    1.4 Guidelines
        1.4.1 Working with Field-Based Dates and Times
            1.4.1.1 Working with Date and Time Values that Require a Time Zone (and not a zone offset)
        1.4.2 Comparing Times
    1.5 Recommendations for XQuery / XSLT

Appendix

A References (Non-Normative)


1 Working With Time Zones

Time-related data is a common requirement for many applications. XML Schema provides a variety of data types for dates and times, such as date, time, and dateTime. These data types follow internationally friendly formats defined by ISO 8601 and can be used to address a variety of differing date or time applications.

The date, time, and dateTime types can either include or omit the time zone offset. The presence (or absence) of the offset means that the data value must be handled differently for certain kinds of operations. In addition, the particular application and source of the date and time values affects how dates and times with different time zones or zone offsets should be handled, as well as how to handle values that lack any time zone or zone offset indication.

Note:

Users and implementers of languages or specifications which handle time-related data should take the following recommendations into account even if time-zone-sensitive data is rarely used. Sooner or later some data will be affected by the issues described. Some examples of these include XQuery, XPath, and XSLT.

1.1 Background

There are three main applications of date, time, or dateTime data types in applications.

  1. Incremental or Computer Time Most programming languages and development environments provide data types for handling time which are based on a numeric value: units of some specific length measured from a specific point in time (called the epoch). For example, the Java type java.util.Date is a long (integer) value for the number of milliseconds since 00:00 (midnight) on January 1, 1970 in UTC (Universal Coordinated Time, sometimes also called GMT). Other systems use other units and epochs. Date and time values based on a construct of this type (which we'll call computer time) are time-zone-independent, since at any given moment it is the same time in UTC everywhere on Earth: the values can be transformed for display for any particular time zone offset, but the value itself is not tied to a specific location. Values of this type are commonly used in applications as "time stamps", showing when an event occured. Some applications for these include:

  2. Time Zone Independent Field-Based Time The human representation of computer times is more complicated, and represents time using various separate field values, such as hour, minute, month, or year. One application for this type of representation is for values that are time zone independent, representing a logical event divorced from a particular location on the Earth. For example, various kinds of "anniversary date" such as a person's birthdate or an employee hire date would normally fall into this category, partly because time is not expressed, and partly because the actual time of the start and end of the day for a given geographic location may not be considered important. Some other examples of this application of dates and times include:

  3. Time Zone Dependent Field-Based Time In other cases, field-based dates and times are supposed to represent values linked to a particular location or time zone. For example, if you tell someone that you will make a telephone call to them at 14:00 from Paris, if that person is in London they'll expect the phone to ring at 13:00. As with incremental time, the event happens in the same instant around the globe and meaning of the value depends on the offset from UTC. Some other examples of this application of dates and times include:

1.2 Identifying Time Zones and Zone Offsets

XML Schema follows the ISO 8601 standard for its lexical representation. Date and time values in ISO 8601 are field-based using the definitions above and can indicate (or omit) the zone offset from UTC. A zone offset is not the same thing as a time zone, and the difference can be important. XML Schema only supports zone offset, but, confusingly, calls it timezone, see for example Section 3.2.8.1, lexical representation in that document.

Although ISO 8601 is expressed in terms of the Gregorian calendar, it can be used to represent values in any calendar system. The presentation of date and time values to end users using different calendar and timekeeping systems is separate from the lexical representation.

What is a "zone offset"? A zone offset is the difference in hours and minutes between a particular time zone and UTC. In ISO 8601, the particular zone offset can be indicated in a date or time value. The zone offset can be Z for UTC or it can be a value "+" or "-" from UTC. For example, the value 08:00-08:00 represents 8:00 AM in a time zone 8 hours behind UTC, which is the equivalent of 16:00Z (8:00 plus eight hours). The value 08:00+08:00 represents the opposite increment, or midnight (08:00 minus eight hours).

What is a "time zone"? A time zone is an identifier for a specific location or region which translates into a combination of rules for calculating the UTC offset. For example, when a website maintaining a group calendar in the United States schedules a recurring meeting for 08:00 Pacific Time, it is referring to what is sometimes known as wall time (so called because that is the time shown "on the clock (or calendar) on the wall"). This is not equivalent to either 08:00-08:00 or 08:00-07:00, because Pacific Time does not have a fixed offset from UTC; instead, the offset changes during the course of the year. As mentioned before, XML Schema only supports zone offset, and it does not make the terminological distinction between zone offset and time zone. So a wall time expressed as an XML Schema time value, must choose which zone offset to use. This may have the unintended effect of causing a scheduled event to shift by an hour (or more) when wall time changes to or from Daylight/Summer time.

To complicate matters, the rules for computing when daylight savings takes effect may be somewhat complex and may change from year to year or from location to location. In the United States, the state of Indiana, for example, does not follow daylight savings time, but this will change in April 2006. See: http://www.mccsc.edu/time.html. The Northern and Southern hemispheres perform Daylight/Summer Time adjustments during opposing times during the year (corresponding to seasonal differences in the two hemispheres).

To capture these situations, a calendar system must use an ID for the time zone. The most definitive reference for dealing with wall time is the TZ database (also known as the "Olson time zone database"), which is used by systems such as various commercial UNIX operating systems, Linux, Java, CLDR, ICU, and many other systems and libraries. In the TZ database, "Pacific Time" is denoted with the ID America/Los_Angeles. The TZ database also supplies aliases among different IDs; for example, Asia/Ulan Bator is equivalent to Asia/Ulaanbaatar. From these alias relations, a canonical identifier can be derived. The Common Locale Data Repository can be used to provide a localized form for the IDs: see Appendix J in UTR #35.

1.3 Incremental versus Field-Based Time

Incremental time and field-based time differ in the way certain operations work. For example, incremental times can be directly compared—their integer values determine which is earlier or later—while field based times must be normalized and their individual fields compared. Field based times can have certain kinds of logical operations performed on them (for example, rolling the date forward or back), while incremental time requires a logical transformation. For example, to set the date 2005-08-30 forward by one day, an implementation can add 'one unit' to the "day" field and adjust the month and year as appropriate. In incremental time, a similar operation might be performed by incrementing the value by 24 hours * 60 minutes * 60 seconds * 1000 milliseconds, which is one logical day, but there may be errors when a particular day has more or fewer seconds in it (such as occur during daylight savings transitions).

The SQL data types date, time, and timestamp are field based time values which are intended to be zone offset independent. The data type timestamp with timezone is the zone offset-dependent equivalent of timestamp in SQL. Programming languages, by contrast, tend to use incremental time and convert to and from a localized textual representation on demand. Databases may use incremental time or either zone offset-dependent or independent field-based structures internally. For example, an Oracle 8 database treats a timestamp field as though it is in the local time of the database instance.

As a result, users may not be clear on the differences between these types or may create a mixture of different representations. For example, a Java programmer using JDBC will retrieve incremental times (java.util.Date objects) from a database, even though the actual field in the database is a (field-based) timestamp value.

In XML Schema, as with SQL, dates and times are always expressed using field-based time. The date or time may express the zone offset from UTC (for example using a format such as 08:00:00+01:00). UTC is indicated by the letter Z (for example 08:00:00Z). Or, the zone offset may be omitted completely.

Properly speaking, an XML Schema date or time value with a zone offset is field-based/zone offset dependent and one without is field-based/zone offset independent.

If the two types are mixed, then the interpretation of the zone offset is not adequately specified in XML Schema. In XQuery 1.0 and XPath 2.0 Functions and Operators, the interpretation is implementation-defined and is based on an implicit zone offset. This is usually either UTC or local time. The presence or absence of the zone offset in the XML Schema representation may not be indicative of the original data's intention because of the confusion described above. Proper comparisons or processing rely on normalizing all date and time values into zone offset-independent (or zone offset-dependent) forms and never mixing the two in a particular operation.

1.4 Guidelines

This section describes different guidelines that can be applied to various time and date comparisons.

1.4.1 Working with Field-Based Dates and Times

Field-based time and date values require the user to determine whether to use a fixed zone offset, a time zone, or nothing. While XML Schema times are field-based in terms of the lexical representation, the underlying data may use incremental time, as may the implementation processing the values. Each specific case requires specific handling.

1.4.1.1 Working with Date and Time Values that Require a Time Zone (and not a zone offset)

Documents or systems can also choose to accompany a time value with the appropriate time zone identifier or TZID using a complex type. This is very important with recurring times, such as calendar meeting times. If a regular meeting is at "08:00 Pacific Time", it is insufficient to store and interchange just a zone offset.

There are different ways to compare two <datetime, TZID> pairs. If both the date and time are fixed (2004-09-31T01:30), then this can be done by computing the offsets on that date and at those times, using the TZ database. This order then reflects whether one datetime is (absolutely) before another.

If the dates are not fixed (such as <T01:30, TZID> — notice that the date value is omitted) then in some sense, neither is 'before' the other, since each refers to a repeating, interleaved set of points in time. The simplest comparison mechanism where the dates may not be fully specified is simply to put both in canonical form, then order them first by time then by TZID (alphabetical, caseless order). The Olson database does not maintain a fixed canonical form; however, CLDR does provide such a form (see: ).

(It is also possible to have a looser comparison, whereby <time0, TZID0> is compared to <time1, TZID1> over some interval of time: if one consistently has a smaller offset during that period, it is considered to be less than the other value. However, there are cases where this mechanism results in a partial ordering.)

Unfortunately, XML Schema date and time types do not provide for Olson IDs, so most time operations cannot use TZIDs directly. Time zone identification in the date and time types relies entirely on time zone offset from UTC. It is up to the document designer to keep the TZID in a separate data field from the time value.

1.4.2 Comparing Times

Conversion between or operations on data sets that mix values with and without zone offsets present certain problems.

If one wishes to write a comparison between the value of <aDateTime> and <bDateTime>, then the two values must be reconciled to use the same reference point. <aDateTime> uses UTC and can easily be converted to computer time or shifted to another zone offset. <bDateTime> contains no indication of the zone offset. It may be UTC or any other value (currently up to 14 hours different in either direction from UTC).

It is good practice to use an explicit zone offset wherever possible. If one is not available, best practice is to use UTC as the implicit zone offset for conversions of this nature. This is because the values are exactly centered in the range of possibilities and because representation internally (as computer time) is usually based on UTC. Since a single reference point has been used it may be possible to unwind the change later even if erroneous conversion takes place. When working with multiple documents from various sources, the "implicit" offset of the document may vary widely from that of the implementation doing the processing. If UTC is widely used, the chances of error are reduced.

Content and query authors are warned that comparing or processing dateTimes with and without time offsets may produce odd results and such processing should be avoided whenever possible. Generating content that omits zone offset information (where it exists) is a recipe for errors later. Of course, data such as the SQL types cited earlier which is meant to represent wall time should continue to omit the zone offset. Query writers can check for the presence (or absence) of zone offset and should do so to modify dates and times explicitly (instead of allowing implicit conversion) whenever possible.

A References (Non-Normative)

[XMLSchema]
XML Schema Part 2: Datatypes Second Edition, W3C Recommendation, Paul V. Biron, Ashok Malhotra, 28 October 2004 (See http://www.w3.org/TR/xmlschema-2/.)
[XQuery]
XQuery 1.0 and XPath 2.0 Formal Semantics, W3C Working Draft, Denise Draper, et al, 15 September 2005 (See http://www.w3.org/TR/query-semantics/.)
[XPathFO]
XQuery 1.0 and XPath 2.0 Functions and Operators, W3C Working Draft, Ashok Malhotra, Jim Melton, Norman Walsh, 15 September 2005 (See http://www.w3.org/TR/xpath-functions/.)
[Olsen]
Sources for Time Zone and Daylight Saving Time Data, Public domain time zone database (See http://www.twinsun.com/tz/tz-link.htm.)
[ISO8601]
ISO 8601:2004, Data elments and interchange formats– Information interchange – Representation of dates and times (See http://www.iso.org/iso/en/prods-services/popstds/datesandtime.html.)
[RFC3339]
Date and Time on the Internet: Timestamps, IETF Standard, G. Klyne, C. Newman, July 2002 (See http://www.ietf.org/rfc/rfc3339.txt.)
[UAX35]
Locale Data Markup Language (LDML), Unicode Technical Standard, Mark Davis, 2 June 2005 (See http://www.unicode.org/reports/tr35/.)
[CLDR]
Common Locale Data Repository, Unicode Consortium (See http://unicode.org/cldr/.)
[NOTE-datetime]
Date and Time Formats, W3C Note, Misha Wolfe, Charles Wicksteed, 15 September 1997 (See http://www.w3.org/TR/NOTE-datetime.)