<< Back to previous view

[CLJ-928] instant literal for Date and Timestamp should print in UTC Created: 07/Feb/12  Updated: 01/Mar/13  Resolved: 24/Feb/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Steve Miner Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader

1.4-beta1, Mac OS X 10.7.3, java version "1.6.0_30"

Attachments: Text File CLJ-928-instant-literals-for-Date-and-Timestamp-fogus.patch     Text File CLJ-928-instant-literals-for-Date-and-Timestamp.patch    
Patch: Code and Test
Approval: Ok


The default #inst returns a java.util.Date. Date is always in UTC, and doesn't know about time zones, but the implementation of the print method always renders it in the default time zone. For example,

user=> #inst "2012Z"
#inst "2011-12-31T19:00:00.000-05:00"

RFC3339 says:

4.3. Unknown Local Offset Convention

If the time in UTC is known, but the offset to local time is unknown,
this can be represented with an offset of "-00:00". This differs
semantically from an offset of "Z" or "+00:00", which imply that UTC
is the preferred reference point for the specified time.

java.sql.Timestamp should also print in UTC since that class doesn't keep timezone information. The print-method for Timestamp seems broken.

user=> (binding [*data-readers* {'inst #'clojure.instant/read-instant-timestamp}] (read-string "#inst \"2012Z\""))
#inst "2011-12-31T19:000000000000000-05:00"

user=> (binding [data-readers {'inst #'clojure.instant/read-instant-timestamp}]
(read-string "#inst \"2012-01-01T01:23:45.678+00:00\""))
#inst "2011-12-31T20:267800000000000-05:00"

user=> (java.sql.Timestamp. 0)
#inst "1969-12-31T19:000000000000000-05:00"

The implementations of the print-methods for Data, GregorianCalendar and Timestamp do too much string manipulation. I suggest doing some refactoring. (Patch coming soon.)

Also, the documentation needs some updating. clojure.instant/read-instant-date, etc. mention instant-reader but I think that mechanism was generalized as data-readers.

Comment by Steve Miner [ 07/Feb/12 1:28 PM ]

#inst literals for Date and Timestamp should print in UTC. Fixed round-tripping problem with Timestamp nanos (and re-enabled test). Refactored print-methods to avoid some string manipulations, particularly regarding Calendar timezone offsets. Added tests including those from CLJ-926, which this patch also fixes. Fixed doc on read-instant-date, etc.

Comment by Fogus [ 07/Feb/12 3:10 PM ]

Thank you. I will run this patch through the paces ASAP.

Comment by Cosmin Stejerean [ 07/Feb/12 7:09 PM ]

"Date is always in UTC, and doesn't know about time zones"

Given that pretty much all of the methods on Date return results in the local timezone, I wouldn't quite say that Date is always in UTC. Granted, most of those methods are deprecated; toString() however isn't and it also displays the date in the local timezone.

We can certainly force printing of dates in UTC, by overriding the timezone setting on SimpleDateFormat, and there might be other valid reasons for always forcing UTC representation. Dates not being timezone aware however doesn't seem like one to me.

Comment by Steve Miner [ 08/Feb/12 8:54 AM ]

It's only a slight exaggeration to say that a java.util.Date is intrinsically a long (milliseconds since "the epoch"). For Clojure's purposes, an "instant" is a point in time in a universal sense (not situated in a timezone) so UTC makes sense as the canonical form. RFC 3339 section 4.3 says to use "-00:00" for the offset when timezone information is unknown. java.sql.Timestamp is like a Date with an extra field for nanos so it also should use UTC for the canonical form.

My argument for UTC as the canonical form is that it best matches the desired semantics of a Clojure instant. I think it's the most conservative way to go and offers the best approach to interoperability by following RFC 3339.

java.util.Calendar is by design cognizant of timezones so it arguably makes sense to preserve the timezone information. Note that comparisons of Calendars take into account the timezone so two Calendar objects might be the same instant in a sense but not be equal because of the timezones. That might be a bit tricky for Clojure users who want a clean abstraction of "instant" with multiple implementations. Presumably they know what they are doing if they decide to use read-instant-calendar. It might be a good idea to be more explicit about this in the documentation.

Comment by Fogus [ 20/Feb/12 3:22 PM ]

Modified priting to conform to the following:

  • Instant instances are stored in UTC format without offset information.
  • Instant literals with offset information will be parsed and stored with offset applied.
  • Instant literals without offset information will be parsed as UTC format without offset information.
  • Instants will print as time having local offset applied and without offset printed.
  • Instants without offset info will be print-dup'd as UTC format with RFC-3339 default offset information (-00:00)

This provides a canonical storage and read format of UTC and a print format relative to the local offset.

Comment by Stuart Sierra [ 24/Feb/12 2:11 PM ]

With Fogus' patch on Feb. 20, instant literals do not round-trip unless *print-dup* is bound true:

user=>  (let [i (read-string "#inst \"2012-02-24T14:41-02:00\"")
              j (read-string (pr-str i))]
          (prn i)
          (prn j)
          (prn (= i j)))
#inst "2012-02-24T11:41:00.000"
#inst "2012-02-24T06:41:00.000"
user=>(binding [*print-dup* true]
        (let [i (read-string "#inst \"2012-02-24T14:41-02:00\"")
              j (read-string (pr-str i))]
          (prn i)
          (prn j)
          (prn (= i j))))
#inst "2012-02-24T16:41:00.000-00:00"
#inst "2012-02-24T16:41:00.000-00:00"

The docstring of *print-dup* says nothing about print/read values being equal (by Clojure's definition of =) only that print/read values will be of the same type. "Type" is not a particularly meaningful concept when applied to data literals.

We (Stuart & Luke) believe that print/read values should always print in such a way that they can be read back as the same instant. Printing an instant should either:

  • print in local time, with local offset
  • always print in UTC, with zero offset

Instants should never print without any time zone offset, as this admits too much confusion.

Comment by Stuart Sierra [ 24/Feb/12 2:46 PM ]

Just spoke with Rich Hickey, who made the following assertions:

  • Instants should always represent an exact point in time, unambiguously. This representation may include time zone information.
  • Our specification for instant literals allows any part on the right to be omitted. Omitting the time zone offset means UTC is assumed. It is not important whether or not the default printer includes the UTC "00:00" offset or omits it.
  • If we are printing an instant literal based on a type that does not contain time zone information (e.g., java.util.Date) then we should not add time zone information but simply print in UTC.
  • If we are printing an instant literal based on a type that can contain time zone information (e.g., java.util.Calendar) then we may print with the time zone offset included. This is a nice-to-have feature, not a requirement. It is always permissible to print instants in UTC.
  • It is possible that round-trip print/read of an instant may lose time zone information, depending on the types used, but it this should not change the meaning of the instant in UTC.
Comment by Steve Miner [ 24/Feb/12 3:13 PM ]

I believe my original patch satisfies the requirements stated above. I'm happy to work on this if you want something changed.

Comment by Stuart Sierra [ 24/Feb/12 3:22 PM ]

Rejected Fogus' patch of Feb. 20.

Screened Steve Miner's patch of Feb. 7. Ready for Rich.

Comment by Stuart Sierra [ 24/Feb/12 3:29 PM ]

Patch applied.

Generated at Wed Sep 20 13:12:23 CDT 2017 using JIRA 4.4#649-r158309.