Skip to content

Create UTC for Dates #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

stephancb
Copy link

A proposal to make Dates accept UTC leap seconds

@stephancb
Copy link
Author

Introduction

Usually technical systems generate absolute time stamps in UTC. Julia
Base DateTime cannot represent times that occur in leap seconds, which
are from time to time inserted into UTC. Several other widely used
representations of time have the same problem: Astronomical Julian
days, time types of the C standard lib, Matlab day numbers,
... Practically it is quite a minor problem. Leap seconds are on
average roughly only every 18 months. Most computers and mobile
devices are automatically configured to use the NTP protocol and can
circumvent leap seconds by halting the system clock for these
times. This has greatly reduced the risk that leap second time stamps
get generated.

Nevertheless the leap seconds seem to have been so disturbing for the
Julia community, that the documentation for DateTime is now claiming
not to handle UTC at all, but implementing UT, more precisely
UT1. This is not a good idea, and I have submitted a PR to retract
this claim. The conversion between UTC and UT1 is non-trivial,
involves large tables, which are updated frequently, presently there
is no support in any package, there are many more arguments ...

Rather the internal presentation of time in Base DateTime should be
changed to allow for leap seconds, which is this proposal. Not always
is halting the system clock, even for just 1 second, an option, like
not on many of the satellites. Naturally solutions how to present time
including leap seconds can be found in the area of space flight. This
proposal is inspired by CDS described in a Blue Book,
https://public.ccsds.org/Pubs/301x0b4e1.pdf which the reader may want
to consult.

For calender time arithmetic Julia DateTime assumes that each day has
86400 s. With leap seconds this is not always the case, which at first
glance seems disturbing. I don't think so. Calendar metric, for what
it is used, does not need to agree precisely with physically elapsed
seconds, and should continue to be done as it is now.

Day segmented time type

We are used to segment time for ourselves, into years, months, days,
... . Computers are typically instructed to use a single, unsegmented
code for time, probably because we think, that it is most efficient in
terms of computation and memory consumption. However, the UTC standard
says, that a day has 86399, 86400, or 86401 seconds, and then
a day segmented time presentation is natural.

An updated Julia Base DateTime would use instead of the millisecond counter

immutable UTCinstant
    day2000::Int32   # nr of days since 2000-01-01
    hus::Int32       # nr of 100 microsec on day
end

To mark that this is a new time code, I suggest to use 2000-01-01 for
the epoch. Day numbers before 2000-01-01 are then negative, and we can
represent back in time about 6 million years (and the same into the
future).

With 32 bits for a (sub)seconds of day counter the smallest possible
increment is 100 microseconds (us, 864009999<2^31-1). Using an
unsigned wouldn't allow for 10x smaller. So with 32 bit signeds for
the day counter and for a 100 us counter, the UTCinstant uses in
total the same amount of memory and allows for a somewhat higher
precision than the present millisecond counter. A computer's
system clock stays synced to UTC within about 100 us, if there is
a high precision NTP server on the same LAN (often at larger
universities).

Date module interface

Constructors

Constructors of course will have to made accept leap second time
stamps. They occur only on June 30 and December 31, which perhaps
should be enforced in the constructors. But a lookup in a leap second
table is not needed, a user might actually want to simulate (future)
leap seconds that would not be in the table.

Types and functions

Otherwise the existing functions and operators in Date should still
behave as presently, i.e. assume that every day has 86400 s. Obviously
returned time stamps would have up to 60 in the seconds of minute
field/string.

For interaction with software that cannot accept leap seconds, an
iterator restamp(collectionofdatetimes, ...) would return time
stamps only outside leap seconds, similar as OS calls for system time
behave on many computers. It has to be an iterator, if several time
stamps in the collection are in the leap second. Then their order
should be preserved (they cannot all get restamped to the same
value). At least on December 31 restamping should be to the same
year/day, in case the times for example stamp financial
transactions.

Leap second table

A table of leap seconds is not needed for this proposal.

Coding

The code for the Dates module would need to be adjusted to use
UTCinstant, probably at many places. Perhaps tedious work, but no
serious complications are expected.

Summary

We propose to change the internal presentation of time in Julia Base
to day segmented. This allows to represent times in leap seconds.
Leap second time stamps, potentially originating from automatic
systems, could get accepted and returned back to a caller. It would
also facilitate to build higher precision types on top of
UTCinstant, using additional segmentation.

Stephan B., Swedish Institute of Space Physics

@oxinabox
Copy link

For reference other libraries without leap-second support.

Libraries with leap second support

In some cases these support only inserting, not deleting.
But no deletions have happened yet.
Also these are often capped at inserting at most one.

Also the situation is complicated in some cases for things like C where POSIX std says you must ignore the leap second, even though the language supports it.

Time is really complex.
This comment is not an opinion one way or the other, just links to other languages/libraries for reference

@quinnj
Copy link
Member

quinnj commented May 22, 2017

I haven't quite followed all the discussion with this, so I won't comment on the specifics here, but more on the process. This is certainly a situation where all this functionality could be developed in a stand-alone package outside of Base and be made to work robustly + plenty of test coverage before needing to be considered to replace the Base implementation. The beauty of what is currently in Base is the simplicity, robustness, and test coverage, being some of the most well-tested code in Base.

Anyway, carry on, just wanted to comment on the process more than proposal.

@nalimilan
Copy link
Member

That's an interesting proposal, thanks for writing it in detail. However, could you discuss the existing implementations? In particular, does any of them behave like you suggest? Why did they make these design choices (among others: old vs. recent, general-purpose vs. scientific...)? Do we have evidence that this works well in practice, is it annoying in some particular cases?

Also I'm not sure I understand the consequences of this change in terms of time arithmetic. Currently the equality Day(1) == Second(86400) allows simplifying lots of operations. For example, we have:

julia> DateTime(2017, 01, 02) - DateTime(2017, 01, 01)
86400000 milliseconds

julia> step(DateTime(2017, 01, 02):DateTime(2017, 01, 01))
1 day

Do you suggest we get rid of this equality? (Note this is already what happens with months and days, so that's not necessarily a showstopper, but...)

@stephancb
Copy link
Author

Not supporting leap seconds means not accepting any time stamps in leap seconds (like present Julia Dates).

Support can mean:

  1. accepting them, returning them back as is, but ignoring them for calendar time arithmetic;

  2. accepting them, returning them back as is, and defining and performing calendar time arithmetic, such that it always agrees with physically elapsed seconds (not always clear how to do this)

This proposal would do 1), similar to POSIX.

The real issues with leap seconds have been, that data got lost, ended up in the wrong time order, etc, when software didn't accept leap second time stamps, threw errors etc. But I never heard that calendar time arithmetic ignoring leap seconds caused any real problems, on the contrary, it is rather what people expect. Therefore I think that 1) is the most sensible approach.

@stephancb
Copy link
Author

To write a separate package for this, I would need to build on functionality in Dates, otherwise it is a lot of work. However, Dates does not accept leap second time stamps. So catch22....

@stephancb
Copy link
Author

stephancb commented May 22, 2017

julia> DateTime(2017, 01, 02) - DateTime(2017, 01, 01)
86400000 milliseconds

should perhaps become

julia> DateTime(2017, 01, 02) - DateTime(2017, 01, 01)
1 day, 0 usec 

to match the internal presentation (now: milliseconds, new: day, 100 us)

But leap seconds should consistently get ignored in arithmetic:

julia> DateTime(2017, 01, 01) - DateTime(2016, 12, 31)
1 day, 0 usec 

though the physically elapsed time is 86401 sec.

A point of the proposal is that a log file entry like

"2016-12-31 23:59:60: 1,000,314.00 $ from account 123456789 to account 987654321"

should not cause hickups, when it is processed using Julia Dates.

@omus
Copy link
Member

omus commented May 22, 2017

Currently the equality Day(1) == Second(86400) allows simplifying lots of operations.

In the package TimeZones.jl this is not true as a day can be equal to Hour(23), Hour(24), or Hour(25). I've been trying to revise some of the Base code where Day(1) == Hour(24) is assumed to always be true. See the documentation for examples of how calendrical arithmetic works with TimeZones.

To write a separate package for this, I would need to build on functionality in Dates, otherwise it is a lot of work. However, Dates does not accept leap second time stamps. So catch22....

This is the same problem that TimeZones faced. Currently the DateTime implementation doesn't have the functionality to handle TimeZones. I ended up solving this problem by introducing a new type: ZonedDateTime. I feel like the best approach for supporting leap seconds would be to add this functionality into a new package.

Additionally, for accurate leap second support we would need to utilize the leap seconds data (ftp://ftp.iana.org/tz/tzdb-2017b/leapseconds) that IANA provides as part of tzdata.

@stephancb
Copy link
Author

stephancb commented May 22, 2017

For this proposal a leap seconds table is not needed, because calendar arithmetic will ignore them.

The user supplies leap second time stamps, and gets them back, that's all.

The user can of course do calender arithmetic with his/her leap seconds time stamps, but the result will be the same as if the time stamp were just before the leap second. People who are interested in things on the second to subsecond level will not use the calender arithmetic functions/operators, but directly access the UTCinstant fields.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 23, 2017

I think the rhetoric in this Julep is a bit over the top:

Nevertheless the leap seconds seem to have been so disturbing for the Julia community, that the documentation for DateTime is now claiming not to handle UTC at all, but implementing UT, more precisely UT1.

The community is not "disturbed" by leap seconds, they're just a pain to deal with and if UT-based time is consistently used, they're not an issue. This was a design decision that was based on a lot of consideration and discussion with astronomers (among others), not some kind of panic response. Whether it was the right design decision or not is debatable but this verbiage is unnecessary and the attitude is not terribly constructive.

There are two core issues with the current scheme:

  1. Hardware systems on which Julia runs will usually be synced to UTC via NTP. When someone calls now(), even if they don't ever get a leap second timestamp (most OSes don't return them), although the value can be interpreted as UT1, that interpretation means the time is off by up to ±0.9 seconds in addition to error due to slippage of NTP syncing. Interpreting this timestamp as UTC, even without leap seconds, would be much closer to the true time – within a few milliseconds, typically – which is clearly much better. In practice, this is probably not a big deal since what we're doing is effectively the same as what many of the above software systems that don't handle leap seconds are doing, but the key difference is that we are calling it UT1, which means we're defining ourselves into telling time much worse than we would be if we just said that our timestamps are UTC without leap seconds. In the absence of the now() function, this wouldn't be a problem, but then again, without the now() function, there would be no reason to have dates and times in Base.

  2. Parsing textual timestamps from external sources. Currently, I believe that we just choke when parsing leap second timestamps because there's no way to represent them as DateTime values. That's a problem since a lot of timestamps are in UTC and many of them come from sources that do produce leap seconds. This could be handled by an external library that implements UTC, and arguably date/time parsing could be moved out of base since it's pretty complex and featurey. The biggest issue with this in my view is the likelihood of this potential exception going undiscovered in normal operation and then suddenly tripping someone up only after a system has been deployed for some time when a leap second actually occurs. That's a bad user experience.

This proposal may indeed be a good way to go. A mundane issue: this file should be named similarly to other Juleps in this repo and have an .md extension so that it renders correctly on GitHub.

@c42f
Copy link
Member

c42f commented May 27, 2017

It's good that this proposal can represent leap seconds and dodges the need to compute with them. Leap second tables are a real pain.

Functionality to get the real number of SI seconds between two Base date times can then be kept up to date in a package, and it's far easier to update a leap second table in that package than to somehow manage the UT1->TAI mapping.

@oxinabox - I don't think boost datetime has much in the way of leap second functionality. Leap seconds are mentioned in the documentation, but date time arithmetic ignores them (perhaps the reasoning was exactly the same as given in this julep, but it's not explained in the boost docs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants