|
|
Subscribe / Log in / New account

2038 is closer than it seems

By Jonathan Corbet
May 21, 2014
Most LWN readers are likely aware of the doom impending upon us in January 2038, when the time_t type used to store time values (in the form of seconds since January 1, 1970) runs out of bits on 32-bit systems. It may be surprising that developers are increasingly worried about this deadline, which is still nearly 24 years in the future, but there are good reasons to be concerned now. Whether those worries will lead to a solution in the near term remains to be seen; not much has happened since this topic came up last August. But recent discussions have at least shed a bit of light on the forms such a solution might take.

At times, developers have hoped that this problem might solve itself. On 64-bit systems, the time_t type has always been defined as a 64-bit quantity and will not run out of space anytime soon. Given that 64-bit systems appear to be taking over the world — even phone handsets seem likely to make the switch in the next few years — might the best solution be to just wait for 32-bit systems to die out and take the problem with them? A "no action required" solution has an obvious appeal.

There are two problems with that reasoning: (1) 32-bit systems are likely to continue to be made for far longer than most people might expect, and (2) there are 32-bit systems being deployed now that can be expected to have lifetimes of 24 years or longer. 32-bit systems will be useful as cheap microcontrollers for a long time, and, once deployed, they will often be expected to work for many years while being difficult or impossible to update. There are almost certainly systems already deployed that are going to provide unpleasant surprises in 2038.

Kernel-based solutions

So it would appear to make sense to solve the problem soon, rather than in, say, 2036 or so. There is only one snag: the problem is not all that easy to solve. At least, it is not easy if one is concerned about little details like not breaking existing programs. Since Linux developers at most levels are quite concerned about compatibility, the simplest solutions (such as a BSD-style ABI break) are not seen as being workable. In a recent discussion, John Stultz outlined a couple of alternative approaches, neither of which is without its difficulties.

The first approach would be to change the 32-bit ABI to use a 64-bit version of time_t (related data structures like, struct timespec and struct timeval would also change). Old binaries could be supported through a compatibility interface, but newly compiled code would normally use the new ABI. There are some advantages to this approach, starting with the fact that lots of applications could be updated simply by rebuilding them. Since a couple of BSD variants have already taken this path, a number of the worst application problems have already been fixed. Embedded microcontrollers typically run custom distributions built entirely from source; changing the ABI in this way would make it possible to build 2038-capable systems in the near future with a minimum of pain.

On the other hand, the kernel would have to maintain a significant compatibility layer for a long time. Developers are also worried that there will be many applications that store 32-bit time_t values in their own data structures, in on-disk formats, and more. Many of these applications could break in surprising ways, and they could prove to be difficult to fix. There are also some concerns about the runtime cost of using 64-bit time_t values on 32-bit systems. Much of this cost could be mitigated within the kernel by using a different format internally, but applications could slow down as well.

The alternative approach is to simply define a new set of system calls, all of which are defined to use better time formats from the beginning. The new formats could address other irritations at the same time; not everybody likes the separate seconds and nanoseconds fields used in struct timespec, for example. All system calls defined to use the old time_t values would be deprecated, with the idea of removing them, if possible, before 2038.

With this approach, there would be no hard ABI break anytime soon and applications could be migrated gradually. Once again, embedded systems could be built using the new system calls in the relatively near future, while desktop systems could be left alone for another decade or so. And it would be a chance to start over and redesign some longstanding system calls with 21st-century needs in mind.

Defining new system calls has its downsides as well, though. It would push Linux further away from being a POSIX system, and would take us down a path different from the one chosen by the BSD world. There are a lot of system calls to replace, and time_t values show up in other places as well, most notably in a long list of ioctl() calls. Applications would have to be updated, including those running only on 64-bit systems, which would not see much of a benefit from the new system calls. And, undoubtedly, there would be lots of applications using the older system calls that would surface in 2037. So this approach is not an easy solution either.

Including glibc

Discussions of these alternatives went on for a surprisingly long time before Christoph Hellwig made an (in retrospect) obvious suggestion: the C library developers are going to have to be involved in the implementation of any real solution to the year-2038 problem, so perhaps they should be part of the discussion now. For years, communications between the kernel community and the developers of C libraries (including the GNU C library — glibc) have been sporadic at best. The changing of the guard at glibc has made productive conversations easier to have, but changing old habits has proved hard. In any case, it is true that the glibc developers will have to be involved in the design of the solution to this problem; the good news is that such involvement appears likely to happen.

Glibc developers are not known for their love of ABI breaks — or of non-POSIX interfaces for that matter. So, once glibc developer Joseph Myers joined the conversation, the tone shifted a bit toward a solution that would allow a smooth transition while retaining existing POSIX system calls and application compatibility. The plan (which was discussed only in rough form and would need a lot of work yet) looks something like this:

  • Create new, 64-bit versions of the affected system calls. So, for example, there would be a gettimeofday64() that returns the time in a struct timeval64. The existing versions of these system calls would be unchanged.

  • Glibc would gain a new feature test macro with a name like TIME_BITS. If TIME_BITS=64 on a 32-bit system, a call to gettimeofday() will be remapped to gettimeofday64() within the library. So applications can opt into the new world by building with an appropriate value of TIME_BITS defined.

  • Eventually, TIME_BITS=64 would become the default, probably after distributions had been shipping in that mode for a while. Even in the 64-bit configuration, compatibility symbols would remain so that older binaries would still work against newer versions of the C library.

Such an approach could possibly allow for a relatively smooth transition to a system that will work in 2038, though, naturally, a number of troublesome details remain. There was talk of remapping ioctl() calls in a similar way, but that looks like a recipe for trouble given just how many of those calls there are and how hard it would be to even find them all. Developers in other C library projects, who often don't wish to maintain the sort of extensive compatibility infrastructure found in glibc, may wish to take a different approach. And so on.

But, even with its challenges, the existence of a vague plan hashed out with participation from kernel and glibc developers is reason for hope. Maybe, just maybe, some sort of reasonably robust solution to the 2038 problem will be found before it becomes absolutely urgent, and, with luck, before lots of systems that will need to function properly in 2038 are deployed. We have the opportunity to avoid a year-2038 panic at a relatively low cost; if we make use of that opportunity, our future selves will thank us.

Index entries for this article
KernelTimekeeping
KernelYear 2038 problem


to post comments

2038 is closer than it seems

Posted May 22, 2014 3:03 UTC (Thu) by adler187 (guest, #80400) [Link] (3 responses)

Sounds like they copied the solution for Large File Support, ie. _FILE_OFFSET_BITS switches between the old 32-bit filesystem interfaces and the 64-bit filesystem interfaces.

2038 is closer than it seems

Posted May 22, 2014 4:20 UTC (Thu) by mezcalero (subscriber, #45103) [Link] (2 responses)

Uh oh. Claiming that TIME_BITS=64 could allow a smooth transition is really wrong. The type time_t is exposed in numerous library APIs. The TIME_BITS=64 thing suggests one could mix and match components compiled with different settings freely, but that is not the case: if you compile one lib with 64bit time_t then you have to do the same for all programs using that lib. And if you do that the you transitively have to recompile all other libs they are using too, and so on. Ultimately you have to recompile the full system that way... Unless of course every single library would do what glibc does and provide both 32bit and 64bit calls for everything. And that's just not going to happen.

The same story happened for large file support (LFS) where off_t got increased in size. Now, off_t is thankfully not that often exposed in APIs, and because people knew how awful the situation was many just avoided exposing it in APIs, but for time_t the situation is much worse.

Also, one particular gem: think of stat() which already exists in two flavours, with LFS and without. Now, this API would also have to be duplicated for 32bit time_t and 64bit time_t. So you get four flavours of this call: stat(), stat64(), stat_t64() and stat64_t64()! Ouch!

And even thinking of duplicating gettimeofday() when there's also clock_gettime(CLOCK_REALTIME) is just wrong...

Lennart

2038 is closer than it seems

Posted May 22, 2014 7:27 UTC (Thu) by niner (subscriber, #26151) [Link] (1 responses)

Since stat() without LFS support is only for compatability, I assume it would suffice to add a stat64_t64 (with hopefully a better name). But having three variants is bad enough.

2038 is closer than it seems

Posted May 22, 2014 9:48 UTC (Thu) by arnd (subscriber, #8866) [Link]

For the particular example, we have been talking about a replacement for stat that solves a number of other problems as well. See http://lwn.net/Articles/394298/ for instance.

2038 is closer than it seems

Posted May 22, 2014 5:50 UTC (Thu) by eru (subscriber, #2753) [Link] (1 responses)

I hope going to 64-bit time_t is selected, like the BSD:s did. Given the human tendency of putting off chores with no immediate benefit, harshly imposing the fix when code is recompiled is the only way to get this solved before the deadline.

2038 is closer than it seems

Posted May 22, 2014 9:55 UTC (Thu) by arnd (subscriber, #8866) [Link]

My feeling is that at the kernel interface side, we won't do a hard break, and keep 32-bit time_t at least for the architectures that have it today while introducing new syscalls to take to the libc.

What user space does is a different matter though. I think you should always have at least the option to build a libc that only supports 64-bit time_t in user space and that uses the new kernel interfaces for a safe implementation. This way, an enterprise distro with e.g. 10 years of guaranteed support and lots of legacy third-party applications can keep working as previously, while an embedded system with 25 years support and no legacy code can go to 64-bit time_t in user space without any backwards compat hacks in user space.

2038 is closer than it seems

Posted May 23, 2014 9:32 UTC (Fri) by danielos (guest, #6053) [Link] (1 responses)

this is just a question, do not blame me.
Does it affect other system, such as TTL and other network staff? (packet format, and such)

2038 is closer than it seems

Posted Dec 11, 2014 15:18 UTC (Thu) by butlerm (subscriber, #13312) [Link]

The TTL is not actually a time, but rather a maximum number of network hops.

2038 is closer than it seems

Posted May 23, 2014 16:24 UTC (Fri) by ScottMinster (subscriber, #67541) [Link] (5 responses)

Would it be possible to change the epoch date? For example, setting an environment variable like $EPOCH_YEAR to 2010 would make the 32 bit time_t relative to midnight UTC, January 1, 2010. Methods like localtime() would take this offset into account. Somehow, the kernel would have to be informed that the process is running with an epoch offset, but presumably that could be done during the process startup in glibc.

Obviously this wouldn't solve situations where the time_t value is recorded outside the process (in a file, sent over the network, etc), so this wouldn't work for some subset of applications. But most applications probably do not do that, so this workaround seems like it could be effective in cases where recompiling is not possible or desirable.

Changing the epoch

Posted May 23, 2014 19:34 UTC (Fri) by corbet (editor, #1) [Link] (3 responses)

Running in a different epoch would work in situations where dates in the past do not need to be represented. But think about things like file time stamps that do indeed need to be in the past.

Changing the epoch

Posted May 23, 2014 21:05 UTC (Fri) by ScottMinster (subscriber, #67541) [Link] (2 responses)

time_t is signed, so it can represent 68 years in either direction of the epoch (2^31/86400/365.25). Setting the epoch to 2010 would allow representation from 1942 to 2078.

It's not a good long term solution, but could be a workaround for some applications.

Changing the epoch

Posted May 24, 2014 5:30 UTC (Sat) by jzbiciak (guest, #5246) [Link] (1 responses)

Are negative time_t values really valid times, though? There are APIs (such as time()) that return -1 to indicate an error. The lazy programmer in me imagines that among programs that actually check for an error (which, admittedly, are probably rare), the majority would just check for less-than-zero.

Changing the epoch

Posted May 24, 2014 10:40 UTC (Sat) by ScottMinster (subscriber, #67541) [Link]

True, that would probably preclude setting your epoch date in the future (say to 2050 so you could span between 1982 and 2118). At the very least, there would be a time (December 31, 23:59:59) when time() would have to legitimately return -1, which is the error code.

But if the epoch date is in the recent past, that wouldn't cause any trouble for the time() function. I don't think any code would care if, for example, the time_t values in a stat structure are null. As long as they are relatively consistent and localtime() returns the correct human translation.

2038 is closer than it seems

Posted Jun 2, 2014 16:00 UTC (Mon) by jch (guest, #51929) [Link]

> setting an environment variable like $EPOCH_YEAR to 2010 would make the 32 bit time_t relative to midnight UTC, January 1, 2010.

Ouch. That would break all of the (perfectly correct) code that does

    fprintf(datafile, "%ld\n", (long)timestamp);

and expects the data to be valid in the next session.

2038 is closer than it seems

Posted May 23, 2014 17:57 UTC (Fri) by lonely_bear (subscriber, #2726) [Link]

It is time to break the ABI (use 64-bit for time_t). It is the cleanest solution in my view. All other solutions introduce unnecessary complexity.

2038 is closer than it seems

Posted May 23, 2014 19:09 UTC (Fri) by kleptog (subscriber, #1183) [Link] (3 responses)

Ok, so now it's 2038, what does the 32-bit interface return after that? Or does it return 0x7FFFFFFF forever?

I suppose what you could do is change the definition to reduce the resolution. So after 2030 it starts counting at half speed. In 2034 at quarter speed, 2036 one-eighth speed, etc. If you fix the APIs for strftime, ctime, etc everything will be transparent for most programs except the resolution gets worse.

This of course screws people who do calculations on time_t though. You'll only see even seconds after that time. It might save you on file formats though. A sort floating point time_t...

2038 is closer than it seems

Posted May 23, 2014 21:37 UTC (Fri) by Jonno (subscriber, #49613) [Link] (1 responses)

> Ok, so now it's 2038, what does the 32-bit interface return after that? Or does it return 0x7FFFFFFF forever?

Presumably it will be treated as an unsigned integer, covering 1970-2106 instead of 1901-2038.

2038 is closer than it seems

Posted Dec 5, 2016 14:26 UTC (Mon) by mirabilos (subscriber, #84359) [Link]

Changing the type of time_t from 31 bit (plus sign) to 32 bit is an API (not just ABI) break just as (or even worse, for the API) changing it to 63 bit (plus sign) is.

(Yes, I know about complement representation, but that was not the point here.)

2038 is closer than it seems

Posted Dec 5, 2016 17:38 UTC (Mon) by zlynx (guest, #2285) [Link]

Start returning -EINVAL on all 32-bit time calls with the time_t fields set to 0. It's really the only sane thing to do.

glibc

Posted May 25, 2014 20:43 UTC (Sun) by robbe (guest, #16131) [Link]

Does the majority of embedded systems (which seem to be the main worry here) even use glibc?

If not, how would a combined glibc+kernel solution help?

2038 is closer than it seems

Posted May 29, 2014 6:18 UTC (Thu) by dlang (guest, #313) [Link]

for a large percentage of the embedded devices, thinking that the date is 1904 instead of 2039 will not actually hurt things much. The biggest question is how everything will handle the rollover

But since embedded devices get reset and power cycled anyway, if they do have problems, resetting them if not going to be that unusual an action.

and then there is a category that cares about the day of the week (or month), and for those you can pick a time in the available window where the calendar matches up.

it's a surprisingly small group of devices that care what year it is


Copyright © 2014, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds