2038 is closer than it seems
At times, developers have hoped that this problem might solve itself. On 64-bit systems, the time_t type has always been defined as a 64-bit quantity and will not run out of space anytime soon. Given that 64-bit systems appear to be taking over the world — even phone handsets seem likely to make the switch in the next few years — might the best solution be to just wait for 32-bit systems to die out and take the problem with them? A "no action required" solution has an obvious appeal.
There are two problems with that reasoning: (1) 32-bit systems are likely to continue to be made for far longer than most people might expect, and (2) there are 32-bit systems being deployed now that can be expected to have lifetimes of 24 years or longer. 32-bit systems will be useful as cheap microcontrollers for a long time, and, once deployed, they will often be expected to work for many years while being difficult or impossible to update. There are almost certainly systems already deployed that are going to provide unpleasant surprises in 2038.
Kernel-based solutions
So it would appear to make sense to solve the problem soon, rather than in, say, 2036 or so. There is only one snag: the problem is not all that easy to solve. At least, it is not easy if one is concerned about little details like not breaking existing programs. Since Linux developers at most levels are quite concerned about compatibility, the simplest solutions (such as a BSD-style ABI break) are not seen as being workable. In a recent discussion, John Stultz outlined a couple of alternative approaches, neither of which is without its difficulties.
The first approach would be to change the 32-bit ABI to use a 64-bit version of time_t (related data structures like, struct timespec and struct timeval would also change). Old binaries could be supported through a compatibility interface, but newly compiled code would normally use the new ABI. There are some advantages to this approach, starting with the fact that lots of applications could be updated simply by rebuilding them. Since a couple of BSD variants have already taken this path, a number of the worst application problems have already been fixed. Embedded microcontrollers typically run custom distributions built entirely from source; changing the ABI in this way would make it possible to build 2038-capable systems in the near future with a minimum of pain.
On the other hand, the kernel would have to maintain a significant compatibility layer for a long time. Developers are also worried that there will be many applications that store 32-bit time_t values in their own data structures, in on-disk formats, and more. Many of these applications could break in surprising ways, and they could prove to be difficult to fix. There are also some concerns about the runtime cost of using 64-bit time_t values on 32-bit systems. Much of this cost could be mitigated within the kernel by using a different format internally, but applications could slow down as well.
The alternative approach is to simply define a new set of system calls, all of which are defined to use better time formats from the beginning. The new formats could address other irritations at the same time; not everybody likes the separate seconds and nanoseconds fields used in struct timespec, for example. All system calls defined to use the old time_t values would be deprecated, with the idea of removing them, if possible, before 2038.
With this approach, there would be no hard ABI break anytime soon and applications could be migrated gradually. Once again, embedded systems could be built using the new system calls in the relatively near future, while desktop systems could be left alone for another decade or so. And it would be a chance to start over and redesign some longstanding system calls with 21st-century needs in mind.
Defining new system calls has its downsides as well, though. It would push Linux further away from being a POSIX system, and would take us down a path different from the one chosen by the BSD world. There are a lot of system calls to replace, and time_t values show up in other places as well, most notably in a long list of ioctl() calls. Applications would have to be updated, including those running only on 64-bit systems, which would not see much of a benefit from the new system calls. And, undoubtedly, there would be lots of applications using the older system calls that would surface in 2037. So this approach is not an easy solution either.
Including glibc
Discussions of these alternatives went on for a surprisingly long time before Christoph Hellwig made an (in retrospect) obvious suggestion: the C library developers are going to have to be involved in the implementation of any real solution to the year-2038 problem, so perhaps they should be part of the discussion now. For years, communications between the kernel community and the developers of C libraries (including the GNU C library — glibc) have been sporadic at best. The changing of the guard at glibc has made productive conversations easier to have, but changing old habits has proved hard. In any case, it is true that the glibc developers will have to be involved in the design of the solution to this problem; the good news is that such involvement appears likely to happen.
Glibc developers are not known for their love of ABI breaks — or of non-POSIX interfaces for that matter. So, once glibc developer Joseph Myers joined the conversation, the tone shifted a bit toward a solution that would allow a smooth transition while retaining existing POSIX system calls and application compatibility. The plan (which was discussed only in rough form and would need a lot of work yet) looks something like this:
- Create new, 64-bit versions of the affected system calls. So, for
example, there would be a gettimeofday64() that returns the
time in a struct timeval64. The existing versions of these
system calls would be unchanged.
- Glibc would gain a new feature test macro with a name like
TIME_BITS. If TIME_BITS=64 on a 32-bit system, a
call to gettimeofday() will be remapped to
gettimeofday64() within the library. So applications can opt
into the new world by building with an appropriate value of
TIME_BITS defined.
- Eventually, TIME_BITS=64 would become the default, probably after distributions had been shipping in that mode for a while. Even in the 64-bit configuration, compatibility symbols would remain so that older binaries would still work against newer versions of the C library.
Such an approach could possibly allow for a relatively smooth transition to a system that will work in 2038, though, naturally, a number of troublesome details remain. There was talk of remapping ioctl() calls in a similar way, but that looks like a recipe for trouble given just how many of those calls there are and how hard it would be to even find them all. Developers in other C library projects, who often don't wish to maintain the sort of extensive compatibility infrastructure found in glibc, may wish to take a different approach. And so on.
But, even with its challenges, the existence of a vague plan hashed out
with participation from kernel and glibc developers is reason for hope.
Maybe, just maybe, some sort of reasonably robust solution to the 2038
problem will be found before it becomes absolutely urgent, and, with luck,
before lots of systems that will need to function properly in 2038 are
deployed. We have the opportunity to avoid a year-2038 panic at a
relatively low cost; if we make use of that opportunity, our future selves
will thank us.
Index entries for this article | |
---|---|
Kernel | Timekeeping |
Kernel | Year 2038 problem |
Posted May 22, 2014 3:03 UTC (Thu)
by adler187 (guest, #80400)
[Link] (3 responses)
Posted May 22, 2014 4:20 UTC (Thu)
by mezcalero (subscriber, #45103)
[Link] (2 responses)
The same story happened for large file support (LFS) where off_t got increased in size. Now, off_t is thankfully not that often exposed in APIs, and because people knew how awful the situation was many just avoided exposing it in APIs, but for time_t the situation is much worse.
Also, one particular gem: think of stat() which already exists in two flavours, with LFS and without. Now, this API would also have to be duplicated for 32bit time_t and 64bit time_t. So you get four flavours of this call: stat(), stat64(), stat_t64() and stat64_t64()! Ouch!
And even thinking of duplicating gettimeofday() when there's also clock_gettime(CLOCK_REALTIME) is just wrong...
Lennart
Posted May 22, 2014 7:27 UTC (Thu)
by niner (subscriber, #26151)
[Link] (1 responses)
Posted May 22, 2014 9:48 UTC (Thu)
by arnd (subscriber, #8866)
[Link]
Posted May 22, 2014 5:50 UTC (Thu)
by eru (subscriber, #2753)
[Link] (1 responses)
Posted May 22, 2014 9:55 UTC (Thu)
by arnd (subscriber, #8866)
[Link]
What user space does is a different matter though. I think you should always have at least the option to build a libc that only supports 64-bit time_t in user space and that uses the new kernel interfaces for a safe implementation. This way, an enterprise distro with e.g. 10 years of guaranteed support and lots of legacy third-party applications can keep working as previously, while an embedded system with 25 years support and no legacy code can go to 64-bit time_t in user space without any backwards compat hacks in user space.
Posted May 23, 2014 9:32 UTC (Fri)
by danielos (guest, #6053)
[Link] (1 responses)
Posted Dec 11, 2014 15:18 UTC (Thu)
by butlerm (subscriber, #13312)
[Link]
Posted May 23, 2014 16:24 UTC (Fri)
by ScottMinster (subscriber, #67541)
[Link] (5 responses)
Obviously this wouldn't solve situations where the time_t value is recorded outside the process (in a file, sent over the network, etc), so this wouldn't work for some subset of applications. But most applications probably do not do that, so this workaround seems like it could be effective in cases where recompiling is not possible or desirable.
Posted May 23, 2014 19:34 UTC (Fri)
by corbet (editor, #1)
[Link] (3 responses)
Posted May 23, 2014 21:05 UTC (Fri)
by ScottMinster (subscriber, #67541)
[Link] (2 responses)
It's not a good long term solution, but could be a workaround for some applications.
Posted May 24, 2014 5:30 UTC (Sat)
by jzbiciak (guest, #5246)
[Link] (1 responses)
Are negative time_t values really valid times, though? There are APIs (such as time()) that return -1 to indicate an error. The lazy programmer in me imagines that among programs that actually check for an error (which, admittedly, are probably rare), the majority would just check for less-than-zero.
Posted May 24, 2014 10:40 UTC (Sat)
by ScottMinster (subscriber, #67541)
[Link]
But if the epoch date is in the recent past, that wouldn't cause any trouble for the time() function. I don't think any code would care if, for example, the time_t values in a stat structure are null. As long as they are relatively consistent and localtime() returns the correct human translation.
Posted Jun 2, 2014 16:00 UTC (Mon)
by jch (guest, #51929)
[Link]
Ouch. That would break all of the (perfectly correct) code that does
fprintf(datafile, "%ld\n", (long)timestamp);
and expects the data to be valid in the next session.
Posted May 23, 2014 17:57 UTC (Fri)
by lonely_bear (subscriber, #2726)
[Link]
Posted May 23, 2014 19:09 UTC (Fri)
by kleptog (subscriber, #1183)
[Link] (3 responses)
I suppose what you could do is change the definition to reduce the resolution. So after 2030 it starts counting at half speed. In 2034 at quarter speed, 2036 one-eighth speed, etc. If you fix the APIs for strftime, ctime, etc everything will be transparent for most programs except the resolution gets worse.
This of course screws people who do calculations on time_t though. You'll only see even seconds after that time. It might save you on file formats though. A sort floating point time_t...
Posted May 23, 2014 21:37 UTC (Fri)
by Jonno (subscriber, #49613)
[Link] (1 responses)
Presumably it will be treated as an unsigned integer, covering 1970-2106 instead of 1901-2038.
Posted Dec 5, 2016 14:26 UTC (Mon)
by mirabilos (subscriber, #84359)
[Link]
(Yes, I know about complement representation, but that was not the point here.)
Posted Dec 5, 2016 17:38 UTC (Mon)
by zlynx (guest, #2285)
[Link]
Posted May 25, 2014 20:43 UTC (Sun)
by robbe (guest, #16131)
[Link]
If not, how would a combined glibc+kernel solution help?
Posted May 29, 2014 6:18 UTC (Thu)
by dlang (guest, #313)
[Link]
But since embedded devices get reset and power cycled anyway, if they do have problems, resetting them if not going to be that unusual an action.
and then there is a category that cares about the day of the week (or month), and for those you can pick a time in the available window where the calendar matches up.
it's a surprisingly small group of devices that care what year it is
2038 is closer than it seems
2038 is closer than it seems
2038 is closer than it seems
2038 is closer than it seems
I hope going to 64-bit time_t is selected, like the BSD:s did. Given the human tendency of putting off chores with no immediate benefit, harshly imposing the fix when code is recompiled is the only way to get this solved before the deadline.
2038 is closer than it seems
2038 is closer than it seems
2038 is closer than it seems
Does it affect other system, such as TTL and other network staff? (packet format, and such)
2038 is closer than it seems
2038 is closer than it seems
Running in a different epoch would work in situations where dates in the past do not need to be represented. But think about things like file time stamps that do indeed need to be in the past.
Changing the epoch
Changing the epoch
Changing the epoch
Changing the epoch
2038 is closer than it seems
2038 is closer than it seems
2038 is closer than it seems
2038 is closer than it seems
2038 is closer than it seems
2038 is closer than it seems
glibc
2038 is closer than it seems