By Jonathan Corbet
August 14, 2013
Many LWN readers have been in the field long enough to remember the
year-2000 problem, caused by widespread use of two decimal digits to store
the year. Said problem was certainly overhyped, but the frantic effort to
fix it was also not entirely wasted; plenty of systems would, indeed, have
misbehaved had all those COBOL programmers not come out of retirement to
fix things up. Part of the problem was that the owners of the affected systems
waited until almost too late to address the issue, despite the fact that it
was highly predictable and had been well understood decades ahead of time.
One would hope that, in the free software world, we would not repeat this
history with another, equally predictable problem.
We'll have the opportunity to find out, since one such problem lurks over
the horizon. The classic Unix representation for time is a signed 32-bit
integer containing the number of seconds since January 1, 1970. This
value will overflow on January 19, 2038, less than 25 years from now.
One might think that the time remaining is enough to approach a fix in a
relaxed manner, and one would be right. But, given the longevity of many
installed systems, including hard-to-update embedded systems, there may be
less time for a truly relaxed fix than one might think.
It is thus interesting to note that, on August 12, OpenBSD developer Philip
Guenther checked in a patch to the OpenBSD
system changing the types of most time values to 64-bit quantities. With
64 bits, there is more than enough room to store time values far past
the foreseeable future, even if high-resolution (nanosecond-based) time
values are used. Once the issues are shaken out, OpenBSD will likely have
left the year-2038 problem behind; one could thus argue that they are well
ahead of Linux on this score. And perhaps that is true, but there are some
good reasons for Linux to proceed relatively slowly with regard to this
problem.
The OpenBSD patch changes types like time_t and clock_t
to 64-bit quantities. Such changes ripple outward quickly; for example,
standard types like struct timeval and
struct timespec contain time_t fields, so those
structures change as well. The struct stat passed to the
stat() system call also contains a set of time_t values.
In other words, the changes made by OpenBSD add up to one huge,
incompatible ABI change. As a
result, OpenBSD kernels with this change will generally not run binaries
that predate the change; anybody updating to the new code is advised to do
so with a great deal of care.
OpenBSD can do this because it is a self-contained system, with the kernel
and user space built together out of a single repository. There is little
concern for users with outside binaries; one is expected to update the
system as a whole and rebuild programs from source if need be. As a
result, OpenBSD developers are much less reluctant to break the kernel ABI
than Linux developers are. Indeed, Philip went ahead and expanded
ino_t (used to represent inode numbers) as well while he was at
it, even though that type is not affected by this problem. As long as
users testing this code follow the recommendations and start fresh with a full
snapshot, everything will still work. Users attempting to update an
installed system will need
to be a bit more careful.
In the Linux world, we are unable to simply drag all of user space forward
with the kernel, so we cannot make incompatible ABI changes in this way.
That is going to complicate the year-2038 transition considerably — all the
more reason why it needs to be thought out ahead of time. That said, not
all systems are at risk. As a general
rule, users of 64-bit systems will not have problems in 2038, since 64-bit
values are already the norm on such machines. The 32-bit x32 ABI was also designed with 64-bit time
values. So many Linux users are already well taken care of.
But users of the pure 32-bit ABI will run into trouble. Of course, there
is a possibility that there
will be no 32-bit systems in the wild 25 years from now, but history argues
otherwise. Even with its memory addressing limitations (a 32-bit processor
with the physical address extension feature will struggle to work with 16GB of
memory which, one assumes, will barely be enough to hold a "hello world"
program in 2038), a 32-bit system can perform a lot of useful tasks. There
may well be large numbers of embedded 32-bit systems running in 2038 that
were deployed many years prior. There will almost certainly be 32-bit
systems running in 2038 that will need to be made to work properly.
During a brief discussion on the topic last June, Thomas Gleixner described a possible approach to the problem:
If we really want to survive 2038, then we need to get rid of the
timespec based representation of time in the kernel altogether and
switch all related code over to a scalar nsec 64bit storage. [...]
Though even if we fix that we still need to twist our brains around
the timespec/timeval based user space interfaces. That's going to
be the way more interesting challenge.
In other words, if a new ABI needs to be created anyway, it would make
sense to get rid of structures like timespec (which split times
into two fields, representing seconds and nanoseconds) and use a simple
nanosecond count. Software could then migrate over to the new system calls
at leisure. Thomas suggested keeping the
older system call infrastructure in place for five years, meaning that
operations using the older time formats would continue to be directly
implemented by the kernel; that would prevent unconverted code from
suffering performance regressions. After that period passed, the
compatibility code would be replaced by wrappers around the new system
calls, possibly slowing the emulated calls down and providing an incentive for
developers to update their code. Then, after about ten years, the old
system calls could be deprecated.
Removal of those system calls could be an interesting challenge, though;
even Thomas suggested keeping them for 100 years to avoid making Linus
grumpy. If the system calls are to be kept up to (and past) 2038, some way
will need to be found to make them work in some fashion. John Stultz had
an interesting suggestion toward that end:
turn time_t into an unsigned value, sacrificing the ability to
represent dates before 1970 to gain some breathing room in the future.
There are some interesting challenges to deal with, and some software would
surely break, but, without a change, all software using 32-bit
time_t values will break in 2038. So this change may well be
worth considering.
Even without legacy applications to worry about, making 32-bit Linux
year-2038 safe would be a significant challenge. The ABI constraints make
the job harder yet. Given that some parts of any migration simply cannot
be rushed, and given that some deployed systems run for many years, it
would make sense to be thinking about a solution to this problem now.
Then, perhaps, we'll all be able to enjoy our retirement without having to
respond to a long-predicted time_t mess.
(
Log in to post comments)