|
|
Subscribe / Log in / New account

Approaching the kernel year-2038 end game

By Jonathan Corbet
January 11, 2019
In January 2038, the 32-bit time_t value used on many Unix-like systems will run out of bits and be unable to represent the current time. This may seem like a distant problem, but, as Tom Scott recently observed, the year-2038 apocalypse is now closer to the present than the year-2000 problem. The fact that systems being deployed now will still be operating in 2038 adds urgency to the issue as well. The good news is that work has been underway for years to prepare Linux for this date, so there should be no need to call developers out of retirement in 2037 in a last-minute panic. Some of the final steps in this transition for the core kernel have been posted, and seem likely to be merged for 5.1.

There have been numerous phases to this work, which has been carried out primarily by Arnd Bergmann and Deepa Dinamani. Timekeeping within the kernel has been reworked to use 64-bit values throughout, even on 32-bit systems, for example. A lot of work was required to get there, but that was, in some sense, the easy part; since the changes were all internal to the kernel, the developers involved were free to change interfaces when needed. Life becomes more difficult when it comes to the system-call interface, since that cannot be changed at whim without breaking user-space applications.

The approach that has been taken here, for many of the relevant system calls, is to recognize that most systems already have a 64-bit solution for 32-bit applications. Most 64-bit kernels are able to run 32-bit processes; to do so, they provide a set of compatibility (or "compat") system calls to perform impedance matching. Typically, these compat calls simply reformat 32-bit types into their 64-bit equivalent, then pass the result to the native 64-bit implementations. In other words, the compat calls do exactly what is needed to connect a user space process using 32-bit times to a kernel that uses 64-bit times throughout.

Much of the work that has been done to this point, thus, has been promoting these compat system calls to become the native 32-bit system calls. User space sees no changes, but the kernel is able to leave 32-bit times behind entirely. To that end, one of the key changes in this patch set posted by Bergmann is to take the compat calls and define them as proper system calls for 32-bit systems. In the process, these calls are renamed; futex() becomes futex_time32(), for example. Then, 32-bit architectures are switched over to use the new _time32() calls.

The only remaining problem, of course, is that user space is still using 32-bit times, so things will still explode on schedule in 2038. Fixing that problem is not something that the kernel can do on its own, but it can provide the infrastructure to make the transition possible. In particular, for all of the _time32() calls described above, the patch set also exposes the 64-bit versions with _time64() suffixes. So, once this patch is applied, both the (broken) 32-bit and (fixed) 64-bit interfaces are available in 32-bit systems.

At this point, the ball moves into the court of the C library and distribution developers. A new C library release can define the system-call interfaces with 64-bit time values, and implement those interfaces with calls to the _time64() versions. Older binaries, instead, will continue to use the 32-bit versions. For many applications, all that will be needed at this point is a rebuild and they will be prepared to survive the 2038 transition. Others, of course, will require more work. Distributors have the option of rebuilding everything they ship for 64-bit time, then disabling 32-bit times entirely by turning off the COMPAT_32BIT_TIME configuration variable. Most distributors, though, are likely to support both modes for some time yet.

For the curious, the system calls affected are: adjtimex(), clock_adjtime(), clock_getres(), clock_gettime(), clock_nanosleep(), clock_settime(), futex(), io_getevents(), io_pgetevents(), mq_timedsend(), mq_timedreceive(), nanosleep(), ppoll(), pselect6(), recvmmsg(), rt_sigtimedwait(), sched_rr_get_interval(), semtimedop(), timer_gettime(), timer_settime(), timerfd_gettime(), timerfd_settime(), and utimensat(). The plan for the GNU C Library transition has been posted in great detail as well.

These changes fix the core kernel system-call interfaces, but that is not the end of the story. There are many other places in the kernel's user-space API where time values appear, and many of them need to be fixed as well. Those are slowly being addressed. Consider, for example, the SO_TIMESTAMP socket option (described in this man page); it enables the reception of control messages with network timestamp values. Those values are specified using struct timeval, which is not year-2038 safe.

This patch set from Dinamani addresses that problem by adding a new set of options that are year-2038 safe. An application can request SO_TIMESTAMP_NEW to get a new control-message format with 64-bit times; the SO_TIMESTAMPNS and SO_TIMESTAMPING options have seen a similar treatment. Socket timeout values also have a year-2038 problem; this patch series adds SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW to address it. Once again, libraries and (possibly) applications will need to be changed to be able to make use of these new options.

Once this work gets in, the kernel community, at least, can begin to think that there is some light at the end of the tunnel. Problems will remain, mostly in filesystem timestamps and time values that are passed to ioctl() calls, but the work as a whole can be seen as entering the clean-up phase. For library developers and distributors, though, the real work is just beginning. The good news is that they still have some time to get their piece of the work done so that systems deployed in the near future will be ready for the not-so-near (but approaching rapidly) 2038 deadline.

Index entries for this article
KernelYear 2038 problem


to post comments

Approaching the kernel year-2018 end game

Posted Jan 11, 2019 18:26 UTC (Fri) by gulsef073 (guest, #123117) [Link] (1 responses)

Perhaps the title should have read 'year-2038'?

Approaching the kernel year-201^H38 end game

Posted Jan 11, 2019 18:28 UTC (Fri) by corbet (editor, #1) [Link]

That would have made sense, wouldn't it? It's amazing what can sneak through review...

Approaching the kernel year-2038 end game

Posted Jan 11, 2019 20:50 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

> The good news is that work has been underway for years to prepare Linux for this date, so there should be no need to call developers out of retirement in 2037 in a last-minute panic.
Why??? I hoped for an additional income stream around 2035!

Approaching the kernel year-2038 end game

Posted Jan 12, 2019 13:17 UTC (Sat) by madscientist (subscriber, #16861) [Link] (22 responses)

> can request SO_TIMESTAMP_NEW ... SO_RCVTIMEOU_NEW and SO_SNDTIMEO_NEW ...

I really, REALLY hate it when people add "new" to the end of symbols. It's short-sighted, not to mention uninformative.

Why not choose something clarifying instead, like SO_TIMESTAMP_64?

Approaching the kernel year-2038 end game

Posted Jan 12, 2019 13:28 UTC (Sat) by dskoll (subscriber, #1630) [Link]

Yeah, that should be fixed before it's frozen forever as a warty API that we're stuck with.

Approaching the kernel year-2038 end game

Posted Jan 13, 2019 11:28 UTC (Sun) by erwbgy (subscriber, #4104) [Link]

Agreed! I was just about to post that. They certainly won't be new in 2038.

Approaching the kernel year-2038 end game

Posted Jan 13, 2019 11:37 UTC (Sun) by warp (guest, #14659) [Link]

Quite.

_NEW is really horrible, as are things like 'next generation' especially when three years later you need to do something even newer or as a further generation.

New

Posted Jan 13, 2019 12:43 UTC (Sun) by tialaramex (subscriber, #21167) [Link] (14 responses)

Birmingham's New Street is recorded at least as far back as the 13th Century. At some point presumably not very long before that it genuinely was a new street. In the 19th century expanding railway traffic caused the train companies to construct a major station next to it, at the time with a huge glass roof. Today "Birmingham New Street" station isn't even on New Street per se, the station and adjunct shopping centre having been rebuilt and sprawled over a larger area - but from pretty much anywhere in the UK if you say "New Street" you will be understood to mean that railway station as distinct from other streets that are new just as if you say "Waterloo" you will be understood to mean the railway station in London once named "Waterloo Bridge Station" and not the place in Belgium.

It would be better to have named it SO_TIMESTAMP_64, but ultimately symbols aren't themselves meanings, they're just symbols, and a few examples like this help to make that abundantly clear.

However, good news, if somebody gives something a name you don't like, you can just attach a different name (doing this to people is rude, but it will work). If yours is much more popular soon nobody will remember the "proper name" at all, and it will be regarded as a mistake if used. In principle as well as a bridge named after the battle of Waterloo, and one still named after the original city itself ("London Bridge") London also has one named for William Pitt (Pitt the Younger). But despite signs nobody called it "William Pitt Bridge", they called it "Blackfriars Bridge", soon the maps followed suit and today only historians or bridge nerds will have any idea where William Pitt Bridge even is. Start campaigning for SO_TIMESTAMP_64 today!

New

Posted Jan 14, 2019 10:27 UTC (Mon) by Guhvanoh (subscriber, #4449) [Link] (7 responses)

And while you're on the subject of railway stations shouldn't London Bridge be known as London London Bridge just like London Waterloo, London Victoria, London Cannon Street etc. ?

New

Posted Jan 14, 2019 18:40 UTC (Mon) by mpr22 (subscriber, #60784) [Link] (6 responses)

Apart from Victoria (which needs to be disambiguated from the ones in Manchester and Southend) and Cannon Street (which needs to be disambiguated from the one in Hull), hardly anyone outside the travel industry or London local government bothers with the 'London' bit when referring to the London terminals that are not London Bridge :)

New

Posted Jan 14, 2019 20:00 UTC (Mon) by brother_rat (subscriber, #1895) [Link]

As we're already way off topic... there's also 'Gatwick' which is GTW by train and LGW by plane.

New

Posted Jan 14, 2019 20:49 UTC (Mon) by BlueLightning (subscriber, #38978) [Link] (4 responses)

I think one reason the rail companies use the "London" prefix (or at least did, when I lived in London a few years ago) is so that when you have a "Not via London" rail ticket it's more obvious that you're not allowed to pass through those stations.

New

Posted Jan 24, 2019 2:08 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

That was back in the days when the Overground came into the London termini, and you had to use the Underground to traverse Central London. Now with travel cards and zones and Thameslink, and the soon-to-open Crossrail, I don't think you get "not via London" any more.

Cheers,
Wol

New

Posted Jan 24, 2019 18:51 UTC (Thu) by mpr22 (subscriber, #60784) [Link] (2 responses)

> I don't think you get "not via London" any more.

To take an example relevant to myself, journeys between Northampton and the South Coast are still about 20-25% cheaper when made with a ticket that requires one to travel via the West London Line through Kensington Olympia rather than permitting interchange in central London.

New

Posted Jan 24, 2019 19:31 UTC (Thu) by TomH (subscriber, #56149) [Link] (1 responses)

Sure, but as http://www.brfares.com/#!fares?orig=NMP&dest=BTN will show you that is because there are tickets routed "ANY PERMITTED" and cheaper tickets routed "KEN OLYMPIA" rather than having some with a "NOT LONDON" restriction.

New

Posted Jan 24, 2019 19:36 UTC (Thu) by TomH (subscriber, #56149) [Link]

That said you can get NOT VIA LONDON if you look at Bristol to York for example: http://www.brfares.com/#!fares?orig=BPW&dest=YRK

New

Posted Jan 17, 2019 18:36 UTC (Thu) by thyrsus (guest, #21004) [Link] (3 responses)

I've always enjoyed this: "Ironically, the New River is considered by some geologists to be one of the oldest rivers in the world.[11]"

https://en.wikipedia.org/wiki/New_River_%28Kanawha_River_...

New & Old

Posted Jan 18, 2019 9:41 UTC (Fri) by rschroev (subscriber, #4164) [Link] (2 responses)

Like how the Pont Neuf (New Bridge) in Paris is actually the oldest bridge in the city.

https://en.wikipedia.org/wiki/Pont_Neuf

New & Old

Posted Jan 18, 2019 15:34 UTC (Fri) by tao (subscriber, #17563) [Link]

At least New York is a lot more recent than York, and New Orleans is a lot more recent than Orléans.

New & Old

Posted Oct 7, 2023 17:59 UTC (Sat) by ceplm (subscriber, #41334) [Link]

You mean like https://en.wikipedia.org/wiki/Old_New_Synagogue in Prague, which is the oldest active synagogue in Europe?

New

Posted Feb 5, 2019 19:12 UTC (Tue) by eythian (subscriber, #86862) [Link] (1 responses)

Amsterdam has solved this with Nieuwe Nieuwstraat.

New

Posted Feb 28, 2019 14:26 UTC (Thu) by remi.chateauneu (subscriber, #51826) [Link]

Cartagena in Spain: "Possessing one of the best harbors in the Western Mediterranean, it was re-founded by the Carthaginian general Hasdrubal in 228 BC as Qart Hadasht ("New City"), a name identical to Carthage, for the purpose of serving as a stepping-off point for the conquest of Spain. The Roman general Scipio Africanus conquered it in 209 BC and renamed it as Carthago Nova (literally "New New City") to distinguish it from the mother city."

Approaching the kernel year-2038 end game

Posted Jan 14, 2019 20:20 UTC (Mon) by k8to (guest, #15413) [Link] (1 responses)

"new" is pretty bad in code in general, unless maybe you're talking about timestamp variables where you know one is more recent than the other. I've dealt with codebases where the "NewProtocol" classes were three versions ago. Since they didn't learn, NG was newer than new, and then "Turbo".

Approaching the kernel year-2038 end game

Posted Jan 17, 2019 19:11 UTC (Thu) by BenHutchings (subscriber, #37955) [Link]

The Linux network receive queue polling API was introduced as "NAPI" (new API) in 2002 and is still called that.

Approaching the kernel year-2038 end game

Posted Jan 14, 2019 22:00 UTC (Mon) by arnd (subscriber, #8866) [Link] (1 responses)

I don't think it's too important here, to clarify: no user space code should actually use the _NEW symbols, but they should just use the existing names that contain no such suffix, e.g. SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING. The fact that we now have nine identifiers instead of three makes it practically impossible to make sense of it anyway, but it can't really be avoided.

The downside of the _64 suffix would be that it's more confusing for 64-bit architectures, on which both _OLD and _NEW refer to 64-bit timestamps. If you have any other suggestions for naming this, please reply on the mailing list with a patch.

Approaching the kernel year-2038 end game

Posted Jan 14, 2019 23:56 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link]

And it's not as if this is a symbol that's likely to undergo a lot of change. Naming something "_NEW" is problematic when it's something that undergoes a lot of churn, but that's not the case with these time symbols. The old versions are quite old, and there's every expectation the "_NEW" version will be sufficient through 2038 and beyond. Complaining about the naming seems like bikeshedding.

Approaching the kernel year-2038 end game

Posted Jan 14, 2019 8:10 UTC (Mon) by gdt (subscriber, #6284) [Link] (6 responses)

What is the behaviour of the current 32-bit APIs after 2038? There seems to be a range of possibilities: time_t values stop incrementing (maybe at some flag value); they two's-complement wrap into years prior to 1970; they wrap to 0? Is the behaviour consistent across system calls and filesystems.

Just removing the 32-bit time_t API isn't a full answer. Even if I'm running a recent 2038-prepared kernel I might still attach an archival volume in the older pre-2038 format.

Approaching the kernel year-2038 end game

Posted Jan 14, 2019 9:54 UTC (Mon) by arnd (subscriber, #8866) [Link] (5 responses)

I expect that the old APIs will remain available in kernels until shortly before 2038, but I'd like to make it a compile-time option for those that intentionally want to have kernels that leave them out earlier, in particular on architectures that never had them (riscv32 etc).

After 2038, I would expect to just remove that entirely, as there will be very little value in running 20 year old user space that probably won't work anyway, as opposed to running it in a virtual machine with an older kernel and a backdated RTC.

Approaching the kernel year-2038 end game

Posted Jan 20, 2019 1:25 UTC (Sun) by giraffedata (guest, #1954) [Link] (4 responses)

I think you mistook the question. It isn't what will happen to the 32 bit time APIs in versions of the kernel developed after 2038; it's how will the 32 bit APIs in today's kernel function when that kernel runs after 2038?

It's a good question; I'd like to understand that too. I'm pretty sure I'm going to have some computers running then on which running a recent kernel won't be the best option.

OP also posed another question that perhaps you'd like to address: what about stored 32 bit time stamps? Many filesystems contain 32 bit timestamps. What will happen after 2038 if one tries to use an up-to-date kernel to read or write such a filesystem? Will we have to copy all the data to a newer type of filesystem to maintain access to all the data?

Approaching the kernel year-2038 end game

Posted Jan 21, 2019 13:47 UTC (Mon) by arnd (subscriber, #8866) [Link] (3 responses)

Generally speaking, I'd consider it a bug to use the old ABIs beyond 2038. Running a Linux kernel older than maybe 5.2 is certainly going to break in unexpected ways in 2038 (unless you are backdating the RTC and don't use network access etc).

Running a newer kernel that supports both the time32 and time64 interfaces gives us compatibility until 2038, but that doesn't mean you should plan to actually use the time32 on systems that might have to run after 2038, since that won't be tested well, and generally means undefined behavior (signed integer overflow in C). In practice this means that any existing 32-bit user space should be rebuilt against a kernel and libc with 64-bit time_t in order to have a chance of it working in 20 years.

In case of stored time stamps, you can always access file systems as read-only and get your data. There is already a problem with setting the timestamps of files beyond the valid range, which is different depending on the particular file system. On patch set that is planned will limit timestamps to the minimum and maximum supported by the file system, and this will likely be true both for calling utimes() and for timestamp updates to the current time that the kernel does itself.
There may be an option to force read-only mounts of file system images that suffer from an overflow in the near future (for some value of "near", which could be as far as "within the next 30 years"), to ensure that you don't accidentally create an embedded system that stops working at a fixed time.

Approaching the kernel year-2038 end game

Posted Jan 24, 2019 2:14 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

> In practice this means that any existing 32-bit user space should be rebuilt against a kernel and libc with 64-bit time_t in order to have a chance of it working in 20 years.

And if you want to run programs for which you don't have the source???

Cheers,
Wol

Approaching the kernel year-2038 end game

Posted Jan 24, 2019 10:07 UTC (Thu) by arnd (subscriber, #8866) [Link]

You may be lucky that they don't use time_t in ways that cause an undefined integer overflow in 2038. However, since you don't have the source, there is no way of finding out whether the programs are safe or not, so the assumption would be that they are not.

The plans for glibc are to keep compatibility support for the old symbols, so the application will still work correctly before any overflows happen, so it's likely that you are safe until some time in the 2030s when it has set the first timers long enough into the future that they overflow (e.g. key expiration dates are often months or years in the future). Again, to be sure you'd have to inspect the source code or ask someone who can do it, but that would not be that different from asking them for a rebuild against a C library from the 2020s.

Approaching the kernel year-2038 end game

Posted Jan 26, 2019 22:02 UTC (Sat) by kleptog (subscriber, #1183) [Link]

For filesystems you could simply declare the value on disk to be unsigned, or use a different epoch and then translate in the kernel. Sucks for the people who want to store timestamps from before 1970 though.

Then you're just left with user space programs. You could do funky things like expanding the range by reducing resolution, but changing the assumption that adding 3600 moves you an hour forward will probably break more things than wrapping.

Approaching the kernel year-2038 end game

Posted Jan 14, 2019 10:25 UTC (Mon) by zoobab (guest, #9945) [Link] (1 responses)

Most 32 bits chips that will be still alive in 18 years time are probably those ARM m0-m3, which are present in many devices now. And they do not run Linux, not even an OS like FreeRTOS in most cases, any patches for FreeRTOS and others?

Approaching the kernel year-2038 end game

Posted Jan 14, 2019 16:23 UTC (Mon) by ay (subscriber, #79347) [Link]

FreeRTOS is more of a threading library than an operating system, it doesn't provide calendering or time keeping functionality beyond simple timers. The issue on those is likely libc (usually newlib and newlib-nano) plus the MCU's hw rtc and calendar block and what it's capabale of (most do handle 2038 just fine) plus any vendor sdk routines around that. That said most of those systems will have a dead rtc battery by 2038 or be disposed of for some other reason so it's not likely to be a huge deal.

Unsigned!

Posted Jan 15, 2019 4:06 UTC (Tue) by ncm (guest, #165) [Link]

Really, the simplest thing for everyone to do who is stuck with a 32-bit runtime and 32-bit database/packet date/time fields is to just begin interpreting the time as a 32-bit unsigned value. That extends its usability right through 2106. Anybody still using a 32-bit unix in 2106 will have much bigger problems than the date rolling over. (Don't bring up birthdates before 1970. Not interested.)

The pcap format used for tcpdump took that route. Come 2106, we will be able to say that any captured packet of interest dated apparently before 2038 (i.e., positive) is really post-2106, and everything is fine again for another 78 years. Much of the stock market, with its nanosecond-scale clocks, has settled there too, with split 32-bit-second, 32-bit-nanosecond timestamps. Nobody there worries about 2038, or 2106.

Approaching the kernel year-2038 end game

Posted Jan 31, 2020 15:13 UTC (Fri) by S8iDH5QTk (guest, #136936) [Link] (3 responses)

Is it sure that we have to go 64-bit wide? 64 bits in every timestamp? Repeating the redundant high 32 bits, which will not change for 136 years?
I would want 32-bit timestamps.

In 2038, "after the last second", we just need a change in the epoch. Let's shift the epoch from 1970 to (1970+136).

Given that we know that now (then) there is (will be) Tue Jan 19 03:14:08 2038, and the timestamp shows me that we are 68 years before the epoch, it follows that the (new) epoch is (should be) now (then) + 68 years.

It is similar to the thing that during any year (let it 2019 for example), a short date like "Fri Jan 31 15:38:17" is sufficient (for most purposes). After December 31, nobody thinks that suddenly has been dropped back in time. Just the next year has started. Now is 2020, and subsequent short dates should be interpreted as ".. in 2020".

And in 2174 (2038+136), again an epoch-shift is needed. One in every 136 year.

Approaching the kernel year-2038 end game

Posted Jan 31, 2020 16:05 UTC (Fri) by excors (subscriber, #95769) [Link]

That might be okay for representing the current time, because device lifetimes are short enough that you can hardcode the current epoch in them. E.g. radio clocks in the UK use a signal that encodes the year as 2 digits. If a device manufactured in 2020 sees the year "50" transmitted, it knows that must mean 2050 not 1950. It would misinterpret the signal in 2150 but there's no chance the device will survive that long anyway.

But that doesn't work when referring to distant past/future times, which the kernel sometimes needs to do. Your computer in 2038 might have files that were genuinely last modified in 1970, and you don't want "ls" to start saying they were modified in 2106. 64-bit timestamps are pretty cheap and they completely solve that problem without the need for heuristics.

It's like how "Jan 31" might be good enough when arranging a holiday with someone, because you both interpret it relative to the same 'now'; but if you print out your holiday photos and put them in an album, you probably want to write "Jan 31 2020" on them, else when you look back in future years you won't have the context to interpret them correctly.

Approaching the kernel year-2038 end game

Posted Jan 31, 2020 16:46 UTC (Fri) by nybble41 (subscriber, #55106) [Link]

> Is it sure that we have to go 64-bit wide? 64 bits in every timestamp?

Yes. We need to be able to represent not only the current time but also all the timestamps recorded up to this point. For example, it should be possible to accurately represent the modification time of a file last modified in 1970 alongside a file modified in 2040, which cannot be accomplished simply by shifting the epoch. Ergo, unless we're willing to sacrifice resolution, we need more than 32 bits. And sizes larger than 32 bits but less than 64 bits are exceedingly awkward to deal with.

Approaching the kernel year-2038 end game

Posted Feb 1, 2020 22:09 UTC (Sat) by flussence (guest, #85566) [Link]

Your proposal to change every piece of software to use a different 32-bit time system will not work, for the same reason every proposal to "simply" extend 32-bit IPv4 addresses has failed.


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds