|
|
Subscribe / Log in / New account

Kernel page-table isolation merged

Linus has merged the kernel page-table isolation patch set into the mainline just ahead of the 4.15-rc6 release. This is a fundamental change that was added quite late in the development cycle; it seems a fair guess that 4.15 will have to go to -rc8, at least, before it's ready for release.

to post comments

Kernel page-table isolation merged

Posted Dec 30, 2017 16:06 UTC (Sat) by jiiksteri (subscriber, #75247) [Link] (7 responses)

This seems uncharacteristically hasty for -rc6, and grsecurity suggests[1] something is being embargoed.

On the other hand any breakage from this isn't exactly subtle I guess, so there's time to resolve potential problems. Who knows.

[1] https://twitter.com/grsecurity/status/946485057849647104

Kernel page-table isolation merged

Posted Dec 30, 2017 17:13 UTC (Sat) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

From the linked LWN article, "KPTI, in other words, has all the markings of a security patch being readied under pressure from a deadline."

Kernel page-table isolation merged

Posted Dec 31, 2017 13:48 UTC (Sun) by lkundrak (subscriber, #43452) [Link]

Well, no need to guess -- X86_BUG_CPU_INSECURE says that pretty clearly.

Kernel page-table isolation merged

Posted Dec 31, 2017 17:44 UTC (Sun) by MarcB (guest, #101804) [Link] (4 responses)

It seems, Microsoft is making the same change in Windows: https://twitter.com/aionescu/status/930412525111296000

Perhaps they were inspired by the publication of KAISER (time frame fits, if they were really fast), perhaps they got the idea independently or perhaps there really is an upcoming publication. Speculation is fun :-)

Kernel page-table isolation merged

Posted Jan 2, 2018 22:12 UTC (Tue) by riel (subscriber, #3142) [Link] (1 responses)

Speculation is fun indeed.

Kernel page-table isolation merged

Posted Jan 3, 2018 22:02 UTC (Wed) by MarcB (guest, #101804) [Link]

I think, I get that now ;-)

Kernel page-table isolation merged

Posted Jan 10, 2018 0:31 UTC (Wed) by immibis (subscriber, #105511) [Link] (1 responses)

> Speculation is fun :-)

Foreshadowing!

Kernel page-table isolation merged

Posted Jan 10, 2018 1:50 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

This should be a quote of the week.

Kernel page-table isolation merged

Posted Dec 30, 2017 20:00 UTC (Sat) by bojan (subscriber, #14302) [Link]

Not sure whether I'm reading this correctly, but it seems 4.14.11-rc1 also received this. That's going to be interesting. If so, I'm looking forward to seeing this in action on F27.

Kernel page-table isolation merged

Posted Dec 30, 2017 20:22 UTC (Sat) by jhoblitt (subscriber, #77733) [Link] (4 responses)

If this is indeed a reaction to a high severity vuln, I wonder PTI is going to need to be backported to distro kernels. rhel/centos 6 kernels are based on 2.6.32... I'm sure that would be a fun port.

Kernel page-table isolation merged

Posted Dec 31, 2017 14:17 UTC (Sun) by amacater (subscriber, #790) [Link] (3 responses)

RHEL == Red Hat Enterprise Linux throughout. [I know Red Hat folk don't like the abbreviation but I'm lazy :) ]

If running RHEL/CentOS 6 - move your machines and code to version 7 quickly? There's no very good reason not to run 7 given that 6 is coming to the end of its supported life [November 2020 is less than two years away now] and 6.8 was likely the last major release.

Oh, and if you're not running RHEL 6.8 move to it now for your RHEL machines because there is a relatively supported upgrade script to take 6 machines forward to 7 [This no longer works well for CentOS machines because of version skew between 6 and 7 and the CentOS folk are looking for a new maintainer - but RHEL will presumably pay someone to hand fix it]

And 2.6.32 is well beyond support, realistically, for any but the very worst case: RHEL 7 has a 3.* kernel and I'd expect RHEL 8 betas any day now with a 4.* kernel (probably 4.4). I can't see anyone willingly supporting three supported versions of RHEL concurrently so 6 users are being encouraged to move to 7 now: there's no conceivable upgrade path to skip from 6 to 8 over 15 years worth of development.

Kernel page-table isolation merged

Posted Dec 31, 2017 16:04 UTC (Sun) by pbonzini (subscriber, #60935) [Link] (1 responses)

There are plenty of users still on RHEL 6. RHEL 6.9 was released last March, and I would be surprised if it's the last update.

The last RHEL 5 release was 5.11, and in fact 2020 will be when the last security updates come for RHEL 5, not 6.

Kernel page-table isolation merged

Posted Dec 31, 2017 19:41 UTC (Sun) by amacater (subscriber, #790) [Link]

Only if you're paying for extended lifecycle support - 5 was extended to 30/11/2020 as was 6 but only for RHEL. CentOS stopped supporting 5 this year, CentOS 6 will be 30/11/2020

Kernel page-table isolation merged

Posted Jan 2, 2018 13:46 UTC (Tue) by Wol (subscriber, #4433) [Link]

> there's no conceivable upgrade path to skip from 6 to 8 over 15 years worth of development.

As has been pointed out here before, I believe ... how many systems are installed and never upgraded? If you have no plans to upgrade the system ever, why start now?

Although, in that case, it would probably make sense to start migrating whatever apps you have that matter, across to a new system.

Cheers,
Wol

Kernel page-table isolation merged

Posted Jan 1, 2018 13:02 UTC (Mon) by arekm (guest, #4846) [Link]

Kernel page-table isolation merged

Posted Jan 1, 2018 23:28 UTC (Mon) by GhePeU (subscriber, #56133) [Link] (14 responses)

AMD CPUs could be unaffected by whatever vulnerability KPTI is meant to protect from: https://lkml.org/lkml/2017/12/27/2

The change isn’t in Linus’s tree yet, so I guess we’ll have to wait and see, but if the performance impact is significant on Intel processors and Ryzen and EPYC aren’t affected it could spell a serious problem for Intel.

Kernel page-table isolation merged

Posted Jan 2, 2018 0:00 UTC (Tue) by Felix_the_Mac (guest, #32242) [Link]

I guess now would be a good time to buy AMD stock

Kernel page-table isolation merged

Posted Jan 2, 2018 1:32 UTC (Tue) by andresfreund (subscriber, #69562) [Link] (8 responses)

> The change isn’t in Linus’s tree yet, so I guess we’ll have to wait and see, but if the performance impact is significant on Intel processors and Ryzen and EPYC aren’t affected it could spell a serious problem for Intel.

I don't think it'll be that bad. The patchset has ASID / PCID support on capable hardware reducing the syscall overhead on capable hardware (apparently sandy bridge onwards).

Kernel page-table isolation merged

Posted Jan 2, 2018 8:06 UTC (Tue) by Lionel_Debroux (subscriber, #30014) [Link]

Ivy Bridge and Skylake, which were spender's ( https://twitter.com/grsecurity ) benchmark targets, saw 34% and 29% slowdowns on a simple `du -s` benchmark. These are newer than Sandy Bridge and feature PCID, so barring bugs, the patchset should have leveraged that feature, shouldn't it ?

Kernel page-table isolation merged

Posted Jan 2, 2018 8:51 UTC (Tue) by joib (subscriber, #8541) [Link]

https://www.realworldtech.com/westmere/ claims PCID was introduced already in the Westmere generation.

Kernel page-table isolation merged

Posted Jan 2, 2018 13:27 UTC (Tue) by sync (guest, #39669) [Link]

Fedora rawhide released a PTI enabled kernel (4.15.0-0.rc6.git0.1.fc28.x86_64). I did some test with it. Syscalls are much slower with PTI enabled.

bench_03_getpid_vdso 2.7x slower
bench_11_read_vdso 2.0x slower
bench_21_write_vdso 2.3x slower

I had used this benchmark https://github.com/arkanis/syscall-benchmark on an Intel i5-4200U. Of course this is only a microbenchmark. The real world slowdown is much less.

Kernel page-table isolation merged

Posted Jan 2, 2018 22:32 UTC (Tue) by andresfreund (subscriber, #69562) [Link]

> > The change isn’t in Linus’s tree yet, so I guess we’ll have to wait and see, but if the performance impact is significant on Intel processors and Ryzen and EPYC aren’t affected it could spell a serious problem for Intel.

> I don't think it'll be that bad. The patchset has ASID / PCID support on capable hardware reducing the syscall overhead on capable hardware (apparently sandy bridge onwards).

Some numbers for postgres workloads that more likely to suffer (i.e. very short statements with plenty syscalls, rather than analytics workloads):
https://www.postgresql.org/message-id/20180102222354.qikj...

readonly pgbench (tpch-like), 16 clients, i7-6820HQ CPU (skylake):

pti=off:
tps = 236629.778328

pti=on:
tps = 220791.228297 (~0.93x)

pti=on, nopcid:
tps = 198959.801459 (~0.84x)

To get closer to the worst case, I've also measured:

pgbench SELECT 1, 16 clients, i7-6820HQ CPU (skylake):

pti=off:
tps = 420490.162391

pti=on:
tps = 350746.065039 (~0.83x)

pti=on, nopcid:
tps = 324269.903152 (~0.77x)

So yea, definitely not fun :(

Kernel page-table isolation merged

Posted Jan 3, 2018 3:03 UTC (Wed) by rahvin (guest, #16953) [Link] (3 responses)

The benchmarks being reported could be catastrophic (5% to 50% performance degradation depending on workload), they've also added a flag to exempt AMD hardware so I'd presume AMD hardware is not vulnerable.

By all reports this is worse than the Intel lights-out firmware bug and allows user space code to read protected kernel memory, conceivably allowing one VM to read the memory of another VM per one of the scenario's I've seen. This has the potential to be heart-bleed plus a remote exploitable memory read that can be executed by user space code including javascript running in a browser. And it's hard coded in Intel silicon requiring the need to use the OS to separate the kernel and user space cache system resulting in major performance hits. Talk about ugly and just like the firmware it's in every processor Intel has built for more than a decade.

This is beyond brutal and I expect it's going to exacerbate the AMD processor shortage, good news for AMD at least. Bad news for anyone running an internet connected server.

Kernel page-table isolation merged

Posted Jan 3, 2018 10:54 UTC (Wed) by cesarb (subscriber, #6266) [Link] (2 responses)

> they've also added a flag to exempt AMD hardware so I'd presume AMD hardware is not vulnerable.

Has the commit adding that test been merged already? So far, I've only seen it on the mailing list, but not on the kernel repository, so as far as I can see, AMD hardware is not yet exempted.

Kernel page-table isolation merged

Posted Jan 4, 2018 0:33 UTC (Thu) by rahvin (guest, #16953) [Link] (1 responses)

I don't know if it's been added, but the patch was posted by Ken with AMD and he says pretty explicitly that AMD isn't vulnerable. Maybe it was premature and they actually are I don't know but I wish they'd lift the embargo and tell us. Particularly given the reports of Intel executives making large share sales.

https://lkml.org/lkml/2017/12/27/2

Kernel page-table isolation merged

Posted Jan 4, 2018 2:11 UTC (Thu) by mjg59 (subscriber, #23239) [Link]

The embargo was lifted 2 hours before your post :)

Kernel page-table isolation merged

Posted Jan 3, 2018 17:33 UTC (Wed) by jezuch (subscriber, #52988) [Link] (3 responses)

Interesting. If that's not considered a violation of x86 architecture spec then Intel really screwed up.

Kernel page-table isolation merged

Posted Jan 3, 2018 17:39 UTC (Wed) by jezuch (subscriber, #52988) [Link] (2 responses)

...I also wonder whether there's any (can there be any?) code out there that relies on Intel's behaviour and does not work on AMD hardware :) Or maybe this kind of problem only affects software API's...

Kernel page-table isolation merged

Posted Jan 3, 2018 18:03 UTC (Wed) by MarcB (guest, #101804) [Link]

This is unlikely to be a simple, direct read of forbidden memory. Most likely, it is a timing side-channel.

So, no existing code will be affected (unless there is an existing exploit, of course).

Kernel page-table isolation merged

Posted Jan 3, 2018 18:20 UTC (Wed) by excors (subscriber, #95769) [Link]

There can be - for example any code that tries to exploit this vulnerability will 'work' on Intel hardware and not AMD. But if you exclude code that has been specifically designed to exploit it, it seems unlikely.

CPUs are usually allowed to fetch whatever memory they fancy whenever they fancy (which gives them freedom to continually improve caches, prefetchers, speculative execution, etc), which is safe because it has no effect on application behaviour except in terms of instruction timing (and performance counters etc). Intel and AMD fetch memory quite differently, and different generations and models of CPUs by a single vendor fetch memory quite differently, so nobody writes applications that depend precisely on instruction timing. Except for people intentionally using timing attacks to extract information that exists inside the CPU pipeline but that wasn't meant to be observable to applications, which is presumably the problem here. (And except for buggy code with race conditions that get triggered by these timing changes, but that's so broken anyway that it doesn't really matter.)

Kernel page-table isolation merged

Posted Jan 3, 2018 4:58 UTC (Wed) by deater (subscriber, #11746) [Link] (5 responses)

The recent PAPI 5.6 performance library release apparently happened just in time.

It added code using rdpmc while doing self-monitoring reads of the perf_event performance counters.

On haswell a counter read:
Using rdpmc (avoids kernel): 142 cycles
4.14 kernel using read() syscall: 913 cycles.
4.15-rc6 kernel using read() syscall: 1360 cycles.

That's a huge increase when you're trying to do low-latency measurements.

Kernel page-table isolation merged

Posted Jan 3, 2018 17:51 UTC (Wed) by Paf (subscriber, #91811) [Link] (4 responses)

Do you know your exact CPU? Do you have PCID support? (I guess that means Westmere and newer)

Kernel page-table isolation merged

Posted Jan 3, 2018 18:04 UTC (Wed) by zdzichu (subscriber, #17118) [Link] (3 responses)

On a side note, PCID seem to be feature of Intel CPUs for 8 (eight!) years now, yet it seem it only get used with kernel 4.14 and later. It is strange to see so many years pass before it got used. Are there more such silicon features ignored by Linux?

Kernel page-table isolation merged

Posted Jan 3, 2018 18:37 UTC (Wed) by Wol (subscriber, #4433) [Link]

Normal development timetable ...

One layer adds a nifty feature. Next layer takes maybe a year to realise the feature exists and work out how to take advantage of it. Allow a year for debugging. Next layer up realises ... rinse lather and repeat.

Innovations typically take TWENTY TO THIRTY YEARS to filter through. How long did it take before the internet (in the form of the WWW) became ubiquitous? Around in the 70s, pushed and *financed* by Gore in the 80s, the web invented in the 90s (as a minor improvement on Gopher!), and ubiquitous by the 2000s. How long is that?

Cheers,
Wol

Kernel page-table isolation merged

Posted Jan 3, 2018 18:40 UTC (Wed) by deater (subscriber, #11746) [Link]

> Are there more such silicon features ignored by Linux?

Definitely. See the sad story that was AMD's Light-Weight Profiling instructions. Though it looks like they gave up on Linux support ever being added and have dropped support for the feature.

Kernel page-table isolation merged

Posted Jan 3, 2018 19:14 UTC (Wed) by Lionel_Debroux (subscriber, #30014) [Link]

PaX and therefore grsecurity have been taking advantage of PCID for years, to get either a stronger (default) or a faster version of the MEMORY_UDEREF protection. But indeed, the mainline Linux kernel only started using PCID very recently.

Kernel page-table isolation merged

Posted Jan 3, 2018 12:12 UTC (Wed) by sorokin (guest, #88478) [Link] (1 responses)

The Register published an article about the issue: Intel processor design flaw forces Linux, Windows redesign. The article refers to a blog post worth reading: Negative Result: Reading Kernel Memory From User Mode.

Kernel page-table isolation merged

Posted Jan 3, 2018 17:02 UTC (Wed) by vstinner (subscriber, #42675) [Link]

I like the last sentence of the blog article: "Further we see that speculative execution does not consistently abide by isolation mechanism, thus it’s a haunting question what we can actually do with speculative execution." Anders Fogh, July 2017

Kernel page-table isolation merged

Posted Jan 3, 2018 16:56 UTC (Wed) by microchp (guest, #93814) [Link] (5 responses)

When this is back-ported to older kernels, such as in RHEL 3.x, will there be a toggle to preserve the old behavior?

I am asking for the folks that will need time to mitigate the performance hit and may have servers with a lower risk profile.

Kernel page-table isolation merged

Posted Jan 3, 2018 17:00 UTC (Wed) by vstinner (subscriber, #42675) [Link] (1 responses)

"There will be a nopti command-line option to disable this mechanism at boot time."
https://lwn.net/Articles/741878/

Kernel page-table isolation merged

Posted Jan 3, 2018 18:12 UTC (Wed) by microchp (guest, #93814) [Link]

Thankyou, I missed that.

Kernel page-table isolation merged

Posted Jan 3, 2018 20:10 UTC (Wed) by cesarb (subscriber, #6266) [Link] (2 responses)

> When this is back-ported to older kernels, such as in RHEL 3.x

Aren't RHEL 3.x and 4.x completely out of support already?

Kernel page-table isolation merged

Posted Jan 3, 2018 20:36 UTC (Wed) by rahulsundaram (subscriber, #21946) [Link]

RHEL 3, 4 and 5 is EOL. Extended support is available for 5 IIRC

Kernel page-table isolation merged

Posted Jan 9, 2018 16:47 UTC (Tue) by microchp (guest, #93814) [Link]

Sorry, I was multitasking in the Spectre/Meltdown discussions and meant RHEL7 on the 3.10.x kernel.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds