|
|
Subscribe / Log in / New account

Shrinking the kernel with link-time optimization

Shrinking the kernel with link-time optimization

Posted Jan 23, 2018 15:14 UTC (Tue) by mirabilos (subscriber, #84359)
Parent article: Shrinking the kernel with link-time optimization

Now if GCC developers would actually care about LTO and not break it every other version… mksh’s regression testsuite picks up miscompilations fairly well, and since GCC 6 it’s constantly broken *again*, and I don’t even bother reporting this any more because they don’t care, and it’s so bad I’m disabling LTO support in the mksh build script because distros blindly use it then complain, instead of fixing their compilers (what fix?).


to post comments

Shrinking the kernel with link-time optimization

Posted Jan 23, 2018 17:03 UTC (Tue) by peter-b (subscriber, #66996) [Link] (6 responses)

Maybe you should ask for your money back?

Shrinking the kernel with link-time optimization

Posted Jan 24, 2018 16:15 UTC (Wed) by ncm (guest, #165) [Link] (5 responses)

Congratulations on entirely missing the point.

It is a matter of respect. When one person puts in substantial efforts to improve the kernel in a way that is important to them and to many other users, a trivial effort not to massively break those improvements would hint that you have something better than contempt for the people who do the work.

To complain that LTO doesn't work would be to demand people work for you for free. Making LTO work is contributing. Improvements of any kind typically take far more effort than it is worth to the individual, who is doing the rest of us that favor. Minimal effort not to break others' work is necessary to a healthy project.

Shrinking the kernel with link-time optimization

Posted Jan 25, 2018 12:58 UTC (Thu) by mirabilos (subscriber, #84359) [Link] (4 responses)

This is great. One replyer misses the point completely, the other assumes I’m not contributing enough and asks me to fix it myself.

Look up my contributions, if you so desire… if you find them all, I know I personally *don’t* even know all places I’ve had my fingers in over the last decades.

It’s just, compilers isn’t what I do well. I’ve patched bootloaders, I’ve got fixes in all Linux libcs except musl (dalias does a great enough job for me to not find any bugs in musl so far), and I’ve been doing tons of other work, and I’m even now expanding to other stuff.

There’s just not enough hours in a day, considering I have a regular, boring $dayjob. Do you wish to sponsor me for a year so I can take a sabbatical and work only on OSS?

Shrinking the kernel with link-time optimization

Posted Jan 26, 2018 22:20 UTC (Fri) by giraffedata (guest, #1954) [Link] (3 responses)

ncm didn't say you don't contribute enough. He said at most that it would be preferable for you to fix GCC (repeatedly, apparently) than to complain that it keeps breaking. (And that's not saying you should fix GCC).

And even that is based on some reading between the lines about the moral value of demanding versus contributing, and the idea that complaining about something is demanding that someone fix it. I don't view complaining that way; for example, I complain about the weather all the time without meaning to criticize anyone or demand that someone fix it. I complain about the presence of ads on Youtube the same way.

Shrinking the kernel with link-time optimization

Posted Jan 26, 2018 22:46 UTC (Fri) by mirabilos (subscriber, #84359) [Link] (2 responses)

The problem isn’t even about fixing vs. not fixing; the problem is that
GCC developers seem to be disinterested in LTO bugs, and for its antecessor
(-fwhole-program --combine) they outright said it won’t get fixed.

I actually find the idea of using LTO to eliminate dead code, in the
Linux kernel or elsewhere, great — I just wanted to point out that GCC
might, with its current bugs, history of bugs, and history of attitude
towards said bugs¹, be a tad too unreliable to do so without excessive
tests that point out miscompiled builds.

① I read “low-hanging fruits” in an LWN article today. One of these,
for the GCC/LTO problem, would be to make building mksh part of the
usual pre-release tests; mksh has a history of spotting compiler,
toolchain, libc, etc. bugs via its testsuite.

Now, with both the footnote #1 and the first paragraph, let’s get to
something: isolating the issue is *hard*. The mksh testsuite is a
bunch of shell scripts together with flags and expected output, with
a Perl driver, ran through the shell compiled with the to-be-tested
compiler/toolchain/libc. That’s a few levels of indirection. The latest
LTO bug occurs in only one testcase: arith-ternary-prec-1, which is:

$ mksh -c 'typeset -i x=2; y=$((1 ? 20 : x+=2))'
mksh: 1 ? 20 : x+=2: += requires lvalue

Basically, ?: binds more than +=, so this is '(1 ? 20 : x) += 2',
and a miscompiled shell silently accepts this. This is *very* hard
to isolate.

GCC developers prefer isolated small test cases. Now, with LTO,
isolating gets even more complicated. I can accept that not having
a small isolated test case is not desirable.

On the other hand, a change in testsuite output between two different
versions of the same compiler, ceteris paribus (i.e. you try the same
version of the testsuite, shell, toolchain, libc, …), *does* indicate
a problem (not necessarily in the compiler, but it’s a prime suspect),
and in the time of “git bisect” it’s at least often possible, for someone
with enough beefy hardware to actually build GCC that often, to figure
out which compiler change introduced the breakage. (Then, it’s still a
matter of deciding whether the bug is actually in the compiler or else‐
where, but the GCC developers at least know their compiler, and each
other on the development team.)

Oh, and: the Linux kernel does not have such a testsuite. Several GNU
distributions’ mksh package maintainers have come to me, independently,
with a testsuite failure report about the above test, and the advice
found after the first analysis (LTO is at fault, GCC miscompiled mksh)
made them compile mksh without LTO, preventing their users from getting
a faulty binary that might misbehave in other situations as well. Now,
the Linux kernel, not so much.

Food for thought?

Shrinking the kernel with link-time optimization

Posted Feb 8, 2018 10:49 UTC (Thu) by dharding (subscriber, #6509) [Link] (1 responses)

Idle curiosity: I'm wondering (though I'm not expecting anyone in this thread to have a ready answer) how many of the problems in LTO builds are specific to LTO, and how many are generic optimization bugs exposed because LTO provides more opportunity for optimization.

Shrinking the kernel with link-time optimization

Posted Feb 8, 2018 20:03 UTC (Thu) by mirabilos (subscriber, #84359) [Link]

That’s an extremely interesting point.

And, yes, sorry, I don’t have even the beginning of an answer for you,
but someone with enough horsepower machine could certainly bisect this
between GCC 5 and 6 I think…


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds