How to unbreak LTTng
LTTng is a tracing subsystem; to carry out that sort of task, it must be able to hook into the kernel in a number of fairly deep places. It is unsurprising that LTTng was accessing parts of the kernel that are not deemed suitable for export to modules in general. Losing access to kallsyms_on_each_symbol() deprived LTTng of the ability to find those addresses, thus breaking much of its functionality. That is not welcome news to those who work on — or use — LTTng.
LTTng developer Mathieu Desnoyers has responded to this change with a patch series exporting a number of new symbols; with those available, LTTng can do what it needs to do without using the rather more general kallsyms_on_each_symbol() function. For example, LTTng needs access to stack_trace_save_user() to be able to save user-space stack traces. It also needs access to functions like task_prio(), disk_name(), and get_pfn_blocks_mask(). LTTng obtains kernel information from tracepoints as well, of course, and that usage will increase as tracepoints replace some of the direct internal accesses that were used before. The patch set raises the number of arguments that can be passed to a BPF program from a tracepoint to an eye-opening 13 (to allow more information to be passed out via a specific tracepoint), but that change may prove to be unnecessary in the end.
Anybody who has watched the kernel community for any period of time can
probably guess what sort of reception this patch series received.
Christoph Hellwig was characteristically
blunt: "Which part of every added export needs an in-tree user
did you not get?
" The kernel community as a whole is strongly
resistant to the idea of adding any sort of support for code that is
outside of the kernel repository. Much of that resistance comes from a
dislike for proprietary kernel modules in general, but there is a bit more
to it than that.
LTTng, being free software, should not be affected by any antipathy for proprietary kernel code. But, as Greg Kroah-Hartman explained, there are still reasons to avoid adding support for free, out-of-tree modules. Once those modules are supported in some way, they add constraints to what kernel developers can do. Internal kernel interfaces can be changed as needed; since all of the users of those interfaces are present in the same code base, they can be changed at the same time. If external modules have to be supported, though, it becomes harder to make such changes, since the users cannot be changed to match. Indeed, it becomes difficult to even know when a change might cause problems elsewhere.
Thus, Kroah-Hartman said:
This all suggests that there is not much of a path forward for LTTng. It is unable to function without access to kernel internals, and that access is being expressly denied.
There is, of course, one other option that was first raised
by Steve Rostedt: "I guess we should be open to allowing LTTng
modules in the kernel as well
". If LTTng were actually a part of
the mainline kernel, there would no longer be problems with giving access
to the resources that it needs.
This is not a new idea. Numerous attempts have been made to get the LTTng code into the mainline kernel, without success. In the early days, before the kernel had any sort of tracing capability at all, adding that feature was a hard sell. Kernel developers now are heavily dependent on tracing for their own work and would strongly resist any attempt to take that capability away, but it was not that long ago that many of the same developers were unconvinced that tracing was needed at all. During that time, getting any tracing features into the kernel was not easy.
Over time, some low-level LTTng code found its way in, but LTTng as a whole has not followed. More recently, in 2011, LTTng was brought into the staging tree by Kroah-Hartman as a first step toward merging it. That move brought about a great deal of hostility, some of which seems familiar; a rather lengthy thread was set off by an attempt to export task_prio(), for example. In the end, LTTng was pushed back out of the staging tree — as it was before and has been ever since.
So LTTng would appear to be in a difficult position: unable to function
outside of the kernel, and unable to be merged. Leaving LTTng broken would
cause serious harm to a lot of users, though, and seems unlikely to advance
the cause of Linux or free software in general. So perhaps the time has
come for something to give. If a handful of symbols truly cannot be
exported for this subsystem, perhaps some space could be found in the
mainline for a widely used tracing subsystem, even if it somehow duplicates
some of the functionality that is already there.
Index entries for this article | |
---|---|
Kernel | Development tools/Kernel tracing |
Kernel | Modules/Exported symbols |
Kernel | Tracing |
Posted Apr 20, 2020 23:05 UTC (Mon)
by cesarb (subscriber, #6266)
[Link] (12 responses)
Back in the day, things like that would be distributed as a patch set against the upstream kernel, and users would be expected to recompile their kernel with it. Is that no longer an option?
Posted Apr 21, 2020 1:05 UTC (Tue)
by neilbrown (subscriber, #359)
[Link] (11 responses)
I suspect that today a more realistic approach is to ask the various distributors to apply that patch to their kernels.
I think it is worth reflecting for a moment on the motivation behind these changes. They seem to be coming from Android. The Android kernel has good reason to lock down the exported inferfaces so that phone vendors cannot 'abuse' them. I fully support that work, but don't think that it should necessarily impose what I do on my device or what a distro does with their supported kernel.
My preference would be that kallsyms_lookup_name(), could be exported or not depending on a CONFIG option. Android could set that to hide the function and LTTng wouldn't work on Android. Probably no loss there.
Posted Apr 21, 2020 11:50 UTC (Tue)
by dyfrgi (guest, #122539)
[Link] (10 responses)
Is that really adequate? Wouldn't phone vendors just set the config flag? Though tbh I also don't understand why they wouldn't just patch the kernel. My understanding is that they all do it for hardware support anyway.
Posted Apr 21, 2020 12:11 UTC (Tue)
by smurf (subscriber, #17840)
[Link] (1 responses)
Posted Apr 22, 2020 0:25 UTC (Wed)
by neilbrown (subscriber, #359)
[Link]
Quoting from https://lwn.net/Articles/813350/
> But that only holds if modules are restricted to the exported-symbol interface; if they start to reach into arbitrary parts of the kernel, all bets are off. Deacon doesn't say so, but it seems clear that some vendors are, at a minimum, thinking about doing exactly that.
So while I agree with what you say, I don't think it is relevant to my comment.
Posted Apr 22, 2020 11:52 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link]
Posted Apr 22, 2020 14:38 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (6 responses)
Google can easily stop that. There's nothing to prevent vendors doing it, but Google can just say "if you do that, you can't call it Android".
Cheers,
Posted Apr 22, 2020 16:33 UTC (Wed)
by pizza (subscriber, #46)
[Link] (4 responses)
...And then those companies start complaining to their various national anti-trust bodies, and Google gets threatened with billion-dollar fines.
Posted Apr 22, 2020 18:41 UTC (Wed)
by Wol (subscriber, #4433)
[Link]
Cheers,
Posted Apr 22, 2020 20:26 UTC (Wed)
by excors (subscriber, #95769)
[Link] (2 responses)
(You can still freely use AOSP and ignore those requirements as long as you don't call it Android, and some large companies that compete with Google already do that, which makes it harder to argue that Google is being anti-competitive here.)
Posted Apr 22, 2020 21:15 UTC (Wed)
by pizza (subscriber, #46)
[Link]
You forget that Google is obviously the only reason why $DomesticBusinessSector is not making lots of money, so any restrictions they impose upon folks using their stuff is clearly anticompetitive behavior that must be harshly punished.
(I'm not saying I agree with this, but many others do)
Posted Apr 24, 2020 16:00 UTC (Fri)
by ballombe (subscriber, #9523)
[Link]
True, however if you do not call it Android, you cannot ship the Google apps with it. So there are some real consequences.
Posted Apr 22, 2020 16:38 UTC (Wed)
by corbet (editor, #1)
[Link]
Posted Apr 21, 2020 12:08 UTC (Tue)
by ncultra (✭ supporter ✭, #121511)
[Link] (6 responses)
It gets worse when considering the purpose of a symbol such as kallsyms_on_each_symbol() which is to be a support to developers, and is critical for maintaining a driver or module that can be compiled and linked on multiple kernel versions. This is an accepted practice that is now made much more difficult. In my opinion, this philosophy that punishes out-of-tree developers (regardless of license choice) is counter-productive to the continued existence of Linux as we know it.
Posted Apr 21, 2020 13:28 UTC (Tue)
by mfuzzey (subscriber, #57966)
[Link] (5 responses)
Generally excluding duplicate functionality seems to be a good idea for long term maintenance and code size surely.
When competing solutions exist the best bits of each can be, and often are it seems to me, merged to make a better in tree solution.
And in this particular case Steve Rostedt was quoted in the article as being open to including LTTng
> In-tree status is unrealistic for many free kernel modules
Why? For non-free ones sure but you explicitly said free modules.
> to punish out-of-tree developers.
Why? Sure some decisions may make there life harder but "punish" implies an "intention to hurt". What makes you think this is the case rather than just trying to do what is best for the kernel as a whole (even if that makes things harder for out of tree modules)?
> with the lack of a stable kernel API
The reasons for this are well documented and make sense to many. This is really part of the Linux philosophy today I'd say. No one forces anyone to use Linux.
Posted Apr 21, 2020 14:09 UTC (Tue)
by Paf (subscriber, #91811)
[Link] (2 responses)
It’s (in many cases) a huge amount of ongoing work, requiring not only commitment and effort but persuasion of others that you have commitment and effort.
The idea is that not everything that exists is up to the standards for mainline inclusion, and not everything that exists is going to avoid duplication to the level that’s appropriate for the mainline.
This is “let a thousand flowers bloom” stuff - it’s about being open to a world of stuff beyond the domain of what’s good enough to be a permanent part of mainline. Hobby projects (that still have outside users), small stuff, differing approaches to existing functionality that are not clearly better in general (see LTTTNG for an example of that) but are preferred by some users, etc.
I recognize this is a sticky issue (well, sort of), but this is still a choice to be more closed.
Posted Apr 28, 2020 21:35 UTC (Tue)
by ecree (guest, #95790)
[Link] (1 responses)
Maintaining an in-tree module is *way* less work than maintaining an out-of-tree one. Heck, most of the time you just have to say "Ack" on the patches that the person changing some kernel infrastructure _writes for you_ to keep your module working. Whereas out-of-tree you end up with elaborate compatibility scripts, that you have to keep updated without any outside help, unless you're happy for your module to either only build on the latest kernel, or be tied to 2.6.32 forever 'cos that's what was ubiquitous when you started the project.
If it's tight enough with the kernel to be a module (rather than a userspace executable, using the APIs that are actually designed to be stable), then it belongs in the upstream tree; and if it's not good enough quality to be upstream (even in staging or behind a 'default n' kconfig) then it doesn't belong anywhere.
Posted Apr 28, 2020 22:06 UTC (Tue)
by mpr22 (subscriber, #60784)
[Link]
Posted Apr 21, 2020 16:10 UTC (Tue)
by nilsmeyer (guest, #122604)
[Link]
What about SELinux and AppArmor (and Smack and TOMOYO)? There are probably a lot of other examples. It often feels arbitrary to the outside observer,
Posted Apr 22, 2020 14:42 UTC (Wed)
by Wol (subscriber, #4433)
[Link]
Because there's evidence that some kernel maintainers, at least, do not make these decisions based on technical evidence, but on how much grief it's going to cause people that these maintainers perceive as "bad actors".
The fact that it causes a lot of grief to innocent bystanders does not get taken into account.
Cheers,
How to unbreak LTTng
How to unbreak LTTng
So if distros want to patch out these restrictions, that might make perfect sense.
Of course it could work the other way around - Android could patch in these restrictions. But we have a long history of trying to bring Android back to mainline - and requiring them to patch in the restrictions would hurt the momentum we have.
Other distros could set the config option the other way and LTTng would work fine on them.
When I build my own kernels, I can (of course) do whatever I want.
Problem solved.
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
Thanks.
How to unbreak LTTng
How to unbreak LTTng
Wol
How to unbreak LTTng
How to unbreak LTTng
Wol
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
That is already a part of the plan as I understand it. The kernel will become part of the generic system image, so it's provided by Google, not the device vendors, who are limited to providing kernel modules. They they will indeed not be able to set that config flag.
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
How to unbreak LTTng
Wol