"Unstable kernel APIs" vs. the embedded reality :-(
"Unstable kernel APIs" vs. the embedded reality :-(
Posted Jan 13, 2018 11:22 UTC (Sat) by darwish (guest, #102479)Parent article: Opening up the GnuBee open NAS system
I mean, we already break the Linux ecosystem into a "kernel" and "user-space", with a very stable API in between.
Maybe it's the right time to break the kernel itself into more components, with stable APIs in between? It's been like what, 25 years now, and we cannot have a proper internal stable API?
I've seen too many horrible SoC kernels in my day-job embedded Linux work that it just became the norm -- all in very reputable companies and teams. Everyone just gave up on this and is dealing with the ugly reality..
Yeah, I've read gregkh treatise on the topic, "stable-api-nonsense.txt", etc. But really, why divide the stable boundaries of the Linux ecosystem into just two components: kernel and user-space? Was this an explicit decision, or honestly just a historical accident?
Linus himself said multiple times, especially at Debconf, that the userspace distribution folks not providing stable APIs are "crazy". So, by his own admission, the stable boundaries in the Linux ecosystem must be __more than one__. So maybe it's the right time now to ask ourselves if treating the whole kernel, internally, as a bundle of unstable APIs, is the right decision.
Google itself has realized how unworkable this development model is for the embedded industry and is working on a whole new kernel to solve this. Maybe we should wake-up and re-investigate our decisions before it's too late.
I won't be arrogant and claim that I know the answer for such an intractable problem, but starting from "set-in-stone" propositions does not seem to be wise when the whole industry is kind of against you at this point.
Posted Jan 13, 2018 15:45 UTC (Sat)
by gregkh (subscriber, #8)
[Link] (3 responses)
Seriously, it's a non-trivial thing to do, lots of people and companies have tried to do so over the decades in operating system development, and no one has been able to do so for a general-purpose operating system that supported more than just a handful of hardware devices/types.
Remember if your in-kernel api is not changing over time, your operating system is dead :)
People forget that change happens because it has to, given the environment or requirements change over time, it's not done just because it is fun to do so. Well, not usually...
Anyway, take a look at how the enterprise distro kernels do this type of thing, they define a subset of all in-kernel apis and work hard to keep that stable over time. That works good for their target market, and is also something that embedded people should copy.
But, to get back to the main topic of this article, I fail to see how a stable api would solve any of the issues Neil had here, what ones do you think it would have helped resolve?
Posted Jan 13, 2018 23:42 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Both are exceptionally stable. My filesystem driver from 2005 still works on Windows 8 with only build system adjustments.
Posted Jan 20, 2018 0:00 UTC (Sat)
by giraffedata (guest, #1954)
[Link] (1 responses)
You mean it's source code compatible? You have to recompile your filesystem driver?
Long ago, I worked on AIX filesystem and device drivers, and AIX was binary backward compatible with kernel extensions. I could plug a binary driver developed for AIX 4 into an AIX 5 kernel.
Making this possible made AIX considerably more complex and the kernel extension APIs harder to use, so I understand why Linux kernel developers enjoy not having the backward compatibility obligation.
Posted Jan 20, 2018 5:26 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Jan 13, 2018 21:08 UTC (Sat)
by nix (subscriber, #2304)
[Link]
Posted Jan 13, 2018 22:22 UTC (Sat)
by neilbrown (subscriber, #359)
[Link]
No API is really stable - we add things to the user->kernel API all the time. You might have noticed that a new event notification API was proposed recently. If we add things, then the API isn't really "Stable", it is "Changing".
If legacy driver-API support was really important to someone I suspect they could create a shim which exports exactly that API. Then they can keep the shim up-to-date as the kernel changes. Probably the most difficult part is creating a good API definition (lwn readers know that we suck at designing new APIs) and ensuring drivers adhere to the intent of the API. I think that is a lot of work, but maybe it would be worth it for someone.
Posted Jan 13, 2018 22:27 UTC (Sat)
by jhoblitt (subscriber, #77733)
[Link] (3 responses)
A much better solution would be to simply not buy hardware that does not have in-tree support. I realize that is easier said than done with this type of hardware but is, AFAIK, possible.
I even hope to someday follow my own advice WRT to GPUs. Sigh.
Posted Jan 14, 2018 12:10 UTC (Sun)
by hubcapsc (subscriber, #98078)
[Link] (1 responses)
I'd like to get one of those studly thinkpad p71's with
So I'll get an X1 Carbon with Intel graphics and a 14 in
-Mike "and squint more..."
Posted Jan 25, 2018 11:28 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
I'll just say that I've got a 17" laptop (an old Tosh), and while I do like large screens, imho a 17" screen on a laptop makes the whole caboodle that little bit too big. 15" seems the sweet spot.
Cheers,
Posted Jan 20, 2018 0:39 UTC (Sat)
by neilbrown (subscriber, #359)
[Link]
That depends on what your goal is. Some people like buying jigsaw puzzles. Others like buying open hardware that doesn't have complete in-tree support.
"Unstable kernel APIs" vs. the embedded reality :-(
"Unstable kernel APIs" vs. the embedded reality :-(
Windows NT, Solaris.
"Unstable kernel APIs" vs. the embedded reality :-(
My filesystem driver from 2005 still works on Windows 8 with only build system adjustments.
"Unstable kernel APIs" vs. the embedded reality :-(
"Unstable kernel APIs" vs. the embedded reality :-(
Yeah, I've read gregkh treatise on the topic, "stable-api-nonsense.txt", etc. But really, why divide the stable boundaries of the Linux ecosystem into just two components: kernel and user-space? Was this an explicit decision, or honestly just a historical accident?
The division follows the boundaries of the address space -- the idea is that within the domain in which simple function calls to/from the kernel are possible, the ABI is uncontrolled. (Note that for the vDSO, which is part of the domain in which simple function calls from userspace are possible, the ABI is very strictly controlled.)
"Unstable kernel APIs" vs. the embedded reality :-(
Internally, we tend to manage this by providing shims to support the legacy API. For example, file change notification has a single internal implementation with three shims: dnotify, inotify, fanotify. You see this sort of pattern of various shims over an internal implementation a lot in the kernel.
This works well for the user->kernel interface because there are lots of applications making calls into one kernel, so the one kernel can easily support multiple APIs. For the kernel->driver interface the situation is a bit different. One kernel is calling lots of drivers.
To manage that, the kernel would need to detect what API the driver expects, and adjust its calling style accordingly. We do do that to some extent. The stand-out example is my mind is "unlocked_ioctl" in 'struct file_operations'. There used to be a 'ioctl' interface. The BKL would be held while that was called. Part of removing the BKL was to handle ioctls without it. So unlocked_ioctl was added and for a time we supported drivers with the new interface and drivers with the legacy interface, by detecting which of the two functions was not NULL. This is a thing that happens quite a bit, but it certainly isn't universal. And we do get rid of the legacy support because we find it adds no value, and does entail a cost.
It would have made my job easier if every time an internal API changed, the name changed (like ioctl -> unlocked_ioctl) so that old drivers wouldn't compile. When an old driver doesn't compile it is easy to find out why, find the commit that introduced the change, read the explanation, and fix the code. I might have done while while forward-porting the mt7621 drivers, but it would have been so "normal" that I wouldn't bother to remember.
The problem is when the API changes and the code still compiles. I think we could do better here, but I'm not sure that we are all that bad. In some cases it might be that the driver was misusing the API in some way. After all, we don't test against a generic API, we test against a particular kernel. If the kernel starts using an API in a new way, a driver might break even though the API didn't change. The only real solution to making drivers work with new kernels is regular testing.
We in the kernel community think it isn't worth it. We would rather keep all drivers in the common tree, and update all API users when we update an API. And we try to make this easy. If you have out-of-tree code, then *please* put it in drivers/staging. The barrier to entry is low and we will do our best to keep the API up-to-date, and may well clean up the code and make it maintainable as well. Free service!
I actually wonder if the swconfig Ethernet driver I mentioned should go into drivers/staging. Davem doesn't want it, but maybe gregkh would take it. Then we can fix it up in the open and convert to switchdev. Then it might be ready for davem.
I think my next sub-project with the gnubee will be to try to get all of the drivers into staging before looking to clean them up. See if it actually works :-)
"Unstable kernel APIs" vs. the embedded reality :-(
"Unstable kernel APIs" vs. the embedded reality :-(
the 17 inch screen, but they have NVIDIA graphics.
screen...
"Unstable kernel APIs" vs. the embedded reality :-(
Wol
"Unstable kernel APIs" vs. the embedded reality :-(