Device trees as ABI

By Jonathan Corbet
July 30, 2013

Last week's device tree article introduced the ongoing discussion on the status of device tree maintainership in the kernel and how things needed to change. Since then, the discussion has only intensified as more developers consider the issues, especially with regard to the stability of the device tree interface. While it seems clear that most (but not all) participants believe that device tree bindings should be treated like any other user-space ABI exported by the kernel, it is also clear that they are not treated in this way currently. Those seeking to change this situation will have a number of obstacles to overcome.

Device tree bindings are a specification of how the hardware is described to the kernel in the device tree data structure. If they change in incompatible ways, users may find that newer kernels may not boot on older systems (or vice versa). The device tree itself may be buried deeply within a system's firmware, making it hard to update, so incompatible binding changes may be more than slightly inconvenient for users. The normal kernel rule is that systems that work with a given kernel should work with all releases thereafter; no explicit exception exists for device tree bindings. So, many feel, bindings should be treated like a stable kernel ABI.

Perhaps the strongest advocate of the position that device tree bindings should be treated as any other ABI right now (rather than sometime in the future) is ARM maintainer Russell King:

We can draw the line at an interface becoming stable in exactly the same way that we do every other "stable" interface like syscalls - if it's in a -final kernel, then it has been released at that point as a stable interface to the world. [...]

If that is followed, then there is absolutely no reason why a "Stable DT" is not possible - one which it's possible to write a DT file today, and it should still work in 20 years time with updated kernels. That's what a stable interface _should_ allow, and this is what DT _should_ be.

As is often the case, though, there is a disconnect between what should be and what really is. The current state of device tree stability was perhaps best summarized by Olof Johansson:

Until now, we have been working under the assumption that the bindings are _NOT LOCKED_. I.e. they can change as needed, and we _ARE_ assuming that the device tree has to match the kernel. That has been a good choice as people get up to speed on what is a good binding and not, and has given us much-needed room to adjust things as needed.

Other developers agreed with this view of the situation: for the first few years of the ARM migration from board files to device trees, few developers (if any) had a firm grasp of the applicable best practices. It was a learning experience for everybody involved, with the inevitable result that a lot of mistakes were made. Being able to correct those mistakes in subsequent kernel releases has allowed the quick application of lessons learned and the creation of better bindings in current kernels. But Olof went on to say that the learning period is coming to a close: "That obviously has to change, but doing so needs to be done carefully." This transition will need to be done carefully indeed, as can be seen from the issues raised in the discussion.

Toward stable bindings

For example: what should be done about "broken" bindings that exist in the kernel currently? Would they immediately come under a guarantee of stability, or can they be fixed one last time? There is a fair amount of pressure to stop making incompatible changes to bindings immediately, but to do so would leave kernel developers supporting bindings that do not adequately describe the hardware, are not extensible to newer versions of the hardware, and are inconsistent with other bindings. Thus, Tomasz Figa argued, current device tree bindings should be viewed as a replacement for board files, which were very much tied to a specific kernel version:

We have what we have, it is not perfect, some things have been screwed up, but we can't just leave that behind and say "now we'll be doing everything correctly", we must fix that up.

Others contend that, by releasing those bindings in a stable kernel, the community already committed itself to supporting them. Jon Smirl has advocated for a solution that might satisfy both groups: add a low-level "quirks" layer that would reformat old device trees to contemporary standards before passing them to the kernel. That would allow the definitive bindings to change while avoiding breaking older device trees.

Another open question is: what is the process by which a particular set of bindings achieves stable status, and when does that happen? Going back to Olof's original message:

It's likely that we still want to have a period in which a binding is tentative and can be changed. Sometimes we don't know what we really want until after we've used it a while, and sometimes we, like everybody else, make mistakes on what is a good idea and not. The alternative is to grind most new binding proposals to a halt while we spend mind-numbing hours and hours on polishing every single aspect of the binding to a perfect shine, since we can't go back and fix it.

Following this kind of policy almost certainly implies releasing drivers in stable kernels with unstable device tree bindings. That runs afoul of the "once it's shipped, it's an ABI" point of view, so it will not be popular with all developers. Still, a number of developers seem to think that, with the current state of the art, it still is not possible to create bindings that are long-term supportable from the beginning. Whether bindings truly differ from system calls and other kernel ABIs in this manner is a topic of ongoing debate.

Regardless of when a binding is recognized as stable, there is also the question of who does this recognition. Currently, bindings are added to the kernel by driver developers and subsystem maintainers; thus, in some eyes, we have a situation where the community is being committed to support an ABI by people who do not fully understand what they are doing. For this reason, Russell argued that no device tree binding should be merged until it has had an in-depth review by somebody who not only understands device tree bindings, but who also understands the hardware in question. That bar is high enough to make the merging of new bindings difficult indeed.

Olof's message, instead, proposed the creation of a "standards committee" that would review bindings for stable status. These bindings might already be in the kernel but not yet blessed as "locked" bindings. As Mark Rutland (one of the new bindings maintainers) pointed out, this committee would need members from beyond the Linux community; device tree bindings are supposed to be independent of any specific operating system, and users may well want to install a different system without having to replace the device tree. Stephen Warren (another new bindings maintainer) added that bootloaders, too, make use of device trees, both to understand the hardware and to tweak the tree before passing it to the kernel. So there are a lot of constituents who would have to be satisfied by a given set of bindings.

Tied to this whole discussion is the idea of moving device tree bindings out of the kernel entirely and into a repository of their own. Such a move would have the effect of decoupling bindings from specific kernel releases; it would also provide a natural checkpoint where bindings could be carefully reviewed prior to merging. Such a move does not appear to be planned for the immediate future, but it seems likely to happen eventually.

There are also some participants who questioned the value of stable bindings in the first place. In particular, Jason Gunthorpe described the challenges faced by companies shipping embedded hardware with Linux:

There is no way I can possibly ship a product with a DT that is finished. I can't tie my company's product release cycles to the whims of the kernel community.

So embedded people are going to ship with unfinished DT and upgrade later. They have to. There is no choice. Stable DT doesn't change anything unless you can create perfect stable bindings for a new SOC instantaneously.

In Jason's world, there is no alternative to being able to deal with device trees and kernels that are strongly tied together, and, as he sees it, no effort to stabilize device tree bindings is going to help. That led him to ask: "So who is getting the benefit of this work, and is it worth the cost?" That particular question went unanswered in the discussion.

Finally, in a world where device tree bindings have been stabilized, there is still the question of how to ensure that drivers adhere to those bindings and add no novelties of their own. The plan here appears to be the creation of a schema to provide a formal description for bindings, then to augment the dtc device tree compiler to verify device trees against the schema. Any strange driver-specific bindings would fail to compile, drawing attention to the problem.

The conversation quickly acquired a number of interesting side discussions on how the schema itself should be designed. A suggestion that XML could be used evoked far less violence than one might imagine; kernel developers are still trying hard to be nice, it seems. But David Gibson's suggestion that a more C-like language be used seems more likely to prevail. The process of coming up with comprehensive schema definition and checking that it works with all device tree bindings is likely to take a while.

Reaching a consensus on when device tree bindings should be stabilized, what to do about substandard existing bindings, and how to manage the whole process will also probably take a while. The topic has already been penciled in for an entire afternoon during the ARM Kernel Summit, to be held in Edinburgh this October. In the meantime, expect a lot of discussion without necessarily binding the community to more stable device trees.

Index entries for this article
Kernel	Development model/User-space ABI
Kernel	Device tree

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 1:42 UTC (Wed) by martin.langhoff (subscriber, #61417) [Link] (26 responses)

Having seen this up close, I think Jason Gunthorpe's email is spot on (and worth a read of the complete email).

Many (most?) ARM SoC creators and the ODMs around them are focused on a quick turnaround, short production runs and not easily hackable nor easily upgradable devices. For them, there is no point in investing in getting the DT Just Right on the hardware side, nor the DT "ABI" on the driver side.

The short production runs and limited hackability fragment and limit the "community" side. This is unfortunate but true even in the absence of DRM.

On the other hand, there is a handful of ODMs and OEMs building ARM servers. It seems likely that they will want to invest on a stable DT ABI for the relevant drivers, so they bake a DT into their server kit, land drivers promptly and as pay off future Linux releases will Just Work (modulo bugs, as usual).

Oh, and perhaps someone takes up the mantle TI left behind in terms of good Linux support upstream. Perhaps a smart SoC maker can define a good DT early enough for their hw that it's useful for their ODMs.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 5:19 UTC (Wed) by pbonzini (subscriber, #60935) [Link]

ARM servers might be using ACPI instead of device trees.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 6:39 UTC (Wed) by dlang (guest, #313) [Link] (19 responses)

The problem is that the cry of "we're embedded, we can't take the time to do things right, we just need to ship product" has been heard far too many times on far too many topics.

It does mean that there needs to be a fast way to get bindings defined, but it doesn't meant that the current free-for-all is the right thing to have happen.

If they want to do their own thing, they can keep their stuff out of the kernel and maintain it on their own forever. But they are (slowly) learning that they don't like the results of that, so they want their work to be in the upstream kernel to save them effort in the future, part of the deal to get things into the upstream kernel is that they are done in a way that allows them to be maintained reasonably, which means that it's no longer a free for all.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 11:48 UTC (Wed) by etienne (guest, #25256) [Link] (7 responses)

> we can't take the time to do things right

FYI, I use few systems on chip, one has more than 1000-Pin BGA Package with more than 145 multi-functions pins.
If one had such reconfigurability in the PC world, I think one would have the same problem on PC.
Also, the PC world has a BIOS in FLASH hiding a lot of details of the boot process, and a secondary storage (hard disk) storing all interesting data.
Where would you put a device tree if you had one on PC: in the FLASH BIOS or the hard disk?
If you put the DT in the FLASH BIOS, costumer will complain that they cannot edit it, and you will get a lot of costumer support calls/returns.
If you put it on the hard disk, describing the disk subsystem, disk partitions (and device tree location) inside the device tree is not really good design.
Embedded world do not have two storage area (FLASH BIOS + HD), and the boot-loader has to handle everything from the reset vector (initialising external memory from limited SOC internal memory, initialising NOR FLASH before being able to read it, recovery and tests from command line...).
The boot process will need constants to progress, some of them before a device tree can be read. It is not really good design to put in the device tree constants which may be different to the value which has been used.

Now, do you need a device tree on PC, to describe:
- if you want EFI BIOS or regular BIOS
- EFI password and keys
- partition table format (MBR/GPT/BSD)
- partition tables themselves
- location and IRQ of serial/parallel ports
- USB keyboard and mouse
- keyboard language layout and user language
- ....
and please ask windows and BSD to support that device tree.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 12:50 UTC (Wed) by rvfh (guest, #31018) [Link] (4 responses)

A DT is not meant to describe what is discoverable, so on a PC it could be in the /boot partition.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 15:10 UTC (Wed) by etienne (guest, #25256) [Link] (3 responses)

So you assume that there is only one Hard Disk with one /boot partition (with a known partition scheme and known filesystem) and containing one DT with a discoverable name.
In the PC world, people would say: try everything you can think of to find that DT, and yes by specification a hard disk can take 31 seconds to appear - the process might take time.
In embedded world, people would say: the list of non-volatile memory devices is not known, define it in the DT; the priority of those devices is not known, in the DT; the partition scheme (MBR, GPT, BSD disk slices,...) is not known, in the DT; the filesystem of /boot is not known, in the DT; the filename of the DT is not known, write it in the DT...

Remember that if something is wrong on a PC, you can always count on the BIOS to get to the point you can fix things - not so on an ARM embedded board.

Also, if a new device (new name) appears and is 100% compatible with an old one, you have to modify the DT to add a "compatible = "new_device_name";" line - so you have to be able to update the DT.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 15:26 UTC (Wed) by rvfh (guest, #31018) [Link] (2 responses)

> So you assume that there is only one Hard Disk with one /boot partition (with a known partition scheme and known filesystem) and containing one DT with a discoverable name.

On a PC the bootloader knows where to load the kernel and initrd from. If it needs a DT then all it needs is store that info too. And yes the BIOS does the init before the kernel re-discovers everything (but I am sure you know all that!)

> In the PC world, people would say: try everything you can think of to find that DT, and yes by specification a hard disk can take 31 seconds to appear - the process might take time.

See reply below, if I understand what you are saying correctly...

> In embedded world, people would say: the list of non-volatile memory devices is not known, define it in the DT; the priority of those devices is not known, in the DT; the partition scheme (MBR, GPT, BSD disk slices,...) is not known, in the DT; the filesystem of /boot is not known, in the DT; the filename of the DT is not known, write it in the DT...

And again the bootloader is what knows where the DT is, so it can
- use it
- pass it to the kernel

> Remember that if something is wrong on a PC, you can always count on the BIOS to get to the point you can fix things - not so on an ARM embedded board.

I don't understand this fully... The BIOS is a pre-bootloader program as you find in the ROM code of your SoC, no?

> Also, if a new device (new name) appears and is 100% compatible with an old one, you have to modify the DT to add a "compatible = "new_device_name";" line - so you have to be able to update the DT.

I don't understand that, sorry. How can a new device 'appear'? Embedded systems tend not to evolve, which I think is the definition of embedded, compare to a PC to which you can change/add/remove core components such as CPU/GPU/memory/storage...

Sorry if I am missing your point, which I am sure is valid.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 16:33 UTC (Wed) by etienne (guest, #25256) [Link] (1 responses)

> > Remember that if something is wrong on a PC, you can always count on the BIOS to get to the point you can fix things - not so on an ARM embedded board.
>
> I don't understand this fully... The BIOS is a pre-bootloader program as you find in the ROM code of your SoC, no?

On that SOC, you have a very small ROM which can do DHCP/TFTP boot when the FLASH is blank, but that is a recovery system only.
A standard boot (i.e. cold reset vector) brings you to U-boot without any other layer, U-boot has to initialise itself the external DDR ram; you have no other software whatsoever.
If you are using the new NOR FLASH (bigger size for lower price), only the first few sectors of that FLASH are present in the address space; you also need to have a driver to read the other sectors.

So U-boot runs in a very restricted environment, and cannot access the DT which is stored in FLASH before initialising both DDR ram and FLASH.
So the constants to initialise the DDR ram and the FLASH are not really needed in the DT.
Once U-boot can access DDR ram and FLASH, it has basically finished its job and just loads two files in memory (linux kernel and DT) without looking at their content.

> I don't understand that, sorry. How can a new device 'appear'?

I was more talking of a DT on a PC, but even on embedded you have chips upgraded during production without software change, or external USB connectors where newer devices can be connected and not auto-probed.
In short you cannot store the DT in real ROM, you may have to update it.

> Sorry if I am missing your point, which I am sure is valid.

For instance, in the PC world the PCI interface is at well-known I/O address 0xCF8; that is compiled as a constant into every ia32/amd64 Linux kernel. There is no point to write that address into a DT located on a hard disk because the hard disk cannot be discovered before the PCI address.
In the embedded world there is no well-know addresses, but there is no place to store such a database neither.

Device trees as ABI -- flag the stable ones

Posted Aug 2, 2013 8:58 UTC (Fri) by rvfh (guest, #31018) [Link]

> Once U-boot can access DDR ram and FLASH, it has basically finished its job and just loads two files in memory (linux kernel and DT) without looking at their content.

OK. On TI OMAPs we have U-Boot/SPL initialising the DDR ans SD card, so U-Boot is directly loaded into it (and the DT could be too.) To some extent, this means the DT info is partly duplicated into U-Boot/SPL.

> In the embedded world there is no well-know addresses, but there is no place to store such a database neither.

Yes, and anyway, there is not necessary a way to determine the board type (and thus choose the correct DT), so the board type is likely hard-coded into the boot loader.

But that's OK to me: it's the SoC case. It remains IMO that in the PC case a DT can be in the /boot partition because we can get there thanks to BIOS/standards/...

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 15:53 UTC (Wed) by raven667 (subscriber, #5198) [Link] (1 responses)

I'm not an expert in this area but isn't the DT something that should be burned into the board when it is manufactured, at that time all those multi-function pins will be connected to something and that needs to be described so that the firmware and bootloader can start, enough to get the kernel loaded and for the kernel to find all the hardware. The PC world has the same problems but they are/were solved by multi-vendor conventions and standards and by building auto-discoverabilty into the hardware at every possible level.

So parts that can be auto-discovered later don't need to be described in a burned-in data structure but parts that are needed for low-level boot strapping do need to be described and initialized by burned-in code. That's not really different than the PC world, is it? It seems that this discussion of device tree is really working around a discussion of having some sort of firmware standard for ARM, like BIOS and uEFI and ACPI and whatnot in the PC world, so that the hardware is self-sufficient enough to be able to load a kernel in a standard way such that a standard kernel image could be created.

Device trees as ABI -- flag the stable ones

Posted Aug 14, 2013 11:20 UTC (Wed) by broonie (subscriber, #7078) [Link]

That's the theory - in practice what's happened is that in order to get to the point where there is enough DT support in the kernel to allow anything to actually boot people have just been using DT as a replacement for board files (which are embedded in the kernel). Now that platforms are able to run usefully from DT only that is changing which is what the discussion is about.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 16:49 UTC (Wed) by jgg (subscriber, #55211) [Link]

There will never be a fast way to get a binding defined. Since it is considered an ABI everything gets bike-shedded to death. People have been working on bindings now for Kirkwood for over a year, and it still isn't fully in mainline. :(

Device trees as ABI -- flag the stable ones

Posted Aug 12, 2013 1:33 UTC (Mon) by mmarq (guest, #2332) [Link] (9 responses)

>The problem is that the cry of "we're embedded, we can't take the time to do things right, we just need to ship product" has been heard far too many times on far too many topics.

and

> If they want to do their own thing, they can keep their stuff out of the kernel and maintain it on their own forever. But they are (slowly) learning that they don't like the results of that, so they want their work to be in the upstream kernel to save them effort in the future, part of the deal to get things into the upstream kernel is that they are done in a way that allows them to be maintained reasonably, which means that it's no longer a free for all.

Congratulations for the daring... but both parts are contradictory.

Wasn't there a previous thread were the maintainer said he was drowning in load ? ... i don't think "they" want to be babysit and enjoy some free candy...

If you care to listen the "cries" then you have to provide a solution... out of kernel, like an ABI, is exactly what has been purposed, and so that "maintain it forever" sounds more like "piss off" than "lets find a cure".

Why can't DT interfaces be dynamic some way like with low DKMS drivers ?

It could be hardened providing a very well knowned safe fall-back for gross errors, it could use "capabilities" so that a DT blob don't go pocking around where it shouldn't... and bolbs already most of the firmware are...

This would kill 2 birds with one stroke... low level dynamic drivers, yes _more_that_simple_DT_interfaces... and so instead of DT buried in firmware blobs, it could provide much more, well defined interfaces and behavior that could be "negotiated" subsystem to subsystem, and OTOH if a vendor wants more in the future then it would be forced to go into kernel development, that is, instead of getting free candy we will be forced to offer some.

At least sounds better than "piss off"... hey! it could be even be great as example for open source graphic drivers to use the low level binary kernel drivers that **without exclusion are now loaded via DKMS**... well defined interfaces would provide hints for possible reverse engineering if must...

In the end "they" don't need and shouldn't be babysit, at least at this low level, there isn't the factor of upstream "free" maintenance, there should be that presumption, the contrary a ask for help, simply because any software without hardware is completely pointless(at this levels).

Device trees as ABI -- flag the stable ones

Posted Aug 12, 2013 4:59 UTC (Mon) by dlang (guest, #313) [Link] (8 responses)

> and so instead of DT buried in firmware blobs, it could provide much more, well defined interfaces and behavior that could be "negotiated" subsystem to subsystem,

I think this indicates that one of us isn't understanding what Device Trees are.

Device Trees are not a bunch of things embedded into drivers or in binary blobs.

Device Trees are a description of the hardware that the bootloader hands to the kernel at boot time. Drivers then look at parts of the Device Tree to find the constants that they need to configure hardware (to know that it's there to be configured, and usually configuration information)

One problem is that some people are putting all their config info in the driver, then using the Device Tree to say "create device ABC", this isn't any better than the OOXML spec saying "manage dates like Excel version X did"

this approach means that the drivers have to be updates for every variation of the device, and every different address the device lives at.

What they should be doing instead is making the Device Tree say "this is device of type X at address Y" so that the kernel source can eliminate all the tables of hardware and drivers that are almost, but not quite identical.

Watching for duplicates, and pushing back on such submissions, telling the submitters that they need to use/update the generic versions is a painful task, both for the maintainers pushing back and the people submitting the updates.

But trying to cry that "we are embedded, we can't take the time to follow the process" isn't going to win sympathy

Device trees as ABI -- flag the stable ones

Posted Aug 12, 2013 10:49 UTC (Mon) by mmarq (guest, #2332) [Link] (7 responses)

> What they should be doing instead is making the Device Tree say "this is device of type X at address Y" so that the kernel source can eliminate all the tables of hardware and drivers that are almost, but not quite identical.

I think that is exactly the problem that must be made to work, there are too many drivers that are almost identical ... "almost"... but all are needed

> Watching for duplicates, and pushing back on such submissions, telling the submitters that they need to use/update the generic versions is a painful task, both for the maintainers pushing back and the people submitting the updates.

I think "they" state "generic" wont fit the bill... AFAI understand... otherwise i don't think no one dared to purpose a low level kernel ABI, after so many years when the slightest mention of that have been enough for an immediate a total incineration of the offender.

OTOH i think they are more than happy to maintain all that bindings and drivers crowd outside of kernel, as long as the interfaces are long duration and stable.

Let them have what they want, but take advantage of it... AFAI understand (not a developer) those "bindings" can be embedded into drivers, so **negotiate** subsystem to subsystem ( not one ABI but several) the wrapper/shims (whatever) binary interfaces exposed, but harden it, use capabilities. Those blobs could be dynamically loaded(trace parallels with DKMS), a "specific" binding, "specific" firmware and more generalist low level "gluing" driver could be a module for a new advanced DKMS system ( ? possible ? crazy ?), and a SOC could have many of those from the same device tree (as i understand).

I think this could be a very good think, more *robust*, more supportive, more portable, more crossplatform, more flexible, more *maintainable*, even to all OSS drivers ( don't have to carry hundreds of drivers code around only to use a few thens of them for each system)... at least follows the "hardware logic", SOCs, boards, platforms, can be varied inside the same family..."almost identical"... yet those minimal differences can be critical for full support and or stability ( and half backed drivers is what linux has to spare).

Device trees as ABI -- flag the stable ones

Posted Aug 12, 2013 21:21 UTC (Mon) by dlang (guest, #313) [Link] (6 responses)

> I think that is exactly the problem that must be made to work, there are too many drivers that are almost identical ... "almost"... but all are needed

no, all the different drivers do not have to exist. support for all the devices needs to exist

One way to do this is to do a cut-n-paste of an existing driver, change some constants and create a new driver.

The other way is to take the first driver, change the constants to DT parameters (defaulting to the old constants if not defined), and then you don't need yet another driver.

arguing that there should be a stable ABI isn't going to get you anywhere. It's not something that can be implemented on a per-driver or per-subsystem basis because part of the ABI the driver would need is the kernel locking rules, power management interfaces, and other things that affect the entire kernel.

Device trees as ABI -- flag the stable ones

Posted Aug 15, 2013 2:34 UTC (Thu) by mmarq (guest, #2332) [Link] (5 responses)

> no, all the different drivers do not have to exist. support for all the devices needs to exist

What is the fundamental difference in principle ?

> One way to do this is to do a cut-n-paste of an existing driver, change some constants and create a new driver.

> The other way is to take the first driver, change the constants to DT parameters (defaulting to the old constants if not defined), and then you don't need yet another driver.

Isnt exactly how is done today ? ... if yes, why doesn't it work ? (meaning maintainers drown in load, implementers complaining etc )

> arguing that there should be a stable ABI isn't going to get you anywhere. It's not something that can be implemented on a per-driver or per-subsystem basis because part of the ABI the driver would need is the kernel locking rules, power management interfaces, and other things that affect the entire kernel.

After all it was a "piss off" your earlier post lol

I don't want to go anywhere, i'm not an implementer. But if that is true wouldn't be the entire kernel the ABI ? lol and there wouldn't be possible to device other ABIs without starting from scratch. I don't think that is the case.

I think what you say is not possible, is already done by DKMS. Every distro i use, uses DKMS, though is not officially supported(AFAIK).

I think the basic is NOT a question of code but more a question of perspective. You dared to mention the "free candy" of upstream "free maintenance" and why complain... i dare to mention the "corporative interest" in keeping everything in house (so to speak), even when everything in house comes very short in lot of aspects and why these complains.

Employing logic, what i said is, if DMKS works, and it does otherwise no distro would use it, why not improve that tremendously, harden it, not one but several DKMSs... i mean, it could have some rules different according the subsystems addressed, it could use Dbus, it could use capabilities for safety... etc.. the genius of programing wouldn't take long advancing several innovative mechanisms, inclusive interacting with DTs in a way that would ease maintenance by a great deal...

The "plus" for perspective is that it would follow the "LOGIC", a lot of drivers are almost identical... but not quite... what is identical will be kernel, what could vary, is a low level "module" that could be dynamically loaded (like DKMS), and that could be in binary form (as happens with DKMS), and is more logic and sane because the "not quite" means every device could/should/must have its own low level module, and it would be good even for the all OSS drivers IMHO.

Is this an ABI ? ... is this a long (put long in this, only change if no other choice) stable API ? ( a bit like kernel 3.10 prespective lol)

Device trees as ABI -- flag the stable ones

Posted Aug 15, 2013 2:52 UTC (Thu) by mmarq (guest, #2332) [Link]

> The "plus" for perspective is that it would follow the "LOGIC", a lot of drivers are almost identical... but not quite...

Oops... i meant "a lot of *devices* are almost identical"... not *drivers* ... the sentence above doesn't make sense, drivers should follow the devices, not devices following drivers ( what some Linux aficionados seems to want)

And in reality there can be quite a lot of differences even for devices implemented around the same processing element (NIC, sound, graphics, storage etc)

Device trees as ABI -- flag the stable ones

Posted Aug 15, 2013 3:25 UTC (Thu) by dlang (guest, #313) [Link] (3 responses)

>> no, all the different drivers do not have to exist. support for all the devices needs to exist

> What is the fundamental difference in principle ?

If you say that the drivers must exist, then the status quo is fine, but if you say support for the devices is what must exist, then there is a lot of room for cleanup in the existing code.

>> The other way is to take the first driver, change the constants to DT parameters (defaulting to the old constants if not defined), and then you don't need yet another driver.

> Isnt exactly how is done today ? ... if yes, why doesn't it work ? (meaning maintainers drown in load, implementers complaining etc )

this is part of the problem today. Far too many of the existing device drivers in ARM (and to be fair, in other embedded architectures) are cut-n-paste jobs when they should be consolidated.

Maintainers haven't been pushing back enough. This was one of the items in the discussions that triggered the "the linux-kernel list is not professional" a few weeks ago.

> ...wouldn't be the entire kernel the ABI ?

> I think what you say is not possible, is already done by DKMS. Every distro i use, uses DKMS, though is not officially supported(AFAIK).

DKMS also fails somewhat regularly as the kernel internals change. This is also why the nvidia and ATI proprietary drivers frequently don't work with a new kernel. The changes made elsewhere in the system can frequently break modules, especially ones that are as demanding as the video drivers.

Yes, it would be possible to create a module API that would not need to care about some of this stuff, but it would be far more limited in what the module could do and how well it could do it. The kernel developers have opted not to go this route (see the stable ABI nonsense document if you aren't familiar with it)

the only time I've seen DKMS actually used is for vmware drivers. The various distros may include it, but I don't see them using it much.

DKMS "works (mostly)" pretty well for minor kernel updates (distro kernel patches), but if you try to use it across larger updates, you will see how badly it starts to fail.

Just because something is popular, doesn't mean that it's the right thing to do. The situation where lots of people were running nvidia modules, and so suffering kernel crashes and as a side effect running into problems with KDE config file updates disappearing on them a year or so ago is a good example of this.

> The "plus" for perspective is that it would follow the "LOGIC", a lot of (devices) are almost identical... but not quite...

actually, in the ARM space, this really is the case. A lot of the different drivers are for the exact same thing connected at a different address.

This does fall naturally out of the 'normal' embedded development process, where a programmer is assigned to make Linux work on chip/device X, and frequently all the company cares about _is_ making it work on that chip/device. once that chip/device ships, the programmer never expects to deal with it again, so they don't care about upgrading things. For these people, the cunt-n-paste, modify constants approach is just fine.

This is part of the oft-repeated claim that "embedded is special, it shouldn't have to follow the rules"

However, the Linux-kernel developers have to maintain things over the long run, and many of the larger companies doing embedded development are starting to learn that they really do want to have the support upstream so they don't have to repeat the driver development for the next kernel/project.

As ARM is stabilizing, they are working to clean this up, but it's a lot of work, and Device Trees are a very significant part of that cleanup.

Device trees as ABI -- flag the stable ones

Posted Aug 17, 2013 6:41 UTC (Sat) by mmarq (guest, #2332) [Link] (2 responses)

> DKMS also fails somewhat regularly as the kernel internals change. This is also why the nvidia and ATI proprietary drivers frequently don't work with a new kernel. The changes made elsewhere in the system can frequently break modules, especially ones that are as demanding as the video drivers.

Yes its a shame...but don't have to be that way.

DKMS as it is, is too must encompassing, its not harden (no fail safe fall backs), and it has no security but what the kernel provides( could have capabilities as example). The DKMS i had in mind is much more restricted, much more things that are "generalist" in a POV could be "in the kernel", so the chance of breaking something in the DKMS APIs would be more diminutive, and since it could be officially maintained, those APIs could also be "extended" according to Kernel changes... and most of times without breaking old stuff... i think...

Sorry for the too much pragmatic views... but i've always been trained that to better solve a problem is to think outside the problem, not to live with it.

Device trees as ABI -- flag the stable ones

Posted Aug 17, 2013 9:59 UTC (Sat) by dlang (guest, #313) [Link] (1 responses)

so how exactly is your super DKMS going to figure out that the module now needs to do locking because the code that calls it no longer grabs a lock (a standard thing to happen as locks get more fine-grained)?

Device trees as ABI -- flag the stable ones

Posted Aug 19, 2013 13:00 UTC (Mon) by mmarq (guest, #2332) [Link]

I don't know.

Its not a question of code but more a question of design philosophy. It would had to be the maintainers of the subsystems addressed that know more (should) than anyone else how things in their side work, to give you an answer. But the philosophy which i think is quite pertinent and true, is that the answer can vary according to the subsystem, this generalist approach may not be the best approach.

I'm starting from the principle that a device, can have things different even from devices of the same family, firmware, bus characteristics etc

So if we agree in this point which seems to me very logic, and very true, the idea is quite simple, what could be "discrete and unique" goes DKMS like, what could easily be addressed in a generalist way is "in kernel" (modules or other doesn't matter).

And in this *logic* (which sometimes is missing from OSS due to politics), this DKMS should be different from subsystem to subsystem. Userland drivers perhaps wouldn't need DKMS anything i think, and it will be for the maintainers to "draw the line in the sand" for the different *long term stable APIs* that would be dynamically exposed by this DKMS, APIs that could have capabilities to restrict to where they "talk/interact" with the kernel, and even how they talk among themselves... that is why i talked of several DKMS not one, more so because different subsystems could have different "lines in the sand".

So what would this "new DKMS" dynamically load ?... not sure... but not much more than load the firmware and a low level driver that initializes the devices and exposes the very low configs to the kernel.

Would there be worries about locks (and other things) ? ... don't know... i think the maintainers would have the last word, but it would be nice to hear the implementors/vendors/IDMs.

And is not about allowing closed sourced blobs in the kernel, matter of fact OSS drivers for a lot of devices could use the "new DKMS" scheme, matter of fact every pertinent device (falling in the category that apply), should use the "new DKMS" either the source is available or not (its about predictability, and the logic of unique vs generalist characteristics)... and if hardened with a safe fall back, this "new DKMS" could be even grand for OSS driver development (and proprietary to), and cherry on top of cake, due to predictability, to reverse engineer if the case comes along.

In a "political" view that is what i call allowing some privileges, but take advantage of it lol.

No kernel upgrade in the embedded world

Posted Jul 31, 2013 12:47 UTC (Wed) by rvfh (guest, #31018) [Link]

> Many (most?) ARM SoC creators and the ODMs around them are focused on a quick turnaround, short production runs and not easily hackable nor easily upgradable devices.

And my experience on smartphones (I suppose applies to a lot of embedded devices too) is that even when you get an Android upgrade, you keep the same kernel version, as nobody wants to forward-port all the hacks they managed to make work to a newer kernel.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 14:12 UTC (Wed) by linusw (subscriber, #40300) [Link] (3 responses)

It's a bit of both/and I would say - mobile handset and tablet chipsets has this quick turn-around feature. Many of these kernel trees and potential device trees are not even in the upstream kernel.

Then there is a class of embedded chipsets that are deeply embedded in automotive, industrial control, airplane and military products with a support cycle of 20+ years. Here is Atmel AT91 and some MIPS products. For these, the support cycle is always longer than you think, even if you take into accoun that the support cycle is longer than you think.

Reference designs is a third class of devices - these have a mid-length lifecycle of some 5-10 years, and are used as a base for mobile handsets and tablets among other stuff.

Device trees as ABI -- flag the stable ones

Posted Jul 31, 2013 19:41 UTC (Wed) by martin.langhoff (subscriber, #61417) [Link] (2 responses)

> deeply embedded in automotive, industrial control, airplane and military products

yes, but there is no expectation of running a vanilla kernel on those, nor a generic linux distro. So as long as the ODM kept the relevant kernel tree around, and someone can grok it, they can support it.

> reference designs is a third class of devices - these have a mid-length lifecycle of
> some 5-10 years, and are used as a base for mobile handsets and tablets

With ARM SoCs, the reference designs I have seen have a shelf life of 2 years, and essentially have to survive in a crowded market. So some die an early death, some dominate a given generation (and perhaps get... 4 years life?).

The cost/benefit isn't there, even for reference designs, if we think about getting to "stable DT". An unstable (but reasonably well done) DT in their own kernel trees is probably OK. But the benefit is more marginal, or at least looks marginal at implementation time.

Device trees as ABI -- flag the stable ones

Posted Aug 1, 2013 16:23 UTC (Thu) by Jonno (subscriber, #49613) [Link] (1 responses)

> With ARM SoCs, the reference designs I have seen have a shelf life of 2 years
At my last work, I was developing a new medical product based on a 4 years old ARM SoC design on a custom PCB. The product was intended to be manufactured using the same ARM SoC for 10 to 15 years after we were done, with each manufactured product having a lifespan (with support and component replacement) for 15-20 years. In other words, the 4 year old SoC were expected to still be available for another 30 years or so.

Most of us engineers figured it would break down by 2038, seeing how arm32 uses a 32 bit time_t, but there was no telling management that ;-).

> yes, but there is no expectation of running a vanilla kernel on those
We were using a mainline 2.6.39 kernel with a few vendor-supplied 2.6.37-based drivers we forward-ported ourselves.

Device trees as ABI -- flag the stable ones

Posted Aug 1, 2013 18:36 UTC (Thu) by pizza (subscriber, #46) [Link]

> In other words, the 4 year old SoC were expected to still be available for another 30 years or so.

In all fairness, most SoC vendors have products they designate for long-term availability, though that usually translates to the 10-year lifespan automotive components require.

But in my experience, far too often management chooses an SoC simply because it's cheap, not realizing it's only cheap because it is almost End-of-Lifed. Having to source components on the grey market to fulfill unexpected orders is not a fun undertaking.

(Another approach I've seen is to order a massive quantity of the critical parts at the outset of the production run, enough to cover the total projected lifecycle, at the outset. This can get expensive, but does hedge you when you have a relatively low-volume but long-lived product.

Device trees as ABI

Posted Jul 31, 2013 11:28 UTC (Wed) by nelljerram (subscriber, #12005) [Link] (6 responses)

Some missing context for me: why was it decided to move from board files to device tree?

The reasons I'd guess are efficiency of expression and/or not wanting the description to be baked into the kernel. If the latter _was_ one of the reasons, surely it was obvious immediately that this would be a new ABI and so raise the question of stability? If the latter is not a strong reason, why not avoid the stable ABI question by compiling the DT and baking it into the kernel?

Device trees as ABI

Posted Jul 31, 2013 12:43 UTC (Wed) by pizza (subscriber, #46) [Link] (5 responses)

> Some missing context for me: why was it decided to move from board files to device tree?

So a single kernel image could be used to boot a wide variety of hardware, including boards that didn't exist when the kernel was released.

Device trees as ABI

Posted Jul 31, 2013 13:13 UTC (Wed) by karim (subscriber, #114) [Link] (3 responses)

Can you elaborate on the benefits of this?

Device trees as ABI

Posted Jul 31, 2013 15:58 UTC (Wed) by raven667 (subscriber, #5198) [Link] (2 responses)

I think the benefit is not having to maintain a separate fork of the kernel for each and every type of board manufactured that has baked-in carnal knowledge of how that particular board is manufactured. Once you can boot a standard kernel on many different types of boards you reduce the maintenance of the kernel and make it easy to do version upgrades, or offer standard software distributions that will work across many devices.

Device trees as ABI

Posted Jul 31, 2013 16:05 UTC (Wed) by raven667 (subscriber, #5198) [Link] (1 responses)

Another way to look at this is to ask how Linux 4.20.3 in 10 years is going to boot on hardware made today, is the kernel going to carry around detailed knowledge of how your FooBoard v3 (and FooBoard v4 and BixBoard model 42 and and and ...) was wired forevermore so that you can load it on your existing device or should we make a standard for detecting how the board is wired so that a generic kernel can use a standard method to enumerate the board and boot, both now and in the indeterminate future.

Device trees as ABI

Posted Aug 12, 2013 2:18 UTC (Mon) by mmarq (guest, #2332) [Link]

To have that standard you have to force every implementer/ODM to follow the standard, akin to have a Linux open platform... which is never going to happen because patents of every kind are pervasive, trade secrets are also pervasive, and ODMs take their IP as very valuable...

OTOH to break out of the "old" PC world, low level binary interfaces are almost mandatory, binary interfaces and Open Source are not mutually exclusive... at least take advantage of it, know and restrict what a blob is up to, but support them, doesn't matter if the source is available or not, of if info is NDA or not.

Device trees as ABI

Posted Jul 31, 2013 13:28 UTC (Wed) by mbizon (subscriber, #37138) [Link]

It was/is already possible to boot a single kernel image for multiple boards, you don't need DT for this.

You want DT if the kernel has no knowledge of how the hardware is layered and you want to provide that information on a separate channel. This requires that the kernel know the DT "schema" that you use to describe your hardware.

for example, DT would tell the kernel "you have a I2C/SPI/... hardware block of type xxxx at location xxxx", allowing the kernel to register the device without needing a platform/pci_register_device that you typically find inside board files.

Device trees as ABI

Posted Aug 1, 2013 11:26 UTC (Thu) by t.figa (subscriber, #92170) [Link]

I would like to clarify one little thing. My words might have been misunderstood a bit, as I did not advocate for DT instability. Instead I just showed two possible extreme DT usage examples that we should consider and proposed creating a process that would be something in the middle, trying to get the best of both extreme worlds (i.e. separate stable and staging bindings).

Device trees as ABI

Posted Aug 1, 2013 12:50 UTC (Thu) by jengelh (guest, #33263) [Link]

Systems using OpenFirmware already had device trees for a long time, did they not? So what is the difficulty with the DT ABI here?