That's a very happy ending for the Linux driver, but what happened to the other platforms? Does the team now have to maintain 2 or 3 separate drivers duplicating almost all of the code? If yes, was the ending really so happy?
Posted Aug 12, 2011 22:54 UTC (Fri) by JoeBuck (subscriber, #2330)
[Link]
A device manufacturer typically wants to support not only Linux, but also BSD (e.g. Darwin/MacOSX) and Windows (various flavors), without writing three completely different drivers. Abstraction has its costs, but so does the approach outlined in the article.
That's why people attempt abstraction
Posted Aug 12, 2011 23:27 UTC (Fri) by djbw (subscriber, #78104)
[Link]
Teams can collaborate and maybe find some highest-common-factor code to share, but at the end of day each environment has it nuances that need special attention. The cost of getting the abstraction wrong is high, the chance of getting abstractions layers right is low (for non-trivial drivers).
That's why people attempt abstraction
Posted Aug 13, 2011 3:00 UTC (Sat) by quotemstr (subscriber, #45331)
[Link]
Yet, somehow, nvidia successfully develops its hugely complex drivers on top of an abstraction layer. In fact, these drivers often work better on Linux than bespoke drivers for other graphics hardware. I find it difficult to accept that writing a platform-specific driver for every platform is really the optimal approach.
That's why people attempt abstraction
Posted Aug 13, 2011 3:14 UTC (Sat) by mjg59 (subscriber, #23239)
[Link]
nvidia benefit hugely from the fact that opengl is already a pretty effective abstraction layer, and the vast majority of their driver is devoted to turning opengl commands into the card's internal format. On the other hand, their lack of integration with the rest of Linux does carry costs. Their driver fails in a variety of situations where nouveau succeeds (handling multi-GPU laptops, systems where the EDID has to be obtained via ACPI, thermal monitoring, backlight control, integration with xrandr) because nouveau is able to take advantage of the infrastructure present in Linux. And nouveau's 2D performance is massively better than the nvidia blob, because nvidia have nothing to abstract 2D operations between Windows and Linux.
So it depends. If you want to do nothing but run 3D applications then their approach is successful, and if you want to do anything else their approach is dreadful.
That's why people attempt abstraction
Posted Aug 14, 2011 19:16 UTC (Sun) by mlankhorst (subscriber, #52260)
[Link]
there are tales of horror in the nvidia glue code, look up os_smp_*_barrier in the glue code. Or their shotgun approach to cache flushing, wbinvd instead of a bunch of clflush calls. ;)
That's why people attempt abstraction
Posted Aug 13, 2011 19:18 UTC (Sat) by cpeterso (guest, #305)
[Link]
Perhaps some platform-specific abstraction layers can be avoided if the driver design is inverted: have a cross-platform core that is called *from* the platform-specific code. The core can use event callbacks to invoke platform code.
The abstraction layers described in the article sound like they are focused on implementation details instead of capturing a higher-level "domain model". Admittedly, designing a platform-independent domain model for a kernel driver manipulating hardware sounds challenging.
Posted Aug 16, 2011 21:01 UTC (Tue) by michaeljt (subscriber, #39183)
[Link]
> Teams can collaborate and maybe find some highest-common-factor code to share, but at the end of day each environment has it nuances that need special attention. The cost of getting the abstraction wrong is high, the chance of getting abstractions layers right is low (for non-trivial drivers).
I could potentially imagine the review process of an "abstraction layer-based driver" going along the lines of "this abstraction is acceptable, this one is not, this one could be acceptable if it were changed a bit". You might then end up with things like some code which is generic for all other supported OSes (or most other? You might find that one or two other target OSes also benefit from having their own implementations) being re-implemented for Linux, but still keep your somewhat refactored core code OS-independent. And of course, like the Linux kernel API, your abstraction layer need not be set in stone and "right the first time". It can be refactored as problems are found or new needs appear.
I wonder though more generally whether it would be acceptable to the Linux kernel community to have code files in the kernel for which kernel.org is not the "primary" site? Otherwise there is little chance of a driver in the upstream kernel having any sort of shared core with other OSes.
That's why people attempt abstraction
Posted Aug 16, 2011 21:05 UTC (Tue) by dlang (✭ supporter ✭, #313)
[Link]
there is already code in the kernel that is primarily developed and maintained elsewhere.
however, the kernel developers don't feel much need to respect the idea of 'this code is read-only on kernel.org', if they make an interface change in the kernel, they will make that change even on these drivers that have their 'primary source' external to the kernel, and it's up to that external source to pick up the changes so that they don't break with the next merge request.
That's why people attempt abstraction
Posted Aug 18, 2011 20:34 UTC (Thu) by michaeljt (subscriber, #39183)
[Link]
> there is already code in the kernel that is primarily developed and maintained elsewhere.
>
> however, the kernel developers don't feel much need to respect the idea of 'this code is read-only on kernel.org', if they make an interface change in the kernel, they will make that change even on these drivers that have their 'primary source' external to the kernel, and it's up to that external source to pick up the changes so that they don't break with the next merge request.
I suppose the easiest way around that would be to make sure that the code parts which are "kernel-external" only interface to other parts in the driver, and not to anything outside. Without letting those other parts become too much of a simple abstraction layer, see above. Another problem that I see though is coding style - having code shared between the kernel and something else in this way forces that something else to adopt the kernel coding style, which is likely not to be the same as their preferred style.
That's why people attempt abstraction
Posted Aug 13, 2011 11:44 UTC (Sat) by justincormack (subscriber, #70439)
[Link]
Iscsi is not a device though, so that argument does not really apply.
That's why people attempt abstraction
Posted Aug 13, 2011 12:42 UTC (Sat) by ftc (subscriber, #2378)
[Link]
The article is not about an iSCSI (SCSI over IP) driver, it's about the Intel C600 series chipset SAS controller, which goes by the somewhat unfortunate name of isci (with only one "s").
isc(s)i
Posted Aug 15, 2011 20:08 UTC (Mon) by wilck (subscriber, #29844)
[Link]
> the somewhat unfortunate name of isci ...
... indeed. This name has caused more confusion than any other driver name I encountered. It's sort of a running gag already.
That's why people attempt abstraction
Posted Aug 15, 2011 4:04 UTC (Mon) by broonie (subscriber, #7078)
[Link]
The problem you end up with is that you get a cross platform driver that doesn't work particularly well on any of the platforms; it ends up being non-idiomatic and often reimplementing the wheel.
Avoiding the OS abstraction trap
Posted Aug 13, 2011 9:24 UTC (Sat) by intgr (subscriber, #39733)
[Link]
> Does the team now have to maintain 2 or 3 separate drivers duplicating
> almost all of the code?
So instead of one 60kLOC driver, they would now have three 20kLOC drivers. I'm not sure that's a bad thing.
Avoiding the OS abstraction trap
Posted Aug 14, 2011 2:12 UTC (Sun) by JoeBuck (subscriber, #2330)
[Link]
... unless they decide to make two 20Kloc drivers and leave the one for Linux out because it's too much of a PITA.
Avoiding the OS abstraction trap
Posted Aug 14, 2011 11:31 UTC (Sun) by ebirdie (subscriber, #512)
[Link]
Did you read the article or lacking in understanding the text, or am I? I'll try my shot.
On Linux there is no need to maintain any Kloc for isc driver, if the initial driver development is done the "Linux community way", ie. shared, until the hardware will get revisions, which may need new bits and pieces added in by the vendor development team.
To put it another way with over simplification and has a familiar sound. Write once and forget. ;-)
Avoiding the OS abstraction trap
Posted Aug 14, 2011 16:01 UTC (Sun) by bfields (subscriber, #19510)
[Link]
"Write once and forget" overstates the case--the driver will need some ongoing maintenance to keep up with the rest of the kernel, and it's hard for people who don't know the specific driver well to completely take over maintenance. At a bare minimum, you need someone who has access to the hardware and can test new kernels.
We certainly hope contributors can reduce their maintenance load, but if we lead them to expect it to be reduced to zero, then we risk ending up with a lot of bit-rotted drivers.
Avoiding the OS abstraction trap
Posted Aug 15, 2011 16:38 UTC (Mon) by misiu_mp (guest, #41936)
[Link]
The point of this rewriting is that the future maintenance will be possible by someone outside the original development team. So although you will probably need a person with the hardware to test it, it could be an ordinary savvy user.
That's because the driver is now much simpler and similar to other drivers, instead of being the very specific and complicated cake of layers that it used to be.
It's got a better Bus factor .
Avoiding the OS abstraction trap
Posted Aug 16, 2011 6:20 UTC (Tue) by ebirdie (subscriber, #512)
[Link]
"the driver is now much simpler and similar to other drivers, instead of being the very specific and complicated cake of layers"
I love this, because to me it implies that the piece of hardware (ie. a chip and its surrounding implementation) has better chances to shine on its own as hardware and not leveled to other similar hardware just by a software development method, which does not directly produce any good to me.
Avoiding the OS abstraction trap
Posted Aug 15, 2011 9:08 UTC (Mon) by linusw (subscriber, #40300)
[Link]
I always nurtured the idea that we grossly overestimate the pain of writing code, especially concerning device drivers. Coding is easy, know-how is hard.
So writing three different device drivers tailored for three differen OS:es isn't necessarily that bad, as long as you have people who are actually comfortable with diving around all three codebases and making sure the know-how about the hardware is shared between all three codebases.
The key assumption I do is that if a developer is to focus on one technical aspect at a time across several codebase rather than one specific codebase across several technical aspects at a time, the outcome may be better.
This may be true to varying extent in different contexts, it's just my intuitive feeling.
Avoiding the OS abstraction trap
Posted Aug 15, 2011 17:14 UTC (Mon) by misiu_mp (guest, #41936)
[Link]
"Coding is easy, know-how is hard."
So true. To write a good driver you need to know both the hardware's secrets and the kernel's secrets. It is basically impossible to have a driver that does not need kernel secrets - which an abstracted driver tries to do (exception being the graphics drivers based on the opengl specs).
There is much more people with kernel knowledge than a with particular hardware knowledge. A well documented driver written in a language that is easy to understand to kernel hackers will widen the potential circle of hardware knowledgeable people. Once the know-how is out there or easy to reach, the writing is a formality.