Maintainability is much more important that functionality. We (upstream) don't want new features if we cannot fix them when they break, or cannot improve surrounding code as the feature might break.
Developers focused on "make it work so we can ship" are going to be less focused on maintainability (or at least, that is the way it appears).
We don't want the rival solution that works now, we want the rival solution that will still work in 5 years.
Distributors are of course free to use a 'downstream first' policy- the GPL guarantees that freedom. But experience shows that 'upstream first' costs less in the long term.
Posted Jun 3, 2010 10:02 UTC (Thu) by khim (subscriber, #9252)
[Link]
From want I'm seeing the companies which employ 'upstream first' tactic routinely fail in marketplace. And this understandable: they can not ship stuff when there is market demand for it - they are stuck with pleasing the upstream.
Sure, if you'll try to support you changes indefinitely it'll become huge drain over time and you'll lose too - so you need to upstream your changes at some time. Sometimes different solution is accepted, not what you proposed initially - but that's not a problem, the goal is to solve the problem end-users are having, not to have your code in kernel. This is how RedHat worked for a long time (till they got enough clout in kernel community to muscle through their changes without much effort), this is how successful embedded developers work (including Google), etc. Novell tried to play 'upstream first' game and the end result does not look good for the company (even if it's may be good for the kernel).
If you have stats which show that 'upstream first' is indeed the best policy for the developers, please share them - I've certainly heard this claim often enough, but rarely, if ever, with numbers.
The only exception are "leaf" drivers which don't change any infrastructure at all and are usually accepted without even looking - here upstreaming is so cheap that it really makes sense to do this.
Care to share your stats?
Posted Jun 3, 2010 10:25 UTC (Thu) by neilbrown (subscriber, #359)
[Link]
Nope, no statistics. Just "a stitch in time saves nine" style anecdotal observations.
And it is only a long-term benefit. I can easily imagine a situation where the short term cost of going upstream-first would cause the business to fail so there is no possibility of a long term reward. But as soon as the horizon stretches out a bit, the more you get upstream the less you have to carry yourself.
Care to share your stats?
Posted Jun 3, 2010 13:00 UTC (Thu) by corbet (editor, #1)
[Link]
So companies like Intel, which are very strongly in the upstream first camp these days (most of the time) are failing in the marketplace?
"Upstream first" is not a hard and fast rule. It's also not exactly "get the code into the mainline kernel first"; it's more along the lines of "be sure that the code can get into the mainline kernel first." There is a difference there.
I'm not sure I see "upstream first" holding back Novell. Citation needed. Instead, I see the times they didn't do things that way (AppArmor), that that didn't work out all that well for them.
Care to share your stats?
Posted Jun 3, 2010 17:43 UTC (Thu) by jwarnica (subscriber, #27492)
[Link]
Well, it seems that the "right thing" in the view of some company depends on what kind of market the company is in.
Component hardware companies typically don't sell software. Getting their new code into the kernel means *poof* they now have a bazillion systems that can use their hardware. It isn't to Intels advantage to keep their own git repository somewhere. If me, as an end user of some intel chipset cant get it to work on my software far, far removed from Intels repo, maybe next time, I won't get a mobo with Intel Inside.
Appliance/embedded hardware companies, or OS companies, are a different story. Doing the globally "right thing": "upstream first" means they are slower to deliver their actual product, and (it should be noted) their actual product has less distinction then do its competitors. Sure, the patch may very well be GPL'd, but their competitors patch which they just threw over the wall is harder for someone to use then something upstream. In a sense, it may as well be a secret.
More simply: If the end user is likely to interact directly with a single vendor, then that vendor can put their patches wherever they want, and not trying the gauntlet of the LKML is cheaper. If the end user is far removed from the provider, the provider should try to get that patch wide and far, which means in the upstream kernel.
So companies that do the globally "right thing" are rewarded by being slower, and less distinct, then those not.
Moving on:
I think part of the lesson here is that "be sure that the code can get into the mainline kernel first" is impossible to test. Until you actually submit code to the LKML, you have no idea the kinds of helpful, productive, petty, or absurd comments you will get in response. No one can predict with any level of accuracy if something will be accepted until it actually shows up in a release.
Care to share your stats?
Posted Jun 4, 2010 12:46 UTC (Fri) by kpvangend (guest, #22351)
[Link]
I don't think bringing in Intel as an example is fair nor correct.
Intel can ship their processors without specific Linux support if they want to and the Linux code is not inside the box they ship.
Doing feature development like Intel or IBM can afford has interesting dynamics. For starters: not much secrecy. Secondly, no time-to-market pressure. Thirdly, the freedom to pick versions and platforms you want.
In contrast, most embedded vendors (and for now, I'm putting Google in that box, too) ship a Linux inside their box, running on some platform the software guys didn't choose.
If they take the time to merge their code upstream, they cannot ship.
And yes, many companies have failed by spending too much time in the community. Just compare the amount of announcements on LinuxDevices.com with the amount of code merged and the amount of products shipped.
When doing embedded development, your boss will only allw you a small window in which you can merge stuff upstream and benefit from it at the same time:
* after the prototype starts working
* before the code freeze happens
That period - in most cases I've seen is only a month or so - will be quickly over if you get push-back.
And then the madness of everyday work (bug hunts, etc) will draw you back inside your company.
Care to share your stats?
Posted Jun 11, 2010 21:00 UTC (Fri) by aliguori (subscriber, #30636)
[Link]
Doing feature development like Intel or IBM can afford has interesting dynamics. For starters: not much secrecy. Secondly, no time-to-market pressure. Thirdly, the freedom to pick versions and platforms you want.
I can promise you, there certainly is time-to-market pressure. And every public traded company cannot discuss products before they've been officially announced so that does mean working with the community on a feature for a product that you can't talk about.
Care to share your stats?
Posted Jun 3, 2010 16:11 UTC (Thu) by anton (guest, #25547)
[Link]
So I guess you are saying that "upstream first" costs more in opportunity costs (worse time-to-market) than releasing before it has been upstreamed costs in additional development time for maintenance and increased later upstreaming effort.
Upstream first policy
Posted Jun 3, 2010 11:35 UTC (Thu) by epa (subscriber, #39769)
[Link]
Maintainability is much more important that functionality.
To whom? Not to the users. Who is the development process intended to benefit?
Upstream first policy
Posted Jun 3, 2010 12:56 UTC (Thu) by corbet (editor, #1)
[Link]
Yes it's important to the users...unless you assume that all of those users want to be running something other than Linux in five years. Without a focus on maintainability you will shortly have a kernel which nobody wants to run.
Upstream first policy
Posted Jun 3, 2010 13:20 UTC (Thu) by michel (subscriber, #10186)
[Link]
Not sure who you consider the user in this case. If it's google (as the user/consumer of the kernel), I can agree with the comment. If it's a consumer using an android based phone, I think the vast majority of them could care less if it's linux under the hood.
Upstream first policy
Posted Jun 3, 2010 13:42 UTC (Thu) by rvfh (subscriber, #31018)
[Link]
Indeed, the user is Google in this case, just like they use Linux in many other places. If Google starts seeing a decline in Linux quality, they will either fork or switch. And to some extend, developing wavelocks behind closed doors could be considered a kind of fork (though on a small scale).
Upstream first policy
Posted Jun 3, 2010 14:41 UTC (Thu) by dgm (subscriber, #49227)
[Link]
I bet it would be rather the opposite. If Google starts to see that Linux does not have the _functionality_ they want, they will fork or switch.
Why do you think they are _not_ using some of the BSDs?
Upstream first policy
Posted Jun 3, 2010 15:45 UTC (Thu) by iabervon (subscriber, #722)
[Link]
Users actually care a whole lot about maintainability of the code if it affects the quality of the maintenance that gets done. They'll be unhappy if apps they've gotten for their phones start misbehaving when they either upgrade the OS or get a new phone. This comes down to the question of whether the APIs that the apps use can be maintained across changes to the underlying system, and has implications for whether your favorite third-party IM program drains your battery when you're idle online or alternatively stops exchanging audio if you don't touch anything during a voice chat.
If Google's using a design that hasn't passed muster, and they eventually switch to a better design, and the original API bitrots, that ends up impacting users, especially ones who have the idea that they can buy an Android phone with the expectation that any program that they come to like will keep working forever.
Upstream first policy
Posted Jun 3, 2010 14:56 UTC (Thu) by epa (subscriber, #39769)
[Link]
Agreed, a focus on maintainability is important. But which is more maintainable? Merging existing, working, widely deployed code - or forcing developers like Google to stay out-of-tree for five years?
My point is that the fact that some code is already being used on millions of devices and works *now* should carry some weight, even in assessing future maintainability. (It's much more likely that little-used features will suffer code rot, no matter what their conceptual purity.) At the moment it appears to get no weight at all.
Upstream first policy
Posted Jun 3, 2010 16:54 UTC (Thu) by cry_regarder (subscriber, #50545)
[Link]
Of course it got weight...tons of weight. If it hadn't we wouldn't be talking about this now.
Also, the "millions of devices" is a red herring. It is just a handful of different devices, all of the same class. The kernel developers need a solution that works for a vast range of devices over the long haul.
Upstream first policy
Posted Jun 3, 2010 13:36 UTC (Thu) by neilbrown (subscriber, #359)
[Link]
> Who is the development process intended to benefit?
Make no mistake: the development process is intended to benefit the developers.
In the case of Linux, many of the developers are users first, and developers second (I certainly started that way), so as a consequence it ends up being focused on benefiting users too, which is nice.
Upstream first policy
Posted Jun 3, 2010 14:08 UTC (Thu) by faramir (subscriber, #2327)
[Link]
>In the case of Linux, many of the developers are users first, and >developers second (I certainly started that way), so as a consequence it >ends up being focused on benefiting users too, which is nice.
Depending on how you define it, that should read "benefiting A FEW users".
Between Tivos, WRT54g routers, Android phones, some TVs, and a host of similar products; I suspect that the vast majority of users are not developers of any sort. In most cases, the manufacturers of those products discourage development as well (Android is obviously different).
As has already been stated elsewhere, these users usually neither know nor care that Linux is involved. That doesn't mean that kernel policy (to the extent it exists) should change. But lets be honest here, this is about certain kinds of developers not users.
If one is a developer of an appliance type product, there would appear to be little reason to even subscribe to LKML let alone be involved in the development process. Your product life cycle is short and chances are that any significant kernel changes that you propose will either take too long or never get accepted.
Upstream first policy
Posted Jun 3, 2010 14:32 UTC (Thu) by corbet (editor, #1)
[Link]
If you are the developer of one appliance-type product, then maybe you can ignore the process. However, the life cycle of such products tends not to be very long; soon you'll be developing another one. There comes a point where you can't drag that 2.4.x kernel forward any further; it just won't work on the hardware you're using. So you're stuck with trying to make your stuff work on something newer. And that will be painful.
I've consulted for companies like this. Had they worked with upstream and made sure the stuff they needed got there, they would have found it waiting for them when the time came to move to a newer kernel. Instead, they set themselves up for a bunch of high-intensity, short-deadline pain. That can be lucrative for kernel consultants, but it's not really a good way to run a company.
To me, treating the kernel as a throwaway resource doesn't make sense even for the most myopic of embedded systems developers. Unless they plan to go out of business soon, they will want a maintainable kernel five or ten years down the road, and they will want it to meet their particular needs. And that doesn't just happen by chance.
Upstream first policy
Posted Jun 3, 2010 16:03 UTC (Thu) by fuhchee (subscriber, #40059)
[Link]
"We don't want the rival solution that works now, we want the rival solution that will still work in 5 years."
Considering how much of the kernel is regularly rewritten, deprecated, this policy appears to be selectively applied.
Upstream first policy
Posted Jun 3, 2010 17:14 UTC (Thu) by martinfick (subscriber, #4455)
[Link]
Yes selectively, but the point you likely missed, is that it is with a strong focus on primarily maintaining the Kernel/Userspace API.
Upstream first policy
Posted Jun 3, 2010 18:24 UTC (Thu) by foom (subscriber, #14868)
[Link]
Well, 5 years isn't actually that long. Most parts of the kernel userspace API that get deprecated and removed lasted far longer than 5 years in their previous incarnation. :)
Upstream first policy
Posted Jun 14, 2010 23:10 UTC (Mon) by aigarius (subscriber, #7329)
[Link]
"Considering how much of the kernel is regularly rewritten ..."
That's actually the whole point - if what you have in the kernel is a custom-made ABI-locked solution that is distributed to millions of devices and can never-ever change, then there can be no rewrite full or partial and the kernel stagnates.
There are from time to time changes in the kernel that require kernel developers to change things around. And they need a freedom to do this. Now and in 5 years time. That is why they insist on keeping out things that they will not be able to change later on, including strict ABIs and narrow use cases in the generic parts of the code.
Google already got the benefit from this code being open so they could add this feature, but here the question is how to balance the maintenance burden of the feature on one hand with usefulness of this feature to other people. The suggestions in the LKML dealt with both sides - they reduced the maintenance burden by focusing the changes in less places and increased the usefulness of the feature, by making it more generic.
If before the discussion the usefulness of the code (to people outside Google) was less than the added maintenance burden it put on the kernel developers, then after the new proposal is implemented its usefulness just might be higher than the burden.
Upstream first policy
Posted Jun 3, 2010 16:27 UTC (Thu) by bfields (subscriber, #19510)
[Link]
Developers focused on "make it work so we can ship" are going to be less focused on maintainability (or at least, that is the way it appears).
There can also be maintainability risks from designs that look elegant/highly general/whatever but that haven't been tested in the field.
I'm not really arguing one side or the other. In practice I think the really hard stuff is hard to get right without working on both tracks (thinking through the design carefully, and testing it in real situations) in parallel.
Upstream first policy
Posted Jun 4, 2010 3:41 UTC (Fri) by neilbrown (subscriber, #359)
[Link]
Yes, nothing is really black-or-white is it?
I actually think there is a place for saying that a given interface is *not* permanent. That seems the be the main sticking point here.
If it were just code, we could import it, tidy it up, and be happy. Maybe it would change completely over a few release cycles. But as there is an interface involved that not everyone agrees with, we are stuck waiting for "perfect".
If we could say "This interface is only guaranteed to work with this library" or in some cases "... with this program", then I feel there would be a lot more room for flexibility. I have a vague feeling that ALSA works like this, but I'm not certain.
We have well-understood infrastructures for versioning libraries, breaking old API's, having multiple versions available and allowing old versions to be discarded selectively by distros. It would be great if the kernel interface could benefit the same way, and I think it should be possible to head that way.
Specifically, the nfsservctl syscall is probably totally unused these days, but it keeps a quantity of legacy code in the kernel which has to be maintained (though it is entirely possible that it is broken and nobody noticed).
Similarly the ioctls used for md/raid should go (though mdadm would need an update first - I haven't bothered because I "know" the ioctls have to stay) ... actually I now see that the sysfs interface I created to replace the ioctl interface is pretty horrible and really needs to be redone. If I could be sure that all users used mdadm ... or some library that I could create ... it would be a lot easier to deprecate old stuff.
Would that have helped with wakelocks? It is hard to be sure, but I think that it may well have done.