LWN.net Logo

Android, forking, and control: Communication

Android, forking, and control: Communication

Posted Jun 7, 2011 15:43 UTC (Tue) by PaulMcKenney (subscriber, #9624)
Parent article: Android, forking, and control

One of the big problems between Android and the Linux kernel community has been an inability to communicate effectively. James noted that Android "did everything we told them not to, and won big." One might also argue that Nokia did everything the community told them to, and lost big. So perhaps part of the problem was that the Linux community did not understand the mobile arena all that well (this certainly applies to me, given my server background).

For its part, the Android community has not always explained its requirements clearly. From what I can see, the Android community was in fact doing its level best, but their (clearly superior) knowledge of the mobile arena includes subconscious as well as conscious components. When people ask them what they need, the Android guys dutifully list out their conscious knowledge. When a Linux hacker cranks out a patch, the result will be unsatisfactory: unless the hacker is luckier than anyone deserves, the patch will fail to meet the requirements in the subconscious list. The Android developer then says "We told you exactly what to do, how could you mess it up so badly???" while the hapless Linux community member says "I did exactly what you told me, but you changed the requirements!!!" This sequence of events clearly will not do much to build trust between the Android and the Linux communities.

By the way, this problem is not specific to the Android folks: Those of us at Sequent had exactly the same problem explaining parallel programming. And some might argue that we still do have this problem!

How can this problem be addressed?

Training at Sequent surrounded new engineers with experienced parallel programmers. Over the course of a few months, the implicit knowledge buried on our subconscious minds would flow to the new engineer. Unfortunately, this apprenticeship model relies on there being lots of experienced engineers and only a few trainees, so it might quite some time for the few Android developers to train the huge number of Linux kernel hackers.

Another approach is to accept the fact that getting Android functionality into the Linux kernel will be an iterative process. A big part of the purpose of the first N patches is not so much to implement the functionality, but rather to learn what the requirements actually are. This approach can work very well, but it clearly requires considerable patience on the part of both the Android developer and the Linux kernel hacker.

Both these approaches take a lot of time. But given that the Android folks have optimized their patchset for forward porting, perhaps we can afford to take the time needed to get it right.


(Log in to post comments)

Android, forking, and control: Communication

Posted Jun 7, 2011 15:47 UTC (Tue) by martinfick (subscriber, #4455) [Link]

> One might also argue that Nokia did everything the community told them to, and lost big.

I doubt anyone would argue that. Nokia never lost, they simply gave up right after they started.

Android, forking, and control: Communication

Posted Jun 7, 2011 16:39 UTC (Tue) by boudewijn (subscriber, #14185) [Link]

If you define "right after they started" as a period of over five or six years, yeah, then they gave up right after they started. They gave the GTK/Gnome world a long enough chance to produce something decent that a whole ecosystem of small Gnome/GTK-based companies sprouted. The same happened after they had to decide that Gnome/GTK was never going to work across Maemo and Symbian and they had to look for something better, which is Qt.

From my own experience with Nokia, with their involvement with first KOffice and then Calligra, a project that started in 2009, they really did everything right, and did do everything the way the community said it wanted -- if you budget for the fact that there was a learning curve for both the community and the company, which is only fair.

All the work on the KOffice/Calligra engine was done in the open, they took two dozen students as interns in an attempt to grow the community, joined sprints and conferences, used the project bugzilla and the project reviewboard. There's nothing for which they can be blamed and a lot for which they can be praised.

At the MWC in Barcelona, many people felt that Nokia had given working the open source way a fair try twice, and failed hard twice, while Apple with their "grab and don't give anything" mentality are a success and Google with their "grab and dump" mentality are a success, so open source is guaranteed failure.

In the end, though I am convinced that MeeGo didn't work for Nokia not because they worked with open source communities the wrong way or the right way: they failed because of internal problems and because of problems in their partnership with Intel. But nobody in the industry will see it that way.

Android, forking, and control: Communication

Posted Jun 7, 2011 16:43 UTC (Tue) by bronson (subscriber, #4806) [Link]

I'm not sure Nokia ever really started! Who wants a 770 or N800 without phone capability? Nobody, that's who. Unless it's a Gnome freebie.

By the time the N900 finally came around, a device that Joe Q. Public might buy, the race was already over.

Android, forking, and control: Communication

Posted Jun 7, 2011 16:57 UTC (Tue) by karim (subscriber, #114) [Link]

Actually, it seemed to me very early on that Nokia's Maemo work was considered experimental lab stuff that was worth pursuing but never truly expected to become Nokia's bread and butter. Once the touch-based revolution started with iPhone and Android, it looks like they scrambled to find something and Maemo was the only "viable" route. Unfortunately, Maemo wasn't rooted in a "we're going to conquer the world" philosophy and $$$ backing. And to add insult to injury, once they discovered that Maemo was about their only card, they decided to merge it with Intel's Moblin without ever engaging the Maemo community about it.

I don't think this is a development philosophy issue as much as it's a lack of market understanding, a failed go-to-market strategy and a breach of trust with an established community. Then again, I might be completely beside the track.

Android, forking, and control: Communication

Posted Jun 9, 2011 4:09 UTC (Thu) by tytso (subscriber, #9993) [Link]

If you look at the Businessweek article, according to Nokia they gave up on Meego because it took too long. Symbian phones were dying on the vine, and they had been dropping prices, and thus losing profit margins, in order to keep their declines from getting worse. I think it is reasonable to posit that perhaps Meego's strict adherence to open source development principles, with the "it will be done when it is done", attitude, may have cause the schedule to slip out too far for Nokia's needs, and that they couldn't have stayed in business waiting another year or two for Meego to become sufficient mature for their needs.

Sometimes, you need to know when it's a better path to take on a little technical debt. Too much will of course sink you, but playing things too conservatively can also be a path to losing.

MeeGo

Posted Jun 9, 2011 14:27 UTC (Thu) by corbet (editor, #1) [Link]

Interesting thought, Ted, but do you have an example of actual delays caused by "strict adherence to open source development principles"? In the Android case, it's obvious that waiting until wakelocks were upstream would have been fatal. I'm not sure I can find a similar situation on the MeeGo side. I suspect that process delays, in this case, are minimal compared to those caused by merging two projects, changing graphical toolkits, etc.

What am I missing?

MeeGo

Posted Jun 9, 2011 14:52 UTC (Thu) by spaetz (subscriber, #32870) [Link]

I would also be interested in examples of this. In my book, changing underlying fundamentals when nearly finished and restart from basically scratch, is really a more likely explanation. Or lack of top-management commitment.

Actually, interviewing the head of Maemo, I learned that they tried to internally fork first, and that didn't play out so well. So I doubt that more "technical debt" would have helped here.

MeeGo

Posted Jun 9, 2011 15:34 UTC (Thu) by PaulMcKenney (subscriber, #9624) [Link]

Is it possible that Nokia took the approach of reworking lots of user-mode code in order to avoid depending on something like wakelocks? If so, mightn't this have reworking have greatly increased the costs and calendar time required?

MeeGo

Posted Jun 9, 2011 16:06 UTC (Thu) by mjg59 (subscriber, #23239) [Link]

There's definitely a cost associated with making sure that your userspace behaves itself in a wakelock-free environment, although there are associated benefits. Assuming a well-behaved userland and kernel, taking a wakelock should be approximately a zero-cost operation as far as power budget goes. But Android's userland and kernel don't seem well behaved. Running powertop on my Nexus One (screen off) shows over 100 wakeups per second. Some of that's because I've got USB connected and some of that's because it's monitoring the battery charging, but there's also bits of userspace doing stuff and active filesystem threads and it's all a bit of a mess. The result of this is that any Android app taking a wakelock pretty much instantly halves your battery life, even if that app doesn't do anything.

So while there's engineering cost involved in reworking some portions of userspace, there's also a measurable engineering benefit in doing so even if you have wakelocks. The market hasn't really been given an opportunity to decide whether or not that's important yet.

(There's technical mechanisms to force userspace to behave without wakelocks, as long as you can assume a userspace that isn't actually pathological. The engineering involved is probably still fairly minimal. They're also undemonstrated, which is more of a problem. I should really get round to trying to prototype that)

Looking forward to seeing your prototype!

Posted Jun 10, 2011 0:00 UTC (Fri) by PaulMcKenney (subscriber, #9624) [Link]

Your prototype does sound interesting! Your point is that Nokia was trying to create something similar? Or is your point instead that it should be possible to improve on wakelocks?

Also, when you say that an Android app taking a wakelock pretty much halves your battery life, you are thinking of an Android app that holds a wakelock indefinitely, as opposed to (for example) acquiring it an then immediately releasing it, correct? If the battery lifetime is halved by an Android app holding the wakelock indefinitely, then the Android folks might reasonably argue that this is evidence that wakelocks doubles battery lifetime.

Looking forward to seeing your prototype!

Posted Jun 10, 2011 0:28 UTC (Fri) by mjg59 (subscriber, #23239) [Link]

Wakelocks double your battery life if your userspace is badly behaved, right. But a badly behaved userspace also reduces your battery life when the phone is in active use, so there's still an incentive to fix it - it's just less significant overall.

I honestly haven't investigated Meego's full power management implementation, but the easiest way to implement management of this would be to use the new cgroup timer slack mechanism. You'd still require some sort of userspace-level wakelock implementation, but when no userspace locks are held you extend the timer slack for all untrusted applications out to infinity (or as close as possible). Events will still be implemented in a timely manner, but once a task's finished handling it and hits a select or timer it won't be scheduled again until another event hits.

This ought to handle the case that wakelocks are designed to handle, which is that an aggressive suspend implementation leaves you open to races between your suspend policy and event delivery. It also avoids the problem of using a freezer cgroup, where not all events are delivered through a framework so you still have races.

The main difference between wakelocks and this is that truly pathological applications still hurt you (if something's in a tight loop it'll still be scheduled), but also you have to trust your underlying layers to be correct (ie, never to wake up unless event delivery occurs). Running the current Android implementation in such an environment would result in a measurable decrease in battery life. But if your framework's been designed with this in mind, it means you get the benefits of aggressive suspend without the overhead of maintaining semi-fragile kernel modifications (some of which certainly stand no chance of going upstream).

There's no fundamental reason to think that the Android approach involved less engineering effort than Nokia's, and the long-term outcome should have been pretty similar in terms of ideal-case battery life. My experience is that it's not difficult to ensure that your core code is event driven provided that that's a design goal from the outset. So I don't think this distinction is what led to Nokia ending up so far behind schedule. There's probably more straightforward reasons for that.

Nokia, Android, and Approach to Open Source

Posted Jun 13, 2011 15:23 UTC (Mon) by PaulMcKenney (subscriber, #9624) [Link]

You make some good points, but you have not convinced me that Nokia's downfall was totally unrelated to their open-source strategy. Of course, I would not be surprised that there were also other factors at work, but from what I can see, their approach to open source was a contributing factor.

Then again, there is a good chance that Nokia's current experience will become a prominent business-school case study. In which case, few will pay attention to what either you or I think about the matter. ;-)

Nokia, Android, and Approach to Open Source

Posted Jun 13, 2011 15:40 UTC (Mon) by mjg59 (subscriber, #23239) [Link]

The engineering time lost rebasing everything on top of Qt because Nokia wanted a migration strategy from Symbian would seem to be massively larger than the time spent on reducing userspace wakeups (something that could probably be done in a few months by a single competent engineer). It's possible that their approach killed them, but when we're talking about dysfunctional management and massive political infighting between the Symbian and Meego teams it's a lot easier to just ascribe it to corporate ineptitude. We're talking about a company that refused to consider the iphone a threat because its camera had fewer megapixels than their latest Symbian device, even if said Symbian device was approximately unusable as a phone.

It's interesting to compare to Palm. Their kernel was pretty much stock (there's some additional drivers), and while they didn't take the effort to attempt to upstream stuff they'd probably have had little trouble in doing so. They ended up shipping a sufficiently attractive mobile OS that it achieved their goal of turning a failed company into an acquisition target. Which model were they closer to?

Nokia, Android, and Approach to Open Source

Posted Jun 17, 2011 21:21 UTC (Fri) by oak (guest, #2786) [Link]

> engineering time lost rebasing everything on top of Qt

I might be confusing what's in public MeeGo and "Nokia MeeGo", but if you look at the stuff in the public gitorious repos, you notice that it wasn't just "rebasing everything on top of Qt", but first writing a completely new widget toolkit[1] on top of the Qt GraphicsView (which toolkit seems have appeared first under different name[2], so maybe even that was partially rewritten). One assumes that apps were started before this new toolkit was at the maturity level of e.g. (over decade old) Qt or Gtk which "can" affect how much time went to writing the apps.

And at some point QML came into picture and now there seems to be yet-another widget toolkit[3], this time done on top of QML...

[1] http://meego.gitorious.org/meegotouch/libmeegotouch
[2] http://qt.gitorious.org/maemo-6-ui-framework
[3] http://qt.gitorious.org/qt-components/qt-components

Android, forking, and control: Communication

Posted Jun 7, 2011 16:35 UTC (Tue) by dgm (subscriber, #49227) [Link]

> When people ask them what they need, the [users] dutifully list out their conscious knowledge. When a [developer] cranks out a patch, the result will be unsatisfactory: unless the [developer] is luckier than anyone deserves, the patch will fail to meet the requirements in the subconscious list. The [client] then says "We told you exactly what to do, how could you mess it up so badly???"

This story is universal. It's one of the few reasons why software projects are always late, over-budget and do not deliver the expected features.

What is really needed is someone who has a foot on each side, and is respected by both. If this someone comes from the developer side, he needs to take the time to learn how the client side does it's thing, and to feel the "pain" they feel. If it comes from the other side, he has to be someone capable of (and willing to) solving problems with logic, not political pressure.

Android, forking, and control: Communication

Posted Jun 7, 2011 17:55 UTC (Tue) by PaulMcKenney (subscriber, #9624) [Link]

And isn't this the problem that "extreme programming" and "agile methods" were supposed to solve? ;-)

Android, forking, and control: Communication

Posted Jun 7, 2011 18:35 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

kernel development is probably the best example of 'agile methods' that you can find.

there is no planned feature set for each release, instead each development team works on their set of features and when the merge window starts, anything that's ready gets pushed upstream.

but this only works if the development teams work on smallish features (or does their development in a way that gradually changes the result rather than big bang changes)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds